[CPU]RNN not specify weight layout when creating primitive descriptor.#20068
Conversation
RNN weight expose planar layout to cpu graph.
3143910 to
697a04b
Compare
697a04b to
4356738
Compare
|
@EgorDuplensky please review the PR |
| CPU_REGISTER_PASS_COMMON(manager, ConvertToSwishCPU); | ||
| CPU_REGISTER_PASS_COMMON(manager, OptimizeSequenceTransposes); | ||
| if (!ov::op::util::has_op_with_type<ov::op::v0::FakeQuantize>(nGraphFunc)) { | ||
| CPU_REGISTER_PASS_COMMON(manager, ReshapeFullyConnectedFusion); |
There was a problem hiding this comment.
Could you please split it into two PRs, one for "always any layout for RNN weights" and another one for disabling ReshapeFullyConnectedFusion transformation.
Also don't we want to remove the transformation itself if it is not need anymore?
There was a problem hiding this comment.
These 2 changes aim to improve more bregemm kernel can be covered.
The reason of removing this reshape+fc fusing is because we found brgemm implement would have some limitations on input activation or weight shape. ONEDNN brgemm FC would fall back to gemm legacy when having input 4D tensors.
Also since reshape node in this case should only change input out memory descriptor and no memory copy in my recall.
"4D input -> Rehape to 2D input with 2D weight -> FC" would be transformed to "4D input with 4D weight -> FC"
The former internal CI run the result of regression:
http://10.67.108.202:8080/benchmark/tput/1566/1536
http://10.67.108.202:8080/benchmark/latency/1567/1529
Seem removing this fusion has no side effect on AVX2 and avx512 .
|
@EgorDuplensky , Split into 2 PRs. Another #21442. Thx! |
|
This PR will be closed in a week because of 2 weeks of no activity. |
There was a problem hiding this comment.
So, this heuristic is not needed anymore?
There was a problem hiding this comment.
Are we enabling avx2 for Convolution node at the same time?
If so, we better to do it in scope of separate PR as well.
There was a problem hiding this comment.
Seems I have merged with another feature branch last time. Sorry for that, there should only rnn related changes in this PR. Already force update. Thx!
0553681 to
bd40a06
Compare
|
@EgorDuplensky , Applied review comments. |
|
@EgorDuplensky Could you please continue to review the PR? Thanks! |
Details:
Tickets: