Paddle2Torch 测试基建:添加 paddle._C_ops 内部算子映射和 MOE 算子支持#635
Merged
Conversation
Fix moe_permute config generation, add moe_permute and moe_unpermute mapping and rules code
Add paddle._C_ops mapping and rules code, fix convert_np_dtype_to_dtype_, update _COPS_API_PUBLIC_ALIAS and args mapping
Add get_signature in ana_torch_api_info
Comment on lines
+582
to
+588
| elif api_config.api_name == "paddle._C_ops.fused_linear_param_grad_add": | ||
| # When has_bias=False, Paddle returns an uninitialized tensor for dbias (2nd output). | ||
| # Only compare the first output (dweight). | ||
| if isinstance(paddle_output, (list, tuple)) and len(paddle_output) > 1: | ||
| paddle_output = paddle_output[:1] | ||
| if isinstance(torch_output, (list, tuple)) and len(torch_output) > 1: | ||
| torch_output = torch_output[:1] |
Collaborator
There was a problem hiding this comment.
这恐怕不合适,这个算子非常重要,用特殊逻辑对比两个output吧
Collaborator
Author
There was a problem hiding this comment.
这个是因为 paddle._C_ops.fused_linear_param_grad_add 有两个输出 [dweight_out, dbias_out] ,当has_bias=False 时输出的第二个 tensor 未初始化,访问时会报错😢:
[T1P1] [accuracy error] paddle._C_ops.fused_linear_param_grad_add(Tensor(paddle.Size([1536, 4096]),"bfloat16"), Tensor(paddle.Size([1536, 4096]),"bfloat16"), Tensor(paddle.Size([4096, 4096]),"float32"), None, True, False, )
(PreconditionNotMet) Tensor not initialized yet when DenseTensor::place() is called.
[Hint: holder_ should not be null.] (at /paddle/paddle/phi/core/dense_tensor_impl.cc:57)
个人认为,对于测试而言此时只需比较第一个,但对框架而言这是个不好的表现。
Comment on lines
+6994
to
+6996
| with torch.no_grad(): | ||
| # Weight decay on work_param | ||
| if with_decay and wd > 0: |
Collaborator
There was a problem hiding this comment.
这个与torch.optim.adam._fused_adam对比
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📌 主要内容
1.
_C_ops算子参数映射 (base.py)_COPS_API_PUBLIC_ALIAS字典:13 个_C_ops算子复用对应公开 API 的签名add_,subtract_,multiply_,concat,flatten_等adamw_,full_,fused_linear_param_grad_add,gaussian,matmul_grad,squared_l2_norm,_run_custom_op2. 签名缓存和参数绑定优化 (base.py)
inspect.signature()结果,避免重复调用_C_ops别名解析:无签名时自动查找公开 API 获取签名3. Torch 等价规则 (rules.py)
新增 17 个转换规则:
CopsAdd_Rule,CopsSubtract_Rule,CopsMultiply_Rule,CopsConcatRule,CopsTransposeRule,CopsClipRuleCopsFlatten_Rule,CopsScale_Rule,CopsReshape_Rule,CopsPutAlongAxis_RuleCopsAdamwRuleCopsFusedLinearParamGradAddRule,CopsMatmulGradRuleCopsGaussianRule,CopsUniformRule,CopsNumelRule,CopsSquaredL2NormRule,CopsRunCustomOpRule4. MOE 算子支持 (rules.py)
MoePermuteRule:实现专家路由置换MoeUnpermuteRule:实现专家输出聚合5. 准确性修复
fused_linear_param_grad_add当has_bias=False时仅比较第一个输出get_dtype支持新旧版本6. 映射配置更新 (mapping.json)
_C_ops和 MOE 算子的规则配置