-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Open
Description
Describe the issue
MatMulInteger fails with NOT_IMPLEMENTED when input A is int8 and input B is uint8.
According to the ONNX spec, T1 and T2 are allowed to be different
(T1 in {int8, uint8}, T2 in {int8, uint8}), so A=int8, B=uint8
should be a valid combination
To reproduce
import onnx
import numpy as np
from onnx import TensorProto
import onnxruntime
MODEL_PATH = "matmul_integer_A_int8_B_uint8.onnx"
def create_matmul_integer_model(model_path):
weights_np = np.random.randint(0, 255, size=(4, 4), dtype=np.uint8)
weights_initializer = onnx.numpy_helper.from_array(
weights_np, name='input_b_weights_u8'
)
node_def = onnx.helper.make_node(
'MatMulInteger',
inputs=['input_a_activations_i8', 'input_b_weights_u8'],
outputs=['output_c_int32'],
)
input_a_info = onnx.helper.make_tensor_value_info(
'input_a_activations_i8',
TensorProto.INT8,
[1, 4]
)
output_c_info = onnx.helper.make_tensor_value_info(
'output_c_int32',
TensorProto.INT32,
[1, 4]
)
graph_def = onnx.helper.make_graph(
nodes=[node_def],
name='matmul_integer_graph_mixed',
inputs=[input_a_info],
outputs=[output_c_info],
initializer=[weights_initializer]
)
model_def = onnx.helper.make_model(graph_def, producer_name='onnx-example')
onnx.save(model_def, model_path)
create_matmul_integer_model(MODEL_PATH)
def run_inference(model_path):
session = onnxruntime.InferenceSession(model_path, providers=['CPUExecutionProvider'])
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name
test_input_int8 = np.random.randint(-128, 127, size=(1, 4), dtype=np.int8)
result = session.run([output_name], {input_name: test_input_int8})
print(result[0])
run_inference(MODEL_PATH)Running the script above throws:
onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented:
[ONNXRuntimeError] : 9 : NOT_IMPLEMENTED :
Could not find an implementation for MatMulInteger(10) node with name ''
Urgency
No response
Platform
Windows
OS Version
10.0.26100 N/A Build 26100
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.23.0
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Metadata
Metadata
Assignees
Labels
No labels