Skip to content

MatMulInteger NOT_IMPLEMENTED for INT8 x UINT8 (A=int8, B=uint8) #26743

@rivkastroh

Description

@rivkastroh

Describe the issue

MatMulInteger fails with NOT_IMPLEMENTED when input A is int8 and input B is uint8.
According to the ONNX spec, T1 and T2 are allowed to be different
(T1 in {int8, uint8}, T2 in {int8, uint8}), so A=int8, B=uint8
should be a valid combination

To reproduce

import onnx
import numpy as np
from onnx import TensorProto
import onnxruntime

MODEL_PATH = "matmul_integer_A_int8_B_uint8.onnx"

def create_matmul_integer_model(model_path):
    weights_np = np.random.randint(0, 255, size=(4, 4), dtype=np.uint8)
    weights_initializer = onnx.numpy_helper.from_array(
        weights_np, name='input_b_weights_u8'
    )

    node_def = onnx.helper.make_node(
        'MatMulInteger',
        inputs=['input_a_activations_i8', 'input_b_weights_u8'],
        outputs=['output_c_int32'],
    )

    input_a_info = onnx.helper.make_tensor_value_info(
        'input_a_activations_i8',
        TensorProto.INT8,
        [1, 4]
    )

    output_c_info = onnx.helper.make_tensor_value_info(
        'output_c_int32',
        TensorProto.INT32,
        [1, 4]
    )

    graph_def = onnx.helper.make_graph(
        nodes=[node_def],
        name='matmul_integer_graph_mixed',
        inputs=[input_a_info],
        outputs=[output_c_info],
        initializer=[weights_initializer]
    )

    model_def = onnx.helper.make_model(graph_def, producer_name='onnx-example')
    onnx.save(model_def, model_path)

create_matmul_integer_model(MODEL_PATH)

def run_inference(model_path):
    session = onnxruntime.InferenceSession(model_path, providers=['CPUExecutionProvider'])
    input_name = session.get_inputs()[0].name
    output_name = session.get_outputs()[0].name

    test_input_int8 = np.random.randint(-128, 127, size=(1, 4), dtype=np.int8)
    result = session.run([output_name], {input_name: test_input_int8})
    print(result[0])

run_inference(MODEL_PATH)

Running the script above throws:

onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented:
[ONNXRuntimeError] : 9 : NOT_IMPLEMENTED :
Could not find an implementation for MatMulInteger(10) node with name ''

Urgency

No response

Platform

Windows

OS Version

10.0.26100 N/A Build 26100

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.23.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions