Skip to content

[Example] Add More Hopper Matmul Examples#154

Merged
yaoyaoding merged 17 commits into
NVIDIA:mainfrom
WilliamZhang20:main
May 17, 2026
Merged

[Example] Add More Hopper Matmul Examples#154
yaoyaoding merged 17 commits into
NVIDIA:mainfrom
WilliamZhang20:main

Conversation

@WilliamZhang20
Copy link
Copy Markdown
Contributor

@WilliamZhang20 WilliamZhang20 commented May 7, 2026

Added v4/v5 of hopper_matmul, with the following results on H200 NVL:

Benchmark results (m=8192, n=8192, k=8192):
 version  latency (ms)     tflops  % of cublas
tilus_v0      3.842016 286.180909    41.596911
tilus_v1      2.169152 506.885464    73.676716
tilus_v2      2.113136 520.322213    75.629772
tilus_v3      1.922096 572.037845    83.146732
tilus_v4      1.953824 562.748557    81.796517
tilus_v5      1.740832 631.601216    91.804375
  cublas      1.605648 684.777511   100.000000

WilliamZhang20 and others added 11 commits April 4, 2026 03:26
Signed-off-by: William Zhang <wzhang20@yahoo.com>
Signed-off-by: William Zhang <wzhang20@yahoo.com>
Signed-off-by: William Zhang <wzhang20@yahoo.com>
Signed-off-by: William Zhang <wzhang20@yahoo.com>
Signed-off-by: William Zhang <wzhang20@yahoo.com>
Signed-off-by: William Zhang <wzhang20@yahoo.com>
Signed-off-by: William Zhang <wzhang20@yahoo.com>
Signed-off-by: William Zhang <wzhang20@yahoo.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 7, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: William Zhang <wzhang20@yahoo.com>
Signed-off-by: William Zhang <wzhang20@yahoo.com>
@WilliamZhang20 WilliamZhang20 changed the title Add More Hopper Matmul Examples [Example] Add More Hopper Matmul Examples May 10, 2026
Signed-off-by: William Zhang <wzhang20@yahoo.com>
Signed-off-by: William Zhang <wzhang20@yahoo.com>
Signed-off-by: William Zhang <wzhang20@yahoo.com>
@yaoyaoding
Copy link
Copy Markdown
Member

/ok to test c07413f

Signed-off-by: William Zhang <wzhang20@yahoo.com>
@yaoyaoding yaoyaoding merged commit 8970fbb into NVIDIA:main May 17, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants