Fix benchmarks and gpu crash by bgreni · Pull Request #20 · BradLarson/max-cv

bgreni · 2026-02-11T00:43:18Z

Fix crash caused by converted_intensity being ref captured and color tensor begin read by thee cpu now before gpu kernel execution.

Also apparently foreach is async now on gpu (or maybe just because I was only running them on mac until now?) so explicit sync is required. Also stride value in bench tensor spec was wrong.

bgreni · 2026-02-12T16:37:47Z

Putting this into a draft as I am trying to write a gpu kernel for the sobel operation and it is not going well so far

bgreni · 2026-02-12T23:56:48Z

Looks like my gpu implementation is now correct, but actually still a bit slower than the foreach variant.

bgreni · 2026-02-13T02:37:02Z

looks like simdifying things got me a little bit across the finish line on my rtx 3080. Performance difference on my mac M3 pro seems negligible.

Add better gpu kernel for sobel oeprator

bgreni force-pushed the improve-gpu-performance branch from 5aa1a5a to 622335c Compare February 12, 2026 16:36

bgreni marked this pull request as draft February 12, 2026 16:36

bgreni force-pushed the improve-gpu-performance branch 2 times, most recently from 04663a3 to 390badf Compare February 12, 2026 23:55

bgreni force-pushed the improve-gpu-performance branch 2 times, most recently from a6570cd to a884c33 Compare February 13, 2026 02:35

bgreni marked this pull request as ready for review February 13, 2026 02:35

bgreni force-pushed the improve-gpu-performance branch from a884c33 to 22fdd20 Compare February 13, 2026 18:28

Fix benchmarks and gpu crash

b94ec59

Add better gpu kernel for sobel oeprator

bgreni force-pushed the improve-gpu-performance branch from 22fdd20 to b94ec59 Compare April 1, 2026 16:57

speedup unit tests

0b82e8f

bgreni force-pushed the improve-gpu-performance branch from 281cce9 to 0b82e8f Compare April 5, 2026 19:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix benchmarks and gpu crash#20

Fix benchmarks and gpu crash#20
bgreni wants to merge 2 commits into
BradLarson:mainfrom
bgreni:improve-gpu-performance

bgreni commented Feb 11, 2026

Uh oh!

bgreni commented Feb 12, 2026

Uh oh!

bgreni commented Feb 12, 2026

Uh oh!

bgreni commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bgreni commented Feb 11, 2026

Uh oh!

bgreni commented Feb 12, 2026

Uh oh!

bgreni commented Feb 12, 2026

Uh oh!

bgreni commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant