Speed up integral images #79

minnerbe · 2025-05-26T19:38:40Z

While investigating a performance issue (which turned out to be unrelated to this library) in some downstream code, I found that some minimal changes could provide significant speedup: Integral images are accumulated row-wise and column-wise, where column-wise accumulation took significantly longer than the row-wise one.

This PR changes the memory access pattern of the column-wise accumulation to row-wise access, thereby significantly reducing the run time; see the changes in IntIntegralImage as an example. These are the min/max run times for 10 runs of computing the integral image for a 20000x20000 image before and after the changes:

	Before	After
Row-wise	214ms - 267ms	222ms - 269ms
Column-wise	627ms - 716ms	131ms - 141ms
Total	841ms - 983ms	353ms - 410ms

All changes either change the accumulation in exactly the same manner, or are IDE warnings which are automatically fixed. For the former category of changes, I manually checked correctness by using images with elements i -> 1 and i -> i + 1.

Let me know what you think, @axtimwalde

axtimwalde · 2025-05-27T02:00:14Z

Beautiful! Have you benchmarked small images? 1024^2 or something?

minnerbe · 2025-05-27T14:42:50Z

Good point @axtimwalde ! Our use case was large images (slices of compute blocks of large 3D stacks), so I focused on that.

I rerun a small benchmark for n x n images, where n varied between 10000-100 and I took the average run time over 100-10000 runs. As expected, the difference is not as stark for medium-sized images but still noticeable, and at least there's no degradation in run time for small-sized images.

Size	Before	After
10000	240ms	97ms
1000	1.23ms	0.97ms
100	0.012ms	0.011ms

I haven't looked at the assembly instructions generated by the JIT compiler, but I suspect that all of this is 'just' more efficient memory access, since Java doesn't reliably compile SIMD instructions. Should this change in the future, the current way of doing column-wise summation is very amenable to this kind of automatic further optimizations.

axtimwalde · 2025-05-27T15:16:38Z

Thanks a lot!

minnerbe added 6 commits May 24, 2025 14:27

Speed up cumulative summation of DoubleIntegralImage

4190b8c

Fix some IDE warnings

97ccc34

Rewrite the other constructor and add documentation

b0b0c32

Change other integral image implementations

182d7b0

Fix some IDE warnings

2c9d34b

Change sums in BlockPMCC and BlockStatistics (and fix some warnings)

5f72c62

axtimwalde merged commit c452d05 into axtimwalde:master May 27, 2025
1 check passed

minnerbe deleted the refactor/integral-images branch May 27, 2025 15:23

minnerbe mentioned this pull request Oct 14, 2025

Fix/integral image oob #86

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speed up integral images #79

Speed up integral images #79

Uh oh!

minnerbe commented May 26, 2025

Uh oh!

axtimwalde commented May 27, 2025

Uh oh!

minnerbe commented May 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

axtimwalde commented May 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Speed up integral images #79

Speed up integral images #79

Uh oh!

Conversation

minnerbe commented May 26, 2025

Uh oh!

axtimwalde commented May 27, 2025

Uh oh!

minnerbe commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

axtimwalde commented May 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

minnerbe commented May 27, 2025 •

edited

Loading