Skip to content

fix: cpu_exclusive_scan#11

Open
Livinfly wants to merge 1 commit into
stanford-cs149:masterfrom
Livinfly:master
Open

fix: cpu_exclusive_scan#11
Livinfly wants to merge 1 commit into
stanford-cs149:masterfrom
Livinfly:master

Conversation

@Livinfly

@Livinfly Livinfly commented Aug 7, 2025

Copy link
Copy Markdown

Firstly, the result after upsweep phase goes wrong. (although not influence the final result)
My experiment result is below.

(two_d < N / 2)
    ./cudaScan -i ones -n 7
    Array size: 7
    1 2 1 4 1 2 1 (upsweep phase)
    2 3 4 6 3 4 3 (downsweep phase)

    ./cudaScan -i ones -n 8
    Array size: 8
    1 2 1 4 1 2 1 4 (upsweep phase)
    0 1 2 3 4 5 6 7 (downsweep phase)

(two_d <= N / 2)
    ./cudaScan -i ones -n 7
    Array size: 7
    1 2 1 4 1 2 1 (upsweep phase)
    2 3 4 6 3 4 3 (downsweep phase)

    ./cudaScan -i ones -n 8
    Array size: 8
    1 2 1 4 1 2 1 8 (upsweep phase)
    0 1 2 3 4 5 6 7 (downsweep phase)

Secondly, as N is not given as power of 2, so during the downsweep, it always visit out-of-bound in output[i + twod1 - 1].

If you want to totally fix it, only choose to transfer N by function like nextPow2 at very first, or just mention at comments, "use it, only when N is power of 2!"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant