Nearest neighbor scaling

We have some questions about running nearest neighbors — specifically the distributed_nearest_neighbors.cpp example. We are focusing on the DistKernelMatrix run (second half of the script), with the following parameters

n = 5000
d = 3
k = 64


As we increase the number of mpi tasks, we notice strange behavior. Most importantly, the code slows down significantly when more tasks are introduced. By the eye test, this is clear, but the reported flops and mops after each neighbor iteration tell the same story

nprocs = 1
[ RT]    31 [normal]     0 [listen]      0 [nested] 3.000E+04 flops 3.000E+04 mops
[ RT]    16 [normal]     0 [listen]      0 [nested] 6.250E+06 flops 1.250E+07 mops

nprocs = 2
[ RT]    32 [normal]     0 [listen]      0 [nested] 9.000E+04 flops 9.000E+04 mops
[ RT]    16 [normal]     0 [listen]      0 [nested] 6.250E+06 flops 1.250E+07 mops

nprocs = 4
[ RT]    36 [normal]     0 [listen]      0 [nested] 2.100E+05 flops 2.100E+05 mops
[ RT]    16 [normal]     0 [listen]      0 [nested] 6.250E+06 flops 1.250E+07 mops

nprocs = 8
[ RT]    48 [normal]     0 [listen]      0 [nested] 2.400E+05 flops 2.400E+05 mops
[ RT]    16 [normal]     0 [listen]      0 [nested] 6.250E+06 flops 1.250E+07 mops

This leads to question 1: Should the number of flops grow for the same problem size when increasing tasks, or is there a problem here?

Our hypothesis is that it has something to do with the splitter warning 

[WARNING] increase the middle gap to 10 percent!

which are displayed more and more frequently as the number of processors increases. We outline that below, showing gap warnings/nn iteration.

nprocs :  1  warnings: 0
nprocs :  2  warnings: 2
nprocs :  4  warnings: 8
nprocs :  8  warnings: 24

Unless we are misunderstanding the algorithm, the warning is displayed when the there are multiple points that project to the median under that split. Our question is twofold: Why would this increase with the number of tasks? And is it fixable?

For the record, this happens even if care is taken so each processes’ data is randomized with a unique seed so there are no duplicate points. Thanks for the help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nearest neighbor scaling #41

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Nearest neighbor scaling #41

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions