hello, I‘m trying to test this code on imagenet, but I find that when the program runs to self.comm.Bcast([self.model_recv_buf.recv_buf[layer_idx], MPI.DOUBLE], root=0) in function async_fetch_weights_bcast in distributed_worker.py at the first step, it thrown an error that is MPI_ERR_TRUNCATE: message truncated , but I check the memory size in Bcast and it works when the program ran on Cifar10/100, have u encountered this problem?
And another issue: then I replaced the Pytorch0.3.0 with Pytorch0.4/1.1, the proceeding time on decode of QSGD is significantly higher than 0.3.0, almost 10 times than it, have u tried this?
hello, I‘m trying to test this code on imagenet, but I find that when the program runs to
self.comm.Bcast([self.model_recv_buf.recv_buf[layer_idx], MPI.DOUBLE], root=0)in functionasync_fetch_weights_bcastin distributed_worker.py at the first step, it thrown an error that isMPI_ERR_TRUNCATE: message truncated, but I check the memory size in Bcast and it works when the program ran on Cifar10/100, have u encountered this problem?And another issue: then I replaced the Pytorch0.3.0 with Pytorch0.4/1.1, the proceeding time on decode of QSGD is significantly higher than 0.3.0, almost 10 times than it, have u tried this?