Hello. I reported this to Debian here:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1101363
On AWS instances of types c7a.large, m7a.large, r7a.large, which incidentally have 2 vCPUs, the Debian package for dbcsr used to take less than 4 minutes to build.
After I added PRTE_MCA_rmaps_default_mapping_policy=:oversubscribe, so that it also builds ok on systems with a single CPU, the build on systems with 2 CPUs now fails with timeout, like this:
11: **********************************************************************
11: -- TESTING dbcsr_multiply (N, C, 5 , A, N, N) ............... PASSED !
11: **********************************************************************
11: test_name multiply_LIMITS_MIX_3
11: The solution is CORRECT !
11: **********************************************************************
11: -- TESTING dbcsr_multiply (T, N, 5 , A, N, N) ............... PASSED !
11: **********************************************************************
11/19 Test #11: dbcsr_unittest1 .......................................***Timeout 1500.01 sec
[...]
The following tests FAILED:
11 - dbcsr_unittest1 (Timeout)
I tried increasing the timeout, like this:
--- a/tests/CMakeLists.txt
+++ b/tests/CMakeLists.txt
@@ -140,6 +140,7 @@ foreach (dbcsr_test ${DBCSR_TESTS_FTN})
endif ()
set_tests_properties(
${dbcsr_test} PROPERTIES ENVIRONMENT OMP_NUM_THREADS=${NUM_THREADS}
+ TIMEOUT 3600
PROCESSORS ${test_processors})
endforeach ()
but 3600 was not enough, and 7200 was not enouth either (still timeouts), which makes me to think that maybe the proper fix should be somewhere else.
Thanks.
Hello. I reported this to Debian here:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1101363
On AWS instances of types c7a.large, m7a.large, r7a.large, which incidentally have 2 vCPUs, the Debian package for dbcsr used to take less than 4 minutes to build.
After I added
PRTE_MCA_rmaps_default_mapping_policy=:oversubscribe, so that it also builds ok on systems with a single CPU, the build on systems with 2 CPUs now fails with timeout, like this:I tried increasing the timeout, like this:
but 3600 was not enough, and 7200 was not enouth either (still timeouts), which makes me to think that maybe the proper fix should be somewhere else.
Thanks.