-
Notifications
You must be signed in to change notification settings - Fork 5
Device could not be initialized or missing initialization #41
Description
I'm trying to setup Aurora-HLS on Noctua2 on Node n2fpga17 with 2 xcu280_u55c_0 connected together in the following configuration. (https://pc2.github.io/fpgalink-gui/index.html?import=%20--fpgalink%3Dn00%3Aacl0%3Ach0-n00%3Aacl1%3Ach0%20--fpgalink%3Dn00%3Aacl0%3Ach1-n00%3Aacl1%3Ach1)
I followed the README and compiled the project with make aurora than build the example with
make host
make xclbin
after the 2 hours of wating for xclbin to be compiled, I tried using ./host_aurora_hls_test and got [n2fpga17:3185664] MCW rank 0 is not bound (or bound to all available processors) error/warning as well as [n2fpga17:3185664:0:3185664] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil)).
I added following debug prints to host/host_aurora_hls_test.cpp and tried again
uint32_t device_id = emulation ? 0 : (((node_rank / 2) + config.device_id_offset) % 3);
printf ("device_id: %u %u \n", device_id, config.device_id_offset);
uint32_t instance = node_rank % 2;
printf ("instance: %u \n", instance);
xrt::device device = xrt::device(device_id);
std::cout << "device name: " << device.get_info<xrt::info::device::name>() << "\n";
std::cout << "device bdf: " << device.get_info<xrt::info::device::bdf>() << "\n";which gave me following output and the place where the problem occurs.
emulation: 0
device_id: 0 0
instance: 0
[n2fpga17:3185664:0:3185664] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
==== backtrace (tid:3185664) ====
0 0x0000000000012cf0 __funlockfile() :0
1 0x00000000000ccc6f xrt_core::system_linux::get_userpf_device() ???:0
2 0x00000000000d1187 xrt_core::get_userpf_device() ???:0
3 0x00000000001186cb xrt::device::device() ???:0
4 0x00000000004046b8 main() /scratch/hpc-prf-gripv/Aurora-HLS/host/host_aurora_hls_test.cpp:102
5 0x000000000003ad85 __libc_start_main() ???:0
6 0x000000000040415e _start() ???:0
=================================
[n2fpga17:3185664] *** Process received signal ***
[n2fpga17:3185664] Signal: Segmentation fault (11)
[n2fpga17:3185664] Signal code: (-6)
[n2fpga17:3185664] Failing at address: 0x12d6f00309c00
[n2fpga17:3185664] [ 0] /lib64/libpthread.so.0(+0x12cf0)[0x1553fbcd6cf0]
[n2fpga17:3185664] [ 1] /opt/software/FPGA/Xilinx/xrt/xrt_2.15/lib/libxrt_core.so.2(_ZNK8xrt_core12system_linux17get_userpf_deviceEj+0x1f)[0x1553e1719c6f]
[n2fpga17:3185664] [ 2] /opt/software/FPGA/Xilinx/xrt/xrt_2.15/lib/libxrt_coreutil.so.2(_ZN8xrt_core17get_userpf_deviceEj+0x67)[0x1553fc564187]
[n2fpga17:3185664] [ 3] /opt/software/FPGA/Xilinx/xrt/xrt_2.15/lib/libxrt_coreutil.so.2(_ZN3xrt6deviceC2Ej+0x5b)[0x1553fc5ab6cb]
[n2fpga17:3185664] [ 4] ./host_aurora_hls_test[0x4046b8]
[n2fpga17:3185664] [ 5] /lib64/libc.so.6(__libc_start_main+0xe5)[0x1553fb939d85]
[n2fpga17:3185664] [ 6] ./host_aurora_hls_test[0x40415e]
[n2fpga17:3185664] *** End of error message ***
Segmentation fault (core dumped)
I know the cards are present and working as shown in the screenshot below:
