[GPU] tensor from nthandle#35543
Conversation
…prefix to dx12 tests
| const Shape& shape, | ||
| void* shared_buffer, | ||
| const MemType memory_type) { | ||
| OPENVINO_ASSERT(shared_buffer != nullptr, "shared_buffer must not be nullptr for SHARED_BUF memory type"); |
There was a problem hiding this comment.
| OPENVINO_ASSERT(shared_buffer != nullptr, "shared_buffer must not be nullptr for SHARED_BUF memory type"); | |
| OPENVINO_ASSERT(shared_buffer != nullptr, "shared_buffer must not be nullptr for SHARED_BUF memory type"); | |
| OPENVINO_ASSERT(memory_type == MemType::SHARED_BUF, "Only SHARED_BUF memory type is supported for raw buffer pointer or NT handle") |
At first glance this check is useless as MemType contains only one record, but MemType may be extended independently, and in that case this check becomes meaningful.
There was a problem hiding this comment.
Did you evaluate how much the Vulkan bootstrap affects the overall configure/build time? Not it looks like a hard dependency for the functional tests, so Vulkan now must be always there whenever we build tests and only for one test. Should we consider extracting this test into a separate binary or converting it into a demo?
There was a problem hiding this comment.
got 27.83s(this PR + setting set(Vulkan_FOUND OFF), not including one time when got hang) vs 9.52s (this PR without vulkan changes in cmake - similar to master cmake) for command cmake .. -DENABLE_INTEL_CPU=OFF -DENABLE_INTEL_NPU=OFF -DENABLE_TESTS=ON -DBUILD_TYPE=Release . To measure it was done 20times.
There was a problem hiding this comment.
this file vulkan_handle.json takes about 5s to compile (from clang report), full build (gpuplugin + tests takes 33minutes)
{
"pid": 240035,
"tid": 240036,
"ph": "X",
"ts": 0,
"dur": 4730359,
"name": "Total ExecuteCompiler",
"args": {
"count": 1,
"avg ms": 4730
}
},
| const Shape& shape, | ||
| void* shared_buffer, | ||
| const MemType memory_type) { | ||
| OPENVINO_ASSERT(shared_buffer != nullptr, "shared_buffer must not be nullptr for SHARED_BUF memory type"); |
There was a problem hiding this comment.
could you update documentation as well? docs/articles_en/openvino-workflow/running-inference/inference-devices-and-modes/gpu-device/remote-tensor-api-gpu-plugin.rst
| * @param type Tensor element type | ||
| * @param shape Tensor shape | ||
| * @param shared_buffer External memory handle from another API (DX12 shared NT handle on Windows, | ||
| * DMA-BUF fd on Linux), passed as void* |
There was a problem hiding this comment.
fd on Linux is handled as int from NPU plugin. I think it should be separated.
| ClBufferTensor create_tensor(const element::Type type, | ||
| const Shape& shape, | ||
| void* shared_buffer, | ||
| const MemType memory_type) { |
There was a problem hiding this comment.
I think this should support importing usm as well. It is identical with usm-version for first three arguments.
There was a problem hiding this comment.
I think that at least at windows creating handles supports only allocation on gpu
| BT_EMPTY, | ||
| BT_BUF_INTERNAL, | ||
| BT_BUF_SHARED, | ||
| BT_BUF_SHARED_IMPORTED, |
There was a problem hiding this comment.
could you leave some comment about this? it is difficult to distinguish against BT_BUF_SHARED. Maybe SHARED_FROM_HANDLE?
| /// Create shared memory object using user-supplied memory buffer @p buf using specified @p layout | ||
| memory_ptr share_buffer(const layout& layout, shared_handle buf); | ||
|
|
||
| virtual memory_ptr import_external_buffer(const layout& layout, shared_handle external_handle); |
There was a problem hiding this comment.
nit: import implies it is coming from outside. so I suggest to rename it to import_buffer
| #ifdef _WIN32 | ||
| constexpr auto handle_type_token = CL_EXTERNAL_MEMORY_HANDLE_OPAQUE_WIN32_KHR; | ||
| #elif defined(__linux__) | ||
| constexpr auto handle_type_token = CL_EXTERNAL_MEMORY_HANDLE_DMA_BUF_KHR; |
There was a problem hiding this comment.
does this PR support dmabuf as well? if not, I think we need to add an assert, instead.
There was a problem hiding this comment.
supports but tests from vulkan was deleted
| cl::Buffer _buffer; | ||
| }; | ||
|
|
||
| struct gpu_external_buffer : public gpu_buffer { |
There was a problem hiding this comment.
It is SHARED_IMPORTED and BUFFER_FROM_HANDLE from enum, but external here. Could you unify the term?
| /// Create shared memory object using user-supplied memory buffer @p buf using specified @p layout | ||
| memory_ptr share_buffer(const layout& layout, shared_handle buf); | ||
|
|
||
| virtual memory_ptr import_buffer(const layout& layout, shared_handle external_handle); |
There was a problem hiding this comment.
It makes sense to convert it to a pure method virtual memory_ptr import_buffer(const layout& layout, shared_handle external_handle) = 0; and implement a stub in the ze engine (ocl implementation is already a part of the PR)
Also please add a comment with this method description.
| * @ingroup ov_runtime_ocl_gpu_cpp_api | ||
| */ | ||
| #ifdef __linux__ | ||
| using handle_param = int; |
There was a problem hiding this comment.
NIT. One line above there is gpu_hanlde_param, here we see handle_param. This is rather confusing. Can a more appropriate type name be thought of?
Details:
Tickets: