-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Labels
Description
The proposed is feature consists in adding a default operator in DMA abstraction that would perform copy of a contiguous
chunk of data into a non-overlapping region.
Providing such method in the abstraction makes sense since all available backends (linux, cuda, OpenCL, level_zero) do provide a memcpy functionality. There are currently only few rarely used accelerated copies for 2D/3D matrices and no real programmable dma engine other than the CPU itself.
On top of that, there are currently two use cases for such a functionality:
- Currently all layouts transform operations are implemented using memcpy operations. Providing such a method would enable layouts to provide a backend-agnostic copy operator for transforming into another layout.
- The WIP deepcopy abstraction uses many flat copies and has to go through the layout abstraction every time when such a data description is really not needed.
Reactions are currently unavailable