Skip to content

Divide 1D tensor into more than 2 TPC instances #22

@mcisowsk

Description

@mcisowsk

I noticed that for a 1D input tensor, we can define index space in such a way, that max 2 TPC cores are utilized (as in example https://docs.habana.ai/en/latest/TPC/TPC_User_Guide/TPC_Programming_Model.html#index-space-mapping). To use 4 TPCs, tensor must be 2D. What I want to achieve is to have a 1D tensor and divide the load equally into all TPC cores. So for a 1D tensor of shape size 512 want each TPC core to handle 64 elements. But all I can accomplish is 2 TPC each handling 256 elements. Why is that?

int elementsInVec = 64;
unsigned depthIndex = (outputSizes[0] + (elementsInVec - 1)) / elementsInVec;
kernel->indexSpaceGeometry.dims = 1;
kernel->indexSpaceGeometry.sizes[0] = depthIndex;

kernel->inputTensorAccessPattern[0].dim[0].dim      = 0;
kernel->inputTensorAccessPattern[0].dim[0].start_a  = elementsInVec;
kernel->inputTensorAccessPattern[0].dim[0].end_a    = elementsInVec;
kernel->inputTensorAccessPattern[0].dim[0].start_b  = 0;
kernel->inputTensorAccessPattern[0].dim[0].end_b    = elementsInVec - 1;

I defined the mapping as:

  • startF(x) = 64*x + 0
  • endF(x) = 64*x+63

but it seems that it is ignored and instead it behaves more as if the mapping was:

  • startF(x) = 256*x + 0
  • endF(x) = 256*x+255

What values is x actually gonna be? [0,1] ? What is wrong with my code? Is it even possible to launch 8 TPC for a data layout like this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions