gpu output array wrong dimensions

Question

nicola on 18 Sep 2024

0
Link

Direct link to this question

https://in.mathworks.com/matlabcentral/answers/2153550-gpu-output-array-wrong-dimensions

Commented: nicola on 19 Sep 2024

Running a kernel by using feval the output size of the array is of wrong dimension, more specifically the declaration in matlab is :

hits_gpu = gpuArray.zeros(numRays * numTriangles, 1, 'single'); % Hits array as int32

the kernel is :

__global__ void moller_trumbore(

float* rayOrig_x,

float* rayOrig_y,

float* rayOrig_z,

float* rayDir_x,

float* rayDir_y,

float* rayDir_z,

float* vert0_x,

float* vert0_y,

float* vert0_z,

float* vert1_x,

float* vert1_y,

float* vert1_z,

float* vert2_x,

float* vert2_y,

float* vert2_z,

float* hits,

float* t_vals,

float* u_vals,

float* v_vals,

int numRays,

int numTriangles)

{

// Calculate ray index and triangle index from thread index

int rayIdx = blockIdx.x * blockDim.x + threadIdx.x; // Ray index

int triangleIdx = blockIdx.y * blockDim.y + threadIdx.y; // Triangle index

// Load ray origin and direction components

float orig_x = rayOrig_x[rayIdx];

float orig_y = rayOrig_y[rayIdx];

float orig_z = rayOrig_z[rayIdx];

float dir_x = rayDir_x[rayIdx];

float dir_y = rayDir_y[rayIdx];

float dir_z = rayDir_z[rayIdx];

// Load triangle vertices from separate coordinate arrays

float v0_x = vert0_x[triangleIdx];

float v0_y = vert0_y[triangleIdx];

float v0_z = vert0_z[triangleIdx];

float v1_x = vert1_x[triangleIdx];

float v1_y = vert1_y[triangleIdx];

float v1_z = vert1_z[triangleIdx];

float v2_x = vert2_x[triangleIdx];

float v2_y = vert2_y[triangleIdx];

float v2_z = vert2_z[triangleIdx];

// Calculate edges for triangle

float edge1_x = v1_x - v0_x;

float edge1_y = v1_y - v0_y;

float edge1_z = v1_z - v0_z;

float edge2_x = v2_x - v0_x;

float edge2_y = v2_y - v0_y;

float edge2_z = v2_z - v0_z;

// Calculate determinant using cross product and dot product

float h_x = dir_y * edge2_z - dir_z * edge2_y;

float h_y = dir_z * edge2_x - dir_x * edge2_z;

float h_z = dir_x * edge2_y - dir_y * edge2_x;

float a = edge1_x * h_x + edge1_y * h_y + edge1_z * h_z;

float f = 1.0f / a;

float s_x = orig_x - v0_x;

float s_y = orig_y - v0_y;

float s_z = orig_z - v0_z;

float u = f * (s_x * h_x + s_y * h_y + s_z * h_z);

// Calculate q vector and v

float q_x = s_y * edge1_z - s_z * edge1_y;

float q_y = s_z * edge1_x - s_x * edge1_z;

float q_z = s_x * edge1_y - s_y * edge1_x;

float v = f * (dir_x * q_x + dir_y * q_y + dir_z * q_z);

// Calculate t to find the intersection point

float t = f * (edge2_x * q_x + edge2_y * q_y + edge2_z * q_z);

//hits[rayIdx * numTriangles + triangleIdx] = 1; // Mark as hit

hits[rayIdx * numTriangles + triangleIdx] = 1.0f; // Mark as hit

}

the parameters of the kernel call are:

numRays = 512; % Example number of rays

numTriangles = 512; % Example number of triangles

blockSize = [16,16]; % Block size (adjust based on your hardware)

gridSize = [ceil(numRays / blockSize(1)), ceil(numTriangles / blockSize(2))];

k.ThreadBlockSize = blockSize;

k.GridSize = gridSize;

the size of the array at the output of the kernel lauch is of dimension numRays instead of numRays * numTriangles, why?

By launching matlab from terminal and checking with printf the kernel effectively the variable hits is computed the rigth times, it seems that when it returns to matlab there is some problem.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Joss Knight on 19 Sep 2024

0
Link

Direct link to this answer

https://in.mathworks.com/matlabcentral/answers/2153550-gpu-output-array-wrong-dimensions#answer_1519225

The outputs of your kernel will be the non-const pointer inputs to your kernel in the order they appear in your function signature. The launch parameters do not change that.

https://uk.mathworks.com/help/parallel-computing/run-cuda-or-ptx-code-on-gpu.html#bsit2so-1

It looks like you are passing your desired output as the 16th non-const input, so perhaps you are simply not retrieving that output? If that is the only kernel output, make all the other kernel inputs const pointers and then hits will be the only kernel output, or just reorder the arguments so that it's the first input, and therefore the first output.

1 Comment
Show -1 older commentsHide -1 older comments

nicola on 19 Sep 2024

Many Thanks Joss, just fixed accoridng to your post! Great

Sign in to comment.

gpu output array wrong dimensions

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

gpu output array wrong dimensions

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments