gpucoder.atomicMin

Atomically find the minimum between value and variable in global or shared memory

Since R2021b

Syntax

A = gpucoder.atomicMin(A,B)

[A,oldA] = gpucoder.atomicMin(A,B)

Description

The gpucoder.atomicMin function reads a value from a global or shared GPU memory location, compares it to an operand, and writes the minimum value back to the memory location. In generated GPU code, the operation is atomic, which means that a GPU thread performs the read-modify-write operation without interference from other threads.

A = gpucoder.atomicMin(A,B) compares B to the value of A and writes the value of min(A,B) back into A.

Call the gpucoder.atomicMin function directly inside a for-loop that you want to execute on the GPU in the generated CUDA^® code. Each iteration of the loop must be able to read and write to A.

example

[A,oldA] = gpucoder.atomicMin(A,B) returns the previous value of A as oldA. When you use this syntax, use the coder.gpu.kernel pragma before the loop that contains gpucoder.atomicMin.

Examples

collapse all

Find the Minimum Using CUDA atomicMin

Perform a simple atomic addition operation by using the gpucoder.atomicMin function and generate CUDA code that calls corresponding CUDA atomicMin() APIs.

In one file, write an entry-point function myAtomicMin that accepts matrix inputs a and b.

function a = myAtomicMin(a,b)
coder.gpu.kernelfun;

for i =1:numel(a)
    a(i) = gpucoder.atomicMin(a(i), b);
end

end

To create a type for a uint32 matrix for use in code generation, use the coder.newtype function.

A = coder.newtype('uint32', [1 30], [0 1]);
B = coder.newtype('uint32', [1 1], [0 0]);
inputArgs = {A,B};

To generate a CUDA library, use the codegen function.

cfg = coder.gpuConfig('lib');
cfg.GenerateReport = true;

codegen -config cfg -args inputArgs myAtomicMin -d myAtomicMin

The generated CUDA code contains the myAtomicMin_kernel1 kernel with calls to the atomicMin() CUDA APIs.

//
// File: myAtomicMin.cu
//
...

static __global__ __launch_bounds__(1024, 1) void myAtomicMin_kernel1(
    const uint32_T b, const int32_T i, uint32_T a_data[])
{
  uint64_T loopEnd;
  uint64_T threadId;
...

  for (uint64_T idx{threadId}; idx <= loopEnd; idx += threadStride) {
    int32_T b_i;
    b_i = static_cast<int32_T>(idx);
    atomicMin(&a_data[b_i], b);
  }
}
...

void myAtomicMin(uint32_T a_data[], int32_T a_size[2], uint32_T b)
{
  dim3 block;
  dim3 grid;
...

    cudaMemcpy(gpu_a_data, a_data, a_size[1] * sizeof(uint32_T),
               cudaMemcpyHostToDevice);
    myAtomicMin_kernel1<<<grid, block>>>(b, i, gpu_a_data);
    cudaMemcpy(a_data, gpu_a_data, a_size[1] * sizeof(uint32_T),
               cudaMemcpyDeviceToHost);
...

}

Input Arguments

collapse all

`A` — Reference to shared or global memory location
scalar

Reference to a shared or global GPU memory location, specified as a scalar. Use the reference as an input and output argument of gpucoder.atomicMin.

Data Types: int32 | uint32 | uint64

`B` — Operand
scalar

Operand, specified as a scalar.

Data Types: int32 | uint32 | uint64

Extended Capabilities

expand all

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Version History

Introduced in R2021b

gpucoder.atomicMin

Syntax

Description

Examples

Find the Minimum Using CUDA atomicMin

Input Arguments

`A` — Reference to shared or global memory location
scalar

`B` — Operand
scalar

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Version History

See Also

Functions

Topics

gpucoder.atomicMin

Syntax

Description

Examples

Find the Minimum Using CUDA atomicMin

Input Arguments

A — Reference to shared or global memory location scalar

B — Operand scalar

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Version History

See Also

Functions

Topics

`A` — Reference to shared or global memory location
scalar

`B` — Operand
scalar

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.