gpucoder.atomicAdd

Atomically add value and variable in global or shared memory

Since R2021b

Syntax

A = gpucoder.atomicAdd(A,B)

[A,oldA] = gpucoder.atomicAdd(A,B)

Description

The gpucoder.atomicAdd function reads a value from a global or shared GPU memory location, adds to the value, and writes the result back to the memory location. In generated GPU code, the operation is atomic, which means that a GPU thread performs the read-modify-write operation without interference from other threads.

A = gpucoder.atomicAdd(A,B) atomically adds the value in the global or shared GPU memory location, A, to the value of B and writes the results back into A.

Call the gpucoder.atomicAdd function directly inside a for-loop that you want to execute on the GPU in the generated CUDA^® code. Each iteration of the loop must be able to read and write to A.

example

[A,oldA] = gpucoder.atomicAdd(A,B) returns the previous value of A as oldA. When you use this syntax, use the coder.gpu.kernel pragma before the loop that contains gpucoder.atomicAdd.

Examples

collapse all

Create Histogram by Atomically Adding Values

Create a histogram of a vector by using the gpucoder.atomicAdd function to parallelize the computations in generated CUDA code.

Write an entry-point function named numberHistogram that accepts the matrix input X and outputs the number of times each integer from 0 through 31 appears in X. Use a for-loop and addition to count the number of times each integer appears.

function counts = numberHistogram(X)
counts = zeros(1, 32);
for i=1:numel(X)
    curr = X(i);
    if 0 <= curr && curr < 32
        counts(curr+1) = counts(curr+1)+1;
    end
end

The function calculates the histogram in MATLAB^®. However, mapping this loop to a kernel would generate a kernel with a data race. Because multiple iterations of the loop can increment the same element of counts, multiple threads can read and write to an element of counts at the same time. To perform the increment operation without interference from other threads, use gpucoder.atomicAdd to make the addition operation atomic.

function counts = numberHistogram(X)
counts = zeros(1, 32);
for i=1:numel(X)
    curr = X(i);
    if 0 <= curr && curr < 32
        counts(curr+1) = gpucoder.atomicAdd(counts(curr+1),1);
    end
end

Create an input variable named X that contains random integers from 0 to 31.

X = randi([0 31],[1 2^10]);

To generate a CUDA MEX function, use the codegen command.

cfg = coder.gpuConfig("mex");
codegen -config cfg -args {X} numberHistogram

The generated CUDA code contains a kernel named numberHistogram_kernel2 that calculates the histogram. The kernel uses atomic addition to prevent threads from reading from or writing to an element of counts while another thread performs the addition operation.

static __global__
    __launch_bounds__(128, 1) void numberHistogram_kernel2(const real_T X[1024],
                                                           real_T counts[32])
{
  int32_T i;
  i = static_cast<int32_T>(mwGetGlobalThreadIndex());
  if (i < 1024) {
    real_T b;
    b = X[i];
    if ((b >= 0.0) && (b < 32.0)) {
      gpu_atomicAdd(&counts[static_cast<int32_T>(b + 1.0) - 1], 1.0);
    }
  }
}

Input Arguments

collapse all

`A` — Reference to shared or global memory location
scalar

Reference to a shared or global GPU memory location, specified as a scalar. Use the reference as an input and output argument of gpucoder.atomicAdd.

Data Types: double | single | int32 | uint32 | uint64

`B` — Operand
scalar

Operand, specified as a scalar.

Data Types: double | single | int32 | uint32 | uint64

Extended Capabilities

expand all

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Version History

Introduced in R2021b

gpucoder.atomicAdd

Syntax

Description

Examples

Create Histogram by Atomically Adding Values

Input Arguments

`A` — Reference to shared or global memory location
scalar

`B` — Operand
scalar

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Version History

See Also

Functions

Topics

gpucoder.atomicAdd

Syntax

Description

Examples

Create Histogram by Atomically Adding Values

Input Arguments

A — Reference to shared or global memory location scalar

B — Operand scalar

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Version History

See Also

Functions

Topics

`A` — Reference to shared or global memory location
scalar

`B` — Operand
scalar

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.