Using dlarray with betarnd/randg
3 views (last 30 days)
Show older comments
I am writing a custom layer with the DL toolbox and a part of the forward pass of this layer is making draws from a beta distribution where the b parameter is to be optimised as part of the network training. However, I seem to be having difficulty using betarnd (and by extension randg) with a dlarray valued parameter.
Consider the following, which works as expected.
>> betarnd(1, 0.1)
ans =
0.2678
However, if I instead do the following, then it does not work.
>> b = dlarray(0.1)
b =
1×1 dlarray
0.1000
>> betarnd(1, b)
Error using randg
SHAPE must be a full real double or single array.
Error in betarnd (line 34)
g2 = randg(b,sizeOut); % could be Infs or NaNs
Is it not possible to use such functions with parameters to be optimised via automatic differentiation (hence dlarray)?
Many thanks
0 Comments
Accepted Answer
Matt J
on 20 Jun 2024
Edited: Matt J
on 20 Jun 2024
Random number generation operations do not have derivatives in the standard sense. You will have to define some approximate derivative for yourself by implementing a backward() method.
2 Comments
Matt J
on 20 Jun 2024
Edited: Matt J
on 22 Jun 2024
You will have to define some approximate derivative for yourself by implementing a backward() method.
One candidate would be to reparametrize the beta distribution in terms of uniform random variables, U1 and U2, which you would save during forward propagation,
function [Z, U1, U2] = forward_pass(alpha, beta)
% Generate uniform random variables
U1 = rand();
U2 = rand();
% Generate Gamma(alpha, 1) and Gamma(beta, 1) using the inverse CDF (ppf)
X = gaminv(U1, alpha, 1);
Y = gaminv(U2, beta, 1);
% Combine to get Beta(alpha, beta)
Z = X / (X + Y);
end
During back propagation, your backward() method would differentiate non-stochastically with resepct to alpha and beta, using the saved U1 and U2 data as fixed and given values,
function [dZ_dalpha, dZ_dbeta] = backward_pass(alpha, beta, U1, U2, grad_gaminv)
% Differentiate gaminv with respect to the shape parameter alpha and beta
dX_dalpha = grad_gaminv(U1, alpha);
dY_dbeta = grad_gaminv(U2, beta);
% Compute partial derivatives of Z with respect to X and Y
X = gaminv(U1, alpha, 1);
Y = gaminv(U2, beta, 1);
dZ_dX = Y / (X + Y)^2;
dZ_dY = -X / (X + Y)^2;
% Use the chain rule to compute gradients with respect to alpha and beta
dZ_dalpha = dZ_dX * dX_dalpha;
dZ_dbeta = dZ_dY * dY_dbeta;
end
This assumes you have provided a function grad_gaminv() which can differentiate gaminv(), e.g.,
function grad = grad_gaminv(U, shape)
% Placeholder for the actual derivative computation of gaminv with respect to the shape parameter
% Here we use a numerical approximation for demonstration
delta = 1e-6;
grad = (gaminv(U, shape + delta, 1) - gaminv(U, shape, 1)) / delta;
end
DISCLAIMER: All code above was ChatGPT-generated.
More Answers (0)
See Also
Categories
Find more on Image Data Workflows in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!