Value to differentiate is not traced. It must be a traced real dlarray scalar. Use dlgradient inside a function called by dlfeval to trace the variables.

16 views (last 30 days)
I'm training a network predicting 9 noise current values(first 6 magnitudes, last 3 phase) with a custom loss function, but I got this error when traning the network.
Error using dlarray/dlgradient (line 105)
Value to differentiate is not traced. It must be a traced real dlarray scalar. Use dlgradient inside a function called by dlfeval to trace the variables.
The following is my custom loss function.
function [total_loss, gradients] = forwardLoss_3port_V4(dlnet, dX, dY8, dY18)
% Forward pass through the network head
rawOut = forward(dlnet, dX);
rawOut = extractdata(rawOut);
dY8 = extractdata(dY8);
dY18 = extractdata(dY18);
B = size(dY8 , 2);
%Division of predicted data
mags_pred = 1./(1 + exp(-rawOut(1:6,:)));
phases_pred = pi*tanh(rawOut(7:9,:));
%Creating minmax values for the true values
phase_tar12 = atan2(dY8(3,:), dY8(4,:));
re12 = dY8(3,:);
im12 = dY8(4,:);
mag_tar12 = hypot(re12, im12);
a = log10(dY8(1,:));
b = log10(dY8(7,:));
c = log10(max(mag_tar12, realmin('double')));
yLog = [a,b,c];
minmax_target = [min(yLog, [], 1).', max(yLog, [], 1).'];
%revamp the predicted magnitudes and build the Cy3
preC11 = mags_pred(1,:) .* (minmax_target(1,2) - minmax_target(1,1)) + minmax_target(1,1);
C11 = 10.^preC11;
preC22 = mags_pred(2,:) .* (minmax_target(2,2) - minmax_target(2,1)) + minmax_target(2,1);
C22 = 10.^preC22;
preC33 = mags_pred(3,:) .* (minmax_target(1,2) - minmax_target(1,1)) + minmax_target(1,1);
C33 = 10.^preC33;
pre_offdiag = mags_pred(4:6,:) .* (minmax_target(3,2) - minmax_target(3,1)) + minmax_target(3,1);
magC12 = 10.^pre_offdiag(1,:);
magC13 = 10.^pre_offdiag(2,:);
magC23 = 10.^pre_offdiag(3,:);
% Build complex off-diagonals from mag & phase
C12 = magC12 .* exp(1i * phases_pred(1,:)); % [1×B]
C13 = magC13 .* exp(1i * phases_pred(2,:)); % [1×B]
C23 = magC23 .* exp(1i * phases_pred(3,:));
Cy3 = dlarray(complex(zeros(3,3,B)));
% Diagonals
Cy3(1,1,:) = reshape(C11, 1,1,[]);
Cy3(2,2,:) = reshape(C22, 1,1,[]);
Cy3(3,3,:) = reshape(C33, 1,1,[]);
% Off-diagonals
Cy3(1,2,:) = reshape(C12, 1,1,[]); Cy3(2,1,:) = conj(Cy3(1,2,:));
Cy3(1,3,:) = reshape(C13, 1,1,[]); Cy3(3,1,:) = conj(Cy3(1,3,:));
Cy3(2,3,:) = reshape(C23, 1,1,[]); Cy3(3,2,:) = conj(Cy3(2,3,:));
% Y3 from flattened [Re(9) Im(9)]
Y3r_flat = reshape(dY18(1:9,:), 3,3,[]);
Y3i_flat = reshape(dY18(10:18,:), 3,3,[]);
Y3_all = Y3r_flat + 1i*Y3i_flat;
% Z_source (unscale from normalized features)
Zs = rescaleVector(dX(end-1,:), 10, 50, 0, 1) + 1j*rescaleVector(dX(end,:), -50, 50, -1, 1) ;
Ys = 1./(Zs); % 1x1xfreq admittance
% Source noise (targets include it)
G2 = 2*physconst('boltzmann')*290*real(Ys);
G2 = reshape(G2, 1,1,[]);
Y2 = reshape(Ys, 1,1,[]);
% Which node of the 3-port is connected to the 1-port
ports1 = 3; ports2 = 1;
% Cascade → external 2-port
Gc = cascadeNoiseCorrelation(Y3_all, Cy3, Y2, G2, ports1, ports2); % 2x2xfreq
% Cascade elements
y11n_pred = log10(real(Gc(1,1,:)));
y22n_pred = log10(real(Gc(2,2,:)));
y12n_pred = log10(hypot(real(Gc(1,2,:)), imag(Gc(1,2,:))));
y12p_pred = atan2(imag(Gc(1,2,:)), real(Gc(1,2,:)));
% 2x2 target
y11n_tar = log10(dY8(1,:));
y22n_tar = log10(dY8(7,:));
y12n_tar = c;
y12p_tar = phase_tar12;
deltaphi = y12p_pred - y12p_tar;
deltaphi = atan2(sin(deltaphi), cos(deltaphi));
mag12_term = (y12n_tar - y12n_pred).^2;
alpha = 0.1;
phase12_term = 0.5 .* (y12n_tar.^2) .* (1-cos(pi*deltaphi));
beta = 1.0;
mag11_term = (y11n_tar - y11n_pred).^2;
mag22_term = (y22n_tar - y22n_pred).^2;
per_ex = mag12_term + alpha.*phase12_term + beta.*(mag11_term + mag22_term);
total_loss = mean(per_ex, 'all');
total_loss = dlarray(total_loss);
% Backprop
gradients = dlgradient(total_loss, dlnet.Learnables);
end
The cascadeNoiseCorrelation function is written below (in case needed for analysis):
function [Gc] = cascadeNoiseCorrelation(Y1, G1, Y2, G2, ports1, ports2)
% Cascades current noise correlation matrices over frequency
%
% Inputs:
% Y1,Y2 m×m×f bzw. n×n×f admittances
% G1,G2 m×m×f bzw. n×n×f noise correlation matrices
% ports1,ports2 vector with connecting ports of each Network of same length c
%
% Outputs:
% Gc (m+n-c)×(m+n-c)×f current noise correlation matrix
% Preallocate:
% External Ports
ext1 = setdiff(1:size(Y1,1), sort(ports1));
ext2 = setdiff(1:size(Y2,1), sort(ports2));
rows1 = [ext1, ports1];
rows2 = [ext2, ports2];
freq = size(Y1,3);
N = numel(ext1) + numel(ext2);
Gc = complex(zeros(N,N,freq));
% block partitioning
m_ext = numel(ext1);
n_ext = numel(ext2);
for k = 1:freq
% Temporary 2D-Matrices
Y1k = Y1(:,:,k);
Y2k = Y2(:,:,k);
G1k = G1(:,:,k);
G2k = G2(:,:,k);
% Permutated Matrices
Y1p = Y1k(rows1, rows1);
Y2p = Y2k(rows2, rows2);
G1p = G1k(rows1, rows1);
G2p = G2k(rows2, rows2);
% Y-Parameters from external to internal
Y_ei = [ Y1p(1:m_ext,m_ext+1:end); Y2p(1:n_ext,n_ext+1:end) ];
% Combined internal Yii parameters
Y_ii = Y1p(m_ext+1:end,m_ext+1:end) + Y2p(n_ext+1:end,n_ext+1:end);
% Noise contributions seperated to:
% external to external
G_ee = blkdiag(G1p(1:m_ext,1:m_ext), G2p(1:n_ext,1:n_ext));
% internal to internal
G_ii = G1p(m_ext+1:end,m_ext+1:end) + G2p(n_ext+1:end,n_ext+1:end);
% external to internal
G_ei = [G1p(1:m_ext,m_ext+1:end) ; G2p(1:n_ext,n_ext+1:end)];
% internal to external
G_ie = [G1p(m_ext+1:end,1:m_ext) , G2p(n_ext+1:end,1:n_ext)];
% total noise correlation matrix
G_tot = [G_ii, G_ie; G_ei, G_ee];
% Noise current transition function from internal currents to external ports
Hj = -Y_ei/(Y_ii);
% Total noise transition matrix
Htot = [Hj, eye(N)];
% Total noise correlation matrix
Gc(:,:,k) = Htot*G_tot*Htot';
end
end
My training intialization is also presented:
function initDLnet(obj)
% Define DL Network
% define input layer
networkLayers = [featureInputLayer(obj.InputDim, 'Name', 'ParameterInputs')];
% define hidden layer
for i_layer = 1:obj.numHiddenLayer
networkLayers = [
networkLayers
fullyConnectedLayer(round(obj.InputDim*obj.hiddenLayerScaling), 'Name', "fc"+num2str(i_layer))
layerNormalizationLayer('Name',"ln"+num2str(i_layer))
reluLayer('Name', "relu"+num2str(i_layer))
];
end
% define output layer
networkLayers = [
networkLayers
fullyConnectedLayer(obj.OutputDim, 'Name', 'fcout');
% tanhLayer("Name", "dOutput")
% sigmoidLayer('Name', 'Output');
% reluLayer('Name', 'dOutput');
];
obj.DLnet = dlnetwork(networkLayers);
end

Answers (1)

Matt J
Matt J about 4 hours ago
Edited: Matt J 8 minutes ago
rawOut = extractdata(rawOut);
dY8 = extractdata(dY8);
dY18 = extractdata(dY18);
If you pre-convert all your inputs to normal arrays at the top of your code, then none of your operations will be traced. In order for a sequence of operations to be traced for the purpose of using a dlgradient, the sequence must be a continuous chain of operations, each one resulting in a dlarray.

Categories

Find more on Mathematics and Optimization in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!