Missing counts during histcount?

10 views (last 30 days)
Joy Shen
Joy Shen on 30 Aug 2023
Commented: Voss on 31 Aug 2023
Hi, I am randomly generating Nsim=10000 values and then binning them with histcounts but there seems to be missing values and I'm not sure where they went. I believe when I sum my histcounts
sum(Hist_CVol1)
I should be getting Nsim since it's binning each simulated value into VBinEdges, but right now I'm getting 3742 when I sum my histcounts. Not sure where they disappeared to or I'm misunderstanding what histcount does.
VBinEdges=[0 1E-17 0.3:0.05:0.8 1.1].*Vroom(:,1); %ft^3 goes to 1.1*Vroom because "infinity" to capture everything. Then we cap it later
VBinEdges_Lo=VBinEdges(1:end-1);
VBinEdges_Hi=VBinEdges(2:end);
VBinEdges_Md=VBinEdges_Lo+(diff(VBinEdges)/2);
IndEFH2 = [1:length(ExtSurgeBinEdges_Md)]'; % EFH index
IndV1 = [1:length(VBinEdges_Md)]'; %Flow volume 1 index
IndA_ps= [1:length(DamState)]'; % PS flow area index
temp1=[]; temp2=[]; temp3=[]; [temp1,temp2, temp3]=ndgrid(IndA_ps, IndV1, IndEFH2);
IndMatCVol1=[temp1(:),temp2(:), temp3(:)];
% CVol1 Node Sim
for iCombCVol1=1:size(IndMatCVol1) % iCombCVol1=1:624
% EFH loc 2 sim
EFH2_a=[]; EFH2_a=ExtSurgeBinEdges_Lo(IndMatCVol1(iCombCVol1,3));
EFH2_c=[]; EFH2_c=ExtSurgeBinEdges_Hi(IndMatCVol1(iCombCVol1,3));
pdEFH2=[]; pdEFH2=makedist('Uniform',EFH2_a,EFH2_c);
Esim2=[]; Esim2=random(pdEFH2,[Nsim,1]); %simulate external flood heights within bin (Uniform within bin)
Esim2=Esim2.*(Esim2>=0); % make sure it's positive
% Flow volume 1 sim
V1_a=[]; V1_a=VBinEdges_Lo(IndMatCVol1(iCombCVol1,2));
V1_c=[]; V1_c=VBinEdges_Hi(IndMatCVol1(iCombCVol1,2));
pdV1=[]; pdV1=makedist('Uniform',V1_a,V1_c);
V1sim=[]; V1sim=random(pdV1,[Nsim,1]);
V1sim=V1sim.*(V1sim>=0); % make sure the difference is never below zero
V1sim=V1sim.*(V1sim<=Vroom(:,1)); % make sure the difference is always below Vroom
%Flow volume 2 sim
% PS state sim through area
Mu_ps=[]; Mu_ps=Amean_ps(IndMatCVol1(iCombCVol1,1));
Sig_ps=[]; Sig_ps=0.2*Mu_ps;
pd_Asim_ps=[]; pd_Asim_ps = makedist('normal',Mu_ps,Sig_ps);
Asim_ps = []; Asim_ps=random(pd_Asim_ps,[Nsim,1]);
Asim_ps(Asim_ps<0)=0; % make sure the area is postive
Asim_ps(Asim_ps>A_ps)=A_ps; %make sure area is never bigger than the full area
% Volumetric flow rate 2 sim calculation
Qsim_ps2=[]; Qsim_ps2=Cd.*Asim_ps.*sqrt(2*g*(Esim2-n_ps)); %ft^3/s
Qsim_ps2(Esim2<=n_ps)=1E-20; %No flow when Esim is less than or equal to installation height
% Flow volume sim calculation
Vsim_ps2=[]; Vsim_ps2=Qsim_ps2.*Dsim; %ft^3
Vsim_ps2=Vsim_ps2.*(Vsim_ps2<=Vroom(:,1)); %Ensure Vsim is less than Vroom (3 Vroom configs)
Vsim_ps2(Esim2<=n_ps)=1E-20; %No flow when Esim is less than or equal to installation height
% Cumulative flow 1 sim calculation
CVol1sim= V1sim+Vsim_ps2;
% PMF
Hist_CVol1=[]; [Hist_CVol1,~] = histcounts(CVol1sim,VBinEdges);
Hist_CVol1=Hist_CVol1';
PMF_CVol1(:,iCombCVol1)=Hist_CVol1./sum(Hist_CVol1); % Bins it by the parents
end
  2 Comments
Image Analyst
Image Analyst on 30 Aug 2023
I tried to reproduce and I just got "Unrecognized function or variable 'Vroom'." What is that function?
Joy Shen
Joy Shen on 31 Aug 2023
Vroom = [6250 8125 5000];
This is why I use Vroom(:,1) because I'm checking each value of Vroom. Eventually I'll index and store values for each value but for now I'm sticking with just the first value to make it simple.

Sign in to comment.

Answers (1)

Voss
Voss on 30 Aug 2023
I suspect that the data you are using histcounts on has elements outside the range of bin edges you have specified.
For example:
% generate 10000 samples from a standard normal distribution
d = makedist('normal',0,1);
x = random(d,[10000,1]);
size(x)
ans = 1×2
10000 1
% specify some edges that don't include +/- infinity.
% the domain of normal distributions is (-Inf,Inf)
e = [0 1E-17 0.3:0.05:0.8 1.1];
% do the histogram binning
N = histcounts(x,e);
% count the number of samples within the specified bins:
N_inside = sum(N)
N_inside = 3584
% count the number of samples outside the specified bins,
% either below the first edge or above the last edge:
N_outside = nnz(x < e(1) | x > e(end))
N_outside = 6416
% the total number of samples:
N_inside + N_outside
ans = 10000
  3 Comments
Voss
Voss on 31 Aug 2023
Those lines work to limit the values of each vector, but you do histcounts on the sum of two of those vectors:
CVol1sim = V1sim+Vsim_ps2;
[Hist_CVol1,~] = histcounts(CVol1sim,VBinEdges);
It looks to me like there's no guarantee that all elements of CVol1sim are within the range of VBinEdges. In fact, it looks like CVol1sim could be as high as 2*Vroom(:,1).
Voss
Voss on 31 Aug 2023
By the way, this logic:
V1sim=V1sim.*(V1sim<=Vroom(:,1));
sets the elements of V1sim that are greater than Vroom(:,1) to zero (or NaN if the element was Inf). Is that what you want to do?
Here's an example:
Vroom = 6250;
V1sim = [-Inf -1000 0 1 7000 Inf]
V1sim = 1×6
-Inf -1000 0 1 7000 Inf
V1sim=V1sim.*(V1sim<=Vroom(:,1))
V1sim = 1×6
-Inf -1000 0 1 0 NaN
I imagine you want to set those elements that are greater than Vroom(:,1) to Vroom(:,1), in which case, you can use logic like you have in other places:
V1sim = [-Inf -1000 0 1 7000 Inf]
V1sim = 1×6
-Inf -1000 0 1 7000 Inf
V1sim(V1sim > Vroom(:,1)) = Vroom(:,1)
V1sim = 1×6
-Inf -1000 0 1 6250 6250
(This doesn't address the original question or impact my suggestion that elements sent to histcounts are outside the range of the bin edges.)

Sign in to comment.

Categories

Find more on Data Distribution Plots in Help Center and File Exchange

Products


Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!