Clear Filters
Clear Filters

How to make a weighted histogram with specific bins?

37 views (last 30 days)
I'm new and I use MATLAB 2011.
I have a column of data and another column of weights:
d = [33 11 18 ... ] %data
w = [1 0.5 1 ...] %weights
There are only 11 possible values of my data so there would be 11 bins, I need to know the frequency count of each value.
I got a plot to work nicely WITHOUT including the weights by changing the data to ordinal:
bins = [0 11 12 13 14 15 16 17 18 32 33];
counts = droplevels(ordinal(d,[],bins));
hist(counts);
set(gca,'XTick',1:11);
I changed it to ordinal because otherwise, the bars had large spaces between them (because the x-axis ranged from 0-33) or merged together (clumped 11-18 together as one). I tried so many things, I couldn't list them all here.
The point is I need a plot to INCLUDE the weights in the frequency counts. I assume it can't be ordinal so the above code is irrelevant. I've done a lot of googling, everything I tried hasn't worked.
Any help is appreciated it, sorry if I'm confusing.
  3 Comments
Jaclyn
Jaclyn on 11 Jul 2013
Each value in d has a 1 or a 0.5 associated with it as the weight. Values with a 0.5 are considered half an occurrence. For example, if 11 comes up five times in the dataset but one is weighted 0.5, the frequency count in the histogram for 11 will be 4.5.

Sign in to comment.

Accepted Answer

dpb
dpb on 11 Jul 2013
Edited: dpb on 11 Jul 2013
edges=unique(data); % vector of unique values
n=histc(data,unique(data)); % to get bin counts w/ bins specified
bar(edges,n,'histc') % plot
Use weights how desired...[edited to add]
If the answer to the question just posed is "yes" then
n=n.*w;
I don't know what bar() will do w/ a half for a count; never tried it. If it doesn't like non-integer values you can either floor() or round() the results depending on which you think is better/more appropriate for you problem.
bar() is ok w/ the fractional count; if you still have trouble w/ the bar location because the x-axis is positioned by value of the bin rather than labeling the bin you can just use x=1:11 and then
set(gca,'xticklab', num2str(bins'))
to label them by the bin numbers. It's a key point w/ handle graphics axes object that the xtick values and the labels do not have to be the same.
doc histc % and friends for further details
  5 Comments
dpb
dpb on 12 Jul 2013
Edited: dpb on 14 Jul 2013
c is the index into data of each unique value in data...it's what accumarray is grouping the weights over to get the total weight per bin You need to arrange it in the accumarray call that c is a column vector for reasons having to do w/ how it functions; not to worry over the details at the moment, it just needs must be a column. In my sample where I tested I had a row, hence the transpose. If you have a column already owing to the orientation of data, then remove that. The weights then also must be a column vector to coincide.
In fact going back I see you say you do have column vectors; my bad -- remove the transpose [ ' ] from both and joy should ensue.
nwt=n.*accumarray(c,w);
Here's a toy example small enough you can see what's going on explicitly...
>> d=[1 1 3]'; w=[1 .5 1]';
>> [n,ib]=histc(d,unique(d));
>> [u,~,c] = unique(d);
>> [n,ib]=histc(d,u);
>> nwt=n.*accumarray(c,w)
nwt =
3
1
>> n
n =
2
1
>>
As you can see,
nwt(1)= 2*(1+0.5)
nwt(2) = 1*1
the sum of weights selected by the bin. You can reorder the d vector and get the same answer to prove it to yourself.
Jaclyn
Jaclyn on 12 Jul 2013
Ohhh okay, that clears it up. It works now, thank you!! :)

Sign in to comment.

More Answers (0)

Categories

Find more on Data Distribution Plots in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!