upper middle and lower third of a histogram
Show older comments
Dear all,
I have data on GDP growth rates for 64 countries over 12 time periods (panel data set).
I want to see if the GDP growth for each year per country was in the upper, middle or lower third of the growth rates recorded in the entire sample period so as to construct a dummy that takes 3 values (-1 for the lower third, 1 for the middle third and 1 for the upper third).
To give you an example of my data set
A={'country' 'obs' 'GDPpercent'
'Argentina' [2000] [ -0.7890]
'Argentina' [2001] [ -4.4088]
'Argentina' [2002] [ -10.8945]
'Argentina' [2003] [ 8.8370]
'Argentina' [2004] [ 9.0296]
'Argentina' [2005] [ 9.1790]
'Argentina' [2006] [ 8.4661]
'Argentina' [2007] [ 8.6533]
'Argentina' [2008] [ 6.7584]
'Argentina' [2009] [ 0.8502]
'Argentina' [2010] [ 9.1609]
'Argentina' [2011] [ 8.8696]
'Australia' [2000] [ 3.9514]
'Australia' [2001] [ 2.0716]
'Australia' [2002] [ 3.9038]
'Australia' [2003] [ 3.2723]
'Australia' [2004] [ 4.1558]
'Australia' [2005] [ 2.9591]
'Australia' [2006] [ 3.0814]
'Australia' [2007] [ 3.5642]
'Australia' [2008] [ 3.8321]
'Australia' [2009] [ 1.4484]
'Australia' [2010] [ 2.2569]
'Australia' [2011] [ 1.9056]
'Austria' [2000] [ 3.6676]
'Austria' [2001] [ 0.8574]
'Austria' [2002] [ 1.6937]
'Austria' [2003] [ 0.8659]
'Austria' [2004] [ 2.5896]
'Austria' [2005] [ 2.4007]
'Austria' [2006] [ 3.6698]
'Austria' [2007] [ 3.7059]
'Austria' [2008] [ 1.3962]
'Austria' [2009] [ -3.8100]
'Austria' [2010] [ 2.3147]
'Austria' [2011] [ 2.6964]};
Is there any way of doing that in matlab?
Any code provided is greately appreciated.
Thanks in advance.
Answers (1)
Image Analyst
on 18 Apr 2013
Do you mean like this:
column3 = cell2mat(A(2:end,3))
plot(column3, 'bo-')
grid on;
counts = hist(column3, 3)
or, do you want to just sort the values and split them into thirds?
sortedValues = sort(column3)
oneThird = int32(length(column3)/3)
twoThirds = int32(2*length(column3)/3)
firstThird = sortedValues(1:oneThird)
secondThird = sortedValues(oneThird+1:twoThirds)
thirdThird = sortedValues(twoThirds+1:end)
5 Comments
Image Analyst
on 18 Apr 2013
I think that would be the first case, but it really depends on how you define the lower third. Is it by the range of values or by the values themselves? Look at the plot. For example, if you had 3000 points with values in the 2000 to 3000 range and only one outlier with a value of 1, then is the lower third the range of 1 - 1000, or is it 1 - 2333? The first range is one third of the range, while the second range encompasses one third of the data points, though they're all clustered into the upper third of the total range. So which is it?
Image Analyst
on 18 Apr 2013
Find the max and min of column3, then find the range and divide by three. So now you know the boundaries of those number ranges. Then you can just find out which are in the range by simple thresholding
column3 = cell2mat(A(2:end,3))
plot(column3, 'bo-')
grid on;
minValue = min(column3)
maxValue = max(column3)
range = maxValue - minValue
lowerThird = column3 < (minValue + range/3)
middleThird = column3 >= (minValue + range/3) & column3 < (minValue + 2*range/3)
upperThird = column3 >= (minValue + 2*range/3)
Categories
Find more on Data Distribution Plots in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!