upper middle and lower third of a histogram

Dear all,
I have data on GDP growth rates for 64 countries over 12 time periods (panel data set).
I want to see if the GDP growth for each year per country was in the upper, middle or lower third of the growth rates recorded in the entire sample period so as to construct a dummy that takes 3 values (-1 for the lower third, 1 for the middle third and 1 for the upper third).
To give you an example of my data set
A={'country' 'obs' 'GDPpercent'
'Argentina' [2000] [ -0.7890]
'Argentina' [2001] [ -4.4088]
'Argentina' [2002] [ -10.8945]
'Argentina' [2003] [ 8.8370]
'Argentina' [2004] [ 9.0296]
'Argentina' [2005] [ 9.1790]
'Argentina' [2006] [ 8.4661]
'Argentina' [2007] [ 8.6533]
'Argentina' [2008] [ 6.7584]
'Argentina' [2009] [ 0.8502]
'Argentina' [2010] [ 9.1609]
'Argentina' [2011] [ 8.8696]
'Australia' [2000] [ 3.9514]
'Australia' [2001] [ 2.0716]
'Australia' [2002] [ 3.9038]
'Australia' [2003] [ 3.2723]
'Australia' [2004] [ 4.1558]
'Australia' [2005] [ 2.9591]
'Australia' [2006] [ 3.0814]
'Australia' [2007] [ 3.5642]
'Australia' [2008] [ 3.8321]
'Australia' [2009] [ 1.4484]
'Australia' [2010] [ 2.2569]
'Australia' [2011] [ 1.9056]
'Austria' [2000] [ 3.6676]
'Austria' [2001] [ 0.8574]
'Austria' [2002] [ 1.6937]
'Austria' [2003] [ 0.8659]
'Austria' [2004] [ 2.5896]
'Austria' [2005] [ 2.4007]
'Austria' [2006] [ 3.6698]
'Austria' [2007] [ 3.7059]
'Austria' [2008] [ 1.3962]
'Austria' [2009] [ -3.8100]
'Austria' [2010] [ 2.3147]
'Austria' [2011] [ 2.6964]};
Is there any way of doing that in matlab?
Any code provided is greately appreciated.
Thanks in advance.

Answers (1)

Do you mean like this:
column3 = cell2mat(A(2:end,3))
plot(column3, 'bo-')
grid on;
counts = hist(column3, 3)
or, do you want to just sort the values and split them into thirds?
sortedValues = sort(column3)
oneThird = int32(length(column3)/3)
twoThirds = int32(2*length(column3)/3)
firstThird = sortedValues(1:oneThird)
secondThird = sortedValues(oneThird+1:twoThirds)
thirdThird = sortedValues(twoThirds+1:end)

5 Comments

antonet
antonet on 18 Apr 2013
Edited: antonet on 18 Apr 2013
thanks a lot
Well, I want to construct the histogram of annual real GDP growth rates for the entire sample period . If the current observation of annual growth falls into the lower third of this distribution, the indicator is assigned a value of 1 for that year, a 0 if it falls in the middle third and a 1 if it falls in the upper third.
I think that would be the first case, but it really depends on how you define the lower third. Is it by the range of values or by the values themselves? Look at the plot. For example, if you had 3000 points with values in the 2000 to 3000 range and only one outlier with a value of 1, then is the lower third the range of 1 - 1000, or is it 1 - 2333? The first range is one third of the range, while the second range encompasses one third of the data points, though they're all clustered into the upper third of the total range. So which is it?
antonet
antonet on 18 Apr 2013
Edited: antonet on 18 Apr 2013
Yes, you are right. It is the first case. But how can I assign these dummies (-1,0,1) to each year for each country using matlab in an "automatic" way? Because I have a large number of countries and years.
Thanks a lot
antonet
antonet on 18 Apr 2013
Edited: antonet on 18 Apr 2013
Please, I am so close...
Find the max and min of column3, then find the range and divide by three. So now you know the boundaries of those number ranges. Then you can just find out which are in the range by simple thresholding
column3 = cell2mat(A(2:end,3))
plot(column3, 'bo-')
grid on;
minValue = min(column3)
maxValue = max(column3)
range = maxValue - minValue
lowerThird = column3 < (minValue + range/3)
middleThird = column3 >= (minValue + range/3) & column3 < (minValue + 2*range/3)
upperThird = column3 >= (minValue + 2*range/3)

Sign in to comment.

Categories

Tags

Asked:

on 18 Apr 2013

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!