How to create a cross table?

I have a categorical data set (see bottom of the message). How do I count frequencies by multiple variables? Eg. frequency by Country and Artist, so that result would be as follows:

              | Abba   ACDC   Metallica   Shakira
-------------------------------------------------
Sweden        |  16      1          1         3
Australia     |   4     15          1         1
United States |   2      4         17         1
Colombia      |   1      3          5        15

Data set used:

Person   | Country         Artist      Sport        Colour
-----------------------------------------------------------
id_1     | Colombia        Shakira     Soccer       Green
id_2     | Colombia        Metallica   Rugby        Yellow
id_3     | Sweden          Abba        Ice hockey   Blue
id_4     | United States   Metallica   Baseball     Red
id_5     | United States   ACDC        Ice hockey   Yellow
id_6     | Colombia        Shakira     Soccer       Green
id_7     | Colombia        ACDC        Soccer       Yellow
id_8     | Colombia        Shakira     Rugby        Blue
id_9     | Colombia        Shakira     Soccer       Red
id_10    | Sweden          Abba        Ice hockey   Blue
id_11    | Australia       ACDC        Rugby        Yellow
id_12    | Australia       ACDC        Rugby        Yellow
id_13    | Sweden          Abba        Ice hockey   Blue
id_14    | United States   Metallica   Baseball     Red
id_15    | Colombia        ACDC        Ice hockey   Yellow
id_16    | Australia       ACDC        Rugby        Yellow
id_17    | Australia       ACDC        Rugby        Yellow
id_18    | United States   Metallica   Baseball     Red
id_19    | United States   Metallica   Baseball     Red
id_20    | Australia       Abba        Baseball     Yellow
id_21    | Sweden          Abba        Baseball     Blue
id_22    | United States   Metallica   Baseball     Red
id_23    | Sweden          Abba        Baseball     Blue
id_24    | Colombia        Shakira     Soccer       Green
id_25    | United States   Metallica   Baseball     Red
id_26    | Sweden          Abba        Ice hockey   Blue
id_27    | Sweden          Abba        Ice hockey   Blue
id_28    | Sweden          Metallica   Ice hockey   Blue
id_29    | Colombia        Shakira     Soccer       Green
id_30    | Colombia        Abba        Soccer       Blue
id_31    | Australia       ACDC        Rugby        Yellow
id_32    | Australia       ACDC        Rugby        Blue
id_33    | Australia       ACDC        Rugby        Yellow
id_34    | United States   Metallica   Baseball     Blue
id_35    | Australia       Abba        Rugby        Yellow
id_36    | Australia       ACDC        Rugby        Yellow
id_37    | United States   ACDC        Baseball     Yellow
id_38    | Australia       ACDC        Rugby        Yellow
id_39    | Sweden          Abba        Ice hockey   Blue
id_40    | Colombia        Shakira     Soccer       Green
id_41    | Sweden          Abba        Ice hockey   Blue
id_42    | Colombia        ACDC        Soccer       Red
id_43    | United States   Abba        Soccer       Green
id_44    | Sweden          Abba        Ice hockey   Blue
id_45    | Sweden          Shakira     Baseball     Blue
id_46    | Sweden          Abba        Ice hockey   Blue
id_47    | Sweden          Shakira     Ice hockey   Green
id_48    | Colombia        Metallica   Rugby        Blue
id_49    | Sweden          ACDC        Baseball     Red
id_50    | United States   ACDC        Soccer       Green
id_51    | Sweden          Abba        Ice hockey   Blue
id_52    | United States   Metallica   Baseball     Red
id_53    | United States   Metallica   Baseball     Red
id_54    | Colombia        Shakira     Soccer       Green
id_55    | United States   ACDC        Rugby        Blue
id_56    | Australia       Shakira     Soccer       Yellow
id_57    | Australia       ACDC        Rugby        Yellow
id_58    | Australia       ACDC        Rugby        Yellow
id_59    | Australia       ACDC        Rugby        Yellow
id_60    | United States   Metallica   Baseball     Red
id_61    | Colombia        Metallica   Ice hockey   Yellow
id_62    | Sweden          Abba        Baseball     Yellow
id_63    | Sweden          Abba        Soccer       Yellow
id_64    | United States   Metallica   Baseball     Red
id_65    | Sweden          Abba        Rugby        Green
id_66    | Australia       ACDC        Rugby        Yellow
id_67    | Colombia        Shakira     Soccer       Green
id_68    | United States   Metallica   Baseball     Red
id_69    | United States   Metallica   Baseball     Red
id_70    | Colombia        Shakira     Soccer       Green
id_71    | Sweden          Shakira     Ice hockey   Blue
id_72    | Sweden          Abba        Ice hockey   Blue
id_73    | United States   Shakira     Baseball     Yellow
id_74    | Australia       ACDC        Rugby        Green
id_75    | Australia       Metallica   Baseball     Red
id_76    | United States   Abba        Rugby        Yellow
id_77    | Colombia        Shakira     Soccer       Green
id_78    | Colombia        Metallica   Rugby        Yellow
id_79    | United States   Metallica   Baseball     Green
id_80    | Colombia        Shakira     Baseball     Blue
id_81    | Colombia        Shakira     Rugby        Red
id_82    | Australia       Abba        Rugby        Red
id_83    | United States   Metallica   Baseball     Red
id_84    | Colombia        Shakira     Soccer       Green
id_85    | United States   Metallica   Baseball     Red
id_86    | Australia       Abba        Ice hockey   Blue
id_87    | Colombia        Shakira     Soccer       Green
id_88    | Australia       ACDC        Rugby        Yellow
id_89    | Colombia        Metallica   Ice hockey   Red
id_90    | United States   Metallica   Baseball     Red

Answers (2)

I reformatted your data as a CSV, and read it into a table:
>> t = readtable('tmp5.csv','Format','%s%C%C%C%C','ReadRowNames',true)
t =
90×4 table
Country Artist Sport Colour
_____________ _________ __________ ______
id_1 Colombia Shakira Soccer Green
id_2 Colombia Metallica Rugby Yellow
id_3 Sweden Abba Ice hockey Blue
id_4 United States Metallica Baseball Red
id_5 United States ACDC Ice hockey Yellow
id_6 Colombia Shakira Soccer Green
id_7 Colombia ACDC Soccer Yellow
id_8 Colombia Shakira Rugby Blue
id_9 Colombia Shakira Soccer Red
id_10 Sweden Abba Ice hockey Blue
id_11 Australia ACDC Rugby Yellow
id_12 Australia ACDC Rugby Yellow
[snip]
Then count the number of rows within each artist/country combination. (This has the drawback that any combinations that are not present will not show up in the result. I think the crosstab function avoids that.):
>> tcounts = varfun(@(x) length(x), t,'GroupingVariables',{'Artist' 'Country'},'InputVariables',{})
tcounts =
16×3 table
Artist Country GroupCount
_________ _____________ __________
ACDC Australia 15
ACDC Colombia 3
ACDC Sweden 1
ACDC United States 4
Abba Australia 4
Abba Colombia 1
Abba Sweden 16
Abba United States 2
Metallica Australia 1
Metallica Colombia 5
Metallica Sweden 1
Metallica United States 17
Shakira Australia 1
Shakira Colombia 15
Shakira Sweden 3
Shakira United States 1
Depending on what you mneed, that might be your stopping point. If you want the actual cross-tabulation, you can unstack the counts by country, for each artist:
>> tcrosstab = unstack(tcounts,'GroupCount','Artist');
>> tcrosstab.Properties.RowNames = cellstr(tcrosstab.Country); tcrosstab.Country = [];
>> tcrosstab
tcrosstab =
4×4 table
ACDC Abba Metallica Shakira
____ ____ _________ _______
Australia 15 4 1 1
Colombia 3 1 5 15
Sweden 1 16 1 3
United States 4 2 17 1
This gets you a table, which provides nice named subscripting, but is not useful for further numeric calculations. Depending on what you need, you can take one more step:
>> tcrosstab.Variables
ans =
15 4 1 1
3 1 5 15
1 16 1 3
4 2 17 1
Star Strider
Star Strider on 11 May 2017
See if the Statistics and Machine Learning Toolbox crosstab (link) function will do what you want.

Asked:

on 11 May 2017

Answered:

on 12 May 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!