Main Content

Read and Analyze MAT-File with Key-Value Data

This example shows how to create a datastore for key-value pair data in a MAT-file that is the output of mapreduce. Then, the example shows how to read all the data in the datastore and sort it. This example assumes that the data in the MAT-file fits in memory.

Create a datastore from the sample file, mapredout.mat, using the datastore function. The sample file contains unique keys representing airline carrier codes and corresponding values that represent the number of flights operated by that carrier.

ds = datastore('mapredout.mat');

datastore returns a KeyValueDatastore. The datastore function automatically determines the appropriate type of datastore to create.

Preview the data using the preview function. This function does not affect the state of the datastore.

preview(ds)
ans=1×2 table
     Key        Value  
    ______    _________

    {'AA'}    {[14930]}

Read all of the data in ds using the readall function. The readall function returns a table with two columns, Key and Value.

T = readall(ds)
T=29×2 table
       Key          Value  
    __________    _________

    {'AA'    }    {[14930]}
    {'AS'    }    {[ 2910]}
    {'CO'    }    {[ 8138]}
    {'DL'    }    {[16578]}
    {'EA'    }    {[  920]}
    {'HP'    }    {[ 3660]}
    {'ML (1)'}    {[   69]}
    {'NW'    }    {[10349]}
    {'PA (1)'}    {[  318]}
    {'PI'    }    {[  871]}
    {'PS'    }    {[   83]}
    {'TW'    }    {[ 3805]}
    {'UA'    }    {[13286]}
    {'US'    }    {[13997]}
    {'WN'    }    {[15931]}
    {'AQ'    }    {[  154]}
      ⋮

T contains all the airline and flight data from the datastore in the same order in which the data was read. The table variables, Key and Value, are cell arrays.

Convert Value to a numeric array.

T.Value = cell2mat(T.Value)
T=29×2 table
       Key        Value
    __________    _____

    {'AA'    }    14930
    {'AS'    }     2910
    {'CO'    }     8138
    {'DL'    }    16578
    {'EA'    }      920
    {'HP'    }     3660
    {'ML (1)'}       69
    {'NW'    }    10349
    {'PA (1)'}      318
    {'PI'    }      871
    {'PS'    }       83
    {'TW'    }     3805
    {'UA'    }    13286
    {'US'    }    13997
    {'WN'    }    15931
    {'AQ'    }      154
      ⋮

Assign new names to the table variables.

T.Properties.VariableNames = {'Airline','NumFlights'};

Sort the data in T by the number of flights.

T = sortrows(T,'NumFlights','descend')
T=29×2 table
    Airline    NumFlights
    _______    __________

    {'DL'}       16578   
    {'WN'}       15931   
    {'AA'}       14930   
    {'US'}       13997   
    {'UA'}       13286   
    {'NW'}       10349   
    {'CO'}        8138   
    {'MQ'}        3962   
    {'TW'}        3805   
    {'HP'}        3660   
    {'OO'}        3090   
    {'AS'}        2910   
    {'XE'}        2357   
    {'EV'}        1699   
    {'OH'}        1457   
    {'FL'}        1263   
      ⋮

View a summary of the sorted table.

summary(T)
T: 29x2 table

Variables:

    Airline: cell array of character vectors
    NumFlights: double

Statistics for applicable variables:

                  NumMissing     Min       Median        Max            Mean              Std      

    Airline           0                                                                            
    NumFlights        0           69        1457        16578        4.2594e+03        5.5065e+03  

Reset the datastore to allow rereading of the data.

reset(ds)

See Also

| | |

Related Topics