Main Content

transprobprep

Preprocess credit ratings data to estimate transition probabilities

Description

example

[prepData] = transprobprep(data) preprocesses credit ratings historical data (that is, credit migration data) for the subsequent estimation of transition probabilities.

example

[prepData] = transprobprep(___,Name,Value) adds optional name-value pair arguments.

Examples

collapse all

Load input data from the file Data_TransProb.mat and display the first ten rows. In this example, the inputs are provided in character vector format.

load Data_TransProb
  
% Preprocess credit ratings data.
prepData = transprobprep(data)
prepData = struct with fields:
           idStart: [1506x1 double]
      numericDates: [4315x1 double]
    numericRatings: [4315x1 double]
     ratingsLabels: {'AAA'  'AA'  'A'  'BBB'  'BB'  'B'  'CCC'  'D'}

Estimate transition probabilities with the default settings.

transMat = transprob(prepData)
transMat = 8×8

   93.1170    5.8428    0.8232    0.1763    0.0376    0.0012    0.0001    0.0017
    1.6166   93.1518    4.3632    0.6602    0.1626    0.0055    0.0004    0.0396
    0.1237    2.9003   92.2197    4.0756    0.5365    0.0661    0.0028    0.0753
    0.0236    0.2312    5.0059   90.1846    3.7979    0.4733    0.0642    0.2193
    0.0216    0.1134    0.6357    5.7960   88.9866    3.4497    0.2919    0.7050
    0.0010    0.0062    0.1081    0.8697    7.3366   86.7215    2.5169    2.4399
    0.0002    0.0011    0.0120    0.2582    1.4294    4.2898   81.2927   12.7167
         0         0         0         0         0         0         0  100.0000

Estimate transition probabilities with the 'cohort' algorithm.

transMatCoh = transprob(prepData,'algorithm','cohort')
transMatCoh = 8×8

   93.1345    5.9335    0.7456    0.1553    0.0311         0         0         0
    1.7359   92.9198    4.5446    0.6046    0.1560         0         0    0.0390
    0.1268    2.9716   91.9913    4.3124    0.4711    0.0544         0    0.0725
    0.0210    0.3785    5.0683   89.7792    4.0379    0.4627    0.0421    0.2103
    0.0221    0.1105    0.6851    6.2320   88.3757    3.6464    0.2873    0.6409
         0         0    0.0761    0.7230    7.9909   86.1872    2.7397    2.2831
         0         0         0    0.3094    1.8561    4.5630   80.8971   12.3743
         0         0         0         0         0         0         0  100.0000

Input Arguments

collapse all

Historical input data for credit ratings, specified as one of the following:

  • A MATLAB® table of size nRecords-by-3 containing the credit ratings. Each row contains an ID (column 1), a date (column 2), and a credit rating (column 3). The assigned credit rating corresponds to the associated ID on the associated date. All information corresponding to the same ID must be stored in contiguous rows. Sorting this information by date is not required, but recommended for efficiency. When using a MATLAB table input, the names of the columns are irrelevant, but the ID, date and rating information are assumed to be in the first, second, and third columns, respectively. Also, when using a table input, the first and third columns can be categorical arrays, and the second can be a datetime array. Here is an example with all the information in table format:

     ID            Date             Rating
    __________    _____________    ______
    '00010283'    '10-Nov-1984'    'CCC'
    '00010283'    '12-May-1986'    'B'  
    '00010283'    '29-Jun-1988'    'CCC'
    '00010283'    '12-Dec-1991'    'D'  
    '00013326'    '09-Feb-1985'    'A'  
    '00013326'    '24-Feb-1994'    'AA' 

    The following summarizes the supported data types for table input:

    Data Input TypeID (1st Column)Date (2nd Column)Rating (3rd Column)
    Table

    • Numeric array

    • Cell array of character vectors

    • Categorical array

    • Numeric array

    • Cell array of character vectors

    • Datetime array

    • Numeric array

    • Cell array of character vectors

    • Categorical array

  • A cell array of size nRecords-by-3 containing the credit ratings. Each row contains an ID (column 1), a date (column 2), and a credit rating (column 3). The assigned credit rating corresponds to the associated ID on the associated date. All information corresponding to the same ID must be stored in contiguous rows. Sorting this information by date is not required but is recommended. IDs, dates, and ratings are stored in character vector format, but they can also be entered in numeric format. Here is an example with all the information in character vector format:

     '00010283'    '10-Nov-1984'    'CCC'
     '00010283'    '12-May-1986'    'B'  
     '00010283'    '29-Jun-1988'    'CCC'
     '00010283'    '12-Dec-1991'    'D'  
     '00013326'    '09-Feb-1985'    'A'  
     '00013326'    '24-Feb-1994'    'AA' 

    The following summarizes the supported data types for cell array input:

    Data Input TypeID (1st Column)Date (2nd Column)Rating (3rd Column)
    Cell

    • Numeric elements

    • Character vector elements

    • Numeric elements

    • Character vector elements

    • Numeric elements

    • Character vector elements

Data Types: table | cell

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: prepData = transprobprep(data,'labels',{'AAA','AA','A','BBB','BB','B','CCC','F'})

Credit-rating scale, specified as the comma-separated pair consisting of 'labels' and a nRatings-by-1, or 1-by-nRatings cell array of character vectors.

labels must be consistent with the ratings labels used in the third column of data. Use a cell array of numbers for numeric ratings, and a cell array for character vectors for categorical ratings.

Data Types: cell

Output Arguments

collapse all

Summary where the credit ratings information corresponding to each company starts and ends, returned as a structure with the following fields:

  • idStart — Array of size (nIDs+1)-by-1, where nIDs is the number of distinct IDs in column 1 of data. This array summarizes where the credit ratings information corresponding to each company starts and ends. The dates and ratings corresponding to company j in data are stored from row idStart(j) to row idStart(j+1)−1 of numericDates and numericRatings.

  • numericDates — Array of size nRecords-by-1, containing the dates in column 2 of data, in numeric format.

  • numericRatings — Array of size nRecords-by-1, containing the ratings in column 3 of data, mapped into numeric format.

  • ratingsLabels — Cell array of size1-by-nRatings, containing the credit rating scale.

Version History

Introduced in R2011b