mlecov

Asymptotic covariance of maximum likelihood estimators

Syntax

acov = mlecov(params,data,'pdf',pdf)

acov = mlecov(params,data,'logpdf',logpdf)

acov = mlecov(params,data,'nloglf',nloglf)

acov = mlecov(___,Name,Value)

Description

acov = mlecov(params,data,'pdf',pdf) returns an approximation to the asymptotic covariance matrix of the maximum likelihood estimators of the parameters for a distribution specified by the custom probability density function pdf. The output acov is a p-by-p matrix, where p is the number of parameters in params.

mlecov computes a finite difference approximation to the Hessian of the loglikelihood at the maximum likelihood estimates params, given the observed data, and returns the negative inverse of that Hessian.

example

acov = mlecov(params,data,'logpdf',logpdf) returns acov for a distribution specified by the custom log probability density function logpdf.

example

acov = mlecov(params,data,'nloglf',nloglf) returns acov for a distribution specified by the custom negative loglikelihood function nloglf.

example

acov = mlecov(___,Name,Value) specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. For example, you can specify the censored data and frequency of observations.

example

Examples

collapse all

Custom Probability Density Function (pdf)

Open Live Script

Load the sample data.

load carbig

The vector Weight contains the weights of 406 cars.

Define a custom function that returns the pdf of a lognormal distribution. Save the file in your current folder as lognormpdf.m.

function newpdf = lognormpdf(data,mu,sigma)
newpdf = exp((-(log(data)-mu).^2)/(2*sigma^2))./(data*sigma*sqrt(2*pi));

Estimate the parameters mu and sigma of the custom distribution.

[phat,pci] = mle(Weight,'pdf',@lognormpdf,'Start',[4.5 0.3])

phat = 1×2

    7.9600    0.2804

pci = 2×2

    7.9327    0.2611
    7.9872    0.2997

Compute the approximate covariance matrix of the parameter estimates.

acov = mlecov(phat,Weight,'pdf',@lognormpdf)

acov = 2×2
10^-3 ×

    0.1937   -0.0000
   -0.0000    0.0968

Estimate the standard errors of the estimates.

se = sqrt(diag(acov))'

se = 1×2

    0.0139    0.0098

The standard error of the estimates of mu and sigma are 0.0139 and 0.0098, respectively.

Recalculate the confidence intervals pci from the standard error se by using the Wald method (normal approximation).

alpha = 0.05;
probs = [alpha/2; 1-alpha/2];
pci2 = norminv(repmat(probs,1,numel(phat)),[phat; phat],[se; se])

pci2 = 2×2

    7.9327    0.2611
    7.9872    0.2997

Custom Log Probability Density Function (pdf)

Open Live Script

Define a custom function that returns the log pdf of a beta distribution. Save the file in your current folder as betalogpdf.m.

function logpdf = betalogpdf(x,a,b)
logpdf = (a-1)*log(x)+(b-1)*log(1-x)-betaln(a,b);

Generate sample data from a beta distribution with parameters 1.23 and 3.45, and estimate the parameters using the simulated data.

rng('default') % For reproducibility
x = betarnd(1.23,3.45,25,1);
phat = mle(x,'Distribution','beta')

phat = 1×2

    1.1213    2.7182

Compute the approximate covariance matrix of the parameter estimates.

acov = mlecov(phat,x,'logpdf',@betalogpdf)

acov = 2×2

    0.0810    0.1646
    0.1646    0.6074

Custom Negative Loglikelihood Function

Open Live Script

Load the sample data.

load('readmissiontimes.mat')

The data includes ReadmissionTime, which has readmission times for 100 patients. This data is simulated.

Define a custom negative loglikelihood function for a Poisson distribution with the parameter lambda, where 1/lambda is the mean of the distribution. You must define the function to accept a logical vector of censorship information and an integer vector of data frequencies, even if you do not use these values in the custom function.

custnloglf = @(lambda,data,cens,freq) ...
    - length(data)*log(lambda) + sum(lambda*data,'omitnan');

Estimate the parameter of the custom distribution and specify its initial parameter value (Start name-value argument).

phat = mle(ReadmissionTime,'nloglf',custnloglf,'Start',0.05)

phat = 
0.1462

Compute the variance of the parameter estimate.

acov = mlecov(phat,ReadmissionTime,'nloglf',custnloglf)

acov = 
2.1374e-04

Compute the standard error.

sqrt(acov)

ans = 
0.0146

Specify Right-Censored Data

Open Live Script

Load the sample data.

load('readmissiontimes.mat');

The data includes ReadmissionTime, which has readmission times for 100 patients. The column vector Censored contains the censorship information for each patient, where 1 indicates a right-censored observation, and 0 indicates that the exact readmission time is observed. This data is simulated.

Define a custom log probability density function (pdf) and log survival function for a Weibull distribution with the scale parameter lambda and the shape parameter k. When the data contains censored observations, you must pass both the log pdf and log survival function to mle and mlecov.

custlogpdf = @(data,lambda,k) ...
    log(k) - k*log(lambda) + (k-1)*log(data) - (data/lambda).^k;
custlogsf = @(data,lambda,k) - (data/lambda).^k;

Estimate the parameters of the custom distribution for the censored sample data. Specify the initial parameter values (Start name-value argument) for the custom distribution.

phat = mle(ReadmissionTime,'logpdf',custlogpdf,'logsf',custlogsf, ...
    'Start',[1,0.75],'Censoring',Censored)

phat = 1×2

    9.2090    1.4223

The scale and shape parameters of the custom distribution are 9.2090 and 1.4223, respectively.

Compute the approximate covariance matrix of the parameter estimates.

acov = mlecov(phat,ReadmissionTime, ...
    'logpdf',custlogpdf,'logsf',custlogsf,'Censoring',Censored)

acov = 2×2

    0.5653    0.0102
    0.0102    0.0163

Input Arguments

collapse all

`params` — Parameter estimates
vector

Parameter estimates, specified as a vector. These parameter estimates must be maximum likelihood estimates. For example, you can specify parameter estimates returned by mle.

Data Types: single | double

`data` — Sample data and censorship information
vector | two-column matrix

Sample data and censorship information used to estimate the distribution parameters params, specified as a vector of sample data or a two-column matrix of sample data and censorship information.

You can specify the censorship information for the sample data by using either the data argument or the Censoring name-value argument. mlecov ignores the Censoring argument value if data is a two-column matrix.

Specify data as a vector or a two-column matrix depending on the censorship types of the observations in data.

Fully observed data — Specify data as a vector of sample data.
Data that contains fully observed, left-censored, or right-censored observations — Specify data as a vector of sample data, and specify the Censoring name-value argument as a vector that contains the censorship information for each observation. The Censoring vector can contain 0, –1, and 1, which refer to fully observed, left-censored, and right-censored observations, respectively.
Data that includes interval-censored observations — Specify data as a two-column matrix of sample data and censorship information. Each row of data specifies the range of possible survival or failure times for each observation, and can have one of these values:
- [t,t] — Fully observed at t
- [–Inf,t] — Left-censored at t
- [t,Inf] — Right-censored at t
- [t₁,t₂] — Interval-censored between [t₁,t₂], where t₁ < t₂

mlecov ignores NaN values in data. Additionally, any NaN values in the censoring vector (Censoring) or frequency vector (Frequency) cause mlecov to ignore the corresponding rows in data.

Data Types: single | double

`pdf` — Custom probability density function
function handle | cell array

Custom probability distribution function (pdf), specified as a function handle or a cell array containing a function handle and additional arguments to the function.

The custom function accepts a vector containing sample data, one or more individual distribution parameters, and any additional arguments passed by a cell array as input parameters. The function returns a vector of probability density values.

Example: @newpdf

Data Types: function_handle | cell

`logpdf` — Custom log probability density function
function handle | cell array

Custom log probability density function, specified as a function handle or a cell array containing a function handle and additional arguments to the function.

Example: @customlogpdf

Data Types: function_handle | cell

`nloglf` — Custom negative loglikelihood function
function handle | cell array

Custom negative loglikelihood function, specified as a function handle or a cell array containing a function handle and additional arguments to the function.

The custom function accepts the following input arguments, in the order listed in the table.

Input Argument of Custom Function	Description
`params`	Vector of distribution parameter values `params`.
`data`	Sample data. The `data` value is a vector of sample data or a two-column matrix of sample data and censorship information.
`cens`	Logical vector of censorship information. `nloglf` must accept `cens` even if you do not use the `Censoring` name-value argument. In this case, you can write `nloglf` to ignore `cens`.
`freq`	Integer vector of data frequencies. `nloglf` must accept `freq` even if you do not use the `Frequency` name-value argument. In this case, you can write `nloglf` to ignore `freq`.
`trunc`	Two-element numeric vector of truncation bounds. `nloglf` must accept `trunc` if you use the `TruncationBounds` name-value argument.

nloglf can optionally accept the additional arguments passed by a cell array as input parameters.

nloglf returns a scalar negative loglikelihood value and, optionally, a negative loglikelihood gradient vector (see the GradObj field in the Options name-value argument).

Example: @negloglik

Data Types: function_handle | cell

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'Censoring',cens,'Options',opt instructs mlecov to read the censored data information from the vector cens and perform according to the new options structure opt.

`cdf` — Custom cumulative distribution function
function handle | cell array

Custom cumulative distribution function (cdf), specified as a function handle or a cell array containing a function handle and additional arguments to the function.

For censored or truncated observations, you must define both cdf and pdf. For fully observed and untruncated observations, mlecov does not use cdf. You can specify the censorship information by using either data or Censoring and specify the truncation bounds by using TruncationBounds.

Example: 'cdf',@newcdf

Data Types: function_handle | cell

`logsf` — Custom log survival function
function handle | cell array

Custom log survival function, specified as a function handle or a cell array containing a function handle and additional arguments to the function.

For censored or truncated observations, you must define both logsf and logpdf. For fully observed and untruncated observations, mlecov does not use logsf. You can specify the censorship information by using either data or Censoring and specify the truncation bounds by using TruncationBounds.

Example: 'logsf',@logsurvival

Data Types: function_handle | cell

`Censoring` — Indicator of censored data
vector of 0s (default) | vector consisting of 0, –1, and 1

Indicator of censored data, specified as a vector consisting of 0, –1, and 1, which indicate fully observed, left-censored, and right-censored observations, respectively. Each element of the Censoring value indicates the censorship status of the corresponding observation in data. The Censoring value must have the same size as data. The default is a vector of 0s, indicating all observations are fully observed.

You cannot specify interval-censored observations using this argument. If the sample data includes interval-censored observations, specify data using a two-column matrix. mlecov ignores the Censoring value if data is a two-column matrix.

For censored data, you must define the custom distribution by using pdf and cdf, logpdf and logsf, or nloglf.

mlecov ignores any NaN values in the censoring vector. Additionally, any NaN values in data or the frequency vector (Frequency) cause mlecov to ignore the corresponding values in the censoring vector.

Example: 'Censoring',censored, where censored is a vector that contains censorship information.

Data Types: logical | single | double

`TruncationBounds` — Truncation bounds
`[-Inf,Inf]` (default) | vector of two elements

Truncation bounds, specified as a vector of two elements.

For censored data, you must define the custom distribution by using pdf and cdf, logpdf and logsf, or nloglf.

Example: 'TruncationBounds',[0,10]

Data Types: single | double

`Frequency` — Frequency of observations
vector of 1s (default) | vector of nonnegative integer counts

Frequency of observations, specified as a vector of nonnegative integer counts that has the same number of rows as data. The jth element of the Frequency value gives the number of times the jth row of data was observed. The default is a vector of 1s, indicating one observation per row of data.

mlecov ignores any NaN values in this frequency vector. Additionally, any NaN values in data or the censoring vector (Censoring) cause mlecov to ignore the corresponding values in the frequency vector.

Example: 'Frequency',freq, where freq is a vector that contains the observation frequencies.

Data Types: single | double

`Options` — Numerical options
`statset('mlecov')` (default) | structure

Numerical options for the finite difference Hessian calculation, specified as a structure returned by statset.

The mlecov function interprets the following statset options.

Field Name Description

Field Name	Description
`GradObj`	Flag indicating whether the function provided by the `nloglf` input argument can return the gradient vector of the negative loglikelihood as a second output, specified as `'on'` or `'off'` (default).
`DerivStep`	Relative step size used in the finite difference for Hessian calculations, specified as a vector of the same size as `params`. The default value is `eps^(1/4)`. A smaller value than the default might be appropriate if `'GradObj'` is `'on'`.

GradObj

Flag indicating whether the function provided by the nloglf input argument can return the gradient vector of the negative loglikelihood as a second output, specified as 'on' or 'off' (default).

DerivStep

Relative step size used in the finite difference for Hessian calculations, specified as a vector of the same size as params.

The default value is eps^(1/4). A smaller value than the default might be appropriate if 'GradObj' is 'on'.

Example: 'Options',statset('GradObj','on')

Data Types: struct

More About

collapse all

Censorship Types

mlecov supports left-censored, right-censored, and interval-censored observations.

Left-censored observation at time t — The event occurred before time t, and the exact event time is unknown.
Right-censored observation at time t — The event occurred after time t, and the exact event time is unknown.
Interval-censored observation within the interval [t₁,t₂] — The event occurred after time t₁ and before time t₂, and the exact event time is unknown.

Double-censored data includes both left-censored and right-censored observations.

Survival Function

The survival function is the probability of survival as a function of time. It is also called the survivor function.

The survival function gives the probability that the survival time of an individual exceeds a certain value. Because the cumulative distribution function F(t) is the probability that the survival time is less than or equal to a given point t in time, the survival function for a continuous distribution S(t) is the complement of the cumulative distribution function: S(t) = 1 – F(t).

Version History

Introduced before R2006a

mlecov

Syntax

Description

Examples

Custom Probability Density Function (pdf)

Custom Log Probability Density Function (pdf)

Custom Negative Loglikelihood Function

Specify Right-Censored Data

Input Arguments

`params` — Parameter estimates
vector

`data` — Sample data and censorship information
vector | two-column matrix

`pdf` — Custom probability density function
function handle | cell array

`logpdf` — Custom log probability density function
function handle | cell array

`nloglf` — Custom negative loglikelihood function
function handle | cell array

Name-Value Arguments

`cdf` — Custom cumulative distribution function
function handle | cell array

`logsf` — Custom log survival function
function handle | cell array

`Censoring` — Indicator of censored data
vector of 0s (default) | vector consisting of 0, –1, and 1

`TruncationBounds` — Truncation bounds
`[-Inf,Inf]` (default) | vector of two elements

`Frequency` — Frequency of observations
vector of 1s (default) | vector of nonnegative integer counts

`Options` — Numerical options
`statset('mlecov')` (default) | structure

More About

Censorship Types

Survival Function

Version History

See Also

Topics

mlecov

Syntax

Description

Examples

Custom Probability Density Function (pdf)

Custom Log Probability Density Function (pdf)

Custom Negative Loglikelihood Function

Specify Right-Censored Data

Input Arguments

params — Parameter estimates vector

data — Sample data and censorship information vector | two-column matrix

pdf — Custom probability density function function handle | cell array

logpdf — Custom log probability density function function handle | cell array

nloglf — Custom negative loglikelihood function function handle | cell array

Name-Value Arguments

cdf — Custom cumulative distribution function function handle | cell array

logsf — Custom log survival function function handle | cell array

Censoring — Indicator of censored data vector of 0s (default) | vector consisting of 0, –1, and 1

TruncationBounds — Truncation bounds [-Inf,Inf] (default) | vector of two elements

Frequency — Frequency of observations vector of 1s (default) | vector of nonnegative integer counts

Options — Numerical options statset('mlecov') (default) | structure

More About

Censorship Types

Survival Function

Version History

See Also

Topics

`params` — Parameter estimates
vector

`data` — Sample data and censorship information
vector | two-column matrix

`pdf` — Custom probability density function
function handle | cell array

`logpdf` — Custom log probability density function
function handle | cell array

`nloglf` — Custom negative loglikelihood function
function handle | cell array

`cdf` — Custom cumulative distribution function
function handle | cell array

`logsf` — Custom log survival function
function handle | cell array

`Censoring` — Indicator of censored data
vector of 0s (default) | vector consisting of 0, –1, and 1

`TruncationBounds` — Truncation bounds
`[-Inf,Inf]` (default) | vector of two elements

`Frequency` — Frequency of observations
vector of 1s (default) | vector of nonnegative integer counts

`Options` — Numerical options
`statset('mlecov')` (default) | structure