How to apply a threshold to gevfit?

10 views (last 30 days)
Darcy Cordell
Darcy Cordell on 20 Jan 2025
Answered: Mathieu NOE on 21 Jan 2025
I am trying to fit a generalized extreme value (GEV) distribution to some data. I am following the example found here: https://www.mathworks.com/help/stats/modelling-data-with-the-generalized-extreme-value-distribution.html
I am using 10-day block maxima for about 17 years worth of data (e.g. around 600 blocks). However, if I apply the GEV to all 600 blocks, the GEV fit is terrible at the extremes. I have seen that some authors only apply the GEV to blocks which have block maxima above some threshold (e.g. >99.9th percentile of the data).
But how do I apply this threshold while still being able to compare the GEV with the original data?
Specifically, the example in the above MATLAB docs link shows the fit of the empirical CDF with the GEV CDF. But I want to fit the data only above some threshold like this:
As an example, I have some vector of block maxima (see attached blockmax.mat):
load('blockmax.mat')
[blockcdf , blockx ] = ecdf(blockmax); %Get the empirical cumulative distribution function of data
threshold = 0.2; %some arbitrary threshold (e.g. 99.9th percentile)
[p, ci] = gevfit(blockmax(blockmax>threshold)); %Get GEV fit for all values above the threshold
gevc = gevcdf(blockx,p(1),p(2),p(3)); %Get the GEV CDF
%These two plots are not comparing apples to apples! The GEV cdf has a
%threshold applied...
plot(blockx,blockcdf); hold on
plot(blockx,gevc)
plot([threshold threshold],[0 1],'--k')
grid on
ylabel('Cumulative Probability')
xlabel('Value')
set(gca,'XScale','log')
legend('Empirical CDF','GEV Fit')
As can be seen, the ECDF of the data and the GEV CDF do not match. What am I missing here?
Any help is appreciated.

Answers (1)

Mathieu NOE
Mathieu NOE on 21 Jan 2025
hello
you can already have a better fit without the threshold and use it only to truncate the fitted curve (but I agree it's not the same as try to do the fit on the data only above the threshold but that doesn't seem to be an option with gevfit)
load('blockmax.mat')
[blockcdf , blockx ] = ecdf(blockmax); %Get the empirical cumulative distribution function of data
threshold = 0.05;
[p, ci] = gevfit(blockmax(blockmax>0)); %Get GEV fit for all values above 0
gevc = gevcdf(blockx,p(1),p(2),p(3)); %Get the GEV CDF
%These two plots are not comparing apples to apples! The GEV cdf has a
%threshold applied...
plot(blockx,blockcdf); hold on
ind = (blockx>threshold);
plot(blockx(ind),gevc(ind))
plot([threshold threshold],[0 1],'--k')
grid on
ylabel('Cumulative Probability')
xlabel('Value')
set(gca,'XScale','log')
legend('Empirical CDF','GEV Fit','Location','northwest')

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!