How to apply a threshold to gevfit?
10 views (last 30 days)
Show older comments
I am trying to fit a generalized extreme value (GEV) distribution to some data. I am following the example found here: https://www.mathworks.com/help/stats/modelling-data-with-the-generalized-extreme-value-distribution.html
I am using 10-day block maxima for about 17 years worth of data (e.g. around 600 blocks). However, if I apply the GEV to all 600 blocks, the GEV fit is terrible at the extremes. I have seen that some authors only apply the GEV to blocks which have block maxima above some threshold (e.g. >99.9th percentile of the data).
But how do I apply this threshold while still being able to compare the GEV with the original data?
Specifically, the example in the above MATLAB docs link shows the fit of the empirical CDF with the GEV CDF. But I want to fit the data only above some threshold like this:

As an example, I have some vector of block maxima (see attached blockmax.mat):
load('blockmax.mat')
[blockcdf , blockx ] = ecdf(blockmax); %Get the empirical cumulative distribution function of data
threshold = 0.2; %some arbitrary threshold (e.g. 99.9th percentile)
[p, ci] = gevfit(blockmax(blockmax>threshold)); %Get GEV fit for all values above the threshold
gevc = gevcdf(blockx,p(1),p(2),p(3)); %Get the GEV CDF
%These two plots are not comparing apples to apples! The GEV cdf has a
%threshold applied...
plot(blockx,blockcdf); hold on
plot(blockx,gevc)
plot([threshold threshold],[0 1],'--k')
grid on
ylabel('Cumulative Probability')
xlabel('Value')
set(gca,'XScale','log')
legend('Empirical CDF','GEV Fit')

As can be seen, the ECDF of the data and the GEV CDF do not match. What am I missing here?
Any help is appreciated.
0 Comments
Answers (1)
Mathieu NOE
on 21 Jan 2025
hello
you can already have a better fit without the threshold and use it only to truncate the fitted curve (but I agree it's not the same as try to do the fit on the data only above the threshold but that doesn't seem to be an option with gevfit)
load('blockmax.mat')
[blockcdf , blockx ] = ecdf(blockmax); %Get the empirical cumulative distribution function of data
threshold = 0.05;
[p, ci] = gevfit(blockmax(blockmax>0)); %Get GEV fit for all values above 0
gevc = gevcdf(blockx,p(1),p(2),p(3)); %Get the GEV CDF
%These two plots are not comparing apples to apples! The GEV cdf has a
%threshold applied...
plot(blockx,blockcdf); hold on
ind = (blockx>threshold);
plot(blockx(ind),gevc(ind))
plot([threshold threshold],[0 1],'--k')
grid on
ylabel('Cumulative Probability')
xlabel('Value')
set(gca,'XScale','log')
legend('Empirical CDF','GEV Fit','Location','northwest')
0 Comments
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!