Is it possible to use polyfit in a way that it takes into account the errorbars?

40 views (last 30 days)
Dear all,
I have a two sets of data that I want to compare. I do so by making a scatterplot of the two and afterwards fit the data with a simple linear regression model. However, for some of the datapoints I assume a large uncertainty that I have already estimated. As far as I understand the polyfit command, it finds a polynomial function that tries to "collect" all the datapoints in the best possible way (in the sense of the smallest squares). I would be interested though in a fit function, that fits best with all the ranges of values I have estimated and not the actual given datapoints. I could not find an answer to that in the documentation and hope someone here can give me a hint?
Thank you very much.
  2 Comments
dpb
dpb on 17 Jul 2018
Are these estimated errors symmetric around the measured points? If so, the OLS and other estimate will still be around the mean.
Typically, one uses a weighting function to give points with hither error a lower weight in the fit; what's your objective here--almost sounds like you're trying to bias towards an error rather than fit...
As always, showing the data for context would be helpful.
Alex W
Alex W on 17 Jul 2018
I am very sorry, I did not take the time to craft a meaningful working example. Say I had a linear correlation between two variables x and y but for whatever reason some datapoints of y are corrupted. But I assume a large error for these points and therefore expect a nice linear dependency of all the other points. However, the polyfit command does not take into account the errors for each datapoint and gives a fit that does not show the correlation we clearly assume is there. So in a nutshell I would be interested in knowing how to get polyfit to take into account the errors. Your approach with weighting the data sounds plausible to me, yet not entirely what I want. In any case, may I ask how you would implement that in matlab?
For illustration purposes I want to show a very minimal example here. We assume a function y=x for instance, and have 4 datapoints to test our hypothesis. The second datapoint however was not measured properly and cleanly for whatever reason and we estimate a large error for that one. In the example I constructed a linear function y=x would still satisfy all the datapoints within the boundaries of error, but the polyfit command gives a curve with a completely different slope and setpoint (how would it not, I never specified taking into account the errors, that is after all what I am interested in.)
x=[1 2 3 4];
y=[1 1 3 4];
err=0.1*ones(1, 4);
err(2)=1.1;
fitfunction=polyfit(x,y,1);
yp=polyval(fitfunction,x);
scatter(x,y)
hold on
plot(x,yp,'g');
errorbar(x,y,err,'LineStyle','none')
Thank you so much in advance :)

Sign in to comment.

Answers (2)

John D'Errico
John D'Errico on 16 Dec 2023
Different people mean different things when they say error bars. But you also use the word weights, which is a bit more standard. So I'll assume that you have weights for each point.
However polyfit does not allow weights. Other tools, such as fit from the curve fitting toolbox do allow weights, but you need that toolbox. Or if you have the stats toolbox, you could use fitlm. With slightly more effort, you could also use tools like lsqlin (optimization toolbox), or lscov, but they will require you to create appropriate matrices. If you know enough to create the necessary matrices to use those tools, then even backslash will solve the problem.

Douglas Novaes
Douglas Novaes on 16 Dec 2023
MATLAB's fit function provides the capability to account for error bar weights in curve fitting. The fit function, combined with the appropriate fitting type ('poly5' for a 5th-degree polynomial, for instance) and the error model ('Weights', specifying the weights), can be used to fit curves while considering the error bar weights.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!