Different results with Polyfit?

5 views (last 30 days)
Jules Ray
Jules Ray on 3 Mar 2017
Edited: John D'Errico on 3 Mar 2017
I'm using polyfit to get a simple lineal regression of degree 1
howver the coefficients from the lineal equation varies slightly. Is not a big variation but each time I run polyfit the coeficients are slightly different than the run before.
Indeed if I run polyfit in a loop or using a bootstrap I obtained several similar values but slightly different.
I know the results is ok and is probably related with the uncertainty of polyfit estimation, but I dont understand the mathematics behind this slight variation.
Does anyone have an idea why this happens?
Cheers

Accepted Answer

John D'Errico
John D'Errico on 3 Mar 2017
It does not happen. IF you pass in exactly the same data (in the same order) into polyfit, calling polyfit the same way each time, it will return exactly the same result, time after time after time.
If you change the data in any way, then of course you will get different results. In fact, even changing the order of the data points is sufficient.
x = randn(100,1);
y = randn(100,1);
P1 =polyfit(x,y,1);
s = randperm(100);
P2 = polyfit(x(s),y(s),1);
P1 == P2
ans =
1×2 logical array
0 0
P1 - P2
ans =
1.7347e-17 6.9389e-17
So all I did was permute the data. Change the order of ANY set of floating point additions, subtracts, multiplies, etc., and you can get a slightly different result.
0.3 - 0.2 - 0.1
ans =
-2.7756e-17
-0.1 - 0.2 + 0.3
ans =
-5.5511e-17
What I have described is NOT due to the accuracy of polyfit, but simply a basic feature of floating point arithmetic. If you are having a different problem, I cannot guess what it is, since you have told us virtually nothing about what you really did. People screw up all sorts of things in all sorts of different ways. So without knowing true specifics, all we can do is make guesses.
  2 Comments
Jules Ray
Jules Ray on 3 Mar 2017
Edited: Jules Ray on 3 Mar 2017
Thanks for your answer... maybe I'm missunderstanding something here an example
if true
clear
X_E=[257180.148132324,257182.200988770,257183.951254355,257185.459594727,257188.927673340,257191.060831488,257193.484331052,257195.999755230,257198.445690139,257200.644455053,257202.582223719,257204.208745053,257205.498291016,257206.860518441,257207.425048828,257208.542259765,257211.649983677,257212.819885254,257213.430984312,257213.955834038,257215.425800405,257216.288452148,257217.560921910,257219.270987033,257221.237270770,257227.119033113,257228.795787703,257230.161193848,257231.636797666,257234.892593131,257236.712341309,257240.565673828,257243.738312829,257246.554913088,257248.971808274,257250.970520020,257253.111625686,257254.053405762,257255.256326070,257258.240346596,257259.448242188,257260.668271778,257263.137341217,257264.457885742,257265.430337653,257267.760200739,257268.697021484,257271.394470215,257273.033179828,257276.404113770,257280.206731603,257287.889458686,257291.017592100,257294.171966406,257296.828247070,257298.714889681,257303.110915463,257305.305908203,257308.438319313,257309.544860840,257310.066639695,257310.900325875,257311.471618652,257312.162379056,257314.169325606,257314.939941406,257315.341124861,257315.646524022,257316.445689880,257316.793421783];
Y_E=[5596222.34149170,5596231.86547852,5596232.07256538,5596232.58905029,5596235.28668213,5596236.33734967,5596237.16789960,5596237.71112480,5596237.93261018,5596237.82725106,5596237.40775195,5596236.68531664,5596235.67193604,5596233.58199607,5596232.97448731,5596232.55405675,5596232.52503967,5596232.20361328,5596231.68610531,5596230.90280274,5596227.01130997,5596225.65270996,5596224.73418842,5596224.14679759,5596223.86592936,5596223.48732638,5596223.12727301,5596222.56982422,5596221.53931947,5596218.59161443,5596217.56018066,5596216.78955078,5596215.75764809,5596214.50269830,5596213.04436437,5596211.39428711,5596208.42910900,5596207.54095459,5596207.07512067,5596206.73276630,5596206.38488770,5596205.51530027,5596202.89199862,5596202.14593506,5596202.05706356,5596202.29071064,5596202.14593506,5596200.21893311,5596199.85311049,5596199.83367920,5596199.93702897,5596200.97563792,5596201.20824736,5596200.99804124,5596200.21893311,5596199.10731173,5596195.42072160,5596194.05340576,5596193.13523611,5596192.51190186,5596191.76819885,5596189.69076160,5596189.04382324,5596188.83174319,5596188.84636981,5596188.65838623,5596188.29374978,5596187.68107973,5596184.40266686,5596183.89414149];
iterations=1000; [p_bootstrp_E,bb] = bootstrp(iterations,'polyfit',X_E,Y_E,1);
end
John D'Errico
John D'Errico on 3 Mar 2017
Edited: John D'Errico on 3 Mar 2017
I think you don't understand what a bootstrap does.
https://en.wikipedia.org/wiki/Bootstrapping_(statistics)
In there, the very first line says "In statistics, bootstrapping is any test or metric that relies on random sampling with replacement."
The help for bootstrp says the same thing.
"bootstrp creates each bootstrap sample by sampling with replacement from the rows of the non-scalar data arguments"
The variability is NOT due to polyfit. This is not a question of each time you use polyfit. Infact, you are not calling polyfit directly at all. The issue is which subset of your data that bootstrp passes to polyfit. When you do random sampling, you will get different results. Can you expect a fixed result given random sampling?

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!