# piece-wise linear fitting

220 views (last 30 days)
Majid Rezaei on 9 Jun 2017
Hi,
I have a data series and I want to fit 3 consecutive lines on my data. I know the exact slope of the middle line, while slope of the left line is unknown as well as that of the right line. Also, placement of the breakpoints are unknown. how can I do that? Can I use SLM functions? If yes, how can I specify slope of the middle line in SLM?
Many thanks, Majid

John D'Errico on 9 Jun 2017
Edited: John D'Errico on 9 Jun 2017
No, you won't be able to use SLM,because whole you can specify aslope at some point, you cannot require that the specified location be between a pair of knots that are allowed to float.
Its not that difficult to do using other methods though. You will need a simple optimizer, one that can handle at least bound constraints. So fmincon, lsqnonlin, fminsearchbnd even will work. lsqcurvefit would be simplest.
I'll see if I can put together a quick example before I need to run out. I have about 30 minutes, so let me see what I can do in that time. If not enough time, then I'll post what I can, then finish after I return this afternoon.
Ok. Start with some sample data. Sin(x), near x == 0, plus some error in y.
x = 3*(rand(100,1) - .5);
y = sin(x) + randn(size(x))/20;
plot(x,y,'o')
So, we know from high school math about sin(x), that the slope near zero is 1. We will choose to model this as a piecewise linear function, with 2 breaks/knots.
The trick is to use what I recall were named "plus functions".
plusfun = @(x) max(x,0);
So, for x > 0, we get x. For x < 0, we get 0.
Lets build a model now. Assume the curve is defined in terms of three linear segments, with breaks at b1 and b2. Below break b1, the slope is s1. between b1 and b2, the slope will be s2, a fixed, known value. Above b2, the slope will be s3. Finally, we will need to know a simple additive constant. I'll call it a. a will shift the entire curve up or down in y.
Next, there will be constraints on the breaks. These are absolutely necessary to build a model that is well posed everywhere. Otherwise the problem may become singular, causing the optimizer to get upset at you.
It is necessary, that there be at LEAST one data point in each curve segment. That means we need at least one data point below b1, one data point at least above b2, and one or more between them. We could write them like this:
min(x) + tol <= b1
b2 - b1 >= max(diff(sort(x))) + tol
b2 <= max(x) - tol
Those are linear inequality constraints. So really, a tool like fmincon would be needed. But we can come close using bound constraints with lsqcurvefit.
The next trick I'll use is to define the second break as a positive offset from the first. Thus:
b2 = b1 + deltab
Where deltab is a parameter we will estimate, subject to the constraint that
deltab >= max(diff(sort(x))) + tol
min(x) + tol <= b1
b1 <= max(x) - max(diff(sort(x))) - tol
Those are bound constraints that will usually be sufficient to avoid problems. I had to fool around at the top end a bit.
Now I can finally build my model function. The vector of unknown parameters will be P, a vector of length 5. Assume the middle slope is called s2.
plusfun = @(x) max(x,0);
model = @(P,x) P(1) + s2*x + (s2-P(2))*plusfun(P(4)-x) + (P(3)-s2)*plusfun(x-(P(4)+P(5)));
While the approach I am following here is not how I would formulate the problem were I writing it inside a tool like SLM, this approach will be entirely valid. I was also thinking that a simple solution would have been to add an extra optional constraint to SLM. Regardless, lets see how lsqcurvefit works...
s2 = 1;
P0 = [.1 1 1 -.5 1];
lb = [-1 -10 -10 -.9 .1];
ub = [1 10 10 .8 1.8];
s2 = 1;
plusfun = @(x) max(x,0);
model = @(P,x) P(1) + s2*x + (s2-P(2))*plusfun(P(4)-x) + (P(3)-s2)*plusfun(x-(P(4)+P(5)));
Pfit = lsqcurvefit(model,P0,x,y,lb,ub)
Local minimum found.
Optimization completed because the size of the gradient is less than
the default value of the optimality tolerance.
<stopping criteria details>
Pfit =
0.0032732 0.3385 0.44027 -0.75622 1.4282
Do the parameters make sense? The constant term os roughly zero. The lopes in the bottom and top segments were both around 0.4, and the break points were at -.756, and
-0.75622 + 1.4282
ans =
0.67198
What really matters is does the fit look like what I want to see?
modelpred = model(Pfit,sort(x));
plot(x,y,'o',sort(x),modelpred,'r-')
Yes, it fits reasonably well, given it is just three linear segments, continuously joined together. We have built what is a very simple piecewise linear spline.
Philipp Kreyenmeier on 14 Aug 2020
I had a similar question: How would I adapt this method if I only know the first slope?

Greg Dionne on 9 Jun 2017
Edited: Greg Dionne on 9 Jun 2017
Have you tried FINDCHANGEPTS in the Signal Processing Toolbox, and how well does it fit your middle segment?