Determine which of N points is not on sin(ax+b), where a and b are unknown.
11 views (last 30 days)
Show older comments
sangeet sagar
on 1 Mar 2018
Commented: Walter Roberson
on 8 May 2018
Suppose N points ((x1,y1),(x2,y2),...(xN,yN)) are given from a curve y=sin(ax+b) where a,b values are unknown. Before giving these N points to you, y coordinate of one point is randomly tampered so that it does not lie on the curve. Write a program to determine which point among N points is NOT on the sinusoidal curve, whose a, b values are unknown.
Any logic on how to approach this question, would be highly appreciate would be very thankful! Please dont suggest inbuilt functions and toolbox (for e.g. NonLinearModel.fit and curve fitting toolbox)
1 Comment
Walter Roberson
on 8 May 2018
C and C++ code are mostly off topic for this forum, except in connection with Mex or loadlibrary or Polyspace.
Accepted Answer
James Tursa
on 2 Mar 2018
Edited: James Tursa
on 2 Mar 2018
A modification of your posted approach:
Calculate the a and b using two points at a time, then use that to generate an associated y vector and compare it to the original y vector to find the outlier. Do this for several pairs of points to make sure several of the pairs do not contain the outlier (it is OK if a couple of them do). Then pick off the outlier that most of the pairs agree on. E.g., an outline:
x=(0:0.01:2*pi); % sample data
y=sin(2*x + 3); % sample data
y(100) = something_else; % pick some arbitrary spot to be the outlier
m = zeros(numel(x)-1,1); % calculated outlier index array, one element for each pair of points
for i=1:numel(x)-1 % loop through a bunch of point pairs
% you insert code here to find the a and b values for the i and i+1 pair of points using posted P\R method
% you insert code here to calculate a y vector for the entire x vector using the a and b you just calculated
% you insert code here to find the index of the max abs difference between original y and just calculated y
% save that index in the m(i) spot
end
outlier = mode(m); % the most frequent value in m is the index of the outlier
So, you simply need to fill in the code inside the loop using the techniques that are already discussed in this thread. I could have given you this code, but thought I would leave it to you to figure out (it is just the appropriate pieces from what has already been posted in this thread). I picked the looping above to match the looping that you already had in your code, but it doesn't have to be this. You could pick the pairs randomly if you wanted to ... you just need to generate enough different pairs to guarantee that most of them will not contain the outlier. The mode(m) stuff will get rid of any pair results that may have been corrupted by an outlier.
More Answers (2)
Walter Roberson
on 1 Mar 2018
Loop N times
take data without sample #N
fit arcsin(y) = a*x+b as a linear fit to get a and b
project yp = sin(a*x+b)
calculate residue(N) between y and yp
end
Lowest residue matches the case where the tampered point was excluded.
10 Comments
Srikanth KS
on 2 Mar 2018
How about trying using a different approach. I assume that a and b ae unknown given the coordinates x and y I will substitute x and y and solve the simultaneous equations to get a and b. I will solve it for a couple of points to prove that my a and b are correct it was mentioned that only 1 point was messed up so once I know a and b I will iterate and find out the difference between y and to typically y -yp should tend to 0 if I get a point where y -yp is not close to 0 I will pick that point as manipulated point.
1 Comment
See Also
Categories
Find more on Linear and Nonlinear Regression in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!