Script to remove polynomial/quadratic error off CSV data

1 view (last 30 days)
[tl;dr: read a csv, fit a curve, substract it from the data and write back to the csv]
Hello everyone,
for a research project I have large amounts of data coming off a profilometer. If you don't know, this is a device that measures the surface profile, in my case of a thin film on a piece of glass, and stores it as X/Y-data in .csv form. Inherent to this data is an error caused by the curvature of the glass plate, that needs to get removed. One such measurement will produce about 40000 lines of data.
I have determined that a quadratic compensation is good enough for what I'm looking to measure, so I have an area in front of and behind the film, as well as in the middle, where there is no film, which can be used to fit a quadratic polynome. The data is quite noisy, so you need to take an average over a couple 100 points. What I would like to do is write a script that reads a CSV file, fits a quadratic polynome to these areas that are known to be the glass plate and subtracts this polynome from the data, so I will hopefully end up with data that is compensated for the curvature of the glass plate, which is then added to the CSV file, ideally in a third column, if that is even possible.
Unfortunately, I am quite new to Matlab, although I managed to cobble together a script that could read a CSV file and plot it in the past, I don't know where to even start with this one. Has anyone ever done this or knows how to do it?
Best, IJ
  6 Comments
dpb
dpb on 10 May 2021
Ah...that's a lot less restrictive of a problem statement than I had inferred from prior... :)
Are the spikes "real" in that they're going to be influencing this estimate across the sample or would/should rejecting them be part of the algorithm?
I've not looked at the rest, there are a relatively few meally large spikes of from 2-3X to 5-6X the surrounding area that are extremely large excursion at the beginning/ending although they have some noise/structure at the peak (that may/may not be real?). Would it be desirable/acceptable to remove those and replace with, say, spline interpolant between?
That likely could be done reasonably robustly and then, having done that in your three selected areas, just fit that parabola on the means of those locations. You could investigate the effect of fitting the raw data as well, but I suspect it wouldn't help much and would, in fact, reintroduce more noise than would help.
I've got other tasks right now, but I'll try to look again later this evening...but those would be my thinking of what I'd probably try. findpeaks if you have Signal Processing TB could be very helpful in peak-locating.
Ivo Trausch
Ivo Trausch on 10 May 2021
The spikes are part of the data and should actually stay in the representation, at least in the film. They may indicate contaminants, air bubbles in the film etc. which help to judge the surface quality. You could still do the interpolation for your data processing, but I wouldn't bother to be honest. Making the selected area bigger or moving it to a less noisy spot is probably easier, and it does not have to be perfect at all, I just need to take out the overall bend.
Thanks anyways for taking so much time out of your day.

Sign in to comment.

Answers (2)

Steven Lord
Steven Lord on 10 May 2021
The detrend function and/or the Remote Trends task may be of interest or use to you.
  4 Comments
Steven Lord
Steven Lord on 10 May 2021
As of release R2019a detrend allows you to remove polynomial trends. See the Release Notes.
The Remove Trends task is new as of release R2019b. See the Release Notes.
Ivo Trausch
Ivo Trausch on 18 May 2021
I tried detrend, but I could not get it to select the right part of my data.

Sign in to comment.


Ivo Trausch
Ivo Trausch on 18 May 2021
Edited: Ivo Trausch on 18 May 2021
Hey everyone,
just to update you that I managed to perform the correction using the polyfit function. I selected three slices of my original data for a new matrix, to which I fitted a polynome that I then subtracted. Here's the section of the code that's doing the job:
FirstSeries = readmatrix('G53_in.csv');
% SecondSeries = readmatrix('G53_cross.csv');
FirstSeries_selection = [FirstSeries(1:2000, :); FirstSeries(ceil(end/2)-1000:ceil(end/2)+1000, :); FirstSeries(end-2000:end, :)];
corrector = polyfit(FirstSeries_selection(:, 1), FirstSeries_selection(:, 2), 2);
x_axis = FirstSeries(:, 1);
y_1 = FirstSeries(:, 2);
% y_2 = SecondSeries(:, 2);
y_1_fit = polyval(corrector,x_axis);
y_1_corrected = y_1 - y_1_fit;
And here's a plot generated with the original data, the fit and the corrected data.
Writing back to the CSV data is still a work in progress.

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!