processing experimental data with polyfit

6 views (last 30 days)
I have this code, it works, but with some warnigs and it doesn't look like other codes I see here, because I am not a programmer, but it is very helpul and I want to share it in file exchange. So would you please share your knowledge to make it look and function better, Thanks
% this file reads data, extrapolates and generates fit curves with given degree of polyfit.
clear all
clc
D=22;% degree of polyfit
Y_axis=xlsread('F:\KT312',1,'b5:v42');
X_axis=xlsread('F:\KT312',1,'a5:a42');
for i=1:size(Y_axis,2)
Ys0=Y_axis(:,i);
[x,~]=find(isfinite(Ys0));
Xs=X_axis(x);
Ys=Ys0(x);
Xs2=linspace(Xs(1),Xs(end),100)';% extrapolation of the curve for a better precision
Ys2=spline(Xs,Ys,Xs2);
fitYs=polyfit(Xs2,Ys2,D);
Ysfit=zeros(size(Xs2));
for j=1:length(fitYs)-1
Ysfit(:,j)=fitYs(j)*Xs2.^(length(fitYs)-j);
end
Ysfit=sum(Ysfit,2)+fitYs(length(fitYs));
Ys2data(:,i)=Ys2;
Ysfit_data(:,i)=Ysfit;
Xs2_data(:,i)=Xs2;
end
clearvars -except Ys2data Ysfit_data Xs2_data
display 'done'
  3 Comments
Rik
Rik on 30 Oct 2020
That seems reaonable. Do you have a further question?

Sign in to comment.

Accepted Answer

John D'Errico
John D'Errico on 30 Oct 2020
Edited: John D'Errico on 30 Oct 2020
You asked for feedback...
So what does your tool offer that is innovative that someone might gain some benefit from?
  1. It reads in data. Generally having hard coded calls to read in data are terrible programming style. If someone else has their data in a different place, they need to modify your code - a bad idea.
  2. It tests for finite data, discarding infs. A good idea, I suppose. This is something anyone should do and know how to do in advance. You should probably be testing for NaNs also, as NaNs are commonly used to signify missing data.
  3. It uses linspace to "extrapolate" the data. WRONG. There is no extrapolation done in your code. And since it uses a fixed number of points, the result may actually be less fine of a grid, in case the real data involved a finer set of points.
  4. You interpolate the data with a psline, and then you use polyfit to approximate the spline? This is a really bad idea, since the spline will very possibly introduce artifacts into your data. Splines can often create ringing artifacts. Any noise in the data will be exagerated by the interpolation.
  5. It fits the curve using polyfit, but it fits a polynomial of order 22??????? This is literally insane, that high degree of a polynomial is almost always going to result in garbage fits, especially bad if the polynomial was fit to arbitrarily scaled data. polyfit is a useful tool to fit a straight line. But once you get past a quadratic, or maybe a cubic, you may be using the wrong tool, for the wrong reasons.
  6. Finally, it computes something called Ysfit, where you are raising things to powers as large as 100. Totally ridiculous. Something with literally no mathematical meaning, and certainly no mathematical/statistical value.
Some other general comments.
  1. No documentation is provided. So if someone wants to know what you are doing or why, they need to make some sort of educated guess. And if a highly experienced user and writer of code involving splines and other modeling tools has no clue as to what and why you did something, then how in the name of god and little green apples do you expect a novice user to be able to guess what the code does?
  2. You used a script. This alone is bad, since it creates new variables in the base workspace. It steps on variables that may already exist, overwriting the values they had. Of course, since the first line of code is to clear everything that the user might have had in their workspace, that also is unfriendly. It leaves behind junk in the workspace. LEARN TO USE AND WRITE FUNCTIONS!!!!!!!!!!!!!!!!!!!!
  3. The variables in your code have meaningless names, strings of characters that make no sense. Why are meaningful, readable, intelligent variable names important? They make your code easier to read and debug. They make it easier to use. And one day, when someone else in the universe might want to use your code (perhaps you get run over by the crosstown bus and someone else needs to take over and maintain your code base) they can do so. And it may not even that you are unable to work on your code. Imagine that next month or next year, you need to use and modify this code. Would you have any reason to remember what your code does, and why? I have MATLAB code that is pushing 35 years in age, and is still usable.
I'm sorry, but this script offers essentially no value to others. It does nothing innovative. It chains together many things that are each alone highly suspect. And yes, you will have every right to be upset at my "review" of what you have to offer, but don't kill the messenger just because they tell you something you don't want to hear.
How could you fix it?
  1. First, learn to write and use functions. That would be your important step to proceed beyond the point of novice programming.
  2. Breaking your code into functions means the code is itself easier to use, test, maintain, and modify as needed.
  3. Learn to document what you do. My recommended target is it would be good if EVERY line of code is both readable, but also has a comment attached that explains what it does. Comments are free, but are of incredible value to someone who will maintain the code. Since that target is a difficult one, I would insist on at least every significant block of code (every loop, for example, every test, etc.) have a comment explaining both the purpose of the block, as wel as an explanation in clear words as to how it was done, in case there is anything of significance. Essentially, if it took you more than a minute to write a code fragment, then it should have an explanatory comment attached to that fragment!
  4. Learn to use intelligent variable names. They make your code easy to read and follow.
  5. Learn to write help for the functions you provide. The help should explain what the inputs to those functions are. It should explain what the function returns, what it does.
  6. Advice that I often give out is to PLOT EVERYTHING. You plotted nothing. When you provide a plot, you provide visual feedback to the user. Did this code do something reasonable? Or is it insanity poured into a blender?
I'll admit there is a lot of similar random stuff posted on the file exchange. But does what you have written improve on anything someone would find? How? Why? What is innovative about what you did? What did you do that would make the life of someone easier? Or would someone need to spend more time trying to figure out what your code does and how to modify it, than it would take to just rewrite the mess from scratch?
Again, I know you won't be happy to hear what I have written. You asked for feedback.
  3 Comments
John D'Errico
John D'Errico on 31 Oct 2020
Edited: John D'Errico on 31 Oct 2020
If you think it is helpful, then you need to recognize that most other people are not solving the same problems as are you. But there will probaly be a few such people. The issue is to find them. How do you do that?
First, make your code usable, in the ways I describe. Make it as friendly and easy to use as possible. Document it well. Explain clearly what you did and why. Note that this process will be of valuable for you too, since doing so will make the code more usable for you. This will extend the useful life of the code, since if you will want to use it a year from now, you will surely not remember what it did or how to use it.
The good documentation you should provide will make it easy for others to use, but also help since it will help you to explain what the code does, and why that was useful for you. And that will help others who are looking for what you have written. (Honestly, I still don't know for sure why you think it was useful to you, since order 22 polynomials are invariably garbage. Sorry, but they are meaningless. Nobody has ever convinced me you can approximate an interpolating spline with an order 22 polynomial and have the result provide useful value. But you could be the exception, and I would then accept it, if you could show the result does offer value.)
Regardless, you will need to provide good tags when you post this on the file exchange. Tags are how someone can search for your code, how they might find it.
For example, your code uses polynomial fits to data. So the first tag I would think of is might be "polyfit". Next, look for other one word tags that will help direct someone to your code. Essentially, you use the tag cloud as a way to help the people who would look for the routines you provide to then find your code.
Once they find it, then you need to write a GOOD CLEAR explanation of what the code does in the description. Show someone why they want to download your code, explain what it does that will improve their MATLAB lives. I would STRONGLY recommend an example. Pictures are also good. You may not appreciate this, but it is the packaging that sells the most successful products. Good packaging is essential. In the end, in order to convince someone to download something from the file exchange, you need to sell it to them. The cost of downloading a tool from the file exchange is the time required to find it, the time required to download and install it, the time required to learn to use it. You get no money in return, but it still has a cost to the person who might download it. (All of this said by someone who has many tools on the file exchange, and many, many, many thousands of downloads.)

Sign in to comment.

More Answers (0)

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!