Not just another dynamic variable naming question! Generating a new string and using it in a command.
Show older comments
I am working on a large data project that has potentially thousands of similarly named files in CSV format that I need to analyze in matlab. These files are named nnnwavex.csv (where nnn = a number and x = either I II III or V).
Ex Directory Contents
- 352waveI.csv
- 352waveII.csv
- 352waveIII.csv
- 352waveV.csv
My previous code to import this was to enter the following
wave352i = importdata('352waveI.csv');
tele352i = wave352i.data;
text352i = wave352i.textdata;
wave352ii = importdata('352waveII.csv');
tele352ii = wave352ii.data;
text352ii = wave352ii.textdata;
wave352iii = importdata('352waveIII.csv');
tele352iii = wave352iii.data;
text352iii = wave352iii.textdata;
wave352v = importdata('352waveV.csv');
tele352v = wave352v.data;
text352v = wave352v.textdata;
Now I have generated a bit of variable naming code that seems to work well for the first part (importing the file as a struct of data and textdata). I have spent hours reading the material about why variable naming is not ideal. This is simply to import large numbers of files quickly.
x=352;
y='wave';
z='I'
a='.csv'
b='tele'
c='.data'
cat(2,num2str(x),y,z,a)
eval(['temp = importdata(ans)']);
cat(2,y,int2str(x),z);
v = genvarname(ans);
eval([v '= temp']);
z='II'
cat(2,num2str(x),y,z,a)
eval(['temp = importdata(ans)']);
cat(2,y,int2str(x),z);
v = genvarname(ans);
eval([v '= temp']);
z='III'
cat(2,num2str(x),y,z,a)
eval(['temp = importdata(ans)']);
cat(2,y,int2str(x),z);
v = genvarname(ans);
eval([v '= temp']);
z='V'
cat(2,num2str(x),y,z,a)
eval(['temp = importdata(ans)']);
cat(2,y,int2str(x),z);
v = genvarname(ans);
eval([v '= temp']);
clear x y z v temp c b ans a
This leaves me with four struct files
- wave352I
- wave352II
- wave352III
- wave352V
how can I make a script that can then build a variable name that accomplishes the following & gives me a double and a cell from the original struct?
If I create a string a = 'wave352V.data', I cannot use it in my code to call that subset of the struct and apply it to a new variable.
tele352v = wave352v.data;
text352v = wave352v.textdata;
I could use excel and word and build a script that uses mail merge to create multiple iterations of the initial code I demonstrated above, but that seems like a very amateur way to approach this.
Any ideas would be appreciated!
3 Comments
dpb
on 6 May 2017
Well, despite your efforts I think you're heading down the wrong path here...
First, what data do you need at any one time to do the analysis--is it a single one of these files or a defined set on the nnn for a give Roman sequence or all of a given sequence or, ...???
That'll be the starting point for what you need for addressing a given (set of) file(s).
Then, what needs to happen on these files and what's the output and where's it going to go?
Once we have a clear definition of the task, THEN we can approach the sequence of operations to accomplish it.
Thomas Fogarty III
on 6 May 2017
@Thomas Fogarty III: "I find working in cell's very painful and have avoided that route". Okay... today might be a good day to practice using cell arrays then, because the more you practice using them, the easier they are to use! Take the plunge with cell arrays and you will learn more useful ways of using MATLAB, and you will learn how to solve your question yourself using simple code.
Do you have a particular reason why you need to have lots of separately named variables in your workspace? I read your question and comments several time carefully, but could not find one.
"I have spent hours reading the material about why variable naming is not ideal". Hopefully you understood that there are multiple reasons why doing this is a bad idea, not least of which is what on earth will you do with a million separate variables in your workspace? Let me illustrate: I want to calculate sine of several values, which might seem easy with just two variables:
a1 = 0;
a2 = pi/2;
sin(a1)
sin(a2)
what happens if now I now have one million values? Would I write one million times:
sin(a1)
sin(a2)
...
sin(a1000000)
Or would I use the much better method of putting my data into one array? In fact second half of your title deserves comment: "Generating a new string and using it in a command." Once you have generated one million random variables names using eval, the only way to call any operator with those variables is to use eval again. And so it goes on... you paint yourself into a corner, with no way out.
Why do you want/need to do this? Note that you have not yet given a single reason why you cannot use the better programing methods suggested in the FAQ, the tutorials, the MATLAB documentation, or the answers below.
Processing multiple data files usually involves looping or comparing of their data, and the method is simple: put the data into an array (ND numeric, cell, struct, table, etc) and use indices. It is so simple that it works! (and is also faster, more reliable, simpler, etc, etc). So far you have not given one single reason why you cannot do this.
Please do not ignore dpb's comment "Once we have a clear definition of the task, THEN we can approach the sequence of operations to accomplish it." You do not tell us what your task is. So far we do not have a clear explanation of what you are trying to do with this data. Do you want to merge that data into new files, process some particular fields or values over all measurements, or process each measurement independently?
Accepted Answer
More Answers (3)
Steven Lord
on 6 May 2017
Don't write a script to do this. Write a function that accepts the name of the file to be imported and returns the specific pieces you want from that file. The names of the variables you use inside the function don't matter to code outside the function under most circumstances, so you can use general names in that function.
function [dataVector, scalarStatus] = readMyFiles(filename)
data = load(filename);
% process data to generate the variables dataVector and scalarStatus
Once the data returns from your file, if you must store it in some way that allows you to access it via name, consider a struct array or a table (if your file name is also a valid MATLAB identifier as per the isvarname function) or perhaps a containers.Map object.
2 Comments
Thomas Fogarty III
on 6 May 2017
"But at the end of the day, it would require me putting in output names for each variable and really be only a touch quicker than running a script or doing it manually"
Quicker how? Quicker in terms of code running time? No, your script with eval will be slower. Quicker in terms of writing time? No, you have already wasted hours on this and had to ask for advice on an internet forum. Using a loop and a cell array wold have taken 30 seconds to write. Quicker in terms of debugging time? Without all of the help that MATLAB gives when writing code properly using a loop, that would be a joke!
"I have upwards of 300-400 files at present that need importing and they follow the same pattern and naming convention, something that should be easy to code/automate." It is easy. People write code like this all the time. They import tens/hundreds/thousands/millions of files easily, without any problems, by using loops (and not just in MATLAB, but most any language). The MATLAB documentation tells us how, and also many threads on this forum show how:
"I know this concept of dynamic names make most of you guys wince!" Forget about us, how about reading the MATLAB documentation?:
Les Beckham
on 7 May 2017
Edited: Les Beckham
on 7 May 2017
2 votes
A lot of what I'm going to say has already been said by far smarter people on this thread. I'm going to rephrase from a less specific viewpoint, in hopes that it will help you grasp what the others have been telling you.
As I see it, you are facing this situation:
- You have a lot of input data to process and, fortunately, this data is stored in files with a well defined file naming convention.
- You need to process each of these files (or well defined groups of them) with some algorithm. You yourself have used the word 'each' several times and, to me, that screams 'loop'
So, you need to define:
- How do I group these files logically (using the naming convention)?
- Do you need to load and process one file at a time or groups of them?
- What algorithm do I need to apply to each file or group?
Once you've answered those questions, the FAQs on how to process multiple files come into play along with these points:
- When you are looping through a bunch of data to apply the same algorithm to each piece of data you should implement that algorithm in a function where you pass the data to the function and get back the result.
- This kind of data (and the corresponding results of the algorithm applied to the data) are often best stored in an array.
- This 'array' can be a basic numeric array or a cell array or an array of structs or even a table.
- Remember that the fact that the source data came from files with unique names does not mean that the processing of each of those files (or appropriate groups of files) needs to retain those names. For processing (applying your algorithm), you simply have 'input' data and 'output' data. Where the data came from and where you eventually store that data should not be embedded into the processing of the data.
You could read your input data into cell arrays or struct arrays, apply your processing and either append the output data to these or create separate output cell or struct arrays. It is totally up to you. Just don't embed the names of the source of the data into the processing of that data.
Note that it would be perfectly acceptable (and, often, even recommended) to retain the input source (e.g., input filename) as a field (if you use structs) or a cell (if you use cells) in the output data.
I hope this helps.
3 Comments
Les Beckham
on 7 May 2017
I would be delighted to be a part of your excellent tutorial.
Thanks
Image Analyst
on 6 May 2017
1 vote
4 Comments
Thomas Fogarty III
on 6 May 2017
Stephen23
on 6 May 2017
" It doesn't get around the need for dynamically naming them based on the input though from what I can tell"
What do you want to do with all of these dynamically named variables?
Steven Lord
on 6 May 2017
Generate the list of filenames using the commands in the FAQ. Pass those filenames in turn to the function I suggested you write. When the data comes out of the function to a standard variable (NOT one with a dynamic name) copy those results into a struct, cell, table, or containers.Map. Use the changeable information as the field name in the struct, the variable (column) name in the table, or the key in the containers.Map.
dpb
on 6 May 2017
Be interesting on this one to see if we ever manage to break through the preconceived mindset, Steven... :)
Categories
Find more on Variables in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!