Add new row to String Array.

Hi,
I have a Xx1 string array and I currently have something on the lines of:
strarray.full = [strarray.full;NEWDATA];
The code will add a new line each loop, and will re create the string array each time.
This works fine for smaller files, but the txt files I am reading are around 200,000 lines, so this takes around 20 minutes to run.
I am trying to get something like this to work to avoid having to re create the matrix each loop and make it alot faster.
strarray.full(end+1,1) = NEWDATA
I keep getting the Error "Unable to perform assignment becuse the indices on the left side are not compatible with the size of the right side"
The first loop results in an empty NEWDATA. I also do not believe I am allowed to share the code itself.
Thanks.

 Accepted Answer

You can preallocate memory for the entire string array with
str = strings(200000,1);
and then fill row-by-row rather than appending. This will avoid the string array being copied to different memory locations as it fills, and should radically speed up the operation.

1 Comment

Adam Brabec
Adam Brabec on 12 Jul 2020
Edited: Adam Brabec on 12 Jul 2020
Hi Cyclist,
I was able to rework the code in the way you suggested so thank you.I'm not sure why, but it appears to be slower than the original method.
The loop times still drastically increase as the matrix is being filled. I stopped it at around 30 minutes.

Sign in to comment.

More Answers (1)

Adam Brabec
Adam Brabec on 13 Jul 2020
I was able to find a fix using a "Dumping matrix" that after each .txt file I read, it takes the dump matrix and add it to the end larger matrix.
This knocked the time down from 20 minutes to 7 minutes.

6 Comments

If you are able to upload the data and code that is currently having such a long run time, maybe people could find a way to speed it up. 7 minutes still seems pretty long for just filling a string array. But you hadn't mentioned reading text files, so maybe that is the botteneck. You could use the profiler to investigate.
Sorry for the delay. I was able to get permission to upload the code here. and I can post a stripped down verision of the .txt files as well.
The Profiler was saying that the issues were still inputting the new data into the matrix.
.txt Format 1:
Datetime1 ~ *string* ~ *string* ~ *string*
*string*
Datetime1 ~ *string* ~ *string* ~ *string*
*string*
*string*
*string*
*string*
Datetime1 ~ *string* ~ *string* ~ *string*
*string*
*string*
.txt Format 2:
datetime2 ~ *string* ~~~ *string* ~~~ *string* ~*~
datetime2 ~ *string* ~~~ *string* ~~~ *string* ~*~
datetime2 ~ *string* ~~~ *string* ~~~ *string* ~*~
the cyclist
the cyclist on 18 Jul 2020
Edited: the cyclist on 18 Jul 2020
Rather than post the format of a text file, can you just upload a sample text file with made-up data? Make it look as close to a typical file as possible.
How many files do you have, such that it takes 7 minutes to run your code? If you post one typical file, can I duplicate that one file many times in the directory to recreate your issue? Can you just upload a zip file with many files like that?
Do you see what I am doing here? I'm trying to make it is easy for me (and any other contributor here) to replicate exactly what is happening to you. Otherwise, you are leaving too much effort -- and too much guesswork -- that might inadvertently be spent on tasks that are not directly solving your actual problem.
I'm sorry Cyclist, that is close to the format as I'm allowed to post. I dont want to break any rules for the company I work at.
I kept looking around and found Pre-allocating ended up being correct, but for some reason, pre-allocating structures doesn't work.
with pre allocating the string array like you suggested earlier, I switched it from:
strarray.full(j) = [NEWDATA];
to:
full(j) = [full;NEWDATA];
where j is a counting variable.
I apologize I frustrated you and wasted your time. I appreciate you still trying to help me.
Not frustrated, and I don't feel like I wasted my time. I'm just trying to help you get to a solution efficiently.
Just bear in mind that any work you leave for someone else, rather than doing that prep work yourself, has to be done by every person who tries to help you. So, it can be a hurdle that some people just won't bother with.
Looking back at what I posted, I completely understand. From now I will be sure to do more prep work.
Thanks

Sign in to comment.

Categories

Products

Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!