Split a cell array of character vectors at multiple-number delimiter
11 views (last 30 days)
Show older comments
Giuseppe Degan Di Dieco
on 24 Sep 2021
Commented: Giuseppe Degan Di Dieco
on 27 Sep 2021
Hello Everyone,
I hope that you're doing well, and your loved ones as well.
I wanted to ask you for help, after many trials. The issue is that I'va data on construction types for bridges uploaded from an Excel spreadsheet, and their data type is CELL ARRAY OF CHARACTER VECTORS.
Below, I jotted down a few example lines of these data:
BEAMS-PREFLEX WITH SLAB-RC 4 NO SPANS
ARCH-MASONRY 1 NO STONE (LIMESTONE SADDLED 1924)
ARCH-MASONRY 2 NO SPAN WIDENED WITH SLAB-R.C. 2 NO SPAN
BOX-R.C. 1 NO SPAN
We can see that they've a typical structure, namely the construction system, followed by the superstructure material, and then by the number of spans. However, these different parameters should stay into separated cells.
I'm a beginner of analysing text with computing techiniqeus, so to use the built-in function 'split', I converted the data type into STRING, and then defined a pattern of the number of spans, namely string(1 2 3 4 ...). However, I couldn't save the split text into the variable spl because the for loop doesn't work, and the results obtained from the split function changes for each data line.
For example, for the first dataline I retrieve:
spl, 2x1 string, that is:
BEAMS-PREFLEX WITH SLAB-R.C.
NO SPANS
Then, for the second dataline I retrieve:
spl, 6x1 string, that is:
ARCH-MASONRY
NO SPAN (LIMESTONE SADDLED)
3 WHITE SPACES
)
I also attached my simple code for any advice.
Thanks for your time and consideration, I look forward to hearing from you.
Best.
% T2 is the data table
% T2.ConstructionType = string(T2.ConstructionType)
% str = T2{:, 6}; %T2.ConstructionType is the sixth column of T2
% noOfSpans = [1:1:20]
% pat = string(noOfSpans)
% spl = []; %Initialisation of variable for the split results
% for i = 1:1:length(str)
% spl = split(str(i), pat)
% end
0 Comments
Accepted Answer
Stephen23
on 24 Sep 2021
Edited: Stephen23
on 24 Sep 2021
Rather than telling us what you currently get, it is probably more useful if you tell us what you want.
I made some guesses about how that text is formatted:
str = ["BEAMS-PREFLEX WITH SLAB-RC 4 NO SPANS";...
"ARCH-MASONRY 1 NO STONE (LIMESTONE SADDLED 1924)";...
"ARCH-MASONRY 2 NO SPAN WIDENED WITH SLAB-R.C. 2 NO SPAN";...
"BOX-R.C. 1 NO SPAN"];
tkn = regexp(str,'^(\w+)-(\D+?)\s*(\d+)\s*(.*)','tokens','once');
tkn = vertcat(tkn{:})
More Answers (0)
See Also
Categories
Find more on Characters and Strings in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!