How to filter certain parts of a string.
6 views (last 30 days)
Show older comments
Hello,
So I have a string file which has a ton of data in it.
Let's say it looks like this:
"Introduction:
x1x1x1x1x1x1x1x1x1x1x1x1x1
Summary:
x2x2x2x2x2x2x2x2x22x2x2x2
Conclusion:"
What I would want to do is setup a filter so that it reads the keyword Introduction and Summary and captures everything in between. While ignoring everything else.
In this case it should be
"Introduction:
x1x1x1x1x1x1x1x1x1x1x1x1x1"
If it helps, I am running a script with this code to generate this string file after reading the contents from a word document:
word = actxserver('Word.Application');
word.visible = 0;
wdoc = word.Documents.Open('wordfile.docx');
x = splitlines(string(wdoc.Content.Text));
wdoc.Close;
word.Quit;
0 Comments
Accepted Answer
Walter Roberson
on 17 Sep 2019
Edited: Walter Roberson
on 17 Sep 2019
S = char(wdoc.Content.Text);
x = regexp(S, '^Introduction:.*?(?=^\nSummary:)', 'match', 'lineanchors');
10 Comments
greenyellow22
on 23 Jun 2022
Hey Walter! I thought I might just ask you directly. I have a table of which one of the columns includes text. Some cells of this column look like this: 'New Scene was loaded: Castle' or 'New Scene was loaded: World1'.
I want to filter out/split for all cells with the text 'New Scene was loaded:' the last word e.g. 'Castle' or 'World1' and insert it into a new column. Can you tell me how I can split these cells based on this rule? I appreciate any help!! :)
Walter Roberson
on 24 Jun 2022
TableName.NewColumnName = regexp(TableName.ColumnName, '(?<=New Scene was loaded:\s+)\S+', 'match', 'once')
There is a possibility that TableName.ColumnName might need to be {TableName.ColumnName} but I think it unlikely.
More Answers (0)
See Also
Categories
Find more on Data Import and Export in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!