How to extract string/numbers after a keyword in a text
10 views (last 30 days)
Show older comments
Hi,
I have a text like the following.
z
>HEAD
DATAID="P=01 R=14 (H)"
ACQBY="DGSM"
FILEBY=""
ACQDATE=04/11/22
ENDDATE=04/13/22
FILEDATE=02/16/23
COUNTRY="UG"
LAT=3:26:1.6
LONG=30:56:25.9
ELEV=1224.44
UNITS=M
STDVERS="SEG 1.0"
PROGDATE="10/20/20"
EMPTY=1.0E32
>INFO MAXINFO=108
UNIQUE ID: {fd652be6-76e5-476d-ace0-bf2981887cf2}
PROCESS DATE: 2022-06-29 18:13
DURATION: 38 h 36 m 20 s
DECLINATION: 2°
...
...
How to extract the corresponding value after a given keyword? For instance, I would like to have
keyword = "LAT"
lat = [3,26,1.6];
keyword = "ACQDATE"
date = 04/11/22;
keyword = "EMPTY"
emptyval = 1.0E32;
keyword = "UNIQUE ID"
uid = 'fd652be6-76e5-476d-ace0-bf2981887cf2';
...
Would there be a function that can take care of all types of values?
Thanks,
Jasmine
2 Comments
the cyclist
on 30 May 2023
Edited: the cyclist
on 30 May 2023
Can you upload the text file? You can use the paper clip icon in the INSERT section of the toolbar.
Also, can you be more specific about what you need? For example, do you just want to input a keyword, and have MATLAB store/output the corresponding value? Or, do you want MATLAB to figure out all the keyword-value pairs and store them?
Accepted Answer
dpb
on 30 May 2023
Edited: dpb
on 31 May 2023
"How to extract the corresponding value after a given keyword?"
data=readlines('Example.txt'); % bring the file into memory for efficiency -- don't open/close every time
keywords=extractBefore(data,"="); % split off keywords for matching -- same length vector as rows in data
instr='';
while ~matches(upper(instr),'Q')
instr=input("Enter keyword wanted ('Q' to Quit)",'s');
value=extractAfter(data(matches(keywords,instr)),"=");
disp(value)
end
"Would there be a function that can take care of all types of values?"
Only a cell can hold any data type; as the above code snippet illustrates, return the data as a string first, then worry about converting it to the proper data type. But, there is no magic elixir potion there; you'll have to have a lookup table of the data type by keyword or do some parsing testing logic to try to automagically figure out what the variable types are.
datetime may not be too hard; they may be the only field with the forward slash or can use that the keyword may always contain the substring "DATE". I noticed one "DURATION"; study may show other patterns that can be taken advantage of other than just trying to convert. Strings appear to be delimited with " so that also is a clue.
ADDENDUM: Edited to use separate keywords array so can match explicitly without other gyrations. One could split the original array into two arrays with split on the "=" sign, but that would entail removing the other header records as split() is very unforgiving in requiring exactly the same number of fields for every record.
3 Comments
dpb
on 30 May 2023
Was "air code" -- "contains" may be safe for this function, but it might be risky and, depending upon what the total population of keywords is, it might not always produce the desired result. If one keyword is a substring within another, both will match whereas you should ensure there is only one match.
The more robust solution would be to first extract the keywords as in
keywords=extractBefore(data,"=");
then do the matches comparison on that set instead of the full string.
More Answers (1)
See Also
Categories
Find more on Characters and Strings in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!