Are Regular Expressions the best way to do this job?
Info
This question is closed. Reopen it to edit or answer.
Show older comments
I have big list of devices, but they are "encrypted" and look like this:
- 'A10_EG_KitchenRadio_L2_P'
- 'A11_KG_FloorPP_L1_P'
- 'C01_EG_PC_L3_P'
- 'C02_EG_TV_Video_L3_P'
- 'C03_EG_HIFI_L3_P'
- 'C04_EG_Switch_L1_P'
- 'C05_EG_MeasuringSystems_L3_P'
- 'A03_freezer_cooling_combi_L1_P'
instead they should look like this:
- Radio
- PowerPlug
- PC
- TV
- HIFI
- Switch
- Measuringsystem
- Freezer
I'm trying to solve this by using Regular Expressions. Is there a better way to do this?
TIA.
4 Comments
per isakson
on 23 Apr 2012
Extract the string between the second and third underscore is a task made for regular expression. However, that will return "cooling" rather than "Freezer".
Norbert
on 23 Apr 2012
Daniel Shub
on 23 Apr 2012
Do you want "freezer" or "Freezer". Can you state the rule for what you want? In particular I am concerned about 'C02_EG_TV_Video_L3_P' and the possibility of 'C02_TV_Video_L3_P'.
Norbert
on 23 Apr 2012
Answers (2)
Daniel Shub
on 23 Apr 2012
I think the answer depends on what you mean by regular expressions. In MATLAB, I think of
doc regexp
doc regexprep
For this task I would argue that regular expression support in MATLAB is not as good or user friendly as it is in Perl, SED or AWK.
If you are looking for help writing the regular expression, you need to be able to state the rules. For example
- All lines start with Letter-digit-digit-undescore
- There is an optional Letter-Letter-underscore (where Letter-Letter is not TV or PC)
- All lines end with underscore-Letter-digit-underscore-Letter
- What is in between the start and the end of the line should be classified as one of N things according to the following rules
- If if it contains ...
Andrei Bobrov
on 23 Apr 2012
t = regexprep(A,{'[EK]G','_','\w*Radio','\w*PP','(cool\w*|freezer)','\w*vision'},{'',' ','Radio','PowerPlug','Freezer','TV'})
out = cellfun(@(x)x{2},regexp(t,'\w*','match'),'un',0)
6 Comments
Norbert
on 23 Apr 2012
Daniel Shub
on 23 Apr 2012
That is not going to deal with freezer to Freezer, or Television to TV. It also is going to have problems with 'A03_freezer_cooling_combi_L1_P'.
Daniel Shub
on 23 Apr 2012
You are working hard Andrei, but in Norbert's comments you have to deal with 'C02_Home_TV_L3_P' being TV and not Home. I am guessing there are a lot more possible exceptions than in the current list, but good luck.
Norbert
on 23 Apr 2012
Oleg Komarov
on 23 Apr 2012
I would define which sequences map to which device. Then I would loop for each device using regexp.
Norbert
on 23 Apr 2012
This question is closed.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!