Regular Expression to extract bigram
    10 views (last 30 days)
  
       Show older comments
    
string = 'ab bc cd ef gh ij kl'
what will be the regular expression to extract bigram from the given string
I am writing the code
    regexp(string,'\w* \w*','match');
the o/p is coming as: 'ab bc' 'cd' 'ef' 'gh' 'ij' 'kl'
while the output i am expecting as:
- 'ab bc'
- 'bc cd'
- 'cd ef'
- 'ef gh'
- 'gh ij'
- 'ij kl'
2 Comments
  Walter Roberson
      
      
 on 26 Sep 2013
				I believe the term is "bi-gram".
If the string was
'abc defg'
would you want the result to be
 ab bc c<space> <space>d de ef fg
or
ab de
or
ab bc de ef fg
?
Or does it only need to work on letter pairs ?
Accepted Answer
  Azzi Abdelmalek
      
      
 on 26 Sep 2013
        
      Edited: Azzi Abdelmalek
      
      
 on 26 Sep 2013
  
      EDIT
Do you want?
string = 'ab bc cd ef gh ij kl'
regexp(string,'\s+','split');
3 Comments
  Azzi Abdelmalek
      
      
 on 26 Sep 2013
				string = 'ab bc cd ef gh ij kl'
out=regexp(string,'\s+','split');
cellfun(@(x,y) [x ' ' y],out(1:end-1)', out(2:end)','un',0)
More Answers (1)
  Andrei Bobrov
      
      
 on 26 Sep 2013
        z=regexp(string,'\w*','match')
strcat(z(1:end-1),{' '},z(2:end))
0 Comments
See Also
Categories
				Find more on Data Type Identification in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!


