Parsing ⅛ and ⅓ Characters from actxserver Outlook Mail object Body and Converting to Floats

2 views (last 30 days)
Hi all
I am parsing Outlook mails in Matlab by actxserver and regexp.
Some mails contain fraction characters as below
The ½,¼,¾ characters are read ok, but the eighths (⅛,⅜,⅝,⅞) and thirds (⅓,⅔) are present in the body property of the mail object as "?" [char(63)] as per below screenshot from the command-line print of the mail body.
Matlab recognises only ¼ ½ ¾ [char(188:190)] so I guess I need to access non ASCII chars. Its not clear whether the issue is Matlab's 16bit unicode or the actxserver object. The characters are available on Windows Vista Arial font as U+215C,E etc
You can verify this for yourself by emailing yourself a mail with the subjectline
⅛¼⅓⅜½⅝⅔¾⅞
and then running the code below in matlab to regexp this subjectline of the mail in your inbox. Put a breakpoint at the regexp line to inspect what the subject variable looks like, should see "?" in there.
Two questions here:
1. How could I extend Matlab's ASCII set to read these characters
2. Is there a neat way to convert them into equivalent floats (3¼ -- > 3.25) within regexp ?
Grateful for any suggestions here
Mark
% Below function will need to be adapted depending on how your outlook folders are set up:
function myfrac = TestReadFractions
outlook = actxserver('Outlook.Application');
mapi = outlook.GetNamespace('mapi');
folder1 = mapi.Folders(1);
myaccount = folder1.Item(2);
inboxmails = myaccount.Folders.Item(2).Folders.Item(9).Items;
count = inboxmails.Count;
myfrac = {};
for i = count:-1:count-10
if strcmp(inboxmails.Item(i).SenderEmailAddress,'yourname@youraddress.com')
subject = inboxmails.Item(i).Subject; % Mail Subject-Line
myfrac = regexp(subject,'\x215c','match');
end
end

Answers (1)

Walter Roberson
Walter Roberson on 3 Feb 2014
regexprep('ABC','B','\x215c')
  4 Comments
Mark Whirdy
Mark Whirdy on 5 Feb 2014
Edited: Mark Whirdy on 5 Feb 2014
Hi Walter,
emailing myself "⅛¼⅓⅜½⅝⅔¾⅞" and reading as per code above gives
K>> subject+0
ans =
63 188 63 63 189 63 63 190 63
Since all the question-marks have same 63 integer, I think that passing through nativetounicode will not work.
I can't change the pc locale as this will effect all other applications I think.
Do you know why changing the InternetCodepage property of the outlook mail, doesn't work? (i.e. as above if I set to anything other than 65001, it is still 65001 when I check in then). I guess the property is immutable, perhaps there is a way of setting it in the actxserver constructor? But even if I can do this, I don't know which InternetCodepage value would fix it
Mark

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!