Info
This question is closed. Reopen it to edit or answer.
urlread - missing some href's
2 views (last 30 days)
Show older comments
Hi,
when i read the source code of a webpage using urlread i am missing some (not all) urls in the string i get. I think it has something to do with the html-tag "div". I can't see any href in a div environment.
EXAMPLE:
I'll post a code snippet from the source code i get using "str = urlread('http://www.mathworks.de/matlabcentral/')" (line 493-497):
<div class="spotlight custom" style="padding-left:12px;">
<<-images-blogs-blogs_spotlight_trials_gray.jpg>>
</div> </div>
In the following the corresponding code snippet from my web browser:
<div class="spotlight custom" style="padding-left:12px;">
<http://www.mathworks.com/programs/trials/trial_request.html?eventid=56763&prodcode=ML&s_iid=mlcmain_trial_mlc_cta1
<<-images-blogs-blogs_spotlight_trials_gray.jpg>>
>
</div> </div>
The complete < a href="..." > is missing. On this example page 64 links are missing.
Thank you very much in advance Hans
0 Comments
Answers (2)
Jan
on 7 Jul 2013
Edited: Jan
on 8 Jul 2013
I cannot reproduce this, because the linked document has changed.
Do you check this in the command window, where "<a href..." is displayed as a hyper reference automatically? Then the HREF is still there, but shown as the underlined link, not as string.
[EDITED] Workaround:
str = urlread('http://www.mathworks.de/matlabcentral/');
str = strrep(str, '<a href=', '<A HREF=');
disp(str)
Unfortunately I cannot remember, if upper case disables the auto-formatting. But I've used such a similar replacement to chos such strings in plaintext. Another idea would be to open the file in the editor:
str = urlread('http://www.mathworks.de/matlabcentral/');
matlab.desktop.editor.newDocument(str);
4 Comments
Jan
on 8 Jul 2013
Edited: Jan
on 8 Jul 2013
Strange: When I copy the full message to a new answer, I can re-open it completely. But when I open the above message, I see only the version before the [EDITED] part has been appended.
I ask Randy.
[EDITED] Dear Randy and other readers: Sorry, the effect disappeared, after closing the browser and restarting(!) the machine. Obviously my computer reloaded an outdated version from the Prism cache before.
Ken Atwell
on 8 Jul 2013
urlread and your browser are hitting the web site independently of each other, and there is no guarantee that the exactly same content will be returned in both situations. In this case, the example you give involves a rather dynamically-created page and a spotlight (okay, an "ad") for a MATLAB trial -- this might be offered or not depending on a host of factors (including the browser being used and even randomness).
In short, you should not count on urlread returning the same content as your browser for anything other than a completely static web page.
0 Comments
This question is closed.
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!