how to download pdf files from website?

14 views (last 30 days)
Yara
Yara on 7 Dec 2022
Commented: Yara on 17 Dec 2022
I need to download all pdf files from specific url (I do not have the list of names of these files)
I just need to download any file ends with .pdf
Ive tried :
url = 'https://... '; %assume it is a real url
urlwrite(url,'*.pdf');
but it is not working.

Answers (1)

C B
C B on 7 Dec 2022
system('wget -r -A.pdf https://smallpdf.com/blog/sample-pdf')
--2022-12-07 15:30:38-- https://smallpdf.com/blog/sample-pdf Resolving smallpdf.com (smallpdf.com)... 99.86.127.71 Connecting to smallpdf.com (smallpdf.com)|99.86.127.71|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 450993 (440K) [text/html] Saving to: ‘smallpdf.com/blog/sample-pdf.tmp’ smallpdf.com/blog/sample-pdf.tmp 0%[ ] 0 --.-KB/s smallpdf.com/blog/sample-pdf.tmp 100%[============================================================================================================>] 440.42K --.-KB/s in 0.005s 2022-12-07 15:30:38 (89.8 MB/s) - ‘smallpdf.com/blog/sample-pdf.tmp’ saved [450993/450993] Loading robots.txt; please ignore errors. --2022-12-07 15:30:38-- https://smallpdf.com/robots.txt Reusing existing connection to smallpdf.com:443. HTTP request sent, awaiting response... 200 OK Length: 57 [text/plain] Saving to: ‘smallpdf.com/robots.txt.tmp’ smallpdf.com/robots.txt.tmp 0%[ ] 0 --.-KB/s smallpdf.com/robots.txt.tmp 100%[============================================================================================================>] 57 --.-KB/s in 0s 2022-12-07 15:30:38 (16.3 MB/s) - ‘smallpdf.com/robots.txt.tmp’ saved [57/57] Removing smallpdf.com/blog/sample-pdf.tmp since it should be rejected. --2022-12-07 15:30:38-- https://smallpdf.com/ Reusing existing connection to smallpdf.com:443. HTTP request sent, awaiting response... 200 OK Length: 445828 (435K) [text/html] Saving to: ‘smallpdf.com/index.html.tmp’ smallpdf.com/index.html.tmp 0%[ ] 0 --.-KB/s smallpdf.com/index.html.tmp 100%[============================================================================================================>] 435.38K --.-KB/s in 0.005s 2022-12-07 15:30:38 (82.2 MB/s) - ‘smallpdf.com/index.html.tmp’ saved [445828/445828] Removing smallpdf.com/index.html.tmp since it should be rejected. FINISHED --2022-12-07 15:30:38-- Total wall clock time: 0.6s Downloaded: 3 files, 876K in 0.01s (85.8 MB/s)
ans = 0
  3 Comments
C B
C B on 11 Dec 2022
sorry for late reply are using windows or linux or mac?

Sign in to comment.

Categories

Find more on Downloads in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!