Normally wget works as expected but every now and then, you page is forbidden or similar, so the below is an alternative approach;
Firstly, download the html page with something like:
wget www.somepage.com/index.html
then extract the file types of the links you want, in this example, pdf files, with;
grep -o 'http[^"]*\.pdf' index.html > links.txt
then we would use wget again as follows;
wget -i links.txt
Obviously, this wont work for all cases but it has helped me on occasion.
No comments:
Post a Comment
Note: only a member of this blog may post a comment.