fuzzylizard Posted January 21, 2003 Report Share Posted January 21, 2003 Hello, I am looking for some software that will allow me to point it to a website and it will capture the entire thing. Something like blackwidow on windows. Preferably I would like to be able to point it at a website, tell it how deep to go and whether it can leave the server, and it would download the entire site to specified folder. Anyone have any suggestions? Thanks in advance. Quote Link to comment Share on other sites More sharing options...
MottS Posted January 21, 2003 Report Share Posted January 21, 2003 I don't know blackwindows but just an idea here... What about 'wget -r http://blablabla.com/' 'man wget' or 'wget --help' for more details MOttS Quote Link to comment Share on other sites More sharing options...
fuzzylizard Posted January 21, 2003 Author Report Share Posted January 21, 2003 Thanks but no. Warning: wildcards not supported in HTTP. Anyone else? Blackwidow is a windows software that will allow you to download an entire website according to a set of rules. It downloads all files while retaining the file structure of the website. All files includes html, jpeg, gif, and anything else that is not behind a password protected directory or that is script driven - i.e. it won't download php or cfm files. Quote Link to comment Share on other sites More sharing options...
MottS Posted January 21, 2003 Report Share Posted January 21, 2003 I downloaded the whole IceWM site in about a minute. [gd@localhost tmp]$ wget -r http://www.icewm.org/ --21:52:29-- http://www.icewm.org/ => `www.icewm.org/index.html' Résolution de www.icewm.org... complété. Connexion vers www.icewm.org[66.35.250.210]:80...connecté. requête HTTP transmise, en attente de la réponse...200 OK Longueur: non spécifié [text/html] [ <=> ] 9,452 56.98K/s 21:52:29 (56.98 KB/s) - « www.icewm.org/index.html » sauvegardé [9452] ... ... ... ... --21:53:08-- http://www.icewm.org/files/es/FAQ/IceWM-FAQ-12.html => `www.icewm.org/files/es/FAQ/IceWM-FAQ-12.html' Connexion vers www.icewm.org[66.35.250.210]:80...connecté. requête HTTP transmise, en attente de la réponse...200 OK Longueur: 4,692 [text/html] 100%[====================================================================>] 4,692 33.45K/s ETA 00:00 21:53:09 (33.45 KB/s) - « www.icewm.org/files/es/FAQ/IceWM-FAQ-12.html » sauvegardé [4692/4692] Terminé --21:53:09-- Téléchargement: 1,535,694 octets dans 180 fichiers [gd@localhost tmp]$ du -sh www.icewm.org/ 1.9M www.icewm.org ???? MOttS Quote Link to comment Share on other sites More sharing options...
fuzzylizard Posted January 21, 2003 Author Report Share Posted January 21, 2003 Getting closer. However, I need to download everything below a certain level on a website www.foo.com/~html/foo/bar/... When I used the wget command, it downloaded everything from the root on down. I need something with more control. Quote Link to comment Share on other sites More sharing options...
fuzzylizard Posted January 21, 2003 Author Report Share Posted January 21, 2003 Alright, a little reading of the man page and I have figured it out. I know, I know RTFM!!! Anyway, here is the command I needed to use: wget -r -np -l 3 http://www.foo.com/~html/foo/bar/ -np no parent. It does not desend upwards into the parent directory -l 3 levels to search. In this case 3. The default is 5. wget is a very cool command that I am going to have to look at more. Thanks for the help. Quote Link to comment Share on other sites More sharing options...
MottS Posted January 21, 2003 Report Share Posted January 21, 2003 Cool !! .. and it's free !! lol Quote Link to comment Share on other sites More sharing options...
Vdubjunkie Posted January 23, 2003 Report Share Posted January 23, 2003 Thanks but no. Boy, you were quick to jump. wget is a wonderful tool, and the right answer was given immediately. I use wget DAILY and cannot say enough nice things about it Quote Link to comment Share on other sites More sharing options...
fuzzylizard Posted January 23, 2003 Author Report Share Posted January 23, 2003 Boy, you were quick to jump. wget is a wonderful tool, and the right answer was given immediately. I use wget DAILY and cannot say enough nice things about it Yeah, I know. Hopefully I can be forgiven for that. I guess I was looking for a graphical solution. Hey, at least I checked the man page and saw the error of my ways, that must count for something. :) Quote Link to comment Share on other sites More sharing options...
ramfree17 Posted January 23, 2003 Report Share Posted January 23, 2003 i know this is late. httrack. :mystismiles: ciao! Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.