allows you to recursively visit and download a website's content to
your disk.
- Vacuums the entirety of a website recursively
- Uses multithreading
- Writes the website's content to your disk
- Enables offline navigation
- Offers random delays to avoid IP banning
- Saves application state on CTRL-C for later pickup
suckit [FLAGS] [OPTIONS] <url>
-c, --continue-on-error Flag to enable or disable exit on error
--disable-certs-checks Dissable SSL certificates verification
--dry-run Do everything without saving the files to the disk
-h, --help Prints help information
-V, --version Prints version information
-v, --verbose Enable more information regarding the scraping process
--visit-filter-is-download-filter Use the dowload filter in/exclude regexes for visiting as well
-a, --auth <auth>...
HTTP basic authentication credentials space-separated as "username password host". Can be repeated for
multiple credentials as "u1 p1 h1 u2 p2 h2"
--cookie <cookie>
Cookie to send with each request, format: key1=value1;key2=value2 [default: ]
--delay <delay>
Add a delay in seconds between downloads to reduce the likelihood of getting banned [default: 0]
-d, --depth <depth>
Maximum recursion depth to reach when visiting. Default is -1 (infinity) [default: -1]
-e, --exclude-download <exclude-download>
Regex filter to exclude saving pages that match this expression [default: $^]
--exclude-visit <exclude-visit>
Regex filter to exclude visiting pages that match this expression [default: $^]
--ext-depth <ext-depth>
Maximum recursion depth to reach when visiting external domains. Default is 0. -1 means infinity [default:
-i, --include-download <include-download>
Regex filter to limit to only saving pages that match this expression [default: .*]
--include-visit <include-visit>
Regex filter to limit to only visiting pages that match this expression [default: .*]
-j, --jobs <jobs> Maximum number of threads to use concurrently [default: 1]
-o, --output <output> Output directory
--random-range <random-range>
Generate an extra random delay between downloads, from 0 to this number. This is added to the base delay
seconds [default: 0]
-t, --tries <tries> Maximum amount of retries on download failure [default: 20]
-u, --user-agent <user-agent> User agent to be used for sending requests [default: suckit]
<url> Entry point of the scraping
A common use case could be the following:
suckit -j 8 -o /path/to/downloaded/pages/
As of right now, SuckIT
does not work on Windows.
To install it, you need to have Rust installed.
Check out this link for instructions on how to install Rust.
If you just want to install the suckit executable, you can simply run
cargo install --git
Now, run it from anywhere with the
can be installed from available AUR packages using an AUR helper. For example,
yay -S suckit
Want to contribute ? Feel free to open an issue or submit a PR !
SuckIT is primarily distributed under the terms of both the MIT license and the Apache License (Version 2.0)