Crawnix is a python tool designed to crawl all the web pages from the website. Cranix uses Beautiful Soup to crawl web pages. It only crawl those webpages which are matching with base url. Crawnix uses url.netloc function for extract the base url and using regular expression it will check for the urls with base url and show on the display.
git clone http://github.com/Mehra1998/Crawnix.git
Crawnix currently supported Python3.x.
- The recommended version for Python3 is 3.x
Crawnix depends on the colorama, subprocess, codecs, Beautiful Soup, lxml python modules.
These dependencies can be installed using the requirements file:
- Installation in Windows:
c:\python27\python.exe -m pip install -r requirements.txt
- Installation on Linux:
$ sudo pip install -r requirements.txt
- Crawl all Web Pages with in scope urls.
$ python3 crawnix.py
Crawnix is licensed under the GNU GPL license. take a look at the LICENSE for more information