Skip to content

A script that performs web scraping using Beautiful Soup (bs4), enabling the retrieval of information such as data in CSV format or images from the website.

License

Notifications You must be signed in to change notification settings

githubstevemas/PyScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyScraper 🐍

With the BeautifulSoup Python library, get informations from books.toscrape.com like book titles, prices, ratings, descriptions and many more.

How it works

When launched, all the datas will be saved locally in an "Outputs" folder, and each category of books will be loaded in separate CSV files. Date of scraping is included in the folder name for a better manage of backups. Finally, you can choose if you want to download the corresponding images files of books.

Requirements

  • Python 3.6 or later

How to run

Once the code has been downloaded, go to the project directory and enter the following commands in terminal :

python -m venv env install a new vitual environement

env/Scripts/activate activate the environement

pip install -r requirements.txt install all the depedencies

python main.py run the code

deactivate when over, deactivate the environement

Note

The commands above are for Windows use. Go to the official Python documentation for MacOS or Unix usage.

Contact

Feel free to mail me for any questions, comments, or suggestions.

About

A script that performs web scraping using Beautiful Soup (bs4), enabling the retrieval of information such as data in CSV format or images from the website.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages