This project scrapes data from two different websites and saves the data in CSV format.
.github/workflows/
: Contains GitHub Actions workflows.Data/
: Contains data files.Datasets/
: Contains scraped datasets.logs/
: Contains log files.Report/
: Report filesScripts/
: Contains Python scripts.additional.py
: Additional functions and logging configuration.dataset_download.py
: Downloads datasets from Kaggle.dataset_upload.py
: Uploads datasets to Kaggle.Selenium.py
: Browser automation with Selenium.
requirements.txt
: Lists the required Python packages.
- Install the required packages:
pip install -r requirements.txt
-
Scrape BKM categories:
python Scripts/bkm_scrape_categories.py
-
Download the Kaggle dataset:
python Scripts/dataset_download.py
-
Scrape BKM data:
python Scripts/bkm_scrape.py
-
Combine the scraped data:
python Scripts/bkm_combine.py
-
Upload the dataset to Kaggle:
python Scripts/dataset_upload.py
-
Scrape KY categories:
python Scripts/ky_scrape_categories.py
-
Download the Kaggle dataset:
python Scripts/dataset_download.py
-
Scrape KY data:
python Scripts/ky_scrape.py
-
Combine the scraped data:
python Scripts/ky_combine.py
-
Upload the dataset to Kaggle:
python Scripts/dataset_upload.py