Skip to content

A web scraper app to track elements of a webpage given a HTML attribute

Notifications You must be signed in to change notification settings

adefrutoscasado/id-scraper

Repository files navigation

ID Scraper

A web scraper app to track elements of a webpage given a HTML attribute. Including:

  • Email notifications when new changes appear. Also daily notification when no changes were found.
  • Webpage to check the status and retrospective report of found elements.

Getting Started

At ./constants/catalog.js, the HTML attribute to track should be specified:

HTML attribute sample

The retrospective report of tracked elements can be found at the root link of the web app:

HTML attribute sample

Prerequisites

This app uses a number of open source projects to work properly:

Installing

Install Node modules:

npm install

Start the MongoDB server:

service mongod start

Start the app:

npm start

Deployment

It is possible to deploy on Heroku easily. The recommended addons are:

  • mLab MongoDB: Data found is saved at MongoDB, in order to be able to compare between samples and send notifications when changes appear.
  • Heroku Scheduler: A scheduler is needed in order to track the web in a given period of time.

Demo

Please note following live demo is not maintained. Since web scraping needs high maintenance frequency I dont assure that the demo will work.

http://amazon-scrap.herokuapp.com/

License

This project is licensed under the MIT License

About

A web scraper app to track elements of a webpage given a HTML attribute

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published