Distributed web data discovery and collection framework
npm install @achannarasappa/locust
- Configuration driven jobs
- Distributed execution model to support serverless architectures
- Handle client-side JavaScript execution
- Data extraction using CSS selectors
- Depth-based stop condition along with support for custom stop conditions
- Robust dev tooling with locust-cli to build and test jobs
- Web indexing (i.e. web crawling)
- Web data extraction (i.e. web scraping)
- Documentation
- Examples
- Related