Skip to content
View rainmana's full-sized avatar

Block or report rainmana

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

Webscraping

85 repositories

A bunch of website scraping scripts

Ruby 8 1 Updated Jul 9, 2013

List of libraries, tools and APIs for web scraping and data processing.

Makefile 6,924 804 Updated Dec 27, 2024

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

Python 6,674 699 Updated Oct 12, 2024

Lighter web automation with Python

Python 7,602 462 Updated Feb 20, 2025

Faster requests on Python 3

Nim 1,110 90 Updated Feb 13, 2025

Guide, reference and cheatsheet on web scraping using rvest, httr and Rselenium.

R 392 104 Updated Dec 20, 2022

Web scrapping and related analytics using Python tools

Jupyter Notebook 273 168 Updated Jun 7, 2020

A browser testing and web crawling library for PHP and Symfony

PHP 2,982 229 Updated Jan 30, 2025

Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)

Python 10,819 1,199 Updated Jun 25, 2024

Get info from any web service or page

PHP 2,105 311 Updated Jan 2, 2025

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, an…

TypeScript 17,106 765 Updated Mar 12, 2025

🚀 Fast and simple Node.js version manager, built in Rust

Rust 19,919 521 Updated Mar 10, 2025

artoo.js - the client-side scraping companion.

JavaScript 1,105 93 Updated Mar 31, 2021

Websites crawler with built-in exploration and control web interface

JavaScript 341 62 Updated Jan 29, 2025

A webmining CLI tool & library for python.

Python 307 27 Updated Mar 11, 2025

DataHen Till is a companion tool to your existing web scraper that instantly makes it scalable, maintainable, and more unblockable, with minimal code changes on your scraper. Integrates with any sc…

Go 814 22 Updated Dec 5, 2021

🥫 The simple, fast, and modern web scraping library

Python 767 57 Updated Dec 7, 2023

🧹 Python package for text cleaning

Python 971 78 Updated May 9, 2023

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

Python 4,022 285 Updated Feb 17, 2025

A suite of utilities for converting to and working with CSV, the king of tabular file formats.

Python 6,104 607 Updated Feb 27, 2025

bypass-url-parser

Python 1,058 116 Updated Mar 8, 2025

Downloads videos and playlists from YouTube

C# 10,696 1,419 Updated Mar 3, 2025

Abstraction layer over YouTube's internal API

C# 3,128 512 Updated Mar 3, 2025

splinter - python test framework for web applications

Python 2,742 514 Updated Nov 1, 2024

Community maintained fork of pdfminer - we fathom PDF

Python 6,273 952 Updated Aug 2, 2024

Next generation web scanner

Ruby 5,758 930 Updated Jul 16, 2024

Google Apps Script library - interprets Google Sheets Formats, converts to formatted text or html

JavaScript 50 25 Updated Sep 6, 2023

Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets

Python 4,366 416 Updated Mar 11, 2025

GraphQL automated security testing toolkit

Python 312 23 Updated Feb 20, 2024

🕸️ Blazing fast GraphQL endpoints finder using subdomain enumeration, scripts analysis and bruteforce. 🕸️

Python 210 13 Updated May 22, 2023