Skip to content

SimpleScraper is an open-source tool designed to fetch and transform website content based on URLs provided via an API. Can be used to give LLMs access to live browsing capabilities.

Notifications You must be signed in to change notification settings

TakSec/simple-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Simple Scraper 🤖🌐

SimpleScraper is an open-source tool that enables you to retrieve and manipulate content from websites through a straightforward API. It's perfect for developers looking to add live browsing capabilities to large language models (LLMs) and other applications.

Twitter URL

Requirements

To run SimpleScraper, you need:

  • Node.js: The JavaScript runtime that executes your server code.
  • npm (Node Package Manager): Manages the libraries that SimpleScraper depends on.

Installation

To set up SimpleScraper on your system, run the following command in your terminal:

npm install

This installs all necessary dependencies to get you started.

Usage

Starting the Server

Launch the server with this command:

node server.js

This command starts your SimpleScraper server, making it ready to handle requests.

Making Requests

To scrape content from a website, make a POST request to the /api/fetch-content endpoint.

Here’s how you can do it using curl:

curl -X POST http://localhost:3001/api/fetch-content -H "Content-Type: application/json" -d '{"url": "http://example.com"}'

Replace "http://example.com" with the URL of the site you want to scrape.

See It in Action

Check out Web Code Explainer GPT to see Simple Scraper in action!

About

SimpleScraper is an open-source tool designed to fetch and transform website content based on URLs provided via an API. Can be used to give LLMs access to live browsing capabilities.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published