A robust document search engine built with Flask and React that supports boolean and proximity search operations. This system allows users to upload text documents and perform advanced searches with real-time indexing.
Click the image above to watch the full demo video
- Boolean search operations (AND, OR, NOT)
- Proximity search with customizable word distance
- Stopwords management
- Multi-threaded document processing
- Real-time indexing status
- Document content viewer
- Support for large text files (up to 32MB)
- Python 3.7+
- Node.js 14+
- npm or yarn
- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # For Windows use: venv\Scripts\activate
- Install dependencies using requirements.txt:
pip install -r requirements.txt
- Create required directory:
mkdir uploads
- Install dependencies using requirements.txt:
pip install -r requirements.txt
- Install Node.js dependencies:
cd frontend
npm install
- Start the Flask backend:
python app.py
The server will start at http://localhost:5000
- In a new terminal, start the React frontend:
cd frontend
npm start
Access the application at http://localhost:3000
- Click the "Upload Files" button
- Select one or more .txt files
- Wait for processing completion
- Create a text file with stopwords (one per line)
- Upload via the stopwords upload feature
- System automatically applies stopwords to searches
Use AND, OR, NOT operators to combine search terms:
computer AND science
technology OR programming
software NOT bugs
Find words within a specific distance:
data science /5 # Finds "data" and "science" within 5 words
Endpoint | Method | Description |
---|---|---|
/upload |
POST | Upload documents |
/upload-stopwords |
POST | Upload stopwords file |
/search |
POST | Perform search |
/status |
GET | Get system status |
/clear |
POST | Clear indexes |
/document/<doc_id> |
GET | Get document content |
The system provides clear error messages for:
- Invalid file types
- Processing errors
- Search syntax errors
- System errors
- Secure filename handling
- Input validation
- File size restrictions (32MB max)
- CORS protection
If you encounter issues:
- Check if both servers are running
- Verify uploads directory permissions
- Check console for error messages
- Ensure all dependencies are installed
- Try clearing indexes and restarting
- Uses Flask for backend API
- React for frontend interface
- Multi-threaded document processing
- Real-time status updates
- Memory-efficient indexing
- Fork the repository
- Create a feature branch
- Commit changes
- Push to the branch
- Create a Pull Request