Picarch is a Python project for face detection and image similarity search using insightface and PostgreSQL. The project detects faces in images, encodes them, stores the embeddings along with image paths in a PostgreSQL database, and allows searching for similar images.
I had a collection of 12k+ photos and was too lazy go through all the photos and find pictures with me, so I built this project.
I overengineered a problem and I'm opensourcing it so you don't have to - clone it and save some time in you life 🙃
- Face detection and embedding using InsightFace.
- Image storage and similarity search using PostgreSQL with vector data.
- Command line interface to encode images, search for similar faces, and manage the database.
- Python 3.12+
- PostgreSQL database server
- Clone the repository and navigate into the project directory:
git clone https://github.com/SirusCodes/picarch.git
cd picarch
- Create a virtual environment (optional but recommended):
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install the required packages:
pip install -r requirement.txt
- Setup the database:
Setup pgvector in Postgres or you can use a docker image.
Picarch provides several command line commands:
Recursively search a directory for images, encode faces, and store the embeddings in the database.
python main.py encode <path_to_images>
Note
This will take time. I ran it overnight.
Provide an image of a face to search for similar images in the database.
python main.py search <image_path> [--output <output_directory>]
Truncate all image and embedding tables in the database.
python main.py truncate
Drop the tables from the database.
python main.py drop
src/
ml.py
: Contains the image encoding functions.db/
: Contains database utils and classes to handle PostgreSQL operations.
database.ini
: Configuration file for the PostgreSQL connection.requirement.txt
: Lists the project dependencies.main.py
: CLI entry point for the project.