This is a runnable proof-of-concept to demonstrate semantic search for DesignSafe publications. The tree representation for each publication is converted to a natural-language description, which is embedded using the nomic-embed-text
model via ollama and stored in a ChromaDB collection. Search is performed by embedding a query string and finding similar vectors in the database.
The demo is packaged as a Jupyter notebook with the embedding model and vector database served via docker-compose. To run it:
- Navigate to the repository root.
- Run
docker compose build
- Run
docker compose up
- Navigate to http://localhost:8888/lab/tree/notebook.ipynb in your browser and run the Jupyter notebook that is served.
- nomic-embed-text model: https://ollama.com/library/nomic-embed-text
- ollama docker image: https://hub.docker.com/r/ollama/ollama
- ChromaDB vector database: https://docs.trychroma.com/docs/overview/introduction