text-to-sql using LLM

Summary

Motivated by the desire to investigate what Large Language Model (LLM) can do for us, this project attempts to apply one of Hugging Face's open source LLM model to take in text input from user and returns a SQL command output through a chatbot interface, similar to Chat-GPT.

We often need to interact with databases in a corporate setting. LLM can supplement this process, especially for non-technical users that needs to query and retrieve certain data, but may not have the necessary domain knowledge yet to do so.

While Chat-GPT is openly accessible right now, the OpenAI GPT-3.5 model API is not free for use. On the other hand, Hugging Face has made a plethora of models available open-sourced, and it can be easily implemented using Python. This app does not require any API key.

Technology Used

Python ( Streamlit, Pytorch )
Docker
Kubernetes ( k3d, kubectl )

Getting Started

There are multiple ways to spin up this project, listed below.

Option 1: Spin up using Python

To begin, first install the required python packages by running

pip install -r requirements.txt
To spin up using python, navigate to main/frontend and run: streamlit run streamlit-sql.py

Option 2: Spin up using Docker compose (note that this requires you to also have Docker installed):

docker-compose up
Navigate to localhost:8501 to interact with the app.

Option 3: Spin up using Kubernetes cluster:

Navigate to main/deployment.
Run sh setup_cluster.sh.
Navigate to localhost:8501 once port-forwarding has started.

Note: This approach is not perfected and currently requires port-forwarding to interact with the pod within the cluster. To run using the kubernetes cluster, first install k3d

Limitation and Future Improvements

While I have selected a relatively compressed LLM model, running the query still put substantial memory pressure on my local machine ( M1 Macbook Pro, 16GB RAM ). Future improve could consider hosting it in the cloud with more memory availability to avoid overwhelming local machines.
The deployment process as of now is not smooth, as it requires port-forwarding to access the service deployed on the kubernetes cluster. It would be better to see if there are ways to expose the service that can be defined using a config script. Once this setup is available, it can then be deployed to the cloud (i.e. AWS) and make it accesible to the public.
The queries were tested by myself, which may introduce a bias as I have some prior experience writing SQL queries, and may not be representative of the ability of this project to support non-technical users in writing SQL queries. More feedback will need to be gathered.

Future Features

Allow "adding on to previous query"
Add more examples or prompting guides
Copy to clipboard

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.devcontainer		.devcontainer
main		main
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

text-to-sql using LLM

Summary

Technology Used

Getting Started

Option 1: Spin up using Python

Option 2: Spin up using Docker compose (note that this requires you to also have Docker installed):

Option 3: Spin up using Kubernetes cluster:

Limitation and Future Improvements

Future Features

About

Releases

Packages

Languages

dodiyzw/text-to-sql-dodi

Folders and files

Latest commit

History

Repository files navigation

text-to-sql using LLM

Summary

Technology Used

Getting Started

Option 1: Spin up using Python

Option 2: Spin up using Docker compose (note that this requires you to also have Docker installed):

Option 3: Spin up using Kubernetes cluster:

Limitation and Future Improvements

Future Features

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages