Skip to content

renardeinside/databricks-uc-semantic-layer

Repository files navigation

Using OpenAI with Databricks SQL for queries in natural language

This is an example application which uses OpenAI, Databricks Unity Catalog and DBSQL to convert natural language queries into SQL and execute it against DBSQL.

Read more about this example project here.

Average part size by brand

Average part size by brand

Solution Architecture

  1. git clone https://github.com/renardeinside/databricks-uc-semantic-layer.git
  2. Get your DBSQL endpoint coordinates from the UI
  3. Get your OpenAI API key
  4. Generate the data using the job defined in ./uc-semantic-layer:
  1. Configure the catalog and schema in ./uc-semantic-layer/conf/data_preparation.yml
  2. Run the job (either on the interactive or as a job):
cd uc-semantic-layer
# install dbx and other relevant libraries
pip install -r unit-requirements.txt
# optional - configure dbx to use another profile (by default it uses the DEFAULT one)
dbx configure -e default --profile=<some-other-profile-name>
# this is to execute on an interactive cluster
dbx execute --job=semantic-layer-data-preparation --cluster-name=<interactive-cluster-name>
# this is to launch on automated one, please configure your node_type_id in conf/deployment.yml
dbx deploy --job=semantic-layer-data-preparation --files-only
dbx launch --job=semantic-layer-data-preparation --as-run-submit --trace
cd .. # back to the project root
  1. Setup the relevant variables in .env file (check .env.example for a reference).
  2. Start the services:
make launch
  1. Open http://localhost:3000 and enjoy the app!