Skip to content
This repository has been archived by the owner on Mar 6, 2024. It is now read-only.

basetenlabs/vicunlocked-alpaca-30b

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Vicunlocked-Alpaca-30B Truss

This repository packages Vicunlocked-Alpaca-30B as a Truss.

Utilizing this model for inference can be challenging given the hardware requirements. With Baseten and Truss, inference is dead simple.

Deploying Vicunlocked-Alpaca-30B

We found this model runs reasonably fast on A100s; you can configure the hardware you'd like in the config.yaml.

...
resources:
  cpu: "3"
  memory: 14Gi
  use_gpu: true
  accelerator: A100
...

Before deployment:

  1. Make sure you have a Baseten account and API key. You can # for a Baseten account here.
  2. Install Truss and the Baseten Python client: pip install --upgrade baseten truss
  3. Authenticate your development environment with baseten login

Deploying the Truss is easy; simply load it and push from a Python script:

import baseten
import truss

vicunlocked_truss = truss.load('.')
baseten.deploy(vicunlocked_truss)

Invoking Vicunlocked-Alpaca-30B

The usual GPT-style parameters will pass right through to the inference point:

import baseten
model = baseten.deployed_model_id('YOUR MODEL ID')
model.predict({"prompt": "Write a movie plot about vicunas planning to over the world", "do_sample": True, "max_new_tokens": 300})

You can also invoke your model via a REST API

curl -X POST " https://app.baseten.co/models/YOUR_MODEL_ID/predict" \
     -H "Content-Type: application/json" \
     -H 'Authorization: Api-Key {YOUR_API_KEY}' \
     -d '{
           "prompt": "Write a movie plot about vicunlockeds planning to over the world",
           "do_sample": True,
           "max_new_tokens": 300,
           "temperature": 0.3
         }'

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages