Skip to content

Starlight039/visual-chatgpt-googlecolab

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Visual ChatGPT

Visual ChatGPT connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting.

See our paper: Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models

Intro

I implement a google-colab version under standard GPU environment. I just use two models T2I and ImageCaption to process images because of my insufficient GPU memory. You can try my colab notebook here

Open 2k image generation in Colab

Demo

T2I

ImageCaption

GPU memory usage

Here we list the GPU memory usage of each visual foundation model, one can modify self.tools with fewer visual foundation models to save your GPU memory:

Foundation Model Memory Usage (MB)
ImageEditing 6667
ImageCaption 1755
T2I 6677
canny2image 5540
line2image 6679
hed2image 6679
scribble2image 6679
pose2image 6681
BLIPVQA 2709
seg2image 5540
depth2image 6677
normal2image 3974
InstructPix2Pix 2795

Acknowledgement

We appreciate the open source of the following projects:

Hugging FaceLangChainStable DiffusionControlNetInstructPix2PixCLIPSegBLIP

About

VisualChatGPT for googlecolab-version

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.4%
  • Shell 1.6%