Additional Zero Shot Models #171

mkammes · 2024-04-03T20:56:42Z

Is your feature request related to a problem? Please describe.
No.

Describe the solution you'd like
Additional Zero Shot models; such as Grounding DINO. Maybe Detectron2 or Segment Anything. However, Grounding DINO - which is promptable - would be great.

Describe alternatives you've considered
n/a

Additional context
The Grounding DINO model is promptable and apparently scores higher than CLIP.

octimot · 2024-04-11T06:13:10Z

Hey there!

I think Segment Anything / Grounding DINO are creating more restrictive embeddings due to their promptable nature (more focused training data). In other words, CLIP on its own allows you to search using more "obscure" language, while others might be restricted to more common words (car, sky, bird, face etc.)

We're preparing an update which also allows the use of GPT-Vision and LLaVA-like models that would allow you to ingest and prompt directly too.

But, I'll take a look at these too ASAP!

Cheers

mkammes added the enhancement New feature or request label Apr 3, 2024

octimot self-assigned this Apr 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additional Zero Shot Models #171

Additional Zero Shot Models #171

mkammes commented Apr 3, 2024 •

edited

Loading

octimot commented Apr 11, 2024

Additional Zero Shot Models #171

Additional Zero Shot Models #171

Comments

mkammes commented Apr 3, 2024 • edited Loading

octimot commented Apr 11, 2024

mkammes commented Apr 3, 2024 •

edited

Loading