Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Updated Dependencies, Better Docker Support, and Segmentation Demo #480

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

tim-win
Copy link

@tim-win tim-win commented Sep 1, 2024

This PR introduces several significant improvements and updates to code surrounding the core YOLO-World project, addressing multiple issues and enhancing overall functionality and ease of use. When I finally got out of dependency hell, I decided to put down a ladder!

Key Changes

  1. Dependency Updates:

    • Updated dependencies to match the latest recommended versions from issue #364, including torch >2.0.0 (phew!).
    • Upgraded to CUDA 12.1 to ensure compatibility with the latest GPU architectures and because thank god it works.
  2. Docker Support:

    • Took the existing Docker demo system under my wing and cleaned it right up: it automatically handles the mm* dependency issues everyone has run into, as well as torch and others required for the demo.
    • Added a build_and_run.sh script for easy building and running of Docker containers with different model configurations, matching configs to models, so no one else needs the headache I have.
  3. Segmentation Demo:

    • Added demo/segmentation_demo.py to showcase YOLO-World's open vocabulary segmentation capabilities. The guts of which was stolen shamelessly form @onuralpszr 's excellent hugginface space, https://huggingface.co/spaces/onuralpszr/YOLO-World-Seg, which did not work but showed me enough to get this running.
    • Integrated segmentation support into the Docker container, allowing for easy testing and demonstration of this feature.
  4. Issue Resolutions:

    • This PR covers much of the work done in #419, bringing it up to date as of August 2024.
    • Implicitly fixes issues #279, #364, and #425.
  5. Tested Configurations:

    • Verified functionality with pretrain-x-1280ft, which performs excellently.
    • Tested seg-l and seg-l-seghead configurations, which show good performance but really work well with my use case ( :/ )

Detailed Improvements

  • Refactored the Dockerfile for better efficiency and clarity.
  • Updated pyproject.toml and requirements files with pinned dependency versions.
  • Minor changes to configuration files, there were some local paths that needed to be removed.
  • Documentation the Docker-based demo workflow.

How to Use

Users can now easily run YOLO-World demos, including the new segmentation demo, using the provided Docker build system. For example:

./build_and_run.sh pretrain-x-1280ft  # For gradio object detection demo
./build_and_run.sh seg-l              # For segmentation demo

(note, while this is in MR, the fixes are not on master. So you have to replace this line in the dockerfile:

RUN git clone --recursive https://github.com/AILab-CVC/YOLO-World /yolo/

With this line:

RUN git clone --recursive https://github.com/tim-win/YOLO-World /yolo/

Hopefully this PR will save the people who come after me significant amounts of time. Feedback and further testing is welcome!

@tim-win
Copy link
Author

tim-win commented Sep 3, 2024

Pinging @wondervictor as you may be able to review!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant