Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

GPU isn't used #2506

Closed
abrahimzaman360 opened this issue Feb 5, 2024 · 5 comments
Closed

GPU isn't used #2506

abrahimzaman360 opened this issue Feb 5, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@abrahimzaman360
Copy link

GPU is not utilized during the process!

@abrahimzaman360 abrahimzaman360 added the bug Something isn't working label Feb 5, 2024
@muazhari
Copy link

muazhari commented Apr 19, 2024

NEED IT TOO!!!

@javier-cohere
Copy link

javier-cohere commented Apr 26, 2024

Same here. Running in Colab and getting the warning that GPU is not being utilised

@hahazei
Copy link

hahazei commented Apr 30, 2024

I need too

@javier-cohere
Copy link

javier-cohere commented Apr 30, 2024

I did a little bit of investigation/debugging and here is what I learned. From what I see, there are two types of layout detection models in Unstructured:

  • Models that run with ONNXRuntime. That is, YoloX, Detectron_ONNX and many others use this method.
  • Native models: That is, Detectron2, whose weights are downloaded from HF and loaded in memory.

For the models that run with ONNXRuntime

ONNXRuntime has a series of providers available that it uses to run inference. In order to use the GPU, the TensorrtProvider and CUDAProvider need to be available. This can be checked by adding these two lines to your code:

from onnxruntime.capi import _pybind_state as C
logger.info(f"Available ONNXRT providers: {C.get_available_providers()}")

In my case, I was getting Available ONNXRT providers: ['AzureExecutionProvider', 'CPUExecutionProvider'], which means that GPU wasn't being used.

To utilise GPU, you need to install onnxruntime-gpu:

  • pip install onnxruntime-gpu if your CUDA drivers are <12.
  • pip install onnxruntime-gpu --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/ otherwise.
    See https://onnxruntime.ai/docs/install/#python-installs

After installing this library, I could see ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']. I think ONNXRT uses the providers in order of preference, so first it will try to use Tensorrt, then CUDA.

For models that do not use ONNXRT

In the case of Detectron2, I could verify in the Unstructured code that the detectron2 model does not correctly get the device parameter that it needs to use cuda. This can be circumvented by leveraging the fact that unstructured first tries to load the model config from the environment variable UNSTRUCTURED_DEFAULT_MODEL_INITIALIZE_PARAMS_JSON_PATH, and if it is not set, it will load the default model config. See https://github.com/Unstructured-IO/unstructured-inference/blob/main/unstructured_inference/models/base.py#L67

You can load the default Detectron2 model config in your code, add device: "cuda", then dump it into a temporal file, and set the path to the config with `os.env["UNSTRUCTURED_DEFAULT_MODEL_INITIALIZE_PARAMS_JSON_PATH"] = "your_config_file".

However I do not recommend this approach since detectron already has an ONNX flavour and you don't really need to do all of this to use it. Moreover, YoloX works better as a layout model.

@MthwRobinson
Copy link
Contributor

Thanks for the write up @javier-cohere ! The detectron2 model @javier-cohere mentioned at the end is no longer supported as of 0.14.1. If you'd still like to use detectron2, you can use the ONNX version, though we recommend yolox.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants