Skip to content
This repository has been archived by the owner on Jan 22, 2024. It is now read-only.

Cannot compile against nvidia drivers in the container #103

Closed
hsysuper opened this issue Jun 3, 2016 · 7 comments
Closed

Cannot compile against nvidia drivers in the container #103

hsysuper opened this issue Jun 3, 2016 · 7 comments
Labels

Comments

@hsysuper
Copy link

hsysuper commented Jun 3, 2016

Currently, the provisioned nvidia-driver volume only includes the driver library files (e.g. libnvidia-opencl.so.352.93) and runtime symbolic link (e.g. libnvidia-opencl.so.1 -> libnvidia-opencl.so.352.93). This allows binary programs using nvidia drivers to run without a problem.

However, when compiling programs against nvidia driver libraries. The linker is looking for the .so file (e.g. libnvidia-opencl.so) to link against, which cannot be achieved in current volume configuration.

Could nvidia-docker also make these .so symbolic links when provisioning the volume? So that we can compile programs in the container in case people want to use the container as a development/debugging environment.

Thanks!

@flx42
Copy link
Member

flx42 commented Jun 3, 2016

libnvidia-opencl.so* is just an example, right? You should not link against this particular library.
Let's take other examples that make more sense: libcuda.so and libnvidia-ml.so. Stub libraries are present in the devel version of the images we provide at location /usr/local/cuda/lib64/stubs/

@hsysuper
Copy link
Author

hsysuper commented Jun 3, 2016

Thanks for the fast reply.

Yes, libnvidia-opencl.so* was just an example and it is very good to know that several libraries are available under /usr/local/cuda/lib64/stubs/. However, the particular library that I am interested to link against is libnvidia-encode.so. nvidia-encode is the library required by the NvEncoder example in the Nvidia-Video-Codec-SDK.

Could this library be provisioned by nvidia-docker?

@3XX0
Copy link
Member

3XX0 commented Jun 3, 2016

Unfortunately, we do not have stubs for the video libraries:
libnvidia-encode.so, libnvcuvid.so, libnvidia-fbc.so, libnvidia-ifr.so

You would have to rely on dynamic dispatching (aka dlopen). You can look at the SDK samples common/src directory to see how it is done.

libnvidia-encode.so for example has a single entry point NvEncodeAPICreateInstance from there you can retrieve all the other function pointers.

@hsysuper
Copy link
Author

hsysuper commented Jun 3, 2016

Thanks for the quick support.

After a few trial and error, I found that NVENC does not work very well with nvidia-docker. In the end, I had to manually share devices between the host and the container and install the nvidia-XXX driver or the complete cuda-toolkit in order for the example to run.

I tried installing the nvidia-XXX driver or the cuda-toolkit using the Ubuntu apt-get method after installing the cuda deb repository. However, I kept getting NV_ENC_ERR_UNSUPPORTED_DEVICE (0x2) error from the NvEncOpenEncodeSessionEx API call.

However, after I provision the devices myself and installed the same driver, the error disappeared and I was able to run the example in the container.

@3XX0
Copy link
Member

3XX0 commented Jun 4, 2016

Please don't do that, the problem is elsewhere.
I just tried it and apparently nvidia-encode is looking for libcuda.so internally, which is an issue on our end. As a workaround, you can create the link yourself inside the container and everything will work as intended:

$ nvidia-docker run -ti nvidia/cuda
# ln -s /usr/local/nvidia/lib64/libcuda.so.1 /usr/lib/x86_64-linux-gnu/libcuda.so

If you want to try the NvEnc sample, you would have to remove -lnvidia-encode in the Makefile (an oversight in the samples)

@hsysuper
Copy link
Author

hsysuper commented Jun 5, 2016

I tired with your solution and it worked!

Now, I am able to run the NvEncoder example in the nvidia-docker enabled docker container. It seems that when building custom applications, there is no need for -lnvidia-encode in LDFLAGS, as the library can be dynamically loaded. I believe the sample has affected some projects using NVENC to link against nvidia-encode during build.

For example, the GStreamer NVENC plugin specifies that it needs nvidia-encode at the build stage in its configure.ac script.

Thanks for helping investigate the issue and hope the two oversights could be fixed soon.

@3XX0 3XX0 added the bug label Jun 7, 2016
@3XX0
Copy link
Member

3XX0 commented Jun 17, 2016

Closing since this issue is now partly fixed with the addition of the libcuda.so symlink.

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants