Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Opendap KO behind a proxy #2752

Closed
robin-cls opened this issue Sep 19, 2023 · 5 comments
Closed

Opendap KO behind a proxy #2752

robin-cls opened this issue Sep 19, 2023 · 5 comments

Comments

@robin-cls
Copy link

Hello,

I am encountering an issue when opening an Opendap URL from Ifremer using the netcd-python library. I did a preliminary analysis and I think the issue is related to the underlying netcdf-c library which is why I am submitting this issue here.

Environment

I am working on a computing platform that have access to internet via a proxy. This proxy is setup using the canonical http_proxy, https_proxy an no_proxy environment variables.

The computing platform runs RHEL8, and I am working in a conda virtual environment with the following libraries

image

The following script attempts to download data via Opendap

import netCDF4
  
nds = netCDF4.Dataset('https://tds0.ifremer.fr/thredds/dodsC/OSI-204-b-metop_b/2023/024/20230124074303-OSISAF-L2P_GHRSST-SSTsubskin-AVHRR_SST_METOP_B-sstmgr_metop01_20230124_074303-v02.0-fv01.0.nc', mode='r', clobber=True, diskless=False, persist=False, format='NETCDF4')

print(nds)

I expected to display the dataset metadata but got an SSL-related error instead

  File "src/netCDF4/_netCDF4.pyx", line 2464, in netCDF4._netCDF4.Dataset.__init__
  File "src/netCDF4/_netCDF4.pyx", line 2027, in netCDF4._netCDF4._ensure_nc_success
OSError: [Errno -68] NetCDF: I/O failure: 'https://tds0.ifremer.fr/thredds/dodsC/OSI-204-b-metop_b/2023/024/20230124074303-OSISAF-L2P_GHRSST-SSTsubskin-AVHRR_SST_METOP_B-sstmgr_metop01_20230124_074303-v02.0-fv01.0.nc

First analysis

I have tried multiple things to narrow down my problem.

Proxy / No proxy

The problem seems to be linked to the presence of a proxy, because the script runs properly on other platforms that have direct access to the internet

SSL certificate validity

When going to the URL with Chrome, it looks like the certificate is valid so in my opinion the error I get is not justified.

Netcdf versions

I have tried downgrading the netcdf libraries to the following versions

image

It works, so I tried an strace in the two environments and noticed that the netcdf-c 4.7.4 is skipping the SSL verification. This lead me to check if there were any changes in netcdf-c 4.8.0 and I found out the curl options for SSL verification has been updated.

Following this, I tried disabling the SSL checks by setting the HTTP.SSL.VERIFYPEER and HTTP.SSL.VERIFYHOST to 0 in the ~/.dodsrc but to no avail.

Question time

Is there something specific to set up with netcdf-c when working behind a proxy ? Setting the HTTP.PROXY_SERVER does not seem to resolve my issue.

Am I doing something wrong when trying to set up the netcdf library SSL configuration ?

@DennisHeimbigner
Copy link
Collaborator

We do not run a proxy here at Unidata, so we have problems debugging the proxy support code. So I can easily believe that it contains flaws.
So first, try this:
set HTTP.SSL.VERIFYPEER and HTTP.SSL.VERIFYHOST to -1 instead of 0

@robin-cls
Copy link
Author

Thanks for the hint Dennis. I first tried to set the HTTP.SSL.VERIFYPEER and HTTP.SSL.VERIFYHOST to -1 but got the same error. Your suggestion then made me think of setting the HTTP.VERBOSE to get more info on the CURL error.

One of the error I got was an error was about the Proxy authentication. Getting a clean setup with fewer virtual environments activation made it work for me. I also could make it work in a badly initialized environment with the HTTP.PROXY_SERVER configuration key.

The proxy authentication being out of the way, I got a true bad certificate proxy. The certificates I was using were provided by the certififi installation in conda, and it was probably not enough to trust my opendap site. Switching to the certificates provided by the computing platform with HTTP.SSL.CAINFO resolved the problem.

Note that it was possible to replicate the certificate issue by playing with the HTTP.SSL.VERIFYPEER and HTTP.SSL.VERIFYHOST to 1 in the 'working' environment. So the difference between the two library version 4.7.4 and 4.8.0 really seems to be a skip of SSL certificate verification in older versions.

Now for the final part, I tried to apply this configuration (SSL.VERIFYPEER=1, SSL.VERIFYHOST=1, SSL.CAINFO=..., VERBOSE=2) in a production environment (libnetcdf 4.9.2) but it seems to be ignored. strace tells me that my ~/.dodsrc file is properly read, but I do not have any CURL log and the CAINFO is not properly setup. This probably explains my previous confusion as to why the configuration seems to have no effect.

Has there been a change about the configuration in newer versions ?

@DennisHeimbigner
Copy link
Collaborator

Can you post the final .dodsrc that you used (with any sensitive info XXXX'd out)?

@robin-cls
Copy link
Author

robin-cls commented Sep 21, 2023

Here is my configuration. I have commented out the PROXY_SERVER because it is not useful in a properly setup environment.

HTTP.SSL.VERIFYPEER=1
HTTP.SSL.VERIFYHOST=1
HTTP.VERBOSE=2
#HTTP.PROXY_SERVER=http://xxxx:xxxxx@proxy.xxxx.fr:8888
HTTP.SSL.CAINFO=/path/to/cert.pem

Update: The configuration seems to be properly used in libnetcdf=4.8.1, but not on libnetcdf=4.9.1 and later. I could'nt install the v4.9.0 because of compatibility problems in conda.

I have these kinds of error in strace just after the reading of configuration files
image

I have also seen that there are new capabilities in HDF5 compression in libnetcdf4.9.0.
Unidata/netcdf4-python#1164
Is it possible that the fact the hdf5plugin (I suppose it is a plugin) not being found breaks the config reading ?

If it is a python wheel issue, we may move my issue to the netcdf-python github

@DennisHeimbigner
Copy link
Collaborator

I have been experimenting with this.
Try the following:

  1. put the attached file into the oc2 directory with the name occurlfunctions.c (replacing the one already there)
  2. modify you .dodsrc to set HTTP.SSL.VERIFYPEER=-1 and HTTP.SSL.VERIFYHOST=-1
  3. Rebuild and try your test again.
    occurlfunctions.c.txt

DennisHeimbigner added a commit to DennisHeimbigner/netcdf-c that referenced this issue Oct 8, 2023
re: Issue Unidata#2752

The authorization setup when using a proxy is apparently not
being used, or used incorrectly.

This PR ensures that the relevant curl options, specifically
CURLOPT_VERIFYHOST and CURLOPT_VERIFYPEER, are properly setup.
As part of this, the ability to turn off these options was fixed.
Note that no testing of this PR is currently possible because we
do not have access to a proxy.
@WardF WardF closed this as completed Oct 27, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants