Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Importing D1 dataset fails with I/O error #530

Open
craig-willis opened this issue Mar 2, 2022 · 1 comment
Open

Importing D1 dataset fails with I/O error #530

craig-willis opened this issue Mar 2, 2022 · 1 comment

Comments

@craig-willis
Copy link

Encountered while testing v1.1rc1 (whole-tale/wt-design-docs#166). This repeats for me on both test and local.

Test steps:
From test case "Import from DataONE: READ-WRITE":

  1. Navigate to https://girder.local.wholetale.org/api/v1/integration/dataone?uri=https%3A%2F%2Fsearch.dataone.org%2Fview%2Fdoi%3A10.18739%2FA2VQ2S94D&title=Fire%20influences%20on%20forest%20recovery%20and%20associated%20climate%20feedbacks%20in%20Siberian%20Larch%20Forests%2C%20Russia&environment=RStudio
  2. Confirm that the Tale title matches the dataset
  3. Select READ/WRITE
  4. Click Create New Tale

Expected results:
Dataset is imported into workspace, tale is successfully created

Actual results:

Traceback (most recent call last):
  File "/girder/girder/events.py", line 164, in run
    event = trigger(eventName, info, _async=True, daemon=True)
  File "/girder/girder/events.py", line 314, in trigger
    handler['handler'](e)
  File "/girder/plugins/jobs/server/__init__.py", line 43, in scheduleLocal
    fn(job)
  File "/girder/plugins/wholetale/server/tasks/import_binder.py", line 196, in run
    copy_fs(source_fs, destination_fs)
  File "/girder/venv/lib/python3.9/site-packages/fs/copy.py", line 48, in copy_fs
    return copy_fs_if(
  File "/girder/venv/lib/python3.9/site-packages/fs/copy.py", line 108, in copy_fs_if
    return copy_dir_if(
  File "/girder/venv/lib/python3.9/site-packages/fs/copy.py", line 448, in copy_dir_if
    copier.copy(_src_fs, dir_path, _dst_fs, copy_path)
  File "/girder/venv/lib/python3.9/site-packages/fs/_bulk.py", line 142, in copy
    copy_file_internal(
  File "/girder/venv/lib/python3.9/site-packages/fs/copy.py", line 279, in copy_file_internal
    _copy_locked()
  File "/girder/venv/lib/python3.9/site-packages/fs/copy.py", line 269, in _copy_locked
    with src_fs.openbin(src_path) as read_file:
  File "/girder/plugins/wholetale/server/tasks/import_binder.py", line 363, in openbin
    return open(fdict["path"], "r+b")
  File "/girder/venv/lib/python3.9/site-packages/fs/error_tools.py", line 89, in __exit__
    reraise(fserror, fserror(self._path, exc=exc_value), traceback)
  File "/girder/venv/lib/python3.9/site-packages/six.py", line 718, in reraise
    raise value.with_traceback(tb)
  File "/girder/plugins/wholetale/server/tasks/import_binder.py", line 362, in openbin
    self._fs._ensure_region_available(path, fdict, fd, 0, fdict["obj"]["size"])
  File "/girder/venv/lib/python3.9/site-packages/girderfs/dms.py", line 226, in _ensure_region_available
    self._wait_for_file(fdict)
  File "/girder/venv/lib/python3.9/site-packages/girderfs/dms.py", line 263, in _wait_for_file
    raise OSError(EIO, os.strerror(EIO))
fs.errors.OperationFailed: operation failed, [Errno 5] Input/output error
@Xarthisius
Copy link
Collaborator

Xarthisius commented Mar 7, 2022

Not our fault. DataONE CN claims that some of the files are checksummed using MD5, but in reality they were checksummed using SHA1. EIO is raised due to a mismatched hash.

How to reproduce (outside of WT)?

#!/bin/bash

rm -rf tree_cores.csv
echo "CN claims that urn:uuid:1dad942b-e6ec-480c-82c3-9a3c87f67fa5 (tree_cores.csv) has"
curl -s "https://cn.dataone.org/cn/v2/query/solr/?q=identifier:%22urn%3Auuid%3A1dad942b-e6ec-480c-82c3-9a3c87f67fa5%22&fl=identifier,formatType,title,size,formatId,fileName,documents,checksum,checksumAlgorithm&rows=1000&start=0&wt=json" | jq . | grep '"checksum'

echo "Downloading urn:uuid:1dad942b-e6ec-480c-82c3-9a3c87f67fa5"
curl -s -LJO https://cn.dataone.org/cn/v2/resolve/urn:uuid:1dad942b-e6ec-480c-82c3-9a3c87f67fa5
echo "I'm checking md5 sum of tree_cores.csv"
md5sum tree_cores.csv
echo "<sad trombone/>"
echo "But..."
sha1sum tree_cores.csv

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants