Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Update DataONE Tale #111

Closed
wants to merge 6 commits into from
Closed

Update DataONE Tale #111

wants to merge 6 commits into from

Conversation

ThomasThelen
Copy link
Member

@ThomasThelen ThomasThelen commented Sep 17, 2020

Background

There are a couple of changes....

  1. While I was working around in the file, there were a few spots where I cleared the documentation up
  2. I moved the resource map to the DataONEMetadata class
  3. I moved the client object to the DataONEPublishProvider class
  4. Removed some commented out code (we can't rollback failed submission)
  5. Played with the tests
  6. Updated the README

The changes are largely split into three commits (apologies): Update Tales, Add Tests and Fix Tests.

Update Tales has the logic for updating Tales that were already published and handles appending DataCite:IsDerivedFrom if the Tale was copied from a published Tale. A few updates to it were made in Add Tests. Add Tests was the first pass at expanding the unit tests. Inside you'll see that the TALE and MANIFEST variables are in each file. The Fix Tests commit largely moves these to __init__.py.

Misc

There are a few miscellaneous changes that range from marking some methods as static, rewording some of the exceptions, improving the docstrings and type annotating types from the dataone library for easier reading & IDE support.

The styling changes in 8c3bbf9 are fairly trivial. The riskiest one is probably renaming the zip variable (shadows the zip package) to zip_file.

Updating

A Tale is 'updated' when it's published a second time. If a Tale is published to DataONE, and then a copy is made and published- datacite:IsDerivedFrom pointing to the first Tale's doi is added to the resource map. The relevant code for this is here

When a Tale is published to DataONE two times, the previous system metadata is retrieved. A new system metadata document is made, but replaces values such as md5 and size from the copied one. This system metadata document is for the object that will be obsoleting the one in DataONE.

[_obsolete_object]() just calls the DataONE python library's update method which takes care of the rest of the heavy lifting.

Tests

I restructured the unit tests to match the src directory. I took the json from the test_publish.py and moved it into the __init__.py file so that they can be used in other tests. You can also see that I have added structures for published tales, etc. Then I pulled the dataone logic of test_publish.py and moved it over into test_dataone_publish.py. I added test_dataone_metadata.py to test some of the methods in the metadata.py file but left out tests for the EML.

Testing

  1. Create a Tale
  2. Publish it to DataONE
  3. View the landing page for that Tale (don't close the tab!)
  4. Publish again
  5. Refresh the URL from step 3, you should see a notification that there is an updated version available

_

  1. Create a Tale
  2. Publish it to DataONE
  3. Make the Tale public
  4. Sign out and # under a new account
  5. Copy the Tale from step 1
  6. Publish it to DataONE
  7. Copy the ID of the resource map (the uuid in the picture below)
  8. Visit https://dev.nceas.ucsb.edu/knb/d1/mn/v2/object/<RESOURCE_MAP_ID>
  9. Search for "datacite"
  10. Check that it references the copied Tale's doi

Example:

I published a Tale here, copied it and republished here. In the new resource map found here, you can see that the datacite:IsDerivedFrom points to the DOI of the first package.

Find the resource map pid at the top of the file table on the dataset landing page, shown below.
Screen Shot 2020-09-16 at 5 03 20 PM

@codecov
Copy link

codecov bot commented Sep 17, 2020

Codecov Report

Merging #111 into master will increase coverage by 0.60%.
The diff coverage is 78.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #111      +/-   ##
==========================================
+ Coverage   62.16%   62.76%   +0.60%     
==========================================
  Files           9        9              
  Lines        1163     1206      +43     
==========================================
+ Hits          723      757      +34     
- Misses        440      449       +9     
Impacted Files Coverage Δ
gwvolman/utils.py 40.52% <0.00%> (ø)
gwvolman/lib/dataone/metadata.py 84.50% <59.37%> (-4.16%) ⬇️
gwvolman/lib/dataone/publish.py 84.90% <85.88%> (+1.92%) ⬆️
gwvolman/lib/publish_provider.py 97.56% <100.00%> (+4.87%) ⬆️
gwvolman/tasks.py 24.44% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d66d6e3...44eb624. Read the comment docs.

@ThomasThelen
Copy link
Member Author

Closing this PR; it's been replaced by #114

@ThomasThelen ThomasThelen deleted the dataone_update_tale branch October 11, 2020 23:21
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant