Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add support for deposit to the DSpace REST API #135

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

markpatton
Copy link
Contributor

@markpatton markpatton commented Nov 15, 2024

A new deposit option is added which uses the DSpace REST API.

Also:

  • Cleaned up DSpace configuration properties
  • Small fix to the duplicate JSONObject warnings when running tests.

DSpace deposit takes three calls and tries to handle resumption after an error in the last two calls.

Operational changes:

  • Can now use repository key DSpace to use the new direct deposit
  • Will need to set additional DSPACE_ variables. See documentation pr below.
  • The metadata set is slightly different. Need to review on stage.

Required prs:

Related prs:

Test by building the pass-core, pass-ui, and pass-support prs. Then using the pass-docker pr, start up PASS with dspace support. Finally attempt to do some deposits and verify that they work. You should notice that publication date is now required.

Notes on interacting directly with the DSpace REST API:

Generate a CSRF token. Note that this endpoint returns 404.

CSRF_TOKEN=`curl -I http://localhost:9000/server/api/security/csrf -w '%header{DSPACE-XSRF-TOKEN}' -o /dev/null`

Login and get the auth token

AUTH_TOKEN=`curl -v -X POST http://localhost:9000/server/api/authn/# --data "user=xxx&password=yyy" -H "X-XSRF-TOKEN: $CSRF_TOKEN" -b "DSPACE-XSRF-COOKIE=$CSRF_TOKEN" -w '%header{Authorization}'`

Create a workspace item

export COLLECTION=8d80c978-5cf3-46bb-a5ea-dd8425f1af2f

curl -v -X POST "http://localhost:9000/server/api/submission/workspaceitems?owningCollection=$COLLECTION" --form 'file=@/home/msp/Downloads/test.pdf' --form 'file=@/home/msp/Downloads/test2.pdf' -H "Authorization: $AUTH_TOKEN" -H "X-XSRF-TOKEN: $CSRF_TOKEN" -b "DSPACE-XSRF-COOKIE=$CSRF_TOKEN"
export WSI_ID=49

See workspace items

curl -v "http://localhost:9000/server/api/submission/workspaceitems/" -H "Authorization: Bearer $AUTH_TOKEN"

Set required metadata

curl -v -X PATCH "http://localhost:9000/server/api/submission/workspaceitems/$WSI_ID" -H 'Content-Type: application/json' --data-binary "@/home/msp/work/pass/pass-support/md.patch" -H "Authorization: $AUTH_TOKEN" -H "X-XSRF-TOKEN: $CSRF_TOKEN" -b "DSPACE-XSRF-COOKIE=$CSRF_TOKEN"

Create a workflow item for workspace item

 curl -v -X POST http://localhost:9000/server/api/workflow/workflowitems -H "Content-Type:text/uri-list" --data "http://localhost:9000/server/api/submission/workspaceitems/$WSI_ID" -H "Authorization: $AUTH_TOKEN" -H "X-XSRF-TOKEN: $CSRF_TOKEN" -b "DSPACE-XSRF-COOKIE=$CSRF_TOKEN"

Example of patch

[
    {
        "op": "add",
        "path": "/sections/teste/dc.title",
        "value": [
            {
                "authority": null,
                "confidence": -1,
                "display": "AROLDO TEST WITH REST",
                "language": null,
                "otherInformation": null,
                "place": 0,
                "value": "AROLDO TEST WITH REST"
            }
        ]
    },
{
        "op": "add",
        "path": "/sections/license/granted",
        "value": "true"
    }
]





[
    {
        "op": "add",
        "path": "/sections/teste/dc.title",
        "value": [
            {
                "authority": null,
                "confidence": -1,
                "display": "AROLDO TEST WITH REST",
                "language": null,
                "otherInformation": null,
                "place": 0,
                "value": "AROLDO TEST WITH REST"
            }
        ]
    },
{
        "op": "add",
        "path": "/sections/license/granted",
        "value": "true"
    }
]

@markpatton markpatton marked this pull request as draft November 15, 2024 15:17
@markpatton markpatton force-pushed the 1063-dspace-rest-deposit branch 2 times, most recently from fec504f to d4a366b Compare November 22, 2024 18:32
@markpatton markpatton force-pushed the 1063-dspace-rest-deposit branch from 6ce178a to d750107 Compare December 16, 2024 14:56
@markpatton markpatton force-pushed the 1063-dspace-rest-deposit branch from 5bf461e to c84a802 Compare January 2, 2025 18:16
@markpatton markpatton marked this pull request as ready for review January 3, 2025 15:35
@markpatton markpatton force-pushed the 1063-dspace-rest-deposit branch from 94a9f14 to d34fdf5 Compare February 4, 2025 14:47
@markpatton markpatton requested a review from rpoet-jh February 5, 2025 16:14
Copy link
Contributor

@rpoet-jh rpoet-jh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job, Mark! Just a few comments, but in general it looks good. I still need to test and review the other PRs, but this is a first pass at the review for this main PR.

throw new RuntimeException("SourceMD with id '" + id + "' not found.");
}

private SourceMD createSourceMd() throws METSException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the method deletions in this class affect the sword deposit for dspace? I know we are going to deprecate the sword deposit eventually, but should these changes be done at that time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were unused methods. It wasn't really meaningful to delete them. But I did it during another cleanup.

deposit.setDepositStatusRef(packageStream.metadata().packageDepositStatusRef());

// Only set deposit status ref if not already set during transport
if (deposit.getDepositStatusRef() == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the retry case? Is it possible deposit.getDepositStatusRef() has a value, then after retry it needs to be updated?


public boolean verifyConnectivity() {
URI uri = URI.create(dspaceApiUrl);
return repositoryConnectivityService.verifyConnect(uri.getHost(), uri.getPort());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want to use repositoryConnectivityService .verifyConnectByURL(dspaceApiUrl ).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if that url would work. I think it would give a 404.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sort of see this as making sure the host is running and can be connected to on a port. The problem is that dspaceApiUrl is not a valid service call. It's the base for other service calls. This is what I was writing and then I did some testing. It turns out that, completely undocumented, that it is in fact a service endpoint and returns links to the other service endpoints. There is actually an health endpoint that could be used, but just the api service endpoint seems fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the api service endpoint seems fine -> yes, better to use this to cover cases where response code is > 500, like 502 Bad Gateway when the server is unreachable but there is a proxy in front of it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

.header(AUTHORIZATION, authContext.authToken())
.header(X_XSRF_TOKEN, authContext.xsrfToken())
.header(COOKIE, DSPACE_XSRF_COOKIE + authContext.xsrfToken())
.body(body)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what the spring RestClient does with a post body like this, will it stream the contents of the Resources to the server? I ask for concern of memory for large files that risk of OOM exceptions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it will stream, but I will poke around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The body is composed of resources and the default is to stream, not buffer. I tested with some larger files and I think it is ok.


AuthContext authContext = dspaceDepositService.authenticate();

// Create WorkspaceItem if it does not already exist
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style comment: this method a bit long. The logic in this method could be split up into maybe three methods like createWorkspace, updateMetadata, and createWorkflow. These methods can then be called from this one.

try {
DepositSubmission depositSubmission = packageStream.getDepositSubmission();

LOG.info("Processing Dspace Deposit for Submission: {}", depositSubmission.getId());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should change to LOG.warn so it shows up in default deployed logging level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit odd to be at level warn instead of info. As in this isn't a warning. This is normal operation.

String itemUuid = workspaceItemContext.read("$._embedded.workspaceitems[0]._embedded.item.uuid");
String accessUrl = dspaceDepositService.createAccessUrlFromItemUuid(itemUuid);

LOG.info("Completed DSpace Deposit for Submission: {}, accessUrl: {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should change to LOG.warn so it shows up in default deployed logging level.

…ake the connection verification for DSpace more robust.
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for deposit using the DSpace REST API
2 participants