Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Implement endpoint for queuing visits #48 #49

Merged
merged 3 commits into from
Jan 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions src/main/config/run.properties.example
Original file line number Diff line number Diff line change
Expand Up @@ -95,3 +95,7 @@ defaultFacilityName=LILS
# but queued requests will only be started when there are less than this many RESTORING downloads.
# Negative values will start all queued jobs immediately, regardless of load.
queue.maxActiveDownloads = 10
# Limit the number files per queued Download part. Multiple Datasets will be combined into part
# Downloads based on their fileCount up to this limit. If a single Dataset has a fileCount
# greater than this limit, it will still be submitted in a part by itself.
queue.maxFileCount = 10000
59 changes: 59 additions & 0 deletions src/main/java/org/icatproject/topcat/IcatClient.java
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,65 @@ public String getFullName() throws TopcatException {
}
}

/**
* Get all Datasets whose parent Investigation has the specified visitId.
*
* @param visitId ICAT Investigation.visitId
* @return JsonArray of Datasets, where each entry is a JsonArray of
* [dataset.id, dataset.fileCount].
* @throws TopcatException
*/
public JsonArray getDatasets(String visitId) throws TopcatException {
try {
String query = "SELECT dataset.id, dataset.fileCount from Dataset dataset";
query += " WHERE dataset.investigation.visitId = '" + visitId + "' ORDER BY dataset.id";
String encodedQuery = URLEncoder.encode(query, "UTF8");

String url = "entityManager?sessionId=" + URLEncoder.encode(sessionId, "UTF8") + "&query=" + encodedQuery;
Response response = httpClient.get(url, new HashMap<String, String>());
if (response.getCode() == 404) {
throw new NotFoundException("Could not run getDatasets got a 404 response");
} else if (response.getCode() >= 400) {
throw new BadRequestException(Utils.parseJsonObject(response.toString()).getString("message"));
}
return Utils.parseJsonArray(response.toString());
} catch (TopcatException e) {
throw e;
} catch (Exception e) {
throw new BadRequestException(e.getMessage());
}
}

/**
* Utility method to get the fileCount (not size) of a Dataset by COUNT of its
* child Datafiles. Ideally the fileCount field should be used, this is a
* fallback option if that field is not set.
*
* @param datasetId ICAT Dataset.id
* @return The number of Datafiles in the specified Dataset
* @throws TopcatException
*/
public long getDatasetFileCount(long datasetId) throws TopcatException {
try {
String query = "SELECT COUNT(datafile) FROM Datafile datafile WHERE datafile.dataset.id = " + datasetId;
String encodedQuery = URLEncoder.encode(query, "UTF8");

String url = "entityManager?sessionId=" + URLEncoder.encode(sessionId, "UTF8") + "&query=" + encodedQuery;
Response response = httpClient.get(url, new HashMap<String, String>());
if (response.getCode() == 404) {
throw new NotFoundException("Could not run getDatasetFileCount got a 404 response");
} else if (response.getCode() >= 400) {
throw new BadRequestException(Utils.parseJsonObject(response.toString()).getString("message"));
}
JsonArray jsonArray = Utils.parseJsonArray(response.toString());
return jsonArray.getJsonNumber(0).longValueExact();
} catch (TopcatException e) {
throw e;
} catch (Exception e) {
throw new BadRequestException(e.getMessage());
}
}

/**
* Gets a single Entity of the specified type, without any other conditions.
*
Expand Down
246 changes: 207 additions & 39 deletions src/main/java/org/icatproject/topcat/web/rest/UserResource.java
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ public class UserResource {

private String anonUserName;
private String defaultPlugin;
private long queueMaxFileCount;

@PersistenceContext(unitName = "topcat")
EntityManager em;
Expand All @@ -71,6 +72,7 @@ public UserResource() {
Properties properties = Properties.getInstance();
this.anonUserName = properties.getProperty("anonUserName", "");
this.defaultPlugin = properties.getProperty("defaultPlugin", "simple");
this.queueMaxFileCount = Long.valueOf(properties.getProperty("queue.maxFileCount", "10000"));
}

/**
Expand All @@ -89,7 +91,6 @@ private String getCartUserName(String userName, String sessionId) {
* Login to create a session
*
* @param facilityName A facility name - properties must map this to a url to a valid ICAT REST api, if set.
* Can be null iff one Facility set in the config for the API.
* @param username ICAT username
* @param password Password for the specified authentication plugin
* @param plugin ICAT authentication plugin. If null, a default value will be used.
Expand Down Expand Up @@ -705,9 +706,7 @@ public Response submitCart(@PathParam("facilityName") String facilityName,
throw new BadRequestException("fileName is required");
}

if (transport == null || transport.trim().isEmpty()) {
throw new BadRequestException("transport is required");
}
validateTransport(transport);

String icatUrl = getIcatUrl( facilityName );
IcatClient icatClient = new IcatClient(icatUrl, sessionId);
Expand All @@ -725,50 +724,21 @@ public Response submitCart(@PathParam("facilityName") String facilityName,
if(email != null && email.equals("")){
email = null;
}


if (cart != null) {
em.refresh(cart);

Download download = new Download();
download.setSessionId(sessionId);
download.setFacilityName(cart.getFacilityName());
download.setFileName(fileName);
download.setUserName(cart.getUserName());
download.setFullName(fullName);
download.setTransport(transport);
download.setEmail(email);
download.setIsEmailSent(false);
download.setSize(0);

Download download = createDownload(sessionId, cart.getFacilityName(), fileName, cart.getUserName(),
fullName, transport, email);
List<DownloadItem> downloadItems = new ArrayList<DownloadItem>();

for (CartItem cartItem : cart.getCartItems()) {
DownloadItem downloadItem = new DownloadItem();
downloadItem.setEntityId(cartItem.getEntityId());
downloadItem.setEntityType(cartItem.getEntityType());
downloadItem.setDownload(download);
DownloadItem downloadItem = createDownloadItem(download, cartItem.getEntityId(),
cartItem.getEntityType());
downloadItems.add(downloadItem);
}

download.setDownloadItems(downloadItems);

Boolean isTwoLevel = idsClient.isTwoLevel();
download.setIsTwoLevel(isTwoLevel);

if(isTwoLevel){
download.setStatus(DownloadStatus.PREPARING);
} else {
String preparedId = idsClient.prepareData(download.getSessionId(), download.getInvestigationIds(), download.getDatasetIds(), download.getDatafileIds());
download.setPreparedId(preparedId);
download.setStatus(DownloadStatus.COMPLETE);
}

downloadId = submitDownload(idsClient, download, DownloadStatus.PREPARING);
try {
em.persist(download);
em.flush();
em.refresh(download);
downloadId = download.getId();
em.remove(cart);
em.flush();
} catch (Exception e) {
Expand All @@ -780,7 +750,205 @@ public Response submitCart(@PathParam("facilityName") String facilityName,
return emptyCart(facilityName, cartUserName, downloadId);
}

/**
* Create a new Download object and set basic fields, excluding data and status.
*
* @param sessionId ICAT sessionId
* @param facilityName ICAT Facility.name
* @param fileName Filename for the resultant Download
* @param userName ICAT User.name
* @param fullName ICAT User.fullName
* @param transport Transport mechanism to use
* @param email Optional email to send notification to on completion
* @return Download object with basic fields set
*/
private static Download createDownload(String sessionId, String facilityName, String fileName, String userName,
String fullName, String transport, String email) {
Download download = new Download();
download.setSessionId(sessionId);
download.setFacilityName(facilityName);
download.setFileName(fileName);
download.setUserName(userName);
download.setFullName(fullName);
download.setTransport(transport);
download.setEmail(email);
download.setIsEmailSent(false);
download.setSize(0);
return download;
}

/**
* Create a new DownloadItem.
*
* @param download Parent Download
* @param entityId ICAT Entity.id
* @param entityType EntityType
* @return DownloadItem with fields set
*/
private static DownloadItem createDownloadItem(Download download, long entityId, EntityType entityType) {
DownloadItem downloadItem = new DownloadItem();
downloadItem.setEntityId(entityId);
downloadItem.setEntityType(entityType);
downloadItem.setDownload(download);
return downloadItem;
}

/**
* Set the final fields and persist a new Download request.
*
* @param idsClient Client for the IDS to use for the Download
* @param download Download to submit
* @param downloadStatus Initial DownloadStatus to set if and only if the IDS isTwoLevel
* @return Id of the new Download
* @throws TopcatException
*/
private long submitDownload(IdsClient idsClient, Download download, DownloadStatus downloadStatus)
throws TopcatException {
Boolean isTwoLevel = idsClient.isTwoLevel();
download.setIsTwoLevel(isTwoLevel);

if (isTwoLevel) {
download.setStatus(downloadStatus);
} else {
String preparedId = idsClient.prepareData(download.getSessionId(), download.getInvestigationIds(),
download.getDatasetIds(), download.getDatafileIds());
download.setPreparedId(preparedId);
download.setStatus(DownloadStatus.COMPLETE);
}

try {
em.persist(download);
em.flush();
em.refresh(download);
return download.getId();
} catch (Exception e) {
logger.info("submitCart: exception during EntityManager operations: " + e.getMessage());
throw new BadRequestException("Unable to submit for cart for download");
}
}

/**
* Queue an entire visit for download, split by Dataset into part Downloads if
* needed.
*
* @param facilityName ICAT Facility.name
* @param sessionId ICAT sessionId
* @param transport Transport mechanism to use
* @param email Optional email to notify upon completion
* @param visitId ICAT Investigation.visitId to submit
* @return Array of Download ids
* @throws TopcatException
*/
@POST
@Path("/queue/visit")
public Response queueVisitId(@FormParam("facilityName") String facilityName,
@FormParam("sessionId") String sessionId, @FormParam("transport") String transport,
@FormParam("email") String email, @FormParam("visitId") String visitId) throws TopcatException {

logger.info("queueVisitId called");
validateTransport(transport);

String icatUrl = getIcatUrl(facilityName);
IcatClient icatClient = new IcatClient(icatUrl, sessionId);
String transportUrl = getDownloadUrl(facilityName, transport);
IdsClient idsClient = new IdsClient(transportUrl);

// If we wanted to block the user, this is where we would do it
String userName = icatClient.getUserName();
String fullName = icatClient.getFullName();
JsonArray datasets = icatClient.getDatasets(visitId);

long downloadId;
JsonArrayBuilder jsonArrayBuilder = Json.createArrayBuilder();

long downloadFileCount = 0L;
List<DownloadItem> downloadItems = new ArrayList<DownloadItem>();
List<Download> downloads = new ArrayList<Download>();
// String filename = formatQueuedFilename(facilityName, visitId, part);
Download newDownload = createDownload(sessionId, facilityName, "", userName, fullName, transport, email);

for (JsonValue dataset : datasets) {
JsonArray datasetArray = dataset.asJsonArray();
long datasetId = datasetArray.getJsonNumber(0).longValueExact();
long datasetFileCount = datasetArray.getJsonNumber(1).longValueExact();
if (datasetFileCount < 1L) {
// Database triggers should set this, but check explicitly anyway
datasetFileCount = icatClient.getDatasetFileCount(datasetId);
}

if (downloadFileCount > 0L && downloadFileCount + datasetFileCount > queueMaxFileCount) {
newDownload.setDownloadItems(downloadItems);
downloads.add(newDownload);
// downloadId = submitDownload(idsClient, download, DownloadStatus.PAUSED);
// jsonArrayBuilder.add(downloadId);

// part += 1L;
downloadFileCount = 0L;
downloadItems = new ArrayList<DownloadItem>();
// filename = formatQueuedFilename(facilityName, visitId, part);
newDownload = createDownload(sessionId, facilityName, "", userName, fullName, transport, email);
}

DownloadItem downloadItem = createDownloadItem(newDownload, datasetId, EntityType.dataset);
downloadItems.add(downloadItem);
downloadFileCount += datasetFileCount;
}
newDownload.setDownloadItems(downloadItems);
downloads.add(newDownload);
// downloadId = submitDownload(idsClient, download, DownloadStatus.PAUSED);
// jsonArrayBuilder.add(downloadId);
int part = 1;
for (Download download : downloads) {
String filename = formatQueuedFilename(facilityName, visitId, part, downloads.size());
download.setFileName(filename);
downloadId = submitDownload(idsClient, download, DownloadStatus.PAUSED);
jsonArrayBuilder.add(downloadId);
part += 1;
}

return Response.ok(jsonArrayBuilder.build()).build();
}

/**
* Format the filename for a queued Download, possibly one part of many.
*
* @param facilityName ICAT Facility.name
* @param visitId ICAT Investigation.visitId
* @param part 1 indexed part of the overall request
* @param size Number of parts in the overall request
* @return Formatted filename
*/
private static String formatQueuedFilename(String facilityName, String visitId, int part, int size) {
String partString = String.valueOf(part);
String sizeString = String.valueOf(size);
StringBuilder partBuilder = new StringBuilder();
while (partBuilder.length() + partString.length() < sizeString.length()) {
partBuilder.append("0");
}
partBuilder.append(partString);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think rather than having a zeros padded part number, I would favour just having the word "part" and the number not padded.
This would hopefully give the user a clue that this is only one part of their restore request.
If we could say "part n of N" then that would be even better, although I realise that might not be easy as we are currently creating the parts as we go along so we don't yet know how many there will be.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My motivation for zero padding is in case there are more than 10 parts (don't know how likely this is) to maintain alphabetical sorting. I.e. part_10_of_10 would appear before part_2_of_10 if sorted by filename, whereas part_02_of_10 would appear in the right order.

In terms of submitting as we go, I think it's slightly more efficient (as we don't have to loop twice, or store more than one Download in memory at a time) but I think I can rewrite it to iterate over all of them once to build the parts, then again to set the filename and submit.


StringBuilder filenameBuilder = new StringBuilder();
filenameBuilder.append(facilityName);
filenameBuilder.append("_");
filenameBuilder.append(visitId);
filenameBuilder.append("_part_");
filenameBuilder.append(partBuilder);
filenameBuilder.append("_of_");
filenameBuilder.append(sizeString);
return filenameBuilder.toString();
}

/**
* Validate that the submitted transport mechanism is not null or empty.
*
* @param transport Transport mechanism to use
* @throws BadRequestException if null or empty
*/
private static void validateTransport(String transport) throws BadRequestException {
if (transport == null || transport.trim().isEmpty()) {
throw new BadRequestException("transport is required");
}
}

/**
* Retrieves the total file size (in bytes) for any investigation, datasets or datafiles.
Expand Down
Loading
Loading