Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

🐛 Ease how we determine preprocessed location #304

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 22 additions & 5 deletions app/services/iiif_print/derivative_rodeo_service.rb
Original file line number Diff line number Diff line change
Expand Up @@ -141,30 +141,47 @@ def self.get_ancestor(filename: nil, file_set:)
# @param file_set [FileSet]
# @param filename [String]
# @return [String] the dirname (without any "/" we hope)
# rubocop:disable Metrics/AbcSize
# rubocop:disable Metrics/MethodLength
def self.derivative_rodeo_preprocessed_directory_for(file_set:, filename:)
# SpaceStone does not know about lineage; it makes assumptions based on the URL of the work.
# If we have an import_url, let's follow the same assumption that SpaceStone would make.
#
# NOTE: We're assuming that a page ripped from a PDF will not have an import_url. This may
# not be the case.
return file_set.import_url.split("/")[-2] if file_set&.import_url&.split("/")[-2]&.presence

ancestor, ancestor_type = get_ancestor(filename: filename, file_set: file_set)

# Why might we not have an ancestor? In the case of grandparent_for, we may not yet have run
# the create relationships job. We could sneak a peak in the table to maybe glean some insight.
# However, read further the `else` clause to see the novel approach.
#
# Why might the ancestor not respond (nor have) a configured
# parent_work_identifier_property_name? Because data is sloppy. And we're trying to "guess"
# how this data was written in SpaceStone; a non-trivial task.
#
# TODO: Perhaps we could use the original remote_url to sniff that out the space stone
# directory?
#
# rubocop:disable Style/GuardClause
if ancestor
if ancestor && ancestor.try(parent_work_identifier_property_name).presence
message = "#{self.class}.#{__method__} #{file_set.class} ID=#{file_set.id} and filename: #{filename.inspect}" \
"has #{ancestor_type} of #{ancestor.class} ID=#{ancestor.id}"
Rails.logger.info(message)
ancestor.public_send(parent_work_identifier_property_name) ||
raise("Expected #{ancestor.class} ID=#{ancestor.id} (#{ancestor_type} of #{file_set.class} ID=#{file_set.id}) " \
"to have a present #{parent_work_identifier_property_name.inspect}")
ancestor.public_send(parent_work_identifier_property_name)
else
# HACK: This makes critical assumptions about how we're creating the title for the file_set;
# but we don't have much to fall-back on. Consider making this a configurable function. Or
# perhaps this entire method should be more configurable.
# TODO: Revisit this implementation.
file_set.title.first.split(".").first ||
Array.wrap(file_set.title).first.split(".").first ||
raise("#{file_set.class} ID=#{file_set.id} has title #{file_set.title.first} from which we cannot infer information.")
end
# rubocop:enable Style/GuardClause
end
# rubocop:enable Metrics/MethodLength
# rubocop:enable Metrics/AbcSize

def initialize(file_set)
@file_set = file_set
Expand Down