-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Default media type of documents #70
Comments
From @jonathanrobie on September 9, 2017 20:26 I think that the media type should be determined by standard HTTP content negotiation. A server should give the most precise media type that a client is willing to accept for a given document, based on its knowledge of the document. If the most precise type of a document is |
From @PonteIneptique on September 9, 2017 21:4
I agree.
I don't "agree" :). This is where I think we need to enforce this thing. One of the failure of CTS was to accept as many schemes as there is implementations, without enforcing any kind of correct format. This would lead to not knowing how to parse the content, to not being really able to foresee what you would get. We have the chance to have a format for textual edition : TEI. Right now, there is not a single standard API (as far as I know) that relies on it. We discussed this a while ago and while I think we should allow implementers to provide other mimetypes, I think XML/TEI should always be accessible. One of the reason for this, unlike images (I look kindly at you, IIIF) and structured metadata, is that textual edition is broad, wide, and should I say wide enough to be already complex in TEI. When you query most standard LOD APIs, you can expect to get RDF, in different kind of expression : json-ld, xml/rdf, turtle, etc. We do not have the same standard for textual edition. HTML, raw text, csv, non grammar based xml, etc. are way too different to be treated the same way. Having TEI, a scholarly accepted standard, as a standard and obligatory output, is required for the API to really have an impact for these reasons. Let's be clear in terms of implementation duties : the server must have the ability to reply a valid content to |
From @jonathanrobie on September 9, 2017 21:14 Suppose I want to serve syntax trees in a format that is semantically different from TEI's standard. I cannot really convert that to a TEI format, and doing so would require effort for the data producer. If both the producer and the consumer want that format for these documents, do they need to use a different protocol? Or suppose the only document available is HTML5. Should a server be required to convert it to TEI? |
From @PonteIneptique on September 10, 2017 5:41 Any format transformation in the case of text serving is a loss of As such, and to be clear, I am not against tree serving, but historically, I do not believe it will be that complicated to transform part of the |
From @jonathanrobie on September 14, 2017 13:6 DTS is a replacement for CTS, which is not restricted to TEI. I don't know what level of design decision has been made on this. I was not aware of a firm decision, but there may have been one. We'll discuss in the meeting. |
From @PonteIneptique on September 14, 2017 13:13
I agree. And CTS has been a failing API also by not providing clear contents type restriction (or base content type) which led to having really few API that can be parsed.
There was no hard decision written in stone, but I definitely remember a meeting where this was discussed, we clearly leaned towards at least TEI as the basis, potentially to propose this whole work to TEI-C, and even assigned the poor @hcayless to the passage endpoint discussing chairing (at the time where we had multiple chairs). |
From @PonteIneptique on September 15, 2017 6:54 To recap and build on some discussion #9 out of two comments from @hcayless ( 1 2 ) : We could, at least, make the following rules:
FragmentA Other content typesThere is a secondary question which could be : do we want to limit content types, ie should we forbid to reply with binary formats such as PDF or images (which are not "minable") ? I have no specific point of view on the question but it felt like a question that can be asked. |
My instinctual vote would be for default TEI response when making a passage/document response. I would vote for default json-ld when we're talking about metadata about the passage/document. I generally like the rules proposed immediately above, though I wonder if allowing fragments creates more confusion for a consuming client. In would be easier on a client if it could simply expect a well formed TEI response in all cases. If we want to respond with a small fragment of a text, can we demand that the service wrap this fragment in a TEI wrapper that includes a TEI header and text/body? |
I think this is close, but I would either (1) require TEI for document data, or (2) add "if available" to the first point:
I would require any XML to be well-formed:
I think we should specify how fragments are wrapped. |
I think we can either:
Neither one requires TEI to do anything new, as I understand the discussion. |
Clearer ProposalAfter last meeting, my* proposal is actually :
FragmentAs for the fragment, there is multiple choices from the discussion:
I definitely lean towards the new tag (specifically, thinkin about all this, an element which would be at expath and name My personal ranking would be 1. new tag in TEI, 2. ab, 3. DTS. |
Note, I think we could have a good thing for the ab fragment, to avoid a root node : <TEI>
<text><body><ab /></body></text>
</TEI> The only thing I don't like with |
We closed this at the last meeting - see decisions marked with the string ** Decision ** in this decision tree: https://github.com/distributed-text-services/collection-api/wiki/Decision-Tree:-Issue-70 ** Decision **: Support for TEI is mandatory, we can open it up later if there is a reason to. We will make up our own wrapper and propose it to TEI, but will not wait for them. |
From @jonathanrobie on September 9, 2017 20:23
Thibault proposes that the default media type of a document should be 'The default format of answer should be
application/tei+xml
, leaving other media types to the implementation.Copied from original issue: distributed-text-services/distributed-text-services.github.io#4
The text was updated successfully, but these errors were encountered: