-
Notifications
You must be signed in to change notification settings - Fork 0
Use Case: Data annotation
Implementation: #3
Not all information about datasets is known to the data producer. Lots of valuable information can be provided by users, after a dataset is published. This information can include:
- Statements about data quality and fitness for purpose (e.g. "Errors in SST are found to be higher near coastlines")
- Links to publications and other resources (web pages, blogs) about the data
- Information about "significant events" (e.g. hurricanes, volcanic eruptions, changes to processing procedures) that might have an impact on the data
In the MELODIES demo portal, it would be useful to enable the user to display (and perhaps create) such annotations.
A typical use case would involve:
- User logs on to MELODIES demo portal
- User selects a dataset
- In addition to dataset metadata, the user is provided with access to the annotations supplied by other users. These annotations may be sorted in some way by type (e.g. free-text comment, publication)
Other use cases might be:
- Use of the MELODIES portal to create annotations
- Discovering and creating annotations at the sub-dataset level (i.e. "fine-grained" annotations)
Much of the technology to enable this was created in the CHARMe project, which used the W3C Open Annotation vocabulary to represent annotations. A server to hold these annotations is already running at STFC, and we can connect to it.
- Users need to create accounts on CHARMe before they can generate annotations
- The ability to use geospatially-linked annotations is not fully functional in the current CHARMe system. We may have to enhance the CHARMe server through the use of a geospatial triplestore like Strabon.
- The NASA EONet system could be useful here, to provide a database of natural events. Can we use this?
Linked Data is central to this use case. Datasets need unique, publicly resolvable identifiers (like DOIs) in order to enable users to make statements about them. An RDF model provides a natural fit to represent the wide variety of possible annotations, although this variety can make querying the database of annotations more complex.