Skip to content

Initial modeling discussion

Carlos Rueda edited this page Jul 28, 2014 · 1 revision

Toward an RDF representation of the UDUNITS-2 unit definitions

For a preliminary review of the XML structure see Samples-from-the-XML-files.

Modeling

Example

Note, the discussion below makes use of the following XML unit definition:

        <unit>
            <def>'/60</def>
            <name><singular>arc_second</singular></name>
            <symbol>"</symbol>
            <symbol>&#x2033;</symbol>           <!-- DOUBLE PRIME -->
            <aliases>
                <name><singular>angular_second</singular></name>
                <name><singular>arcsecond</singular></name>
                <name><singular>arcsec</singular></name>
            </aliases>
        </unit>

Initial idea

  • Define class Unit
  • Define properties for "hasDefinition", "hasAlternate", "hasSymbol", "hasCardinality":
    • hasDefinition: string
    • hasAlternate: Unit
    • hasSymbol: string
    • hasCardinality: "singular" | "plural"

In the following, italicized name refers to the semantic name concept, while non-italicized name refers to the actual string instances in the vocabulary.

  • Each unit name from the XML definitions will be captured in a corresponding instance of the class Unit.
  • For this purpose, these extracted names are: all explicit names (singular and plural), all aliases (singular and plural) from the unit definition
  • With all names associated to a particular <unit> definition, the corresponding Unit instances are related to each using the "hasAlternate" property.

I think of the relationship between unit and definition as 1-1, which is why I don't like making 1 name -> 1 unit. I don't think 'ampere' is a unit and 'amp' is another unit, I think they are two names for the same unit. Whereas your hasAlternate wants to specify a relation between Units, I think it really is for connecting different names.

yes, good point.

Example

The arc_second unit definition would be converted to the following RDF Unit instances:

@prefix :        <http://mmisw.org/ont/mmitest/udunits2-accepted/> .
@prefix prop:    <http://mmisw.org/ont/mmitest/udunits2-prop/> .

:arc_second
      a       :Unit ;
      prop:hasCardinality  "singular";
      prop:hasAlternate  :arcsec , :angular_second , :arcsecond ;
      prop:hasDefinition    "'/60" ;
      prop:hasSymbol "\"" , "″" .

:arcsec
      a       :Unit ;
      prop:hasCardinality  "singular";
      prop:hasAlternate  :arc_second , :angular_second , :arcsecond ;
      prop:hasDefinition    "'/60" ;
      prop:hasSymbol "\"" , "″" .

:angular_second
      a       :Unit ;
      prop:hasCardinality  "singular";
      prop:hasAlternate  :arc_second , :arcsec, :arcsecond ;
      prop:hasDefinition    "'/60" ;
      prop:hasSymbol "\"" , "″" .

:arcsecond
      a       :Unit ;
      prop:hasCardinality  "singular";
      prop:hasAlternate  :arc_second , :arcsec, :angular_second ;
      prop:hasDefinition    "'/60" ;
      prop:hasSymbol "\"" , "″" .

Note that although the example above explicitly reflects all associated properties for each Unit instance, not all of these would need to be actually materialized internally as some of this information could be inferred with appropriate modeling. More concretely:

  • One "master" Unit instance is designated to have all characterization explicitly associated including links to all its alternate names via the prop:hasAlternate property, which is symmetric.
  • Each of the other instances will basically only indicate prop:hasCardinality.

Example Alternate -- proposed for first release (graybeal)

In this case, we treat this as a vocabulary, in which the strings are the first class object and everything is related to them. It makes a distinction between the primary unit string, and the aliases. But otherwise, this is not a normalized model; information is repeated everywhere. (So if arc_second ever gets a new symbol, a lot of terms will change.) FOr what it's worth, an ideal UDUNITS exploration tool would be able to present alphabetized lists not just of the Unit and Alias terms, but also the definition and symbol strings.

The example will generate the following RDF unit instances:

@prefix :        <http://mmisw.org/ont/mmitest/udunits2-accepted/> .
@prefix prop:    <http://mmisw.org/ont/mmitest/udunits2-prop/> .

:arc_second
      a       :Unit ;
      prop:hasCardinality  "singular";
      prop:hasSingularAlias  :arcsec , :angular_second , :arcsecond ;
      prop:hasDefinition    "'/60" ;
      prop:hasComment       "DOUBLE PRIME"
      prop:hasSymbol        "\"", "″" .

:arcsec
      a       :Alias ;
      prop:hasCardinality  "singular";
      prop:hasUnit          :arc_second  ;
      prop:hasDefinition    "'/60" ;
      prop:hasSymbol        "\"", "″" .

:angular_second
      a       :Alias ;
      prop:hasCardinality  "singular";
      prop:hasUnit          :arc_second  ;
      prop:hasDefinition    "'/60" ;
      prop:hasSymbol        "\"", "″" .

:arcsecond
       a       :Alias ;
      prop:hasCardinality  "singular";
      prop:hasUnit          :arc_second  ;
      prop:hasDefinition    "'/60" ;
      prop:hasSymbol        "\"", "″" .

Example Alternate Properly Modeled (graybeal)

In this case, we treat the unit itself as the primary entity. It is actually the definition that is the unique 'key' for each unit (as in the original document), but the fact that some units have URL-unfriendly definitions means we do the right thing and use an opaque code for each unit. A tool that works with this ontology will have to be able to recognize and present the name strings associated with the units; is the label is the right way to do this systematically?

Now the example will generate the following RDF unit instance:

@prefix :        <http://mmisw.org/ont/mmitest/udunits2-accepted/> .
@prefix prop:    <http://mmisw.org/ont/mmitest/udunits2-prop/> .

:2a1369 
      a       :Unit ;
      prop:hasDefinition     "'/60" ;
      rdfs:label             "arc_second" ;     // I am not sure I've used the right property 
      prop:hasSingularName   "arc_second" ;
      prop:hasSingularAlias  "arcsec", "angular_second", "arcsecond" ;
      prop:hasSymbol         "\"", "″" ;
      prop:hasSymbolComment  "DOUBLE PRIME"  .

This would require special handling to display as a vocabulary. One could choose to augment this model with the concepts for Names and Aliases, something like the following. (Using the strings as subjects is appropriate here, because the entire concept of the alias is built around the string itself; change the string and you have a different alias.)

:arcsec
      a        :Alias  ;
      prop:referencesUnit    :2a1369;
      rdfs:label             "arcsec" ;
      prop:hasCardinality    "singular" .

If you wanted to build a a complete model you might create entries for all the symbols, like the following. But I think there is not an important use case for doing so.

:39f2c1
      a       :Symbol ;
      prop:referencesUnit    :2a1369;
      rdfs:label             "″";
      rdfs:comment           "DOUBLE PRIME"  .

==================

Unit and UnitName

Basically the idea is:

  • class Unit: an instance of this class captures one main unit entry in the XML. It may have a (primary) name and zero or more aliases.

  • class UnitName: instances of this class capture names and aliases associated with instances of a class Unit.

Applied on the example:

@prefix :        <http://mmisw.org/ont/mmitest/udunits2-accepted/> .
@prefix prop:    <http://mmisw.org/ont/mmitest/udunits2-prop/> .

:2a1369 
      a                      :Unit ;
      prop:hasDefinition     "'/60" ;
      prop:hasName           "arc_second" ;
      prop:hasAlias          :arcsec, :angular_second, :arcsecond ;  <-- NOTE: URIs, not strings
      prop:hasSymbol         "\"", "″" ;

:arc_second
      a                      :UnitName ;
      prop:forUnit           :2a1369;
      prop:hasCardinality    "singular";

:arcsec
      a                      :UnitName ;
      prop:forUnit           :2a1369;
      prop:hasCardinality    "singular";

:angular_second
      a                      :UnitName ;
      prop:forUnit           :2a1369;
      prop:hasCardinality    "singular";

:arcsecond
      a                      :UnitName ;
      prop:forUnit           :2a1369;
      prop:hasCardinality    "singular";

Remarks:

  • first off, idea not completely new, of course! it does overlap with previous ideas, in particular "proposed-for-first-release"; hopefully we would just need to consolidate for that very first initial version of the ontologies!
  • as in previous proposals, it separates the concept of Unit from any associated specific names/aliases
  • Those names/aliases will have URIs per se (so they could be self-resolvable, which is the case with our ORR ;)
  • Instead of introducing an Alias class, it instead uses a class UnitName for both unit names and unit aliases. It is the concrete Unit instance that indicates what UnitName is primary via the prop:hasName property (which may be absent if there is not such primary name).
  • prop:hasSymbol could also be used as a property of UnitName in cases where the symbol is for the specific name.