Skip to content
Chris Mungall edited this page Feb 3, 2015 · 1 revision

Performing queries in OWL

Introduction

We want to be able to do complex queries over ontologies - sometimes we want to include inferred axioms, sometimes non-logical constructs (aka owl annotations). What are our options?

DL Queries

This should be familiar to users of Protege through the "DL Query" tab. DL queries are possible via the OWLAPI reasoner interface. Use the OWL API to build up a class expression, then ask for ancestors/descendants.

Problems:

  • DL Queries are limited to the logical structure of the ontology - elements such as labels, synonyms, obo-subsets (aka annotations) cannot be incorporated without contorting the ontology somehow
  • DL Queries do not allow for closed world negation (AKA SPARQL filters).
  • Not all reasoners support DL queries
  • In particular, our favorite reasoner ELK only allows you to query using named classes (also true of JCEL?)
  • Reasoning can be slow, particularly if you have a lot of individuals

SPARQL

SPARQL does not suffer from these limitations, but has its own problems:

  • not supported in the OWL API
  • querying over entailments not always supported
  • SPARQL is very awkward and low-level for use with OWL class expressions

SPARQL-DL

SPARQL-DL is in theory the best of both worlds. It has a nice OWL syntax (yet another one...) for expressing queries. You can use closed world negation, combine logical relationships and annotations

However:

  • Not a W3C standard
  • Future not clear. E.g.
  • will updates be supported?
  • how does SPARQL-DL track SPARQL, if at all?
  • OWLAPI support is not good
    • OWLTools comes with the Dresden sparql-dl jar, but this does not support all of SPARQL-DL

Non-standard solutions

OBO-Edit

OBO-Edit provides a visual query editor for building complex queries that combine lexical elements with logical relationships, closed-world negation and entailments.

  • Doesn't have a standard syntax - must use GUI (check this..?)
  • Non-standard
  • Limited to OBO-Edit reasoner (EL-ish expressivity, slow)

SQL

One possibility is to load all inferences into a relational database and use SQL. We are aware of two schemas that support this:

  • OBD
  • GOLD (Gene Ontology schema)

Prolog

See examples for the Prolog OWL shell

Query support in OWLTools

In the current absence of "one query language to rule them all", OWLTools takes a pragmatic approach and allows you to mix and match as the situation calls for. It provides convenience methods for common processing operations that help with querying

SPARQL-DL support on OWLTools command line

Command line example:

owltools fly_anatomy_XP.obo --reasoner hermit --sparql-dl "SELECT * WHERE {SubClassOf(?x, <http://purl.obolibrary.org/obo/FBbt_00005106>)}"

Unfortunately, the SPARQL-DL library used is very limited. In the future this should hopefully allow more powerful queries.

DL Queries on OWLTools command line

Say we want to query for neurons in the mushroom body. This can be expressed as a DL query:

owltools fly_anatomy_XP.obo --reasoner hermit --reasoner-query "FBbt_00005106 and (BFO_0000050 some FBbt_00005801)"

(support for labels on command line in future)

Note the ELK is much faster that hermit (we heart elk). Why don't we use it instead? One current limitation of ELK is that the reasoner implementation only allows named classes. A common(?) workaround is to name the query. OWLTools will do this for you with the -m option (to materialize the query expression as a class):

owltools fly_anatomy_XP.obo  --reasoner-query -r elk -m "FBbt_00005106 and (BFO_0000050 some FBbt_00005801)"

Closed world interpretation of queries

Danger Will Robinson! - this trick involves a pretty major deviation from OWL semantics....

What if we want to throw in closed world negation? E.g. neurons that are not inferred to be part of the mushroom body?

owltools fly_anatomy_XP.obo --query-cw "FBbt_00005106 and not (BFO_0000050 some FBbt_00005801)"

(of course, the DL expression actually means neurons that are not part of the MB, which is different from neurons that are not provably part of the MB... but it's useful in the absence of a standard query language)

Note that no external reasoner is used here (it is assumed that classification is done in advance). The OWLTools graph walking algorithm is used here. Property chains etc are taken into account. See the OWLTools graph package for more details.

In future this kind of dark arts will be obsoleted by a working SPARQL-DL query regime.

SPARQL with OWLTools

SPARQL isn't supported with the OWLAPI, and as OWLTools is mostly a wrapper onto the OWLAPI, there is no SPARQL support at this time. This may be provided in future if there is demand.

One possibility is to load all inferences into a triplestore and provide a means of accessing this. This does not fit well into the OWL-centric nature of this tool.

Querying for individuals

Adding individuals to an ontology can slow down reasoning considerably. Another challenge is that Elk does not currently handle individuals.

Often we do not need complete ABox reasoning, in particular if the individuals are not interconnected.

One strategy is to translate individuals to classes:

owltools myi.owl --i2c -o file://`pwd`/myc.owl

This can be reasoned over with elk

Hybrid Queries

In future, OWLTools may support hybrid query strategies. For example, we might have individuals stored in a Solr server with pre-classified facets. It would be nice to issue a DL query and for this to be executed via a mixture of reasoning (TBox) and Solr queries (ABox)