Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Query Parameterization #57

Open
VladimirAlexiev opened this issue Apr 7, 2019 · 15 comments
Open

Query Parameterization #57

VladimirAlexiev opened this issue Apr 7, 2019 · 15 comments
Labels
goodpractice A good practice for developers/deployers protocol improving sending queries over the wire query Extends the Query spec

Comments

@VladimirAlexiev
Copy link
Contributor

VladimirAlexiev commented Apr 7, 2019

Why?

Many people can't write SPARQL but can invoke queries written by others.
Then it's important to be able to pass (bind) parameters into the query, eg some fixed object, or a date range.

Previous work

Proposed solution

  • It'd be useful to standardize $var to mean "query parameter", allow it as a parameter in the SPARQL http protocol and also accepting values in Turtle syntax (so prefixes can be used)
  • The Basil / grlc special var naming is hacky but practical.
  • The more advanced Wikidata solution is also very attractive

Considerations for backward compatibility

  • Some people may frown at appropriating $ for this purpose, but IMHO there's no value in having two variable prefix chars.
  • Similar parameter replacement is used in SHACL semantics wherein $this is bound to the resource being evaluated. @pfps has expressed concerns about its semantics. But if the param value is replaced textually in all occurrences of $var (even in subqueries), I think that's quite clear? That is another argument to remove the meaning "free variable" from $ .
@kasei
Copy link
Collaborator

kasei commented Apr 7, 2019

+1

I like the idea of standardizing parameterization.

However,

Some people may frown at appropriating $ for this purpose, but IMHO there's no value in having two variable prefix chars

More than "frown", I'd consider it really worrisome to break existing queries that happen to use $.

Regarding there being "no value," my recollection is that the two variable syntaxes were introduced to ease use of SPARQL in languages that might have different escaping rules for ? and $. Having the option allowed simply embedding a SPARQL query in a language-native string (hopefully) without having to worry about escaping every variable.

@cygri
Copy link

cygri commented Apr 7, 2019

TopQuadrant's approach to this are SPIN templates, originally introduced more than 10 years ago for SPARQL 1.0. It is not a SPARQL extension, but a layer on top of SPARQL. A SPIN template is a resource described in RDF, with properties for the query that we want to parameterise, and for the arguments to be passed in. Here is an illustration—it's not actual SPIN, as some of the naming and design choices in SPIN distract from the core idea:

my:Template1 a :SelectTemplate;
    rdfs:comment "Retrieves widgets by category, optionally filtered by production status.";
    :argument [
        rdfs:comment "The category of widget to retrieve";
        :varName "category";
        :valueType my:WidgetCategory;
    ];
    :argument [
        rdfs:comment "Include discontinued widgets; default: false";
        :varName "withDiscontinued";
        :valueType xsd:boolean;
        :optional true;
    ];
    :query """
        SELECT ?widget {
            ?widget my:category ?category.
            FILTER IF(COALESCE(?withDiscontinued, false),
                true,
                EXISTS { ?widget my:inProduction true }
            )
        }"""^^:SPARQL.

So we have two arguments, ?category and ?withDiscontinued. We can provide various annotations, like human-readable comments, expected class/datatype for the argument, is it required or optional.

The template can be invoked like so:

http://myserver/template?template=my:Template1&category=...&withDiscontinued=true

This doesn't use the SPARQL endpoint but a separate endpoint for template queries.

The main downside of this approach is the relatively verbose syntax, and mixing Turtle and SPARQL in the same file. TopQuadrant has always had IDE support for working with this, so we rarely write/edit these files by hand. A significant advantage is the ability to add arbitrary additional properties, e.g., for documentation, access control, and categorisation of services.

The main takeaway is that parameterised queries do not require an extension to SPARQL, but can be done as a layer on top, and this approach has been successfully used in production systems for many years. Half of what's needed to standardise this is already in SHACL.

@lisp
Copy link
Contributor

lisp commented Apr 25, 2019

another option is to provide for parameters at the protocol level and leave the implementation to perform dynamic binding or query rewriting as it sees fit.
in that case, the compile/runtime environment determines the appropriate interpretation for a variable rather than the variable syntax itself.
for example, given

https://dydra.com/jhacker/foaf/@query#all

which is

select * where {?s ?p ?o} limit 10

both of

 curl 'https://dydra.com/jhacker/foaf/all?limit=10'
 curl 'https://dydra.com/jhacker/foaf/all?$s=<http://dbpedia.org/resource/Germany>'

are possible.
in this approach, the url variable syntax follows the convention established by sesame's http protocol, which offers some level of interoperability.

@VladimirAlexiev
Copy link
Contributor Author

Triply-Dev/YASGUI.YASQE-deprecated#24 is a great discussion

@lisp
Copy link
Contributor

lisp commented Jun 4, 2019

following on to that discussion, we eventually implemented a mechanism to allow values clauses

the mechanism uses the values clause dimensions to perform substitutions in the symbolic sparql expression.

@VladimirAlexiev
Copy link
Contributor Author

http://docs.rdf4j.org/rest-api/#_repository_queries is broken.
@lisp thanks for the new link http://archive.rdf4j.org/system/ch08.html#d0e247

@jeenbroekstra Isn't there an equivalent chapter in rdf4j's new documentation?

@abrokenjester
Copy link
Collaborator

@VladimirAlexiev the more up to date link is https://rdf4j.org/documentation/reference/rest-api/ .

@namedgraph
Copy link

Is there are list of triplestores that support this REST API? Specifically repository creation:
https://rdf4j.org/documentation/reference/rest-api/#repository-creation

@abrokenjester
Copy link
Collaborator

Is there are list of triplestores that support this REST API? Specifically repository creation:
https://rdf4j.org/documentation/reference/rest-api/#repository-creation

Any triplestore that implements the RDF4J Repository or Sail API can be exposed through this REST API using the RDF4J Server application. But if you're asking which stores support this REST API directly, I don't have an exhaustive list I'm afraid. From the top of my head I believe GraphDB, Halyard, and Strabon do so, though I haven't checked or tested this.

@afs afs added the protocol improving sending queries over the wire label Jun 23, 2021
@afs
Copy link
Collaborator

afs commented Dec 3, 2021

https://afs.github.io/substitute.html covers query parameterization.

The difference between EXISTS usage and query parameterization is the treatment of nested SELECT. For query parameterization, it would seem natural to allow substitution by name inside nested SELECT even when the variable is not exposed by the inner project. That's whether to apply the scope-sensitive renaming or not.

Example: replacement of ?c in:

SELECT ... {
   ...
   SELECT (count(?s) AS ?X {
      ?a ?b ?c 
   }
}

@lisp
Copy link
Contributor

lisp commented Dec 4, 2021

in relation to jena's notion, the term substitution in my remark from 2019 warrants clarification.
with respect to values request arguments, the query compiler does manipulate the respective sse.
this because, for that case, simple static analysis suffices and the mechanism changes just the values of variable bindings, but not their nature.
this mechanism is not, however, that which is used to affect general query parameterisation.
for that, rather than manipulating the sse, the query compiler establishes dynamic bindings.
this mechanism makes it much easier to avoid the issues described in jena's documentation.

@VladimirAlexiev
Copy link
Contributor Author

Is the trailing VALUES clause specifically made for this purpose? Because you can just append it to the query.
Diagram from https://rawgit2.com/VladimirAlexiev/grammar-diagrams/master/sparql11-grammar.xhtml:

image

@lisp
Copy link
Contributor

lisp commented Dec 20, 2022

a mechanism which accepts a values clause as parameter has a very different behaviour than one which accepts individual variable bindings.
the former provides a solution field as an algebra argument while the latter provides initial bindings for free variables.

@afs
Copy link
Collaborator

afs commented Dec 21, 2022

Is the trailing VALUES clause specifically made for this purpose?

Not really. The VALUES syntax at that point isn't treated specially to VALUES anywhere else in a query. It is join'ed with the evaluation of results of the query so far (it is after GROUP BY and HAVING, for example). It can't "see" all variables.

https://www.w3.org/TR/sparql11-query/#sparqlAlgebraFinalValues

It would have an effect of parameterization in some cases but not all. It is different behaviour.

I think the user intuition of parameterization of a query is more syntactic - replace a variable by a value at the syntax level before processing the query, replace everywhere in the query, including inside sub-queries. It is on the "input", not the "output". If the replacement is "bad" (e.g. BOUND(?var) becomes BOUND(constant)), then the query doesn't execute.

@lisp
Copy link
Contributor

lisp commented Dec 21, 2022

some ambiguity in expectations could be traced to the passage in the federation recommendation

https://www.w3.org/TR/sparql11-federated-query/#values

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
goodpractice A good practice for developers/deployers protocol improving sending queries over the wire query Extends the Query spec
Projects
None yet
Development

No branches or pull requests

8 participants