Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Use node references tables in join query to better performance #312

Open
hason opened this issue Jan 14, 2016 · 3 comments
Open

Use node references tables in join query to better performance #312

hason opened this issue Jan 14, 2016 · 3 comments

Comments

@hason
Copy link

hason commented Jan 14, 2016

The QueryBuilder

$qb
    ->fromDocument(Article::class, 'a')
    ->addJoinInner()
        ->right()->document(ArticleCategory::class, 'c')->end()
        ->condition()->equi('a.category', 'c.uuid')->end()
    ->end()
    ->where()
        ->eq()->field('c.uuid')->literal($contentDocument->getUuid())->end()
    ->end()
    ->andWhere()
        ->eq()->field('a.publishable')->literal(true)->end()
    ->end()
    ->orderBy()->desc()->field('a.publishStartDate')->end()
;

is actual transformed to very slow sql (sqlite 400 ms)

SELECT 
  n0.path AS n0_path, 
  n0.identifier AS n0_identifier, 
  n0.props AS n0_props, 
  n1.path AS n1_path, 
  n1.identifier AS n1_identifier, 
  n1.props AS n1_props 
FROM 
  phpcr_nodes n0 
  INNER JOIN phpcr_nodes n1 ON (
    n0.workspace_name = n1.workspace_name 
    AND n1.type IN ('nt:unstructured', 'rep:root') 
    AND EXTRACTVALUE(
      n0.props, '//sv:property[@sv:name="category"]/sv:value[1]'
    ) = n1.identifier
  ) 
WHERE 
  n0.workspace_name = ? 
  AND n0.type IN ('nt:unstructured', 'rep:root') 
  AND (
    (
      (
        n1.identifier = 'c39b134f-81dd-4ef8-937d-d0fd975c349b' 
        AND EXTRACTVALUE(
          n0.props, '//sv:property[@sv:name="publishable"]/sv:value[1]'
        ) = '1'
      ) 
      AND (
        EXTRACTVALUE(
          n0.props, 'count(//sv:property[@sv:name="phpcr:class"]/sv:value[text()="AppBundle\Document\Article"]) > 0'
        ) 
        OR EXTRACTVALUE(
          n0.props, 'count(//sv:property[@sv:name="phpcr:classparents"]/sv:value[text()="AppBundle\Document\Article"]) > 0'
        )
      )
    ) 
    AND (
      EXTRACTVALUE(
        n1.props, 'count(//sv:property[@sv:name="phpcr:class"]/sv:value[text()="AppBundle\Document\ArticleCategory"]) > 0'
      ) 
      OR EXTRACTVALUE(
        n1.props, 'count(//sv:property[@sv:name="phpcr:classparents"]/sv:value[text()="AppBundle\Document\ArticleCategory"]) > 0'
      )
    )
  ) 
ORDER BY 
  CAST(
    EXTRACTVALUE(
      n0.numerical_props, '//sv:property[@sv:name="publishStartDate"]/sv:value[1]'
    ) AS DECIMAL
  ) DESC, 
  EXTRACTVALUE(
    n0.props, '//sv:property[@sv:name="publishStartDate"]/sv:value[1]'
  ) DESC

The sql is much more faster with references tables (4 ms):

SELECT 
  n0.path AS n0_path, 
  n0.identifier AS n0_identifier, 
  n0.props AS n0_props, 
  n1.path AS n1_path, 
  n1.identifier AS n1_identifier, 
  n1.props AS n1_props 
FROM 
  phpcr_nodes n0 
  INNER JOIN phpcr_nodes n1 ON (
    n0.workspace_name = n1.workspace_name 
    AND n1.type IN ('nt:unstructured', 'rep:root')
  ) 
  INNER JOIN phpcr_nodes_weakreferences n2 ON (
    n2.target_id = n1.id 
    AND n2.source_id = n0.id 
    AND n2.source_property_name = "category"
  ) 
WHERE 
  n0.workspace_name = ? 
  AND n0.type IN ('nt:unstructured', 'rep:root') 
  AND (
    (
      (
        n1.identifier = 'c39b134f-81dd-4ef8-937d-d0fd975c349b' 
        AND EXTRACTVALUE(
          n0.props, '//sv:property[@sv:name="publishable"]/sv:value[1]'
        ) = '1'
      ) 
      AND (
        EXTRACTVALUE(
          n0.props, 'count(//sv:property[@sv:name="phpcr:class"]/sv:value[text()="AppBundle\Document\Article"]) > 0'
        ) 
        OR EXTRACTVALUE(
          n0.props, 'count(//sv:property[@sv:name="phpcr:classparents"]/sv:value[text()="AppBundle\Document\Article"]) > 0'
        )
      )
    ) 
    AND (
      EXTRACTVALUE(
        n1.props, 'count(//sv:property[@sv:name="phpcr:class"]/sv:value[text()="AppBundle\Document\ArticleCategory"]) > 0'
      ) 
      OR EXTRACTVALUE(
        n1.props, 'count(//sv:property[@sv:name="phpcr:classparents"]/sv:value[text()="AppBundle\Document\ArticleCategory"]) > 0'
      )
    )
  ) 
ORDER BY 
  CAST(
    EXTRACTVALUE(
      n0.numerical_props, '//sv:property[@sv:name="publishStartDate"]/sv:value[1]'
    ) AS DECIMAL
  ) DESC, 
  EXTRACTVALUE(
    n0.props, '//sv:property[@sv:name="publishStartDate"]/sv:value[1]'
  ) DESC
@dbu
Copy link
Member

dbu commented Jan 15, 2016

i kind of wonder if it even makes sense that the condition reads ->condition()->equi('a.category', 'c.uuid') instead of ->condition()->equi('a.category', 'c'). the odm query builder should not use the phpcr level things...

on the phpcr level, i don't know if we can detect this situation - and we probably don't know if its a weak or a hard reference, and those are in different tables. but maybe we can, or can use both tables or something. or find a way to pass hints from phcpr-odm to the phcpr query builder - the performance difference is quite massive.

if you have time to improve some of this, it would be great.

@hason
Copy link
Author

hason commented Jan 18, 2016

i kind of wonder if it even makes sense that the condition reads ->condition()->equi('a.category', 'c.uuid') instead of ->condition()->equi('a.category', 'c'). the odm query builder should not use the phpcr level things...

->condition()->equi('a.category', 'c') throws Invalid field specification, expected string like "[alias].[field_name]", got "c"

@dbu
Copy link
Member

dbu commented Jan 21, 2016

yes, its how it currently works, but it feels like a violation of abstractions :-(

@dbu dbu added the performance label Apr 7, 2017
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

2 participants