Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Long schema scan query on row insert #377

Closed
tQuant opened this issue Dec 23, 2024 · 18 comments
Closed

Long schema scan query on row insert #377

tQuant opened this issue Dec 23, 2024 · 18 comments
Assignees

Comments

@tQuant
Copy link

tQuant commented Dec 23, 2024

What steps will reproduce the problem?

Insert row

/** @var \Yiisoft\Db\Connection\ConnectionInterface $db */
$db
                ->createCommand()
                ->insertWithReturningPks('my.my_table_name', [
                    ...
                ]);

What is the expected result?

Row inserted

What do you get instead?

Row inserted, but
A long additional query is performed to scan the database schema - from 60 to 80 seconds.

Additional info

Slow query:

SELECT d.nspname AS table_schema, c.relname AS table_name, a.attname AS column_name,
    COALESCE(td.typname, tb.typname, t.typname) AS data_type, COALESCE(td.typtype, tb.typtype, t.typtype) AS type_type,
    (SELECT nspname
     FROM pg_namespace
     WHERE oid = COALESCE(td.typnamespace, tb.typnamespace, t.typnamespace)) AS type_scheme,
    a.attlen AS character_maximum_length, pg_catalog.col_description(c.oid, a.attnum) AS column_comment,
    information_schema._pg_truetypmod(a, t) AS modifier, NOT (a.attnotnull OR t.typnotnull) AS is_nullable,
    COALESCE(t.typdefault, pg_get_expr(ad.adbin, ad.adrelid)) AS column_default,
        COALESCE(pg_get_expr(ad.adbin, ad.adrelid) ~ 'nextval', false) OR a.attidentity != '' AS is_autoinc,
    pg_get_serial_sequence(quote_ident(d.nspname) || '.' || quote_ident(c.relname), a.attname) AS sequence_name,
    CASE WHEN COALESCE(td.typtype, tb.typtype, t.typtype) = 'e'::char
             THEN array_to_string((SELECT array_agg(enumlabel)
                                   FROM pg_enum
                                   WHERE enumtypid = COALESCE(td.oid, tb.oid, a.atttypid))::varchar[], ',')
         ELSE NULL END AS enum_values,
    information_schema._pg_numeric_precision(COALESCE(td.oid, tb.oid, a.atttypid),
                                             information_schema._pg_truetypmod(a, t)) AS numeric_precision,
    information_schema._pg_numeric_scale(COALESCE(td.oid, tb.oid, a.atttypid),
                                         information_schema._pg_truetypmod(a, t)) AS numeric_scale,
    information_schema._pg_char_max_length(COALESCE(td.oid, tb.oid, a.atttypid),
                                           information_schema._pg_truetypmod(a, t)) AS size,
    ct.oid IS NOT NULL AS is_pkey,
    COALESCE(NULLIF(a.attndims, 0), NULLIF(t.typndims, 0), (t.typcategory = 'A')::int) AS dimension
FROM pg_class c
         LEFT JOIN pg_attribute a ON a.attrelid = c.oid
         LEFT JOIN pg_attrdef ad ON a.attrelid = ad.adrelid AND a.attnum = ad.adnum
         LEFT JOIN pg_type t ON a.atttypid = t.oid
         LEFT JOIN pg_type tb ON (a.attndims > 0 OR t.typcategory = 'A') AND t.typelem > 0 AND t.typelem = tb.oid OR
                                 t.typbasetype > 0 AND t.typbasetype = tb.oid
         LEFT JOIN pg_type td ON t.typndims > 0 AND t.typbasetype > 0 AND tb.typelem = td.oid
         LEFT JOIN pg_namespace d ON d.oid = c.relnamespace
         LEFT JOIN pg_rewrite rw ON c.relkind = 'v' AND rw.ev_class = c.oid AND rw.rulename = '_RETURN'
         LEFT JOIN pg_constraint ct ON ct.conrelid = c.oid AND ct.contype = 'p' AND a.attnum = ANY (ct.conkey) OR
                                       rw.ev_action IS NOT NULL AND ct.contype = 'p' AND
                                       (ARRAY(SELECT regexp_matches(rw.ev_action,
                                                                    '{TARGETENTRY .*? :resorigtbl (\d+) :resorigcol (\d+) ',
                                                                    'g')))[a.attnum : a.attnum] <@
                                       (ct.conrelid::text || ct.conkey::text[])
WHERE a.attnum > 0 AND t.typname != '' AND NOT a.attisdropped AND c.relname = 'my_table_name' AND d.nspname = 'my'
ORDER BY a.attnum;
Q A
Version 1.3.0
PHP version 8.1.29
Operating system CentOS Linux 7
@roxblnfk

This comment was marked as off-topic.

@samdark

This comment was marked as off-topic.

@samdark
Copy link
Member

samdark commented Dec 23, 2024

What's the database? How large is the schema?

@tQuant
Copy link
Author

tQuant commented Dec 23, 2024

DBMS: PostgreSQL (ver. 12.15)

Family of 498 table
Family of 4 materialized view
Family of 483 view
Family of 315 routine
Family of 239 sequence
Family of 12 object type

@tQuant
Copy link
Author

tQuant commented Dec 23, 2024

The slowest part is "regexp_matches" as I understend.
EXPLAIN:
image
All other parts is less then 1ms

@samdark
Copy link
Member

samdark commented Dec 23, 2024

Thanks for the helpful info.

@xepozz
Copy link
Member

xepozz commented Dec 23, 2024

@tQuant could you please also attach raw "explain analyze" results?

@tQuant
Copy link
Author

tQuant commented Dec 23, 2024

explain.json

@Tigrov
Copy link
Member

Tigrov commented Dec 24, 2024

@tQuant Does the insertion takes place into view?

@tQuant
Copy link
Author

tQuant commented Dec 24, 2024

@tQuant Does the insertion takes place into view?

yes

@Tigrov Tigrov transferred this issue from yiisoft/db Dec 25, 2024
@Tigrov
Copy link
Member

Tigrov commented Dec 25, 2024

@tQuant Could you try this PR

Also plan to make type casting optional in DB 2.0.

@tQuant
Copy link
Author

tQuant commented Dec 25, 2024

Can you please give me a result query? I will run it in DB, to check performance?

@Tigrov
Copy link
Member

Tigrov commented Dec 25, 2024

SELECT
            a.attname AS column_name,
            COALESCE(td.typname, tb.typname, t.typname) AS data_type,
            COALESCE(td.typtype, tb.typtype, t.typtype) AS type_type,
            (SELECT nspname FROM pg_namespace WHERE oid = COALESCE(td.typnamespace, tb.typnamespace, t.typnamespace)) AS type_scheme,
            a.attlen AS character_maximum_length,
            pg_catalog.col_description(c.oid, a.attnum) AS column_comment,
            NOT (a.attnotnull OR t.typnotnull) AS is_nullable,
            COALESCE(t.typdefault, pg_get_expr(ad.adbin, ad.adrelid)) AS column_default,
            COALESCE(pg_get_expr(ad.adbin, ad.adrelid) ~ 'nextval', false) OR a.attidentity != '' AS is_autoinc,
            pg_get_serial_sequence(quote_ident(d.nspname) || '.' || quote_ident(c.relname), a.attname)
            AS sequence_name,
            CASE WHEN COALESCE(td.typtype, tb.typtype, t.typtype) = 'e'::char
                THEN array_to_string(
                    (
                        SELECT array_agg(enumlabel)
                        FROM pg_enum
                        WHERE enumtypid = COALESCE(td.oid, tb.oid, a.atttypid)
                    )::varchar[],
                ',')
                ELSE NULL
            END AS enum_values,
            COALESCE(
                information_schema._pg_char_max_length(
                    COALESCE(td.oid, tb.oid, a.atttypid),
                    a.atttypmod
                ),
                information_schema._pg_datetime_precision(
                    COALESCE(td.oid, tb.oid, a.atttypid),
                    a.atttypmod
                ),
                CASE a.atttypmod
                    WHEN -1 THEN null
                    ELSE ((a.atttypmod - 4) >> 16) & 65535
                END
            ) AS size,
            information_schema._pg_numeric_scale(
                COALESCE(td.oid, tb.oid, a.atttypid),
                a.atttypmod
            ) AS scale,
            ct.contype,
            COALESCE(NULLIF(a.attndims, 0), NULLIF(t.typndims, 0), (t.typcategory='A')::int) AS dimension
        FROM
            pg_class c
            LEFT JOIN pg_attribute a ON a.attrelid = c.oid
            LEFT JOIN pg_attrdef ad ON a.attrelid = ad.adrelid AND a.attnum = ad.adnum
            LEFT JOIN pg_type t ON a.atttypid = t.oid
            LEFT JOIN pg_type tb ON (a.attndims > 0 OR t.typcategory='A') AND t.typelem > 0 AND t.typelem = tb.oid
                OR t.typbasetype > 0 AND t.typbasetype = tb.oid
            LEFT JOIN pg_type td ON t.typndims > 0 AND t.typbasetype > 0 AND tb.typelem = td.oid
            LEFT JOIN pg_namespace d ON d.oid = c.relnamespace
            LEFT JOIN pg_rewrite rw ON c.relkind = 'v' AND rw.ev_class = c.oid AND rw.rulename = '_RETURN'
            LEFT JOIN pg_constraint ct ON (ct.contype = 'p' OR ct.contype = 'u' AND cardinality(ct.conkey) = 1)
                AND (
                    ct.conrelid = c.oid AND a.attnum = ANY (ct.conkey)
                    OR rw.ev_action IS NOT NULL
                    AND strpos(rw.ev_action, ':resorigtbl ' || ct.conrelid || ' ') > 0
                    AND rw.ev_action ~ ('.* :resno ' || a.attnum || ' :resname \S+ :ressortgroupref \d+ :resorigtbl '
                        || ct.conrelid || ' :resorigcol (?:'
                        || replace(substr(ct.conkey::text, 2, length(ct.conkey::text) - 2), ',', '|') || ') .*')
                )
        WHERE
            a.attnum > 0 AND t.typname != '' AND NOT a.attisdropped
            AND c.relname = 'my_table_name'
            AND d.nspname = 'my'
        ORDER BY
            a.attnum;

@tQuant
Copy link
Author

tQuant commented Dec 25, 2024

"[42883] ERROR: function regexp_like(pg_node_tree, text) does not exist"

The regexp_like() function was introduced in PostgreSQL 15

@Tigrov
Copy link
Member

Tigrov commented Dec 25, 2024

Oh, sorry, try now, I updated the query above

@tQuant
Copy link
Author

tQuant commented Dec 25, 2024

Thats work (execution: 1 s)

@samdark
Copy link
Member

samdark commented Dec 25, 2024

Great improvement considering that it's possible to cache it.

@Tigrov
Copy link
Member

Tigrov commented Dec 26, 2024

#378

@Tigrov Tigrov closed this as completed Dec 26, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants