PSQL and SQL profiler #7086

asfernandes · 2021-12-23T18:41:21Z

No description provided.

…ILED_STATEMENT_ID and MON$CALL_STACK.MON$COMPILED_STATEMENT_ID.

…nt correctly.

…tion.

asfernandes · 2022-04-26T10:43:30Z

Are there any objections to integrate this in master, or anything yet to discuss?

romansimakov · 2022-04-26T11:35:39Z

In firebird-devel we talked about an ability to profile other connections. Has it been added or are you going to add it later?

asfernandes · 2022-04-26T12:26:27Z

In firebird-devel we talked about an ability to profile other connections. Has it been added or are you going to add it later?

I think this change would not invalidate the current design, so I think it does not make much sense to continue adding features before an initial integration and lack of feedback on what we have now.

romansimakov · 2022-04-26T13:09:28Z

In firebird-devel we talked about an ability to profile other connections. Has it been added or are you going to add it later?

I think this change would not invalidate the current design, so I think it does not make much sense to continue adding features before an initial integration and lack of feedback on what we have now.

I don't mind if so

dyemanov · 2022-05-01T12:59:40Z

I'd prefer the default plugin to be used without naming in START_SESSION (for example, declare NULL as a default value). It could be done via DefaultProfilerPlugin option in firebird.conf, for example. Alternatively, we may have ProfilerPlugin option in firebird.conf (similar to TracePlugin = fbtrace) that substitutes the missing plugin name in START_SESSION to any specified plugin (not only the default one). One may still specify the plugin name explicitly in START_SESSION, if necessary.

dyemanov · 2022-05-01T12:59:58Z

Once we support profiling other connections (I agree this is a top priority feature), should we extend START_SESSION with a new parameter (ATTACHMENT_ID DEFAULT CURRENT_ATTACHMENT) or some new function will be introduced?

dyemanov · 2022-05-01T13:05:09Z

While I don't mind CREATE_SESSION and FINISH_SESSION routine names, I'd suggest to rename SESSION_ID columns inside tables to PROFILE_ID. Having SESSION_ID and ATTACHMENT_ID side to side looks somewhat ambiguous, considering we have SESSION_ID and USER_SESSION meaning attachments in RDB$GET_CONTEXT, and also considering the SQL specification having SESSION_USER where session also means an attachment.

dyemanov · 2022-05-01T13:06:42Z

Why FBPROF$ prefix for tables/views instead of just PROF$? We're Firebird anyway, so what's the point.

dyemanov · 2022-05-01T13:09:07Z

Given the default profiler plugin is supposedly to be really usable (not just a draft example of how things should be coded), why its tables/views are created dynamically rather than being part of ODS? Are they expected to change significantly with time?

dyemanov · 2022-05-01T13:14:58Z

I really miss timings (count/min/max/total) for statements as a whole. It would be inconvenient to measure them with some external tools and then look into PROF$ tables/views for details, it would be more handy to have everything in a single place. Especially if we speak about multiple executions of a single prepared/cached statement.

dyemanov · 2022-05-01T13:28:36Z

I've attempted to collect execution times in the past, but wanted to split them into total/cpu/wait parts, with the wait part being also detailed (I/O, lock, latch, pause, etc). Given that extra measurements are not always dirty cheap, I had a doubt they should be presented via MON$ tables unconditionally. Now it looks like it could be integrated with your profiler design after it's committed and thus measured on demand. But the question is how deep we need to dive into the CPU time. Is it OK to calculate it as (total - wait) (or leave this calculation up to users), or do we want to see real CPU time and possibly deal with some delta (total - wait - cpu) that remains unknown. And if we need real CPU time, do we want user/kernel times? All these things complicate the implementation and make it somewhat platform-dependent.

dyemanov · 2022-05-01T13:30:45Z

I feel the profiler package also needs a routine SET_FLUSH_INTERVAL that does flushing automagically, based on timer. Zero (default) means manual flushing, as designed currently.

aafemt · 2022-05-01T14:26:05Z

It could be done via DefaultProfilerPlugin option in firebird.conf, for example. Alternatively, we may have ProfilerPlugin option in firebird.conf (similar to TracePlugin = fbtrace)

Talking about configuration: because profiling is entirely engine-related, I'd say that this setting should belong to Engine14.conf, not firebird.conf.

asfernandes · 2022-05-01T16:01:53Z

Given the default profiler plugin is supposedly to be really usable (not just a draft example of how things should be coded), why its tables/views are created dynamically rather than being part of ODS? Are they expected to change significantly with time?

Then it's not really plugin design
It would require many code to make backup and restore of them work
It would not support primary and foreign keys

asfernandes · 2022-05-02T16:49:47Z

Why FBPROF$ prefix for tables/views instead of just PROF$? We're Firebird anyway, so what's the point.

My idea was to have a FB namespace to differentiate things from users objects, as they can also use the dollar sign.

But since it's new name convention, I would not have a problem to change it.

We also have PLG$* tables, so maybe we should name them PLG$PROF*?

asfernandes · 2022-05-02T16:56:13Z

I really miss timings (count/min/max/total) for statements as a whole. It would be inconvenient to measure them with some external tools and then look into PROF$ tables/views for details, it would be more handy to have everything in a single place. Especially if we speak about multiple executions of a single prepared/cached statement.

How these stored timings would be different than aggregate the request-based timings per STATEMENT_ID?

asfernandes · 2022-05-02T17:08:42Z

I feel the profiler package also needs a routine SET_FLUSH_INTERVAL that does flushing automagically, based on timer. Zero (default) means manual flushing, as designed currently.

Data (even when flushed) is stored as part of user transaction, that may be rolled back later.

I do not see a way that automatic flush would work with this or would be less confusing than manually flush the data before read.

But this could be useful in the case of profiling others connections.

dyemanov · 2022-05-04T07:04:01Z

We also have PLG$* tables, so maybe we should name them PLG$PROF*?

This looks good to me.

dyemanov · 2022-05-04T07:11:24Z

Data (even when flushed) is stored as part of user transaction, that may be rolled back later.

Automatic flushing could behave like it's executed in autonomous transaction. Rollback will surely not be possible, but one may always delete the rows manually. This should be well-documented, of course.

dyemanov · 2022-05-04T07:18:14Z

How these stored timings would be different than aggregate the request-based timings per STATEMENT_ID?

They cannot be aggregated from RECORD_SOURCE_STATS, as time may also be spent between cursor operations. Perhaps it could be aggregated from PSQL_STATS, but only for PSQL routines. Imagine I execute INSERT FROM SELECT and 90% of time is spent inside VIO_store(), how could I get the total execution time for my statement? What about non-PSQL procedures/functions?

AlexPeshkoff · 2022-05-04T08:24:24Z

On 5/1/22 19:02, Adriano dos Santos Fernandes wrote: Given the default profiler plugin is supposedly to be really usable (not just a draft example of how things should be coded), why its tables/views are created dynamically rather than being part of ODS? Are they expected to change significantly with time? 1. Then it's not really plugin design

I want to explicitly agree with Adriano here. I see no reasons for one particular plugin to have required for it tables in ODS.

Change RDB$PROFILER.START_SESSION parameters order and put defaults on them.

…SSION parameters.

Use autonomous transaction in flush.

asfernandes · 2022-06-04T18:10:37Z

I've implemented profiling of others attachments with this commit set. The semantics of it is documented in the readme.

To make things not confusing, user's transaction is not passed to plugins anymore and flush always starts its own transaction.

…of MSVC. Note, it is not documented so far for newly released MSVC 17.1.

Assume _MSC_VER will be increased to be >= 2000 when\if VC CRT library get new version number in suffix.

Add some static_asserts.

Add parameter FLUSH_INTERVAL to START_SESSION.

asfernandes · 2022-06-09T02:57:24Z

How these stored timings would be different than aggregate the request-based timings per STATEMENT_ID?

They cannot be aggregated from RECORD_SOURCE_STATS, as time may also be spent between cursor operations. Perhaps it could be aggregated from PSQL_STATS, but only for PSQL routines. Imagine I execute INSERT FROM SELECT and 90% of time is spent inside VIO_store(), how could I get the total execution time for my statement? What about non-PSQL procedures/functions?

I think we need TOTAL time of requests (as stored in PLG$PROF_REQUESTS) which is valid for SQL and PSQL.

Then it's possible to calculate MIN/MAX per STATEMENT (as stored in PLG$PROF_STATEMENTS).

…NT_STATS_VIEW.

asfernandes · 2022-06-22T11:39:29Z

I've attempted to collect execution times in the past, but wanted to split them into total/cpu/wait parts, with the wait part being also detailed (I/O, lock, latch, pause, etc). Given that extra measurements are not always dirty cheap, I had a doubt they should be presented via MON$ tables unconditionally. Now it looks like it could be integrated with your profiler design after it's committed and thus measured on demand. But the question is how deep we need to dive into the CPU time. Is it OK to calculate it as (total - wait) (or leave this calculation up to users), or do we want to see real CPU time and possibly deal with some delta (total - wait - cpu) that remains unknown. And if we need real CPU time, do we want user/kernel times? All these things complicate the implementation and make it somewhat platform-dependent.

AFAIK there is no way to directly get the "wait time" from the OS. It's the elapsed time minus the thread's CPU time (user + kernel).

I think we can pass the elapsed and total thread's CPU time. The helper views could also calculate the total wait time.

We should decide if the APIs that currently receives only uint64 runTime would get a new uint64 cpuTime parameter, hence will not be extensible.

Or if we would need to add another interface which would be extensible and plugin writers would call its methods to get timings. The downside of this is that it will be slower.

dyemanov · 2022-06-22T14:30:50Z

Yes, no way to get wait time from the system. But we can measure "logical waits" ourselves in most points where we checkout from the engine. Some of them won't be honest (IO wait time will actually be IO CPU time when reads are performed from filesystem cache, for example), but this is OK as they're expected to be short and unlikely to be noticeable as "top" waits. If the CPU time is also measured, then we could calculate "wait" as "total time - CPU time" and it should be more or less real, but I doubt it gonna be useful. So I'm somewhat skeptical in whether we really need CPU time...

Add ProfilerStats interface and pass it to plugin instead of runTime parameter. Rename *_TIME columns to *_ELAPSED_TIME.

asfernandes · 2022-08-07T22:19:41Z

Anyone see a blocking point to have this merged in master?

AlexPeshkoff · 2022-10-11T07:48:39Z

On 5/4/22 11:39, Vlad Khorsun wrote: We also have |PLG$*| tables, so maybe we should name them |PLG$PROF*|? This looks good to me. What does it means ? PLG == ???

That's just prefix like RDB$, see PLG$SRP_USERS in sec db.

asfernandes · 2022-10-11T09:01:09Z

On 04/05/2022 07:49, Vlad Khorsun wrote: So, if user call RDB$PROFILER.START_SESSION and RDB$PROFILER.FLUSH in the same (own) transaction, profiler will create metadata and attempt to use it (to INSERT something) in the same transaction ?

No, and actually things happens a bit different. The default plugin uses the transaction from startSession to query a sequence to make the session id. There is the init method. Init receives a attachment and transaction. The default plugin uses the attachment to start a new transaction to create the metadata. Init is called by engine once per database/plugin when a profiler session is about to be created. Adriano

pavel-zotov · 2023-02-22T05:05:19Z

@@@ QA issue @@@
Test verifies only example from doc/sql.extensions/README.profiler.md
More complex checks will be implementer later.

asfernandes added 6 commits November 24, 2021 15:18

Add table MON$COMPILED_STATEMENTS and columns MON$STATEMENTS.MON$COMP…

3a45263

…ILED_STATEMENT_ID and MON$CALL_STACK.MON$COMPILED_STATEMENT_ID.

Fix #7062 - Creation of expression index does not release its stateme…

e0e1a6d

…nt correctly.

Fix #7077 - EXECUTE BLOCK (without RETURNS) do not work with batches.

7fc9ee6

Reintroduce Array method in SortedArray to use faster find implementa…

6980d4b

…tion.

Add LogLocalStatus to log cloop status.

f60cf7f

Profiler.

2f503f2

asfernandes added the type: new feature label Dec 23, 2021

asfernandes self-assigned this Dec 23, 2021

asfernandes mentioned this pull request Dec 23, 2021

PSQL and SQL profiler #285

Closed

asfernandes added 6 commits June 4, 2022 15:00

Add configuration parameter DefaultProfilerPlugin.

a8ae90d

Change RDB$PROFILER.START_SESSION parameters order and put defaults on them.

Add defaults to RDB$PROFILER.PAUSE_SESSION and RDB$PROFILER.FINISH_SE…

12f8d08

…SSION parameters.

Add PLUGIN_OPTIONS parameters to RDB$PROFILER.START_SESSION.

b3654c4

Move EngineContextHolder (and AttachmentHolder) to jrd.h.

a0ade0c

Remote attachment profiling.

bb139df

Do not pass user's transaction to plugin.

3fe37a6

Use autonomous transaction in flush.

hvlad and others added 5 commits June 4, 2022 15:45

This should show value of _MSC_VER when build fails with new version …

a3313c3

…of MSVC. Note, it is not documented so far for newly released MSVC 17.1.

Fixed build with MSVC 17.1

82a569f

Assume _MSC_VER will be increased to be >= 2000 when\if VC CRT library get new version number in suffix.

Add system privilege PROFILE_ANY_ATTACHMENT and permission check.

e090ec0

Add some static_asserts.

Add procedure SET_FLUSH_INTERVAL.

35ba335

Add parameter FLUSH_INTERVAL to START_SESSION.

Improve accuracy.

7411434

asfernandes added 3 commits June 10, 2022 22:14

Add TOTAL_TIME column to PLG$PROF_REQUESTS. Add view PLG$PROF_STATEME…

82b98c0

…NT_STATS_VIEW.

Update view's field types.

208f25b

Improve precision of executed statements.

fb9f5c6

Design for future extensions with different timings types.

2f2f5fe

Add ProfilerStats interface and pass it to plugin instead of runTime parameter. Rename *_TIME columns to *_ELAPSED_TIME.

asfernandes added a commit that referenced this pull request Aug 11, 2022

Feature #7086 - PSQL and SQL profiler.

85c5263

asfernandes merged commit 581795e into master Aug 11, 2022

asfernandes added the fix-version: 5.0 Beta 1 label Aug 11, 2022

pavel-zotov added qa: done successfully qa: done with caveats and removed qa: done successfully labels Feb 21, 2023

asfernandes deleted the work/profiler-plugin branch April 9, 2024 00:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PSQL and SQL profiler #7086

PSQL and SQL profiler #7086

asfernandes commented Dec 23, 2021

asfernandes commented Apr 26, 2022

romansimakov commented Apr 26, 2022

asfernandes commented Apr 26, 2022

romansimakov commented Apr 26, 2022

dyemanov commented May 1, 2022

dyemanov commented May 1, 2022

dyemanov commented May 1, 2022

dyemanov commented May 1, 2022

dyemanov commented May 1, 2022

dyemanov commented May 1, 2022

dyemanov commented May 1, 2022

dyemanov commented May 1, 2022

aafemt commented May 1, 2022

asfernandes commented May 1, 2022

asfernandes commented May 2, 2022

asfernandes commented May 2, 2022

asfernandes commented May 2, 2022

dyemanov commented May 4, 2022

dyemanov commented May 4, 2022

dyemanov commented May 4, 2022

AlexPeshkoff commented May 4, 2022 via email

asfernandes commented Jun 4, 2022

asfernandes commented Jun 9, 2022

asfernandes commented Jun 22, 2022

dyemanov commented Jun 22, 2022

asfernandes commented Aug 7, 2022

AlexPeshkoff commented Oct 11, 2022 via email

asfernandes commented Oct 11, 2022 via email

pavel-zotov commented Feb 22, 2023

PSQL and SQL profiler #7086

PSQL and SQL profiler #7086

Conversation

asfernandes commented Dec 23, 2021

asfernandes commented Apr 26, 2022

romansimakov commented Apr 26, 2022

asfernandes commented Apr 26, 2022

romansimakov commented Apr 26, 2022

dyemanov commented May 1, 2022

dyemanov commented May 1, 2022

dyemanov commented May 1, 2022

dyemanov commented May 1, 2022

dyemanov commented May 1, 2022

dyemanov commented May 1, 2022

dyemanov commented May 1, 2022

dyemanov commented May 1, 2022

aafemt commented May 1, 2022

asfernandes commented May 1, 2022

asfernandes commented May 2, 2022

asfernandes commented May 2, 2022

asfernandes commented May 2, 2022

dyemanov commented May 4, 2022

dyemanov commented May 4, 2022

dyemanov commented May 4, 2022

AlexPeshkoff commented May 4, 2022 via email

asfernandes commented Jun 4, 2022

asfernandes commented Jun 9, 2022

asfernandes commented Jun 22, 2022

dyemanov commented Jun 22, 2022

asfernandes commented Aug 7, 2022

AlexPeshkoff commented Oct 11, 2022 via email

asfernandes commented Oct 11, 2022 via email

pavel-zotov commented Feb 22, 2023