Skip to content

AI Driven Context Enhancements. Add Variable Names to the Casting Hint. #190

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

kirkw
Copy link
Contributor

@kirkw kirkw commented Jun 9, 2025

Having used this code for years. Here is my second attempt to give back.
The message:
Detail: cast "integer" value to "bigint" type
now indicates the variable, if available:
Detail: cast "integer" value to "bigint" type: variable "y"

in complex statements, it was ALWAYS a struggle to figure out which BIGINT/int was acting up.
NOW this is a proof of concept Pull request.

It had to refactor a bit of code to get the variable names to come through.
AGAIN, this is proof of concept. If you kinda like it and agree with the concept but REQUIRE changes.
Please just provide the feedback.

ALSO, I apologize that these are 3 commits. They are not quite accurate in the naming.
I do not know GIT very will. And there should probably be a "git smash" or something to combine them.
TOGETHER they work.

Feedback is appreciated.

I wasn't sure if I should attach my tests. I will include them here. (RETURN statements will only display a variable if one is used).
CREATE OR REPLACE FUNCTION public.test_fail()
RETURNS text
LANGUAGE plpgsql
AS $function$
DECLARE
x int; y bigint; z numeric;
BEGIN
Select 1,1,3
into
x,y,z;
PERFORM 'PRAGMA:ECHO:TEST_FAIL @@name';
return 'test'::varchar;
END;
$function$;

and:
CREATE OR REPLACE FUNCTION public.test_fail2()
RETURNS text
LANGUAGE plpgsql
AS $function$
DECLARE
x int; y bigint; z numeric;
BEGIN
Select 1,1,3
into
x,y,z;
PERFORM 'PRAGMA:ECHO:TEST_FAIL @@name';
return x;
END;
$function$;

@okbob
Copy link
Owner

okbob commented Jun 9, 2025

What is a target of this patch? Display the variable name can be good idea.

I am not sure about building stmt_info - same content should be known from context?

The code

switch (cstate->estate->err_stmt->cmd_type)
						{
							case PLPGSQL_STMT_ASSIGN:
								appendStringInfo(&stmt_info, " in assignment statement");
								break;
							case PLPGSQL_STMT_FETCH:
								appendStringInfo(&stmt_info, " in FETCH INTO statement");
								break;
							case PLPGSQL_STMT_GETDIAG:
								appendStringInfo(&stmt_info, " in SELECT INTO statement");
								break;
							case PLPGSQL_STMT_EXECSQL:
								appendStringInfo(&stmt_info, " in EXECUTE statement");
								break;
							case PLPGSQL_STMT_DYNEXECUTE:
								appendStringInfo(&stmt_info, " in EXECUTE USING statement");
								break;
							default:
								appendStringInfo(&stmt_info, " in %s statement", 
									plpgsql_check__stmt_typename_p(cstate->estate->err_stmt));
						}

is repeated 3x.

a) I am not sure if it is necessary - same information should be visible from context.

b) What for these statements you explicitly build string, although for all cases

appendStringInfo(&stmt_info, " in %s statement", 
									plpgsql_check__stmt_typename_p(cstate->estate->err_stmt));

should be enough?

Regress tests are every time good idea

any typo and white chars modifications should be in separate pull request - it is hard to review code when you mix some new functionality with cleaning white chars. Probably there is more formatting issues, and then probably the best thing can be using pgident - but it should be separate pull request.

Don't worry about git. plpgsql_check is not in upstream, and his git history is not extra great. I am not git expert too. Maybe one day I'll change it to some stricter process, but now - because mostly changes in plpgsql is from me, I don't see this as extra important. Now - am use mainly one git super command git rebase -i.

More important is a specification of goals - what is problem, what is an expected change and why?

Hidden cast from int to bigint can be very common in Postgres - just you can write

do $$
declare v bigint;
begin
v := 1;
end;
$$

plpgsql_parser is dumb - 1 is small int always. And without explicit casting, the cast is hidden and repeated. It can be a performance problem only in cycles or if you use it in functions that are called per row over billions rows. Fortunately new implementation of assign command (from Postgres 10) is smarter and the constant casted to bigint is stored in cached plan - so the performance impact is not too visible like in 9.x versions.

@kirkw
Copy link
Contributor Author

kirkw commented Jun 9, 2025

Pavel,
Apologies. I unfortunately learn by making mistakes :-)
There were a lot here, and I will have to setup the PG version testing 14 - 17. Good to know (soon to add 18).
Not sure the best way to organize all of those, I have a special build box I am using...
What do you recommend for all of the PG versions? (Separate Folders + Environment Variables on which one to test)?

We cleaned our code to the degree that we have ZERO reported hints/warnings/errors. And it is a requirement to
keep it that way. Honestly, I wish there were a fine tuned easy way to limit some HINTS, like this one in DECLARE/RETURN.

But having fixed these in 300 routines. It was GUESSWORK as to which line, which variable.
The goal was to add the variable name, so AT LEAST we can figure it out faster.

The patch, while accomplishing this. Breaks regressions, and is poorly cobbled together.
I am more than willing to cleanup the patch for this feature. It would have saved me MANY HOURS.
[At one point, I was inserting lines and re-checking, just to see if I was before/after the line number].

But to answer your question. YES, having the full statement WOULD be 80% and sufficient.

When you have 10-12 variables you are initializing, you have to pick through them, one at a time.

Finally, should this additional "Show the statement?" be driven by an extra flag? (It seems so)??

Thanks for the comments. Let me know how you would like me to proceed.
But you can trash this pull request, it has serious problems. But I want to help get some progress in this direction.

@okbob
Copy link
Owner

okbob commented Jun 9, 2025

I afraid so plpgsql_check_function with new content can be unreadable. Maybe result of plpgsql_check_function_tb can be enhanced about some columns - like variable_name, statement_name, query.

@kirkw
Copy link
Contributor Author

kirkw commented Jun 9, 2025

I afraid so plpgsql_check_function with new content can be unreadable.
Agreed. the entire SQL statement is just too much.
That's why I reduced it to the variable names.

If I rework the patch, and do a better implementation that shows the variable name...
Are you interested?

@okbob
Copy link
Owner

okbob commented Jun 9, 2025 via email

@okbob
Copy link
Owner

okbob commented Jun 9, 2025 via email

@kirkw
Copy link
Contributor Author

kirkw commented Jun 10, 2025

prepare and test it against pg13 - pg18. With scripts it is not hard work. https://github.com/okbob/home/tree/main/bin

Thank you for sharing your scripts. I've forked them, and will work them into my system next weekend.
I will work on running and testing in all of the versions you have listed.
Then I will "train my dragon (AI)" to use the scripts. And work on the proper rules/prompts and validate the CICD failures before I continue to clean/fix, rewrite this patch.
I am torn with what to call the AI... Internally we call it "My Friend". Sarcastically, we call the other devs... "Your Girlfriend (LOL) coded that for you". I just used "My Dragon" for the first time. I like it. It represents the power and the LACK of fidelity...

Again, thank you for the support and encouragement. You represent the ideals of Open Source and community!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants