Skip to content

Support parsing empty map literal syntax for DuckDB and Genric #1361

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 5 commits into from
Aug 4, 2024

Conversation

goldmedal
Copy link
Contributor

Description

DuckDB supports creating an empty map like

D select map {};
┌────────────────────────────────────────────────┐
│ main.map(main.list_value(), main.list_value()) │
│             map(integer, integer)              │
├────────────────────────────────────────────────┤
│ {}                                             │
└────────────────────────────────────────────────┘

@coveralls
Copy link

coveralls commented Aug 2, 2024

Pull Request Test Coverage Report for Build 10235175469

Details

  • 11 of 11 (100.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.001%) to 89.235%

Totals Coverage Status
Change from base Build 10206043624: 0.001%
Covered Lines: 27472
Relevant Lines: 30786

💛 - Coveralls


Ok(Expr::Map(Map { entries: fields }))
if self.peek_token().token == Token::RBrace {
let _ = self.next_token(); // consume }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: ambiguous comments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, i interpreted wrongly. This look good.

self.expect_token(&Token::RBrace)?;

Ok(Expr::Map(Map { entries: fields }))
if self.peek_token().token == Token::RBrace {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to switch to let fields = self.parse_comma_separated0(Self::parse_duckdb_map_field)?; instead? if so that would let us skip the if/else here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can do it. It seems that parse_comma_separated0 is for parentheses, but MAP uses braces.
https://github.com/sqlparser-rs/sqlparser-rs/blob/d49acc67b13e1d68f2e6a25546161a68e165da4f/src/parser/mod.rs#L3487

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah i see, I'm thinking we could still reuse the logic (thinking its currently a bit odd that parse_comma_separated0 is hardcoded to parenthesis while parse_comma_separated is token agnostic as it should be). Seems there's ony a couple of usages of parse_comma_separated0, we could change it to something like parse_comma_separated0(|| {}, Token::RParen)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I addressed this in 0763f52 and 5d81654. I share the same logic for brace, parenthesis, and bracket. I think it looks better. Thanks.

pub fn parse_comma_separated0<T, F>(
&mut self,
f: F,
trailing_commas: bool,

This comment was marked as outdated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I'm not too sure if pulling out the trailing_comma option is worth it/desirable - the current behavior isn't specified on the dialect level, but rather on the parser itself, to be overridden by the user, and I wonder if it'll get confusing to have to figure out where each dialect supports or doesnt support trailing commas. Would it be okay to keep the behavior as it was before? (i.e. always guarding internally on self.options.trailing_comma)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I'm not too sure if pulling out the trailing_comma option is worth it/desirable - the current behavior isn't specified on the dialect level, but rather on the parser itself, to be overridden by the user, and I wonder if it'll get confusing to have to figure out where each dialect supports or doesnt support trailing commas. Would it be okay to keep the behavior as it was before? (i.e. always guarding internally on self.options.trailing_comma)

I see. I tried more cases for DuckDB, I found the trailing_comma behaviors are shared.

D select * from t where a in (1,2,);
┌────────┐
│   a    │
│ int32  │
├────────┤
│ 0 rows │
└────────┘
D select {'a':1,};
┌──────────────────────────┐
│ main.struct_pack(a := 1) │
│    struct(a integer)     │
├──────────────────────────┤
│ {'a': 1}                 │
└──────────────────────────┘
D select [1,2,];
┌───────────────────────┐
│ main.list_value(1, 2) │
│        int32[]        │
├───────────────────────┤
│ [1, 2]                │
└───────────────────────┘

I think we can use self.options.trailing_comma internally.

self.expect_token(&Token::RBracket)?;
Ok(Expr::Array(Array { elem: exprs, named }))
}
let exprs = self.parse_comma_separated0(Parser::parse_expr, false, Token::RBracket)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we know if this change is going to be correct for all dialects? Otherwise we can probably keep the behavior to match the parser configuration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure but all the tests are passed. The logic is the same as what I did for the map literal (actually, I followed it to implement the empty map). Should I roll back it?

pub fn parse_comma_separated0<T, F>(
&mut self,
f: F,
trailing_commas: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I'm not too sure if pulling out the trailing_comma option is worth it/desirable - the current behavior isn't specified on the dialect level, but rather on the parser itself, to be overridden by the user, and I wonder if it'll get confusing to have to figure out where each dialect supports or doesnt support trailing commas. Would it be okay to keep the behavior as it was before? (i.e. always guarding internally on self.options.trailing_comma)

Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! cc @alamb

pub fn parse_comma_separated0<T, F>(
&mut self,
f: F,
end_token: Token,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ that is a nice generalization

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @goldmedal and @iffyio

@alamb alamb merged commit 8f8c96f into apache:main Aug 4, 2024
10 checks passed
@alamb
Copy link
Contributor

alamb commented Aug 4, 2024

(and @git-hulk and @dharanad 🤯 )

@goldmedal goldmedal deleted the support-empty-map branch August 4, 2024 14:09
ayman-sigma pushed a commit to sigmacomputing/sqlparser-rs that referenced this pull request Nov 19, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants