-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
feat: add wildcard expr #7
Conversation
Reviewer's Guide by SourceryThis PR refactors the AST package by splitting it into multiple files and adds support for wildcard expressions in the parser. The implementation moves each AST node type into its own file and introduces a new WildcardExpr type for handling wildcard patterns in identifiers and strings. Class diagram for AST refactoring and WildcardExpr additionclassDiagram
class Expr {
<<interface>>
+Pos() int
+End() int
+String() string
}
class ParenExpr {
+L int
+R int
+Expr Expr
+Pos() int
+End() int
+String() string
}
Expr <|-- ParenExpr
class CombineExpr {
+LeftExpr Expr
+Keyword token.Kind
+RightExpr Expr
+Pos() int
+End() int
+String() string
}
Expr <|-- CombineExpr
class BinaryExpr {
+pos int
+Field string
+Operator token.Kind
+Value Expr
+HasNot bool
+Pos() int
+End() int
+String() string
}
Expr <|-- BinaryExpr
class Literal {
+pos int
+end int
+Kind token.Kind
+Value string
+WithDoubleQuote bool
+Pos() int
+End() int
+String() string
}
Expr <|-- Literal
class WildcardExpr {
+Literal *Literal
+Indexes []int
+Pos() int
+End() int
+String() string
}
Expr <|-- WildcardExpr
Literal <|-- WildcardExpr
note for WildcardExpr "New class for handling wildcard patterns in identifiers and strings"
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Here's the code health analysis summary for commits Analysis Summary
Code Coverage Report
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @laojianzi - I've reviewed your changes and they look great!
Here's what I looked at during the review
- 🟡 General issues: 1 issue found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
||
var indexes []int | ||
|
||
runes := []rune(tok.Value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: The wildcard detection logic needs to properly handle escaped sequences
The current implementation only checks for a single backslash before '*', but doesn't handle cases where the backslash itself is escaped (e.g. "foo\*bar"). Consider implementing proper escape sequence parsing that accounts for all levels of escaping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Progressing #3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Auto Pull Request Review from LlamaPReview
1. Overview
1.1 PR Summary
- Purpose and scope of changes: Introduce support for wildcard expressions in the KQL parser, allowing for flexible pattern matching with identifiers and strings. The changes involve refactoring the existing expression handling to accommodate this new feature and introducing extensive tests to validate the implementation.
- Key components modified: AST package and parser logic.
- Impact assessment: Improved query flexibility and robustness, potentially increasing the complexity of the parser logic.
1.2 Architecture Changes
- System design modifications: Split the AST package into multiple files for better organization. Each AST node type (Binary, Combine, Literal, Paren, Wildcard) is moved into separate files.
- Component interactions: The parser is modified to detect and handle wildcard characters (*) in identifiers and strings.
- Integration points: The new WildcardExpr type is introduced to handle wildcard patterns, and corresponding test files are added for each AST node type.
2. Detailed Technical Analysis
2.1 Code Logic Deep-Dive
Core Logic Changes
Wildcard Detection Logic
- File Path:
parser/parser.go
- Submitted PR Code:
func (p *defaultParser) parseWildcard() (ast.Expr, error) { kind := p.lexer.Token.Kind tok, err := p.expect(kind) if err != nil { return nil, err } pos, end := tok.Pos, tok.End if kind == token.TokenKindString { // with double quote " pos -= 1 end += 1 } lit := ast.NewLiteral(pos, end, kind, tok.Value) if kind != token.TokenKindIdent && kind != token.TokenKindString { return lit, nil } var indexes []int runes := []rune(tok.Value) for i := range runes { if runes[i] == '*' && (i == 0 || runes[i-1] != '\\') { // skip escaped wildcard indexes = append(indexes, i) } } if len(indexes) == 0 { // not found wildcard return lit, nil } return ast.NewWildcardExpr(lit, indexes), nil }
- Analysis:
- The current logic only checks for a single backslash before '*'.
- Does not handle cases where the backslash itself is escaped (e.g. "foo\*bar").
- LlamaPReview Suggested Improvements:
func (p *defaultParser) parseWildcard() (ast.Expr, error) { kind := p.lexer.Token.Kind tok, err := p.expect(kind) if err != nil { return nil, err } pos, end := tok.Pos, tok.End if kind == token.TokenKindString { // with double quote " pos -= 1 end += 1 } lit := ast.NewLiteral(pos, end, kind, tok.Value) if kind != token.TokenKindIdent && kind != token.TokenKindString { return lit, nil } var indexes []int runes := []rune(tok.Value) escape := false for i := range runes { if !escape && runes[i] == '\\' { escape = true continue } if runes[i] == '*' && !escape { indexes = append(indexes, i) } escape = false } if len(indexes) == 0 { // not found wildcard return lit, nil } return ast.NewWildcardExpr(lit, indexes), nil }
- Submitted PR Code:
2.2 Implementation Quality
- Code organization and structure: The AST package is now better organized with each node type in its own file.
- Design patterns usage: The parser logic is modular and follows good design principles.
- Error handling approach: Errors are handled gracefully, but there is room for improvement in the wildcard detection logic.
- Resource management: Memory usage may increase slightly due to the storage of wildcard indexes.
3. Critical Findings
3.1 Potential Issues
-
🔴 Critical Issues
- Issue description: The wildcard detection logic needs to properly handle escaped sequences.
- Impact: Incorrect parsing of escaped wildcards can lead to unexpected behavior.
- Recommendation: Improve the wildcard detection logic to handle multiple levels of escaping.
-
🟡 Warnings
- Warning description: Ensure comprehensive test coverage for various wildcard patterns.
- Potential risks: Missing edge cases can lead to incomplete testing.
- Suggested improvements: Add more test cases to cover edge cases and complex patterns.
3.2 Code Quality Concerns
- Maintainability aspects: The code is well-organized and modular.
- Readability issues: The code is generally readable, but the wildcard detection logic could be clearer.
- Performance bottlenecks: The parser's performance with wildcard expressions needs to be validated through performance testing.
4. Security Assessment
4.1 Security Considerations
- Authentication/Authorization impacts: No direct impact.
- Data handling concerns: No direct impact.
- Input validation: Ensure that wildcard patterns do not introduce new attack vectors.
- Security best practices: Validate that the parser correctly handles escaped sequences to prevent injection attacks.
4.2 Vulnerability Analysis
- Potential security risks: Incorrect handling of escaped wildcards.
- Mitigation strategies: Implement proper escape sequence parsing.
- Security testing requirements: Test the parser with various wildcard patterns to ensure correct behavior.
5. Testing Strategy
5.1 Test Coverage
- Unit test analysis: Ensure all AST node types are thoroughly tested.
- Integration test requirements: Test the parser with complex queries containing wildcards.
- Edge cases coverage: Cover cases with multiple wildcards and escaped characters.
5.2 Test Recommendations
Suggested Test Cases
func TestWildcard(t *testing.T) {
type args struct {
pos int
end int
kind token.Kind
value string
withDoubleQuote bool
indexes []int
}
cases := []struct {
name string
args args
wantPos int
wantEnd int
wantString string
wantIndexes []int
}{
{
name: "only wildcard on ident",
args: args{
end: 1,
kind: token.TokenKindIdent,
value: "*",
},
wantEnd: 1,
wantString: "*",
wantIndexes: []int{0},
},
{
name: "only wildcard on string",
args: args{
pos: 1,
end: 2,
kind: token.TokenKindString,
value: "*",
withDoubleQuote: true,
},
wantPos: 1,
wantEnd: 2,
wantString: `"*"`,
wantIndexes: []int{1},
},
{
name: "int value with wildcard on ident",
args: args{
end: 3,
kind: token.TokenKindIdent,
value: "4*9",
},
wantEnd: 3,
wantString: "4*9",
wantIndexes: []int{1},
},
{
name: "int value with multi-wildcard on ident",
args: args{
end: 3,
kind: token.TokenKindIdent,
value: "*0*",
},
wantEnd: 3,
wantString: "*0*",
wantIndexes: []int{0, 2},
},
{
name: "float value with wildcard on ident",
args: args{
end: 4,
kind: token.TokenKindIdent,
value: "0.*9",
},
wantEnd: 4,
wantString: "0.*9",
wantIndexes: []int{2},
},
{
name: "float value with multi-wildcard on ident",
args: args{
end: 4,
kind: token.TokenKindIdent,
value: "*.9*",
},
wantEnd: 4,
wantString: "*.9*",
wantIndexes: []int{0, 3},
},
{
name: "string value with wildcard on ident",
args: args{
end: 3,
kind: token.TokenKindIdent,
value: "f*o",
},
wantEnd: 3,
wantString: "f*o",
wantIndexes: []int{1},
},
{
name: "string value with multi-wildcard on ident",
args: args{
end: 3,
kind: token.TokenKindIdent,
value: "*o*",
},
wantEnd: 3,
wantString: "*o*",
wantIndexes: []int{0, 2},
},
{
name: "value with wildcard on string",
args: args{
end: 5,
kind: token.TokenKindString,
value: "f*o",
withDoubleQuote: true,
},
wantEnd: 5,
wantString: `"f*o"`,
wantIndexes: []int{2},
},
{
name: "value with multi-wildcard on string",
args: args{
end: 5,
kind: token.TokenKindString,
value: "*o*",
withDoubleQuote: true,
},
wantEnd: 5,
wantString: `"*o*"`,
wantIndexes: []int{1, 3},
},
}
for _, c := range cases {
t.Run(c.name, func(t *testing.T) {
expr := ast.NewLiteral(c.args.pos, c.args.end, c.args.kind, c.args.value)
assert.Equal(t, c.wantPos, expr.Pos())
assert.Equal(t, c.wantEnd, expr.End())
assert.Equal(t, c.wantString, expr.String())
})
}
}
- Coverage improvements: Aim for at least 90% test coverage.
- Performance testing needs: Measure the performance impact of wildcard expressions on large queries.
6. Documentation & Maintenance
6.1 Documentation Requirements
- API documentation updates: Document the new
WildcardExpr
type and its usage. - Architecture documentation: N/A
- Configuration changes: N/A
- Usage examples: N/A
6.2 Maintenance Considerations
- Long-term maintainability: The code is well-organized and modular.
- Technical debt assessment: Address the wildcard detection logic to handle escaped sequences.
- Monitoring requirements: Monitor the performance of the parser in production.
7. Deployment & Operations
7.1 Deployment Impact
- Deployment strategy: Follow standard deployment practices.
- Rollback plan: N/A
- Configuration changes: N/A
7.2 Operational Considerations
- Monitoring requirements: Track any errors related to wildcard expressions.
- Performance metrics: Measure the parser's performance with wildcard expressions.
- Resource utilization: Ensure efficient resource usage.
8. Summary & Recommendations
8.1 Key Action Items
- Fix the wildcard detection logic to handle multiple levels of escaping (High Priority).
- Add comprehensive test cases for various wildcard patterns and edge cases (Medium Priority).
- Optimize the parsing logic for wildcards to improve performance (Low Priority).
8.2 Future Considerations
- Long-term improvements: Continuously improve the parser's efficiency.
- Technical debt items: Address any remaining issues with the wildcard detection logic.
- Scalability considerations: Ensure the parser remains efficient with the new feature.
By addressing the identified issues and following the recommended actions, this PR can be improved to ensure a robust and efficient implementation of wildcard expressions in the KQL parser.
Summary by Sourcery
Add support for wildcard expressions in the parser, enabling pattern matching with identifiers and strings. Refactor existing expression handling to accommodate this new feature and introduce extensive tests to validate the implementation.
New Features:
Tests: