Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

feat: Configure splitting on graphemes #314

Merged
merged 1 commit into from
Mar 21, 2022

Conversation

afnanenayet
Copy link
Owner

Allow configuring whether the input tree is split into graphemes. This
allows for a performance-precision trade-off. Splitting on graphemes is
more granular, but that means that we need to allocate a metadata struct
and split out every single unicode grapheme in the document. We can
avoid this processing step, but that means we're comparing all of the
text in a node.

This also does some organizational refactoring to clean up the former
ast module. The input processing method has been cleaned up, as well
as its supporting functions. The method for actually computing the diff
between two entry vectors has been moved into the diff module, which
seems more appropriate.

Allow configuring whether the input tree is split into graphemes. This
allows for a performance-precision trade-off. Splitting on graphemes is
more granular, but that means that we need to allocate a metadata struct
and split out every single unicode grapheme in the document. We can
avoid this processing step, but that means we're comparing all of the
text in a node.

This also does some organizational refactoring to clean up the former
`ast` module. The input processing method has been cleaned up, as well
as its supporting functions. The method for actually computing the diff
between two entry vectors has been moved into the `diff` module, which
seems more appropriate.
@afnanenayet afnanenayet merged commit b89d2fa into main Mar 21, 2022
@afnanenayet afnanenayet deleted the afnan/input-processing-cfg-refactor branch March 21, 2022 03:07
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant