Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

fix: Make fewer copies when building a string array #5503

Merged
merged 2 commits into from
Aug 13, 2024

Conversation

mhilton
Copy link
Contributor

@mhilton mhilton commented Aug 13, 2024

Change the StringBuilder to only create an arrow builder when it becomes necessary. Whist the StringBuilder has only a single distinct value it will just record how long the array is. Creating an arrow.StringBuilder when a second distinct value is added.

Checklist

Dear Author 👋, the following checks should be completed (or explicitly dismissed) before merging.

  • ✏️ Write a PR description, regardless of triviality, to include the value of this PR
  • 🔗 Reference related issues
  • 🏃 Test cases are included to exercise the new code
  • 🧪 If new packages are being introduced to stdlib, link to Working Group discussion notes and ensure it lands under experimental/
  • 📖 If language features are changing, ensure docs/Spec.md has been updated

Dear Reviewer(s) 👋, you are responsible (among others) for ensuring the completeness and quality of the above before approval.

"In the last few months a performance degradation has crept in when
processing flux. One possible cause was the changes that were made
to the way string arrays were built, which always built a full array
even if it would ultimately not be needed. This change ensures that
a full arrow array is only created when necessary.
@mhilton mhilton requested a review from a team as a code owner August 13, 2024 06:41
@mhilton mhilton requested a review from btasker August 13, 2024 06:42
array/builder.go Outdated
return
}
b.makeBuilder(buf)

}

// Append appends a string to the array being built. The input string
// will always be copied.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor point: the second half of this comment was true prior to this change, but now isn't

@mhilton mhilton merged commit c2433e6 into master Aug 13, 2024
7 checks passed
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants