Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Faster LinkedHashMap tail() #2725

Merged

Conversation

j-baker
Copy link
Contributor

@j-baker j-baker commented Nov 28, 2022

Change: LinkedHashMap tail() no longer copies the entire datastructure.

Motivation: We have a use case which looks somewhat like the following.

  1. It's effectively a Queue use case (fifo).
  2. Iteration order needs to be stable.
  3. Elements in the queue frequently need to be verified as present in the queue.
  4. Occasionally, removals must occur, which are usually from the head of the queue although occasionally randomly.

We started with a Queue and eventually found performance issues from too much linear scanning. This was not obviously going to be a bottleneck since our queues are usually of very small size, but ended up so. The next type we considered was a LinkedHashSet, using the head and tail methods to implement a dequeue-like operation and the set operations being sufficient for the other operations. Unfortunately, it seems that tail at present copies the entire queue despite not seemingly needing to.

We moved to our own hand-crafted HashSet+Queue structure, but this PR seems like it'd generally be beneficial for users of Vavr! It affects LinkedHashMap, and because LinkedHashSet contains a LinkedHashMap, LinkedHashSet, too.

Change: LinkedHashMap tail() is now constant-time, from being O(size)
time and space.

Motivation: We have a use case which looks somewhat like the following.

1. It's effectively a Queue use case (fifo).
2. Iteration order needs to be stable.
3. Elements in the queue frequently need to be verified as present in
   the queue.
4. Occasionally, removals must occur, which are usually from the head of
   the queue although occasionally randomly.

We started with a Queue and eventually found performance issues from too
much linear scanning. This was not obviously going to be a bottleneck
since our queues are usually of very small size, but ended up so.
The next type we considered was a LinkedHashSet, using the `head` and
`tail` methods to implement a dequeue-like operation and the set
operations being sufficient for the other operations. Unfortunately,
it seems that `tail` at present copies the entire queue despite not
seemingly needing to.

We moved to our own hand-crafted HashSet+Queue structure, but this PR
seems like it'd generally be beneficial for users of Vavr!
Copy link
Contributor

@danieldietrich danieldietrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi James,
thanks for giving me insight into your use-case and your journey, very interesting!
The change looks good to me, I will approve and merge it now.
Thx
Daniel

@danieldietrich danieldietrich merged commit 7e2911f into vavr-io:master Nov 28, 2022
@j-baker
Copy link
Contributor Author

j-baker commented Nov 28, 2022

Thanks, Daniel! I appreciate the fast review + merge! When should I expect this code to be in a release?

@danieldietrich
Copy link
Contributor

You're welcome. I can prepare a patch release this week!

pivovarit pushed a commit that referenced this pull request Aug 19, 2024
Change: LinkedHashMap tail() is now constant-time, from being O(size)
time and space.

Motivation: We have a use case which looks somewhat like the following.

1. It's effectively a Queue use case (fifo).
2. Iteration order needs to be stable.
3. Elements in the queue frequently need to be verified as present in
   the queue.
4. Occasionally, removals must occur, which are usually from the head of
   the queue although occasionally randomly.

We started with a Queue and eventually found performance issues from too
much linear scanning. This was not obviously going to be a bottleneck
since our queues are usually of very small size, but ended up so.
The next type we considered was a LinkedHashSet, using the `head` and
`tail` methods to implement a dequeue-like operation and the set
operations being sufficient for the other operations. Unfortunately,
it seems that `tail` at present copies the entire queue despite not
seemingly needing to.

We moved to our own hand-crafted HashSet+Queue structure, but this PR
seems like it'd generally be beneficial for users of Vavr!
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants