Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Remove and ban usage of [Inheritable]ThreadLocal #463

Open
1 task done
snazy opened this issue Nov 21, 2024 · 12 comments · Fixed by #589
Open
1 task done

Remove and ban usage of [Inheritable]ThreadLocal #463

snazy opened this issue Nov 21, 2024 · 12 comments · Fixed by #589
Assignees
Labels
1.0-blocker bug Something isn't working
Milestone

Comments

@snazy
Copy link
Member

snazy commented Nov 21, 2024

Is this a possible security vulnerability?

  • This is NOT a possible security vulnerability

Describe the bug

While [Inheritable]ThreadLocals are relatively easy to start with to pass along request related information, ThreadLocals come with a non-negligible cost and maintenance burden:

  • TLs can cause very hard to detect memory leaks as objects (and classes!) are (often permanently) attached to a thread.
  • TLs and their usage are hard to test
  • (The use of) TLs can accidentally share data across requests
  • Use of TLs becomes complex and hard to reason about

The proper way of sharing request related information is to use CDI's @RequestScoped beans.

To Reproduce

No response

Actual Behavior

No response

Expected Behavior

No response

Additional context

No response

System information

No response

@adutra
Copy link
Contributor

adutra commented Feb 4, 2025

Reopening since #589 will be reverted soon.

@adutra adutra reopened this Feb 4, 2025
@adutra
Copy link
Contributor

adutra commented Feb 4, 2025

A few datapoints:

https://quarkus.io/guides/duplicated-context:

When using a traditional, blocking, and synchronous framework, processing of each request is performed in a dedicated thread. So, the same thread is used for the entire processing. [...] When you need to propagate data along the processing [...] you can use ThreadLocals. [...] When using a reactive and asynchronous execution model [...] the same thread can be used to handle multiple concurrent processing. Thus, you cannot use ThreadLocals as the values would be leaked between the various concurrent processing.

https://quarkus.io/guides/context-propagation:

Traditional blocking code uses ThreadLocal variables to store contextual objects in order to avoid passing them as parameters everywhere [...] If you write reactive/async code [...] try/finally blocks as well as ThreadLocal variables stop working, because your reactive code gets executed in another thread, after the caller ran its finally block.

https://stackoverflow.com/questions/76468966/requestscope-vs-threadlocal-for-mutlti-tenant-in-quarkus-with-mutiny:

Using @RequestScope with Context Propagation is the way to go. Dealing with ThreadLocal manually is going to be very error prone.

@flyrain
Copy link
Contributor

flyrain commented Mar 12, 2025

Thanks for filing this issue. IIUC, the community has agreed on replacing LocalThread with @RequestScoped. I'd encourage any volunteer to take this. As the context, the current production system with LT didn't hit any significant issues. That's said, it is more a continuous improvement instead of a 1.0 blocker to me. Tentatively removed the label 1.0-blocker, still keep the milestone 1.0. We do hope this can be fixed before 1.0 release.

cc @snazy @jbonofre @adutra @dimas-b @collado-mike @eric-maynard @dennishuo @HonahX

@flyrain flyrain added this to the 1.0.0 milestone Mar 12, 2025
@flyrain flyrain closed this as completed by moving to Done in Basic Kanban Board Mar 12, 2025
@flyrain flyrain removed bug Something isn't working 1.0-blocker labels Mar 12, 2025
@flyrain flyrain reopened this Mar 12, 2025
@eric-maynard
Copy link
Contributor

eric-maynard commented Mar 12, 2025

As the context, the current production system with [ThreadLocal] didn't hit any significant issue

My fear is that the issues caused by the ThreadLocal may be very subtle -- e.g. one client gets another client's cache by accident. Maybe these issues are occurring, but they have gone unnoticed. Or maybe there are no issues now, but a PR will introduce an issue soon.

I really hope that we can resolve this prior to 1.0.

@dimas-b
Copy link
Contributor

dimas-b commented Mar 12, 2025

I'd encourage any volunteer to take this.

@flyrain @eric-maynard : previous attempt at removing thread locals by @adutra (#589) was reverted in #1000.

How do you envision organizing the work on implementing this issue to avoid a similar situation?

@eric-maynard
Copy link
Contributor

It's a good question. Ideally, any interface changes can be either minimal or purely additive (i.e. CallContext is added to interfaces, nothing is removed) so that the fix can be focused on removing threadlocal rather than also removing CallContext.

Now that I am more familiar with Quarkus, I also feel that CallContext needs a refactor and that we have a few too many disjoint beans managed by CDI. But I would suggest that we decouple this work from the ThreadLocal work to keep the ThreadLocal removal as uncontroversial and as unlikely to disrupt existing deployments as possible.

@dimas-b
Copy link
Contributor

dimas-b commented Mar 13, 2025

@eric-maynard : I'm not sure we can completely avoid changing method parameters while removing thread locals because the information currently obtained from they will have to be provided by other means (CDI or direct call parameters).

Because of downstream dependencies, I'd prefer you to tackle thread locals. However, if you prefer I can have a go at it and tag you for a review... WDYT?

@flyrain
Copy link
Contributor

flyrain commented Mar 13, 2025

Thanks @dimas-b for picking this up. Assigned this to you. Let's discuss on the PR once it's ready.

@dimas-b
Copy link
Contributor

dimas-b commented Mar 13, 2025

@flyrain : Thanks for your trust, but I did not pick it up... My question to @eric-maynard above did not get an answer yet ;)

@flyrain
Copy link
Contributor

flyrain commented Mar 13, 2025

@dimas-b, sorry for the miss understanding, un-assigned, feel free to pick up later.

@snazy
Copy link
Member Author

snazy commented Mar 21, 2025

This is a serious 1.0-blocker, see reason explained here

@eric-maynard
Copy link
Contributor

Hey @dimas-b, I can try to pick this up.

@eric-maynard eric-maynard self-assigned this Mar 21, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
1.0-blocker bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants