Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Do not refcount Trace instances in TraceItem API #721

Merged
merged 1 commit into from
Jun 8, 2023

Conversation

jasnell
Copy link
Member

@jasnell jasnell commented May 31, 2023

The underlying io Trace objects are kj heap allocated. They cannot be referenced unprotected by the TraceItem jsg object and we do not want to restrict them to the IoContext lifetime. Instead we'll extract the detail we need on TraceItem creation.

@jasnell jasnell force-pushed the jsnell/fixup-refcounted-trace-items branch from a8d36ec to c270e69 Compare May 31, 2023 17:38
Copy link
Contributor

@ohodson ohodson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay here, but definitely covers ground I'm not familiar with.

@jasnell jasnell force-pushed the jsnell/fixup-refcounted-trace-items branch from c270e69 to cbda22e Compare June 1, 2023 14:21
@kentonv kentonv requested review from jclee and removed request for kentonv June 5, 2023 14:34
@kentonv
Copy link
Member

kentonv commented Jun 5, 2023

Added @jclee as a reviewer and removed myself since he wrote a lot of this originally.

src/workerd/api/trace.c++ Show resolved Hide resolved
src/workerd/api/trace.h Show resolved Hide resolved
src/workerd/api/trace.c++ Outdated Show resolved Hide resolved
@jclee
Copy link
Contributor

jclee commented Jun 7, 2023

So, the motivation for the original design was to minimize allocation overhead, if a single trace record needed to pass through several trace handlers, each of which might only examine a portion of the data. I guess this hasn't turned out to be a major concern today, with only a few trace handlers installed. It does mean, though, that an entire copy of the trace data needs to be made on each trace handler invocation, which seems like a pity, for something that is generally request-scoped, or at least pipeline-scoped, so the work done during the trace handler will be proportional to the amount of trace data recorded regardless of what the trace handler is doing... e.g. if it's only being used to collect a few request metrics.

But maybe I'm overthinking it... The cost wouldn't be more than was already paid during the trace recording, and we do have safeguards in place to limit the size of the recorded data.

If I'm understanding the problem correctly:

  • We see refcount errors due to the refcount being modified in multiple threads.
  • This is most likely due to accesses in a GC thread conflicting with an access in a request thread.
  • One way we address similar problems is by using kj::AtomicRefcounted, but we don't want to do that here due to overhead.
  • We address similar situations for kj I/O objects via IoOwn, but that assumes the value is request-scoped, which might not be the case for these objects.
  • We don't have an existing analogous solution for things that are pipeline-scoped.

Hm... I suppose if we have to make a copy, long-term, it might be more efficient to just pass around something that's a bunch of views onto a single arena allocation, such the allocation and copy can just be done once. Or, perhaps, just a rpc::Trace::Reader?

@jasnell
Copy link
Member Author

jasnell commented Jun 7, 2023

It does mean, though, that an entire copy of the trace data needs to be made on each trace handler invocation...

We could potentially get around that by following a pattern similar to the new Detail struct this PR introduces. Specifically, use kj::Refcounted instances that are guaranteed to be own entirely by the isolate rather than split between the kj heap and the isolate heap. I do have to wonder, however, how much of an actual issue that's going to be tho.

Hm... I suppose if we have to make a copy, long-term, it might be more efficient to just pass around something that's a bunch of views onto a single arena allocation, such the allocation and copy can just be done once. Or, perhaps, just a rpc::Trace::Reader?

Maybe? I've not dug in that far on this. There's likely lots of better ways of improving this.

The underlying io Trace objects are kj heap allocated. They cannot be
referenced unprotected by the TraceItem jsg object and we do not want
to restrict them to the IoContext lifetime. Instead we'll extract the
detail we need on TraceItem creation.
@jasnell jasnell force-pushed the jsnell/fixup-refcounted-trace-items branch from cbda22e to 787c07e Compare June 7, 2023 20:40
@jasnell jasnell merged commit 39be22e into main Jun 8, 2023
@jasnell jasnell deleted the jsnell/fixup-refcounted-trace-items branch June 8, 2023 15:05
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants