-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL][XPTI] Revisit resource management strategy #4494
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[SYCL][XPTI] Revisit resource management strategy #4494
Conversation
This PR depends on #4462 |
4da67e0
to
73115fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, just copy-past errors that I pointed in comments. This also suggests that proper testing is missing, since this wasn't noted before.
Right, tests were disabled for a long time and are hardly buildable right now due to warnings implied by compiler upgrades. I'm working on cleaning that up and enabling it in CI. I'll convert this PR to draft till tests are merged. UPD: tests for this patch are included into intel/llvm-test-suite#458, which depends on #4545 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
sycl/source/detail/xpti_registry.hpp
Outdated
@@ -31,6 +31,15 @@ inline constexpr const char *SYCL_PIDEBUGCALL_STREAM_NAME = "sycl.pi.debug"; | |||
|
|||
class XPTIRegistry { | |||
public: | |||
void initializeFrameworkOnce() { | |||
#ifdef XPTI_ENABLE_INSTRUMENTATION | |||
if (!MInitialized) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need a thread guard to avoid race conditions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Right now this is only called from a single place, which is guarded std::call_once
, but the future instrumentations will initialize streams closer to use place, so that we don't notify tools about streams we're not going to use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have replaced this with std::call_once, it better demonstrates my intentions and is definitely thread safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@andykaylor @smaslov-intel I fixed merge conflicts, could you please re-review? |
* sycl: (108 commits) [SYCL][XPTI] Revisit resource management strategy (intel#4494) [SYCL][ESIMD] Fix misprint: ESIMD_L1_FLUASH_RO_DATA -> ESIMD_L1_FLUSH_RO_DATA (intel#4681) [SYCL] Make kernel_bundle interop more conformant (intel#4672) [SYCL] Submission with kernel parameter ignores set kernel bundle (intel#4667) [SYCL] Add support for std::byte to vec class (intel#4637) [BuildBot] Uplift CPU/FPGAEMU RT version for CI Process (intel#4671) [SYCL] Fix an error on host when big image is used on opencl:gpu (intel#4668) [SYCL] Exclude exported symbols from kernel bundles (intel#4660) Revert "[SYCL] Allow overriding plugin libraries (intel#4067)" (intel#4659) [SYCL] Handle exceptions on mutually exclusive handler operations (intel#4639) [sycl-post-link] Don't split module if function pointer has a user that's not CallInst (intel#4657) [SYCL][HIP] Fix MemBufferFill for nvidia platform (intel#4629) [SYCL][Doc] Describe DPC++ CUDA install w/ non-standard toolkit loc (intel#4663) [SYCL] Fix device code instrumentation (intel#4615) Remove myself as a code owner (intel#4653) [SYCL] Fix overwriting insert to sub_group_mask (intel#4656) [x86][Matrix] Replace packed_a with row_major in matrix testcases (intel#4641) [SYCL][Doc] Add device global extension spec (intel#4233) [sycl-post-link] Adds property listing exported functions (intel#4626) [Driver][SYCL] Do not consider non-archive files for FPGA binary checks (intel#4644) ...
* sycl: (107 commits) [SYCL][XPTI] Revisit resource management strategy (intel#4494) [SYCL][ESIMD] Fix misprint: ESIMD_L1_FLUASH_RO_DATA -> ESIMD_L1_FLUSH_RO_DATA (intel#4681) [SYCL] Make kernel_bundle interop more conformant (intel#4672) [SYCL] Submission with kernel parameter ignores set kernel bundle (intel#4667) [SYCL] Add support for std::byte to vec class (intel#4637) [BuildBot] Uplift CPU/FPGAEMU RT version for CI Process (intel#4671) [SYCL] Fix an error on host when big image is used on opencl:gpu (intel#4668) [SYCL] Exclude exported symbols from kernel bundles (intel#4660) Revert "[SYCL] Allow overriding plugin libraries (intel#4067)" (intel#4659) [SYCL] Handle exceptions on mutually exclusive handler operations (intel#4639) [sycl-post-link] Don't split module if function pointer has a user that's not CallInst (intel#4657) [SYCL][HIP] Fix MemBufferFill for nvidia platform (intel#4629) [SYCL][Doc] Describe DPC++ CUDA install w/ non-standard toolkit loc (intel#4663) [SYCL] Fix device code instrumentation (intel#4615) Remove myself as a code owner (intel#4653) [SYCL] Fix overwriting insert to sub_group_mask (intel#4656) [x86][Matrix] Replace packed_a with row_major in matrix testcases (intel#4641) [SYCL][Doc] Add device global extension spec (intel#4233) [sycl-post-link] Adds property listing exported functions (intel#4626) [Driver][SYCL] Do not consider non-archive files for FPGA binary checks (intel#4644) ...
This reverts commit c91b3b8.
Global objects are typically destroyed on process tear down, but the order of the destructor calls is undefined. Since SYCL applications can potentially call SYCL APIs from global context, XPTI and all of its resources have to outlive user application. This patch refactors XPTI proxy library and framework to allocate global objects on heap and manages their lifetime based on communications from traced application.
Each user of XPTI (e.g. SYCL runtime) has to call
xptiFrameworkInitialize()
once prior to any other XPTI API. When application is done collecting trace information, it must close streams and then callxptiFrameworkFinalize()
.The XPTI framework will maintain a reference counter, and will only free resources and unload libraries, when the counter hits 0. This will allow the subscribers to survive past
DllMain
call or global shared library destructor.