Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Allow using external elfutils #252

Open
rbberger opened this issue Mar 3, 2023 · 6 comments
Open

Allow using external elfutils #252

rbberger opened this issue Mar 3, 2023 · 6 comments
Labels
bug Something isn't working cmake Modifies the CMake build system

Comments

@rbberger
Copy link

rbberger commented Mar 3, 2023

Tried installing v1.8.0 by adding it to Spack, but on our systems it always fails during the internal (ExternalProject) elfutils build and missing gettext symbols. While installing elfutils via Spack is easy, the current CMake setup doesn't seem to allow providing an external one to avoid this issue.

It should be straight forward to support this by following the same CMake logic found in dyninst.

@rbberger
Copy link
Author

rbberger commented Mar 7, 2023

Follow up on that issue.

Here is a workaround for the original issue I encountered.

To make v1.8.0 compile with Spack and resolve the TBB include errors and the gettext linking errors:

  • Add the new version to the omnitrace spackage:
    version("1.8.0", commit="7c73d981258cc3a29477756a95c1f90c5f8897dd", submodules=True)
    
  • Add depends_on("tbb@2018.6.0:") to the omnitrace spackage (copied from dyninst)
  • Spack install with an additional ldflag:
    spack install omnitrace@1.8.0 ldflags="-lintl"
    

@jrmadsen
Copy link
Collaborator

jrmadsen commented Mar 8, 2023

Hi @rbberger, sorry for the delay, I was on vacation last week. It was always my intention to enable this but it got sort of lost in all the changes that arose from the causal profiling support. I can potentially get this feature into the repo in the 1.8.2 release -- if any of the solutions below solve your problem, I will probably semi-delay doing this until Dyninst merges the CMake overhaul I provided them when I reworked it to be build as a subproject. They just opened a PR into their main branch. Once that is merged I will be working on this anyway -- Dyninst decided to remove the support for internal builds of Boost/TBB/etc. so I am going to have to migrate that into OmniTrace.

Anyway, onto potential short-term solutions:

I poked around and it appears adding --disable-nls should discard the need to link to libintl. I cannot think of any need for it in omnitrace since we don't output messages in any language of than English anywhere else so I added that flag to #254. Thus, potentially, spack install omnitrace@main might fix your issue once that is merged.

Furthermore, following #254 getting merged, the RedHat installers should finally be available. Thus, there should plenty of install options available via the self-extracting installer scripts (Ubuntu 18.04 - 22.04, OpenSUSE 15.x, and RedHat 8.7, 9.0, and 9.1). These are quick and easy and 100% relocatable. Even before the 1.8.1 release (hopefully tomorrow), you can go to the Actions -> Installer Packaging (CPack) tab and find the latest installers for the last merge into the main branch.

  • Add depends_on("tbb@2018.6.0:") to the omnitrace spackage (copied from dyninst)

Interesting this was necessary. Ideally, omnitrace should not have to add anything like this since Dyninst relies on TBB internally, OmniTrace does not. But then again, the CMake for your Dyninst installation was a literal nightmare so I am not surprised at all Dyninst isn't propagating the TBB include flags it needs for omnitrace to build against it.

@rbberger
Copy link
Author

rbberger commented Mar 8, 2023

@jrmadsen thanks for the detailed reply. For now, the workaround I found should be good enough. Just a heads up on what I found along the way (beside a Spack bug spack/spack#35913 😄 ). Please feel free to ignore this rambling, but maybe it's useful in case you run into this eventually.

Once I had the workaround (which I built for ROCm 5.3.0) I decided to upgrade to ROCm 5.4.3 (in Spack) to match what we have at the LLNL machines we are using. Things then got really weird because of this:

#include <hip/hip_runtime.h>
#include "hip_ostream_ops.h"
#include <hip/amd_detail/hip_prof_str.h>

https://github.com/ROCm-Developer-Tools/roctracer/blob/4c7c59eb652f6dd1f57fe0cceb2d82b03deba161/inc/roctracer_hip.h#L26-L28

The first include of <hip_runtime.h> then does this

#if USE_PROF_API
#include <hip/amd_detail/hip_prof_str.h>
#endif

https://github.com/ROCm-Developer-Tools/HIP/blob/fea9cf73ba64592cdbc6c946d03b2ad2f14b77db/include/hip/hip_runtime_api.h#L6968

because USE_PROF_API is set true from the installs CMake targets of HIP:
https://github.com/ROCm-Developer-Tools/hipamd/blob/rocm-5.4.x/src/CMakeLists.txt#L258

Which then leads to undefined roctracer compilation errors during the include of <hip_runtime.h> because the required "hip_ostream_ops.h" only gets included after.

Interestingly, I did not see this on the LLNL machine (where I used the system ROCm 5.4.3 installation), so my guess is that it's somehow related to the way Spack installs HIP, how the files are laid out differently in that case, and/or how the CMake is configured by default there.

It might be related to ROCm/roctracer@05ee3ff which would mean you could/should change includes such as

https://github.com/AMDResearch/omnitrace/blob/40e0d2a92f5c8c8837fe28576d2645fef0ffaf94/source/lib/omnitrace/library/roctracer.hpp#L31

to

#include <roctracer/roctracer.h>

Anyway, I worked around it by

#define USE_PROF_API 0
#include <hip/hip_runtime.h>
#define USE_PROF_API 1
#include "hip_ostream_ops.h"
#include <hip/amd_detail/hip_prof_str.h>

@jrmadsen
Copy link
Collaborator

jrmadsen commented Mar 8, 2023

Interesting, and no worries, I enjoy a semi-related rambling.

Luckily, we won't run into this issue though. The include paths of <roctracer.h> vs. <roctracer/roctracer.h> are accounted for in the Findroctracer.cmake so that the former is effectively the same as the latter.

Plus the dependency on the HIP runtime is only as a last resort / only really used in omnitrace-avail and thus, there's only one place the header is included (lib/core/gpu.cpp), which is completely separated from any roctracer stuff. We basically avoid any calls to the HIP runtime since if OmniTrace were to make a call to the HIP runtime to say, query the number of GPU during initialization, it would activate the ROCm tooling even if the application is not using HIP (and thus produce superfluous output related to ROCm despite the app not using it).

@jrmadsen jrmadsen added bug Something isn't working cmake Modifies the CMake build system labels Jun 22, 2023
@ppanchad-amd
Copy link

@rbberger Has your issue been resolved? If so, please close the ticket. Thanks!

@rbberger
Copy link
Author

rbberger commented Oct 7, 2024

@ppanchad-amd I'll check, now that this has become part of the official ROCm distribution it might be less of an issue.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working cmake Modifies the CMake build system
Projects
None yet
Development

No branches or pull requests

3 participants