After conversion to LLVM we should be able to delete the inferred source of the kernel. #520

vchuravy · 2023-09-20T21:48:47Z

@simonbyrne has shown me a heap-snapshot were the inferred source took up >>1GB of ram.

…rce of the kernel.

src/jlgen.jl

codecov · 2023-09-20T22:05:26Z

Codecov Report

Patch coverage: 88.88% and project coverage change: -7.74% ⚠️

Comparison is base (edfdc1a) 83.18% compared to head (919242d) 75.44%.
Report is 1 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #520      +/-   ##
==========================================
- Coverage   83.18%   75.44%   -7.74%     
==========================================
  Files          24       24              
  Lines        3300     3270      -30     
==========================================
- Hits         2745     2467     -278     
- Misses        555      803     +248

Files Changed	Coverage Δ
src/jlgen.jl	`77.85% <85.71%> (-2.07%)`	⬇️
src/execution.jl	`67.79% <100.00%> (-32.21%)`	⬇️

... and 13 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/jlgen.jl

simonbyrne · 2023-09-21T05:03:05Z

This doesn't seem to fix my issue. I'm not sure exactly where the problem is, but I did notice:

julia> GPUCompiler.GLOBAL_CI_CACHES
Dict{CompilerConfig, GPUCompiler.CodeCache} with 2 entries:
  CompilerConfig for PTXCompilerTarget => CodeCache(IdDict{MethodInstance, Vector{CodeInstance}}(MethodInstance for >>(…
  CompilerConfig for PTXCompilerTarget => CodeCache(IdDict{MethodInstance, Vector{CodeInstance}}(MethodInstance for >>(…

julia> Base.summarysize(GPUCompiler.GLOBAL_CI_CACHES) / 10^6
1396.946174

julia> Base.summarysize(collect(values(GPUCompiler.GLOBAL_CI_CACHES))[1]) / 10^6
1393.855007

julia> Base.summarysize(collect(values(GPUCompiler.GLOBAL_CI_CACHES))[2]) / 10^6
3.090233

I tried manually calling empty! on this dict: it didn't seem to make any difference, so I suspect the data is being retaine somewhere else as well.

simonbyrne · 2023-09-21T05:07:43Z

Also, what's odd is that RES reported by top is 6.3g, but

julia> Sys.maxrss() / 10^9
17.232601088

maleadt · 2023-09-21T08:41:35Z

Removed a call to jl_uncompress_ir, as IIRC it was only needed for the 1.6 overlay hack: #151 (comment)
Maybe that also helps?

simonbyrne · 2023-09-21T17:53:07Z

Unfortunately still no.

maleadt · 2023-09-21T17:57:19Z

You could try taking a heap snapshot.

simonbyrne · 2023-09-21T18:13:22Z

I did that: it looks like most of it is still the inferred objects:

I tried clearing them out manually:

for cache in values(GPUCompiler.GLOBAL_CI_CACHES)
    for insts in values(cache.dict)
        for inst in insts
            @atomic :release inst.inferred = nothing
        end
    end
end

that seemed to work:

top is still reporting 4GB of memory usage though, so not sure what is going on.

vchuravy · 2023-09-21T18:19:05Z

So I am only deleting top-level kernel calls. Since everything else is re-usable.

vchuravy · 2023-09-21T18:19:36Z

@maleadt are we tracking anywhere how big the modules are we load onto the GPU?

maleadt · 2023-09-21T19:56:33Z

@maleadt are we tracking anywhere how big the modules are we load onto the GPU?

No, and I don't know of a way to query the size of a CuModule or CuContext.

After conversion to LLVM we should be able to delete the inferred sou…

d65cc1f

…rce of the kernel.

vchuravy commented Sep 20, 2023

View reviewed changes

src/jlgen.jl Outdated Show resolved Hide resolved

Update src/jlgen.jl

715a1be

vchuravy commented Sep 20, 2023

View reviewed changes

src/jlgen.jl Outdated Show resolved Hide resolved

vchuravy and others added 4 commits September 20, 2023 18:09

Update src/jlgen.jl

974cc55

Fix usage of ci_cache_lookup

d927ecf

fixup! Fix usage of ci_cache_lookup

427c63a

fixup! fixup! Fix usage of ci_cache_lookup

d7ac585

Allow caching uncompressed IR again.

919242d

maleadt force-pushed the master branch 5 times, most recently from 1d233d7 to e18b7c2 Compare January 20, 2025 10:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

After conversion to LLVM we should be able to delete the inferred source of the kernel. #520

After conversion to LLVM we should be able to delete the inferred source of the kernel. #520

vchuravy commented Sep 20, 2023

codecov bot commented Sep 20, 2023 •

edited

Loading

simonbyrne commented Sep 21, 2023

simonbyrne commented Sep 21, 2023

maleadt commented Sep 21, 2023

simonbyrne commented Sep 21, 2023

maleadt commented Sep 21, 2023

simonbyrne commented Sep 21, 2023

vchuravy commented Sep 21, 2023

vchuravy commented Sep 21, 2023

maleadt commented Sep 21, 2023

After conversion to LLVM we should be able to delete the inferred source of the kernel. #520

Are you sure you want to change the base?

After conversion to LLVM we should be able to delete the inferred source of the kernel. #520

Conversation

vchuravy commented Sep 20, 2023

codecov bot commented Sep 20, 2023 • edited Loading

Codecov Report

simonbyrne commented Sep 21, 2023

simonbyrne commented Sep 21, 2023

maleadt commented Sep 21, 2023

simonbyrne commented Sep 21, 2023

maleadt commented Sep 21, 2023

simonbyrne commented Sep 21, 2023

vchuravy commented Sep 21, 2023

vchuravy commented Sep 21, 2023

maleadt commented Sep 21, 2023

codecov bot commented Sep 20, 2023 •

edited

Loading