Skip to content

Test Enzyme and reexport ADTypes.AutoEnzyme #1887

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
wants to merge 97 commits into from
Closed

Test Enzyme and reexport ADTypes.AutoEnzyme #1887

wants to merge 97 commits into from

Conversation

devmotion
Copy link
Member

@devmotion devmotion commented Sep 28, 2022

Note: This does not work yet


I opened this PR to make it easier to debug (and possibly fix) issues with Enzyme.

Currently, the following example does not work (note that the snippet does not require the PR which solely reexports AutoEnzyme at this point):

using Turing
using Enzyme
using ADTypes
Enzyme.API.runtimeActivity!(true);
Enzyme.API.typeWarning!(false);

@model function model()
    m ~ Normal(0, 1)
    s ~ InverseGamma()
    x ~ Normal(m, s)
end

sample(model() | (; x=0.5), NUTS(; adtype = ADTypes.AutoEnzyme()), 10)

With Enzyme#main my Julia (1.8.1) segfaults. An incomplete (it filled my whole terminal) output: https://gist.github.com/devmotion/1352197f2354c6fecddd7b778ec4bcf7#file-log-txt

The example works (latest releases of Turing, Enzyme, and ADTypes on Julia 1.10.0) but the following warnings show up:

warning: didn't implement memmove, using memcpy as fallback which can result in errors
warning: didn't implement memmove, using memcpy as fallback which can result in errors

@coveralls
Copy link

coveralls commented Nov 13, 2022

Pull Request Test Coverage Report for Build 13975938687

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 133 unchanged lines in 5 files lost coverage.
  • Overall coverage decreased (-9.2%) to 76.86%

Files with Coverage Reduction New Missed Lines %
src/mcmc/abstractmcmc.jl 5 82.35%
src/mcmc/repeat_sampler.jl 10 50.0%
src/mcmc/Inference.jl 36 54.81%
src/mcmc/hmc.jl 37 66.92%
src/mcmc/gibbs.jl 45 63.87%
Totals Coverage Status
Change from base Build 13945146446: -9.2%
Covered Lines: 1116
Relevant Lines: 1452

💛 - Coveralls

@codecov
Copy link

codecov bot commented Nov 13, 2022

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 76.85%. Comparing base (e4cd6a2) to head (8690786).
Report is 2 commits behind head on main.

❗ There is a different number of reports uploaded between BASE (e4cd6a2) and HEAD (8690786). Click for more details.

HEAD has 21 uploads less than BASE
Flag BASE (e4cd6a2) HEAD (8690786)
42 21
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1887      +/-   ##
==========================================
- Coverage   86.03%   76.85%   -9.18%     
==========================================
  Files          21       21              
  Lines        1454     1452       -2     
==========================================
- Hits         1251     1116     -135     
- Misses        203      336     +133     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@wsmoses
Copy link
Collaborator

wsmoses commented Jun 26, 2023

Also if you want to disable the warnings you can set it like so (https://github.com/EnzymeAD/Enzyme.jl/blob/c29e6119c7963ddb22f1363726f762455748e193/src/api.jl#L414
)

Enzyme.API.typeWarning!(false)

@wsmoses
Copy link
Collaborator

wsmoses commented Jun 26, 2023

You also may want to set the version to 0.11.2 since your CI currently is running at 0.11.0 (⌃ [7da242da] Enzyme v0.11.0)

@wsmoses
Copy link
Collaborator

wsmoses commented Jun 27, 2023

@devmotion this PR (EnzymeAD/Enzyme.jl#914) should fix the immediate issues you see on CI if you want to try.

@yebai
Copy link
Member

yebai commented Dec 1, 2024

separately @yebai you appear to have removedd my permissions to run tests, if that can be restored

I don't know what happened precisely -- some changes were made to the TuringLang repos permissions to make CI work more robustly.

@wsmoses
Copy link
Collaborator

wsmoses commented Dec 2, 2024

┌ Warning: Could not use exact versions of packages in manifest, re-resolving
└ @ Pkg.Operations /opt/hostedtoolcache/julia/1.11.1/x86/share/julia/stdlib/v1.11/Pkg/src/Operations.jl:1[9](https://github.com/TuringLang/Turing.jl/actions/runs/12122531930/job/33795922092?pr=1887#step:8:10)02
ERROR: Unsatisfiable requirements detected for package DynamicPPL [366bfd00]:
 DynamicPPL [366bfd00] log:
 ├─possible versions are: 0.1.0 - 0.31.0 or uninstalled
 ├─restricted to versions [0.29, 0.30.4 - 0.31] by Turing [fce5fe82], leaving only versions: [0.29.0 - 0.29.2, 0.30.4 - 0.31.0]
 │ └─Turing [fce5fe82] log:
 │   ├─possible versions are: 0.35.3 or uninstalled
 │   └─Turing [fce5fe82] is fixed to version 0.35.3
 ├─restricted by compatibility requirements with Mooncake [da2b9cff] to versions: 0.29.0 - 0.30.5 or uninstalled, leaving only versions: [0.29.0 - 0.29.2, 0.30.4 - 0.30.5]
 │ └─Mooncake [da2b9cff] log:
 │   ├─possible versions are: 0.3.0 - 0.4.53 or uninstalled
 │   └─restricted to versions 0.4.19 - 0.4 by project [23fc8c3f], leaving only versions: 0.4.19 - 0.4.53
 │     └─project [23fc8c3f] log:
 │       ├─possible versions are: 0.0.0 or uninstalled
 │       └─project [23fc8c3f] is fixed to version 0.0.0
 └─restricted by compatibility requirements with Bijectors [76274a88] to versions: 0.31.0 or uninstalled — no versions left
   └─Bijectors [76274a88] log:
     ├─possible versions are: 0.1.0 - 0.15.2 or uninstalled
     ├─restricted to versions 0.14 - 0.15 by Turing [fce5fe82], leaving only versions: 0.14.0 - 0.15.2
     │ └─Turing [fce5fe82] log: see above
     └─restricted by compatibility requirements with Enzyme [7da242da] to versions: [0.1.0 - 0.[13](https://github.com/TuringLang/Turing.jl/actions/runs/12122531930/job/33795922092?pr=1887#step:8:14).16, 0.15.0 - 0.15.2] or uninstalled, leaving only versions: 0.15.0 - 0.15.2
       └─Enzyme [7da242da] log:
         ├─possible versions are: 0.1.0 - 0.13.18 or uninstalled
         └─restricted to versions 0.13 by project [23fc8c3f], leaving only versions: 0.13.0 - 0.13.18
           └─project [23fc8c3f] log: see above
Stacktrace:

I'd try to help, but I don't have permission to edit things or rerun CI xD

@penelopeysm
Copy link
Member

penelopeysm commented Dec 2, 2024

It should resolve with Mooncake 0.4.54 as that allows for DPPL=0.31.0. Don't know why CI isn't picking up the new version.

└─Mooncake [da2b9cff] log:
     ├─possible versions are: 0.3.0 - 0.4.53 or uninstalled

0.4.54 should have been available a few hours ago.

@mhauru
Copy link
Member

mhauru commented Dec 4, 2024

The registry issue is sorted, now merrily running with the latest Enzyme.

@wsmoses
Copy link
Collaborator

wsmoses commented Dec 4, 2024

Check ADType: Error During Test at /home/runner/work/Turing.jl/Turing.jl/test/mcmc/hmc.jl:334
  Got exception outside of a @test
  ArgumentError: Unsupported ADType: ADTypes.AutoEnzyme{Nothing, Nothing}
  Stacktrace:
    [1] Main.ADUtils.ADTypeCheckContext(adbackend::ADTypes.AutoEnzyme{Nothing, Nothing}, child::DynamicPPL.DefaultContext)
      @ Main.ADUtils ~/work/Turing.jl/Turing.jl/test/test_utils/ad_utils.jl:102
    [2] macro expansion
      @ ~/work/Turing.jl/Turing.jl/test/mcmc/hmc.jl:336 [inlined]
    [3] macro expansion
      @ /opt/hostedtoolcache/julia/1.11.2/x86/share/julia/stdlib/v1.11/Test/src/Test.jl:1704 [inlined]
    [4] macro expansion
      @ ~/work/Turing.jl/Turing.jl/test/mcmc/hmc.jl:335 [inlined]
    [5] macro expansion
      @ /opt/hostedtoolcache/julia/1.11.2/x86/share/julia/stdlib/v1.11/Test/src/Test.jl:1793 [inlined]
    [6] top-level scope
      @ ~/work/Turing.jl/Turing.jl/test/mcmc/hmc.jl:22
    [7] include(fname::String)
      @ Main ./sysimg.jl:38
    [8] macro expansion
      @ ~/.julia/packages/TimerOutputs/6KVfH/src/TimerOutput.jl:237 [inlined]
    [9] macro expansion
      @ ~/work/Turing.jl/Turing.jl/test/runtests.jl:26 [inlined]
   [10] macro expansion
      @ /opt/hostedtoolcache/julia/1.11.2/x86/share/julia/stdlib/v1.11/Test/src/Test.jl:1704 [inlined]
   [11] macro expansion
      @ ~/work/Turing.jl/Turing.jl/test/runtests.jl:56 [inlined]
   [12] macro expansion
      @ ~/.julia/packages/TimerOutputs/6KVfH/src/TimerOutput.jl:237 [inlined]
   [13] macro expansion
      @ ~/work/Turing.jl/Turing.jl/test/runtests.jl:54 [inlined]
   [14] macro expansion
      @ /opt/hostedtoolcache/julia/1.11.2/x86/share/julia/stdlib/v1.11/Test/src/Test.jl:1704 [inlined]
   [15] top-level scope
      @ ~/work/Turing.jl/Turing.jl/test/runtests.jl:34
   [16] include(fname::String)
      @ Main ./sysimg.jl:38
   [17] top-level scope
      @ none:6
   [18] eval
      @ ./boot.jl:430 [inlined]
   [19] exec_options(opts::Base.JLOptions)
      @ Base ./client.jl:296
   [20] _start()
      @ Base ./client.jl:531

Looks like something in turing needs to be updated?

@mhauru
Copy link
Member

mhauru commented Dec 5, 2024

Fixed the above issue that @wsmoses pointed out.

We are seeing a lot of illegal type analysis errors, which I suspect are all instances of EnzymeAD/Enzyme.jl#2169.
Too many to reasonably mark as broken, I think we need to get that fixed first and then have another look.

@wsmoses
Copy link
Collaborator

wsmoses commented Dec 5, 2024

So this is indicative of a union (which isn't presently fully supported, at least without setting Enzyme.API.strictAliasing!(false) which may permit it).

Something around here https://github.com/TuringLang/DynamicPPL.jl/blob/2252a9b6012da8e2ac56353770a0f848f6874357/src/abstract_varinfo.jl#L791 is sometimes an int and other times a double. I think this will need to be fixed on the turing side.

@wsmoses wsmoses mentioned this pull request Dec 6, 2024
9 tasks
@mhauru
Copy link
Member

mhauru commented Dec 6, 2024

If

@model function gdemo_copy()
    s ~ InverseGamma(2, 3)
end

fails, and it does, then I assume most Turing models are affected, since they don't really get simpler than that. We could look into trying to chase down that Union somewhere, but I'm surprised that this is an issue given that we should have type stability at most function boundaries for such a simple model, especially as "deep" in as invlink_with_logpdf (we've made sure of that for performance reasons). It could end up taking quite a lot of time to track down the issue on the Turing side.

Enzyme.API.strictAliasing!(false) doesn't seem to save us, the simplest MWE in the issue I made still fails.

@wsmoses has something in Enzyme gotten stricter so that these illegal type analysis errors come up more often nowadays? Some of the errors are from tests that already passed at an earlier point.

@yebai
Copy link
Member

yebai commented Dec 16, 2024

So this is indicative of a union (which isn't presently fully supported, at least without setting Enzyme.API.strictAliasing!(false) which may permit it). Something around here https://github.com/TuringLang/DynamicPPL.jl/blob/2252a9b6012da8e2ac56353770a0f848f6874357/src/abstract_varinfo.jl#L791 is sometimes an int and other times a double. I think this will need to be fixed on the turing side.

Ideally, a proper fix should be added to Enzyme instead of requiring packages like Turing.jl / DynamicPPL.jl to work around it. One good reason is that Turing allows arbitrary Julia code inside the @model macro, which will get hit again if users write code that involves union.

@wsmoses
Copy link
Collaborator

wsmoses commented Jan 4, 2025

So this is indicative of a union (which isn't presently fully supported, at least without setting Enzyme.API.strictAliasing!(false) which may permit it). Something around here https://github.com/TuringLang/DynamicPPL.jl/blob/2252a9b6012da8e2ac56353770a0f848f6874357/src/abstract_varinfo.jl#L791 is sometimes an int and other times a double. I think this will need to be fixed on the turing side.

Ideally, a proper fix should be added to Enzyme instead of requiring packages like Turing.jl / DynamicPPL.jl to work around it. One good reason is that Turing allows arbitrary Julia code inside the @model macro, which will get hit again if users write code that involves union.

Sure, but that's doable after this merges so we can at least confirm and test the thigns that currently are expected to work do work. Not all AD tools support all code (at least at any given time). Zygote historically and currently didn't support mutation (and that's fine here). Enzyme historically (but not presently) didn't like type unstable code. Mooncake presently fails on this PR/CI (see below).

Can we just mark whatever isn't working now as test_broken, open issues, and at least track where things are at? That seems to be the case with all other ADs here.

2024-12-05T15:34:20.5424999Z   StackOverflowError:
2024-12-05T15:34:20.5425533Z   Stacktrace:
2024-12-05T15:34:20.5437430Z         [1] �[0m�[1mset_to_zero!!�[22m�[0m�[1m(�[22m�[90mx�[39m::�[0mMooncake.PossiblyUninitTangent�[90m{Any}�[39m�[0m�[1m)�[22m
2024-12-05T15:34:20.5438650Z   �[90m        @�[39m �[35mMooncake�[39m �[90m~/.julia/packages/Mooncake/19jl1/src/�[39m�[90m�[4mtangents.jl:658�[24m�[39m
2024-12-05T15:34:20.5439433Z         [2] �[0m�[1mtuple_map�[22m
2024-12-05T15:34:20.5440277Z   �[90m        @�[39m �[90m~/.julia/packages/Mooncake/19jl1/src/�[39m�[90m�[4mutils.jl:46�[24m�[39m�[90m [inlined]�[39m
2024-12-05T15:34:20.5441047Z         [3] �[0m�[1mset_to_zero!!�[22m
2024-12-05T15:34:20.5441902Z   �[90m        @�[39m �[90m~/.julia/packages/Mooncake/19jl1/src/�[39m�[90m�[4mtangents.jl:657�[24m�[39m�[90m [inlined]�[39m
2024-12-05T15:34:20.5442677Z         [4] �[0m�[1mset_to_zero!!�[22m
2024-12-05T15:34:20.5443513Z   �[90m        @�[39m �[90m~/.julia/packages/Mooncake/19jl1/src/�[39m�[90m�[4mtangents.jl:663�[24m�[39m�[90m [inlined]�[39m
2024-12-05T15:34:20.5444289Z         [5] �[0m�[1mtuple_map�[22m
2024-12-05T15:34:20.5445094Z   �[90m        @�[39m �[90m~/.julia/packages/Mooncake/19jl1/src/�[39m�[90m�[4mutils.jl:46�[24m�[39m�[90m [inlined]�[39m
2024-12-05T15:34:20.5445843Z         [6] �[0m�[1mset_to_zero!!�[22m
2024-12-05T15:34:20.5446672Z   �[90m        @�[39m �[90m~/.julia/packages/Mooncake/19jl1/src/�[39m�[90m�[4mtangents.jl:657�[24m�[39m�[90m [inlined]�[39m
2024-12-05T15:34:20.5448469Z         [7] �[0m�[1mset_to_zero!!�[22m�[0m�[1m(�[22m�[90mx�[39m::�[0mMooncake.Tangent�[90m{@NamedTuple{gdemo_copy::Mooncake.MutableTangent{@NamedTuple{contents::Mooncake.PossiblyUninitTangent{Any}}}}}�[39m�[0m�[1m)�[22m
2024-12-05T15:34:20.5450076Z   �[90m        @�[39m �[35mMooncake�[39m �[90m~/.julia/packages/Mooncake/19jl1/src/�[39m�[90m�[4mtangents.jl:661�[24m�[39m
2024-12-05T15:34:20.5451259Z         [8] �[0m�[1mset_to_zero!!�[22m�[0m�[1m(�[22m�[90mx�[39m::�[0mMooncake.PossiblyUninitTangent�[90m{Any}�[39m�[0m�[1m)�[22m
2024-12-05T15:34:20.5452390Z   �[90m        @�[39m �[35mMooncake�[39m �[90m~/.julia/packages/Mooncake/19jl1/src/�[39m�[90m�[4mtangents.jl:659�[24m�[39m
2024-12-05T15:34:20.5453468Z   �[90m--- the above 7 lines are repeated 26659 more times ---�[39m
2024-12-05T15:34:20.5491994Z    [186622] �[0m�[1mtuple_map�[22m
2024-12-05T15:34:20.5492882Z   �[90m        @�[39m �[90m~/.julia/packages/Mooncake/19jl1/src/�[39m�[90m�[4mutils.jl:46�[24m�[39m�[90m [inlined]�[39m
2024-12-05T15:34:20.5493655Z    [186623] �[0m�[1mset_to_zero!!�[22m
2024-12-05T15:34:20.5494514Z   �[90m        @�[39m �[90m~/.julia/packages/Mooncake/19jl1/src/�[39m�[90m�[4mtangents.jl:657�[24m�[39m�[90m [inlined]�[39m
2024-12-05T15:34:20.5495177Z    [186624] �[0m�[1mset_to_zero!!�[22m
2024-12-05T15:34:20.5495713Z   �[90m        @�[39m �[90m~/.julia/packages/Mooncake/19jl1/src/�[39m�[90m�[4mtangents.jl:663�[24m�[39m�[90m [inlined]�[39m
2024-12-05T15:34:20.5496189Z    [186625] �[0m�[1mtuple_map�[22m
2024-12-05T15:34:20.5496930Z   �[90m        @�[39m �[90m~/.julia/packages/Mooncake/19jl1/src/�[39m�[90m�[4mutils.jl:46�[24m�[39m�[90m [inlined]�[39m
2024-12-05T15:34:20.5497630Z    [186626] �[0m�[1mset_to_zero!!�[22m
2024-12-05T15:34:20.5498141Z   �[90m        @�[39m �[90m~/.julia/packages/Mooncake/19jl1/src/�[39m�[90m�[4mtangents.jl:657�[24m�[39m�[90m [inlined]�[39m
2024-12-05T15:34:20.5499255Z    [186627] �[0m�[1mset_to_zero!!�[22m�[0m�[1m(�[22m�[90mx�[39m::�[0mMooncake.Tangent�[90m{@NamedTuple{gdemo_copy::Mooncake.MutableTangent{@NamedTuple{contents::Mooncake.PossiblyUninitTangent{Any}}}}}�[39m�[0m�[1m)�[22m
2024-12-05T15:34:20.5500207Z   �[90m        @�[39m �[35mMooncake�[39m �[90m~/.julia/packages/Mooncake/19jl1/src/�[39m�[90m�[4mtangents.jl:661�[24m�[39m
2024-12-05T15:34:38.4612463Z dynamic model: �[91m�[1mError During Test�[22m�[39m at �[39m�[1m/home/runner/work/Turing.jl/Turing.jl/test/mcmc/gibbs.jl:119�[22m

@wsmoses
Copy link
Collaborator

wsmoses commented Mar 20, 2025

This seems like an easy fix you need on the turing side?

ADType test with ADTypes.AutoEnzyme(mode=EnzymeCore.ReverseMode{false, true, EnzymeCore.FFIABI, false, false}()): Error During Test at /home/runner/work/Turing.jl/Turing.jl/test/optimisation/Optimisation.jl:622
  Got exception outside of a @test
  ArgumentError: Unsupported ADType: ADTypes.AutoEnzyme{EnzymeCore.ReverseMode{false, true, EnzymeCore.FFIABI, false, false}, Nothing}
  Stacktrace:
    [1] Main.ADUtils.ADTypeCheckContext(adbackend::ADTypes.AutoEnzyme{EnzymeCore.ReverseMode{false, true, EnzymeCore.FFIABI, false, false}, Nothing}, child::DynamicPPL.DefaultContext)
      @ Main.ADUtils ~/work/Turing.jl/Turing.jl/test/test_utils/ad_utils.jl:97
    [2] macro expansion
      @ ~/work/Turing.jl/Turing.jl/test/optimisation/Optimisation.jl:624 [inlined]
    [3] macro expansion
      @ /opt/hostedtoolcache/julia/1.11.4/x64/share/julia/stdlib/v1.11/Test/src/Test.jl:1793 [inlined]
    [4] macro expansion
      @ ~/work/Turing.jl/Turing.jl/test/optimisation/Optimisation.jl:622 [inlined]
    [5] macro expansion
      @ /opt/hostedtoolcache/julia/1.11.4/x64/share/julia/stdlib/v1.11/Test/src/Test.jl:1704 [inlined]
    [6] top-level scope
      @ ~/work/Turing.jl/Turing.jl/test/optimisation/Optimisation.jl:28
    [7] include(fname::String)
      @ Main ./sysimg.jl:38
    [8] macro expansion
      @ ~/work/Turing.jl/Turing.jl/test/runtests.jl:25 [inlined]
    [9] macro expansion
      @ /opt/hostedtoolcache/julia/1.11.4/x64/share/julia/stdlib/v1.11/Test/src/Test.jl:1704 [inlined]
   [10] macro expansion
      @ ~/work/Turing.jl/Turing.jl/test/runtests.jl:52 [inlined]
   [11] macro expansion
      @ /opt/hostedtoolcache/julia/1.11.4/x64/share/julia/stdlib/v1.11/Test/src/Test.jl:1[704](https://github.com/TuringLang/Turing.jl/actions/runs/13953116120/job/39101294503?pr=1887#step:8:705) [inlined]
   [12] top-level scope
      @ ~/work/Turing.jl/Turing.jl/test/runtests.jl:33
   [13] include(fname::String)
      @ Main ./sysimg.jl:38
   [14] top-level scope
      @ none:6
   [15] eval
      @ ./boot.jl:430 [inlined]
   [16] exec_options(opts::Base.JLOptions)
      @ Base ./client.jl:296
   [17] _start()
      @ Base ./client.jl:531

@yebai
Copy link
Member

yebai commented Apr 18, 2025

Closed in favour of https://github.com/TuringLang/ADTests

Please note that this PR doesn't add any functionality to Turing but only adds Enzyme to the Turing test suite. Such tests will now be included in https://github.com/TuringLang/ADTests

@yebai yebai closed this Apr 18, 2025
@yebai yebai deleted the dw/enzyme branch April 18, 2025 20:14
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.