Skip to content

WIP: Generated function recompilation with edges #32774

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
5cd866b
Initial version: trying to add edges to all generated functions
NHDaly Jul 31, 2019
f98ed40
Changed to have a weird manually constructed Tuple type, which seems …
NHDaly Jul 31, 2019
bb7966f
AHA!! Apparently b/c the generator is never specialized, all the type…
NHDaly Jul 31, 2019
84eb92e
Remove printlns
NHDaly Jul 31, 2019
f670295
Change implementation to add forward edges to CodeInfo, rather than d…
NHDaly Jul 31, 2019
5f57247
reenabled print statements and reorderded for clarity
NHDaly Aug 1, 2019
1e2b78e
Cool: fix segfault in boostrap -- use different function to specializ…
NHDaly Aug 1, 2019
e2a086d
OOPS remove spurious JL_GC_POP
NHDaly Aug 1, 2019
037eb79
Switch to using jl_specializations_get_linfo b/c (I thought) this wil…
NHDaly Aug 1, 2019
0f32fea
Remove printlns from my tests
NHDaly Aug 1, 2019
576b14c
ACTUALLY, it looks like jl_get_specialization1 works!
NHDaly Aug 1, 2019
063c920
Cleanup
NHDaly Aug 1, 2019
9b10e0d
Move all this logic into a function
NHDaly Aug 1, 2019
06a7011
Start adding unit tests for invalidating generated functions.
NHDaly Aug 2, 2019
ba6c254
Fix MWE broken test to _actually_ show the problem
NHDaly Aug 2, 2019
e75224b
Remove world-age freezing when calling generator
NHDaly Aug 2, 2019
58c60ac
Revert "Remove world-age freezing when calling generator"
NHDaly Aug 2, 2019
12687d8
Fix unit tests to _actually_ show broken behavior
NHDaly Aug 2, 2019
8c05fc6
Fix redef generated for functions with type sparams
NHDaly Aug 5, 2019
f20d374
Fix recompiling generated functions w/ varargs...
NHDaly Aug 5, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion base/boot.jl
Original file line number Diff line number Diff line change
@@ -521,7 +521,8 @@ end

# invoke and wrap the results of @generated
function (g::GeneratedFunctionStub)(@nospecialize args...)
body = g.gen(args...)
#body = g.gen(args...)
body = Core._apply_pure(g.gen, (args...,))
if body isa CodeInfo
return body
end
1 change: 1 addition & 0 deletions src/julia.h
Original file line number Diff line number Diff line change
@@ -1202,6 +1202,7 @@ JL_DLLEXPORT jl_svec_t *jl_alloc_svec(size_t n);
JL_DLLEXPORT jl_svec_t *jl_alloc_svec_uninit(size_t n);
JL_DLLEXPORT jl_svec_t *jl_svec_copy(jl_svec_t *a);
JL_DLLEXPORT jl_svec_t *jl_svec_fill(size_t n, jl_value_t *x);
// Construct the DataType for `Tuple{v, v, v...}`
JL_DLLEXPORT jl_value_t *jl_tupletype_fill(size_t n, jl_value_t *v);
JL_DLLEXPORT jl_sym_t *jl_symbol(const char *str) JL_NOTSAFEPOINT;
JL_DLLEXPORT jl_sym_t *jl_symbol_lookup(const char *str) JL_NOTSAFEPOINT;
66 changes: 66 additions & 0 deletions src/method.c
Original file line number Diff line number Diff line change
@@ -376,6 +376,67 @@ STATIC_INLINE jl_value_t *jl_call_staged(jl_method_t *def, jl_value_t *generator
return code;
}

// Sets func.edges = MethodInstance[MethodInstance for generator(::Any, ::Any...)]
// This is so that @generated functions will be re-generated if any functions called from
// the generator body are invalidated.
static void set_codeinfo_forward_edge_to_generator(jl_method_t *def,
jl_code_info_t *func,
jl_value_t *generator,
jl_svec_t *sparam_vals,
jl_tupletype_t *ttdt) {
// Get generator function body (not GeneratedFunctionStub) to attach an edge to it
jl_value_t* gen_func = jl_get_field(generator, "gen");

// Get jl_method_t for generator body
// The generator body takes three sets of arguments: (typeof(func), params..., T...)
// For example, for `foo(x::Array{T}) where T`:
// Tuple{getfield(Main, Symbol("##s4#3")), typeof(Main.foo), Int64, Array{Int64}}
// BUT --- THIS IS THE WEIRD PART:
// Apparently the backedge needs to be from `##s4#3(::Any, ::Any)` instead of
// the actual types of the aruments! Who knows why!! Weirdness.
// (EDIT: I _think_ this is because the generatorbody is marked nospecialize?)
// Manually construct that weird signature, here:
size_t n_sparams = jl_svec_len(sparam_vals);
// Get def->nargs, not number of args in ttdt, to correctly count varargs.
size_t numargs = n_sparams + def->nargs;
// Construct Tuple{typeof(gen_func), ::Any, ::Any} for correct number of args.
jl_value_t* weird_types_tuple = jl_tupletype_fill(numargs, (jl_value_t*)jl_any_type);
jl_tupletype_t* typesig = (jl_tupletype_t*)jl_argtype_with_function(gen_func, weird_types_tuple);

// TODO: I still don't know what the right way to specialize the method is.
// I've tried `jl_gf_invoke_lookup` (but that segfaults during bootstrap),
// `jl_get_specialization1` and `jl_specializations_get_linfo.` The last two seem
// to behave identically, so I don't know which is better. Certainly
// jl_get_specialization1 is simpler.
// UPDATE: Actually jl_specializations_get_linfo seems to also cause an error during
// boostrap, complaining about `UndefRefError()`. So maybe jl_get_specialization1.

// next look for or create a specialization of this definition.
size_t min_valid = 0;
size_t max_valid = ~(size_t)0;
jl_method_instance_t *edge = jl_get_specialization1((jl_tupletype_t*)typesig, -1,
&min_valid, &max_valid,
1 /* store new specialization */);

if (edge != NULL && (jl_value_t*)edge != jl_nothing) {
// Now create the edges array and set the edge!
if (func->edges == jl_nothing) {
// TODO: How to construct this array type properly
jl_value_t* array_mi_type = jl_apply_type2((jl_value_t*)jl_array_type,
(jl_value_t*)jl_method_instance_type, jl_box_long(1));

func->edges = (jl_value_t*)jl_alloc_array_1d(array_mi_type, 0);
}

//jl_method_instance_add_backedge(edge, linfo);
jl_array_ptr_1d_push((jl_array_t*)func->edges, (jl_value_t*)edge);
}
else {
jl_printf(JL_STDERR, "WARNING: no edge for generated function body ");
jl_static_show(JL_STDERR, (jl_value_t*)def); jl_printf(JL_STDERR, "\n");
}
}

// return a newly allocated CodeInfo for the function signature
// effectively described by the tuple (specTypes, env, Method) inside linfo
JL_DLLEXPORT jl_code_info_t *jl_code_for_staged(jl_method_instance_t *linfo)
@@ -420,6 +481,11 @@ JL_DLLEXPORT jl_code_info_t *jl_code_for_staged(jl_method_instance_t *linfo)
jl_resolve_globals_in_ir(stmts, def->module, linfo->sparam_vals, 1);
}

// Set forward edge from the staged func `func` to the generator, so that the
// generator will be rerun if any dependent functions called from the generator body
// are invalidated. This keeps generated functions up-to-date, like other functions.
set_codeinfo_forward_edge_to_generator(def, func, generator, linfo->sparam_vals, ttdt);

ptls->in_pure_callback = last_in;
jl_lineno = last_lineno;
ptls->world_age = last_age;
83 changes: 83 additions & 0 deletions test/method.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
using Test

@testset "Normal generated functions" begin
@generated g1(x) = :(x)
@test g1(1) == 1
@test g1("hi") == "hi"
end

# Invalidating Generated Functions
# TODO: For some reason this doesn't work inside a testset right now (See below)
@generated foo() = bar()
bar() = 2
# We can still call foo() even though bar() was defined after! Hooray! :)
@test foo() == 2
# Now we can change bar(), and foo() is also updated! Woohoo! :D
bar() = 3
@test foo() == 3


@testset "invalidating generated functions in a testset" begin
@generated foo() = bar()
bar() = 2

# TODO: It seems like this doesn't work because of the @testset. Is that expected?
# Would this work for regular functions? I think it's broken...
@test foo() == 2
bar() = 3
@test_broken foo() == 3
end


# Functions that take arguments
@generated f(x) = f2(x) + f2(x)
f2(x::Type) = sizeof(x)
@test f(1) == 16
f2(x::Type) = sizeof(x)÷2
@test f(1) == 8


# Method at bottom of call-stack accepts ::Type, not ::Any
# The simple case -- bar(::Any):
@generated foo(x) = bar(x)
bar(x) = 2
@test foo(1) == 2
bar(x) = 3
@test foo(1) == 3
# This also works, with t(::Type{Int})
@generated f_type(x) = t(x)
t(::Type{Int}) = 2
@test f_type(1) == 2
t(::Type{Int}) = 3
@test f_type(1) == 3
# Yet for some reason this does not work:
# Somehow having t1(T) call typemax prevents forming a backedge from t1 to the generator.
@generated f_type2(x) = t1(x)
t1(T) = typemax(T)
@test f_type2(1) == typemax(Int)
t1(T) = 3
@test_broken f_type2(1) == 3
Copy link
Member Author

@NHDaly NHDaly Aug 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the currently broken examples. Somehow, when the generator (#s5#3 in this case) is compiled, it doesn't cause any backedges to be set from t1 to it.

This simple utility function (defined here) prints out all the backedges for all specializations of a function, and you can see that there are none for t1:

julia> NHDalyUtils.func_all_backedges(t1)
1-element Array{Pair{Any,Any},1}:
 :MethodTable => Any[]

Whereas the previous, working example does get such backedges set:

julia> NHDalyUtils.func_all_backedges(t)
2-element Array{Pair{Any,Array{Any,1}},1}:
                 :MethodTable => [Tuple{typeof(t),Any}, MethodInstance for #s5#4(::Any, ::Any)]
 Tuple{typeof(t),Type{Int64}} => [MethodInstance for #s5#4(::Any, ::Any)]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some digging, I think I've found a MWE that explains why this case doesn't work, but it boggles my understanding of how julia recompiles functions...
As I show in this example, functions that call typemax don't get backedges set from typemax, despite the fact that they do inline the result.
But by some mysterious magic, they still get recompiled when typemax is updated, so somehow there's something other than backedges that can cause function recompilation??

julia> struct X x end

julia> Base.typemax(::Type{X}) = 8

julia> f1(t) = Base.typemax(t)
f1 (generic function with 1 method)

julia> f1(X)
8

julia> Base.method_instances(typemax, (Type{X},))[1].backedges
ERROR: UndefRefError: access to undefined reference
Stacktrace:
 [1] getproperty(::Any, ::Symbol) at ./sysimg.jl:18

julia> f2(x::T) where T = Base.typemax(T)  # This function _does_ get a backedge added to typemax.
f2 (generic function with 1 method)

julia> f2(X(2))
8

julia> Base.method_instances(typemax, (Type{X},))[1].backedges
1-element Array{Any,1}:
 MethodInstance for f2(::X)

julia> Base.typemax(::Type{X}) = 100

julia> f1(X)  # wat! if there is no backedge to f1 from typemax, how is this getting recompiled?!?
100

julia> @code_typed f1(X)  # But we know it _is_ getting recompiled, because the value is inlined, here:
CodeInfo(
1return 100
) => Int64

julia> methods(typemax).mt.backedges  # And it's not hiding on the MethodTable, as we can see.
ERROR: UndefRefError: access to undefined reference

So my test is failing because there's no backedge from typemax to f1. And in my test case above, f1 is the generator body. So the generator body is never properly invalidated, and so my generated function is never re-generated.

But somehow typemax is able to cause f1 to recompile via some mystery voodoo that i don't understand. And so I guess I want to also apply that to my generator body?


@vtjnash you pointed to this test case as an example of why "this approach won't work". Is this what you were referring to? That sometimes there is a different mechanism besides backedges that is used to trigger function recompilation?

Copy link
Member Author

@NHDaly NHDaly Aug 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now explained via my explanation below, #32774 (comment).

It turns out that -- despite what it looks like -- the call to typemax(t) in f1 above is triggering a dynamic dispatch, so as explained in that comment, the backedges mechanism implemented in this PR so far has no way to trigger regenerating the generated function. (It looks like the call to typemax is being inlined above, but this is actually due to @code_typed showing fully specialized code, even though it isn't actually fully specialized in reality: #32834.)

A simpler example that showcases why this doesn't work can be found here -- this example fails for the current implementation in this PR (as of f20d374):

julia> baz() = 2
baz (generic function with 2 methods)

julia> f() = Any[baz][1]()  # f calls `baz()` through a type-erased `Any` reference, and inference cannot de-virtualize it.
f (generic function with 2 methods)

julia> @generated foo() = f()
foo (generic function with 2 methods)

julia> foo()  # When computing the result for `foo()`, `baz()` is called via dynamic dispatch
2

julia> baz() = 4  # Updating baz does not trigger regenerating foo
baz (generic function with 2 methods)

julia> foo()  # So foo() still returns 2, instead of 4
2



# Functions with type params
@generated f(x::T) where T<:Number = width(T)
width(::Type{Int}) where T = 5
@test f(10) == 5
width(::Type{Int}) where T = 100
@test f(10) == 100

# It also works for newly defined types
struct MyNum <: Number x::Int end
width(::Type{MyNum}) where T = MyNum(5)
@test f(MyNum(10)) == MyNum(5)
width(::Type{MyNum}) where T = MyNum(100)
@test f(MyNum(10)) == MyNum(100)


# Functions with varargs
@generated f(a::T, b, c...) where T<:Number = bar(T) + bar(a) + bar(b) + sum(bar(v) for v in c)
bar(x) = 2
@test f(2,3,4) == 8
bar(x) = 3
@test f(2,3,4) == 12
@test f(2,3,4,5) == 15