Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Some models saved under Flux v0.14 do not load on v0.15+ #2584

Open
dorn-gerhard opened this issue Jan 18, 2025 · 2 comments
Open

Some models saved under Flux v0.14 do not load on v0.15+ #2584

dorn-gerhard opened this issue Jan 18, 2025 · 2 comments

Comments

@dorn-gerhard
Copy link

Using u = Flux.loadmodel!(empty_model, model_state)
leads to the following error

Image
since in Flux v0.14 pad_size is not stored extra (model_state2) whereas in the new version it is (model_state)

Image

Is there a workaround to upgrade old model states?

@ToucheSir
Copy link
Member

I don't recall any built-in layers which have a pad_size field, can you provide a MWE? Depending on what pad_size represents, I think the easiest solution could be to use Accessors.jl + maybe Functors.jl to remove pad_size from model_state.

@mcabbott
Copy link
Member

A possible cause is Functors v0.5 recursing into arbitrary custom structs?

For built-in layers, Flux.state does typically save integers like padding, does not load them (since they are immutable), but does complain if they don't match:

julia> st = Flux.state(Conv((2,), 1=>1, pad=1, stride=2))
(σ = (), weight = Float32[-0.8863135; -0.70230323;;;], bias = Float32[0.0], stride = (2,), pad = (1, 1), dilation = (1,), groups = 1)

julia> Flux.loadmodel!(Conv((2,), 1=>1, pad=0, stride=0), st)
Conv((2,), 1 => 1, stride=0)  # 3 parameters

julia> Flux.loadmodel!(Conv((2,), 1=>1, pad=0, stride=0), st[propertynames(st)[1:end-1]])
ERROR: ArgumentError: Tried to load (:pad, , :weight, :bias, :stride, :dilation) into (:pad, , :weight, :bias, :groups, :stride, :dilation) but the structures do not match.

One possible remedy might be to define a method of loadmodel! for your struct, as was done when some built-in layers changed what fields they had:

# Issue 2476, after ConvTranspose got a new field in 2462. Minimal fix to allow loading?
function loadmodel!(dst::ConvTranspose, src::NamedTuple{(:σ, :weight, :bias, :stride, :pad, :dilation, :groups)}; kw...)
new_src = (; src.σ, src.weight, src.bias, src.stride, src.pad, dst.outpad, src.dilation, src.groups)
loadmodel!(dst, new_src; kw...)
end

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants