Converting from integer-tokens to one-hot tokens gives different results. #179

codetalker7 · 2024-05-14T10:52:42Z

I tried to use the "colbert-ir/colbertv2.0" pretrained checkpoint for a task (it's essentially a BERT model + a linear layer, for this issue we only focus on the BERT model). Here is how I loaded the model:

using CUDA
using Flux
using OneHotArrays
using Test
using Transformers
using Transformers.TextEncoders

const PRETRAINED_BERT = "colbert-ir/colbertv2.0"

bert_config = Transformers.load_config(PRETRAINED_BERT)
bert_tokenizer = Transformers.load_tokenizer(PRETRAINED_BERT)
bert_model = Transformers.load_model(PRETRAINED_BERT)

const VOCABSIZE = size(bert_tokenizer.vocab.list)[1]

Now, we'll simply run the bert_model over a bunch of sentences.

docs = [
    "hello world",
    "thank you!",
    "a",
    "this is some longer text, so length should be longer",
]

encoded_text = encode(bert_tokenizer, docs)
ids, mask = encoded_text.token, encoded_text.attention_mask

Above, by default, ids is a OneHotArray. We convert it to an integer matrix, containing integer token IDS:

integer_ids = Matrix(onecold(ids))

As expected, the bert_model gives the same results on the integer-ids as well as the one-hot encodings:

julia> @test isequal(bert_model((token = integer_ids, attention_mask=mask)), bert_model((token = ids, attention_mask=mask)))
Test Passed

Note that we can also convert from integer_ids back to the OneHotArray using the onehotbatch function. Here's a test just for a sanity check:

julia> @test isequal(ids, onehotbatch(integer_ids, 1:VOCABSIZE))             # test passes
Test Passed

However, if we convert back from the integer ids to the one-hot encodings, and use the converted one-hot encodings in the bert_model, the model throws an error:

julia> bert_model((token = onehotbatch(integer_ids, 1:VOCABSIZE), attention_mask=mask))
ERROR: ArgumentError: invalid index: false of type Bool
Stacktrace:
  [1] to_index(i::Bool)
    @ Base ./indices.jl:293
  [2] to_index(A::Matrix{Float32}, i::Bool)
    @ Base ./indices.jl:277
  [3] _to_indices1(A::Matrix{Float32}, inds::Tuple{Base.OneTo{Int64}}, I1::Bool)
    @ Base ./indices.jl:359
  [4] to_indices
    @ ./indices.jl:354 [inlined]
  [5] to_indices
    @ ./indices.jl:355 [inlined]
  [6] to_indices
    @ ./indices.jl:344 [inlined]
  [7] view
    @ ./subarray.jl:176 [inlined]
  [8] _view(X::Matrix{Float32}, colons::Tuple{Colon}, k::Bool)
    @ NNlib ~/.julia/packages/NNlib/Fg3DQ/src/scatter.jl:38
  [9] gather!(dst::Array{Float32, 4}, src::Matrix{Float32}, idx::OneHotArrays.OneHotArray{UInt32, 2, 3, Matrix{UInt32}})
    @ NNlib ~/.julia/packages/NNlib/Fg3DQ/src/gather.jl:107
 [10] gather
    @ ~/.julia/packages/NNlib/Fg3DQ/src/gather.jl:46 [inlined]
 [11] Embed
    @ ~/.julia/packages/Transformers/lD5nW/src/layers/embed.jl:43 [inlined]
 [12] macro expansion
    @ ~/.julia/packages/Transformers/lD5nW/src/layers/architecture.jl:108 [inlined]
 [13] WithArg
    @ ~/.julia/packages/Transformers/lD5nW/src/layers/architecture.jl:103 [inlined]
 [14] apply_on_namedtuple
    @ ~/.julia/packages/Transformers/lD5nW/src/layers/architecture.jl:80 [inlined]
 [15] macro expansion
    @ ~/.julia/packages/Transformers/lD5nW/src/layers/layer.jl:0 [inlined]
 [16] (::Transformers.Layers.CompositeEmbedding{Tuple{Transformers.Layers.WithArg{(:token,), Transformers.Layers.Embed{Nothing, Matrix{Float32}}}, Transformers.Layers.WithOptArg{(:hidden_state,), (:position,), Transformers.Layers.ApplyEmbed{Base.Broadcast.BroadcastFunction{typeof(+)}, Transformers.Layers.FixedLenPositionEmbed{Matrix{Float32}}, typeof(identity)}}, Transformers.Layers.WithOptArg{(:hidden_state,), (:segment,), Transformers.Layers.ApplyEmbed{Base.Broadcast.BroadcastFunction{typeof(+)}, Transformers.Layers.Embed{Nothing, Matrix{Float32}}, typeof(Transformers.HuggingFace.bert_ones_like)}}}})(nt::NamedTuple{(:token, :attention_mask), Tuple{OneHotArrays.OneHotArray{UInt32, 2, 3, Matrix{UInt32}}, NeuralAttentionlib.LengthMask{1, Vector{Int32}}}})
    @ Transformers.Layers ~/.julia/packages/Transformers/lD5nW/src/layers/layer.jl:620
 [17] apply_on_namedtuple
    @ ~/.julia/packages/Transformers/lD5nW/src/layers/architecture.jl:80 [inlined]
 [18] macro expansion
    @ ~/.julia/packages/Transformers/lD5nW/src/layers/architecture.jl:0 [inlined]
 [19] Chain
    @ ~/.julia/packages/Transformers/lD5nW/src/layers/architecture.jl:319 [inlined]
 [20] (::Transformers.HuggingFace.HGFBertModel{Transformers.Layers.Chain{Tuple{Transformers.Layers.CompositeEmbedding{Tuple{Transformers.Layers.WithArg{(:token,), Transformers.Layers.Embed{Nothing, Matrix{Float32}}}, Transformers.Layers.WithOptArg{(:hidden_state,), (:position,), Transformers.Layers.ApplyEmbed{Base.Broadcast.BroadcastFunction{typeof(+)}, Transformers.Layers.FixedLenPositionEmbed{Matrix{Float32}}, typeof(identity)}}, Transformers.Layers.WithOptArg{(:hidden_state,), (:segment,), Transformers.Layers.ApplyEmbed{Base.Broadcast.BroadcastFunction{typeof(+)}, Transformers.Layers.Embed{Nothing, Matrix{Float32}}, typeof(Transformers.HuggingFace.bert_ones_like)}}}}, Transformers.Layers.DropoutLayer{Transformers.Layers.LayerNorm{Vector{Float32}, Vector{Float32}, Float32}, Nothing}}}, Transformer{NTuple{12, Transformers.Layers.PostNormTransformerBlock{Transformers.Layers.DropoutLayer{Transformers.Layers.SelfAttention{NeuralAttentionlib.MultiheadQKVAttenOp{Nothing}, Transformers.Layers.Fork{Tuple{Transformers.Layers.Dense{Nothing, Matrix{Float32}, Vector{Float32}}, Transformers.Layers.Dense{Nothing, Matrix{Float32}, Vector{Float32}}, Transformers.Layers.Dense{Nothing, Matrix{Float32}, Vector{Float32}}}}, Transformers.Layers.Dense{Nothing, Matrix{Float32}, Vector{Float32}}}, Nothing}, Transformers.Layers.LayerNorm{Vector{Float32}, Vector{Float32}, Float32}, Transformers.Layers.DropoutLayer{Transformers.Layers.Chain{Tuple{Transformers.Layers.Dense{typeof(gelu), Matrix{Float32}, Vector{Float32}}, Transformers.Layers.Dense{Nothing, Matrix{Float32}, Vector{Float32}}}}, Nothing}, Transformers.Layers.LayerNorm{Vector{Float32}, Vector{Float32}, Float32}}}, Nothing}, Transformers.Layers.Branch{(:pooled,), (:hidden_state,), Transformers.HuggingFace.BertPooler{Transformers.Layers.Dense{typeof(tanh_fast), Matrix{Float32}, Vector{Float32}}}}})(nt::NamedTuple{(:token, :attention_mask), Tuple{OneHotArrays.OneHotArray{UInt32, 2, 3, Matrix{UInt32}}, NeuralAttentionlib.LengthMask{1, Vector{Int32}}}})
    @ Transformers.HuggingFace ~/.julia/packages/Transformers/lD5nW/src/huggingface/implementation/bert/load.jl:51
 [21] top-level scope
    @ REPL[26]:1
 [22] top-level scope
    @ ~/.julia/packages/CUDA/s5N6v/src/initialization.jl:190

Am I missing something here?

The text was updated successfully, but these errors were encountered:

chengchingwen · 2024-05-14T21:54:32Z

You should use integer_ids = reinterpret(Int32, ids) and OneHotArray{VOCABSIZE}(integer_ids). The OneHotArray used in Transformers and Flux is different and the error happened because that OneHotArray does not overload gather

codetalker7 · 2024-05-15T09:59:00Z

Thanks for this! I didn't notice that the package was using it's own OneHotArray.

codetalker7 closed this as completed May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Converting from integer-tokens to one-hot tokens gives different results. #179

Converting from integer-tokens to one-hot tokens gives different results. #179

codetalker7 commented May 14, 2024

chengchingwen commented May 14, 2024

codetalker7 commented May 15, 2024

Converting from integer-tokens to one-hot tokens gives different results. #179

Converting from integer-tokens to one-hot tokens gives different results. #179

Comments

codetalker7 commented May 14, 2024

chengchingwen commented May 14, 2024

codetalker7 commented May 15, 2024