Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

多头问题的实现推导不太理解 #176

Open
frostjsy opened this issue Aug 13, 2021 · 0 comments
Open

多头问题的实现推导不太理解 #176

frostjsy opened this issue Aug 13, 2021 · 0 comments

Comments

@frostjsy
Copy link

1、为啥分成多头后又进行的concat?

Split and concat

    Q_ = tf.concat(tf.split(Q, num_heads, axis=2), axis=0)  # (h*N, T_q, d_model/h)
    K_ = tf.concat(tf.split(K, num_heads, axis=2), axis=0)  # (h*N, T_k, d_model/h)
    V_ = tf.concat(tf.split(V, num_heads, axis=2), axis=0)  # (h*N, T_k, d_model/h)

2、为啥下面的这个操作等价于多头的拼接
outputs = tf.concat(tf.split(outputs, num_heads, axis=0), axis=2) # (N, T_q, d_model)

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant