You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When reading your paper, I was a little puzzled about the T-Head module, and I hope to get your answer.
Why can "N consecutive conv layers" extract the task-interactive features?Compared with it, does the feature extracted by the previous backbone+FPN have no interactive information?
The text was updated successfully, but these errors were encountered:
We hold the view that the closer to the prediction layer, the richer the classification and localization information. In our method, the features extracted by the N consecutive conv layers are used to predict both the classification and localization directly. Therefore, the features extracted by the N consecutive conv layers have richer classification and localization information for task-interaction, than the feature extracted by the previous backbone+FPN.
When reading your paper, I was a little puzzled about the T-Head module, and I hope to get your answer.
Why can "N consecutive conv layers" extract the task-interactive features?Compared with it, does the feature extracted by the previous backbone+FPN have no interactive information?
The text was updated successfully, but these errors were encountered: