-
Notifications
You must be signed in to change notification settings - Fork 8k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
add svtr large model #10937
add svtr large model #10937
Conversation
Thanks for your contribution! |
d6dc304
to
435a928
Compare
configs/rec/rec_svtrnet_large.yml
Outdated
use_visualdl: false | ||
infer_img: doc/imgs_words/ch/word_1.jpg | ||
character_dict_path: ppocr/utils/ppocr_keys_v1.txt | ||
max_text_length: &max_text_length 25 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里可以修改成40
configs/rec/rec_svtrnet_large.yml
Outdated
beta2: 0.99 | ||
epsilon: 1.0e-08 | ||
weight_decay: 0.05 | ||
no_weight_decay_name: norm pos_embed char_node_embed pos_node_embed char_pos_embed vis_pos_embed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是优化过的吗?
configs/rec/rec_svtrnet_large.yml
Outdated
out_channels: 512 | ||
patch_merging: Conv | ||
embed_dim: [192, 256, 512] | ||
depth: [6, 6, 9] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
参数都是调整过的?
configs/rec/rec_svtrnet_large.yml
Outdated
|
||
Architecture: | ||
model_type: rec | ||
algorithm: SVTR_LCNet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
algorithm:SVTR?
已经没有LCNet了
57d2780
to
c675aa5
Compare
self.dec_pos_embed = self.create_parameter( | ||
shape=[1, w, dim], default_initializer=zeros_) | ||
self.add_parameter("dec_pos_embed", self.dec_pos_embed) | ||
# self.pos_drop = nn.Dropout(p=drop_rate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
删除多余代码
@@ -88,7 +111,9 @@ def __init__(self, in_channels, out_channels_list, **kwargs): | |||
'{} is not supported in MultiHead yet'.format(name)) | |||
|
|||
def forward(self, x, targets=None): | |||
|
|||
if self.use_pool: | |||
# print(x.shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
del
@@ -61,8 +78,14 @@ def __init__(self, in_channels, out_channels_list, **kwargs): | |||
max_text_length = gtc_args.get('max_text_length', 25) | |||
nrtr_dim = gtc_args.get('nrtr_dim', 256) | |||
num_decoder_layers = gtc_args.get('num_decoder_layers', 4) | |||
self.before_gtc = nn.Sequential( | |||
if self.use_pos: | |||
# add_pos = AddPos(nrtr_dim, 60) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
del
b3487df
to
f8eb3f5
Compare
ppocr/modeling/backbones/rec_vit.py
Outdated
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
from matplotlib.mlab import stride_windows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
删除无关代码
|
||
def forward(self, x): | ||
|
||
qkv = paddle.reshape(self.qkv(x), (0, -1, 3, self.num_heads, self.dim // |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这么写会不会不能导出inference model,验证过了吗
f8eb3f5
to
40396c4
Compare
40396c4
to
2d8a6ae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
embed_dim: [192, 256, 512] | ||
depth: [6, 6, 9] | ||
num_heads: [6, 8, 16] | ||
mixer: ['Conv','Conv','Conv','Conv','Conv','Conv','Conv','Conv','Conv','Global','Global','Global','Global','Global','Global','Global','Global','Global','Global','Global','Global'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @zhangyubo0722, can I have a question?
I guess the Permuation
column (in the SVTR paper) is the value of mixer
, so I set my config is Conv*10
and Global*11
, but your config is Conv*9
and Global*11
. Can you show me the quotation you used for this config please. Thank you a lot
No description provided.