Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

add svtr large model #10937

Merged
merged 2 commits into from
Sep 26, 2023
Merged

Conversation

zhangyubo0722
Copy link
Collaborator

No description provided.

@paddle-bot
Copy link

paddle-bot bot commented Sep 18, 2023

Thanks for your contribution!

@zhangyubo0722 zhangyubo0722 force-pushed the add_svtr_large branch 2 times, most recently from d6dc304 to 435a928 Compare September 18, 2023 12:08
use_visualdl: false
infer_img: doc/imgs_words/ch/word_1.jpg
character_dict_path: ppocr/utils/ppocr_keys_v1.txt
max_text_length: &max_text_length 25
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里可以修改成40

beta2: 0.99
epsilon: 1.0e-08
weight_decay: 0.05
no_weight_decay_name: norm pos_embed char_node_embed pos_node_embed char_pos_embed vis_pos_embed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是优化过的吗?

out_channels: 512
patch_merging: Conv
embed_dim: [192, 256, 512]
depth: [6, 6, 9]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

参数都是调整过的?


Architecture:
model_type: rec
algorithm: SVTR_LCNet
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

algorithm:SVTR?
已经没有LCNet了

@zhangyubo0722 zhangyubo0722 force-pushed the add_svtr_large branch 2 times, most recently from 57d2780 to c675aa5 Compare September 25, 2023 12:01
self.dec_pos_embed = self.create_parameter(
shape=[1, w, dim], default_initializer=zeros_)
self.add_parameter("dec_pos_embed", self.dec_pos_embed)
# self.pos_drop = nn.Dropout(p=drop_rate)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

删除多余代码

@@ -88,7 +111,9 @@ def __init__(self, in_channels, out_channels_list, **kwargs):
'{} is not supported in MultiHead yet'.format(name))

def forward(self, x, targets=None):

if self.use_pool:
# print(x.shape)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

del

@@ -61,8 +78,14 @@ def __init__(self, in_channels, out_channels_list, **kwargs):
max_text_length = gtc_args.get('max_text_length', 25)
nrtr_dim = gtc_args.get('nrtr_dim', 256)
num_decoder_layers = gtc_args.get('num_decoder_layers', 4)
self.before_gtc = nn.Sequential(
if self.use_pos:
# add_pos = AddPos(nrtr_dim, 60)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

del

@zhangyubo0722 zhangyubo0722 force-pushed the add_svtr_large branch 3 times, most recently from b3487df to f8eb3f5 Compare September 25, 2023 12:24
tink2123
tink2123 previously approved these changes Sep 25, 2023
# See the License for the specific language governing permissions and
# limitations under the License.

from matplotlib.mlab import stride_windows
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

删除无关代码


def forward(self, x):

qkv = paddle.reshape(self.qkv(x), (0, -1, 3, self.num_heads, self.dim //
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这么写会不会不能导出inference model,验证过了吗

Copy link
Collaborator

@tink2123 tink2123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tink2123 tink2123 merged commit e49e491 into PaddlePaddle:dygraph Sep 26, 2023
embed_dim: [192, 256, 512]
depth: [6, 6, 9]
num_heads: [6, 8, 16]
mixer: ['Conv','Conv','Conv','Conv','Conv','Conv','Conv','Conv','Conv','Global','Global','Global','Global','Global','Global','Global','Global','Global','Global','Global','Global']

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @zhangyubo0722, can I have a question?
I guess the Permuation column (in the SVTR paper) is the value of mixer, so I set my config is Conv*10 and Global*11, but your config is Conv*9 and Global*11. Can you show me the quotation you used for this config please. Thank you a lot
image

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants