-
Notifications
You must be signed in to change notification settings - Fork 8k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
版面矫正网络DocTr++论文复现 #10379
Comments
这个训练集是自制的,还得自己构建训练集
|
认领 约需1个月完成 |
数据集的构造已经在问题中进一步说明,有任何问题我们可以持续交流~ |
进行了论文解读,可以参考 |
等有时间了写一下训练部分 |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions. |
hello,进展如何? |
@zhuxiaobin 可以看下这个PR |
你好,进展如何? |
@Li-Yidong 可以看下这个PR |
感谢分享! |
背景
经过需求征集#10334 和每周技术研讨会 #10223 讨论,我们确定了DocTr++版面矫正任务,该任务在文档比对、关键字提取、合同篡改确认等重要场景发挥作用。本任务的完成能显著OCR结果的细粒度,并有众多场景应用。
通过定量实验和定性对比,作者团队验证了 DocTr++ 的性能优势及泛化性,并在现有及所提出的基准测试中刷新了多项最佳记录,是目前最优的文档矫正方案。
暂时没有预训练权重和训练代码,需要按照论文描述重新训练尝试。
解决步骤
数据集:
The text was updated successfully, but these errors were encountered: