Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

The y coordinate value of cell bbox seems to be inaccurate #25

Open
qyhou opened this issue Jan 28, 2022 · 0 comments
Open

The y coordinate value of cell bbox seems to be inaccurate #25

qyhou opened this issue Jan 28, 2022 · 0 comments

Comments

@qyhou
Copy link

qyhou commented Jan 28, 2022

Thank you for providing the large-scale dataset.

When converting the html to a kind of split structure, I found the y coordinate value of cell bbox seems to be inaccurate.

eg. PMC5842743_009_00, which is a 11x6 table.
PMC5842743_009_00
A03 line: [2, 65, 19, 76], [31, 65, 46, 76], [68, 65, 82, 76], [110, 65, 133, 76], [165, 65, 176, 76], 211, 65, 228, 76]
A04 line: [2, 78, 20, 89], [31, 78, 46, 89], [71, 75, 79, 90], [118, 75, 125, 90], [167, 75, 174, 90], [216, 75, 223, 90]
Obviously y1 of the upper cell is greater than y0 of the lower cell ( 76 > 75 ).
PMC5842743_009_00

I randomly checked 100 tables in training set and discovered 37 instances have this peculiarity.

Thanks

@qyhou qyhou changed the title The y coordinate values of cell bbox seem not accurate The y coordinate value of cell bbox seems to be inaccurate Jan 28, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant