Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

if (bbox_intersection_area(ba, bb) / bbox_area(ba)) > 0.8: ZeroDivisionError: float division by zero #495

Open
arjungandeeva opened this issue Apr 4, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@arjungandeeva
Copy link

I'm encountering a ZeroDivisionError: float division by zero error in camelot-py when using the functions bbox_intersection_area and bbox_area. This error occurs under certain conditions, likely when the bounding box area (ba) is zero.

@arjungandeeva arjungandeeva added the bug Something isn't working label Apr 4, 2024
@cktse
Copy link

cktse commented Apr 10, 2024

I did a quick fix/hack to circumvent the error by skipping over the area check if ba is singular (area is zero):

~/.pyenv/versions/3.11.3/lib/python3.11/site-packages/camelot/utils.py: Line 375:

            if bbox_area(ba) > 0 and bbox_intersect(ba, bb):
                # if the intersection is larger than 80% of ba's size, we keep the longest
                if (bbox_intersection_area(ba, bb) / bbox_area(ba)) > 0.8:
                    if bbox_longer(bb, ba):
                        rest.discard(ba)

@bosd
Copy link
Collaborator

bosd commented Aug 6, 2024

Hey!

As #343, we try to build a maintained fork at pypdf_table_extraction.

Do you want to check that code and open an issue / PR thereto include this fix?

@cktse
Copy link

cktse commented Aug 12, 2024

I just took another look at the branches -- looks like this has already been fixed as part of "Release camelot-fork 0.20.1", which is already included in your fork: Release camelot-fork 0.20.1

@bosd
Copy link
Collaborator

bosd commented Aug 12, 2024

Thanks for checking 👍

@cktse
Copy link

cktse commented Aug 12, 2024

Great to see camelot lives on!

BTW is this fork going to be packaged on pip under a separate name? Think the current package is stale from the main branch.

@bosd
Copy link
Collaborator

bosd commented Aug 12, 2024

BTW is this fork going to be packaged on pip under a separate name? Think the current package is stale from the main branch.

Yes, it is published here https://pypi.org/project/pypdf-table-extraction/

We're currently working on a new release, bymerging the open pr's from this repo, and rebranding the package.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants