Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Allow reading orders dectors to support any class that has a bounding box/PdfRectangle #855

Open
davebrokit opened this issue Jun 25, 2024 · 1 comment
Labels
document-reading Related to reading documents enhancement

Comments

@davebrokit
Copy link
Contributor

davebrokit commented Jun 25, 2024

Currently the interface IReadingOrderDetector relies on TextBlock as a parameter. This limits it's use to the TextBlock class.

I propose adding an IBoundingBox interface

public interface IBoundingBox
{
    PdfRectangle BoundingBox { get; }
}

Then changing IReadingOrderDector interface and implementing classes to use IBoundingBox as it's parameter

Adding an overload that takes a Func<T, PdfRectangle> would allow the caller to specify any bounding box making the interface more useful.

Breaking changes: The IReadingOrderDector will instead return an IReadOnlyList<T> which will be the ordered results. This would mean TextBlock.ReadingOrder is not set which is a breaking change. But some code can be added that if type T is TextBlock then ReadingOrder is set

Happy to make the changes

@BobLd
Copy link
Collaborator

BobLd commented Jun 25, 2024

@davebrokit I was thinking of doing similar, please go ahead and implement your idea.

I did a similar interface for my project https://github.com/BobLd/Caly/blob/master/Caly.Pdf/Models/IPdfTextElement.cs feel free to reuse that or not.

I think the Letter class has a method instead of a property to get the bounding box. Might be a good opportunity to change that too (in my mind, the letters, text lines and text block should implement your interface, but please let me know what you think)

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
document-reading Related to reading documents enhancement
Projects
None yet
Development

No branches or pull requests

3 participants