hOCRtoMarkdown

Synopsis: hocr2md sourceImage [OPTION]

Description: Converts image into hocr file, via tesseract, and hocr into markdown.

    -h, --help              display this help menu
    
    -o, --output            choose name and location to output markdown file (location must exist)

    -e, --extractImages     extract images into nameOfInputFile_Images/

    -p, --psm               choose psm to be used by tesseract (default=3)

    -l, --lang              choose language to be used by tesseract (default=por)

    --extractImagesFolder   choose folder to extract images
    
    --conf                  choose value for line confidence (default=40, line is deleted if below confidence)
    
    --dc                    show image with Careas limits drawn

    --dp                    show image with Pars limits drawn

    --dl                    show image with Lines limits drawn

    --di                    show image with Images limits drawn

    --da                    show image with Articles limits drawn

Some page segmentation modes:

     1                      Automatic page segmentation with OSD.
     3                      Fully automatic page segmentation, but no OSD. (Default)
     4                      Assume a single column of text of variable sizes.
     5                      Assume a single uniform block of vertically aligned text.
     6                      Assume a single uniform block of text.
    11                      Sparse text. Find as much text as possible in no particular order.
    13                      Raw line. Treat the oppenedImage as a single text line,
                            bypassing hacks that are Tesseract-specific.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
assets		assets
README.md		README.md
hocr2md		hocr2md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hOCRtoMarkdown

About

Releases

Packages

Languages

DrBr4n/hOCRtoMarkdown

Folders and files

Latest commit

History

Repository files navigation

hOCRtoMarkdown

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages