You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a pdf where the table is spread across multiple pages. I need it to be in a single csv or excel format.
I have attached a screenshot of the PDF as well.
Steps to reproduce the bug
If you try to extract the code, it will extract the first table nicely but it is not able to extract the table below it.
Expected behavior
Both tables should be in one single table
Code
try:
tables = camelot.read_pdf(pdf_path, pages="all") # Extract all pages
except Exception as e:
print(f"Error extracting tables from {pdf_path}: {e}")
return
extracted_data: Dict[str, Any] = {}
# Store table data as CSV and include path in JSON
for i, table in enumerate(tables):
table_filename = f"table_{i + 1}.csv"
table_path = os.path.join(tables_dir, table_filename)
table.to_csv(table_path, index=False) # store as CSV
extracted_data[f"table_{i+1}"] = table_path
PDF
Screenshots
Environment
OS: [e.g. macOS]
Python version:
Numpy version:
OpenCV version:
Ghostscript version:
camelot version:
Additional context
The text was updated successfully, but these errors were encountered:
I have a pdf where the table is spread across multiple pages. I need it to be in a single csv or excel format.
I have attached a screenshot of the PDF as well.
Steps to reproduce the bug
If you try to extract the code, it will extract the first table nicely but it is not able to extract the table below it.
Expected behavior
Both tables should be in one single table
Code
PDF
Screenshots
Environment
Additional context
The text was updated successfully, but these errors were encountered: