You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Scrapping two different PDFs yields the exact same results when using the FileCache.
The problem is that set_hash_key() always computes the same key because the file is already seek at the end (md5("") == "d41d8cd98f00b204e9800998ecf8427e") and pdfquery ends up using the same cached data for both PDFs.
Adding file.seek(0) before computing the md5 seems to solve the issue.
The text was updated successfully, but these errors were encountered:
Scrapping two different PDFs yields the exact same results when using the
FileCache
.The problem is that
set_hash_key()
always computes the same key because the file is already seek at the end (md5("") == "d41d8cd98f00b204e9800998ecf8427e"
) and pdfquery ends up using the same cached data for both PDFs.Adding
file.seek(0)
before computing the md5 seems to solve the issue.The text was updated successfully, but these errors were encountered: