Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Bug]: parse error #101

Closed
craigwinkler opened this issue Oct 27, 2023 · 4 comments · Fixed by #102
Closed

[Bug]: parse error #101

craigwinkler opened this issue Oct 27, 2023 · 4 comments · Fixed by #102
Labels
bug Something isn't working

Comments

@craigwinkler
Copy link

enex2notion version

0.3.0

What OS are you using?

MacOS

OS Version / Linux distribution

MacOS 14.0

Bug description

In general enex2notion works terrifically well, thank you.

In some Evernote notebooks I get a parse error at the beginning of the process. Even in verbose mode there is no information beyond that.

In each case a bunch of notes get correctly imported, but way less than the source .enex file contains. Is there any more information on what the parsing problem was? I can't work out how to rectify a hidden issue.

In the log info provided below for "Personal" the original file contains 4569 notes, NOT the 284 that enex2notion thinks.

Log excerpt

WARNING: No token provided, dry run mode. Nothing will be uploaded to Notion!
INFO: Processing notebook 'Personal'...
WARNING: 'Personal.enex' file parsed with errors
DEBUG: Personal.enex:571373:95:xmlSAX2Characters: huge text node
DEBUG: 'Personal' notebook contains 284 note(s)
DEBUG: Parsing note 'Mindy G Cleaning'
DEBUG: Parsing note 'Electrolux EBE5307SA Refrigerator-Freezer manual'
...etc
@craigwinkler craigwinkler added the bug Something isn't working label Oct 27, 2023
@vzhd1701
Copy link
Owner

When you use --verbose flag, all parsing errors are printed right after "file parsed with errors" warning. This is your parse error on line 571373, column 95 in Personal.enex:

DEBUG: Personal.enex:571373:95:xmlSAX2Characters: huge text node

@zzamboni
Copy link
Contributor

zzamboni commented Oct 27, 2023

@vzhd1701 Thanks also from my side for enex2notion!

Thanks for the explanation - I'm getting the same error, and looking at the location in my ENEX file, it's in the middle of a <data encoding="base64"> block in a note that contains a few large PDF attachments. I guess this is what's causing the "huge text node" error?

Is there a way to increase the maximum allowed node size, or some other workaround for this? I tried using --skip-failed but it doesn't seem to help with this particular error.

zzamboni added a commit to zzamboni/enex2notion that referenced this issue Oct 27, 2023
- Enabled the "huge_tree" option in the XML parser to prevent the
  "xmlSAX2Characters: huge text node" error.
- Fixed a "list index out of range" error that happened on some notes
  with title but no content.

Fixes vzhd1701#101.
@zzamboni
Copy link
Contributor

I've just submitted PR #102 to fix this bug and another one I found, where some notes that had just a title and no content were failing.

vzhd1701 added a commit that referenced this issue Oct 27, 2023
* Fix parse errors for huge and empty nodes

- Enabled the "huge_tree" option in the XML parser to prevent the
  "xmlSAX2Characters: huge text node" error.
- Fixed a "list index out of range" error that happened on some notes
  with title but no content.

Fixes #101.

* refactor: make empty dom check explicit

* test: add big resource note test

* test: add empty note dom test

---------

Co-authored-by: vzhd1701 <vzhd1701@gmail.com>
@vzhd1701
Copy link
Owner

This should be fixed in the new version. Thanks again, @zzamboni! Homebrew version will be updated in a few hours.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants