You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
The splitter's methods _move_to_comma_or_closing_curly_bracket and _move_to_closed_bracket each contain a check for unexpected block starts. Unfortunately, this interferes with the parsing of entries that contain the @ sign as raw text.
Reproducing
Version: 2.0.0b7
Code:
This example parse fails because of the @ in the title, raising a BlockAbortedException and adding the block to failed_blocks.
test=bibtexparser.parse_string(
""" @inproceedings{DBLP:conf/cikm/EsuliM021, author = {Andrea Esuli and Alejandro Moreo and Fabrizio Sebastiani}, editor = {Gao Cong and Maya Ramanath}, title = {LeQua @ {CLEF} 2022: {A} Shared Task for Evaluating Quantification Systems}, booktitle = {Proceedings of the {CIKM} 2021 Workshops co-located with 30th {ACM} International Conference on Information and Knowledge Management {(CIKM} 2021), Gold Coast, Queensland, Australia, November 1-5, 2021}, series = {{CEUR} Workshop Proceedings}, volume = {3052}, publisher = {CEUR-WS.org}, year = {2021}, url = {https://ceur-ws.org/Vol-3052/abstract4.pdf}, timestamp = {Fri, 10 Mar 2023 16:22:33 +0100}, biburl = {https://dblp.org/rec/conf/cikm/EsuliM021.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } """
)
print(test.entries_dict['DBLP:conf/cikm/EsuliM021'])
Bibtex:
@inproceedings{DBLP:conf/cikm/EsuliM021,
author = {Andrea Esuli and Alejandro Moreo and Fabrizio Sebastiani},
editor = {Gao Cong and Maya Ramanath},
title = {LeQua @ {CLEF} 2022: {A} Shared Task for Evaluating Quantification Systems},
booktitle = {Proceedings of the {CIKM} 2021 Workshops co-located with 30th {ACM}
International Conference on Information and Knowledge Management {(CIKM}
2021), Gold Coast, Queensland, Australia, November 1-5, 2021},
series = {{CEUR} Workshop Proceedings},
volume = {3052},
publisher = {CEUR-WS.org},
year = {2021},
url = {https://ceur-ws.org/Vol-3052/abstract4.pdf},
timestamp = {Fri, 10 Mar 2023 16:22:33 +0100},
biburl = {https://dblp.org/rec/conf/cikm/EsuliM021.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Workaround
Monkey-patching the two methods by removing the @ check leads to a successful parse.
Remaining Questions (Optional)
I would be willing to contribute a PR to fix this issue.
This issue is a blocker, I'd be grateful for an early fix.
It says in the code that new blocks are identified by being after a new line. If that assumption is generally safe to make, I could remove the two checks altogether. The only other solution I could think of is replacing the "@" check with a tuple of the most common entry types, e.g. startswith(("@article", "@book", "@proceedings", ...)). Let me know if one of those works and I'll gladly prepare a PR.
The text was updated successfully, but these errors were encountered:
Describe the bug
The splitter's methods _move_to_comma_or_closing_curly_bracket and _move_to_closed_bracket each contain a check for unexpected block starts. Unfortunately, this interferes with the parsing of entries that contain the
@
sign as raw text.Reproducing
Version:
2.0.0b7
Code:
This example parse fails because of the
@
in the title, raising a BlockAbortedException and adding the block to failed_blocks.Bibtex:
Workaround
Monkey-patching the two methods by removing the
@
check leads to a successful parse.Remaining Questions (Optional)
It says in the code that new blocks are identified by being after a new line. If that assumption is generally safe to make, I could remove the two checks altogether. The only other solution I could think of is replacing the
"@"
check with a tuple of the most common entry types, e.g.startswith(("@article", "@book", "@proceedings", ...))
. Let me know if one of those works and I'll gladly prepare a PR.The text was updated successfully, but these errors were encountered: