Skip to content

Commit

Permalink
Fix missed detection of single letter part names
Browse files Browse the repository at this point in the history
Fix the wrong classification of single letter part names, if the single
letter is also a stopword.
  • Loading branch information
klb2 committed Jul 5, 2024
1 parent a18e7ba commit 7955b63
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 1 deletion.
3 changes: 2 additions & 1 deletion pyiso4/lexer.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ def yield_hyphenated(word: str, base_pos: int) -> Iterable[Token]:
yield Token(TokenType.PART, word, self.start_word)
was_part = self.count
# check if ordinal (preceded by PART)
elif IS_ORDINAL.match(word) and self.count == was_part + 1:
elif IS_ORDINAL.fullmatch(word) and self.count == was_part + 1:
yield Token(TokenType.ORDINAL, word, self.start_word)
# check if article (after ordinal, so "a" is detected as ordinal if preceded by PART)
elif lower_word in ARTICLES:
Expand All @@ -155,6 +155,7 @@ def yield_hyphenated(word: str, base_pos: int) -> Iterable[Token]:
# yield the remaining symbols, if any
if len(end_symbols) > 0:
yield Token(TokenType.SYMBOLS, end_symbols, self.pos - len(end_symbols))
was_part = self.count

self.next()

Expand Down
6 changes: 6 additions & 0 deletions tests/tests.tsv
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,9 @@ Norsk Militært Tidsskrift Nor. Mil. Tidsskr.
Proceedings of the 2024 Conference on Science Proc. 2024 Conf. Sci.
IEEE Power and Energy Magazine IEEE Power Energy Mag.
IEEE Transactions on Automatic Control IEEE Trans. Autom. Control
E.S.A. bulletin E.S.A. bull.
Acta Universitatis Carolinae. Iuridica Acta Univ. Carol., Iurid.
Physical Review. A Phys. Rev., A
Physical Review. D Phys. Rev., D
Physical Review. E Phys. Rev., E
Physical Review. I Phys. Rev., I

0 comments on commit 7955b63

Please # to comment.