Skip to content

Commit

Permalink
Fix IndexError when abbrv is longer than original
Browse files Browse the repository at this point in the history
In some cases, there is a mismatch between abbreviation and original,
where a dot is added to an unabbreviated word, e.g., "Control".
If this occurs, the dot is removed and the abbreviation is reduced to
the length of the original word.
  • Loading branch information
klb2 committed Jul 5, 2024
1 parent 8cc47b5 commit a18e7ba
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 1 deletion.
3 changes: 3 additions & 0 deletions pyiso4/ltwa.py
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,9 @@ def match_capitalization_and_diacritic(abbrv: str, original: str) -> str:
"""Matches the capitalization and diacritics of the `original` word, as long as they are similar
"""

if len(abbrv) > len(original):
abbrv = abbrv[:len(original)]

normalized_abbrv = list(normalize(abbrv, Level.SOFT))
for i, c in enumerate(normalized_abbrv):
unided = unidecode(original[i])
Expand Down
4 changes: 3 additions & 1 deletion tests/tests.tsv
Original file line number Diff line number Diff line change
Expand Up @@ -40,4 +40,6 @@ Zeitschrift des Deutschen Palästina-Vereins Z. Dtsch. Paläst.-Ver.
International Journal of e-Collaboration Int. J. e-Collab.
Proceedings of A. Razmadze Mathematical Institute Proc. A. Razmadze Math. Inst.
Norsk Militært Tidsskrift Nor. Mil. Tidsskr.
Proceedings of the 2024 Conference on Science Proc. 2024 Conf. Sci.
Proceedings of the 2024 Conference on Science Proc. 2024 Conf. Sci.
IEEE Power and Energy Magazine IEEE Power Energy Mag.
IEEE Transactions on Automatic Control IEEE Trans. Autom. Control

0 comments on commit a18e7ba

Please # to comment.