Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Match History] football-data CSV files about 24/25 season start with BOM char #750

Open
2 of 12 tasks
sandromodarelli opened this issue Nov 6, 2024 · 0 comments · May be fixed by #751
Open
2 of 12 tasks

[Match History] football-data CSV files about 24/25 season start with BOM char #750

sandromodarelli opened this issue Nov 6, 2024 · 0 comments · May be fixed by #751
Labels
bug Something isn't working

Comments

@sandromodarelli
Copy link

Describe the bug
all CSV files from football-data about 24/25 season start with a BOM charachter. this leads to a "fake" column named Div in the dataset and makes the league translation impossible (with an exception)

Affected scrapers
This affects the following scrapers:

  • ClubElo
  • ESPN
  • FBref
  • FiveThirtyEight
  • FotMob
  • Match History
  • SoFIFA
  • Understat
  • WhoScored

Code example
A minimal code example that fails. Use no_cache=True to make sure an invalid cached file does not cause the bug and make sure you have the latest version of soccerdata installed.

import soccerdata as sd
mh = sd.MatchHistory('ITA-Serie A', "24/25", no_cache=True)
data = mh.read_games()

Error message

Traceback (most recent call last):
  File ".venv/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 3805, in get_loc
    return self._engine.get_loc(casted_key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "index.pyx", line 167, in pandas._libs.index.IndexEngine.get_loc
  File "index.pyx", line 196, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 7081, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 7089, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'league'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".venv/lib/python3.12/site-packages/soccerdata/match_history.py", line 116, in read_games
    .pipe(self._translate_league)
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.12/site-packages/pandas/core/generic.py", line 6231, in pipe
    return common.pipe(self, func, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.12/site-packages/pandas/core/common.py", line 502, in pipe
    return func(obj, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.12/site-packages/soccerdata/_common.py", line 399, in _translate_league
    mask = ~df[col].isin(flip)
            ~~^^^^^
  File ".venv/lib/python3.12/site-packages/pandas/core/frame.py", line 4102, in __getitem__
    indexer = self.columns.get_loc(key)
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 3812, in get_loc
    raise KeyError(key) from err
KeyError: 'league'

Additional context
Add any other context about the problem here.

Contributor Action Plan

  • I can fix this issue and will submit a pull request.
  • I’m unsure how to fix this, but I'm willing to work on it with guidance.
  • I’m not able to fix this issue.
@sandromodarelli sandromodarelli added the bug Something isn't working label Nov 6, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant