Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Chapter 03 Yelp Dataset has a Typo #30

Open
amancioandre opened this issue Jun 11, 2020 · 0 comments
Open

Chapter 03 Yelp Dataset has a Typo #30

amancioandre opened this issue Jun 11, 2020 · 0 comments

Comments

@amancioandre
Copy link

Hi everyone,

Chapter 3 does not load Yelp data due to a typo on the last line of the dataset:

Line Review
73357: "1","Capital City Transfer han

Using nrows argument passing the number of rows - 1, fixed for me.

train_reviews = pd.read_csv(args.raw_train_dataset_csv, header=None, names = ['rating', 'review'], nrows=73356)

Or

train_reviews = pd.read_csv(args.raw_train_dataset_csv, header=None, names = ['rating', 'review'], error_bad_lines=False)

Or by just appending a " at this line.

Still, would be nice to fix this typo on the dataset.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant