Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

fake user agent for IMDB scraping #223

Merged
merged 3 commits into from
Dec 21, 2024
Merged

fake user agent for IMDB scraping #223

merged 3 commits into from
Dec 21, 2024

Conversation

pkscout
Copy link
Member

@pkscout pkscout commented Dec 11, 2024

This should fix IMDB scraping by faking a valid user agent heading. I have tested this with the TV show scraper, and it works fine. I haven't had a chance to test this with movies, so someone should validate that this works and doesn't break some other part of the scraping. Movie scraping is more straight forward, so this shouldn't break anything.

@KarellenX
Copy link
Member

@pkscout
I copied the changes to my local install, restarted and tested.
No IMDB ratings were scraped.

@pkscout
Copy link
Member Author

pkscout commented Dec 13, 2024

@pkscout I copied the changes to my local install, restarted and tested. No IMDB ratings were scraped.

I'll find some time to take a closer look this weekend. It's possible the header has some extra stuff in it from other web calls, and that might be causing IMDB to continue blocking it. I had that problem with the TV show scraper, so I had to make sure to blank out the header and then create a new one.

@pkscout
Copy link
Member Author

pkscout commented Dec 13, 2024

@KarellenX try the updated fix. I think I stuck the call to set the header in the wrong place the fist time. I also had to clean up how and when headers were set, as we were sending a bunch of header info to every site that didn't relate to those sites, and IMDB could use that to block us again. This one worked on my test setup, at least for the dozen or so movies I tested.

@KarellenX
Copy link
Member

@pkscout IMDB ratings were scraped in the handful of movies I tested with.
Thanks

@pkscout pkscout requested a review from rmrector December 14, 2024 11:25
@pkscout
Copy link
Member Author

pkscout commented Dec 14, 2024

Great. I've added @rmrector as a requested reviewer. Hopefully he'll be by soon to merge this and then get the update pushed out.

@rmrector rmrector merged commit b14e1fb into xbmc:master Dec 21, 2024
2 checks passed
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants