-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Fixed issue with links not being found for new google response format #309
base: master
Are you sure you want to change the base?
Conversation
…itDataCallback() the google page contains info in a script variable `AF_initDataCallback` See the javascript that parses it: https://gist.github.com/FarisHijazi/6c9ba3fb315d0ce9bfa62c10dfa8b2f8 This commit is an implementation to this code.fix-2020-format I have added an iterator that returns rg_meta objects
_parse_AF_initDataCallback() the beautifulsoup lib returns text differently for python2, also some unicode decoding had to be done differently for python3 also there were some issues with siteAndNameInfo being accessed unsafely, also got fixed
can you add bs4 into the requirements.txt? |
Hi, I tried to use this PR locally and am getting errors when running.
The command I am using to download images
The exception raised when running
|
yup, verified this PR doesn't work. |
this won't fix failures, but it will catch any errors in _parse_AF_initDataCallback() and will stop them from rising any higher
done I didn't want to add it to the requirements as it is an optional requirement. |
Hi The command that I am running: The error that I got: Item no.: 1 --> Item name = tree |
This works for me. It is also the right solution to the problem. Might have a few bugs that will have to be sorted before it works for everyone but @FarisHijazi is right about AF_initDataCallback. I checked it and the required information certainly is there. So it just needs to be parsed for it. |
Works, but it thinks every image is a GIF, even when it's not. |
At line 776, before:
Insert this:
If you don't, your script will think that all images are in GIF format. |
This is a duplicate of #298 |
the new 2020 google images update changes where the image information is stored, I found that they're stored in a script in variable
AF_initDataCallback
This implementation is backward compatible (using rg_meta), and if that doesn't work, then it will parse the new info.
This code was tested with both python3 and python2