Skip to content

Some websites put meta tags outside the head. #192

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
paul-rchds opened this issue Apr 13, 2022 · 2 comments
Open

Some websites put meta tags outside the head. #192

paul-rchds opened this issue Apr 13, 2022 · 2 comments

Comments

@paul-rchds
Copy link

On some pages meta tags are included outside of the head tag. For example on the YouTube channel page: https://www.youtube.com/c/Freecodecamp

As the opengraph extractor only looks in the head tag, all the og:* meta properties are missed.
In my fork, I changed the extractor to look in the body rather.

If I get permission, I can do a PR?

Here is a link to where I made the change:

for head in document.xpath('//head'):

@lopuhin
Copy link
Member

lopuhin commented Apr 14, 2022

hi @paul-rchds yes, that would be great - I noticed the same issue myself but didn't get to implement everything required, here is a link to a PR #129 - feel free to start a new one.

@frostrot
Copy link

I have changed the functionality of the extract_item function in OpengraphExtractor class, to incorporate the meta tags outside of the head. Have tested it on the link shared by @paul-rchds . Please review my PR for its workability. Thanks

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants