Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Feature: inherit robot hiding #39

Open
rolfrb opened this issue Nov 20, 2018 · 4 comments
Open

Feature: inherit robot hiding #39

rolfrb opened this issue Nov 20, 2018 · 4 comments

Comments

@rolfrb
Copy link

rolfrb commented Nov 20, 2018

It would be really, really neat if we could choose to inherit hiding from external search engines. If you can find a sub page through a search engine, it's really easy for users find the parent page. For instance if x is hidden, but y isn't, then finding x is really simple for users coming to

https://www.example.com/x/y/

So generally, we hide sections of a site - not invidual pages. And today we would have to remember to hide each page in a section.

@Bellfalasch
Copy link
Contributor

Agree, that would be a nice feature. However, the robot config easily gets complicated and borders on an area slightly outside of SEO, so I am more tempted in creating a robots.txt app instead that adds these checkboxes and options to the contents instead of piggybacking on the SEO app. This would of course be breaking things a bit since robot config would be removed from SEO app entirely.

@poi33
Copy link
Contributor

poi33 commented May 27, 2020

We have a robots.txt app now.
It should handle all exclusions. The extended standard that google uses for robots.txt even lets you exclude sections of a website.

@poi33 poi33 closed this as completed May 27, 2020
@rolfrb
Copy link
Author

rolfrb commented May 27, 2020

Yes, but

  1. google will ignore robots.txt directives if it finds a link to the page somewhere else and recommends using meta tags

To properly prevent your URL from appearing in Google Search results, you should password-protect the files on your server or use the noindex meta tag or response header (or remove the page entirely).

  1. the sitemap app uses the settings from this app to remove pages from the sitemap.xml, so using robots.txt wont remove the pages from the sitemap

Anyway - your app, your call :) Perhaps we'll look into simple password protection as that is one of Googles other suggestions.

@poi33
Copy link
Contributor

poi33 commented May 27, 2020

If possible in the robots.txt app we would want find all pages excluded with the robots.txt config and also meta exclude that page.
The problem is that we would need to regex or even worse implement our own matcher of the robots exclusion format.

I was thinking maybe a content selector and adding all pages that you want excluded.
But i don't think the app users would want to add a config, and then also include all pages "again" with the content selector.

I'll try finding a good solution, need suggestions.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants