Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing issue regarding Astro v5 Beta and Astro v4 #10673

Open
ArmandPhilippot opened this issue Jan 9, 2025 · 8 comments
Open

Indexing issue regarding Astro v5 Beta and Astro v4 #10673

ArmandPhilippot opened this issue Jan 9, 2025 · 8 comments
Assignees
Labels
site improvement Some thing that improves the website functionality - ask @delucis for help!

Comments

@ArmandPhilippot
Copy link
Member

📋 Explain your issue

I just came across this link after a Google search: https://5-0-0-beta--astro-docs-2.netlify.app/en/getting-started/

The content might be out of date if someone refers to it thinking they're on Astro Docs. And duplicated content between this website and the real Docs website could harm SEO I think... (both use canonical but each with their own URL so I think Google can't determine which one is the real "original" and may suggest the beta website instead of the current docs...)

So I think we should handle this.

And quoting Chris: "We need to handle this and also https://v4.docs.astro.build/ properly and I don’t think we did. We’ll need to figure out if we can actually kill the v5 beta one, or if we need to update how it is indexed"

@sarah11918 sarah11918 assigned sarah11918 and delucis and unassigned sarah11918 Jan 10, 2025
@sarah11918
Copy link
Member

Thanks for filing this issue so it's on the radar for @delucis ! 🙌

@delucis
Copy link
Member

delucis commented Jan 11, 2025

Thanks for helping track this. Small update: I updated and redeployed the v5 branch (https://github.com/withastro/docs/tree/5.0.0-beta) to use docs.astro.build for canonical URLs, so that should hopefully filter through and remove beta URLs from search results.

For the v4 branch (https://github.com/withastro/docs/tree/v4), I’ve tried a different approach for now: adding a X-Robots-Tag: noindex header to all pages to tell search engines not to index them. My thinking here was that the v4 branch is a bit more longer lived and more likely not really relate to the v5 canonical content, so just requesting it not to be indexed may make more sense. But we can monitor it (AFAIK I haven’t seen the v4 subdomain causing issues in search results just yet).

@sarah11918
Copy link
Member

Thanks @delucis ! Is there follow up/monitoring to do here? How will we know when we're "done" and the issue can be closed?

@sarah11918 sarah11918 added the site improvement Some thing that improves the website functionality - ask @delucis for help! label Jan 13, 2025
@delucis
Copy link
Member

delucis commented Jan 13, 2025

I’m not sure 😁 Basically, yes, I’d expect this to take a little time to resolve so might need monitoring. Maybe @ArmandPhilippot could share the search query that returned the 5-0-0 deploy URL? Then once that seems to be working we can close?

@ArmandPhilippot
Copy link
Member Author

Good thing I didn't delete the history... because I forgot to noted it somewhere. 😅
I don't remember why I was searching that but, from the date and time, it was "@astrojs/mdx" component.

I just checked in Google (in the 3 first pages) and I no longer see the v5 beta! However, I see: https://v4--astro-docs-2.netlify.app/fr/guides/integrations-guide/mdx/
The x-robots-tag appears in the headers, so I guess it might be a matter of days before Google decide to remove it from the search results.

@delucis
Copy link
Member

delucis commented Jan 13, 2025

Thank you! I get a v5 branch URL on the 3rd page of results on Google (in Korean for some reason 😅):

Google search for astrojs/mdx component showing a result pointing to the v5 branch deployment of docs

But hopefully that will clear up and in any case 3rd page isn’t a disaster as most people don’t end up there (especially these days given how bad most results are…)

@delucis
Copy link
Member

delucis commented Jan 13, 2025

I can also see 5-0-0 URLs showing up as not being indexed in the Google search console which is good.

@ArmandPhilippot
Copy link
Member Author

Yeah, even I don't look at all the pages anymore. 😆 So, I just tried some less specific queries to check the results:

  • with @astrojs/mdx (no quotes) on google.fr, I still see the Netlify v4 website on the first page then, the Korean doc on page 2.
  • with astro mdx on google.com, the fifth result is a 5-0-0-beta URL...

Well, site:5-0-0-beta--astro-docs-2.netlify.app still gives a lot of results... 30 result pages until I get:

In order to display the most relevant results, we have omitted some entries that are very similar to the current 298 entries.

I guess I was optimistic when I said "matter of days"... maybe a little longer than that. 😅 But, at least, it means that some pages have already been removed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site improvement Some thing that improves the website functionality - ask @delucis for help!
Projects
None yet
Development

No branches or pull requests

3 participants