Robots.txt / indexing problem with site running as...
# workers-help
d
Running into a situation where Google search is unable to index the site running as pages. Robots.txt renders as index.html for some reason. I did create explicit robots.txt in my repo and I'm pretty sure it's propagating to pages - anyone know what might be happening?
j
Do you have a 404.html? If not, any page that 404s will render the index.html, for SPA behaviour. That won't prevent Pages from server a file though. Where is your robots.txt? Can you provide a link to it and/or your repo for us to see?
d
Thanks

https://cdn.discordapp.com/attachments/1109614324768592003/1109619292615475251/image.png

j
Your robots.txt should be in your output folder. I imagine that’s public or dist?
The same for the index.html too
i
From what it looks like you're running Vite (or a framework that uses Vite). The way Vite works is that it copies everything from public -> dist at build time so your robots.txt needs to go in public/ as James says. Side note - I'd recommend against committing dist (add it to your .gitignore), those generated files change often and will slow down git in the long run.
d
Thanks, you’re right - robots doesn’t seem to make it to public or dist. If I remove dist from the github, will it get generated by the cf pages?
i
Yes, it should.
As long as you have the build command set correctly (
npm run build
I assume but it might be different for you).
d
Got it, I'll try- thanks!