Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Attack of the dummy urls -- what to do?
-
It occurs to me that a malicious program could set up thousands of links to dummy pages on a website:
www.mysite.com/dynamicpage/dummy123
www.mysite.com/dynamicpage/dummy456
etc..
How is this normally handled? Does a developer have to look at all the parameters to see if they are valid and if not, automatically create a 301 redirect or 404 not found? This requires a table lookup of acceptable url parameters for all new visitors.
I was thinking that bad url names would be rare so it would be ok to just stop the program with a message, until I realized someone could intentionally set up links to non existent pages on a site.
-
Hello,
I am also having this issue with hundreds of dummy urls that never existed as a part of our website's blog. Do I go into parameters and specify each of the dummy urls to avoid this?
Thanks in advance for any help!!!! (and sorry to piggyback this question Theodore-hope you don't mind!)
-
Thanks Ray. Appreciate the advice!
-
It's great that you've identified issues like this. I also suggest that if you know certain parameters are generated often and not necessary to index, that you go into your Google Webmaster Tools account > Crawl > URL Parameters and proactively set the crawl rate to 'No URLs' is appropriate. I do this with certain custom parameters for sites that are prone to having these extra URLs indexed mistakenly.
-
Hi Ray-pp,
Thanks for your answer. I'm not getting anything significant, but occasionally a bot will come with extra stuff added to the parameter names, so it got me to thinking a malicious program or nasty competitor might want to do that to cause havoc. My understanding is that 404s don't hurt SEO ranking from Google, but I was thinking that the way things are set up now no-one would get a 404 and in fact Google would index the 'bad' pages, so maybe I needed to do something proactively to 404 or 301 such pages so they would never get put into an index at all.
Since my site has lots of dynamically generated pages, I've had my share of surprises, and am just trying to avoid any new ones!
-
Hi Theodore - You pose an interesting problem, are you currently experiencing this issue? I don't see why someone would create a bunch of random non-existent links to your site, but if they did (and the pages were receiving low quality traffic) then I would proactively disavow those domains that created the links. That would be enough to prevent any penalties you're afraid of receiving.
If, however, you're noticing that specific 404 pages are receiving quality traffic (maybe an old page was removed but good traffic is still sent to the page) then you would want to 301 that page to its closest relative page that deserves the traffic and authority.
Does that help? Maybe a little more information around you specific problem would allow me to tailor the advice better.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Mass URL changes and redirecting those old URLS to the new. What is SEO Risk and best practices?
Hello good people of the MOZ community, I am looking to do a mass edit of URLS on content pages within our sites. The way these were initially setup was to be unique by having the date in the URL which was a few years ago and can make evergreen content now seem dated. The new URLS would follow a better folder path style naming convention and would be way better URLS overall. Some examples of the **old **URLS would be https://www.inlineskates.com/Buying-Guide-for-Inline-Skates/buying-guide-9-17-2012,default,pg.html
Intermediate & Advanced SEO | | kirin44355
https://www.inlineskates.com/Buying-Guide-for-Kids-Inline-Skates/buying-guide-11-13-2012,default,pg.html
https://www.inlineskates.com/Buying-Guide-for-Inline-Hockey-Skates/buying-guide-9-3-2012,default,pg.html
https://www.inlineskates.com/Buying-Guide-for-Aggressive-Skates/buying-guide-7-19-2012,default,pg.html The new URLS would look like this which would be a great improvement https://www.inlineskates.com/Learn/Buying-Guide-for-Inline-Skates,default,pg.html
https://www.inlineskates.com/Learn/Buying-Guide-for-Kids-Inline-Skates,default,pg.html
https://www.inlineskates.com/Learn/Buying-Guide-for-Inline-Hockey-Skates,default,pg.html
https://www.inlineskates.com/Learn/Buying-Guide-for-Aggressive-Skates,default,pg.html My worry is that we do rank fairly well organically for some of the content and don't want to anger the google machine. The way I would be doing the process would be to edit the URLS to the new layout, then do the redirect for them and push live. Is there a great SEO risk to doing this?
Is there a way to do a mass "Fetch as googlebot" to reindex these if I do say 50 a day? I only see the ability to do 1 URL at a time in the webmaster backend.
Is there anything else I am missing? I believe this change would overall be good in the long run but do not want to take a huge hit initially by doing something incorrectly. This would be done on 5- to a couple hundred links across various sites I manage. Thanks in advance,
Chris Gorski0 -
Does google ignore ? in url?
Hi Guys, Have a site which ends ?v=6cc98ba2045f for all its URLs. Example: https://domain.com/products/cashmere/robes/?v=6cc98ba2045f Just wondering does Google ignore what is after the ?. Also any ideas what that is? Cheers.
Intermediate & Advanced SEO | | CarolynSC0 -
All URLs in the site is 302 redirected to itself
Hi everyone, I have a problem with a website wherein all URLs (homepage, inner pages) are 302 redirected. This is based on Screaming Frog crawl. But the weird thing is that they are 302 redirected to themselves which doesn't make any sense. Example:
Intermediate & Advanced SEO | | alex_goldman
https://www.example.com.au/ is 302 redirected to https://www.example.com.au/ https://www.example.com.au/shop is 302 redirected to https://www.example.com.au/shop https://www.example.com.au/shop/dresses is 302 redirected to https://www.example.com.au/shop/dresses Have you encountered this issue? What did you do to fix it? Would be very glad to hear your responses. Cheers!0 -
Duplicate content on URL trailing slash
Hello, Some time ago, we accidentally made changes to our site which modified the way urls in links are generated. At once, trailing slashes were added to many urls (only in links). Links that used to send to
Intermediate & Advanced SEO | | yacpro13
example.com/webpage.html Were now linking to
example.com/webpage.html/ Urls in the xml sitemap remained unchanged (no trailing slash). We started noticing duplicate content (because our site renders the same page with or without the trailing shash). We corrected the problematic php url function so that now, all links on the site link to a url without trailing slash. However, Google had time to index these pages. Is implementing 301 redirects required in this case?1 -
Does Google Read URL's if they include a # tag? Re: SEO Value of Clean Url's
An ECWID rep stated in regards to an inquiry about how the ECWID url's are not customizable, that "an important thing is that it doesn't matter what these URLs look like, because search engines don't read anything after that # in URLs. " Example http://www.runningboards4less.com/general-motors#!/Classic-Pro-Series-Extruded-2/p/28043025/category=6593891 Basically all of this: #!/Classic-Pro-Series-Extruded-2/p/28043025/category=6593891 That is a snippet out of a conversation where ECWID said that dirty urls don't matter beyond a hashtag... Is that true? I haven't found any rule that Google or other search engines (Google is really the most important) don't index, read, or place value on the part of the url after a # tag.
Intermediate & Advanced SEO | | Atlanta-SMO0 -
301 redirect with /? in URL
For a Wordpress site that has the ending / in the URL with a ? after it... how can you do a 301 redirect to strip off anything after the / For example how to take this URL domain.com/article-name/?utm_source=feedburner and 301 to this URL domain.com/article-name/ Thank you for the help
Intermediate & Advanced SEO | | COEDMediaGroup0 -
What is the best way to handle special characters in URLs
What is the best way to handle special characters? We have some URL's that use special characters and when a sitemap is generate using Xenu it changes the characters to something different. Do we need to have physically change the URL back to display the correct character? Example: URL: http://petstreetmall.com/Feeding-&-Watering/361.html Sitmap Link: http://www.petstreetmall.com/Feeding-%26-Watering/361.html
Intermediate & Advanced SEO | | WebRiverGroup0 -
Google News URL Structure
Hi there folks I am looking for some guidance on Google News URLs. We are restructuring the site. A main traffic driver will be the traffic we get from Google News. Most large publishers use: www.site.com/news/12345/this-is-the-title/ Others use www.example.com/news/celebrity/12345/this-is-the-title/ etc. www.example.com/news/celebrity-news/12345/this-is-the-title/ www.example.com/celebrity-news/12345/this-is-the-title/ (Celebrity is a channel on Google News so should we try and follow that format?) www.example.com/news/celebrity-news/this-is-the-title/12345/ www.example.com/news/celebrity-news/this-is-the-title-12345/ (unique ID no at the end and part of the title URL) www.example.com/news/celebrity-news/celebrity-name/this-is-the-title-12345/ Others include the date. So as you can see there are so many combinations and there doesnt seem to be any unity across news sites for this format. Have you any advice on how to structure these URLs? Particularly if we want to been seen as an authority on the following topics: fashion, hair, beauty, and celebrity news - in particular "celebrity name" So should the celebrity news section be www.example.com/news/celebrity-news/celebrity-name/this-is-the-title-12345/ or what? This is for a completely new site build. Thanks Barry
Intermediate & Advanced SEO | | Deepti_C0