Comment on WordPress SEO Tutorial by SEO Dave.

WordPress SEO When using a robots.txt file the first question to ask is what are you trying to achieve?

Lets go through the example robots.txt file you found:

User-agent: *
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content
Disallow: /tag
Disallow: /author
Disallow: /wget/
Disallow: /httpd/
Disallow: /i/
Disallow: /f/
Disallow: /t/
Disallow: /c/
Disallow: /j/

User-agent: * means all bots should obey this set of rules, doesn’t mean they will (all my robots.txt files include “Crawl-Delay: 20” but Google ignores it).

Anything starting disallow means you don’t want it spidered. Are you having specific problems with some sections of a site being indexed you don’t want indexed (not much point using “Disallow: /cgi-bin” if you don’t even have a cgi-bin directory for example)?

If not there’s no need to have anything here. I will add never use a robots.txt file to hide files you don’t want visitors to see, it’s real easy to load https://stallion-theme.co.uk/robots.txt and if I were dumb enough to add files that I wanted to keep secret, not much of a secret :-)

As a side note most of my robots.txt files are identical to the one for this site, just the crawl delay (supposed to slow down spidering : my sites are constantly hammered by bots) and I only have a robots.txt file to stop a 404 error code when bots look for a robots.txt file.

If you were trying to stop WordPress tags from being indexed adding “Disallow: /tag” would stop them from being spidered, BUT it would waste a lot of link benefit and mean your tags won’t be indexed (is that what you want?). The new WordPress SEO plugin I mentioned above that I was working on has been released at Stallion WordPress SEO Plugin and it can achieve the equivalent of the WordPress relevant disallows above without wasting link benefit.

User-agent: Mediapartners-Google
Allow: /
User-agent: Adsbot-Google
Allow: /
User-agent: Googlebot-Image
Allow: /
User-agent: Googlebot-Mobile
Allow: /

Specifying a specific user-agent and allowing it is a complete waste of time. By default they are allowed to spider your site and adding the above won’t increase or decrease the number of visits a user agent will make.

User-agent: ia_archiver-web.archive.org
Disallow: /

Specifying a user-agent and disallowing it can be useful. If you are having a problem with a spider you can set this to stop it spidering completely, but be aware if you are having a problem with a bot it probably isn’t a good one one that will follow the rule!

Sitemap: http://www.askapache.com/sitemap_index.xml

Useful if you have a sitemap file.

As you can see the robots.txt file is basically useful for stopping spiders doing things you don’t want them to do, but they don’t all follow the rules and when you do disallow a section of a WordPress site it can come at the cost of link benefit.

BTW some of the information at http://codex.wordpress.org/Search_Engine_Optimization_for_WordPress is wrong.

Search Engine Site Submissions complete BS, total waste of time submitting a site to a search engine like Google, only way to get a site indexed long term and increase rankings is through backlinks that are not rel=”nofollow” or on a noindex page.

Meta Tags have zero ranking value.

Robots.txt Optimization, the example robots.txt file is awful, that could seriously damage a sites SEO, would damage my sites for certain!!!

Talian 3 was never broken, that’s a permissions issue you have. If you want to edit your files online they have to have the correct permissions, depending on your server you’ll want the files set to 666 for full access, but I would strongly advise setting them back to 644 for added security after editing. I find it more secure to edit the files offline and upload using FTP (tendency to forget to change the permissions back).

As a Talian 3 customer you’ll be entitled to both a Talian 5 upgrade and Stallion 6 upgrade. Drop me an email from the email address you used to order and I’ll give you the upgrade details.

David