jstern Posted May 27, 2010 Report Share Posted May 27, 2010 I have a few development sites online that I dont want searched / crawled by search engines. (which they are unfortunately) Does anyone understand how to use the rel=nofollow (or noread??) tags well? Quote Link to comment Share on other sites More sharing options...
Andrea Posted May 27, 2010 Report Share Posted May 27, 2010 Just use the .htaccess file to exclude those sites. Quote Link to comment Share on other sites More sharing options...
jstern Posted May 28, 2010 Author Report Share Posted May 28, 2010 im not sure how .htaccess would help. We use our dev servers so we can keep our projects 'live' and the boss man can take a look at projects as progress. I suppose we could keep move them to an internal server but thats another task altogether at the moment, and we've previously purchased domainnames for them. I think what i've decided to do was add the specific metatags during one of the site setup controllers. (Another reason i love Zend!) If (DEV) NOINDEX NOFOLLOW kinda idea. Either that or add to the dev sites robots.txt file. I think something like: User-agent: * Disallow: /* If i do the robots method, i run the risk of the guys overriding the file when committing a new branch to the dev servers. This way would probably quicker and easier if anyone else had this problem with no risk of override. (Anyone know a way to protect the robots.txt file? All our developers have root access.) Quote Link to comment Share on other sites More sharing options...
weblink Posted September 30, 2010 Report Share Posted September 30, 2010 To let Google know that you do not want a page crawled, you can create the following meta tag: <META NAME="GOOGLEBOT" CONTENT="NOINDEX, NOFOLLOW"> To let all search engine spiders know that you do not want a page crawled, you can create the following meta tag: <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"> Use of nofollow tag- Applying the rel="nofollow" Attribute: Example: <a href="http://spammer.example.com/">buy'>http://spammer.example.com/">buy now</a> changes to: <a href="http://spammer.example.com/ rel=nofollow">buy now</a> Quote Link to comment Share on other sites More sharing options...
Kyle Undefined Posted September 30, 2010 Report Share Posted September 30, 2010 Even better, robots.txt file, here's an example: # robots.txt for http://www.site.com/ User-agent: * Disallow: /Admin/ Disallow: /Content/ Disallow: /UserControls/ Quote Link to comment Share on other sites More sharing options...
jstern Posted September 30, 2010 Author Report Share Posted September 30, 2010 Heres what I ended up doing; In my viewsetup.php i ended up adding the following line: if (DEV) { $this->_view->headMeta()->setName('ROBOTS', 'NOARCHIVE,NOINDEX,NOFOLLOW'); } DEV is defined in the bootstrap.php file as anything that resembles development servers. (EX. i use ***dev1.com locally so it would resolve DEV. Also have ***dev2.com dev3com, dev8.com etc on the webhost, so these would also resolve a DEV) So I'm sure you can see why i didn't want Search engines crawling around dev sites. Robots.txt would be a good a idea if I wasnt ever worried about someone committing over it if they upload or clear our that shared dev environment. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.