SEO

jstern · May 27, 2010

I have a few development sites online that I dont want searched / crawled by search engines. (which they are unfortunately)

Does anyone understand how to use the rel=nofollow (or noread??) tags well?

Andrea · May 27, 2010

Just use the .htaccess file to exclude those sites.

jstern · May 28, 2010

im not sure how .htaccess would help. We use our dev servers so we can keep our projects 'live' and the boss man can take a look at projects as progress. I suppose we could keep move them to an internal server but thats another task altogether at the moment, and we've previously purchased domainnames for them.

I think what i've decided to do was add the specific metatags during one of the site setup controllers. (Another reason i love Zend!)

If (DEV) NOINDEX NOFOLLOW kinda idea.

Either that or add to the dev sites robots.txt file. I think something like:

User-agent: *

Disallow: /*

If i do the robots method, i run the risk of the guys overriding the file when committing a new branch to the dev servers. This way would probably quicker and easier if anyone else had this problem with no risk of override. (Anyone know a way to protect the robots.txt file? All our developers have root access.)

weblink · September 30, 2010

To let Google know that you do not want a page crawled, you can create the following meta tag:

<META NAME="GOOGLEBOT" CONTENT="NOINDEX, NOFOLLOW">

To let all search engine spiders know that you do not want a page crawled, you can create the following meta tag:

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

Use of nofollow tag-

Applying the rel="nofollow" Attribute:

Example:

<a href="http://spammer.example.com/">buy'>http://spammer.example.com/">buy now</a>

changes to:

<a href="http://spammer.example.com/ rel=nofollow">buy now</a>

Kyle Undefined · September 30, 2010

Even better, robots.txt file, here's an example:

# robots.txt for http://www.site.com/

User-agent: *
Disallow: /Admin/
Disallow: /Content/
Disallow: /UserControls/

jstern · September 30, 2010

Heres what I ended up doing;

In my viewsetup.php i ended up adding the following line:

if (DEV) {
               $this->_view->headMeta()->setName('ROBOTS', 'NOARCHIVE,NOINDEX,NOFOLLOW');
           }

DEV is defined in the bootstrap.php file as anything that resembles development servers. (EX. i use ***dev1.com locally so it would resolve DEV. Also have ***dev2.com dev3com, dev8.com etc on the webhost, so these would also resolve a DEV)

So I'm sure you can see why i didn't want Search engines crawling around dev sites. Robots.txt would be a good a idea if I wasnt ever worried about someone committing over it if they upload or clear our that shared dev environment.

Sign In

SEO

Recommended Posts

jstern

Andrea

jstern

weblink

Kyle Undefined

jstern

Join the conversation

Browse

Activity

Store

Support