<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Tropical Blogging &#187; canonical URL</title>
	<atom:link href="http://www.tropicalwebworks.org/tag/canonical-url/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.tropicalwebworks.org</link>
	<description>Warm breezes, sunshine, and random thoughts</description>
	<lastBuildDate>Sun, 04 Apr 2010 23:27:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Pre-Launch Steps for Your Site</title>
		<link>http://www.tropicalwebworks.org/2007/03/20/pre-launch-steps-for-your-site/</link>
		<comments>http://www.tropicalwebworks.org/2007/03/20/pre-launch-steps-for-your-site/#comments</comments>
		<pubDate>Tue, 20 Mar 2007 21:20:07 +0000</pubDate>
		<dc:creator>Sonja</dc:creator>
				<category><![CDATA[Usability]]></category>
		<category><![CDATA[Web Site Design]]></category>
		<category><![CDATA[Web Standards]]></category>
		<category><![CDATA[404]]></category>
		<category><![CDATA[canonical URL]]></category>
		<category><![CDATA[cron]]></category>
		<category><![CDATA[forms]]></category>
		<category><![CDATA[pre-launch steps]]></category>
		<category><![CDATA[robots.txt]]></category>
		<category><![CDATA[spiders]]></category>

		<guid isPermaLink="false">http://www.tropicalwebworks.org/2007/03/20/pre-launch-steps-for-your-site/</guid>
		<description><![CDATA[Does your host or website developer do these things? Developing a new web site &#8212; or re-developing an old one, for that matter &#8212; typically involves consulting with the client to determine the site&#8217;s target audience and primary objective, creating an attractive and functional design, turning the design into properly coded, valid html, building out [...]]]></description>
			<content:encoded><![CDATA[<h3>Does your host or website developer do these things?</h3>
<p>Developing a new web site &#8212; or re-developing an old one, for that matter &#8212; typically involves consulting with the client to determine the site&#8217;s target audience and primary objective, creating an attractive and functional design, turning the design into properly coded, valid html, building out the pages of content, and writing the server-side programming to perform whatever dynamic features are needed. But there are several steps that are frequently overlooked before a site &#8220;goes live.&#8221;</p>
<p><span id="more-37"></span><strong>A custom &#8220;404 not found&#8221; page should be created. </strong>At minimum, it should incorporate the site&#8217;s overall design and navigation links, and might also include a search form, links to the most popular sections of the site, and/or a way to contact the site owner or webmaster for assistance. And make sure that requests for non-existent pages actually get a &#8220;404 page not found&#8221; server response. If any other server response is returned &#8212; particularly a &#8220;200 OK&#8221; response &#8212; the site could easily become persona non grata in the search engines, among other problems.<br />
<strong>A robots.txt file should be created</strong> to tell the search engine spiders what pages or parts of the site should not be spidered. Even if you want every page to be spidered, a robots.txt file should be placed in the document root of the site, so as to avoid filling up the site&#8217;s error logs with &#8220;not found&#8221; entries for a non-existent robots.txt. This makes it much easier to spot errors resulting from actual bad links when you examine the error logs.</p>
<p><strong>A canonical URL redirect</strong> should be implemented to send all site traffic to your desired canonical URL &#8212; either www.example.com, or just example.com (without the www). Whichever you prefer, your should make sure that all traffic to the other form is redirected via a proper 301 redirect.</p>
<p><strong>Test all forms and other interactive features.</strong> Submit every form. Attempt to submit forms without required information, or with invalid information. Try to break them. Try really really hard to break them. And make sure that whatever is supposed to be done after the form is submitted actually happens. If an e-mail is supposed to be sent to the site owner, test it, and make sure the owner gets that e-mail with all the appropriate information. If there&#8217;s a search engine, search for some things. If the site relies on cron jobs, set them up ahead of time, and make sure they&#8217;re running as scheduled and performing as expected.</p>
<p><strong>Test all redirects and rewrites.</strong> If the site uses Apache&#8217;s mod_rewrite module to present search-engine-friendly URLs, test them all, and test the non-search-engine-friendly versions, to make sure that every bit of content can only be reached by one unique URL.</p>
<p><strong>Check subdirectories for directory listings.</strong> Make sure that directory indexes are turned off, and/or for any subdirectory that  doesn&#8217;t have an index page, plop one in there.</p>
<p><strong>Test all internal links and all outgoing links.</strong> Make absolutely sure that every single link leads to the right place. We don&#8217;t need to stinkin&#8217; dead links!</p>
<p>There&#8217;s simply no excuse for ever launching a site without having completed each of these steps.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tropicalwebworks.org/2007/03/20/pre-launch-steps-for-your-site/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Infamous Canonical URL Issue</title>
		<link>http://www.tropicalwebworks.org/2007/01/18/infamous-canonical-url/</link>
		<comments>http://www.tropicalwebworks.org/2007/01/18/infamous-canonical-url/#comments</comments>
		<pubDate>Fri, 19 Jan 2007 01:42:24 +0000</pubDate>
		<dc:creator>Sonja</dc:creator>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[Search Engine Optimization]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[301]]></category>
		<category><![CDATA[canonical URL]]></category>
		<category><![CDATA[duplicate content]]></category>

		<guid isPermaLink="false">http://www.tropicalwebworks.org/2007/01/18/the-infamous-canonical-url-issue/</guid>
		<description><![CDATA[Difficult as it may be to believe, but by January of 2007, Google is still unable to recognize when URLs that obviously lead to the same page are in fact the same page. So what&#8217;s a URL, and what&#8217;s the problem here? URL (pronounced you-are-ell, or sometimes &#8220;earl&#8221; as in Duke of) stands for Uniform [...]]]></description>
			<content:encoded><![CDATA[<p>Difficult as it may be to believe, but by January of 2007, Google is <strong>still</strong> unable to recognize when URLs that obviously lead to the same page are in fact the same page. So what&#8217;s a URL, and what&#8217;s the problem here?</p>
<p><span id="more-30"></span>URL (pronounced you-are-ell, or sometimes &#8220;earl&#8221; as in <em>Duke of</em>) stands for Uniform Resource Locator. It&#8217;s the technical name for the <strong>address</strong> of a particular web page. For example, the URL of this site&#8217;s home page is <code>http://www.tropicalwebworks.org</code>, and the URL of this page is <code>http://www.tropicalwebworks.org/2007/01/18/infamous-canonical-url/</code>.</p>
<p>It&#8217;s common that any particular web page may be reached at multiple URLs. If this site were not configured optimally, the home page might be reachable at both <code>http://www.tropicalwebworks.org</code> and <code>http://tropicalwebworks.org</code> (notice the missing &#8220;www.&#8221;). Normal people would logically think that this would be desirable: After all, you don&#8217;t want people to get a &#8220;server not found&#8221; error if they try to get to your site without including the www part.</p>
<p>But Google sees these as two completely separate URLs that just happen to contain exactly the same content. There are two problems with such a situation:</p>
<ol>
<li>First, the &#8220;strength&#8221; of that page, and its ability to turn up in the search engine results, is diluted. Some of the page&#8217;s strength is allotted to one version, and some to the other, and neither &#8220;page&#8221; performs as well as it would if all the strength were concentrated in one page.</li>
<li>And second, Google attempts to filter out pages containing duplicate content, based on the reasonable logic that people don&#8217;t want to see multiple results in their searches for the exact same thing. Thus, since both of these &#8220;pages&#8221; contain the exact same content, one of them will suffer in searches due to the dupe content filter.</li>
</ol>
<p>It&#8217;s a double whammy. It&#8217;s not that your site actually <strong>has</strong> duplicate content. No, we could possibly call this situation &#8220;virtual duplicate content.&#8221; But it&#8217;s all the same to Google: It&#8217;s duplicate content, period.</p>
<p>And if that&#8217;s not bad enough, many people link to their home page like this: http://www.example.com/index.html. Now Google sees yet another instance of duplicate content: http://www.example.com and http://www.example.com/index.html. So ultimately what Google sees is <strong>four</strong> &#8220;duplicate content&#8221; pages:</p>
<ul>
<li>http://www.example.com</li>
<li>http://example.com</li>
<li>http://www.example.com/index.html</li>
<li>http://example.com/index.html</li>
</ul>
<p>And all this before we&#8217;ve even gotten past the home page of your site!</p>
<p>It&#8217;s easy-peasy to configure the server to do what&#8217;s called a &#8220;301 permanent redirect&#8221; from the non-www version to the www version of your site. This technique, which is recommended by Google, tells Google that the two are indeed the same and keeps the poor Googlebot from deciding that you have duplicate content and splitting your page&#8217;s strength among more than one version. &#8220;301&#8243; refers to the status code that&#8217;s returned by the web server to the browser (or the spider, in this case), and it says, in effect, &#8220;Hey, the correct, permanent URL for the page you&#8217;re requesting is actually over there. Don&#8217;t index it at this URL.&#8221;</p>
<p>It&#8217;s likewise easy-peasy to link to your home page without the &#8220;index.html&#8221; (or other directory index name, such as home.htm or default.asp). For index pages in subdirectories, you simply link to the directory: <code>http://www.example.com/subdirectory/</code>, again leaving out the actual filename index.html.</p>
<p>I apply an appropriate 301 permanent redirect to the www version of every web site I develop. It&#8217;s not something I charge extra for, or something that I tout to my clients as being anything special. It&#8217;s about a 20-second task to set up the 301 properly. And I never link to directory index pages by filename. I don&#8217;t know why some of the big companies aren&#8217;t aware of this issue, or, if they are aware, why they don&#8217;t care enough to do it properly. It raises the question, if they&#8217;re so ignorant, or uncaring, about a thing that is so simple to do right, in how many other areas are they incompetent?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tropicalwebworks.org/2007/01/18/infamous-canonical-url/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Web site development, SEO, and Hippocrates</title>
		<link>http://www.tropicalwebworks.org/2007/01/13/web-site-development-seo-and-hippocrates/</link>
		<comments>http://www.tropicalwebworks.org/2007/01/13/web-site-development-seo-and-hippocrates/#comments</comments>
		<pubDate>Sun, 14 Jan 2007 03:42:21 +0000</pubDate>
		<dc:creator>Sonja</dc:creator>
				<category><![CDATA[Rants]]></category>
		<category><![CDATA[Search Engine Optimization]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Web Standards]]></category>
		<category><![CDATA[canonical URL]]></category>
		<category><![CDATA[javascript]]></category>
		<category><![CDATA[SEO]]></category>

		<guid isPermaLink="false">http://www.tropicalwebworks.org/2007/01/13/web-site-development-seo-and-hippocrates/</guid>
		<description><![CDATA[What do web site development, search engine optimization and Hippocrates have in common? A line from the Hippocratic Oath comes to mind: First, do no harm. In a previous post, I touched on how the technological factors underlying a web site are important to the site&#8217;s search engine optimization. These factors aren&#8217;t important so much [...]]]></description>
			<content:encoded><![CDATA[<p>What do web site development, search engine optimization and Hippocrates have in common? A line from the Hippocratic Oath comes to mind: <strong>First, do no harm.</strong> In a previous post, I touched on how the <a href="http://www.tropicalwebworks.org/2007/01/13/components-of-seo/">technological factors</a> underlying a web site are important to the site&#8217;s search engine optimization.</p>
<p>These factors aren&#8217;t important so much for their ability to rank a site highly, as they are for avoiding problems that can harm a site&#8217;s ranking.</p>
<p><span id="more-10"></span>Using standard href links rather than javascript links, and avoiding canonical URL problems by implementing a 301 redirect from non-www to www, won&#8217;t actually help your site rank better for any given search. But using all javascript links and having canonical URL problems can <strong>harm</strong> your site&#8217;s ability to rank well.</p>
<p>These technological factors fall under the &#8220;Do no harm&#8221; umbrella. Don&#8217;t throw obstacles in the path of high rankings. Removing the obstacles won&#8217;t cause you to win that race &#8212; but it will make winning <strong>possible</strong>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tropicalwebworks.org/2007/01/13/web-site-development-seo-and-hippocrates/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
