Don’t Forget These 5 Things On A New Site
TweetI recently acquired the excellent Web Design Checklist, which does a great job of listing all the little things one should remember to include in a new site. I love it. But guess what? I forgot to look at it when I created my newest site.
DOH! (Can you see my forehead hitting my desk right about now?) ![]()
Luckily, it’s only been a week or two (who remembers) since I uploaded the new site, so I quickly looked through the checklist once I remembered I had it. Want to know what I noticed? The same 5 things I almost *always* forget to include were the ones I forgot to include this time as well. Oy.
Ok, so, just in case there’s some kind of weird synergy between you and I, and you always forget the same 5 things when you create a site, here’s my abbreviated checklist (which of course is nowhere near as comprehensive as the one I listed in the first sentence of this post).
Robots.txt file
A default robots.txt file that allows search engines to index every page should look like this:
User-Agent: *
Disallow:
Actually, if you just want everything indexed, there’s really no need for a robots.txt file, since “Index Everything” is the default if robots.txt doesn’t exist. I usually include one anyway, just to cover all my bases. I often follow the “just in case” method of site building.
Don’t get confused and accidentally add a / after Disallow: If your robots.txt says Disallow: / (note the /), then you will be preventing the search bots from indexing your entire site. Very very bad.
There are lots of different things you might want to exclude with a robots.txt file, but here’s the most common (sample directories, substitute yours as needed):
Exclude certain directories
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/
You can generate a custom robots.txt file with a tool like http://tools.seobook.com/robots-txt/generator/ (Note that this tool will automatically default to Allow Everything, as soon as you load the tool).
TIP: Your robots.txt file MUST be in your top-level web (root) directory, not in a subdirectory.
For more info about what a robots.txt file is all about, go to http://www.robotstxt.org/robotstxt.html.
XML sitemap
In some ways, an XML sitemap is exactly the opposite of a robots.txt, in the sense that it is designed to make sure the search engines know which pages on your site you want them to index. It provides more information than just a list of URLs, however, such as giving search engines hints as to which pages are more important to you than others.
It’s not stricly necessary to have an XML sitemap. In most cases, the search engines will get around to indexing all (or most) of the pages you list at some point. However, any time you can exert some bit of control (or at least give strong hints) over your site’s inclusion in search engines, you should take advantage of it.
To read more about sitemaps, check out the site that Google, Yahoo, and Microsoft collaborated on at http://www.sitemaps.org/
If your site is powered by WordPress then just go download this plugin, and the job of adding a sitemap will be greatly simplified.
If your site isn’t WordPress-based, then one of the XML sitemap generator tools should work for you.
Custom 404 page
Don’t trust your visitors to stick around if they land on the generic 404 – Not Found page. Give them a custom error page and gently lead them to other areas of your site that might be useful to them.
To create a custom 404 page for WordPress blogs, simply edit your theme’s 404.php page template. The default WordPress theme has a 404.php file, but not all themes have their own custom 404 error template file. If your template has one, it will be named 404.php. WordPress will automatically use that page if a Page Not Found error occurs. Just edit it in your admin (Appearance / Editor) and include links back to your home page, your archives, your search box, etc. For more details, go to http://codex.wordpress.org/Creating_an_Error_404_Page.
To create a custom 404 page for non-Wordpress site that has a hosting account that uses Cpanel, login to cPanel and click Error Pages, under the Advanced block. Select the domain you want to create a custom 404 page for, and click the type of error page to edit – in this case, 404 (not found). Insert your own custom HTML code. The changes will be applied after you click Save. Here’s a screenshot of what my cPanel shows for the Error Pages section in my hosting account.

Favicon
It’s really easy to create a favicon for your site (the tiny little graphic that shows up when you bookmark a site). Just take any square image (like a square portion of your logo, for example) and run it through a favicon generator like this one. Place it in the root folder (where your home page is). Then add the following code to the head section of your pages.
<link rel="shortcut icon" type="image/x-icon" href="/favicon.ico">
Analytics code
I think just about *everyone* forgets to add their analytics or stats code to their new site at least once. I forget every time! Anyway, whether you are using Google Analytics, Clicky Web Analytics or any other stats application, don’t forget to find the directions for your analytics code, and place it where it belongs (usually in the head section or just before the end of the page).
Now, if I *still* forget those 5 things next time I create a new site, just hit me over the head with either this post or the Web Design Checklist, ok?
Tweet
Tags: Robots exclusion standard, Sitemaps, Web analytics, web design, Web Design Checklist
Share and Bookmark This Post













If you feel my blog has contributed to you in any meaningful way, and would like to throw some loose change into a tip jar (metaphorically speaking), a click on the donation button below would definitely be appreciated.
Thanks for mentioning our checklist Donna! I’m glad it came in handy.
We’ve actually updated it since the last version you saw, and will be sending out the updated version to you and everyone else who bought it sometime in the next few days.
Recent post by Jeremy L. Knauff ..Great Customer Service Makes Your Readers Feel Appreciated
Oh sweet! Looking forward to getting it. Now, if I can just get it to jump out at me when I need it.
Thanks for the list. Looking over it, I realized that our new site is missing the site map which when done correctly can get a second listing on the keyword.
I’m guessing you’re thinking of a regular user sitemap, rather than an XML sitemap. Assuming I’m guessing correctly, then yes, that’s a way to do that – or even to use a secondary keyword for a page. A user sitemap is different from an XML sitemap, however, and the XML sitemap really won’t help for that.
On Stumbleupon, added to delicious – I learn every time I visit this blog, did not know anything about a robots.txt file & an XML sitemap (I installed the XML sitemap plugin a while ago because it sounded essential). But I would have never thought the concepts contrast to a degree.
Recent post by ashok ..Margaret Levine- “A Man I Knew”
That’s the beauty of the web, right? We all get to learn from each other, since each of us has a little bit of knowledge to pass on, and a lot of info we can get from others. Crazy awesome, imo.
Be careful with XML sitemaps – with tools that auto generate them they can often do more harm than good.
You need to make sure that any sitemap only contains real URLs that you would want to be returned in search results.
I’ve seen tons of sitemaps on commercial sites with secure/internal/defunct pages included in the sitemap, often searchable in Google as a result. For example in one instance, a page that set currency conversion rates for the whole e-commerce system could be fiddled with at leisure by anyone who could find it in Google, purely because it was included in the auto-generated sitemap.
The risks are to SEO as well as security – auto-generated sitemaps may often include lots of low quality pages that are only meant to be internal to the CMS (eg pages that act as containers for file downloads or images), or defunct/old versions of pages that are no longer part of the site structure and introduce 404 errors to spidering.
I’ve never had those problems, but that’s probably more likely because I’ve never had any of those types of pages on my sites. Still, good points, Miles. Always good to know what you’re putting out there.
if you are going to get a VPS server make sure that it has cPanel coz it makes server maintennance easier.“:
Yep, I agree.
What great advice! Thanks! I’ve been looking for some info on favicons. Sites without them just look so sad.
Recent post by Laura Davis ..What You Didnt Know About Wall Street Reform
Thanks for sharing. When I put up a new site, Google XML Sitempas, Smart 404 and Google Analytics are in my list of plugin must-haves. So that’s 3 out of 5. Is there a plugin for robots.txt too?
Recent post by Shinta ..Earn Money from Home- When do You do Decide to Take the Risk
Apparently, there are, Shinta, but I haven’t used any of them, so I have no idea if any are worth recommending or not. But searching does bring up a few out there.
What software do you use to upload the favicon file to the site directory? Where do I get it? I am still figuring out all this stuff from a non-web designer standpoint.
Recent post by Stephanie Suesan Smith ..The Texan’s Irish Bride Book Trailer
I use Filezilla ftp software (it’s free), but you can also use whatever file manager your host provides. If it’s cPanel, just go to the File Manager, and upload it from there.
Filezilla: http://filezilla-project.org/
cPanel File Manager documentation: http://docs.cpanel.net/twiki/b.....ileManager