Monday, March 27, 2006

Backup Your Blog

If you spend any time developing your blog, you'll probably not going to enjoy recreating it again. Occasionally though, mistakes happen.

  • Maybe you hit the "Delete This Blog" button, and realised it just a microsecond AFTER you hit the Yes button. Too late. Now you have to wait for Blogger to restore it, and hope that it doesn't get taken by the spammers in the meantime.

  • Maybe you just tweaked the template a little too much, and can't get it right again.

  • Maybe you're plagued by the problem of the week, Blog Corruption / Hijacking.

  • Maybe you would like to carry your computer around, and show others your blog (or look at it yourself), even if you're not online.



If you can, by a miracle, get Blogger to restore your blog, or if you're one of the lucky ones reading this who doesn't have a problem, take a couple hours and Backup Your Blog. One solution for this need is HTTrack, which is a free and very easy product to use. Note this caution when using HTTrack. And note this additional advice - backup your template, separately.

  • Get HTTrack (free).

  • Install it. You will want to close all open applications, and prepare to reboot afterwards.

  • Run it. The first time, you will have to identify your blog by URL and by Title, and identify the HTTrack Mirror Base Path. The mirroring itself is easy.

  • Setup a shortcut to the mirrored code. Now you can view your blog locally, whether or not your computer is online.

  • Whenever you make changes, rerun HTTrack. Changed blogs remirror quickly.


Organise Your Mirrors Properly - Setup A Significant Base Path
When you setup a mirror using HTTrack, you don't get a complicated applet to change all the many options. There are a lot of options that you can change, to tweak your mirroring, later - HTTrack is very configurable. There is, however, one important setting, that you input on the first screen.

There are 3 settings that you will see on the first screen, when you start HTTrack.

  1. Project Name.

  2. Project Category.

  3. Base Path.


Project name should be pretty obvious, or easily guessed. Project category is something that you can change, to suit your needs. Base path is something that you can change, at any time, too.

The URL for your blog you will input on the second screen. HTTrack allows you to mirror multiple blogs in one Project. If your Blog is named "myvacation2006.blogspot.com", you'll input "http://myvacation2006.blogspot.com" into the URL List. Simply hit the "Add URL..." button on the second screen, and input each URL, as you need. Then hit the Next button, twice, and watch it mirror.

But, before you hit Next from the first screen, setup the Base Path properly. My professional recommendation is to separate your operating system, (non operating system) program libraries, and data into 3 separate partitions. I have, as an example:

  • C:\Windows for the operating system.

  • D:\Program Files for the non operating system program libraries.

  • E:\Web Site Mirrors for my HTTrack web site mirror databases.


You might have everything on C:, which is not my preference, but default system installation will do this to you. Then you'll have:

  • C:\Windows for the operating system.

  • C:\Program Files for the non operating system program libraries.

  • C:\Web Site Mirrors for my HTTrack web site mirror databases.



When you install HTTrack, the default will be for Base Path to be equal to the folder containing the HTTrack program components. If you install HTTrack into "C:\Program Files\HTTrack", that will be the default for Base Path too. I DO NOT recommend this. Please change Base Path, when you run HTTrack for the first time, to something like "C:\Web Site Mirrors". You will thank me, in the long run.

If you have the latter setup, and you mirror 3 blogs - Blog1, Blog2, and Blog3, you'll mirror them into "C:\Web Site Mirrors\Blog1", "C:\Web Site Mirrors\Blog2", and "C:\Web Site Mirrors\Blog3". The name "Blog1" for the Project and for the folder go hand in hand. If you rename the Project to "My Main Blog", the mirror folder will be automatically renamed to "C:\Web Site Mirrors\My Main Blog". Similarly, if you go into Windows Explorer, and rename "Blog1" to "My Vacation Blog", the entry in the pulldown list for Project Name will become "My Vacation Blog".

Always organise your data libraries. If you really use your computer, your data libraries will become many times larger than your operating system, or your applications. Data library organisation is therefore more important.

If you care for your blogs, as I do for mine, you'll take extreme care in organising the mirrors that you setup. You will thank me, in the long run, if you can do this.

Make it easy on yourself - setup shortcuts to the mirrors. If my Project Name is "The Real Blogger Status", and my Base Path is "C:\Web Site Mirrors", I could create a shortcut of "C:\Web Site Mirrors\The Real Blogger Status\index.html", and copy the shortcut to a folder on my desktop or Start menu. I do this for each mirror I create. Try it, it makes checking and using your mirror so much easier.

NOTE: Here's a word of caution. Don't just save one copy of your important blogs. If you're reading in the forums about problems accessing blogs, you go run HTTrack, and your blog is afflicted like the others, HTTrack may have the same problem accessing your blog. HTTrack will copy the same problems into your blog mirror that everybody else is experiencing. You will have no blog mirror either, when it's done.

If there are ongoing problems, make multiple mirrors of your blogs. I have one main backup job - "Main Backup", which backsup (among others) PChuck's Network and The Real Blogger Status, together. I have "Main Backup - A", "Main Backup - B", and "Main Backup - C", each a copy of each other. Today I run the first, tomorrow the second, the following day the third, and the day after that, I run the first again. This was called, in olden IT days, a "grandfather - father - son" backup strategy.

A backup takes maybe 5 - 10 minutes, and can be run while I am busy elsewhere. It occupies (for me) maybe 15 - 20 MB of disk space.

After you run an HTTrack job, check the error log. Look for any line besides "link added" or "link updated", and make sure that it doesn't indicate a problem with the mirror that you just created. Test what you just created. With a GFS backup set, if you find that today's mirror is corrupt, because of problems with Blogspot, you can fall back on yesterday's mirror. In extreme cases, the day before that.

How important is another 20MB of disk space to you? When I make significant changes to my blogs, I can even run an extra mirror. The peace of mind provided, to me, is significant.

Quicker Alternatives

Maybe you would simply like to print off your blog to paper, to save a visual copy of the blog. And you can Export / Import the comments and posts, to / from files locally stored on your computer or network.

>> Top

13 comments:

Scott Marlowe said...

Excellent tip. The backup utility works great. Thanks for the info!

Katherine said...

Errrm - not being very techie-minded (and your explanation left me with my jaw hanging open!), I'm backing up by e-mailing a copy of each post to myself and then I archive that.

Plus I archive the comments as well - same way (e-mails)

Then the only thing I have to do is copy the template into Word and archive tht as well.

What do I lose if I just do this? OK - I know I might have to reassemble a blog at some point - but my solution requires minimal effort.

Marion said...

I want to thank you for the tip about HTTrack! It worked great, and I have my whole blog backed up now...something I thought would take me hours took only a little while.

Incredible!

Anonymous said...

Well, i tried to use HTTrack but unfortunately it goes looping and tries to download over 200 MB of content, which my blog never has, as i only have 80+ posts...probably i'm doing something wrong i don't know, or perhaps it just doesn't work with blogger beta...if anyone has an idea i would be happy to know about it.

Anonymous said...

Thanks, totally thanks, if I lost the work on my blog - it would really set me back.

Not sure I got the bit about saving what in which drive - going to have to ask someone about that, I think I did the thing you said not to do.

But thank you - your a ledgend.

Nadia said...

Uhhh, I think I followed all your instructions correctly and it all seems to work. Only that the mirror I made on my blog contains only the last 7 or so posts (only those appearing on the first page). Am I doing sth wrong or I just need to change my blog settings temporarily to show all blogs on the 1st page, then backup, then change the settings again?
And one more stupid question: Where can I find my blog's template (I mean on the back-up) or should i save that separately?
Thanks a million!

Chuck said...

Nadia,

The instructions for backing up the template are in Backup Your Template, with a link above.

I don't know about the mirror scope. You've got a Layouts template - I see the blog - you are talking about http://mybloodyluv.blogspot.com/? I may have more reading to do.

fel was shaking her ass said...

i used the blogger backup utility from codeplex. it worked out well but when the XML documents are opened, they're all in html format and i don't really want that.

does this utility offered here, save posts in that format as well or hopefully more readable formats?

fel

Diane said...

I am thinking that the HTTtrack is in French, which I do not speak. How can I fix this so I can get free HTT? Thanks, Diane

Chuck said...

Diane,

Je ne parle pas le francaise, either (well, un peu). I do, however, have HTTrack en anglaise.

What leads you to think that the HTTrack is in French?

Fernanda said...

I'll do at once.
never thought about that.
Thanks

Andy said...

Dumb question. If I ever have to restore my entire blog how would I do it?

Chuck said...

Andy,

This is an interesting question. I can think of at least three different scenarios where your question would be applicable. Each scenario will require a different backup strategy.

If you want to discuss this question more fully, try my private forum.

https://groups.google.com/forum/?fromgroups#!forum/nitecruzr-dot-net-blogging