rsync and bzip2-compressed data

September 6, 2008 – 2:07 pm

As it only transfers deltas between source and destination files, rsync is a great backup tool when working with uncompressed data. The structure of compressed data, however, can change drastically between backups, defeating the benefits of rsync. I’d read somewhere recently, however, that bzip2’s “blocking” design might make it a viable compression to use with rsync. Ran an ad hoc experiment this morning to check this out.

Uncompressed Data

Here are the results from rsync’ing an uncompressed MySQL database with a few minor record changes.

total: matches=1634 hash_hits=2136 false_alarms=0 data=21227
sent 6.73K bytes received 9.92K bytes 6.66K bytes/sec
total size is 2.69M speedup is 161.44

Nice. Only about 7K transferred. Roughly the size of the change.

bzip2 Compressed Data

And here are the results from rsync’ing the same MySQL database, compressed in advance with bzip2.

total: matches=596 hash_hits=17533 false_alarms=1 data=876602
sent 876.99K bytes received 7.73K bytes 353.89K bytes/sec
total size is 1.64M speedup is 1.85

Woah! What amounts to about a 7K change is resulting 10x the data transfer.

Which makes sense, as — digging into the details of bzip2 — I see that the bzip2 algorithm chunks data in 100K - 900k blocks.  So I suppose that using bzip2 might make sense if you have an incrementally growing data store that adds about 100K of data between backups; and where the older data rarely, if ever, changes.  Barring that, to achieve the benefits of rsync, uncompressed data is probably the way to go.

That said, there seems to be a version of gzip with an --rsyncable switch for Debian.  The BeezNest has a great article on this here.

What does the “g” in “gDiapers” stand for?

September 3, 2008 – 4:01 am

Genuine?  Green?  Actually, I’m pretty sure it stands for

GOOD GOD!  THE TOILET IS GROTESQUELY GUSHING GALLONS!

Yes, while I’m for saving the planet and all, I think that the makers of planet-friendly, biogegradable, flushable gDiapers should have a large warning on the box:  May cause toilet to explode at 3am.

Granted, it’s my fault for not reading the instructions.  But then again, I’m a guy.  A guy with a baby.  Like I’m going to read diaper instructions.  If not a warning, the gDiaper people should at least be guy-conscious/guy-friendly and include a picture on the box indicating that the included swizzle stick is for helping the diaper break apart in water; not for ramming vast quantities of diaper down the nether regions of the toilet.

A simple drawing of an angry diaper-prodding guy with a big slash through it would suffice.

Jealous Teddy

September 1, 2008 – 5:08 pm

Ever since Layla came along, our dog Teddy has been “Uncle Teddy”.  We were a bit worried at first, but he’s been great.  Layla is always the first person to be greeted when we come home.  And when Layla cries, Teddy makes sure to come and get us.  He’s a big, orange furred, over-protective uncle.

But, he does get a little jealous at times.  And when we’re spending too much time focused on little Layla, he’ll occasionally butt in to remind us that he’s there.  Which means we have a lot of snapshots kind of like this:

Sorry Ted!  I’ll pick up a fresh bag of pig ears for you this week.

What to do about TortoiseSVN 1.5.x svn+ssh “Connection closed unexpectedly” errors on Vista

August 28, 2008 – 8:47 pm

Annoying.  If, like me, you’re suddenly seeing this (despite assurances that it’s been fixed), I have two recommendations:

  1. Revert to an older TortoiseSVN 1.4.x build if you can find it.
  2. Try SmartSVN

SmartSVN is a Java-based free and “pro” drop-in replacement for Tortoise.  Unlike other SVN clients, you can use it exactly as you were using TortoiseSVN.  Possibly not as feature-rich, but considerably more polished than our favorite old Testudine.  Runs everywhere.  And it works over svn+ssh.

Fighting Layla

August 22, 2008 – 8:59 am

I’m not sure if this is a something-babies-do thing, or a Layla thing, but lately she has been holding her fists up in a “put up your dukes” pose.  Here the look on her face seems to say “Oh, I am so going to have fun kicking your ass.”

Sixteen years and I could be in big trouble.

Interesting minor modes discovered in the Emacs guided tour

August 19, 2008 – 10:02 pm

I was looking for a good introduction to Emacs for some friends and stumbled upon the truly excellent Guided Tour of Emacs on gnu.org.  And unsurprisingly it contained a few useful minor modes of which I had never heard.

icomplete-mode

icomplete-mode shows completions in the minibuffer as you type.  If you’re too impatient to hit tab, then this is the minor mode for you.

iswitchb-mode

This global minor mode solves an inconvenience that had always bothered me.  Typically, to see a list of buffers without resorting to the mouse, one has to C-x b TAB to see the list of buffers, or C-x C-b and then toggle over to the other window (Ctl-o) to use dired to select a buffer. Too many keystrokes.

With iswitchb-mode turned on, C-x b shows a list of avaialble buffers in the mini buffer, narrowing them down as you type.  A much easier way to jump between dozens (hundreds?) of open files.

Update: Looks like iswitchb-mode has been replaced with the far superior ido-mode as of Emacs 22.

How to change your default shell

August 12, 2008 – 8:56 pm

I ain’t never going to remember this, so into the blog it goes:

$ chsh -s /path/to/fish

Masochists should substitute “fish” for whatever less friendly shell they prefer.

Some Evidence for Nancy Pelosi

August 10, 2008 – 3:07 pm

Recently on The View, Nancy Pelosi stated that “if somebody had a crime that the president had committed” (what she later refers to as “the goods”), she might consider impeachment hearings.

Um.  Okay.  Well, here are the goods. Or, at least, the tip of the iceberg:

Vincent Bugliosi, the L.A. attorney who prosecuted Charles Manson in 1970 and recent author of The Prosecution of George W. Bush for Murder, presented documented evidence that Bush Administration officials took the country to war under false pretenses and are therefore, under law, guilty of the deaths of over over 4,000 American soldiers. Not to mention what the war has done to the Iraqis.

On Friday, Mike Barnicle interviewed Ron Suskind, author of The Way of the World: A Story of Truth and Hope in an Age of Extremism, which presents documented evidence that the White House ordered the CIA to “manufacture” evidence connecting Iraq to Al-Queda.

I think its time for Ms. Pelosi and others to stop pretending that Congress’ daily order of business is more important than calling into question the widespread misconduct overarching this highly questionable administration.

At the very least, an illegal war resulting in thousands of deaths and expenditure of trillions of dollars — much of it unaccounted, much of it now in the coffers of businesses directly connected to Bush Administration officials — is an extremely dangerous precedent to let stand.

apache2: apr_sockaddr_info_get() failed for somehost

August 7, 2008 – 9:15 am

Hypothetically speaking of course, let’s assume you forget to renew a domain name. And suddenly that domain’s email is not working. And then you notice the site is down. The next step is, logically speaking, to panic, followed by an attempt to figure out what the hell is going on. Which usually means restarting Apache. Which results in:

apache2: apr_sockaddr_info_get() failed for yourhost

Which is, wow, an exotic new error. If you see this it means that, even though Apache says its restarting, really its probably not. And now all your other sites are down. And, so, more panic. More panic for you.

Now that you’ve probably realized that the default domain name has expired, you will want to get Apache back up on a different, actually non-expired domain. Like this:

$ hostname actual-non-expired-domain-name.com

Now restart.

Alternatively if your hostname is set to something like “www”, probably you can change the default site in vhosts so that Apache can connect the hostname to the tld.

This is all hypothetical of course.

Pensive Layla

August 5, 2008 – 11:08 pm

Layla is pondering about how best to launch a cutting-edge web 2.0 app running on a Gentoo-based LAMP stack.. even though the management, sales, and technical teams and bickering like alley cats over a puddle of spilt milk.

Either that or she’s trying to figure out where stuffed doggy Julien got off to.
It’s hard to tell.  Same face.