Ah the (Digital) Humanity!

Today I'm sitting in O'Hare; tomorrow I get to meet some of the investigators in the burgeoning academic field of the Digital Humanities. These folks are all pondering the question "What sort of neat things can you do if you apply computing to understanding culture and history?"

The nascent field has already borne some interesting fruit:

Tonight, however, I just hope to find someone in Urbana who might like to go to Papa Del's with me when I get in around 8pm...


Dolan v United States

In a tragic day for jurisprudence, the Supreme Court of the United States Monday approved "fill in the blank" sentencing, allowing a court to decide details of a sentence (in this case the amount awarded for restitution) an indeterminate time after the defendant is convicted. Dissenting, Chief Justice Roberts wrote:

[If] a trial court can “leave open, say, the amount of a fine,” ante, at 12, why not, say, the number of years? Thus, after a defendant like Dolan has served his entire sentence—and who knows how long after?—a court might still order additional imprisonment, additional restitution, an additional fine, or an additional condition of supervised release.

BREYER, J., delivered the opinion of the Court, in which THOMAS, GINSBURG, ALITO, and SOTOMAYOR, JJ., joined. ROBERTS, C. J., filed a dissenting opinion, in which STEVENS, SCALIA, and KENNEDY, JJ., joined.

"Lifting Map"

I feel like a college freshman again.

There's a one-to-one correspondence between the set of circles in the plane and intersections of planes with the parabaloid z = x^2 + y^2. So you can turn the InCircle question into a half space test by passing your test point through the "Lifting Map" (x, y) -> (x, y, x^2+y^2)

Consistency, Survivability, Speed: Pick Two

Legal Disclaimer: The views and opinions expressed in this article are mine only and do not necessarily reflect the views of Google.

I spent my first two years at Google working on Google Checkout, and I walked away with two great lessons. One was the lesson I learned from Pat Helland on building scalable systems. The other was admitting a fundamental trade off in system design. Like many things, it seems obvious in retrospect, but you would be surprised at the number of people who keep trying for everything [cf, OceanStore].

When building a computing system, you only get to pick two: (1) Consistency (2) Survivability (3) Speed.

What do I mean by these? Consistency: Someone in Tokyo and someone in Chicago make a query on a piece of data and both observe the same state. Survivability: The system continues to function even in the face of a catastrophic outage such as the loss of power in New York state in 2003. Speed: How long does it take to mutate a single piece of data.

Scenario 1: Consistency and Survivability

If you want consistency and survivability, then you are forced to make each write in your system to backends on different failure domains. Failure domains may be quite large, as shown by the New York State 2003 power outage of 2003. This implies geographically distributed backends, and you can't beat the speed of light. A ping from Mountain View to Atlanta takes 60 ms on Google's network; and the speed of light puts the lower bound at 22 ms.

At Google, systems which choose consistency and survivability include Google Docs and our global Chubby cell. This trade off has the advantage of requiring no operator interaction when one of our clusters goes offline. However, it comes at a cost: writes which could have taken 2 ms take 200 ms. An application which makes 100 writes for each user action might work well enough in the former case but won't in the latter. Therefore, Google Docs and users of Global Chubby must be much more frugal with their writes.

Brewer's Law: In a network partition, there is always a losing side.

If two systems are independently modifying a piece of data, it may be difficult (or impossible) to resolve the combined effects of those modifications. What does it mean for an order to be canceled and shipped (so-called "forked state")1 ? To maintain consistency during a network partition, at most one side will win ownership of each piece of data. The other side will at best have a read-only version that may be stale.

Scenario 2: Consistency and Speed

If you're willing to forgo survivability, you can get consistency and speed in a variety of ways. The simplest way is to have one computer and a single copy of the data (a global "master"). To reduce your mean time between failure, you can put a quorum system like RAID or GFS in place so that when individual components fail, your system keeps humming along. However, if your entire site goes offline due to a fiber cut or other catastrophic failure, you have an outage.

The common approximation to survivability chosen by people who choose consistency and speed is to have a replica which lags a certain distance behind the master and is located elsewhere. The replica has a consistent view of the world: it's just a bit out of date. Such replicas are often used to serve read-only versions of the data, and there are approaches to force-promote a replica to be master if the master is lost, but care must be taken regarding the possibly-stale state of existing objects.

Scenario 3: Speed and Survivability

How does Google provision web search for the world? We don't synchronize the index.

Google has S search clusters each of which can serve Q search queries per second. Each search cluster receives a continuous stream of background updates that bring it to a close approximation of the "current" state of the Internet. However, there's no need for each cluster to be strictly in sync with the others. Maybe you have the blog post that was made ten minutes ago in your search results and I don't. In return for such small inconsistencies, Google provides web search which is always on and always fast.


Even though I've framed the survivability versus speed trade off here as one involving geographically distributed backends, the trade off arises at lower levels as well. GFS trades some survivability for speed: write completion does not wait for disk commit, only memory commit to three chunk servers. If all three chunk servers lose power before their disk caches are flushed, the writes are lost. We've found this trade off acceptable given the speed up: disk seeks take about 10 ms, whereas memory references take only 10 ns.

1 Maybe Amazon knows what it means to have an order be independently shipped and canceled. Amazon's Dynamo Paper indicates that many of their objects are written to have mergable histories.

-David Eger, November 2008

Media List 2007

I've stumbled upon a variety of good media this year which ended up falling into three broad categories: horror, curiosities and fantasy. The best of which I list below.


Both real and imagined this genre has the ability to make us want to turn our eyes, hoping not to see the degradation and inhumanity that lays before us.


Curiosities give us something interesting to think about. They are full of details from which we can learn, or which we may simply enjoy.


The realm of fantasy is the realm of our dreams: worlds we can explore, people we wish we knew (or were!), or stories that are just plain fun.Key: Fiction, Non-Fiction

New Year's in Atlanta

It's been five months and I'm settling in to life in the Bay Area. The San Francisco Bay Area is about as expensive as New York, but you get half again as much space for your money. Lots has happened this year: a cross country move, three weddings, a funeral, a new piano, getting promoted, moving projects, seeing asia for the first time, and listening to some good books.

I'll be in the Atlanta around the New Year (Dec 28 @ 7PM - Jan 6 @ 7AM), staying with Hammad and hopefully getting a chance to see you. Let me know if you're around.

Please post your current address (replies screened) if you'd like a christmas card.

Owl visit

jalenstrix drove up from her new home as resident linguist in the desert to visit me this weekend, hooray! Her new boy is a lot of fun; and jalen continues to be the most fun girl on the dance floor. Skinny czech girls approve.

Dark and Imaginative Fiction

I've been thinking of pigmassacre a lot lately, ever since I picked up a copy of Borderlands 3.

Dark fiction is something that brings back memories of Terrell Street -- the place where I was first introduced to the Borderlands series. My friend Pynk kept all manner of macabre games and books, so it's no surprise that he discovered some of the more imaginative books of the genre. Not all of the stories in Monteleone's series are great, but he does a good job of finding fresh ideas. The Borderlands series is a limited edition run, however. So it's a rare pleasure to find a copy of one of the five books.

Slightly easier to come by is the good stuff over at pseudopod.org. Pynk edits the podcast under very much the same rules as Monteleone, and there are enough up and coming writers to keep a good dark story coming out every week or so.

US Presidential Candidate Ron Paul

My sister told me the other day about a candidate for president: Ron Paul. He's one of the 133 congressmen who voted against the authorization of a war in Iraq, which is a nice quality in any candidate for president. Of the other popular candidates, Biden, Clinton, Edwards, Lieberman, and McCain all voted for the war. Feingold voted against it. Barack Obama didn't get elected to congress until 2004, two years after he might have voted. Senate Bill for Iraq War Authorization House Bill for Iraq War Authorization House Roll Call

Ron Paul is a reasonable candidate. His stances play out as follows according to his web site:

  • The Iraq war was a mistake, a war sold on a folio of lies.
  • The government has taken too broad a license to invade and monitor its citizens.
  • Tariffs are just more government taxes - get rid of them.

  • We tax and spend too much.
  • We must stop the government from taking land through eminent domain.
  • America should withrdraw from the WTO and other meddling international organizations.

  • Fetuses should be protected by law.

He'll be stopping by Google in a couple of weeks, and his speech will likely end up on Google Video. Let's see if he's any more receptive to our concerns than McCain was.