Monday, April 14, 2008

Beauty and the beast

Here are pictures from the UW campus out here. We went in the beginning of April to see the cherry blossoms. Later we took a detour to visit some of the libraries and sit in some guy's study.

UW Campus/Cherry Blossoms


Here are pictures from the UW campus back home. I think I took these late one night in MC when handing in an assignment.

UW (ugly)


You be the judge. I'm going to eat a cutie.

Sunday, April 13, 2008

Big news

I'm coming back to Toronto in June! Email me for more details. For everyone who is in town, especially friends I didn't get to see over the winter break, I will be sure to contact you. I've been terrible at keeping in touch.

Things in Seattle are going well. I finally found a roommate and a new place. My roommate is named John and he's from Microsoft, which is okay because he's tolerant of my constant jokes. Seriously, John is a great guy. The place is near downtown Redmond and it already feels like home. We finished unpacking and things are starting to settle back to normal. Yesterday we had our inaugural hot pot, before the weather grew too warm.

Last week we went to see the cherry blossoms at UW. The UW out here, that is. How beautiful the campus is! Now I see what the left-brainers of the world miss out on when they attend university. The blossoms were in full bloom, pale and white. Not pink like in Japan. I wore red to compensate.

March went by with minimal commotion, with the exception of the move. All things said and done, it went really smoothly, especially with the help of some great friends: Martin, Stan, Pat, and John. Recently I have been volunteering at church and my lesson to teach is how to abandon worry (Phil 4). Somehow the providence of the timing of the lesson and the move made sense: I'm normally pretty anxious and without what I learned the whole ordeal would have been a lot worse than it actually was.

For the first few nights I slept on the floor in the living room. I brought the cushions from my big orange couch. I set up the computer because I was bored and it was one of the few possessions I had with me. Somehow, the change in scenery and the liveliness of the sodium lights gave me some inspiration to write. It was nice to return to a meager existence — Mac Pro notwithstanding — for just a little while.

Further back, in February, Sil, Chris, Fred, and I went to Cuba. How I wish I had known about the lesson not to worry then! I had the wrong hotel name when I arrived. Everything worked out fine, though, despite my frantic pacing in the wrong hotel's lobby. After seven great days of sun, I learned many things. Always pack enough medicine. How to drive stick. Cuban food gives me LDZ. All in all, a great trip. And it's still free because Chris still hasn't cashed the check I wrote him.

Winter went by like a lamb in Seattle. I wish I could have said the same thing about Toronto. The only upside is snowboarding. Adrian came to visit in February, right before the Cuba trip. We had a blast in Crystal Mountain, him on skis and most of us on snowboards. Adrian, if you're reading this, I'm still waiting for this country to have a long weekend so I can fly up there! I miss the beef in Calgary, so expect a visit soon.

I think I'll leave things here for now. I don't want to say too much on this blog or I'll run out of stories to tell in person. So to you, my friends, I say good night. And take care.

P.S. The Queens project was a success. I'm writing it up in another venue, so you won't see any more posts about it here.

Sunday, September 30, 2007

Queens, growing pains

Part II. Blind alleys

Not everything goes smoothly in the passage of life. One of the numerous false starts I ran into stemmed in a false assumption. First off, my program is written in Java.1 I had reached the performance level of the current record holder, which was also written in Java, and I was ready to move on. Beating a Java program in terms of speed is kind of like winning the Special Olympics. I mean, it's kind of an accomplishment, but you know you're not facing off against the best the world has to offer. My point is that no matter what Sun tries to tell you, Java is slow. Slower than the guy who offed himself in Full Metal Jacket. So in order to face off against the fastest program in the world, my next target would have to be the C program that is now the second runner-up for the Queens record.2

Which as easier said than done. Where the Java world-record holder took 70 seconds to solve the 16-queens problem, the C program took 8 seconds.3 That's almost ten times faster. Ain't that a peach? Or maybe "peach" isn't the right word.

I decided to try something completely new with my approach, and here was my reasoning for this experiment. In C, recursion is about ten times slower than iteration.4 Which is why that fast runner-up program used iteration. And why I decided to use it too. Here's my code. Be warned, it's ugly. I'll explain the ugliness further down.
  private static void findSolutions(
int index,
int row,
long down,
long ld,
long rd,
int maxRowIndex,
SolutionResult res) {

int startIndex = index;
int startRow = row;
int[] indexStack = new int[maxRowIndex];
long mask;
long tries = res.tries;
long solutionCount = res.solutionCount;

// Main (critical) loop.
main_loop:
for (;;) {
mask = index;
if ((down & mask) == 0L &&
(ld & (mask << row)) == 0L &&
(rd & (mask << (32 - row))) == 0L) {
tries += 1;
if (row == maxRowIndex) {
solutionCount += 1;
// This optimization disables rectangular boards.
index = 1; // Triggers exit condition below.
} else {
down |= mask;
ld |= mask << row;
rd |= mask << (32 - row);
indexStack[row] = index;
++row;
index = startIndex;
continue;
}
}
while (index == 1) {
if (row == startRow) {
break main_loop;
}
--row;
index = indexStack[row];
mask = index;
down &= ~mask;
ld &= ~(mask << row);
rd &= ~(mask << (32 - row));
}
index >>= 1;
} // End critical for loop

res.tries = tries;
res.solutionCount = solutionCount;
}
Why is it ugly? Well, the problem we're trying to solve is inherently recursive. Transforming it into an iterative solution forced us to make certain concessions. These concessions are twofold. First, size: there's a lot more code. More code means more chance for bugs and should always be avoided if possible. Secondly, we have to maintain a custom stack in an array since we can't take advantage of the function stack.5 These two problems jam together to make this snippet awkward.

So right now, I should take a moment to explain exactly why I chose to try something ugly and awkward. It's the mentality that the ends justify the means. Just ask the Chinese: what's the childhood of one little girl compared to winning a gold medal in gymnastics? Well worth it, they'll tell you. In terms of cars, this solution is totally the pimped-out Civic that can do 300 HP. It may look like ass, but it goes fast. And that's worth it, right?

Right, if it went fast. But unfortunately, it didn't. What? That's kind of like going out with a girl who's both ugly and dumb. There's just no point. A total waste of effort. This solution wound up taking over 80 seconds. The assumption that an iterative solution would be fast in Java just because it's fast in C was wrong.

The only thing we can do is learn from our mistake. This thing was supposed to save our necks, but instead it totally bombed. How could that have happened? The mystery deepens.

The answer lies in the loops.6 Loops in Java are very slow. Kind of like the rest of Java, but inordinately so. The presence of a loop seems to incur much more overhead than a method invocation. That's an important clue as to where we should go next. Loops are slow, so if we want to go fast, we should avoid loops as much as possible.

This experiment failed, but from this failure we have obtained a critical hint about our goal. I kind of liken it to dating. Sometimes it's only through trials and failures that we learn what doesn't work. And the lesson is best learned through experience. It can be painful, even heartbreaking, to do your best and not be good enough. But instead of giving up and succumbing to the life of a spinster, we can take away nuggets of wisdom that leave us better equipped in the future. And I think that's the lesson learned here: our failures are much more valuable than our successes. Maybe I'm making too much a deal out of something so small, but then again programming makes me philosophize.

Footnotes

(1) Java 1.5 on Mac OS X, client VM.

(2) Solution for the 24 x 24 case by Jeff Somers. His site has several links to other Queens sites. And yes, this program is significantly faster than the current world record holder.

(3) Somers' site has some timings that give 23 seconds for the solution of the 16 x 16 case. These were taken on an older machine. All of my timings are taken on the same computer, a 2.6GHz Mac Pro. I compiled Somers' code with gcc 4.0 with -O3 optimization (the fastest). The resulting program solved the 16 x 16 case in 8 seconds.

(4) Recursion and iteration are two differnt ways of approaching a problem that can be broken down into smaller problems. For a classic example of recursion, see the Towers of Hanoi problem, a personal favorite interview question of mine. Iteration is somewhat simpler. If you've ever taken a programming class, a simple "for" loop is an example of iteration. It allows you to step sequentially through a range of values.
    Although both approaches are powerful tools for the programmer, there are certain classes of problems that lend themselves naturally to one approach over the other. Binary search, for example, can be implemented using either method but most computer science courses teach it using recursion because it's easier to understand that way.

(5) When one function calls another, the data from the new function is pushed onto a part of memory called the function stack or call stack. Think of it like putting a new, blank piece of paper on top of a pile when you start something new. Switching to iteration is like trying to use a whole bunch of Post-Its stuck all over your desk instead of a neat stack of paper.

(6) I determined this by using Apple's Shark profiling tool. It's free with OS X's developer tools and easily beats other profiling software that costs hundreds more. If you develop in Java, Shark alone is worth switching your environment to Mac OS X. It's really that good. And believe me, when a cynic like me says that, it really means something.
    Incidentally, all of my speed analyses in the Queens project are done using Shark.

Queens, finding dry land

Sometimes I think it's a curse to be in high-tech. Not because of the job itself but because it basically ruins movies for me. Any movie that has anything to do with computers. Any time a movie throws some hackery on the screen, it winds up looking incredibly amateurish. Imagine a medical show throwing around dialog about freezing peoples' heads. Utterly ridiculous. And I bet I know why this technobabble is wildly inaccurate: it's boring. Code is boring. And not just kind of boring, it's amazingly boring. Public television-boring.

Which is why I gotta warn you: these Queens entries will have a substantial amount of code. But don't worry, I'll be your tour guide. And all the code will be in snippets and clearly marked, so you can gawk at them like you're looking at animals in a zoo. Or inmates in an insane asylum. And we're walking down a corridor, lit only by bare light bulbs, that leads to Dr. Lecter himself.

But don't worry. I'll make it interesting. When I have a pen and paper in my hands, I can make anything interesting.

Part I. Growing pains, or discovery

Software is really malleable. It doesn't exist anywhere except in the minds of its creators. There's nothing substantial about it. That's both a blessing and a curse, though, as many companies are guilty of releasing things that are half-baked. Kind of like the time I got an egg tart from Pacific Mall that was half-cooked. I tried to return it but only got half my money back. But I digest...

Sorry, hungry. Anyways, the best part of software is how easily you can change your mind about things. There are no parts to assemble, no chemicals to dispose. But all that flexibility comes at a cost. You're basically afloat in a vast ocean of possibilities on a tiny little raft. It's night and you're navigating using a deft combination of your wits and the stars. And each trick you know is another constellation you can use for guidance.

I've solved a few problems like this before, so I knew a few tricks. I essentially took what I knew and applied it cookie-cutter to this problem. The solution was, shall we say, competent but uninspired. In case you're curious:
  private static long findSolutions(
int index,
int row,
long down,
long ld,
long rd) {

int startIndex = index;
long solutions = 0;
long mask = 0;

// Main (critical) loop.
for (;; index >>= 1) {
mask = index;
if ((down & mask) == 0L &&
(ld & (mask << row)) == 0L &&
(rd & (mask << (32 - row))) == 0L) {
if (row != 0) {
solutions += findSolutions(startIndex, row - 1,
down | mask,
ld | mask << row,
rd | mask << (32 - row));
} else {
// This optimization disables rectangular boards.
return 1L;
}
}
if (index == 1) {
return solutions;
}
} // End critical for loop
/* NOT REACHED */
} // end of function
If you're not a programmer, that was kind of a Keanu-"whoa" moment. What does any of it mean? Doesn't matter. All you really need to know is that it's basically the standard, plain-vanilla version of the solution. No one would fault you for a solution like this, but you probably wouldn't garnish much praise either.1

The only thing really noticeable about this solution is how normal it is. It sets everything up, then goes off into a loop. Ya gotta loop, cuz it's, you know, how you get stuff done on computers. Whenever you use a computer it's essentially in a giant loop waiting for you to do stuff. After the loop you get your "if" statements that check to see if we have a solution. Read one of my other entries for the wacky wigged-out fun that is "if" statements. Then the solution calls itself. Not like a schizophrenic with two cell phones; this kind of "calling yourself" actually makes sense. It's called recursion, and in order to define recursion, we must first define recursion. (If you got that, send me a resume.)

For the programming-minded, I have some basic optimizations in place. I'm using longs as bit fields2 and looping backwards to avoid a comparison to a variable through each iteration of the loop.3 Real Mickey Mouse stuff. In fact, in any other project, I probably wouldn't suggest looking into such trivial matters. But in this case, trying to optimize this loop was kind of like squeezing blood from an onion.

So there you go. At this point, we're like a kid out of high school. We know what we're doing and where we want to go. We can accomplish things, but we're neither noteworthy nor especially fast. But we're past the first hurdle.

Timing: For my baseline, I'm using the current Java world-record holder.4 It solves the 16-queen problem5 in 70 seconds on my computer. This version of my solver does it in 70.48. I actually timed it. Yes, I have no life. But I'm just about as fast as a world-record holder, and that's a good start.

Footnotes.

(1) Actually, my first attempt separated the representation of the chessboard into a separate class. Well, since method invocations on classes are much slower than accessing variables, that approach was initially much slower, taking over 180 seconds to solve the 16 x 16 chessboard case. That was embarrassingly slow to mention, except as footnote material.

(2) This is as opposed to using boolean arrays. Array accesses in Java are automatically bounds-checked. This means that each time an array access occurs, the VM checks to see if the index is in bounds. This incurs a significant overhead.

(3) This is a really esoteric form of optimization, and I'd only use in cases where there's a very frequently-called loop and precious little room to improve elsewhere. Such as a tight loop like this. It essentially involves looping backwards from N to 0 instead of forwards from 0 to N in cases where you don't know what N is beforehand. The normal way of iterating forwards requires a comparison between two variables (the loop index and N), which requires two loads from main memory. Looping backwards requires only a comparison between a variable and a constant (the index and 0), which only requires one load from main memory. Loads from main memory are slow and best avoided if possible.

(4) Inria's ProActive solver. Info and source code available at http://proactive.inria.fr/nqueen.htm.

(5) This is the problem of finding the total number of solutions on a 16 x 16 chessboard. Incidentally, there are 14,772,512. Timing was done on a single computer.

Sunday, September 23, 2007

Queens, at first glance

Sometimes obsession is a good thing. Without the ability to focus our thoughts we'd be nowhere today. Without a Dutchman obsessed with pornography we'd have no auto-focusing lenses on our cameras. Without a Japanese comic book artist steadfastly refusing to take her psychotic medication, we wouldn't have Sailor Moon. And finally, without Microsoft's obsession with copying every innovative high-tech company in the business, we wouldn't have the Xbox, Live Search, Zune, or a bunch of other products that no one uses. Cheap shot, I know.

It was obsession that led me to the subject matter at hand. A while ago I was called to do some interviewing. For this interview I had to think of a question. I was kind of lacking creativity that day so I decided to look at the Eight Queens problem. My introduction to this problem was way back in the ninth grade playing 7th Guest. It's a pretty easy problem to understand, and actually pretty fun to solve by hand. The goal for my interview question would be, obviously, to find a solution by computer. At that time, I decided that if I was going to ask this question, I better be able to solve it myself! So I took a quick lunch break to do so. It wasn't too hard—a fun diversion if you're programming-minded—and my own initial solution was pretty straightforward.

Soon, I wanted to make sure that my program was correct. I designed it to spit out all the solutions that it found and checked the first few by hand. They all looked fine. Then I saw the total number of solutions: on an 8 by 8 chessboard, my program claimed that there were 92.1 So I cross-checked this against a known list and hey, whaddayaknowit, it matched. To make super-extra sure, I checked the number against some different sizes of chessboards. The numbers were weird, but they matched. Good, I think, it works.

The part of me that went through combinatorics (man that's a lot of links... am I really that hard to understand?) decided to think about how these numbers grow and how to count the number of solutions without actually going through the messy legwork of finding them all.2 Well, it turns out that no one knows of a better way. In order to find out that there are 2 quadrillion-and-whatever solutions for a 25 by 25 board, you actually need to find each and every one of them. Not only that, you also need to find all the possibilities that could have led to solutions but didn't. That's just about 500 bajillion others.3

Twenty-five, by the way... that's the biggest chessboard anyone has ever solved. It's a world record, as far as I know. So that's when I wonder... what would be involved in conquering lucky number 26? So then I see the web page of the current record holder. The results? Over a period of six months, they used the combined resources of 260 machines. That was for a 25 by 25 board.

For 26, it's estimated that there are ten times as many solutions and that the computation will take ten times as long. That means that we either spend five years waiting for this thing to finish with 260 machines, or we somehow recruit 2600 suckers for a period of six months. I don't know about you, but I have better things to do for five years, and I don't even know 200 suckers, let alone 2000.

So what's a programmer to do? World record, interesting challenge, epic computation. I'm sold. Put it on my Visa. I'll pay in installments. I'm gonna take my shot at finding the twenty-something-quadrillion magic number.

But wait, you say, hold it a sec. What about the kabillion-jillion computations and the mind-numbing years it will take to solve this? You really don't know me, do you? Insert smilie here. I'm going to work smarter, not harder. That means, if I have ten times as much work to do, using the same amount of resources as you have, then I've got to pump out my solutions ten times faster than you. But that's okay. Won't be the first time I've done something like this.

And I'm going to need help. Lots of help.

(1) This count is for unique solutions. There are actually 12 solutions unique up to symmetry. Two solutions are considered the same up to symmertry if you can rotate the chessboard 90 degrees in either direction or flip it horizontally or vertically to transform one solution into the other. I'm only interested in counting solutions without symmetry.

(2) The best conjecture right now is that the number c(n) of solutions for an n by n chessboard is asymptotic to n! / (2.54^n). That means that c(26) is about 10 times greater than c(25). So the number of solutions we're expecting for c(26) is in the range of 22 quadrillion, or 22,000,000,000,000,000, give or take a few hundred trillion.

(3) My current algorithm appears to make around 38 times more steps than the total number of solutions. For example, on a 16 by 16 board, there are 14,772,512 solutions. The algorithm takes 570,595,151 steps to find all these solutions, giving a factor of almost 38. This trend appears to continue. I haven't found any references about calculating the number of guesses taken by various solutions. This also means that the solution to 26 will require about 836 quadrillion steps to make sure we've found them all. Now you see why we've no time to dawdle?


References.

ProActive NQueens World Record [http://proactive.inria.fr/nqueens25.htm]. Current world record holder. Used a distributed computing solution to solve the 25 by 25 case. The figures for the amount of computation required is taken from their web site. Their computation was later repeated and confirmed independently by a team in Taiwan.

Online Encyclopedia of Integer Sequences [http://www.research.att.com/~njas/sequences/A000170]. All known values of this sequence are here. So are links to existing record holders and the conjecture for the asymptotic growth of the solution count.

Wikipedia [http://en.wikipedia.org/wiki/Eight_queens]. Solution counts and a list of all solutions for an 8 by 8 board up to symmetry. Also includes a number of interesting links and variations of the puzzle. Note that the section "The eight queens puzzle as an exercise in algorithm design" seems to (confusingly) discuss two forms of the puzzle without explicitly stating which form is under discussion. The form I propose to undertake counts the total number of solutions for a given board size. The latter half of that Wikipedia section talks about finding a single solution on a large board, which is a much simpler problem.

Like a rat in a cage

Tonight we went to Endfest, which featured music to go deaf by. Don't get me wrong, the bands were amazing, but now I'm left with a blaring fire alarm in my head. After surviving mosh pits, bouncers, and drunken brawls, our reward was a surprisingly good Paramore (I might even download their music), the me-too Social Distortion, the we're-too-good-to-say-hello-Seattle Smashing Pumpkins (who rocked anyways), and finally MXPX at some local dive.


Paramore


Social D


Pumpkins


MXPX


At MXPX. I'm slightly tenderized by the mosh pit.

So now the only thing I have to do is wash the smell of pot and tobacco out of my clothes.

Endfest was awesome!

Saturday, September 22, 2007

September rain

Every so often we find ourselves at a crossroad. When we realize that among the vast ocean of our dreams, the best we can hope for is to scoop a precious handful to be with us the rest of our lives. And it seems, for me at least, that each scoop is smaller and smaller, containing only the most urgent and precious of ambitions. That's not to say that we lose our dreams throughout the course of our lives. On the contrary—if we didn't let go of some of our goals in life then none of them would come true. Instead, I see it as a refinement. When a diamond is cut, it increases in value and decreases in quantity, and this process of sifting through our ambitions in life is similar. It is only by consciously choosing what to focus on, and realizing what we give up in the decision, that we truly discover what matters to us most.

Have you met my friend Billy? Billy's a great guy, really smart, really cool, but he doesn't get things done. Sure, he's a Microsoft, but that's not his ultimate goal in life. He has a multitude of projects: media browsers, building a motorcycle game, yoga. His variety of interest is great, boggling even. He'd be the start of a true renaissance man—if he ever followed through. I've seen his repertoire. It feels like reading through a thousand post-it notes, and scrawled on the back of each one is an excuse:

"I need to find more people to collaborate with."
"I don't want to start building something now and end up using obsolete technology by the time I finish it."
"I think it's a great idea, but I just haven't found the audience for it."

Not each one is an excuse in the pejorative sense. Some of these projects were refined (to use my own terminology) for obvious reasons: the dot-com bust. Some startup stole his idea. But others do seem like they might fly, and yet poor Billy holds them back. And every time Billy tells me about one of these things, I have to resist the urge to kick him in the butt. It's almost like he's afraid that he'll actually succeed.

Billy's an extreme. He's a shotgun artist. He throws a thousand blind darts and hopes that they stick. He doesn't see that sometimes, you have to stop waiting for that ideal point in your life to finally pick up and do that thing you always wanted to do, go that place you always wanted to go, or talk to that girl you always had a crush on. I see in this idealism a lot of naivete. And eventually his dreams lose their romantic energy and they peter out. And at the end of the day, always, gargantuan plans, always. But nothing accomplished. It turns out that Billy doesn't refine. He just gets bored.

Meet my other friend, Walt. Walt is my age (whatever that means to you). He works at a government job where he can't get fired and he lives with his parents. He sits in his room and downloads gigabytes and gigabytes of anime. I have the feeling that, if you looked really hard, you could find some stains in his carpet that are ten years old. Or twenty. I don't have much to say about Walt, other than his goals in life amount to nothing more than to inherit the house that he lives in. I've never heard him talk about his goals. I don't think he has any. This paragraph is probably the most that anyone will ever write about him. And he's perfectly happy with that.

As for me, I'm trying to stay somewhere in the reality between these two. I can see elements of both of these people in my own psyche, and that's scary. In order not to turn into either one, something's gotta give. It's difficult to live with too many hopes and aspirations; it's impossible to live with too few. The more thought I pour into what I want to do with my life that's truly mine, I come up with one answer. It's an answer that I've often visited, but only now do I make the commitment.

That long-winded intro is basically padding for my announcement.

I'm writing a book. That's it. I said it. Now it's out there, and I'm accountable. I might post some snippets on this blog from time to time. And if you care, if you really do care, then you'll bug me about it incessantly, to make sure I don't give up. Believe me, if you do, you'll earn a special place in my heart (and potentially dust jacket).