The official online Fjord
August 22, 2010 at 7:48 pm · Filed under Tech
Lucky asked me the other day if there was any reason to use MyISAM instead of InnoDB for a MySQL database. Scum thought my response was useful, so I’m posting it in case it would be useful for anyone else:
From what I understand, there are still several advantages to MyISAM. If your database load is primarily read or primarily write activity, MyISAM is better. For mixed loads, InnoDB wins. Also, InnoDB comes with a lot of overhead of journaling and stuff to make it a “real” database. MyISAM is pretty simple. If you don’t need all those features, you might be better off with MyISAM.
In InnoDB’s favor:
Obviously if you need anything along the lines of transactions, foreign key constraints, or some general feeling of safety that your data will actually survive after something goes wrong, InnoDB is the answer.
MyISAM has no cache for data. Some or all of its index can get cached, but data retrieval always results in a disk hit. The kernel can help, but it’s still less than ideal. InnoDB uses the same structure to store the index and the data, so make your buffer pool size large enough, and the whole thing can be in RAM.
MyISAM tables get fragmented easily. They have a pretty boring mechanism of marking rows as deleted then filling them back in with new data, fragmenting as needed. It will stay that way until you do an OPTIMIZE TABLE. I don’t remember exactly how things work with InnoDB, but the problem isn’t nearly as bad, and you don’t need to do periodic table optimizations.
InnoDB has row-level locks, MyISAM has to lock the entire table whenever an update occurs. Obviously this is important for a database with mixed read/write activity because you can update part of a table while letting things continue on the rest of the table. There are several caveats to this rule. InnoDB locks have lots of subtleties and you’ll probably end up locking more of the table than you thought you were. Also, MyISAM does support a mode where a table can have INSERT and SELECT statements occurring concurrently, but you need to have very few UPDATE/DELETE statements for it to work.
That’s what I can think of off the top of my head. In general, I’d say go with InnoDB unless you have a good reason. It’ll give you a “real” database. If you don’t need a relational database (which I’m increasingly convinced is the case), use MongoDB or something. As always, the final answer depends on the characteristics of the particular application: benchmarks are your friend. Oh, and MySQL Performance Blog is an excellent resource for everything on the topic of MySQL tuning.
August 8, 2010 at 7:22 pm · Filed under Life
Last week, Rebecca asked me if I’d heard the new Arcade Fire album, The Suburbs. My reply ended up being more or less a full review, so here are my thoughts on the album:
It’s pretty great, but at this point, I wouldn’t rank it as high as their previous two albums. While they maintain their trademark epic soundscape, the music itself doesn’t sound quite as epic to me. “Neon Bible” has several multi-part songs and lots of cool things with harmonies, counter-melodies, time signature changes, creative instrumentation (boys choir, mens choir, pipe organ(!), bells, etc.), and dark lyrics reflecting on such weighty topics as postmodernism and theology.
On the other hand, the topic at hand in the album–the suburbs–doesn’t really lend itself to those kinds things. Do you really need a pipe organ and a boys choir to bemoan the fact that the house you grew up in looked like all the other ones on your block? Besides, their instrumentation does evoke a bit of a 90s vibe, which makes sense given that Win and Regine are in their early 30s.
And the album certainly has its moments. Who would have thought endlessly repeating the word “rococo” would be so interesting? There’s also the always-eloquent biting cultural commentary: “They heard me singing and they told me to stop. Quit these pretentious things and just punch the clock” (Sprawl II [Mountains Beyond Mountains]) and “In line for a number, but you don’t understand. Like a modern man” (Modern Man). And I particularly enjoyed “We Used to Wait,” their lament over the transience of modern communication and our obsession with instant gratification.
So should you buy it? Yes. But also get “Neon Bible” and “Funeral.” :-)
August 1, 2010 at 5:48 pm · Filed under Life
She’s a black belt in Karate. :-)
I adopted her from from the friendly folks at Humanimal Connection a few weeks ago. It’s nice to finally have a cat. Even if she does think play time comes well before my alarm clock goes off…
The best part is that I didn’t even make up her name. It came from her previous owners. Clearly it was meant to be.
July 5, 2010 at 4:22 pm · Filed under Tech
So you’ve heard that tuning the Ruby garbage collector is essential for Rails performance, but all those tuning parameters in the REE documentation look scary and indecipherable? Well, I took an afternoon and played around with Birdstack‘s GC config, and it turns out that not only is it actually really important to tune those parameters, they’re not really that scary.
First, some terminology:
Ruby Heap: A collection of ~20-40 byte “slots” that keep track of every object allocated in a Ruby program
C Heap: Memory allocated by Ruby for storing data for your objects if the slot is too small. If you’re storing an integer, it’ll probably fit in the slot. If you’re storing a string containing the entirety of your latest novel, Ruby will need a slot to keep track of the object and some space on the C heap to store your bestseller.
Garbage Collection: Unlike C/C++, Ruby takes care of deallocating objects for you. It doesn’t do this as soon as you stop using an object, it waits and does period sweeps through the Ruby heap (the collection of slots), figures out what objects aren’t being used and then deletes them.
Ruby Enterprise Edition: A distribution of MRI 1.8.x with all sorts of nice patches. To tune Ruby’s garbage collector, you’ll need to be running REE.
If you want to know more about the Ruby garbage collector, go read the Engine Yard article on MRI Memory Allocation. Also, you should really take a look at the REE documentation to see more about these parameters. And if you really want to know what’s going on, just go read the gc.c source. It’s really not too complicated.
OK, so REE gives us a bunch of knobs to turn for GC and memory allocation, and a lot of diagnostic output. The first step is to turn on that output so we can decide what knobs to fiddle with. Here’s how I set up my Rails app to give me an approximate idea of what GC activity happens with every request.
GitHub Gist: http://gist.github.com/464624
Once you’ve got that set up, get your app running in a staging environment that closely mimics your production environment, and start up an instance of your app with GC logging turned on. I found that I needed to use just ‘script/server’ with WEBrick instead of Passenger or I wouldn’t get anything in my logs. Probably something with how Passenger handles file descriptors when forking; I didn’t feel like debugging it. You might also be interested in a my post on setting machine-wide Ruby GC defaults.
I started with these environment variables:
RUBY_GC_STATS=1
RUBY_GC_DATA_FILE=/tmp/ruby_gc.txt
And here’s what I got for my first request:
***
Request completed: http://localhost:8080/people
GC collections: 7
GC time (us): 563772
Bytes allocated since last GC run: 1856079
Bytes allocated since last request: 43486971
Number of allocations since last request: 887968
Live objects: 558106
Total allocated objects since interpreter start: 2557208
HEAP[ 0]: size= 10000
HEAP[ 1]: size= 20001
HEAP[ 2]: size= 38001
HEAP[ 3]: size= 70402
HEAP[ 4]: size= 128722
HEAP[ 5]: size= 233698
HEAP[ 6]: size= 422655
***
Woah, after only 1 request, we’ve done 7 GC runs, allocated 7 Ruby heaps, and spent over half a second doing nothing but garbage collecting. To be fair, this includes loading WEBrick, and I’m on a resource-starved virtual machine on my laptop, so things will be a bit different once we’re in full production. But this is still pretty horrible!
Let’s start with the easy optimization: heap allocations. After 1 request, we’ve got 558,106 objects, so it’s pretty silly to stick with the default heap size of 10,000 and make Ruby allocate more heaps when it runs out of room, so let’s up that to 600,000 and try again.
RUBY_GC_STATS=1
RUBY_GC_DATA_FILE=/tmp/ruby_gc.txt
RUBY_HEAP_MIN_SLOTS=600000
And here’s the request:
***
Request completed: http://localhost:8080/people
GC collections: 6
GC time (us): 502798
Bytes allocated since last GC run: 4013508
Bytes allocated since last request: 43482680
Number of allocations since last request: 887880
Live objects: 616405
Total allocated objects since interpreter start: 2557151
HEAP[ 0]: size= 600000
HEAP[ 1]: size= 610000
***
Well, that’s a bit better. We cut out 1 GC run, 5 Ruby heap allocations, and about 60ms of GC time. We ran over our 600,000 item heap limit (616,405 live objects), so let’s increase that to 650,000 for the next run. But that doesn’t explain why there were so many GC runs.
The next important tuning variable is RUBY_GC_MALLOC_LIMIT. By default, it’s set to 8000000, which means that every time we allocate more than 8 million bytes on the C heap (not the Ruby heap), we run GC. Well, on the first request, we allocated 43,482,680 bytes! Divide that by 8 million, and you get almost 5.5, so yup, 6 GC runs makes sense. But if we know that we’ll be allocating nearly 50 MB every request, let’s just increase that number.
RUBY_GC_STATS=1
RUBY_GC_DATA_FILE=/tmp/ruby_gc.txt
RUBY_HEAP_MIN_SLOTS=650000
RUBY_GC_MALLOC_LIMIT=50000000
Here’s the request:
***
Request completed: http://localhost:8080/people
GC collections: 3
GC time (us): 339061
Bytes allocated since last GC run: 2796396
Bytes allocated since last request: 43482627
Number of allocations since last request: 887879
Live objects: 582949
Total allocated objects since interpreter start: 2557151
HEAP[ 0]: size= 650000
***
Much better! We cut out 3 GC runs, and over 150ms of GC time. But why is it still doing 3 GC runs? Well, it turns out that was just because either WEBrick or Rails initialized a ton of objects on startup. If we go back in the logs and look at the GC runs, we can see that two of the runs happened before the request started, and each time the freelist had 0. That means the Ruby heap space really was out of room to store more objects.
Garbage collection started
objects processed: 0650000
live objects : 0403654
freelist objects : 0000000
freed objects : 0246346
…
Garbage collection started
objects processed: 0650000
live objects : 0428574
freelist objects : 0000000
freed objects : 0221426
…
***
Request starting: http://localhost:8080/people
***
Garbage collection started
objects processed: 0650000
live objects : 0489192
freelist objects : 0000000
freed objects : 0160808
GC runs themselves aren’t a bad thing: if you use a language with garbage collection, you’ll have to collect dead objects sooner or later. The larger your heap, the fewer times you’ll have to do garbage collection, but a large heap also means that doing GC will take a long time. I figure a reasonable goal is an average of one GC run per request cycle.
So let’s run a few more requests on this same instance:
***
Request completed: http://localhost:8080/people/cghawthorne
GC collections: 1
GC time (us): 97365
Bytes allocated since last GC run: 3556
Bytes allocated since last request: 5147220
Number of allocations since last request: 158437
Live objects: 505713
Total allocated objects since interpreter start: 2736920
HEAP[ 0]: size= 650000
***
…
***
Request completed: http://localhost:8080/people/cghawthorne/lists/3.html
GC collections: 0
GC time (us): 0
Bytes allocated since last GC run: 2175864
Bytes allocated since last request: 1620109
Number of allocations since last request: 43456
Live objects: 562918
Total allocated objects since interpreter start: 2794125
HEAP[ 0]: size= 650000
***
Hey, it worked out! About 1 GC run per request (we’re almost at the point of having to do another one, less than 100,000 objects to go) and no extra heap allocations.
There are several other settings you can tweak that have to do with determining the size of additionally allocated heaps. If your application’s memory size can occasionally grow, you might look into fine tuning them as well.
Update: If you’re curious, I collected some performance data from running Birdstack with and without these modifications.
Without GC tuning, the average VIRT size, as reported by ‘top’, of each Rails process was about 80mb shortly after starting. Over the course of a day, the average time to serve a page was 599ms.
After tuning, the average start size of the Rails processes was 120mb, and over a day, the average serve time was 581ms. After letting it run for another couple days, the average serve time dropped to 475ms.
So what does that mean? Well, the tunings didn’t actually improve response times on the site drastically. A lot of the variance can probably be attributed to different types of traffic. But, the changes certainly didn’t hurt things, and I’m willing to pay a few extra megabytes of RAM for a little more performance. I’m guessing the real problem here is that my server is very much IO-bound rather than CPU-bound. It lives on a virtual host with a limited amount of RAM and pretty slow disk access. So, every page to swap (thankfully not that many) and every disk hit the db has to do (quite a bit more) hurts. I’d be interested to see GC performance numbers for hosts that have better disk and memory resources. I know Twitter got a big performance boost with GC tuning.
June 9, 2010 at 1:55 pm · Filed under Life
Just like everyone else, I like to think that I’m immune to advertising’s influence. But that’s not really the case. The other day, Gmail figured out that I’m interested in music and decided to show me an ad for Musician’s Dice. Those sounded pretty novel, so I decided to take a look. Next thing I knew, I was ordering a copy of Muzundrum, a sort of musical Scrabble/crossword puzzle board game that is played with the Musician’s Dice.

I had a chance to play a few games while visiting my family in Kansas, and I think it’s a pretty great game. Players take turns rolling one of the 12-sided (one side for each note in the chromatic scale) dice and then try to place that die on the board in a way that adds to a scale or a triad. To keep things from getting too complicated, only major scales are allowed and triads must be either major, minor, or diminished. If that’s too simple, they also have a master’s version of the game.
I’ve played only three rounds (and won one!) so far, and I’m still using the cheat sheet quite a bit. But, as I hoped, it has helped improve my understanding of music theory.
Nathan (who is a full-time music minister and really knows his music theory) and Abu (who isn’t half bad at music theory either) were able to be in town one evening, and we all played a game with Grey (who claims not to know music theory, but has successfully applied his mathematical superpowers to the game). We probably spent more time talking about music theory and fiddling around on the piano than we did actually playing the game. It was great! Nathan won. :-)
If you happen to get a copy of the game (or come visit me!), here are the important bits of music theory you’ll need to know. Major scales are made up of whole steps and half steps in this pattern: WWHWWWH. Or, you can think of it as two identical tetrachords separated by a whole step: WWH-W-WWH (thank you, Wikipedia). Major triads are made up of a major third (the second note is two whole steps above the first) and a minor third (the second note is one whole step and one half step above the first). Minor triads are the opposite: a minor third followed by a major third. Diminished triads are made of two minor triads.
I’d learned all that stuff at some point, but playing the game and talking through it all really helped solidify it in my mind.
One potential gripe I have with the game is that there seems to be very little strategy involved. You never know what note you’ll get on your next roll, so there’s usually not much point in planning ahead; just find the best place to put the note you just rolled. For now, just trying to find that best place is sufficiently challenging to keep me interested, but I can see how it might get less interesting once I get better.
I haven’t tried it yet, but it might be fun to try pre-rolling some number of dice for each player and keeping them secret. This could make Scrabble-like strategizing possible. Or, maybe once I get bored with the standard gameplay, I could just move up to the master level.
Anyway, it’s a cool game. You should come visit me so we can play a round!
June 6, 2010 at 1:35 pm · Filed under Life
Fjord’s Last Stand photos
Last weekend was my final weekend in Maryland before moving to California to start my shiny new job (which starts on tomorrow!). So, my totally awesome friends put together a non-stop four-day-long party which Jared titled “Fjord’s Last Stand.” Luckily, things worked out better for me than for Custer.
Friday
The weekend began with an all-you-can-eat crab extravaganza at a little seafood place that Jared found. Zerg had never had crabs before, and I’d only had them once, so we definitely had to experience a true Maryland crab dinner before I left. It was pretty great. The all-you-can-eat aspect certainly made it less stressful–we didn’t have to be worried about extracting all possible meat from every crab.
Then we went back to the Osborn’s (who graciously let me sleep on their couch because I had already vacated my apartment) and watched the last two episodes of Chuck. Zerg and I had watched all the rest of the season while we were still roommates, so it seemed fitting to finish it off at the party. The season had a great finale, but I really wish they would have just ended the show instead of adding another crazy twist at the last moment.
Saturday

The next morning, we got up early and headed to Philadelphia to see (and hear) THE LARGEST PIPE ORGAN IN THE WORLD! Woo! Oddly enough, it’s not in a church or a concert hall, but a Macy’s department store. The Wanamaker Organ did not disappoint. The sound wasn’t overwhelming loud, but amazingly rich, warm, complex, diverse, and yes, powerful. I highly recommend a visit. In addition to a fantastic concert by Peter Conte himself, we also got to go on a tour of the organ.

But, best of all, I got to sit at the console! More photos of the tour and the console are in the gallery.
We also got to see the Liberty Bell, visited a couple yarn shops, and ate at Pat’s, home of the original Philly Cheese Steak. Those were all great, but obviously my favorite was the organ. :-)
Sunday
On Sunday, we had a special outdoor church service under a big tent for the Bishop’s visit. It was actually pretty neat, and we had great weather for it. St. Tim’s gave me a really amazing going away present: an accompanist’s edition of the 1982 Hymnal and Service Music, signed by everyone on the worship team. I hope I can find a good church that will let me use it on their pipe organ!

After the service, we had a big picnic lunch. And after that, Fr. Terry had the crazy idea that we should have a Young Adults vs. The Vestry rope pull over a mud pit. He had told us we’d be doing this earlier, but we really couldn’t tell if he was joking or not until it actually happened. Unfortunately, most of the young adults had left by that time, but the teens joined in with us in our battle against the vestry. We still lost. And got rather muddy. What a way to finish up my time at St. Tim’s! :-)
Then, we went back to the Osborn’s for an excellent Rock Band session (we played a lot of Rock Band that weekend) and watched Hero and Kill Bill Vol. 2 (the greatest movie ever made, ever).
Monday

Thanks to Memorial Day, most people were off work on Monday, so the party continued! In the morning, we went to Jared’s parents’ place to celebrate his dad’s birthday. Then it was back to the Osborn’s for Make-Your-Own-Sushi Night! We had salmon and shark for the meat and cucumber, carrot, avacado, and scrambled eggs for the, uh, non-meat. We mostly did regular rolls but also a few handrolls and even some sashimi. It was pretty tasty.
Then, in continuing our movie theme, we watched Once (one of Rebecca’s favorite movies). It had some pretty great music and made me want to go start a band or something. Brandon and CC even came over, and I got to play with their cat Diotima.
Conclusion
If you were to make a list of Things Fjord Likes, there wouldn’t be very many things on it that didn’t occur that weekend. Thanks to everyone who made it happen! I shall miss you, my east coast buddies. Thanks for being the awesomest friends anyone could ask for!
You can also read Rebecca’s account of the party. The last bit almost brings tears to my eyes. ::sniff::
Fjord’s Last Stand photos
May 24, 2010 at 11:53 pm · Filed under Tech
Update: I’ve posted some more info on how to go about tuning the garbage collector in Ruby Garbage Collector Tuning: A Walkthrough
After attending Ruby Nation (which was awesome) and hearing a couple talks on how Ruby’s garbage collection works, I’ve been playing around some with the GC settings that REE lets you tweak. I’m still experimenting with the best setup, but I do know that by setting RUBY_HEAP_MIN_SLOTS to 600,000 (the default is 10,000), the Birdstack code base loads with 1 heap slot allocation instead of 7.
So, that’s an obvious win, and I wanted to make sure that all my app servers used that setting. Unfortunately, there’s no easy way to do that. My first thought was to look for a Passenger option to set the environment for the REE instances it launches, but that setting doesn’t exist, and that wouldn’t cover my delayed_job process or the cron jobs that run.
My solution was to write a wrapper around the ruby program itself. You can’t use a script as an interpreter for a shebang line (#!) with kernels earlier than 2.6.27.9, so for my deployment, it needed to be in C. It’s a pretty trivial program, but it gets the job done and ensures that all my Ruby programs use the correct GC settings.
GitHub Gist: http://gist.github.com/412737
Add whatever other GC settings you’d like and modify the exec line to point to your real Ruby executable. To compile, just do ‘make ruby’.
Read up on the other REE settings in the REE documentation. Give it a few reads; the way the allocator works isn’t totally obvious. I also recommend the Engine Yard blog post on the MRI allocator.
May 1, 2010 at 9:09 pm · Filed under Life
Spence flew in Thursday night for our long-awaited Epic Research Party of Awesometude. Remember that Stockade Case thing we obsessively researched for an entire semester back in college and were going to write a book about? Well, it’s going to happen. It turns out that the National Archives in DC has all of the original copies of everything used in the trial, so we’re going to spend the next week researching and writing like crazy.
It’s pretty exciting to handle original documents from the 1860s. We’ve spent countless hours reading about the people involved in this crazy trial, and now we’re able to handle papers that they actually wrote on! Better yet, the things they wrote are pretty nuts. My advice: don’t mess with a Reconstruction-era disenfranchised Texan.
Bonus: we get to use locker #41 every day to store our stuff. :-)
February 21, 2010 at 2:01 pm · Filed under Life
As of today, I’m actually a real organist! Nearly two years ago (April ’08), I started piano lessons with the goal of eventually playing the organ in church. Today, I played Old Hundredth (“Praise God from whom all Blessings Flow”) in church. I used all three manuals of the pipe organ at St. Tim’s, both feet on the pedals, and a pretty full registration. The congregation sang and everything! I made a few minor mistakes, but I kept on playing, and I doubt very many people even noticed.
Now, I just need to join the guild. :-)
December 31, 2009 at 7:26 pm · Filed under Life
For Christmas break, I’m back in Kansas with my parents. This last Sunday, I decided to check out the local Episcopal church. I started attending an Episcopal church after college, so I was curious what the one in my hometown was like. Grace Episcopal turned out to be a great little church, and better yet, they’ve got an awesome pipe organ! It’s a little Gabriel Kney tracker organ (entirely mechanical) built in 1989 in Canada. I talked with the rector after the service, and he said I could come by during the week and practice on it if I wanted, so guess where I’ve been the last three afternoons. :-)
It has only 13 stops, but it’s tons of fun to play. It’s small, so I was right next to the pipes and immersed in their sound when playing. And, it’s a tracker organ, so I could literally feel the levers opening the pipes for every note I played.
I’m still trying to figure out a reasonable way to record pipe organ music, so I don’t have any music to post yet, but I’ll get that figured out soon. Next week, it’s back to the much larger pipe organ at St. Tim’s!
More pipe organ pictures