X-Combinator

Avatar

making the human scalable

RailsConf08: 23 Hacks

Nathaniel Talbott (from Terralien) gave a great talk on a number of interesting hacks in ruby. There was one in particular that I want to focus on here: gitjour. Gitjour is a fun and novel app that will discover git repositories on a local network via Bonjour. It’s hardly going to change your life, but it is fantastic for ad-hoc sharing of code on a local network and quick collaboration without too much ceremony, e.g. at a conference, or ruby users group, etc.

It’s a fun tool and you should definitely try it out.

While I’m on the topic of git, I’d like to point out that the community at RailsConf was completely sold on using git and using GitHub in particular. Even though I am a git-switcher myself, it was really surprising to see that almost the whole community had either switched or was planning on switching.

GitHub did a great job marketing at the conference. They gave out free t-shirts that said “Fork You” on the front. On the back the shirt said “http://github.com/________” and you were expected to take a sharpee and write your GitHub user name so that others could see your projects (my projects are here). The successful result was that at nearly every session or meal someone would use the phrase “I think I’ll put this code up on GitHub”.

UPDATE: I’ve found a picture of the shirt here

RailsConf08: (opinion) ttyrec and my advice to session presenters

RailsConf was fantastic. A huge thank-you goes out to everyone who took the time to create presentations and sessions. Creating them is a lot of work and I commend you for preparing them. One common negative theme, however, was this: Live coding isn’t efficient and it is bug-prone. It doesn’t matter who you are or how good you are, you’re not going to be maximally efficient presenting if you are performing live coding. Even if you practice, when you get up on stage you will make typos and it will not run exactly as planned. If you’re not convinced, let me give you a list of people who did live coding this weekend and made typos or mistakes:

Each of these guys are Ruby-Rock-Stars and they are way smarter than I am. The content of what they presented was, in every case, phenomenal. They are some of the best programmers in our community and if they can’t give a presentation without typos and mistakes I can almost guarantee that I (you) won’t be able to either. When you get in front of 300 people, start typing, start talking, and realize you’re under the pressure of time (and the crowd) it changes your ability to focus on any one task and it is very hard to not make mistakes. My theory is that one can present a live-like demonstration with no loss of effectiveness.

The most obvious way to do a live-like demonstration is by recording a screencast. You could also use AppleScript or library like castanaut (if you are on a mac). The app I’d like to recommend today is ttyrec . It’s a great little command-line app (it compiles flawlessly on my OS X 10.4) and it records the output of a shell session and can play it back like you are typing in real time. I see two major benefits to this:

  • You can’t make mistakes. It’s recorded and it’s storing the output all into a text file. This means that if the network fritzes out when you’re on stage it doesn’t matter. Network outages, ill-timed bugs, etc. become irrelevant because the commands are not actually run at playback time (the stderr/stdout stream is saved into a text file).
  • You can talk while you’re playing the demo. They do this in TV all the time (think Friends). In a scene where actors play video games they don’t actually play the game and act at the same time. The video games are recorded so the actor can focus on acting. When your computer is playing-back the work you’ve prepared before hand you can focus on talking while the “typing” is happening on the screen. You don’t have to do the mental context-switching of talking and typing simultaneously which causes mistakes that cost you (and maybe more importantly, your audience) time.

You may say “Well, I don’t want to do a recording, I want my demo to be live”. Here’s my view on that: Most people don’t care if your demo is actually live. When you’re on stage and what you’re showing is prepared it’s assumed to be in the perfect environment because you’ve been testing and developing in that exact environment. Even if your demo is truly live it still seems “contrived” in that we, the viewer, can never actually see all the pre-work, installing, compiling, back end etc. that was prepared behind the scenes. As Josh Susser pointed out: “keep in mind Lansford’s Corollary to Clarke’s Third Law: ‘Any sufficiently advanced technology is indistinguishable from a rigged demo.’”

I believe that recording your shell with ttyrec or similar is exactly as effective as a live demo. If I, as a listener, am interested in your project I’ll be the same amount of convinced to download/try/buy even if your demo is recorded. In fact, I’ll be a lot more convinced to try it out if you play something recorded that runs perfectly as opposed to a system that is so complicated or buggy that even the author is having problems using it.

So please record your “live coding” before hand. Your audience will thank you, you’ll avoid embarrassment, and you’ll be more effective for it

UPDATE I removed Yehuda Katz from the list of non-recorded presenters. My apologies go out to Yehuda who pointed out that he did record his presentation before-hand.

Pair Programming with screen

I want to share a simple tip that I learned from Jim Weirich (author of rake, xml builder, and rubygems) this weekend at RailsConf: You can do pair programming with screen. When I learned about this I asked around and apparently everyone knew about this already, but I had never done it. Here’s how you can do it too:

Open up ~/.screenrc add the following lines:

In terminal one start a screen session. Then log in as the same user on terminal two and type screen -x. Now either user can control and see real time what is going on in the tty. It is possible to do this across users, I believe. See your local man page.

They use this in combination with audio/video chat (Skype or iChat). This way they can see facial expressions, ask for control of the keyboard etc. Jim and co. have keywords for passing control, I believe they are “drive?” and “release.” The person who wants to take over says “drive?” and they are allowed to use the keyboard iff the other person says “release.”Seems like a simple, but powerful idea.

UPDATE Jim pointed out that they use the terms “tag” and “yield”. Thanks Jim!

RailsConf08: Passenger or mod_rails RIP

There was no lack of hubris on the stage today as the guys from Phusion talked about their new Apache extension Passenger. If Passenger lives up to its claims it seems that it could quickly become the de-facto standard for deploying Rails (and more) applications.

The 19 22-year-old duo was obviously ecstatic about sharing about what they created but I kept getting the feeling that they were surprised that the crowd didn’t give them a presidential-state-of-the-union-like standing-ovation every few minutes. (If passenger does what they claim maybe they deserve it but respect and appreciation often lose something when they are too eagerly expected.)

So what was it that they claimed? That passenger will not only make Rails deployment dead-simple (think PHP, upload and go) but also crank out better performance while using less memory. It’s a worthy goal and as Kent Beck said in our keynote, “Humility is not a prerequisite to ideas with impact”. I’d like to write up a bit about Passenger and the session they presented at RailsConf. You can download the Keynote slides here [zip] (I’m sure the PDF will appear somewhere soon. If you find it, please feel free to leave a URL in the comments).

Memory Usage and Clustering

Memory Usage

First lets talk about memory usage. When you’re using Mongrel each Mongrel process holds both a full copy of the Rails and application code in memory plus the private memory for the individual process. In this model of N Mongrel processes you have N copies of the application code. With Passenger, each process shares one copy of your Rails/application code. Each process still gets its own chunk of private memory but the shared code greatly reduces the overall memory usage.

The Phusion guys also patched Ruby (and they’ve horribly named it “Ruby Enterprise Edition”). This version has modified garbage collection and causes Ruby to use significantly less memory. They achieve this by doing copy-on-write for memory management. This hasn’t been released so no word yet on how well it works or how stable it is.

Clustering

Another nice feature is that with clustering they use “fair load balancing”. The idea is that you keep track of how many jobs each process in the cluster has and you give the next job to the process with the least amount of work to do.

Competitor Comparison

They compared Passenger’s performance (as an Apache extension) to many competing products (including Nginx and Mongrel) and claimed that it used significantly less memory and was much faster. I won’t repeat all the statistics, you can check out the slides.

mod_rails, RIP

Although they had greatly simplified deployment for Rails they didn’t stop there. Passenger now supports Rack. I see this as probably the coolest thing about Passenger. Now any custom server that you write using Rack can be basically “dropped” into Apache and is effortlessly handled by Passenger. This also means that rails alternatives like Merb and Camping work out of the box. But there is more…

They also added in support for Python’s Django. It was almost comical: when they announced this the crowd nearly boo’ed them. I think it was a bit unfair, but I guess they should have expected it at a Rails conference. Either way, you have to give them props for taking the initiative and pushing the software to its boundaries.

Because Passenger supports frameworks other than Rails they decided to drop the name mod_rails and call it Phusion Passenger. They mentioned that their focus is still going to be developing and perfecting Rails. Passenger will simply not be exclusive to Rails.

Case Studies

Passenger is already being used in production at Dreamhost. It is also being used soocial and ilike.

The Q&A

By the Q&A the crowd was full of skeptics; what they promised seemed too good to be true. However, one gentleman from the crowd sensed this, stood up, and said that he works with a developer/deployer Tom that has been working with production Rails deployment for 4 years. Tom apparently knows the in’s and out’s of Mongrel, Monit, Capistrano, etc. They found out about Passenger two or three weeks ago and he deployed all of their existing applications to Passenger in a single day. His comment was:

Everything I learned [about Rails deployment] over the last few years is moot now, and that is a good thing.Tom the deployer

He said that “[Passenger] is incredibly awesome when it comes to rails deployment.”

A skeptic stood and rightly asked: “Why wasn’t this done five years ago? What was the technological hurdle?” The team answered that they believed the problem was largely social. Developers that had written Rails applications wanted to deploy them as quick as possible. They researched, learned about Capistano, Nginx, and Mongrel etc., and made it work. The Phusion team said that the people that were smart enough to tackle this problem were complacent and choose to deploy applications in this (painful) way.

There is nothing technically preventing [ease of rails deployment]. We’ve shown it’s possible. Why it hasn’t been done is a social or political problem. There is no technical things stopping you. -Hongli Lai (Phusion)

Summary

Time will tell us if Apache/Passenger will live up to the hype and become the new standard. I, for one, am hoping that it really does take hold. If Rails deployment becomes as common and easy as PHP deployment we can spend time solving more interesting problems and that will be a good thing.

RailsConf08: Engine Yard on Rails Deployment Issues

Yesterday I sat in on a session on rails deployment, headed up by the guys from Engine Yard. The idea was to discuss deployment problems but it turned into general deployment tips. If anyone knows about deploying rails it is these guys and they have some fantastic ideas. I took away some interesting things that I’d like to share with you.

Server choice

One of the issues discussed was the choice of what rails server to run your application on.

  • ebb ebb is extremely fast, but probably not production ready today
  • thin thin can be good for requests that are completed quickly. Because thin is event-driven it doesn’t work as well for longer running requests
  • mongrel

Tom pointed out one important fact to consider when choosing between these. He said:

They can all respond to requests faster than your application can generate. There are way more important things to spend time on. Tom Mornini, Engine Yard

The improvements you gain by switching between these is often insignificant when compared to your use of caching, limiting disk i/o in your apps, and controlling your overall application architecture.

Mongrel is, obviously, going to be most people’s first choice, because it’s great for general purpose. But when using mongrel, a common question is “how many mongrel processes should I be running?” Tom said that “you can burn out a modern CPU with 3 mongrels” and there is no reason to run more than 3 mongrel processes per core. Typically if you have more than 3 mongrel processes per CPU core they are generally wasted.

The guys at Engine Yard love nginx. They said they’ve had no problems with it. Tom said that in internal tests against statics files they’ve seen nginx serve 40 megabytes per second of static images and not show up in top.

Misc other tips

Static Files Static files don’t have to be local. They can be shared across the entire cluster with a clustered file system. They use RedHat GFS for static storage and it is convenient because multiple machines can read the same filesystem. “If you can avoid NFS, do… NFS was really, really cool in 1979.”

Static resource domains Browsers limit the number of requests per domain. At Engine Yard they have had success in improving load times by creating domain name aliases that often point to the same physical machine. e.g. images1.domain.com, images2.domain.com, etc. can all point to the exact same machine and exact same IP address but the browser is tricked into loading them concurrently because the domain names are different. They have seen significant improvement in load times by using this technique on pages that need to load a lot of files.

Virtualization They use (the free, open-source version of) Xen and love it. Nearly everything at Engine Yard is virtualized. Because of the way Xen works they said they have very little performance hit when using virtualization. One tip they gave was that it is not always good for each service to be on a separate virtual machine. They said that, by default, every slice (vm) at Enine Yard has nginx, 3 mongrels, and memcachd. They group the services and find that this often works well.

After the session I chatted with the guys. I told them that I spent a few weeks with the free version of Xen and found it very complicated to work with. They said that it took them nine months to perfect their use of Xen. I’m glad to find out that it wasn’t just me. However, it does inspire me to give it a second chance.

mod_rails and passenger When asked about the new mod_rails they said that they are much more interested in rubinius and mod_rubinius. More posts on both rubinius and passenger to come.

,