@xcombinator
- I realize there are a million already, but I created another git cheatsheet: http://bit.ly/bfAKlZ 2010/09/01
-
Recent Posts
Recent Comments
- ActiveRecord from_json and from_xml (5)
- Terence: Dude, you are the bomb. Thanks for your fix. Helped us out heaps.
- djb daemontools with Ubuntu’s upstart (2)
- sorcess: such config may lead to data loss… consider above configuration with this little change start on...
- Mac OS X color showing ESC[whatever for git-diff colors (and more) (15)
- automate installing tripwire using expect (1)
- Trey Henefield: I came across this as useful. But I found an even easier solution. There is an option that disables...
- ActiveRecord from_json and from_xml (5)
Categories
- bookmarks (2)
- cascading (2)
- code (2)
- crawling (2)
- deployment (6)
- ec2 (3)
- erlang (2)
- gems (3)
- git (7)
- hadoop (3)
- java (1)
- merb (1)
- music (1)
- osx (2)
- poolparty (3)
- processing (1)
- programming (50)
- rails (11)
- ruby (21)
- scalability (5)
- shell (8)
- sysadmin (16)
- tips (13)
- Uncategorized (3)
- useless (1)
Archives
Pages
Blogroll

Why Do Cells Have DNA?
In order for a system to reproduce itself, it seems necessary for it to hold an encoded form of itself somehow. This idea, and the inevitability of the existence of DNA-like structures within living cells, is well-illustrated using computer code.
We start with the problem: Write a ruby script that can print itself out, without it reading its own file.
The restriction “without it reading its own file” is important, otherwise a trivial solution would be:
Without thinking about it too much and just starting naively, we may try:
We soon find ourselves entering into an infinite regress that we quickly realize has no escape. Seems we have to be a little bit more clever.
Before getting to the solution, the interested reader is urged to stop reading at this point and try to come up with a solution independently.
Here’s one way to accomplish this. Create the following script, which is just the first step toward the final script:
Save the above file as “pregodel.rb”. Next create this helper script:
Save this as “godelize.rb”, and run it taking pregodel.rb into STDIN:
$ ruby godelize.rb < pregodel.rbThis outputs a cgi-encoded string whose initial characters are '%23%21%2Fusr%2Fbin%2Fenv ...'. Finally, copy pregodel.rb to godel.rb and edit godel.rb. Replace the first instance of DNA with the output of godelize.rb, and save it.
godel.rb should now be:
When run, the output is identical to the file contents. A conclusive test is comparing md5 sums:
$ cat godel.rb | md54b92a6303568cde3a98d042a67616d2d
$ ruby godel.rb | md54b92a6303568cde3a98d042a67616d2d
(This was run on a Mac—many linux distros have md5sum instead.)
Discussion
So what does this have to do with DNA in cells? The godel.rb script is about as simple of a self-replicating ruby script that one can write (again, without invoking the file itself). Its function and structure is surprisingly similar to how a biological cell works. The entire script is stored in the dna variable as a cgi-encoded string, just as a cell is encoded in its nucleus with deoxyribonucleic acid:
dna = '%23%21%2Fusr%2Fbin%2Fenv+ruby%0A%0Arequire+%27cgi%27%0A%0Adna+%3D+%27DNA%27%0Acell+%3D+CGI.unescape+dna%0Aputs+cell.sub%28+%27DNA%27%2C+dna+%29%0A'The first step in the script's self-replication process is to unescape itself from a cgi-encoded string to ruby code. This is analogous to a cell's ribosomes performing its role in translating DNA into protein. (The intermediate step of RNA transcription is omitted for simplicity without loss of accuracy.) Protein is encoded as sequences of nucleic acids in DNA, just as ruby code is encoded into cgi-encoded character sequences.
Finally, a cell's DNA must make a copy of itself during mitosis and insert it into the cell's copy. The analog of this is the last line of the script:
Though the dna string has a representation of the whole script, observe how it was able to avoid the infinite regress. The place in the string that references itself, rather than storing itself again (which would lead to an infinitely long string) it instead uses a symbol for itself, namely, DNA. This acts as a placeholder for where the dna variable will be substituted into when the string is printed out.
For further reading, check out the book Gödel, Escher, Bach by Douglas Hofstadter, which applies the above ideas to not only self-replicating cells, but to the nature of mathematical truth, the art of Escher, the music of Bach, and human consciousness, among other related topics.