Toward general computing with DNA

Erik Winfree, faculty candidate, Caltech

High order bits: Model a TM as a tiling problem; make the "tiles" be different type of DNA structures; use DNA self-assembly techniques to see if the machine halts (if the tiling succeeds, ie the DNA self-assembles).  Use atomic force microscopy to examine resulting assemblies.  2D lattice self-assembly still in early stages experimentally.  Hypothesis: different types of DNA structures allow the expression of problems at different levels of expressiveness (regular languages, CFG, general recursive languages).

What can you do with DNA?

An "enzyme Turing machine" (Bennett, 1982): Polymerase inserts copy of WC complement to bind to a strand.  A simple TM that copies and requires energy roughly proportional to reaction speed.  Can arbitrary fns be computed?  Hypothetical strategy: to make the transition table, get groups of bases that recognize the group representing the head of the TM and rewrite it.  If th TM computes a reversible fcuntion...

Joyce 89 and others: to find molecule M that binds w/target T, just generate large library of random molecules, and let the chemicals do their work.  Discard those that "bind nonspecifically" or don't bind.  Brute force, highly parallel.

Extensions of Adleman's technique have been used to solve experimentally small examples of circuit-sat, formula-sat, Hamiltonian, max-clique.  For about $100K you can get a Beckman Instruments "DNA workstation" setup.  Optimistic estimates suggest it would still take longer

Hagiya & Yokoyama 1997 showed ("whiplash PCR") that you can add an additional group that inhibits further copying; you can use this to implement "goto".  Suppose you can represent f(x,y,z) and x,y,z discretely on a strand: as long as formula accesses x,y,z exactly once, this technique can b eused to code this.  You can then create a combinatorial library.  These are called "mu-formulas".  Find a mu-formula that agrees with the example data for N points (one at a time, keeping only consistent results each time), and in O(N) stepes you'll have a mu-formula that satisfies all of them.

Other DNA structures (hairpins, crossover double-strands, three-arm junctures) result in different kinds of libraries thru self-assembly!

Theorems (Winfree 96): Simple Adleman strands yield regular languages; three-arm junctures yield context free languages; double crossovers yield general recursive languages (via assembling in 2D lattice).

Context: Tiling problem; given a set of shapes, can you tile the plane?  Problem is undecidable!  Proof: by reduction to the Halting Problem; TM halts <--> no tile can be added, so plane cannot be tiled.  Interesting part: the tiles can be used to simulate a computation performed by a TM.  What if the "tiles" can be made from DNA?

Example: Pascal's triangle mod 2 is the Sierpinski triangle: the (n,k) entry is ((n choose k) mod 2.  Mod N: if you can compute any entry from its 2 predecessors, you can make an arbitrary CA.