The minimal genome of Craig Venter’s Syn3.0

The J Craig Venter Institute has published a paper detailing the genome of their new Syn3.0 synthetic organism. The major accomplishment was to construct a viable cell with a synthetic, extremely small genome: only 473 genes and about 500 kbp.

Even though it is considered to be fully “synthetic”, this genome is not built from scratch. Instead, the starting point is the Mycoplasma genitalium bacterium, from which genes and regions are deleted to produce something that is much smaller, but still viable. This means that even this fully synthetic genome still contains regions and functionalities that are not fully understood. M. genitalium was also the basis for JCVI’s Syn1.0, which was produced in 2008, but the genome of Syn3.0 is the smallest so far – “smaller than that of any autonomously replicating cell found in nature”. Syn3.0 should be a very valuable starting point for developing an explicit understanding of the basic gene frameworks needed by any cell for its survival – the “operating system of the cell” in the words of the authors.

Since so many genes are still basically not understood, the authors could not rely entirely on logic and common sense when choosing what genes to remove. They used an approach that introduced random mutations into the starting organism, and then checked which mutations where viable and which were not. This allowed them to classify genes as essential, inessential or quasi-essential (!). The deletion of essential genes would cause the cell to simply die. The deletion of quasi-essential genes would not kill it, but would dramatically slow its replication rate, severely crippling it. The final Syn3.0 organism has a doubling time of about 3 hours.

Some of the points I took away from this readable and interesting paper were:

Synthetic biology methods are starting to resemble software development methods. The authors describe a design-build-test (DBT) cycle that involve several nontrivial methods, such as in silico design, oligonucleotide synthesis, yeast cloning, insertion into the bacteria, testing, and then (perhaps) sequencing to go back to computers and figure out what went wrong or what went well. Thus, a feedback loop between the cells and the in silico design space is set up.

A very small genome needs a very tightly controlled environment to survive. The medium (nutrient solution) that Syn3.0 lives in apparently contains almost all the nutrients and raw materials it could possibly need from its environment. This means that many genes that would normally be useful for overcoming adverse conditions, perhaps for synthesising nutrients that are not available from the environment, are now redundant and can be removed. So when thinking about genome design, it seems we really have to think about how everything relates to a specific environment.

The mechanics of getting a synthetic genome into a living cell are still complex. A huge amount of wet-lab (and, presumably, dry-lab) processes are still needed to get the genome from the computer into something viable in a cell culture. However, things are going much faster than in 2008, and it’s interesting to think about where this field might be in 2021.

 

Method and object. Horizons for technological biology

(This post is an attempt at elaborating the ideas I outlined in my talk at Bio-pitch in February.)

The academic and investigative relationship to biology – our discourse about biology – is becoming increasingly technological. In fields such as bioinformatics and computational biology, the technological/instrumental relationship to nature is always at work, constructing deterministic models of phenomena. By using these models, we may repeatedly extract predictable results from nature. An example would be a cause-effect relationship like: exposing a cell to heat causes “heat shock proteins” to be transcribed and translated.

The implicit understanding in all of these cases is that nature can be turned into engineering. Total success, in this understanding, would amount to one or both of the following:

  1. Replacement/imitation as success. If we can replace the phenomena under study by its model (concretely, a machine or a simulation), we have achieved success.
  2. Control as success. If we can consistently place the phenomena under study in verifiable, fully defined states, we have achieved success. (Note that this ideal implies that we also possess perfect powers of observation, down to a hypothetical “lowest level”).

These implicitly held ideals are not problematic as long as we acknowledge that they are mere ideals. They are very well suited as horizons for these fields to work under, since they stimulate the further development of scientific results. But if we forget that they are ideals and begin to think that they really can become realities, or if we prematurely think that biology really must be like engineering, we might be in trouble. Such a belief conflates the object of study with our relatedness to that object. It misunderstands the role of the equipment-based relationship. The model – and associated machines, software, formulae. et cetera – is equipment that constitutes our relatedness to the phenomena. It cannot be the phenomena themselves.

Closely related to the ideals of replacement and control is the widespread application of abstraction and equality in engineering-like fields (and their application to new fields that are presently being clad in the trappings of engineering, such as biology). Abstraction and equality – – the notion that two entities, instances, moments, etc., are in some way the same – allow us to introduce an algebra, to reason in the general and not in specifics. And this is of course what computers do. It also means that two sequences of actions (laboratory protocols for example), although they are different sequences, or the same sequence but at different instances in time, can lead to the same result. Just as 3+1 and 2+2 both “equal” 4. In other words, history becomes irrelevant, the specific path taken no longer means very much. But it is not clear that this can ever truly be the case outside of an algebra, and that is what risks being forgotten.

We might call all this the emergence of technological biology, or technological nature, the conquest of biology by λόγος, et cetera. The principal danger seems to be the conflation of method with object, of abstraction with the specific. And here we see clearly how something apparently simple – studying RNA expression levels in the software package R, for example – opens up the deepest metaphysical abysses. One of the most important tasks right now, then, would be the development of a scientific and technological culture that keeps the benefits of the technological attitude without losing sight of a more basic non-technological relatedness. The path lies open…

Is bioinformatics possible?

I recently gave a talk at the Bio-Pitch event at the French-Japanese institute. I was fortunate to be able to speak about some of the ideas I’ve been developing here among so many interesting projects (MetaPhorest, HTGAA, Yoko Shimizu, Tupac Bio, Bento Lab etc).

The topic of my talk was “Is bioinformatics possible”? A deliberate provocation, since of course many people including myself work with this every day. I simply mean to suggest that there are intrinsic problems in the field that are not usually discussed or thought about, and that it might be valuable to confront those problems.

The slides are available, if anyone is interested.

The bigger topic that is hinted at, but not discussed, might be the instrumental relationship of humans to nature. I hope to return to this problem soon.

Reactive software and the outer world

At Scala Matsuri a few weeks ago (incidentally, an excellent conference), I was fortunate to be able to attend Jonas Bonér’s impassioned talk about resilience and reactive software. His theme: “without resilience, nothing else matters”.

At the core of it is a certain way of thinking about the ways that complex systems fail. Importantly, complex systems are not the same as complicated systems, although in everyday speech we tend to confuse the two. Perhaps a related or even identical question is: how do composite systems fail?

Using a terminology that originates with the Erlang language, Bonér talked about the “error kernel”, which is the part of a software system that must never fail, no matter what. As long as this innermost part stays alive, other parts are allowed to fail. There are mechanisms to replace, restart or route around failures in the outer parts.

This style of design leads to a well-structured failure and supervision hierarchy. Maybe this style of thinking itself is the most important contribution. In most software systems being designed today, the possibility of errors or failures is often a second class citizen, swept under the carpet, and certainly not part of a carefully considered structure of possibilities of failure. What if this structure becomes a primary concern?

Once errors are well structured and organised in a hierarchy, it also becomes easy to decide what to do when errors occur. The hierarchy structure clearly indicates which parts of a system have become defunct and need to be replaced or bypassed. Recoverability – being able to crash safely – at every level takes the software system a little bit closer, it seems, to biological systems.

Biological systems, Bonér pointed out, usually operate with some degree of inherent failure, be it disease, weakness, mutations or environmental stress. Perfect functioning is not typical, and it seems to me that for most organisms such a state may not even exist.

Recoverability at every level, resilience, and error hierarchies – “let it fail” – is truly a significant and very humble way of thinking about software. It means that as the developer, I acknowledge that the software I am writing does not control the universe (although as a developer I often fall prey to that illusion). The active principle, the “prime mover”, is somewhere outside the scope that I control. When it produces some unforeseen circumstance, we must respond properly. Reactive software to me seems to quietly acknowledge this order of things.

I have only had a very brief opportunity to try out Akka, Typesafe’s actor framework, in my projects so far, but I felt inspired by Boner’s talk and hope to use it more extensively in the future.

The inexhaustible wealth of appearance, information and specificity

IMG_0001

When perceiving an object, for example a chair, the statement “this is X” (this is a chair) is almost entirely uninteresting. The concept by which we identify the object is a mere word, and in a sense entirely devoid of meaning.

That concept does help us align this object with other entities in space and time. It sets expectations about what has been done and what can be done to and with it, and it links the object to social practices. But none of these things are very interesting. After all, we understand quite well what society expects from chairs.

What is more interesting is all the other statements we could make about a particular chair, that is, all the qualities, information, phenomena and experiences that do not fit the general concept of a chair. Call this the chair’s particularity. It may be unusually sturdy or rickety. It may evoke a sense of sorrow or longing for a person who used to sit on it. It may make us think about economics. Its shape may even have something spiritual about it. It may, if it is a chair in an abandoned house, be decomposing. And even this is just scratching the surface.

In all likelihood, we are able to produce an unbounded number of interesting statements about this locus that is the chair. (Recall the famous school assignment about writing a story several hundred words long about the face of a coin.) And this would hold true both when we speak freely, metaphorically and poetically, and when we restrict ourselves to testable, scientific (in the modern sense) statements. New metaphors can always be invented, new scientific equipment may always be constructed. These additional modes of relatedness to the locus provide, perhaps, the basis for new statements.

How are we to understand this fundamental overflowing, this exuberant blossoming, the profound potential wealth that we draw upon and realise when we articulate statements about an entity such as this chair? It is not part of the concept “chair”. This concept is overlaid as an afterthought in order to make the surplus of impressions manageable and graspable. We are used to economising the use of our consciousness, dispensing it only sparingly, through the shielding, buffering and deflection that concepts afford us.

For Heidegger, being is the basis of intelligibility, a carrier of meaning. Language and intelligibility exists only on the basis of primordial being. He makes it his task to inquire as to what this being is.

For Georges Bataille, all activity that involves redistribution of energy, human and otherwise, accumulates a surplus that necessarily must be released in some way.

Myths and archetypes repeat themselves throughout history and society, in constantly renewed forms which are both always the same and always made from different specific constitutent parts. They can always be repeated in a different way. The hero myth exists in every culture (see for example Jung or Campbell). Conversely, this myth in all its specific detail is always different each time it appears.

In difference and repetition, Deleuze argues that conceptual machinery is constantly at work, extracting difference from whatever the underlying basis is.

Genetic material successfully reproduces and preserves itself, and perhaps prospers, only through the continual introduction of difference and variation at an appropriate rate.

The digital world, on the other hand, denies the possibility of generating an unbounded number of statements from some entity (such as a record in a database). In fact, its essence is the possibility of perfect copying, which happens only when the information being carried is strictly circumscribed and limited.

All these concepts, it seems, have something in common – the interaction between a specific form and the possibility of an infinite number of variations of and departures from that form.