Category: Computer science


Is bioinformatics possible?

February 21st, 2016 — 5:43pm

I recently gave a talk at the Bio-Pitch event at the French-Japanese institute. I was fortunate to be able to speak about some of the ideas I’ve been developing here among so many interesting projects (MetaPhorest, HTGAA, Yoko Shimizu, Tupac Bio, Bento Lab etc).

The topic of my talk was “Is bioinformatics possible”? A deliberate provocation, since of course many people including myself work with this every day. I simply mean to suggest that there are intrinsic problems in the field that are not usually discussed or thought about, and that it might be valuable to confront those problems.

The slides are available, if anyone is interested.

The bigger topic that is hinted at, but not discussed, might be the instrumental relationship of humans to nature. I hope to return to this problem soon.

1 comment » | Bioinformatics, Computer science, Philosophy

Reactive software and the outer world

February 12th, 2016 — 11:13am

At Scala Matsuri a few weeks ago (incidentally, an excellent conference), I was fortunate to be able to attend Jonas Bonér’s impassioned talk about resilience and reactive software. His theme: “without resilience, nothing else matters”.

At the core of it is a certain way of thinking about the ways that complex systems fail. Importantly, complex systems are not the same as complicated systems, although in everyday speech we tend to confuse the two. Perhaps a related or even identical question is: how do composite systems fail?

Using a terminology that originates with the Erlang language, Bonér talked about the “error kernel”, which is the part of a software system that must never fail, no matter what. As long as this innermost part stays alive, other parts are allowed to fail. There are mechanisms to replace, restart or route around failures in the outer parts.

This style of design leads to a well-structured failure and supervision hierarchy. Maybe this style of thinking itself is the most important contribution. In most software systems being designed today, the possibility of errors or failures is often a second class citizen, swept under the carpet, and certainly not part of a carefully considered structure of possibilities of failure. What if this structure becomes a primary concern?

Once errors are well structured and organised in a hierarchy, it also becomes easy to decide what to do when errors occur. The hierarchy structure clearly indicates which parts of a system have become defunct and need to be replaced or bypassed. Recoverability – being able to crash safely – at every level takes the software system a little bit closer, it seems, to biological systems.

Biological systems, Bonér pointed out, usually operate with some degree of inherent failure, be it disease, weakness, mutations or environmental stress. Perfect functioning is not typical, and it seems to me that for most organisms such a state may not even exist.

Recoverability at every level, resilience, and error hierarchies – “let it fail” – is truly a significant and very humble way of thinking about software. It means that as the developer, I acknowledge that the software I am writing does not control the universe (although as a developer I often fall prey to that illusion). The active principle, the “prime mover”, is somewhere outside the scope that I control. When it produces some unforeseen circumstance, we must respond properly. Reactive software to me seems to quietly acknowledge this order of things.

I have only had a very brief opportunity to try out Akka, Typesafe’s actor framework, in my projects so far, but I felt inspired by Boner’s talk and hope to use it more extensively in the future.

Comment » | Bioinformatics, Computer science, Philosophy, Software development

Mysteries of the scientific method

November 7th, 2015 — 10:48am

mitScientific method can be understood as the following steps: formulating a hypothesis, designing an experiment, carrying out experiments, and drawing conclusions. Conclusions can feed into hypothesis formulation again, in order for a different (related or unrelated) hypothesis to be tested, and we have a cycle. This feedback can also take place via a general theory that conclusions contribute to and hypotheses draw from. The theory gets to represent everything we have learned about the domain so far. Some of the steps may be expanded into sub-steps, but in principle this cycle is how we generally think of science.

This looks quite simple, but is it really? Let’s think about hypothesis formulation and drawing conclusions. In both of these steps, the results are bounded by our imagination and intuition. Thus, something that doesn’t ever enter anybody’s imagination will not be established as scientific fact. In view of this, we should hope that scientists do have vivid imaginations. It is easy to imagine that there might be very powerful findings out there, on the other side of our current scientific horizon, that nobody has yet been creative enough to speculate about. It is not at all obvious that we can see the low hanging fruit or even survey this mountainous landscape well – particularly in an age of hyper-specialisation.

But scientists’ imaginations are probably quite vivid in many cases – thankfully. Ideas come to scientists from somewhere, and some ideas persist more strongly than others. Some ideas seduce scientists to years of hard labour, even when the results are meagre at first. Clearly this intuition and sense that something is worth investigating is absolutely crucial to high quality results.

A hypothesis might be: there is a force that make bodies with mass attract to each other, in a way that is inversely proportional to the squared distance between them. To formulate this hypothesis we need concepts such as force, bodies, mass, distance, attraction. Even though the hypothesis might be formulated in mere words, these words all depend on experience and practices – and thus equipment (even if the equipment used in some cases is simply our own bodies). If this hypothesis is successfully proven, then a new concept becomes available: the law of gravity. This concept in turn may be incorporated into new hypotheses and experiments, paving the way for ever higher and more complex levels of science and scientific phenomena.

Our ability to form hypotheses, to construct equipment and to draw conclusions, seem to be human capacities that are not easy to automate.

Entities such as matter, energy, atoms and electrons become accessible – I submit – primarily through the concepts and equipment that give access to them. In a world with an alternate history different from ours, it is conceivable that entirely different concepts and ideas would explain the same phenomena that are explained by our physics. For science to advance, new equipment and new concepts need to be constructed continually. This process is almost itself an organic growth.

Can we have automated science? Do we no longer need scientific theory? (!?) Can computers one day carry out our science for us? Only if either: a) science is not an essentially human activity, or b) computers become able to take on this human essence, including the responsibility for growing the conceptual-equipmental boundary. Data mining in the age of “big data” is not enough, since this (as far as I know) operates with a fixed equipmental boundary. As such, it would only be a scientific aid and not a substitute for the whole process. Can findings that do not result in concepts and theories ever be called scientific?

If computer systems ever start designing and building new I/O-devices for themselves, maybe something in the way of “artificial science” could be achieved. But it is not clear that the intuition guiding such a system could be equivalent to the human intuition that guides science. It might proceed on a different path altogether.

1 comment » | Bioinformatics, Computer science, Philosophy

Historical noise? Simulation and essential/accidental history

June 24th, 2015 — 4:58pm

Scientists and engineers around the world are, with varying degrees of success, racing to replicate biology and intelligence in computers. Computational biology is already simulating the nervous systems of entire organisms. Artificial intelligence seems to be able to replicate more tasks formerly thought to be the sole preserve of man each year. Many of the results are stunning. All of this is done on digital circuits and/or Turing-Church computers (two terms that for my purposes here are interchangeable — we could also call it symbol manipulation). Expectations are clearly quite high.

What should we realistically hope for? How far can these advances actually go? If they do not culminate in “actual” artificial biology (AB) and artificial intelligence (AI), then what will they end in – what logical conclusion will they reach, what kind of wall would they run up against? What expectations do we have of “actual” AB and AI?

These are extremely challenging questions. When thinking about them, we ought to always keep in mind that minds and biology are both, as far as science knows, open-ended systems, open worlds. This in the sense that we do not know all existing facts about them (unlike classical mechanics or integer arithmetic, which we can reduce to sets of rules). For all intents, given good enough equipment, we could make an indefinite amount of observations and data recordings from any cell or mind. Conversely, we cannot, starting from scratch, construct a cell or a mind starting from pure chemical compounds. Even given godlike powers in a perfectly controlled space, we wouldn’t know what to do. We cannot record in full detail the state of a (single!) cell or a mind, we cannot make perfect copies, and we cannot configure the state of a cell or mind with full precision. This is in stark contrast to digital computation, where we can always make an indefinite number of perfect copies, and where we know the lower bound of all relevant state – we know the smallest detail that matters. We know that there’s no perceivable high-level difference between having a potential difference of 5.03 volts or 5.04 volts in our transistors on the lowest level.

(Quantum theory holds that ultimately, energy can only exist in discrete states. It seems that one consequence would be that a given volume of matter can only represent a finite amount of information. For practical purposes this does not affect our argument here, since measurement and manipulation instruments in science are very far from being accurate and effective at a quantum level. It may certainly affect our argument in theory, but who says that we will not some day discover a deeper level that can hold more information?)

In other words, we know the necessary and sufficient substrate (theoretical and hardware basis) for digital computation, but we know of no such substrate for minds or cells. Furthermore, there are reasons to think that any such substrate would lie much deeper, and at a much smaller scale, than we tend to believe. We repeatedly discover new and unexpected functions of proteins and DNA. Junk DNA, a name that has more than a hint of hubris to it, was later found to have certain crucial functions – not exactly junk, in other words.

Attempts at creating artificial minds and/or artificial biology are attempts at creating detached versions of the original phenomena. They would exist inside containers, independently of time and entropy, as long as the sufficient electrical charge or storage integrity is maintained. Their ability to affect the rest of the universe, and to be affected by it, would be very strictly limited (though not nonexistent – for example, memory errors may occur in a computer as a result of electromagnetic interference from the outside). We may call such simulations unrooted or perhaps hovering. This is the quality that allows digital circuits to preserve information reliably. Interference and noise is screened out, removed.

In attempting to answer the questions posed above, we should think about two alternative scenarios, then.

Scenario 1. It is possible to find a sufficient substrate for biology and/or minds. Beneath a certain level, no further microscopic detail is necessary in the model to replicate the full range of phenomena. Biology and minds are then reduced to a kind of software; a finite amount of information, an arrangement of matter. No doubt such a case would be comforting to many of the logical positivists at large today. But it would also have many strange consequences.

Each of us as a living organism, society around us, and every entity has a history that stretches back indefinitely far. The history of cells needs a long pre-history and evolution of large molecules to begin. A substrate, in the above sense, exists and can be practically used if and only if large parts of history are dispensable. If we could create a perfect artificial cell on some substrate (in software, say) in a relatively short time span, say an hour, or, why not, less than a year, then it means that nature took an unnecessarily long way to get to its goal. (Luckily, efficient, rational, enlightened humans have now come along and found a way to cut out all that waste!) Our shorter way to the goal would then be something that cuts out all the accidental features of history, leaving only the essential parts in place. So the practically usable substrate, which allows for shortcuts in time, then seems to imply a division between essential and accidental history of the thing we wish to simulate! (I say “practically” usable, since an impractical alternative is a working substrate that requires as much time as natural history in the “real” world. In this scenario, getting to the first cell on the substrate takes as long as it did in reality starting from, say, the beginning of the universe. Not a practical scenario, but an interesting thought experiment.) Note that if we are able to somehow run time faster in the simulation than in reality, then it would also mean that parts of history (outside the simulation) are dispensable: some time would have been wasted on unecessary processes.

Scenario 2. Such a substrate does not exist. If no history is accidental, if the roundabout historical process taken by the universe to reach the goal of, say, the first cell or first mind, is actually the only way that such things can be attained, then this scenario would be implied. This scenario is just as astounding as the first, since it implies that each of us depends fully on all of the history and circumstances that led up to this moment.

In deciding which of the two scenarios is more plausible, we should note that both biology and minds seem to be mechanisms for recording history in tremendous detail. Recording ability gives them advantages. This, I think, speaks in favour of the second scenario. The “junk DNA” problem becomes transposed to history itself (of matter, of nature, of societies, of the universe). Is there such a thing as junk history, events that are mere noise?

In writing the above, my aim has not been to discourage any existing work or research. But the two possibilities above must be considered and could point the way to the most worthwhile research goals for AI and AB. If the substrates can be found, then all is “well”, and we would need to truly grapple with the fact that we ourselves are mere patterns/arrangements of building blocks, mere software, body and mind. If the substrates can not be found, as I am inclined to think, then perhaps we should begin to think about completely new kinds of computation, which could somehow incorporate the parts that are missing from mere symbol manipulation. We should also consider much more seriously how closed-world systems, such as the world of digital information, can coexist harmoniously with what would be open-world systems, such as biology and minds. It seems that these problems are scarcely given any thought today.

4 comments » | Bioinformatics, Computer science, Philosophy

The bounded infinity of language

August 9th, 2014 — 5:48pm

Works of art, including film, painting, sculpture, literature and poetry, have a seemingly inexhaustible quality. As we keep confronting them, renewing our relationship with them over time, we continually extract more meaning from them. Some works truly appear to be bottomless. Reaching the bottom easily is, of course, a sure sign that a work will not have much lasting value.

Out of the forms listed above, (written) poetry and literature have the particular property that they are crafted out of a demonstrably finite medium: text. A finite alphabet, finite vocabulary, and a finite number of pages. As long as one disregards the effect of details such as paper quality, typography and binding, perfect copies can be made; the text can indeed be transcribed in its entirety without information loss. Somehow, reading Goethe on a Kindle is an experience that still holds power, although he presumably never intended his books to be read on Kindles (and some might argue that reading him in this way is ignoble).

How is it then that the evocative power of something finite can seem to be boundless? This curious property is something we might call the poetic or metaphorical qualities of a text. (Works of film, painting, sculpture and so on most likely also have this power, but it is trickier to demonstrate that they are grounded in a finite medium.) Through this mysterious evocative power, the elements that make up a work of art allow us to enter into an infinity that has been enclosed in a finite space. It will be argued that what is evoked comes as much from the reader as from the text, but this duality applies to all sensation.

With this in mind we turn, once again, to programming and formal “languages”. Terms in programming languages receive their meaning through a formal semantics that describes, mathematically, how the language is to be translated into an underlying, simpler language. This process takes place on a number of levels, and eventually the lowest underlying language is machinery. This grounds the power of a program to command electrons. But this is something different from the meaning of words in a natural language. The evocative power described above is clearly absent, and computer programs today do not transcend their essential finitude. With brute force, we could train ourselves to read source code metaphorically or poetically, but in most languages I know, this would result in strained, awkward and limited metaphors. (Perhaps mostly because programming languages to a large extent reference a world different from the human world.)

Consider how this inability to transcend finitude impacts our ability to model a domain in a given programming language. With an already formal domain, such as finance or classical mechanics, it is simple since what needs to happen is a mere translation. On the other hand, other domains, such as biology, resist formalisation  – and perhaps this is one of their essential properties. Here we would like to draw on the evocative, poetic, and metaphorical capacities of natural language – for the sake of program comprehension and perhaps also to support effective user interfaces – while also writing practical programs. But we have yet to invent a formal language that is both practical and evocative to the point that works of art could be created in it.

an ancient pond / a frog jumps in / the splash of water

(Bashou, 1686)

1 comment » | Computer science, Philosophy, Software development

Back to top