Tag: scientific software


Method and object. Horizons for technological biology

March 22nd, 2016 — 10:32pm

(This post is an attempt at elaborating the ideas I outlined in my talk at Bio-pitch in February.)

The academic and investigative relationship to biology – our discourse about biology – is becoming increasingly technological. In fields such as bioinformatics and computational biology, the technological/instrumental relationship to nature is always at work, constructing deterministic models of phenomena. By using these models, we may repeatedly extract predictable results from nature. An example would be a cause-effect relationship like: exposing a cell to heat causes “heat shock proteins” to be transcribed and translated.

The implicit understanding in all of these cases is that nature can be turned into engineering. Total success, in this understanding, would amount to one or both of the following:

  1. Replacement/imitation as success. If we can replace the phenomena under study by its model (concretely, a machine or a simulation), we have achieved success.
  2. Control as success. If we can consistently place the phenomena under study in verifiable, fully defined states, we have achieved success. (Note that this ideal implies that we also possess perfect powers of observation, down to a hypothetical “lowest level”).

These implicitly held ideals are not problematic as long as we acknowledge that they are mere ideals. They are very well suited as horizons for these fields to work under, since they stimulate the further development of scientific results. But if we forget that they are ideals and begin to think that they really can become realities, or if we prematurely think that biology really must be like engineering, we might be in trouble. Such a belief conflates the object of study with our relatedness to that object. It misunderstands the role of the equipment-based relationship. The model – and associated machines, software, formulae. et cetera – is equipment that constitutes our relatedness to the phenomena. It cannot be the phenomena themselves.

Closely related to the ideals of replacement and control is the widespread application of abstraction and equality in engineering-like fields (and their application to new fields that are presently being clad in the trappings of engineering, such as biology). Abstraction and equality – – the notion that two entities, instances, moments, etc., are in some way the same – allow us to introduce an algebra, to reason in the general and not in specifics. And this is of course what computers do. It also means that two sequences of actions (laboratory protocols for example), although they are different sequences, or the same sequence but at different instances in time, can lead to the same result. Just as 3+1 and 2+2 both “equal” 4. In other words, history becomes irrelevant, the specific path taken no longer means very much. But it is not clear that this can ever truly be the case outside of an algebra, and that is what risks being forgotten.

We might call all this the emergence of technological biology, or technological nature, the conquest of biology by λόγος, et cetera. The principal danger seems to be the conflation of method with object, of abstraction with the specific. And here we see clearly how something apparently simple – studying RNA expression levels in the software package R, for example – opens up the deepest metaphysical abysses. One of the most important tasks right now, then, would be the development of a scientific and technological culture that keeps the benefits of the technological attitude without losing sight of a more basic non-technological relatedness. The path lies open…

Comment » | Bioinformatics, Computer science, Philosophy

Mysteries of the scientific method

November 7th, 2015 — 10:48am

mitScientific method can be understood as the following steps: formulating a hypothesis, designing an experiment, carrying out experiments, and drawing conclusions. Conclusions can feed into hypothesis formulation again, in order for a different (related or unrelated) hypothesis to be tested, and we have a cycle. This feedback can also take place via a general theory that conclusions contribute to and hypotheses draw from. The theory gets to represent everything we have learned about the domain so far. Some of the steps may be expanded into sub-steps, but in principle this cycle is how we generally think of science.

This looks quite simple, but is it really? Let’s think about hypothesis formulation and drawing conclusions. In both of these steps, the results are bounded by our imagination and intuition. Thus, something that doesn’t ever enter anybody’s imagination will not be established as scientific fact. In view of this, we should hope that scientists do have vivid imaginations. It is easy to imagine that there might be very powerful findings out there, on the other side of our current scientific horizon, that nobody has yet been creative enough to speculate about. It is not at all obvious that we can see the low hanging fruit or even survey this mountainous landscape well – particularly in an age of hyper-specialisation.

But scientists’ imaginations are probably quite vivid in many cases – thankfully. Ideas come to scientists from somewhere, and some ideas persist more strongly than others. Some ideas seduce scientists to years of hard labour, even when the results are meagre at first. Clearly this intuition and sense that something is worth investigating is absolutely crucial to high quality results.

A hypothesis might be: there is a force that make bodies with mass attract to each other, in a way that is inversely proportional to the squared distance between them. To formulate this hypothesis we need concepts such as force, bodies, mass, distance, attraction. Even though the hypothesis might be formulated in mere words, these words all depend on experience and practices – and thus equipment (even if the equipment used in some cases is simply our own bodies). If this hypothesis is successfully proven, then a new concept becomes available: the law of gravity. This concept in turn may be incorporated into new hypotheses and experiments, paving the way for ever higher and more complex levels of science and scientific phenomena.

Our ability to form hypotheses, to construct equipment and to draw conclusions, seem to be human capacities that are not easy to automate.

Entities such as matter, energy, atoms and electrons become accessible – I submit – primarily through the concepts and equipment that give access to them. In a world with an alternate history different from ours, it is conceivable that entirely different concepts and ideas would explain the same phenomena that are explained by our physics. For science to advance, new equipment and new concepts need to be constructed continually. This process is almost itself an organic growth.

Can we have automated science? Do we no longer need scientific theory? (!?) Can computers one day carry out our science for us? Only if either: a) science is not an essentially human activity, or b) computers become able to take on this human essence, including the responsibility for growing the conceptual-equipmental boundary. Data mining in the age of “big data” is not enough, since this (as far as I know) operates with a fixed equipmental boundary. As such, it would only be a scientific aid and not a substitute for the whole process. Can findings that do not result in concepts and theories ever be called scientific?

If computer systems ever start designing and building new I/O-devices for themselves, maybe something in the way of “artificial science” could be achieved. But it is not clear that the intuition guiding such a system could be equivalent to the human intuition that guides science. It might proceed on a different path altogether.

1 comment » | Bioinformatics, Computer science, Philosophy

The bounded infinity of language

August 9th, 2014 — 5:48pm

Works of art, including film, painting, sculpture, literature and poetry, have a seemingly inexhaustible quality. As we keep confronting them, renewing our relationship with them over time, we continually extract more meaning from them. Some works truly appear to be bottomless. Reaching the bottom easily is, of course, a sure sign that a work will not have much lasting value.

Out of the forms listed above, (written) poetry and literature have the particular property that they are crafted out of a demonstrably finite medium: text. A finite alphabet, finite vocabulary, and a finite number of pages. As long as one disregards the effect of details such as paper quality, typography and binding, perfect copies can be made; the text can indeed be transcribed in its entirety without information loss. Somehow, reading Goethe on a Kindle is an experience that still holds power, although he presumably never intended his books to be read on Kindles (and some might argue that reading him in this way is ignoble).

How is it then that the evocative power of something finite can seem to be boundless? This curious property is something we might call the poetic or metaphorical qualities of a text. (Works of film, painting, sculpture and so on most likely also have this power, but it is trickier to demonstrate that they are grounded in a finite medium.) Through this mysterious evocative power, the elements that make up a work of art allow us to enter into an infinity that has been enclosed in a finite space. It will be argued that what is evoked comes as much from the reader as from the text, but this duality applies to all sensation.

With this in mind we turn, once again, to programming and formal “languages”. Terms in programming languages receive their meaning through a formal semantics that describes, mathematically, how the language is to be translated into an underlying, simpler language. This process takes place on a number of levels, and eventually the lowest underlying language is machinery. This grounds the power of a program to command electrons. But this is something different from the meaning of words in a natural language. The evocative power described above is clearly absent, and computer programs today do not transcend their essential finitude. With brute force, we could train ourselves to read source code metaphorically or poetically, but in most languages I know, this would result in strained, awkward and limited metaphors. (Perhaps mostly because programming languages to a large extent reference a world different from the human world.)

Consider how this inability to transcend finitude impacts our ability to model a domain in a given programming language. With an already formal domain, such as finance or classical mechanics, it is simple since what needs to happen is a mere translation. On the other hand, other domains, such as biology, resist formalisation  – and perhaps this is one of their essential properties. Here we would like to draw on the evocative, poetic, and metaphorical capacities of natural language – for the sake of program comprehension and perhaps also to support effective user interfaces – while also writing practical programs. But we have yet to invent a formal language that is both practical and evocative to the point that works of art could be created in it.

an ancient pond / a frog jumps in / the splash of water

(Bashou, 1686)

1 comment » | Computer science, Philosophy, Software development

Equipmental visibility and barriers to understanding

July 12th, 2013 — 9:28pm

The following is an excerpt from a text I am currently in the process of writing, which may or may not be published in this form. The text is concerned with the role of software in the scientific research process, and what happens when researchers must interact with software instead of hardware equipment, and finally the constraints that this places on the software development process.

Technological development since the industrial revolution has made equipment more intricate. Where we originally had gears, levers and pistons, we progressed via tape, vacuum tubes and punch cards to solid state memory, CPUs and wireless networks. The process of the elaboration of technology has also been the process of its hiding from public view. An increasing amount of complexity is packed into compact volumes and literally sealed into “black boxes”. This does not render the equipment inaccessible, but it does make it harder to understand and manipulate as soon as one wants to go outside of the operating constraints that the designers foresaw. As we have already noted, this poses problems to the scientific method. Scientists are human, and they engage with their equipment through the use of their five senses. Let us suggest a simple rule of thumb: the more difficult equipment is to see, touch, hear etc., the more difficult it becomes to understand it and modify its function. The evolution of technology has happened at the expense of its visibility. The user-friendly interface that provides a simple means of interacting with a complex piece of machinery, which initially is very valuable, can often become a local maximum that is difficult to escape if one wants to put the equipment to new and unforeseen uses. We may note two distinct kinds of user-friendly interfaces: interfaces where the simplified view closely approximates the genuine internals of the machinery, and interfaces where the simplified view uses concepts and metaphors that have no similarity to those internals. The former kind of interface we will call an authentic simplification, the latter an inauthentic simplification.

Of course, software represents a very late stage in the progression from simple and visible to complex and hidden machinery. Again we see how software can both accelerate and retard scientific studies. Software can perform complex information processing, but it is much harder to interrogate than physical equipment: the workings are hidden, unseen. The inner workings of software, which reside in source code, are notoriously hard to communicate. A programmer watching another programmer at work for hours may not fully be able to understand what kind of work is being done, even if both are highly skilled, unless a disciplined coding style and development methodology is being used. Software is by its very nature something hidden away from human eyes: from the very beginning it is written in artificial languages, which are then gradually compiled into even more artificial languages for the benefit of the processor that is to interpret them. Irreversible, one-way transformations are essential to the process of developing and executing software. This leads to what might be called a nonlinearity when software equipment is being used as part of an experimental setup. Whereas visible, tangible equipment generally yields more information about itself when inspected, and whereas investigators generally have a clear idea how hard it is to inspect or modify such equipment, software equipment often requires an unknown expenditure of effort to inspect or modify – unknown to all except those programmers who have experience working with the relevant source code, and even they will sometimes have a limited ability to judge how hard it would be to make a certain change (software projects often finish over time and over budget, but almost never under time or under budget). This becomes a severe handicap for investigators. A linear amount of time, effort and resources spent understanding or modifying ordinary equipment will generally have clear payoffs, but the inspection and modification of software equipment will be a dark area that investigators, unless they are able to collaborate well with programmers, will instinctively avoid.

To some degree these problems are inescapable, but we suggest the maximal use of authentic simplification in interfaces as a remedy. In addition, it is desirable to have access to multiple levels of detail in the interface, so that each level is an authentic simplification of the level below. In such interface strata, layers have the same structure and only differ in the level of detail. Thus, investigators are given, as far as possible, the possibility of smooth progression from minimal understanding to full understanding of the software. The bottom level interface should in its conceptual structure be very close to the source code itself.

Comment » | Bioinformatics, Computer science, Philosophy, Software development

Back to top