Category: Bioinformatics


The year and decade in review. 2020s: orderly peace?

December 30th, 2019 — 8:49am

2019 comes to a close, and with it the 2010s. Below are a few thoughts on these periods of time.

The most significant book I’ve read in 2019 is probably Hannah Arendt’s The Origins of Totalitarianism. The German title, literally “Elements and Origins of Totalitarian Rule” more closely reflects the contents of this monograph. Arendt explores antisemitism, imperialism and totalitarianism to form a grand analysis of totalitarian forms of government, which she considers to be genuinely new and unprecedented. Those who make it through the somewhat slow early chapters will be richly rewarded. It’s a very timely book – although written in the 1950’s, most of the ideas feel like they could be from last week. Elements of totalitarian rule are absolutely something we should worry about.

Another notable book from this year has been Edward Snowden’s Permanent Record. Aside from the obvious political dynamite, I found myself relating to a lot of the experiences he had growing up. Perhaps this is a generational story. In the late 90s, the Internet suddenly became relatively mainstream and for a short while, it was a very special place, seemingly full of utopian promise and all kinds of possibilities and exploration. For many born in the mid-80s this coincided with our teenage years.

I’ve lived in Japan throughout the 2010s, the final part of the Heisei (平成) era. In 2019 this era came to a close and we are now officially in Reiwa (令和). I can’t easily summarise the 2010s. Both my personal life and Japan seem to have undergone great change during this time, and sometimes it’s hard to separate one from the other. The Fukushima incident in 2011 was perhaps a watershed moment that Japan is still grappling with. Although the future of nuclear power has not yet been resolved, the country’s response to such a tense incident has in many ways been admirable, and the famous Japanese virtue (sometimes a double-edge sword) of stability certainly came through. The surrounding world is also changing, and Japan, though still a relatively separate culture, is becoming considerably more open and mixed as a society, perhaps out of necessity. Tourism and labour imports have both increased significantly. This raises interesting questions about what kind of society Japan might be in 10 – 20 years.

During the decade I have had diverse personal and professional experiences. I lived in Tokyo, Osaka, then Tokyo again. I was able to complete a PhD thesis. I visited many countries for the first time, and became interested in bioinformatics (mainly as a field in which to apply fundamental computer science and software engineering). I took up several new hobbies, obtained permanent residency in Japan, and was able to improve my Japanese to the point of reading novels, although I’m still not quite where I’d like to be with the language. I’ve been reading a lot of philosophy and general literature and tried to systematically develop a worldview (fragments of which sometimes appear on this blog). Not everything I tried to do worked out the way I expected, but the learning has felt very valuable, and I do feel much wiser and more capable about my approach to many things. I expect to be sincerely expressing the same sentiment in the year 2029, though.

One technical focus this year was improving my Spark (and Scala) skills and developing an algorithm for De Bruijn graph compaction (similar to what Bcalm does). I was pleased with the efficient research process I was able to achieve, probably my best ever on this kind of project. In terms of my professional path, the overall trend for me seems to be towards smaller firms and greater independence. (Although I remain with Lifematics, I will now also be available for consulting and contracting opportunities in bioinformatics as well as general software development. If you are reading this and think you would like to work with me, do get in touch.)

Thus ends a politically very strange decade, from a global perspective, and we enter the brave new world of the 2020s. Will it be a time of “orderly peace”, as the name 令和 suggests?

Comment » | Bioinformatics, Computer science, Life, Philosophy

Nietzschean toxicology

January 25th, 2018 — 10:40am

Although one of my main projects is software for toxicology and toxicogenomics, my background in toxicology is not as strong as in, for example, computer science, and I’m lucky to be able to rely on experienced collaborators. With that said, I’d still like to try to speculate about the field through a mildly Nietzschean lens.

Toxicology focuses in the main on identifying mechanisms of degradation. Ingesting large quantities of the painkiller acetaminophen will cause liver damage and necrosis of liver cells. This will seriously harm the organism, since the liver is such an important organ, and many essential functions that the body depends on will be degraded or perhaps vanish completely. Untreated acute liver failure is fatal. It is very clearly a degradation.

Toxicology wishes to understand the mechanisms that lead to such degradation. If we understand the sequence of molecular events that eventually leads to the degradation, perhaps we can either make some drug or compound safer, by blocking those events, or we can distinguish between safe and unsafe compounds or stimuli.

Safety testing of a new drug, however, is done in aggregate, on a population of cells (or, in a clinical trial for example, on a group of animals or even humans, after a high degree of confidence has been established). If only a few individuals develop symptoms out of a large population, the drug is considered unsafe. But in practice, different individuals have different metabolism, different versions of molecular pathways, different variants of genes and proteins, and so on. Accordingly, personalised medicine holds the promise of – when we have sufficient insight into individual metabolism – being able to prescribe unsafe drugs (for the general population) to only those individuals that can safely metabolise them.

It is easy to take a mechanism apart and stop its functioning. However, while a child can take a radio apart, often he or she cannot put it back together again, and only very rarely can a child improve a radio. And in which way should it be improved? Should it be more tolerant to noise, play sound more loudly, receive more frequencies, perhaps emit a pleasant scent when receiving a good signal? Some of these improvements are as hard to identify, once achieved, as they might be to effect. Severe degradation of function is trivial both to effect and to identify, but improvement is manifold, subtle, may be genuinely novel, and may be hard to spot.

An ideal toxicology of the future should, then, be personalised, taking into account not only what harms people in the average case, but what harms a given individual. In the best case (a sophisticated science of nutrition) it should also take into account how that person might wish to improve themselves, a problem that is psychological and ethical as much as it is biological, especially when such improvement involves further specialisation or a trade-off between different possibilities of life. Here the need for consent is even more imperative than with more basic medical procedures that simply aim to preserve or restore functioning.

In fact, the above issues are relevant not only for toxicology but also for medicine as a whole. Doctors can only address diseases and problems after viewing them as a form of ailment. Such a viewpoint is based on a training that has as its topic the average human being. But species and individuals tend towards specialisation, and perhaps the greatest problems are never merely average problems. Personalised medicine as a field may eventually turn out to be much more complex than we can now imagine, and place entirely new demands on physicians.

Comment » | Bioinformatics, Philosophy

Interactive toxicogenomics

May 4th, 2017 — 10:14am

If you work in toxicology or drug discovery, you might be familiar with the database Open TG-GATEs, a large transcriptomics database that catalogues gene expression response to well-known drugs and toxins. This database was developed by Japan’s Toxicogenomics Project during many years, as a private-public sector partnership, and remains a very valuable resource. As with many large datasets, despite the open-ness, accessing and working with this data can require considerable work. Data must always be placed in a context, and these contexts must be continually renewed. One user-friendly interface to simplify access to this data is Toxygates, which I begun developing as a postdoc at NIBIOHN in the Mizuguchi Lab in 2012 (and am still the lead developer of). As a web application, Toxygates lets you look at data of interest in context, together with annotations such as gene ontology terms and metabolic pathways, as well as visualisation tools.

We are now releasing a new major version of Toxygates, which, among many other new features, allows you to perform and visualise gene set clustering analyses directly in the web browser. Gene sets can also be easily characterised through an enrichment function, which is supported by the TargetMine data warehouse. Last but not least, users can now upload their own data and cluster and analyse it in context, together with the Open TG-GATEs data.

Our new paper in Scientific Reports documents the new version of Toxygates and illustrates the use of the new functions through a case study performed on the hepatotoxic drug WY-14643. If you are curious, give it a try.

When I begun the development as a quick prototype, I had no idea that the project would still be evolving many years later. Toxygates represents considerable work and many learning experiences for me as a researcher and software engineer, and I’m very grateful to everybody who has collaborated with us, supported the project, and made our journey possible.

 

Comment » | Bioinformatics

Synthesis is appropriation

December 11th, 2016 — 5:58pm

In contemporary society, we make use of the notion that things may be synthetic. Thus we may speak of synthetic biology, “synthesizers” (synthetic sound), synthetic textile etc. Such things are supposed to be artificial and not come from “nature”.

However, the Greek root of the word synthesis actually seems to refer to the conjoining of pre-existing things, rather than something being purely man-made. But what does it mean to be purely man-made?

Furniture, bricks, bottles, roads and bread are all made in some sense; they are the result of human methods, tools and craft applied to some substrate. But they do not ever lose the character of the original substrate, and usually this is the point – we would like to see the veins of wood in fine furniture, and when we eat bread, we would like to ingest the energy, minerals and other substances that are accumulated in grains of wheat.

Products like liquid nitrogen or pure chlorine, created in laboratories, are perhaps the ones most readily called “synthetic”, or the ones that most readily would form the basis for something synthetic.  This owing to their apparent lack of specific character/particularity, such as the veins of wood or the minerals in wheat. On the other hand, it is apparent that they possess such non-character only from the point of reference of atoms as the lowest level. If we take into consideration ideas from string theory or quantum mechanics, most likely the bottom level shifts and the pure chlorine no longer seems so homogenous.

Accordingly, if we follow this line of thought to the end, as long as we have not established the bottom or ground level of nature – and it is questionable if we ever shall – all manufacture, all making and synthesis, is only a rearrangement of pre-existing specificity. Our crafts leave traces in the world, such as objects with specific properties, but do not ever bring something into existence from nothing.

Synthesis is appropriation: making is taking.

Comment » | Bioinformatics, Philosophy

The minimal genome of Craig Venter’s Syn3.0

March 28th, 2016 — 6:21pm

The J Craig Venter Institute has published a paper detailing the genome of their new Syn3.0 synthetic organism. The major accomplishment was to construct a viable cell with a synthetic, extremely small genome: only 473 genes and about 500 kbp.

Even though it is considered to be fully “synthetic”, this genome is not built from scratch. Instead, the starting point is the Mycoplasma genitalium bacterium, from which genes and regions are deleted to produce something that is much smaller, but still viable. This means that even this fully synthetic genome still contains regions and functionalities that are not fully understood. M. genitalium was also the basis for JCVI’s Syn1.0, which was produced in 2008, but the genome of Syn3.0 is the smallest so far – “smaller than that of any autonomously replicating cell found in nature”. Syn3.0 should be a very valuable starting point for developing an explicit understanding of the basic gene frameworks needed by any cell for its survival – the “operating system of the cell” in the words of the authors.

Since so many genes are still basically not understood, the authors could not rely entirely on logic and common sense when choosing what genes to remove. They used an approach that introduced random mutations into the starting organism, and then checked which mutations where viable and which were not. This allowed them to classify genes as essential, inessential or quasi-essential (!). The deletion of essential genes would cause the cell to simply die. The deletion of quasi-essential genes would not kill it, but would dramatically slow its replication rate, severely crippling it. The final Syn3.0 organism has a doubling time of about 3 hours.

Some of the points I took away from this readable and interesting paper were:

Synthetic biology methods are starting to resemble software development methods. The authors describe a design-build-test (DBT) cycle that involve several nontrivial methods, such as in silico design, oligonucleotide synthesis, yeast cloning, insertion into the bacteria, testing, and then (perhaps) sequencing to go back to computers and figure out what went wrong or what went well. Thus, a feedback loop between the cells and the in silico design space is set up.

A very small genome needs a very tightly controlled environment to survive. The medium (nutrient solution) that Syn3.0 lives in apparently contains almost all the nutrients and raw materials it could possibly need from its environment. This means that many genes that would normally be useful for overcoming adverse conditions, perhaps for synthesising nutrients that are not available from the environment, are now redundant and can be removed. So when thinking about genome design, it seems we really have to think about how everything relates to a specific environment.

The mechanics of getting a synthetic genome into a living cell are still complex. A huge amount of wet-lab (and, presumably, dry-lab) processes are still needed to get the genome from the computer into something viable in a cell culture. However, things are going much faster than in 2008, and it’s interesting to think about where this field might be in 2021.

 

Comment » | Bioinformatics

Back to top