The path towards the concept of a Human Cytome Project

Introduction

I will give a bit more background to the path which for me has lead to the idea that something of a Human Cytome Project might be feasible. It should be part of a solution for late stage attrition in drug development and facilitate personalized medicine.

There was a personal, a scientific and an organizational reason for making the suggestion of a Human Cytome Project.
The personal reason was the decline in R&D productivity which had cost me my job in a pharmaceutical company in 2001. This decline in R&D productivity seemed to be a world-wide phenomenon, so there must be something wrong with the overall approach to drug discovery and development in recent years.
The scientific reason was that I had seen the power of elucidating complex biological processes by using the power of the automated microscopy system at Janssen Pharmaceutica in Beerse, Belgium, over a period of 15 years, of which I had worked on succesfully for 4.5 years (1997-2001 CE). The robust autofocus algorithm, the scale space algorithms and the spatial color model we used, allowed for superior operations, object detection and feature extraction. Commercial systems started popping up which were promising to bring this even further, mainly in the field of automation of sample handling. Molecular imaging would allow for exploring the human cytome in-vivo.
I also became increasingly aware that personal and scientific efforts would not be sufficient, so I became more involved in process improvement and organization-building in order to shape the organizational architecture which would allow for working on ambitious goals. Process and organization development can be fun and very rewarding.

The idea for large scale screening of the dynamics of the (living) cell came when I visited the Sanger Center in the UK in 2001 and was shown a big room filled with DNA-sequencers. From then on I wanted to create a system which could mean for cell-based research what DNA-sequencing had meant for Human Genome research. We already had developed an advanced automated microscopy systems at Janssen Pharmaceutica (Beerse, Belgium), and only needed to add additional sample handling robotics and integrate the sytems into a larger (conveyor belt) system. The level of knowledge and understanding of pathological processes needs to be pushed to a higher level of biological complexity and organisation to bring our understanding more on a level which is closer to the in vivo environment in the ecosystem of man. Complexity needs to be catpured and understood, not just dismantled and reduced into static building blocks which are unable to convey the true nature of a process in man. Too much data and not enough process knowledge hinders our attempts to come to grips with complex pathological process.

I wanted to achieve at least the same level of understanding of biomedical processes at the cellular level and beyond, but not only its molecular structure but the structure-function relation or in other words the molecular physiology for an entire organism. I also noticed the dramatic decline in productivity of pharmaceutical research and wanted to provide a way out of the dead end drug discovery was heading into, with its focus on target based discovery. Attrition rates in drug development of up to 90% overall and 50% in Phase III of clinical development, are not an example of a highly productive process is it? We are asking the wrong questions and giving the wrong answers about pathological processes. We generate a lot of data and knowledge about genes and proteins, but we lack a true understanding of the complex molecular pathological process in the human biosystem.

I did not want to create a catalog of the cytome, but to allow for the functional exploration of the cell in order to capture and describe the dynamics of cellular processes and not only create a catalog of its components. It is not only important to study an average celltype, but to understand the impact of cellular diversity. The multidimensional world of our cells requires a higher-dimensional approach than the linear world of DNA and also a different inner- and outer resolution is needed for each level of biological integration. It became clear to me that the cellular level is the lowest level of biological organization close enough to the complex dynamics of a disease process. Only a high correlation to the disease process itself allows a model to be used as a valid disease model.

Today powerful techniques to explore the cytome are available, such as flow cytometry (Edwards B.S., 2004) and advanced digital microscopy (Price J. H., 2003; Tsien R, 2003), which enables the exploration of the cellular function and phenotype. There are now exciting technological developments going on in what is called High Content Screening which will allow us to explore cellular systems on a large scale (Taylor DL, 2001; Giuliano KA, 2003). These developments and other technological advances made me feel confident that the exploration of the human cytome would be feasible. We should be able to open the door to the cell wide open to look at cellular structure and dynamics better than we do now by just looking through the keyhole.

My personal interest and research

Exploring the organizational levels of biology
Figure 1: Exploring the organizational levels of biology.
A linked and overlapping cascade of exploratory systems
each exploring -omes at different organizational levels of biological systems
in the end allows for creating an interconnected knowledge architecture of entire cytomes.
The approach leads to the creation of an Organism Architecture (OA)
in order to capture the multi-level dynamics of an organism.

I am no longer directly involved in the design and development of a High Content Screening (HCS) system or cytomics. This information is solely for the purpose of providing some background information on my personal interest and previous involvement in the design and development of such a system.

My goal with cytome-oriented research was to unravel the secrets of many cells (mainly primary cells), not only one cell type, as the human cyctome consists of a highly diverse population of cells. I wanted to understand the structure and function of human cells in a quantitative way to improve the predictive power and relevance of the drug discovery stage and pre-clinical stages of pharmaceutical R&D. This bridging knowledge must be applied to understand and predict the course of disease processes down to the cellular level, but also up to the entire cytome level, in order to create better disease treatments and drugs (truly succeed instead of late stage attrition). Many problems in medicine remain unsolved, due to our limited knowledge and understanding of cellular processes in the human cytome as a whole.

I myself wanted to know if a framework to explore cells on a very large scale could be implemented and would work. Managing the flow of data from physics to features is the centerpiece of such as system. I wanted to transform the space-time continuum of biological processes in cells into their digital representations on a truly massive scale. Once a process is represented in a digital state it becomes accessible to quantitative content extraction and analysis.

As technologies evolve, it should be easy to exchange components of a system or expand it with new technologies. The system should therefore be modular and scalable, the core of the system should be of a different design than the interface to the outside world and they should evolve separately, only linked to each other for the exchange of information. The concept should allow for up-scaling the system for processing massive amounts of high-dimensional data.

The core has to be able to deal with multidimensional spaces and datasets and manage the dataflow between modules, each module dealing with a part of the entire process, from acquisition and detection to data generation. From center to periphery, the system becomes increasingly machine and technology related, while the core is only a data-transfer module unaware of technical or physical constraints. Each machine which becomes connected to the core enables to explore a subset of a physical space and informs the core about its capacities and restrictions (0D up to 5D, spatial, spectral and temporal).

A device attached to the system as such should allow for the exploration of a part of this spatio-spectro-temporal continuum. Devices differ in their sampling of the electromagnetic spectrum (LM, EM, CT, NMR), the spatial scale at which they can operate (nm, microns, mm) and their temporal resolution (nsec, msec, sec, min). A given device has an inner and outer spatial, spectral and temporal resolution limit. All (imaging) devices generate pixel or voxel density profiles which can be used for (semi-) quantitative exploration. A given input data point represents a spatial, spectral and temporal sampling of the spatio-spectro-temporal continuum. The basic principles remain the same, only our point of view and our perspective of the physical boundaries in space and time change.

The physical dimensions of the high-dimensional space and the meaning of each pixel/voxel are only relevant for the quantification module as the detection module only deals with 'density' patterns in a 5D space. Anisotropy in spatial, temporal and spectral sampling are only accounted for at the periphery of the system, as they have an impact on the quantification of objects. Each dimension (XYZ, spectral, temporal) is regarded as a continuum, sampled at discrete intervals, each with its own inner and outer resolution.

The system design allowed for distributed operation, so a system could run on different platforms and interact with components over a network. It should use open standards for its communication with the outside world to allow for easy integration in a heterogeneous environment (XML, CORBA). The output of the system should be a set of linked feature hyperspaces, each describing structural and functional attributes of the individual cell and its components. The data output must be in a format which can easily be parsed and fed into data analysis and visualization systems.

The system extracts features, their meaning as such is not relevant for the system itself, but for the observer of the feature space. Capturing attributes is not the same process as assigning a meaning to the features we extract from a biological system. The meaning of a change in the multidimensional feature space can be built into a postprocessing system, but the content extraction process has to capture quantitative data and not interpretations of data. The ultimate data reductions are the assignment of a meaning to a quantitative feature or attribute change, not the extraction of only a minimum of features.

Since 2001 until 2005 I had been thinking about, and working on, the design of such a scalable system (Van Osta P., 2004). The core of the system was being built into a framework for the exploration of cells, tissues and model organisms by using a microscopy based reader. The system was designed to be used as a discovery process plug-in which enables cell based experiments to flow through its modules to convert physical events into a feature space for exploration and interpretation.

The roots and predecessor of my work

The predecessor of this system and a source of inspiration dates back to the late eighties and early nineties of the twentieth century (Geerts H, 1987; Ver Donck L, 1992; Cornelissen F, 1993; Geerts, H, 1992; Geusebroek J.M., 2000; Van Osta P, 2002).

Life Science Laboratory at Janssen Pharmaceutica in 2001 This use of digital microscopy in drug discovery at Janssen Pharmaceutics (Beerse, Belgium) originated from Nanovid (ultra-) microscopy (nanometer particle video ultramicroscopy) long ago (De Mey J., 1981; De Brabander M., 1986; De Brabander M, 1986b; Geuens G, 1986; Geerts H, 1987; De Brabander M, 1989; Geerts H., 1991). Nanovid (ultra-) microscopy itself had its origin in the study of microtubules (De Mey J., 1976; De Brabander M., 1977). Automated Calcium (Ca2+) ratio imaging was used for studying the effect of drugs on isolated cardiomyocytes. This research dates back to halfway the eighties of the twentieth century (Borgers M, 1985; Ver Donck L, 1986; Ver Donck L, 1987; Borgers M, 1988; Ver Donck L, 1988; Ver Donck L, 1990; Geerts H, 1989; Olbrich HG, 1991; Ver Donck L, 1991; Ver Donck L, 1992; Cornelissen F, 1993; Ver Donck L, 1993; Cornelussen RN, 1996).

Drug discovery research by using cellular disease models with automated microscopy based systems was done in this environment for many years, before it became fashionable in the outside world (Geerts H, 1989; Ver Donck L, 1992; Cornelissen F, 1993; Nuydens R, 1993; Nuydens R, 1995; Nuydens R, 1995b; Geerts H, 1996; Nuydens R, 1998).

System history

It started in 1983 at Janssen Pharmaceutica in Beerse, Belgium, with the Quantimet 970 Image Analyzer made by Cambridge Instruments. The team at Janssen developed Nanovid microscopy in 1985. The original Janssen automated imaging system, of which the first system was built in 1993, was based on Carl Zeiss microscopes, equipped with high-precision Märzhäuser Wetzlar scanning stages and various CCD camera types. Software development was based on the SCIL Image system, running on SGI IRIX systems and developed in ANSI C. I joined the team at the Life Science department at Janssen on 31 Dec. 1996, working on image analysis applications with Frans Cornelissen and Jan-Mark Geusebroek. A patented proprietary objectbased auto-focusing system, scalespace and spatial color model algorithms, allowed for unprecedented performance. In 2001, I continued my work at MAIA SCIENTIFIC (Geel, Belgium), a subsidiary of Harvard Bioscience, while system development at Janssen was discontinued. The system became known as the MIAS-2 system, equipped with Hudson Robotics and the Linux-based eaZYX software system. In 2005, I left MAIA SCIENTIFIC and later on the company became a subsidiary of Digilabs.

Some references

Acknowledgments

I am indebted, for their pioneering work on automated digital microscopy and High Content Screening (HCS) (1988-2001), to my former colleagues at Janssen Pharmaceutica (1997-2001 CE), such as Frans Cornelissen, Hugo Geerts, Jan-Mark Geusebroek and Roger Nuyens, Rony Nuydens, Luk Ver Donck, Johan Geysen and their colleagues.

Many thanks also to the pioneers of Nanovid microscopy at Janssen Pharmaceutica, Marc De Brabander, Jan De Mey, Hugo Geerts, Marc Moeremans, Rony Nuydens, Kris Ver Donck, Johan Geysen and their colleagues.

I also want to thank all those scientists who have helped me with general information and articles. My special thanks goes to Andres Kriete, Robert F. Murphy, J. Paul Robinson, Attila Tarnok, and Guenter K. Valet.

Created 17 July 2005 by Peter Van Osta, latest change on 30 December 2019

Email: pvosta{at}gmail{dot}com