Professional Documents
Culture Documents
Informatics
I
n 1999 the Harvard Business School published identification and validation. The company sought Dr David Sedlock
two case studies highlighting the meteoric rise and signed a significant number of business deals to
of Millennium Pharmaceuticals, Inc as a promi- help it grow as a new biotech company in this rap-
nent company in the newly minted world of idly evolving world. Between its genesis in 1993,
genomics targeted drug discovery1,2. Millennium with its first investment of $8.45 million in seed
had reached its notoriety at that time through a money, until the launch of Velcade in 2003, the
combination of timely deal making, a strong com- first drug that it shepherded through clinical testing
mitment to excellence in developing a new technol- and the regulatory process, Millennium had com-
ogy platform, and demonstrated success with the pleted some 54 deals worth a total of $4.8 billion
companies it had developed a business relationship. (Figure 1). This included a number of acquisitions
During its early growth years Millennium built up and corporate spin-offs as Millennium forged a
a strong reputation for being able to talk the talk business model toward the goal of a fully integrat-
and walk the walk in the area of new drug target ed biopharmaceutical company.
Informatics
Figure 1
The history and value of
Millennium business deals Millennium Corporate Deals
14 2500
12
2000
8 1500
6 1000
4
500
2
0 0
01
94
95
96
97
98
99
02
03
04
05
06
93
00
19
20
Creation of Millennium Discovery gets to the drug discovery process. The various
Informatics platform programme technologies cited in Table 1 that were
Millennium, along with several other companies of created and developed to accomplish this task gen-
its genre (eg Celera Genomics and Human Genome erated significant technology hurdles that were
Sciences) were at the forefront of what many saw attempted to be solved through a variety of mech-
as a new revolution in the world of drug discovery anisms, including a substantial investment in soft-
to help fill what was commonly seen as a dearth of ware developed by Millennium engineers. The
new chemical entities moving through the regula- companys foray into the world of software devel-
tory processes for approval as marketed drugs3. To opment allowed it to create major new tools in the
accomplish this plan Millennium made a sizeable area of genomics data mining, differential expres-
investment in the creation of a technology platform sion analyses, tissue banking, image processing and
to help manage the terabytes of data that were data viewing that enabled the scientists at
being generated as part of the growth of the com- Millennium and at its corporate partners to under-
pany (Table 1). Managing the massive growth of stand quickly the value of the plethora of data
data became an immense undertaking for the com- fields that were being mined.
pany and Millennium responded through the cre-
ation of a significant infrastructure devoted to the Changes to the environment
storage, analysis, processing, access and viewing of The evolution of the Informatics environment mir-
these various data types. The goal has been the rored the companys growth and business priori-
same throughout the industry: to develop tools to ties. Almost from its inception the company
enable the conversion of data to information, and focused on creating the data mining tools needed
the assimilation of this information to knowledge, to support the explosion of genomics data with a
with the intended use of this knowledge to aid in vast infrastructure following to handle the pipelin-
the discovery and development of new treatments ing of sequence data and the concomitant applica-
for known diseases. tion of these data to the identification of discrete
This information flow from data to knowledge proteins that could be associated with disease tar-
creation forms the basic process that comprises our gets. A major effort was devoted to the analysis of
Informatics infrastructure. For the first 10 years of differential expression profiles and the creation of
the companys existence, this Informatics platform numerous cDNA libraries for cell expression
was geared toward the established need to find experiments to help interpret these data and to cre-
more validated drug targets and to apply these tar- ate a database for target validation. This was
Informatics
Bioinformatics Systems
Expression Biobanking
Profiling Repository Gene
DNA Platform Advanced Pathways Animal 2nd Generation
Millennium Sequencing Genomics Cloning Gene Knowledge Imaging Biobanking
Incorporated Platform Platform Library Mining Management Center System
1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007
Assay
Chemical Chemical Project Request Chemistry
Compound Reagent Data System Electronic
Registration Inventory Viewing Notebook
Mangagement Platform Compound
Bioassay Inventory
Data Management
Capture
Cheminformatics Systems
Informatics
Informatics
The most common approach is a mixture of Excel 1. Software system environment One recommen-
files, PowerPoint slides, and e-mail to compile and dation is to implement system lifecycles into the
communicate these data. During the past few platform so that you are able to plan for technolo-
years there has been an attempt to address the gy changes, software system use changes and
integration of these data through electronic note- wholesale environment evolution. Software should
book technology; however, the deployment of this be deployed for a purpose and that purpose should
solution, especially in the biology world, has been translate to use. Once that use changes or even dis-
slow to materialise. The bottom line is that the appears, the software system needs to change as
lifeblood of any biotech or pharmaceutical com- well. All systems require the implementation of a
pany is the ability to make rapid decisions that maintenance schedule but too often the mainte-
correctly predict the outcome of a new drug in the nance becomes a major cost burden. Every system
next test. If you predict correctly, you succeed; if deployed should have decommissioning built in as
you dont, the project, the programme, even the part of its lifecycle. A corollary to this is the need
company can fail. to maintain an accurate network architecture map.
Our informatics systems environment is under con-
A contemporary approach stant pressure from environmental change (appli-
So, how do you construct an informatics environ- cation upgrades, OS patching, database mainte-
ment that contributes to this process that posi- nance, hardware changes, etc). We place a huge
tively impacts the decision-making process? burden on our support staff to maintain this envi-
Millennium and the rest of the industry are grap- ronment amidst these constant intrusions and an
pling with this issue. We have explored different accurate map of our network architecture includ-
solutions and, admittedly, do not propose to have ing an updated application inventory, configura-
developed the ideal approach we are still learn- tion plans, and dependency grid is critical to enable
ing. Over the years, working through several dif- implementing a solid test plan for these changes.
ferent options and working solutions, I can com- 2. Data archiving A second component is the
ment on a few select jewels that have helped us development of an active data archiving model to
navigate this minefield. These can be divided into manage the constant growth of data and the fact
three areas: that only a very small percentage of stored data is
Figure 4
Component analysis of drug
(Ground Level)
Biology Chemistry
Systems Systems
Platform Platform
Bioassay Testing
Target ID blank blank Compound ID
Project ID
Informatics
Informatics
References drug disposition parameters, or integrating in-life consequence. There is predictable overhead asso-
1 Matthews, Sarah. Strategic study parameters with bioanalytical profiles of ciated with the deployment of this architecture
deal-making at Millennium new drug candidates. the most noteworthy being the enhanced com-
Pharmaceuticals. 1999. Case
Our focus has progressed during the past few in plexity of the environment impacting the testing
Study 9-800-032. Harvard
Business School Publishing. three stages: from a data storage and access para- that is needed when making changes to any of the
Cambridge, MA. digm centred on protein targets to the development system components, but we have found the main-
2 Thomke, S, Nimgade, A. of a compound-centric view integrating early dis- tenance of this architecture to be manageable with
Millennium Pharmaceuticals, covery data to the state we have entered, today, a small support staff.
Inc. 1999. Case Study N9-
where we are focusing on the complete project and The migration to this architecture has allowed us
600038. Harvard Business
School Publishing. Cambridge, the recognition that the most valuable data for this to begin planning a roadmap toward a fully inte-
MA. project tends to be the late-stage study data that grated discovery (including preclinical develop-
3 Challenge and opportunity deal with the compilation of numerous animal ment) data environment. Our approach, to date,
on the critical path to new tests. Additionally we need to be able to apply has relied primarily on lightweight point solutions
medical products. 2004. Food
traceability to the data, themselves, as they are governed by the aforementioned architectural
and Drug Administration
Report. March. analysed, processed, translated and transcribed model and driven by use cases and user need asso-
4 Augen, Jeffrey. The evolving into study reports. Integrating across this diverse ciated with specific data integration requirements.
role of information technology platform has definitely presented numerous chal- This model is working for us now, will certainly
in the drug discovery process. lenges. One basic strategy that we are taking is to evolve as the technology evolves, and gives us a
2002. Drug Discovery Today,
spend considerable time analysing work processes framework that allows evolution either within our
Vol. 7, No. 5: 315-323.
5 Davies, N, Peakman, T. 2004. and use cases to before we incorporate any addi- proprietary environment or with vendor applica-
Making the most of your tional changes into our Informatics environment. tions without the need to completely rebuild as we
discovery data. Drug We have found it imperative that the user be a key migrate this forward. DDW
Discovery World. Spring Issue: participant in the design of any solution and a sig-
17-23.
nificant investment in time is taken to work
6 Koontz, Bryan. Redefining
drug discovery informatics. through an iterative process of system deployment.
2005. Drug Discovery World. Since we have already built out a number of sub-
Spring Issue: 85-90. stantial platforms, the most cost-effective
7 Claus, BL, Underwood, DJ. approach has been to leverage against the present
2002. Discovery informatics: its
state as we decommission old, rebuild active appli-
evolving role in drug discovery.
Drug Discovery Today. Vol 7, cations where new features are truly needed, and
No. 18: 957-966. deploy selected new systems (from completely
8 Wang, X, Gorlitsky, R, home-grown to completely commercial out-of-
Almeida, JS. 2005. From XML the-box solutions) where completely new solu-
to RDF: how semantic web
tions are required.
technologies will change the
design of omic standards. The general model that we are adopting as we
Nature Biotechnology. Vol 23, migrate toward a full integration strategy relies on
Issue 9: 1099-1103. a service-oriented framework to connect our vari-
9 Mukherjea, S. 2005. ous disparate data stores through dedicated client
Information retrieval and
interfaces that have been deployed for specific user
knowledge discovery utilising a
biomedical semantic web. groups. These client tools might be standard ven-
Briefings in Bioinformatics. Vol dor products (thick client or web application) or
6, Issue 3: 252-262. our own internally built product for a particular
process. Using our electronic chemistry notebook
as a model we have developed a very facile archi-
tecture that consists of a number of layers. The
key is in the development of a middleware layer Dr David Sedlock is currently the Director of
that allows us to customise the client interface Research Systems at Millennium Pharmaceuticals,
without impacting our vendor products (Figure 5). Inc and is responsible for the development and
This way we can maintain a vendor neutral management of the Informatics environment that
approach to system maintenance and focus only supports drug discovery and early-stage preclinical
on those components that we have created when activities. David received his PhD in
we undertake a system change. This also allows us Bacteriology/Biochemistry from the University of
to control data flow, access and viewing in a man- Wisconsin, Madison, USA and has been directing
ner that is customised for the particular user or drug discovery functions in the pharmaceutical and
user group. This approach does not come without biotechnology industries for more than 25 years.