Chemical Information Bulletin, Volume 53, Number 2, Fall 2001 Page: 36
64 p. ; 28 cm.View a full description of this periodical.
Extracted Text
The following text was automatically extracted from the image on this page using optical character recognition software:
Chihae Yang', Paul E. Blower Jr.', Limin Yu',
Bhavik Bakshi2, and James F. Rathman2. (1)
LeadScope, Inc, 1275 Kinnear Rd, Columbus, OH
43212, cyang@leadscope.com, (2) Department of
Chemical Engineering, The Ohio State University
Tremendous amounts of data are produced by the
high throughput screening methods currently
employed in drug discovery and product
development. A typical cDNA microarray or
genechip experiment easily generates over 10,000
data points for each array or chip. The challenge of
then inferring meaningful information is formidable
given the size of the data set. Most published data
handling techniques include clustering of the gene
sets for sub-categorization and mapping the
classifications for visualization. In this paper,
multiscale Bayesian approaches including principal
component analysis (PCA) and wavelet
transformation (WT) methods are used to extract
subsets and to visualize the data in multidimensions
for comparisons. Data available from the National
Cancer Institute (NCI) are used to demonstrate the
new methods. These include gene expression data
from cDNA microarray studies on 60 cancer cell
lines, and the effects of various drug compounds on
activity for the same 60 cell lines. Similarity in cell
lines and compound-gene correlations are effectively
visualized and quantitatively compared by PCA and
WT.
10:30 - 36. Towards hermeneutic knowledge
management in science.
Thomas F. Kochmannl, Werner Dubitzky', Ruediger
M. Flaig2, and Roland Eilsl. (1) Intelligent
Bioinformatics Systems, German Cancer Research
Center, Im Neuenheimer Feld 280, D-69120
Heidelberg, Germany, Fax: +49 6221 42-3620,
t.kochmann@dkfz-heidelberg.de, w.dubitzky@dkfzheidelberg.de,
(2) Institute for Pharmaceutical
Technology and Biopharmacy, University of
Heidelberg
Knowledge in the life sciences and molecular
sciences has become increasingly complex. As a
consequence most scientists are forced to specialize
on a narrow field. This problem has recently
triggered a new discussion on novel ways to
organize, manage, and advance scientific knowledge.
In addition to its inherent complexity and increasing
degrees of specialization, a critical dimension of
future knowledge management in science is that
science has become a global endeavor. This calls for
an entirely new approach to scientific knowledge
management. We propose an approach to a
computational global framework for knowledge
management based on the distributed/intelligentagent paradigm and emergent processes able to
synthesize symbolic and subsymbolic information
within a multiple-level scheme.
11:00 - 37. An extension of recursive partitioning
for mining large screening sets.
Paul E. Blower Jr.1, Jeff Bjoraker', Denise Fiacco',
Joseph Verducci2, and Michael Fligner2. (1)
LeadScope, Inc, 1275 Kinnear Rd, Columbus, OH
43212, Fax: 614-675-3732, pblowergleadscope.com,
jbjoraker@leadscope.com, (2) Ohio State University
Statistical datamining methods have proven to be
powerful tools for investigating correlations between
molecular structure and biological activity. Recursive
partitioning, in particular, offers several advantages
in mining large, diverse data sets resulting from high
throughput screening. We use simulated annealing to
find sets of structural features whose simultaneous
presence or absence best separates the largest group
of most active compounds. The search is
incorporated into a recursive partitioning design to
produce a regression tree for biological activity on
the space of structural fingerprints. Each node is
characterized by some specific combination of
structural features, and the terminal nodes with high
average activities correspond, roughly, to different
classes of compounds. In this talk, we will describe
the statistical techniques used in this new method and
illustrate its application in mining a large dataset.
11:30 - 38. Informative library design: overview
and method.
Jennifer L. Miller, Consultant, 935 College Avenue,
Menlo Park, CA 94025, and Erin K. Bradley,
Chemical and Physical Sciences, DuPont
Pharmaceuticals Research Laboratories
One can view screening a compound library as
performing a collection of parallel experiments. As
with any set of scientific experiments, one wishes to
learn as much as possible with minimum effort. Yet,
by being based on diversity, contemporary screening
libraries do not take full advantage of the information
gain possible in this parallel setting. Informative
library design was developed explicitly to select the
set of compounds (set of experiments) that
maximizes the information gain per assay. This
information theoretic approach will be discussed in
detail including a discussion of the design of both
combinatorial and discrete libraries.CHEMICAL INFORMATION BULLETIN
36
Upcoming Pages
Here’s what’s next.
Search Inside
This issue can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Periodical.
American Chemical Society. Division of Chemical Information. Chemical Information Bulletin, Volume 53, Number 2, Fall 2001, periodical, Autumn 2001; Philadelphia, Pennsylvania. (https://digital.library.unt.edu/ark:/67531/metadc5628/m1/38/: accessed April 23, 2024), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; .