|
The
biological questions we address
In
the broadest terms, we seek to understand how information is encoded and
dynamically utilized in living eukaryotic genomes. We focus specifically on those
areas of the genome that regulate chromosomal functions such as
transcription, DNA replication and repair, recombination, and chromosome
segregation.
DNA
is an elegant molecule, but carries its information in a language that
consists of only four letters. The molecular simplicity of DNA imposes
practical limits on the complexity and types of information it can encode.
How do complex organisms overcome these limitations? Conceptually,
information in living genomes can be visualized as existing in layers, with
the information being more diffusely coded in each ascending layer. The
primary layer is best represented by protein-coding DNA, which operates
according to the relatively inflexible universal genetic code. A second layer
encodes regulatory information through the occurrence of millions of
degenerate sequence motifs potentially recognized by “sequence
specific” DNA-binding proteins such as transcription factors. A third
layer of sequence information is very diffusely encoded over hundreds of
bases and guides the positioning and occupancy of nucleosomes, the basic
units of DNA packaging. The final layer is composed of the nucleosomes
themselves. Nucleosomes greatly extend the information-coding capacity of the
genome by allowing overlapping, redundant, and even illegitimate information
to be safely encoded in DNA sequences. Nucleosomes accomplish this by
blocking regulatory protein access to most of the genome, and by dynamically
allowing access to relatively small portions of the genome that are utilized
specifically in a given cellular environment. We seek to characterize
quantitatively how the regulation of genome accessibility occurs and how it
is coordinated with the underlying layers of information encoded in DNA.
Yeast,
worms, and humans: A strategy for linking basic biology and medicine
The
projects in my laboratory are united by the scientific goal of understanding
relationships between chromatin, transcription factor targeting, and gene
expression. We use three biological systems: (1) S. cerevisiae (hereafter “yeast”) to address basic
molecular mechanisms; (2) C. elegans
to test the importance of those mechanisms in a simple multicellular
organism; and (3) cell lines and clinical samples to directly interrogate
chromatin function in human development and disease. The genomes of these
organisms span three orders of magnitude in size (12 Mb, 100 Mb, and 3000 Mb
respectively) and a wide range of genome complexity (~50% coding, ~25%
coding, and ~1.5% coding respectively). Use of these systems, with C. elegans serving as a
“stepping stone” to bridge yeast and human studies, permits us to
quickly bring concepts discovered in model systems to medical relevance.
The
major projects in the lab are as follows:
Project
Group 1: Using yeast transcription factors to investigate regulation of
protein-genome interactions
We use the
localization of yeast proteins as model systems to investigate in vivo
DNA-binding specificity, and how it is regulated under different
environmental and developmental conditions. Genome-wide localization of
proteins is determined by a method commonly called "ChIP-chip",
which stands for Chromatin Immunoprecipitation followed by microarray
analysis. We also measure transcription genome-wide to study the biological
implications of protein-DNA interactions.
Project
Group 2: Using in vitro methods to
identify factors that regulate in vivo
target selection by DNA-binding proteins
In collaboration
with Neil Clarke's group (now at the Genome Institute of Singapore), we have
developed a new method for determining the DNA-binding specificity of
proteins. In DIP-chip (DNA immunoprecipitation with microarray detection),
protein-DNA complexes are isolated from an in vitro mixture of purified protein and naked genomic DNA.
Whole-genome DNA microarrays are used to identify the protein-bound DNA
fragments, and the sequence of the identified fragments is used to derive
binding site descriptions. The
experimental protocol for DIP-chip can also be used for a rather different
purpose, which is comparing the sites of binding in vitro with the sites of
binding in vivo, as defined by
ChIP-chip. Comparisons of DIP-chip and
ChIP-chip experiments will be useful in determining how much of the
specificity of in vivo interactions
depends on chromatin and other factors, and how much is inherent to the
protein and DNA itself.
Project
Group 3: Combining biochemical and genomic methods to study genome and
chromatin organization.
Our group work to characterize how DNA is packaged, focusing in
particular on the regulation of nucleosome dynamics. We have published
results that provide evidence that the basic repeating units of eukaryotic
chromatin, nucleosomes, are depleted from active regulatory elements
throughout the Saccharomyces cerevisiae
genome in vivo. Alterations in the global transcriptional program resulted in
an increased nucleosome occupancy at repressed promoters, and a decreased
nucleosome occupancy at promoters that became active. Given the conservation
of sequence and function among components of both chromatin and the
transcriptional machinery, nucleosome depletion at promoters may be a
fundamental feature of eukaryotic transcriptional regulation. We are
continuing to study the regulation of nucleosome occupancy genome-wide in yeast.
We are interested in bringing
technologies and concepts we develop in model systems to the study of human
biology and health. One example is FAIRE (Formaldehyde-Assisted
Isolation of Regulatory Elements), a simple low-cost
method for the isolation and identification of nucleosome-depleted regions of
chromatin genomewide. FAIRE was initially discovered in yeast, where we
observed that if formaldehyde-crosslinked chromatin was subjected to
phenol-chloroform extraction, nucleosome-depleted sequences were recovered in
the aqueous phase with much greater efficiency than coding sequences. FAIRE
presumably works because covalently crosslinked protein–DNA complexes
are retained at the interface of the organic and aqueous solvents, whereas
DNA that is not crosslinked (or trapped by crosslinks) escapes into the
aqueous phase. Higher-resolution comparison of FAIRE signal to nucleosome
mapping data revealed that nearly all yeast genomic regions depleted in
histone H3 and H4-Myc chips were enriched by FAIRE. Histone proteins are
likely to dominate the crosslinking profile because of their abundant primary
amines and close proximity to DNA, both required for crosslinking. We have
developed FAIRE as an alternative method for identification of open chromatin
sites in human chromatin. FAIRE isolates regulatory regions in human cells
that overlap to a large degree with DNaseI hypersensitive regions, but also
detect a unique set of loci. Our discovery in of FAIRE in yeast, and its
continued development in human cells provides the foundation of projects
designed to create a human open chromatin atlas, and our proposal to profile
chromatin in human cancer.
Project
Group 4: Establishment of Caenorhabditis elegans
as a model metazoan for the study of protein-DNA interactions during
development
Yeast
is a fabulous system, but we are also interested in studying aspects of
chromatin regulation that are required for development. For this purpose, we
initiated studies of C. elegans. C. elegans is at the forefront of both
large-scale genomic research and gene function discovery. It was the first
animal to have a fully mapped and sequenced genome. Genomic approaches
including EST projects, SAGE sequencing, an ORFeome library, extensive yeast
two hybrid data-sets, microarray profiling, and genome-wide RNAi screens have
provided a wealth of information regarding gene structure and function
(www.wormbase.org). The versatility of C.
elegans for experimental manipulation has led to a large collection of mutant
alleles and many well-known discoveries of basic biology (www.wormbook.org).
Also unique to worms are the advantages it holds for the study of chromatin
factors regulating meiosis and germline development, which are notoriously
difficult to study in mammalian systems. All of these features make realistic
the goal of understanding how a genome sequence directs animal development. C. elegans has traditionally been
exploited as a model for genetics, cell biology, and neurobiology, but
application of biochemical approaches has lagged. We sought to establish
ChIP-chip, which we helped to develop in yeast, to this important model
system. For this purpose we used the C.
elegans dosage compensation complex (DCC) proteins. Because the DCC binds
specifically to X and not to the autosomes, we could measure the specificity
and sensitivity of our assay and optimize procedures to maximize the ratio of
signal (X-chromosome hybridization) to noise (autosome hybridization).
Furthermore, we are able to cross-validate experiments using antibodies
against distinct components of the DCC. Therefore in addition to important
biological discoveries, this test case offered technical advantages that
allowed protocol development and objective assessment of our methods. This
led to successful ChIPs of other factors, including the histone variant H2A.Z
and the transcription factor NFI-1.
We
are currently funded as part of a large effort funded by NHGRI’s modENCODE
project to identify elements encoded in DNA that control chromatin behavior
in C. elegans.
Site updated
June 18, 2007
A similar description of the Lab's research can
be found on the Department of Biology Faculty
Page.
|