Archive for the ‘PNAS’ Category

Systematic discovery of regulatory motifs in conserved regions of the human genome, including thousands of CTCF insulator sites

May 29, 2007

Xiaohui Xie{dagger}, Tarjei S. Mikkelsen{dagger},{ddagger}, Andreas Gnirke{dagger}, Kerstin Lindblad-Toh{dagger}, Manolis Kellis{dagger},§, and Eric S. Lander{dagger},||,{dagger}{dagger}

Conserved noncoding elements (CNEs) constitute the majority of sequences under purifying selection in the human genome, yet their function remains largely unknown. Experimental evidence suggests that many of these elements play regulatory roles, but little is known about regulatory motifs contained within them. Here we describe a systematic approach to discover and characterize regulatory motifs within mammalian CNEs by searching for long motifs (12–22 nt) with significant enrichment in CNEs and studying their biochemical and genomic properties. Our analysis identifies 233 long motifs (LMs), matching a total of {approx}60,000 conserved instances across the human genome. These motifs include 16 previously known regulatory elements, such as the histone 3′-UTR motif and the neuron-restrictive silencer element, as well as striking examples of novel functional elements. The most highly enriched motif (LM1) corresponds to the X-box motif known from yeast and nematode. We show that it is bound by the RFX1 protein and identify thousands of conserved motif instances, suggesting a broad role for the RFX family in gene regulation. A second group of motifs (LM2*) does not match any previously known motif. We demonstrate by biochemical and computational methods that it defines a binding site for the CTCF protein, which is involved in insulator function to limit the spread of gene activation. We identify nearly 15,000 conserved sites that likely serve as insulators, and we show that nearby genes separated by predicted CTCF sites show markedly reduced correlation in gene expression. These sites may thus partition the human genome into domains of expression.

Discovering transcriptional regulatory regions in Drosophila by a nonalignment method for phylogenetic footprinting

May 29, 2007

Alona Sosinsky*,{dagger},{ddagger}, Barry Honig*,{dagger},{ddagger},§, Richard S. Mann{dagger}, and Andrea Califano{ddagger}

The functional annotation of the nonprotein-coding DNA of eukaryotic genomes is a problem of central importance. Phylogenetic footprinting methods, which attempt to identify functional regulatory regions by comparing orthologous genomic sequences of evolutionarily related species, have shown promising results. The main advantage of this class of approaches is that they do not require any knowledge of the regulating transcription factors. Here we describe a method called Enhancer Detection using only Genomic Information (EDGI), which integrates a traditional motif-discovery algorithm with a local permutation-clustering algorithm. Together, they can identify large regulatory elements (e.g., enhancers) as evolutionarily conserved order-independent clusters of short conserved motifs. We show that EDGI can distinguish between established sets of known enhancers and nonenhancers with 88% accuracy, rivaling predictions by methods that rely on the knowledge of the regulating transcription factors and their DNA-binding specificities. We tested EDGI’s performance on a set of Drosophila genomes. Our results demonstrate that comparative genomic analysis of multiple closely related species has substantial power to identify key functional elements without additional biological knowledge.

Small dsRNAs induce transcriptional activation in human cells

January 9, 2007

Long-Cheng Li*,, Steven T. Okino, Hong Zhao, Deepa Pookot, Robert F. Place, Shinji Urakami, Hideki Enokida, and Rajvir Dahiya*,

Department of Urology, Veterans Affairs Medical Center and University of California, San Francisco, CA 94121

…… In conclusion, we have identified several dsRNAs that activate gene expression by targeting noncoding regulatory regions in gene promoters. These findings reveal a more diverse role for small RNA molecules in the regulation of gene expression than previously recognized and identify a potential therapeutic use for dsRNA in targeted gene activation.

Using the principle of entropy maximization to infer genetic interaction networks from gene expression patterns

January 9, 2007

Timothy R. Lezon*, Jayanth R. Banavar*, Marek Cieplak{dagger}, Amos Maritan{ddagger}, and Nina V. Fedoroff§,||

Pennsylvania State University.

We describe a method based on the principle of entropy maximization to identify the gene interaction network with the highest probability of giving rise to experimentally observed transcript profiles. In its simplest form, the method yields the pairwise gene interaction network, but it can also be extended to deduce higher-order interactions. Analysis of microarray data from genes in Saccharomyces cerevisiae chemostat cultures exhibiting energy metabolic oscillations identifies a gene interaction network that reflects the intracellular communication pathways that adjust cellular metabolic activity and cell division to the limiting nutrient conditions that trigger metabolic oscillations. The success of the present approach in extracting meaningful genetic connections suggests that the maximum entropy principle is a useful concept for understanding living systems, as it is for other complex, nonequilibrium systems.