Posts Tagged ‘computing’
Adding a Little Reality to Building Ontologies for Biology
Written by Scott Christley et al. on September 3, 2010 – 7:00 am -Many areas of biology are open to mathematical and computational modelling. The application of discrete, logical formalisms defines the field of biomedical ontologies. Ontologies have been put to many uses in bioinformatics. The most widespread is for description of entities about which data have been collected, allowing integration and analysis across multiple resources. There are now over 60 ontologies in active use, increasingly developed as large, international collaborations. There are, however, many opinions on how ontologies should be authored; that is, what is appropriate for representation. Recently, a common opinion has been the “realist” approach that places restrictions upon the style of modelling considered to be appropriate.
Methodology/Principal FindingsHere, we use a number of case studies for describing the results of biological experiments. We investigate the ways in which these could be represented using both realist and non-realist approaches; we consider the limitations and advantages of each of these models.
Conclusions/SignificanceFrom our analysis, we conclude that while realist principles may enable straight-forward modelling for some topics, there are crucial aspects of science and the phenomena it studies that do not fit into this approach; realism appears to be over-simplistic which, perversely, results in overly complex ontological models. We suggest that it is impossible to avoid compromise in modelling ontology; a clearer understanding of these compromises will better enable appropriate modelling, fulfilling the many needs for discrete mathematical models within computational biology.
Tags: biology, computing, news
Posted in Computatioanl biology | Comments Off
Gradient Descent Optimization in Gene Regulatory Pathways
Written by Scott Christley et al. on September 3, 2010 – 7:00 am -Gene Regulatory Networks (GRNs) have become a major focus of interest in recent years. Elucidating the architecture and dynamics of large scale gene regulatory networks is an important goal in systems biology. The knowledge of the gene regulatory networks further gives insights about gene regulatory pathways. This information leads to many potential applications in medicine and molecular biology, examples of which are identification of metabolic pathways, complex genetic diseases, drug discovery and toxicology analysis. High-throughput technologies allow studying various aspects of gene regulatory networks on a genome-wide scale and we will discuss recent advances as well as limitations and future challenges for gene network modeling. Novel approaches are needed to both infer the causal genes and generate hypothesis on the underlying regulatory mechanisms.
MethodologyIn the present article, we introduce a new method for identifying a set of optimal gene regulatory pathways by using structural equations as a tool for modeling gene regulatory networks. The method, first of all, generates data on reaction flows in a pathway. A set of constraints is formulated incorporating weighting coefficients. Finally the gene regulatory pathways are obtained through optimization of an objective function with respect to these weighting coefficients. The effectiveness of the present method is successfully tested on ten gene regulatory networks existing in the literature. A comparative study with the existing extreme pathway analysis also forms a part of this investigation. The results compare favorably with earlier experimental results. The validated pathways point to a combination of previously documented and novel findings.
ConclusionsWe show that our method can correctly identify the causal genes and effectively output experimentally verified pathways. The present method has been successful in deriving the optimal regulatory pathways for all the regulatory networks considered. The biological significance and applicability of the optimal pathways has also been discussed. Finally the usefulness of the present method on genetic engineering is depicted with an example.
Tags: biology, computing, news
Posted in Computatioanl biology | Comments Off
The Proneural Molecular Signature Is Enriched in Oligodendrogliomas and Predicts Improved Survival among Diffuse Gliomas
Written by Scott Christley et al. on September 3, 2010 – 7:00 am -The Cancer Genome Atlas Project (TCGA) has produced an extensive collection of ‘-omic’ data on glioblastoma (GBM), resulting in several key insights on expression signatures. Despite the richness of TCGA GBM data, the absence of lower grade gliomas in this data set prevents analysis genes related to progression and the uncovering of predictive signatures. A complementary dataset exists in the form of the NCI Repository for Molecular Brain Neoplasia Data (Rembrandt), which contains molecular and clinical data for diffuse gliomas across the full spectrum of histologic class and grade. Here we present an investigation of the significance of the TCGA consortium's expression classification when applied to Rembrandt gliomas. We demonstrate that the proneural signature predicts improved clinical outcome among 176 Rembrandt gliomas that includes all histologies and grades, including GBMs (log rank test p = 1.16e-6), but also among 75 grade II and grade III samples (p = 2.65e-4). This gene expression signature was enriched in tumors with oligodendroglioma histology and also predicted improved survival in this tumor type (n = 43, p = 1.25e-4). Thus, expression signatures identified in the TCGA analysis of GBMs also have intrinsic prognostic value for lower grade oligodendrogliomas, and likely represent important differences in tumor biology with implications for treatment and therapy. Integrated DNA and RNA analysis of low-grade and high-grade proneural gliomas identified increased expression and gene amplification of several genes including GLIS3, TGFB2, TNC, AURKA, and VEGFA in proneural GBMs, with corresponding loss of DLL3 and HEY2. Pathway analysis highlights the importance of the Notch and Hedgehog pathways in the proneural subtype. This demonstrates that the expression signatures identified in the TCGA analysis of GBMs also have intrinsic prognostic value for low-grade oligodendrogliomas, and likely represent important differences in tumor biology with implications for treatment and therapy.
Tags: biology, computing, news
Posted in Computatioanl biology | Comments Off
Comparative Analysis of Plasmids in the Genus Listeria
Written by Scott Christley et al. on September 2, 2010 – 7:00 am -We sequenced four plasmids of the genus Listeria, including two novel plasmids from L. monocytogenes serotype 1/2c and 7 strains as well as one from the species L. grayi. A comparative analysis in conjunction with 10 published Listeria plasmids revealed a common evolutionary background.
Principal FindingsAll analysed plasmids share a common replicon-type related to theta-replicating plasmid pAMbeta1. Nonetheless plasmids could be broadly divided into two distinct groups based on replicon diversity and the genetic content of the respective plasmid groups. Listeria plasmids are characterized by the presence of a large number of diverse mobile genetic elements and a commonly occurring translesion DNA polymerase both of which have probably contributed to the evolution of these plasmids. We detected small non-coding RNAs on some plasmids that were homologous to those present on the chromosome of L. monocytogenes EGD-e. Multiple genes involved in heavy metal resistance (cadmium, copper, arsenite) as well as multidrug efflux (MDR, SMR, MATE) were detected on all listerial plasmids. These factors promote bacterial growth and survival in the environment and may have been acquired as a result of selective pressure due to the use of disinfectants in food processing environments. MDR efflux pumps have also recently been shown to promote transport of cyclic diadenosine monophosphate (c-di-AMP) as a secreted molecule able to trigger a cytosolic host immune response following infection.
ConclusionsThe comparative analysis of 14 plasmids of genus Listeria implied the existence of a common ancestor. Ubiquitously-occurring MDR genes on plasmids and their role in listerial infection now deserve further attention.
Tags: biology, computing, news
Posted in Computatioanl biology | Comments Off
Community Landscapes: An Integrative Approach to Determine Overlapping Network Module Hierarchy, Identify Key Nodes and Predict Network Dynamics
Written by Scott Christley et al. on September 2, 2010 – 7:00 am -Network communities help the functional organization and evolution of complex networks. However, the development of a method, which is both fast and accurate, provides modular overlaps and partitions of a heterogeneous network, has proven to be rather difficult.
Methodology/Principal FindingsHere we introduce the novel concept of ModuLand, an integrative method family determining overlapping network modules as hills of an influence function-based, centrality-type community landscape, and including several widely used modularization methods as special cases. As various adaptations of the method family, we developed several algorithms, which provide an efficient analysis of weighted and directed networks, and (1) determine pervasively overlapping modules with high resolution; (2) uncover a detailed hierarchical network structure allowing an efficient, zoom-in analysis of large networks; (3) allow the determination of key network nodes and (4) help to predict network dynamics.
Conclusions/SignificanceThe concept opens a wide range of possibilities to develop new approaches and applications including network routing, classification, comparison and prediction.
Tags: biology, computing, news
Posted in Computatioanl biology | Comments Off
Prediction of Nucleosome Positioning Based on Transcription Factor Binding Sites
Written by Scott Christley et al. on September 1, 2010 – 7:00 am -The DNA of all eukaryotic organisms is packaged into nucleosomes, the basic repeating units of chromatin. The nucleosome consists of a histone octamer around which a DNA core is wrapped and the linker histone H1, which is associated with linker DNA. By altering the accessibility of DNA sequences, the nucleosome has profound effects on all DNA-dependent processes. Understanding the factors that influence nucleosome positioning is of great importance for the study of genomic control mechanisms. Transcription factors (TFs) have been suggested to play a role in nucleosome positioning in vivo.
Principal FindingsHere, the minimum redundancy maximum relevance (mRMR) feature selection algorithm, the nearest neighbor algorithm (NNA), and the incremental feature selection (IFS) method were used to identify the most important TFs that either favor or inhibit nucleosome positioning by analyzing the numbers of transcription factor binding sites (TFBSs) in 53,021 nucleosomal DNA sequences and 50,299 linker DNA sequences. A total of nine important families of TFs were extracted from 35 families, and the overall prediction accuracy was 87.4% as evaluated by the jackknife cross-validation test.
ConclusionsOur results are consistent with the notion that TFs are more likely to bind linker DNA sequences than the sequences in the nucleosomes. In addition, our results imply that there may be some TFs that are important for nucleosome positioning but that play an insignificant role in discriminating nucleosome-forming DNA sequences from nucleosome-inhibiting DNA sequences. The hypothesis that TFs play a role in nucleosome positioning is, thus, confirmed by the results of this study.
Tags: biology, computing, news
Posted in Computatioanl biology | Comments Off
Prediction and Testing of Biological Networks Underlying Intestinal Cancer
Written by Scott Christley et al. on September 1, 2010 – 7:00 am -Colorectal cancer progresses through an accumulation of somatic mutations, some of which reside in so-called “driver” genes that provide a growth advantage to the tumor. To identify points of intersection between driver gene pathways, we implemented a network analysis framework using protein interactions to predict likely connections – both precedented and novel – between key driver genes in cancer. We applied the framework to find significant connections between two genes, Apc and Cdkn1a (p21), known to be synergistic in tumorigenesis in mouse models. We then assessed the functional coherence of the resulting Apc-Cdkn1a network by engineering in vivo single node perturbations of the network: mouse models mutated individually at Apc (Apc1638N+/−) or Cdkn1a (Cdkn1a−/−), followed by measurements of protein and gene expression changes in intestinal epithelial tissue. We hypothesized that if the predicted network is biologically coherent (functional), then the predicted nodes should associate more specifically with dysregulated genes and proteins than stochastically selected genes and proteins. The predicted Apc-Cdkn1a network was significantly perturbed at the mRNA-level by both single gene knockouts, and the predictions were also strongly supported based on physical proximity and mRNA coexpression of proteomic targets. These results support the functional coherence of the proposed Apc-Cdkn1a network and also demonstrate how network-based predictions can be statistically tested using high-throughput biological data.
Tags: biology, computing, news
Posted in Computatioanl biology | Comments Off
GOPred: GO Molecular Function Prediction by Combined Classifiers
Written by Scott Christley et al. on August 31, 2010 – 7:00 am -Functional protein annotation is an important matter for in vivo and in silico biology. Several computational methods have been proposed that make use of a wide range of features such as motifs, domains, homology, structure and physicochemical properties. There is no single method that performs best in all functional classification problems because information obtained using any of these features depends on the function to be assigned to the protein. In this study, we portray a novel approach that combines different methods to better represent protein function. First, we formulated the function annotation problem as a classification problem defined on 300 different Gene Ontology (GO) terms from molecular function aspect. We presented a method to form positive and negative training examples while taking into account the directed acyclic graph (DAG) structure and evidence codes of GO. We applied three different methods and their combinations. Results show that combining different methods improves prediction accuracy in most cases. The proposed method, GOPred, is available as an online computational annotation tool (http://kinaz.fen.bilkent.edu.tr/gopred).
Tags: biology, computing, news
Posted in Computatioanl biology | Comments Off
Identification of cis-Regulatory Elements in the Mammalian Genome: The cREMaG Database
Written by Scott Christley et al. on August 31, 2010 – 7:00 am -A growing number of gene expression-profiling datasets provides a reliable source of information about gene co-expression. In silico analyses of the properties shared among the promoters of co-expressed genes facilitates the identification of transcription factors (TFs) involved in the co-regulation of those genes. Our previous experience with microarray data led to the development of a database suitable for the examination of regulatory motifs in the promoters of co-expressed genes.
MethodologyWe introduce the cREMaG (cis-Regulatory Elements in the Mammalian Genome) system designed for in silico studies of the promoter properties of co-regulated mammalian genes. The cREMaG system offers an analysis of data obtained from human, mouse, rat, bovine and canine gene expression-profiling studies. More than eight analysis parameters can be utilized in user-defined combinations. The selection of alternative transcription start sites and information about CpG islands are also available.
ConclusionsUsing the cREMaG system, we successfully identified TFs mediating transcriptional responses in reference gene sets. The cREMaG system facilitates in silico studies of mammalian transcriptional gene regulation. The resource is freely available at http://www.cremag.org.
Tags: biology, computing, news
Posted in Computatioanl biology | Comments Off
A Novel Preprocessing Method Using Hilbert Huang Transform for MALDI-TOF and SELDI-TOF Mass Spectrometry Data
Written by Scott Christley et al. on August 31, 2010 – 7:00 am -Mass spectrometry is a high throughput, fast, and accurate method of protein analysis. Using the peaks detected in spectra, we can compare a normal group with a disease group. However, the spectrum is complicated by scale shifting and is also full of noise. Such shifting makes the spectra non-stationary and need to align before comparison. Consequently, the preprocessing of the mass data plays an important role during the analysis process. Noises in mass spectrometry data come in lots of different aspects and frequencies. A powerful data preprocessing method is needed for removing large amount of noises in mass spectrometry data.
ResultsHilbert-Huang Transformation is a non-stationary transformation used in signal processing. We provide a novel algorithm for preprocessing that can deal with MALDI and SELDI spectra. We use the Hilbert-Huang Transformation to decompose the spectrum and filter-out the very high frequencies and very low frequencies signal. We think the noise in mass spectrometry comes from many sources and some of the noises can be removed by analysis of signal frequence domain. Since the protein in the spectrum is expected to be a unique peak, its frequence domain should be in the middle part of frequence domain and will not be removed. The results show that HHT, when used for preprocessing, is generally better than other preprocessing methods. The approach not only is able to detect peaks successfully, but HHT has the advantage of denoising spectra efficiently, especially when the data is complex. The drawback of HHT is that this approach takes much longer for the processing than the wavlet and traditional methods. However, the processing time is still manageable and is worth the wait to obtain high quality data.
Tags: biology, computing, news
Posted in Computatioanl biology | Comments Off
