Archive for the ‘Computer Science’ Category
Statistic Complexity: Combining Kolmogorov Complexity with an Ensemble Approach
Written by Scott Christley et al. on August 26, 2010 – 7:00 am -The evaluation of the complexity of an observed object is an old but outstanding problem. In this paper we are tying on this problem introducing a measure called statistic complexity.
Methodology/Principal FindingsThis complexity measure is different to all other measures in the following senses. First, it is a bivariate measure that compares two objects, corresponding to pattern generating processes, on the basis of the normalized compression distance with each other. Second, it provides the quantification of an error that could have been encountered by comparing samples of finite size from the underlying processes. Hence, the statistic complexity provides a statistical quantification of the statement ‘ is similarly complex as
’.
The presented approach, ultimately, transforms the classic problem of assessing the complexity of an object into the realm of statistics. This may open a wider applicability of this complexity measure to diverse application areas.
Tags: computer, news, science
Posted in Computer Science | Comments Off
From Modular to Centralized Organization of Synchronization in Functional Areas of the Cat Cerebral Cortex
Written by Scott Christley et al. on August 26, 2010 – 7:00 am -Recent studies have pointed out the importance of transient synchronization between widely distributed neural assemblies to understand conscious perception. These neural assemblies form intricate networks of neurons and synapses whose detailed map for mammals is still unknown and far from our experimental capabilities. Only in a few cases, for example the C. elegans, we know the complete mapping of the neuronal tissue or its mesoscopic level of description provided by cortical areas. Here we study the process of transient and global synchronization using a simple model of phase-coupled oscillators assigned to cortical areas in the cerebral cat cortex. Our results highlight the impact of the topological connectivity in the developing of synchronization, revealing a transition in the synchronization organization that goes from a modular decentralized coherence to a centralized synchronized regime controlled by a few cortical areas forming a Rich-Club connectivity pattern.
Tags: computer, news, science
Posted in Computer Science | Comments Off
A Method for the Automated, Reliable Retrieval of Publication-Citation Records
Written by Scott Christley et al. on August 19, 2010 – 7:00 am -Publication records and citation indices often are used to evaluate academic performance. For this reason, obtaining or computing them accurately is important. This can be difficult, largely due to a lack of complete knowledge of an individual's publication list and/or lack of time available to manually obtain or construct the publication-citation record. While online publication search engines have somewhat addressed these problems, using raw search results can yield inaccurate estimates of publication-citation records and citation indices.
MethodologyIn this paper, we present a new, automated method that produces estimates of an individual's publication-citation record from an individual's name and a set of domain-specific vocabulary that may occur in the individual's publication titles. Because this vocabulary can be harvested directly from a research web page or online (partial) publication list, our method delivers an easy way to obtain estimates of a publication-citation record and the relevant citation indices. Our method works by applying a series of stringent name and content filters to the raw publication search results returned by an online publication search engine. In this paper, our method is run using Google Scholar, but the underlying filters can be easily applied to any existing publication search engine. When compared against a manually constructed data set of individuals and their publication-citation records, our method provides significant improvements over raw search results. The estimated publication-citation records returned by our method have an average sensitivity of and specificity of
(in contrast to raw search result specificity of less than 10%). When citation indices are computed using these records, the estimated indices are within
of the true value, compared to raw search results which have overestimates of, on average,
.
These results confirm that our method provides significantly improved estimates over raw search results, and these can either be used directly for large-scale (departmental or university) analysis or further refined manually to quickly give accurate publication-citation records.
Tags: computer, news, science
Posted in Computer Science | Comments Off
Inferring Geographic Coordinates of Origin for Europeans Using Small Panels of Ancestry Informative Markers
Written by Scott Christley et al. on August 18, 2010 – 7:00 am -Recent large-scale studies of European populations have demonstrated the existence of population genetic structure within Europe and the potential to accurately infer individual ancestry when information from hundreds of thousands of genetic markers is used. In fact, when genomewide genetic variation of European populations is projected down to a two-dimensional Principal Components Analysis plot, a surprising correlation with actual geographic coordinates of self-reported ancestry has been reported. This substructure can hamper the search of susceptibility genes for common complex disorders leading to spurious correlations. The identification of genetic markers that can correct for population stratification becomes therefore of paramount importance. Analyzing 1,200 individuals from 11 populations genotyped for more than 500,000 SNPs (Population Reference Sample), we present a systematic exploration of the extent to which geographic coordinates of origin within Europe can be predicted, with small panels of SNPs. Markers are selected to correlate with the top principal components of the dataset, as we have previously demonstrated. Performing thorough cross-validation experiments we show that it is indeed possible to predict individual ancestry within Europe down to a few hundred kilometers from actual individual origin, using information from carefully selected panels of 500 or 1,000 SNPs. Furthermore, we show that these panels can be used to correctly assign the HapMap Phase 3 European populations to their geographic origin. The SNPs that we propose can prove extremely useful in a variety of different settings, such as stratification correction or genetic ancestry testing, and the study of the history of European populations.
Tags: computer, news, science
Posted in Computer Science | Comments Off
Protein Networks Reveal Detection Bias and Species Consistency When Analysed by Information-Theoretic Methods
Written by Scott Christley et al. on August 18, 2010 – 7:00 am -We apply our recently developed information-theoretic measures for the characterisation and comparison of protein–protein interaction networks. These measures are used to quantify topological network features via macroscopic statistical properties. Network differences are assessed based on these macroscopic properties as opposed to microscopic overlap, homology information or motif occurrences. We present the results of a large–scale analysis of protein–protein interaction networks. Precise null models are used in our analyses, allowing for reliable interpretation of the results. By quantifying the methodological biases of the experimental data, we can define an information threshold above which networks may be deemed to comprise consistent macroscopic topological properties, despite their small microscopic overlaps. Based on this rationale, data from yeast–two–hybrid methods are sufficiently consistent to allow for intra–species comparisons (between different experiments) and inter–species comparisons, while data from affinity–purification mass–spectrometry methods show large differences even within intra–species comparisons.
Tags: computer, news, science
Posted in Computer Science | Comments Off
jsPhyloSVG: A Javascript Library for Visualizing Interactive and Vector-Based Phylogenetic Trees on the Web
Written by Scott Christley et al. on August 18, 2010 – 7:00 am -Many software packages have been developed to address the need for generating phylogenetic trees intended for print. With an increased use of the web to disseminate scientific literature, there is a need for phylogenetic trees to be viewable across many types of devices and feature some of the interactive elements that are integral to the browsing experience. We propose a novel approach for publishing interactive phylogenetic trees.
Methods/Principal FindingsWe present a javascript library, jsPhyloSVG, which facilitates constructing interactive phylogenetic trees from raw Newick or phyloXML formats directly within the browser in Scalable Vector Graphics (SVG) format. It is designed to work across all major browsers and renders an alternative format for those browsers that do not support SVG. The library provides tools for building rectangular and circular phylograms with integrated charting. Interactive features may be integrated and made to respond to events such as clicks on any element of the tree, including labels.
Conclusions/SignificancejsPhyloSVG is an open-source solution for rendering dynamic phylogenetic trees. It is capable of generating complex and interactive phylogenetic trees across all major browsers without the need for plugins. It is novel in supporting the ability to interpret the tree inference formats directly, exposing the underlying markup to data-mining services. The library source code, extensive documentation and live examples are freely accessible at www.jsphylosvg.com.
Tags: computer, news, science
Posted in Computer Science | Comments Off
A New Acoustic Portal into the Odontocete Ear and Vibrational Analysis of the Tympanoperiotic Complex
Written by Scott Christley et al. on August 4, 2010 – 7:00 am -Global concern over the possible deleterious effects of noise on marine organisms was catalyzed when toothed whales stranded and died in the presence of high intensity sound. The lack of knowledge about mechanisms of hearing in toothed whales prompted our group to study the anatomy and build a finite element model to simulate sound reception in odontocetes. The primary auditory pathway in toothed whales is an evolutionary novelty, compensating for the impedance mismatch experienced by whale ancestors as they moved from hearing in air to hearing in water. The mechanism by which high-frequency vibrations pass from the low density fats of the lower jaw into the dense bones of the auditory apparatus is a key to understanding odontocete hearing. Here we identify a new acoustic portal into the ear complex, the tympanoperiotic complex (TPC) and a plausible mechanism by which sound is transduced into the bony components. We reveal the intact anatomic geometry using CT scanning, and test functional preconceptions using finite element modeling and vibrational analysis. We show that the mandibular fat bodies bifurcate posteriorly, attaching to the TPC in two distinct locations. The smaller branch is an inconspicuous, previously undescribed channel, a cone-shaped fat body that fits into a thin-walled bony funnel just anterior to the sigmoid process of the TPC. The TPC also contains regions of thin translucent bone that define zones of differential flexibility, enabling the TPC to bend in response to sound pressure, thus providing a mechanism for vibrations to pass through the ossicular chain. The techniques used to discover the new acoustic portal in toothed whales, provide a means to decipher auditory filtering, beam formation, impedance matching, and transduction. These tools can also be used to address concerns about the potential deleterious effects of high-intensity sound in a broad spectrum of marine organisms, from whales to fish.
Tags: computer, news, science
Posted in Computer Science | Comments Off
People Efficiently Explore the Solution Space of the Computationally Intractable Traveling Salesman Problem to Find Near-Optimal Tours
Written by Scott Christley et al. on July 29, 2010 – 7:00 am -Humans need to solve computationally intractable problems such as visual search, categorization, and simultaneous learning and acting, yet an increasing body of evidence suggests that their solutions to instantiations of these problems are near optimal. Computational complexity advances an explanation to this apparent paradox: (1) only a small portion of instances of such problems are actually hard, and (2) successful heuristics exploit structural properties of the typical instance to selectively improve parts that are likely to be sub-optimal. We hypothesize that these two ideas largely account for the good performance of humans on computationally hard problems. We tested part of this hypothesis by studying the solutions of 28 participants to 28 instances of the Euclidean Traveling Salesman Problem (TSP). Participants were provided feedback on the cost of their solutions and were allowed unlimited solution attempts (trials). We found a significant improvement between the first and last trials and that solutions are significantly different from random tours that follow the convex hull and do not have self-crossings. More importantly, we found that participants modified their current better solutions in such a way that edges belonging to the optimal solution (“good” edges) were significantly more likely to stay than other edges (“bad” edges), a hallmark of structural exploitation. We found, however, that more trials harmed the participants' ability to tell good from bad edges, suggesting that after too many trials the participants “ran out of ideas.” In sum, we provide the first demonstration of significant performance improvement on the TSP under repetition and feedback and evidence that human problem-solving may exploit the structure of hard problems paralleling behavior of state-of-the-art heuristics.
Tags: computer, news, science
Posted in Computer Science | Comments Off
Multi-Environment Model Estimation for Motility Analysis of Caenorhabditis elegans
Written by Scott Christley et al. on July 22, 2010 – 7:00 am -The nematode Caenorhabditis elegans is a well-known model organism used to investigate fundamental questions in biology. Motility assays of this small roundworm are designed to study the relationships between genes and behavior. Commonly, motility analysis is used to classify nematode movements and characterize them quantitatively. Over the past years, C. elegans' motility has been studied across a wide range of environments, including crawling on substrates, swimming in fluids, and locomoting through microfluidic substrates. However, each environment often requires customized image processing tools relying on heuristic parameter tuning. In the present study, we propose a novel Multi-Environment Model Estimation (MEME) framework for automated image segmentation that is versatile across various environments. The MEME platform is constructed around the concept of Mixture of Gaussian (MOG) models, where statistical models for both the background environment and the nematode appearance are explicitly learned and used to accurately segment a target nematode. Our method is designed to simplify the burden often imposed on users; here, only a single image which includes a nematode in its environment must be provided for model learning. In addition, our platform enables the extraction of nematode ‘skeletons’ for straightforward motility quantification. We test our algorithm on various locomotive environments and compare performances with an intensity-based thresholding method. Overall, MEME outperforms the threshold-based approach for the overwhelming majority of cases examined. Ultimately, MEME provides researchers with an attractive platform for C. elegans' segmentation and ‘skeletonizing’ across a wide range of motility assays.
Tags: computer, news, science
Posted in Computer Science | Comments Off
Googling Social Interactions: Web Search Engine Based Social Network Construction
Written by Scott Christley et al. on July 21, 2010 – 7:00 am -Social network analysis has long been an untiring topic of sociology. However, until the era of information technology, the availability of data, mainly collected by the traditional method of personal survey, was highly limited and prevented large-scale analysis. Recently, the exploding amount of automatically generated data has completely changed the pattern of research. For instance, the enormous amount of data from so-called high-throughput biological experiments has introduced a systematic or network viewpoint to traditional biology. Then, is “high-throughput” sociological data generation possible? Google, which has become one of the most influential symbols of the new Internet paradigm within the last ten years, might provide torrents of data sources for such study in this (now and forthcoming) digital era. We investigate social networks between people by extracting information on the Web and introduce new tools of analysis of such networks in the context of statistical physics of complex systems or socio-physics. As a concrete and illustrative example, the members of the 109th United States Senate are analyzed and it is demonstrated that the methods of construction and analysis are applicable to various other weighted networks.
Tags: computer, news, science
Posted in Computer Science | Comments Off
