10 years 6 months ago
To Release or Not to Release: Evaluating Information Leaks in Aggregate Human-Genome Data
The rapid progress of human genome studies leads to a strong demand of aggregate human DNA data (e.g, allele frequencies, test statistics, etc.), whose public dissemination, howeve...
Xiao-yong Zhou, Bo Peng, Yong Fuga Li, Yangyi Chen...
151views Bioinformatics» more  BIBM 2010»
11 years 4 months ago
Probabilistic topic modeling for genomic data interpretation
Recently, the concept of a species containing both core and distributed genes, known as the supra- or pangenome theory, has been introduced. In this paper, we aim to develop a new ...
Xin Chen, Xiaohua Hu, Xiajiong Shen, Gail Rosen
11 years 5 months ago
On the embedding capacity of DNA strands under substitution, insertion, and deletion mutations
A number of methods have been proposed over the last decade for embedding information within deoxyribonucleic acid (DNA). Since a DNA sequence is conceptually equivalent to a unid...
Félix Balado
142views more  INFORMATICALT 2010»
11 years 5 months ago
Complexity Estimation of Genetic Sequences Using Information-Theoretic and Frequency Analysis Methods
The genetic information in cells is stored in DNA sequences, represented by a string of four letters, each corresponding to a definite type of nucleotides. Genomic DNA sequences a...
Robertas Damasevicius
11 years 5 months ago
Finding Gapped Motifs by a Novel Evolutionary Algorithm
Background: Identifying approximately repeated patterns, or motifs, in DNA sequences from a set of co-regulated genes is an important step towards deciphering the complex gene reg...
Chengwei Lei, Jianhua Ruan
109views more  BMCBI 2002»
11 years 6 months ago
Kangaroo - A pattern-matching program for biological sequences
Background: Biologists are often interested in performing a simple database search to identify proteins or genes that contain a well-defined sequence pattern. Many databases do no...
Doron Betel, Christopher W. V. Hogue
150views more  BMCBI 2005»
11 years 6 months ago
Approaching the taxonomic affiliation of unidentified sequences in public databases - an example from the mycorrhizal fungi
Background: During the last few years, DNA sequence analysis has become one of the primary means of taxonomic identification of species, particularly so for species that are minut...
R. Henrik Nilsson, Erik Kristiansson, Martin Ryber...
201views more  BMCBI 2005»
11 years 6 months ago
Principal component analysis for predicting transcription-factor binding motifs from array-derived data
Background: The responses to interleukin 1 (IL-1) in human chondrocytes constitute a complex regulatory mechanism, where multiple transcription factors interact combinatorially to...
Yunlong Liu, Matthew P. Vincenti, Hiroki Yokota
142views more  BMCBI 2005»
11 years 6 months ago
transAlign: using amino acids to facilitate the multiple alignment of protein-coding DNA sequences
Background: Alignments of homologous DNA sequences are crucial for comparative genomics and phylogenetic analysis. However, multiple alignment represents a computationally difficu...
Olaf R. P. Bininda-Emonds
11 years 6 months ago
Detection of subtle variations as consensus motifs
We address the problem of detecting consensus motifs, that occur with subtle variations, across multiple sequences. These are usually functional domains in DNA sequences such as t...
Matteo Comin, Laxmi Parida