bTSSfinder is a novel tool that predicts putative prompters for five classes of sigma factors in E. coli and in Cyanobacteria. bTSSfinder also classifies cyanobacterial promoters. Comparing to currently available tools, bTSSfinder achieves higher accuracy and has a broad scope
This system makes a thorough analysis of ChIP-Seq peaks and identifies the
dominant sequence motif families as potential binding sites of DNA-interacting
|||Dragon Desert Masker
Tool that can with a very high accuracy demarcate those genomic regions that
are unlikely to promote the initiation of transcription. In our machine learning
algorithm we utilize various constraining properties of features identified in
the upstream and downstream regions around verified TSSs, as well as statistical
analyses of these surrounding regions.
If you are using this resource in your research please cite:
Schaefer U, Kodzius R, Kai C, Kawai J, Carninci P, Hayashizaki Y, Bajic
VB (November 2010) High sensitivity TSS prediction: estimates of locations where TSS
cannot occur. PLoS One
5(11): e13934. Epub 2010 Nov 15.
Dragon Motif Finder is a simple ab-initio motif finding tool in DNA
sequences. It allows the processing of large sequence sets in a relatively short
amount of time on the web. It is heavily used in Fantom5 consortium project for
the analysis of promoter sequences.
The Dragon PolyA Spotter is a tool for predicting polyadenylation signals
variants in human DNA genomic sequences based on two machine learning
algorithms. The tool displays predicted polyA signal variants and their
positions in each submitted DNA sequence
Dragon PolyA Spotter:
predictor of poly(A) motifs within human genomic DNA sequences. Kalkatawi M,
Rangkuti F, Schramm M, Jankovic BR, Kamau A, Chowdhary R, Archer JA, Bajic VB.
Bioinformatics. 2013 Jun 1;29(11):1484. doi: 10.1093/bioinformatics/btt161.
Tool that aims at predicting the DNA binding sites of peroxisome
proliferatior-activated receptors (PPARs) with extremely high accuracy.
Dragon TIS Spotter searches for Translation Initiation Sites (TISs) in plant
genomic sequences provided in fasta format. The tool analyzes content of the
sliding windows of 300 bp of DNA sequence, assuming the TIS is located at
150-152 position of the window counted from the 5prime end. The machine learning
prediction algorithm is trained on Arabidopsis genome and tested on genomic
sequences of three plant genomes.
Dragon TIS Spotter: an
Arabidopsis-derived predictor of translation initiation sites in plants.
Magana-Mora A, Ashoor H, Jankovic BR, Kamau A, Awara K, Chowdhary R, Archer
JA, Bajic VB. Bioinformatics. 2013 Jan 1;29(1):117-8. doi:
The program is a pipeline for genetic algorithms for optimization of decision
tree structures .
Histone Modification in Cancer (HMCan) is Hidden Markov Model based tool that
is developed to detect histone modification in cancer ChIP-seq data. It applies
three correction steps to the data: copy number correction, GC bias correction
and noise level correction. In order to run HMCan, one needs ChIP-seq target
alignment file, and control alignment file.
HMCan: a method for detecting
chromatin modifications in cancer samples using ChIP-seq data. Ashoor H,
Hérault A, Kamoun A, Radvanyi F, Bajic VB, Barillot E, Boeva V. Bioinformatics.
2013 Dec 1;29(23):2979-86. doi: 10.1093/bioinformatics/btt524
Accurate prediction of hot spot residues through physicochemical
characteristics of amino acid sequences.
LigandRFs is a random forest-based approach to predict protein-ligand binding
Dimitrios Kleftogiannis, Panos Kalnis and Vladimir B. Bajic
A fundamental problem in bioinformatics is genome assembly.
Next-Generation Sequencing (NGS) technologies produce large volumes of
fragmented genome reads, which require large amounts of memory to assemble the
complete genome efficiently. With recent improvements in DNA sequencing
technologies, it is expected that the memory footprint required for the assembly
process will increase dramatically and will emerge as a limiting factor in
processing widely available NGS-generated reads. In this report, we compare
current memory-efficient techniques for genome assembly with respect to quality,
memory consumption and execution time. Our experiments prove that it is possible
to generate draft assemblies of reasonable quality on conventional multi-purpose
computers with very limited available memory by choosing suitable assembly
methods. Our study reveals the minimum memory requirements for different
assembly programs even when data volume exceeds memory capacity by orders of
magnitude. By combining existing methodologies, we propose two general assembly
strategies that can improve short-read assembly approaches and result in
reduction of the memory footprint. Finally, we discuss the possibility of
utilizing cloud infrastructures for genome assembly and we comment on some
findings regarding suitable computational resources for assembly.Comparing memory-efficient
genome assemblers on stand-alone and cloud infrastructures.
D, Kalnis P, Bajic VB. PLoS One. 2013 Sep 27;8(9):e75505. doi:
miRNAVISA is a web-based tool that allows customized interrogation and
comparisons of miRNA families for hypotheses generation, and comparison of the
per-species chromosomal distribution of miRNA genes in different families.
Exploration of miRNA families
for hypotheses generation. Kamanu TK, Radovanovic A, Archer JA, Bajic VB.
Sci Rep. 2013 Oct 15;3:2940. doi: 10.1038/srep02940.
A framework for scalable parameter estimation of gene circuit models using
Genome-wide analysis of alternative TSSs - Improved recognition of
industrially important enzymes
The program is able to predict the 12 main variants of human poly(A) motifs,
i.e., AATAAA, ATTAAA, AAAAAG, AAGAAA, TATAAA, AATACA, AGTAAA, ACTAAA, GATAAA,
CATAAA, AATATA, and AATAGA.
Our method trains a two-round support vector regression model for predicting
protein-DNA binding affinity.