10.4225/03/5a1372cde0ad8
Burden, Conrad J.
Conrad J.
Burden
ForĂȘt, Sylvain
Sylvain
ForĂȘt
Wilson, Susan R.
Susan R.
Wilson
k-Word matches: an alignment-free sequence comparison method
Monash University
2017
Bioinformatics -- Congresses
Computational biology -- Congresses
Computer vision in medicine -- Congresses
Computational biology -- Methods -- Congresses
Pattern recognition, automated -- Methods -- Congresses
Genome sequences
Protein sequences
Sequence comparison
Word matches
2008
conference paper
1959.1/63722
monash:7868
Bioinformatics Software
Bioinformatics
Pattern Recognition and Data Mining
2017-11-21 00:26:52
Dataset
https://bridges.monash.edu/articles/dataset/k-Word_matches_an_alignment-free_sequence_comparison_method/5619526
k-word matches, the number of words of length k shared between two sequences, also known as the D2 statistic, are used in alignment-free sequence comparison statistic. The advantages of the use of this statistic over alignment-based methods for nucleotide and amino-acid sequence comparisons are firstly that it does not assume that homologous segments are contiguous, and secondly that the algorithm is computationally extremely fast, the runtime being proportional to the size of the sequence under scrutiny. We summarise our results to date on determing the distributional properties of the D2 statistic for a range of biologically relevant parameters and outline the directions in which the research will proceed. PRIB 2008 proceedings found at: http://dx.doi.org/10.1007/978-3-540-88436-1
Contributors: Monash University. Faculty of Information Technology. Gippsland School of Information Technology ;
Chetty, Madhu ;
Ahmad, Shandar ;
Ngom, Alioune ;
Teng, Shyh Wei ;
Third IAPR International Conference on Pattern Recognition in Bioinformatics (PRIB) (3rd : 2008 : Melbourne, Australia) ;
Coverage:
Rights: Copyright by Third IAPR International Conference on Pattern Recognition in Bioinformatics. All rights reserved.