Computer Science – Information Retrieval
Scientific paper
2008-10-30
Computer Science
Information Retrieval
299 pages, 44 figures, 10 tables, 9 algorithms. PhD thesis in mathematics defended in May 2005 at the Victoria University of W
Scientific paper
A quasi-metric is a distance function which satisfies the triangle inequality but is not symmetric: it can be thought of as an asymmetric metric. The central result of this thesis, developed in Chapter 3, is that a natural correspondence exists between similarity measures between biological (nucleotide or protein) sequences and quasi-metrics. Chapter 2 presents basic concepts of the theory of quasi-metric spaces and introduces a new examples of them: the universal countable rational quasi-metric space and its bicompletion, the universal bicomplete separable quasi-metric space. Chapter 4 is dedicated to development of a notion of the quasi-metric space with Borel probability measure, or pq-space. The main result of this chapter indicates that `a high dimensional quasi-metric space is close to being a metric space'. Chapter 5 investigates the geometric aspects of the theory of database similarity search in the context of quasi-metrics. The results about $pq$-spaces are used to produce novel theoretical bounds on performance of indexing schemes. Finally, the thesis presents some biological applications. Chapter 6 introduces FSIndex, an indexing scheme that significantly accelerates similarity searches of short protein fragment datasets. Chapter 7 presents the prototype of the system for discovery of short functional protein motifs called PFMFind, which relies on FSIndex for similarity searches.
No associations
LandOfFree
Quasi-metrics, Similarities and Searches: aspects of geometry of protein datasets does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Quasi-metrics, Similarities and Searches: aspects of geometry of protein datasets, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Quasi-metrics, Similarities and Searches: aspects of geometry of protein datasets will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-278304