Physics – Condensed Matter – Statistical Mechanics
Scientific paper
2004-02-24
Physica A - Vol 342/1-2 pp 294-300 (2004)
Physics
Condensed Matter
Statistical Mechanics
7 pages, Latex, elsart style
Scientific paper
10.1016/j.physa.2004.01.072
In this paper we present a general method for information extraction that exploits the features of data compression techniques. We first define and focus our attention on the so-called "dictionary" of a sequence. Dictionaries are intrinsically interesting and a study of their features can be of great usefulness to investigate the properties of the sequences they have been extracted from (e.g. DNA strings). We then describe a procedure of string comparison between dictionary-created sequences (or "artificial texts") that gives very good results in several contexts. We finally present some results on self-consistent classification problems.
Baronchelli Andrea
Caglioti Emanuele
Loreto Vittorio
Pizzi E.
No associations
LandOfFree
Dictionary based methods for information extraction does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Dictionary based methods for information extraction, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Dictionary based methods for information extraction will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-352134