Computer Science – Computation and Language
Scientific paper
2005-07-07
Journal of Quantitative Linguistics, 2006, Volume 13, Number 1, pp. 81-109
Computer Science
Computation and Language
To appear in Journal of Quantitative Linguistics
Scientific paper
10.1080/09296170500500637
Hilberg (1990) supposed that finite-order excess entropy of a random human text is proportional to the square root of the text length. Assuming that Hilberg's hypothesis is true, we derive Guiraud's law, which states that the number of word types in a text is greater than proportional to the square root of the text length. Our derivation is based on some mathematical conjecture in coding theory and on several experiments suggesting that words can be defined approximately as the nonterminals of the shortest context-free grammar for the text. Such operational definition of words can be applied even to texts deprived of spaces, which do not allow for Mandelbrot's ``intermittent silence'' explanation of Zipf's and Guiraud's laws. In contrast to Mandelbrot's, our model assumes some probabilistic long-memory effects in human narration and might be capable of explaining Menzerath's law.
No associations
LandOfFree
On Hilberg's Law and Its Links with Guiraud's Law does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with On Hilberg's Law and Its Links with Guiraud's Law, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and On Hilberg's Law and Its Links with Guiraud's Law will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-480312