Beyond Zipf's law: Modeling the structure of human language

Computer Science – Computation and Language

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

9 pages, 4 figures

Scientific paper

Human language, the most powerful communication system in history, is closely associated with cognition. Written text is one of the fundamental manifestations of language, and the study of its universal regularities can give clues about how our brains process information and how we, as a society, organize and share it. Still, only classical patterns such as Zipf's law have been explored in depth. In contrast, other basic properties like the existence of bursts of rare words in specific documents, the topical organization of collections, or the sublinear growth of vocabulary size with the length of a document, have only been studied one by one and mainly applying heuristic methodologies rather than basic principles and general mechanisms. As a consequence, there is a lack of understanding of linguistic processes as complex emergent phenomena. Beyond Zipf's law for word frequencies, here we focus on Heaps' law, burstiness, and the topicality of document collections, which encode correlations within and across documents absent in random null models. We introduce and validate a generative model that explains the simultaneous emergence of all these patterns from simple rules. As a result, we find a connection between the bursty nature of rare words and the topical organization of texts and identify dynamic word ranking and memory across documents as key mechanisms explaining the non trivial organization of written text. Our research can have broad implications and practical applications in computer science, cognitive science, and linguistics.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Beyond Zipf's law: Modeling the structure of human language does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Beyond Zipf's law: Modeling the structure of human language, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Beyond Zipf's law: Modeling the structure of human language will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-552889

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.