Approaching the linguistic complexity

Computer Science – Computation and Language

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

to be published in conference proceedings

Scientific paper

We analyze the rank-frequency distributions of words in selected English and Polish texts. We compare scaling properties of these distributions in both languages. We also study a few small corpora of Polish literary texts and find that for a corpus consisting of texts written by different authors the basic scaling regime is broken more strongly than in the case of comparable corpus consisting of texts written by the same author. Similarly, for a corpus consisting of texts translated into Polish from other languages the scaling regime is broken more strongly than for a comparable corpus of native Polish texts. Moreover, based on the British National Corpus, we consider the rank-frequency distributions of the grammatically basic forms of words (lemmas) tagged with their proper part of speech. We find that these distributions do not scale if each part of speech is analyzed separately. The only part of speech that independently develops a trace of scaling is verbs.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Approaching the linguistic complexity does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Approaching the linguistic complexity, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Approaching the linguistic complexity will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-684862

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.