A practical approach to language complexity: a Wikipedia case study

Computer Science – Computation and Language

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Scientific paper

In this paper we present statistical analysis of English texts from Wikipedia (WP). We try to address the issue of language complexity empirically by comparing samples of the main English WP (Main) and the simple English WP (Simple). Simple is supposed to use a more simplified language with a limited vocabulary, and editors are explicitly requested to follow this guideline, yet in practice the vocabulary richness of both samples are at the same level. However, detailed analysis of longer units (n-grams rather than words alone) shows that the language of Simple is indeed less complex than that of Main. Comparing the two language varieties by the Gunning readability index supports this conclusion. We also report on the topical dependence of language complexity, e.g. that the language is more advanced in conceptual articles compared to person-based (biographical) and object-based articles. Finally, we investigate the relation between conflict and language complexity by analysing the content of the talk pages associated to controversial and peacefully developing articles, concluding that controversy has the effect of reducing language complexity.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

A practical approach to language complexity: a Wikipedia case study does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with A practical approach to language complexity: a Wikipedia case study, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and A practical approach to language complexity: a Wikipedia case study will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-312946

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.