A Formal Framework for Linguistic Annotation

Computer Science – Computation and Language

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

49 pages

Scientific paper

`Linguistic annotation' covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions -- audio, video and/or physiological recordings -- or it may be textual. The added notations may include transcriptions of all sorts (from phonetic features to discourse structures), part-of-speech and sense tagging, syntactic analysis, `named entity' identification, co-reference annotation, and so on. While there are several ongoing efforts to provide formats and tools for such annotations and to publish annotated linguistic databases, the lack of widely accepted standards is becoming a critical problem. Proposed standards, to the extent they exist, have focussed on file formats. This paper focuses instead on the logical structure of linguistic annotations. We survey a wide variety of existing annotation formats and demonstrate a common conceptual core, the annotation graph. This provides a formal framework for constructing, maintaining and searching linguistic annotations, while remaining consistent with many alternative data structures and file formats.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

A Formal Framework for Linguistic Annotation does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with A Formal Framework for Linguistic Annotation, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and A Formal Framework for Linguistic Annotation will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-290533

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.