skip to main content
10.3115/1118078.1118083dlproceedingsArticle/Chapter ViewAbstractPublication PagessigdialConference Proceedingsconference-collections
Article
Free Access

Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory

Published:01 September 2001Publication History

ABSTRACT

We describe our experience in developing a discourse-annotated corpus for community-wide use. Working in the framework of Rhetorical Structure Theory, we were able to create a large annotated resource with very high consistency, using a well-defined methodology and protocol. This resource is made publicly available through the Linguistic Data Consortium to enable researchers to develop empirically grounded, discourse-specific applications.

References

  1. Bruce Britton and John Black. 1985. Understanding Expository Text. Hillsdale, NJ: Lawrence Erlbaum Associates.Google ScholarGoogle Scholar
  2. Jill Burstein, Daniel Marcu, Slava Andreyev, and Martin Chodorow. 2001. Towards automatic identification of discourse elements in essays. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jean Carletta, Amy Isard, Stephen Isard, Jacqueline Kowtko, Gwyneth Doherty-Sneddon, and Anne Anderson. 1997. The reliability of a dialogue structure coding scheme. Computational Linguistics 23(1): 13--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Giacomo Ferrari. 1998. Preliminary steps toward the creation of a discourse and text resource. In Proceedings of the First International Conference on Language Resources and Evaluation (LREC 1998), Granada, Spain, 999--1001.Google ScholarGoogle Scholar
  5. Giovanni Flammia and Victor Zue. 1995. Empirical evaluation of human performance and agreement in parsing discourse constituents in spoken dialogue. In Proceedings of the 4th European Conference on Speech Communication and Technology, Madrid, Spain, vol. 3, 1965--1968.Google ScholarGoogle Scholar
  6. Roger Garside, Steve Fligelstone and Simon Botley. 1997. Discourse Annotation: Anaphoric Relations in Corpora. In Corpus annotation: Linguistic information from computer text corpora, edited by R. Garside, G. Leech, and T. McEnery. London: Longman, 66--84.Google ScholarGoogle Scholar
  7. Talmy Givon. 1983. Topic continuity in discourse. In Topic Continuity in Discourse: a Quantitative Cross-Language Study. Amsterdam/Philadelphia: John Benjamins, 1--41.Google ScholarGoogle Scholar
  8. Joseph Evans Grimes. 1975. The Thread of Discourse. The Hague, Paris: Mouton.Google ScholarGoogle Scholar
  9. Barbara Grosz and Candice Sidner. 1986. Attentions, intentions, and the structure of discourse. Computational Linguistics, 12(3): 175--204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Marti Hearst. 1997. TextTiling: Segmenting text into multi-paragraph subtopic passages. Computational Linguistics 23(1): 33--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Julia Hirschberg and Diane Litman. 1993. Empirical studies on the disambiguation of cue phrases. Computational Linguistics 19(3): 501--530. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Eduard Hovy. 1993. Automated discourse generation using discourse structure relations. Artificial Intelligence 63(1-2): 341--386. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Klaus Krippendorff. 1980. Content Analysis: An Introduction to its Methodology. Beverly Hills, CA: Sage Publications.Google ScholarGoogle Scholar
  14. Geoffrey Leech, Tony McEnery, and Martin Wynne. 1997. Further levels of annotation. In Corpus Annotation: Linguistic Information from Computer Text Corpora, edited by R. Garside, G. Leech, and T. McEnery. London: Longman, 85--101.Google ScholarGoogle Scholar
  15. Robert Longacre. 1983. The Grammar of Discourse. New York: Plenum Press.Google ScholarGoogle Scholar
  16. William Mann and Sandra Thompson. 1988. Rhetorical structure theory. Toward a functional theory of text organization. Text, 8(3): 243--281.Google ScholarGoogle ScholarCross RefCross Ref
  17. William Mann and Sandra Thompson, eds. 1992. Discourse Description: Diverse Linguistic Analyses of a Fund-raising Text. Amsterdam/Philadelphia: John Benjamins.Google ScholarGoogle Scholar
  18. Daniel Marcu. 2000. The Theory and Practice of Discourse Parsing and Summarization. Cambridge, MA: The MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Daniel Marcu, Estibaliz Amorrortu, and Magdelena Romera. 1999. Experiments in constructing a corpus of discourse trees. In Proceedings of the ACL Workshop on Standards and Tools for Discourse Tagging, College Park, MD, 48--57.Google ScholarGoogle Scholar
  20. Daniel Marcu, Lynn Carlson, and Maki Watanabe. 2000. The automatic translation of discourse structures. Proceedings of the First Annual Meeting of the North American Chapter of the Association for Computational Linguistics, Seattle, WA, 9--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Mitchell Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. 1993. Building a large annotated corpus of English: the Penn Treebank, Computational Linguistics 19(2), 313--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Bonnie Meyer. 1985. Prose Analysis: Purposes, Procedures, and Problems. In Understanding Expository Text, edited by B. Britton and J. Black. Hillsdale, NJ: Lawrence Erlbaum Associates, 11--64.Google ScholarGoogle Scholar
  23. Johanna Moore. 1995. Participating in Explanatory Dialogues: Interpreting and Responding to Questions in Context. Cambridge, MA: MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Johanna Moore and Cecile Paris. 1993. Planning text for advisory dialogues: capturing intentional and rhetorical information. Computational Linguistics 19(4): 651--694. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Megan Moser and Johanna Moore. 1995. Investigating cue selection and placement in tutorial discourse. Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, Cambridge, MA, 130--135. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Tadashi Nomoto and Yuji Matsumoto. 1999. Learning discourse relations with active data selection. In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, College Park, MD, 158--167.Google ScholarGoogle Scholar
  27. Rebecca Passonneau and Diane Litman. 1997. Discourse segmentation by human and automatic means. Computational Linguistics 23(1): 103--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Marie-Paule Pery-Woodley and Josette Rebeyrolle. 1998. Domain and genre in sublanguage text: definitional microtexts in three corpora. In Proceedings of the First International Conference on Language Resources and Evaluation (LREC-1998), Granada, Spain, 987--992.Google ScholarGoogle Scholar
  29. Livia Polanyi. 1988. A formal model of the structure of discourse. Journal of Pragmatics 12: 601--638.Google ScholarGoogle ScholarCross RefCross Ref
  30. Livia Polanyi. 1996. The linguistic structure of discourse. Center for the Study of Language and Information. CSLI-96-200.Google ScholarGoogle Scholar
  31. Josette Rebeyrolle. 2000. Utilisation de contextes définitoires pour l'acquisition de connaissances à partir de textes. In Actes Journées Francophones d'Ingénierie de la Connaissance (IC'2000), Toulouse, IRIT, 105--114.Google ScholarGoogle Scholar
  32. Harvey Sacks, Emmanuel Schegloff, and Gail Jefferson. 1974. A simple systematics for the organization of turntaking in conversation. Language 50: 696--735.Google ScholarGoogle ScholarCross RefCross Ref
  33. Sidney Siegal and N. J. Castellan. 1988. Nonparametric Statistics for the Behavioral Sciences. New York: McGraw-Hill.Google ScholarGoogle Scholar
  34. Beth Sundheim. 1995. Overview of results of the MUC-6 evaluation. In Proceedings of the Sixth Message Understanding Conference (MUC-6), Columbia, MD, 13--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Benjamin K. T'sou, Tom B. Y. Lai, Samuel W. K. Chan, Weijun Gao, and Xuegang Zhan. 2000. Enhancement of Chinese discourse marker tagger with C.4.5. In Proceedings of the Second Chinese Language Processing Workshop, Hong Kong, 38--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Teun A. Van Dijk and Walter Kintsch. 1983. Strategies of Discourse Comprehension. New York: Academic Press.Google ScholarGoogle Scholar
  37. Ellen Voorhees and Donna Harman. 1999. The Eighth Text Retrieval Conference (TREC-8). NIST Special Publication 500--246.Google ScholarGoogle Scholar
  38. Charles Wayne. 2000. Multilingual topic detection and tracking: successful research enabled by corpora and evaluation. In Proceedings of the Second International Conference on Language Resources and Evaluation (LREC-2000), Athens, Greece, 1487--1493.Google ScholarGoogle Scholar
  39. Janyce Wiebe, Rebecca Bruce, and Thomas O'Hara. 1999. Development and use of a gold-standard data set for subjectivity classifications. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics. College Park, MD, 246--253. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        SIGDIAL '01: Proceedings of the Second SIGdial Workshop on Discourse and Dialogue - Volume 16
        September 2001
        214 pages

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 1 September 2001

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate19of46submissions,41%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader