ABSTRACT
Discourse analysis is an important natural language processing task. There are many discourse parsers in many languages, such as English and Chinese, constructing discourse trees from text documents for further semantic analysis. However, there is no official release of Vietnamese discourse treebank for research in Vietnamese discourse parser. Therefore, this paper presents our preliminary result in building Vietnamese discourse treebank. some problems when building discourse treebank and proposes a discourse annotation framework for it. In order to show the feasibility of developing discourse parsers for Vietnamese documents, two experiments in discourse relation classification and in discourse nucleus classification are conducted using the discourse annotated documents.
- V. W. Feng, G. A. Hirst. 2014. Linear-Time Bottom-Up Discourse Parser with Constraints and Post-Editing. In ACL 1 (2014), 511--521.Google Scholar
- W. Feng, G. Hirst. 2012. Text-level Discourse Parsing with Rich Linguistic Features. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, ACL '12, Jeju Island, Korea (2012), 60--68. Google ScholarDigital Library
- H. Hernault, H. Prendinger and M. Ishizuka. 2010. HILDA: A discourse parser using support vector machine classification. Dialogue & Discourse 1, 3 (2010). 1--33Google Scholar
- Z. Lin, H.T. Ng, and M.Y. Kan. 2014. A PDTB-styled end-to-end discourse parser. Natural Language Engineering 20, 2 (2014), 151--184.Google ScholarCross Ref
- Ghosh, S., Johansson, R. and Tonelli, S., 2011. Shallow discourse parsing with conditional random fields. In Proceedings of the 5th International Joint Conference on Natural Language Processing (IJCNLP 2011), Chiang Mai, Thailand, 1071--1079.Google Scholar
- W. C. Mann, S. A. Thompson. 1988. Rhetorical structure theory: towards a functional theory of text organization. Text 3, 8 (1988), 243--281Google Scholar
- S. Verberne. 2009. In Search of Why: Developing a system for answering why-questions. Ph.D. Dissertation. Radboud University, Nijmegen, Germany.Google Scholar
- M. Taboada, W. C. Mann. 2006. Applications of rhetorical structure theory. Discourse studies 8, 4 (2006), 567--588.Google Scholar
- L. Carlson, D. Marcu, M. E. Okurowski. 2003. Building a discourse-tagged corpus in the framework of rhetorical structure theory. Current and new directions in discourse and dialogue. Springer. Nethrelands, 85--112.Google Scholar
- E. Miltsakaki, L. Robaldo, A. Lee and A. Joshi. 2008. Sense annotation in the penn discourse treebank. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, Berlin, Heidelberg, 275--286. Google ScholarDigital Library
- Y. Zhou, N. Xue. 2012. PDTB-style discourse annotation of Chinese text. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. ACL 1 (2012), 69--77. Google ScholarDigital Library
- Y. Zhou, N. Xue. 2015 The Chinese discourse treebank: a Chinese corpus annotated with discourse relations. Language Resources and Evaluation 49, 2 (2015), 397--431. Google ScholarDigital Library
- D. Zeyrek, I. Demirsahin, A. Sevdik-Callı, R. Çakici. 2013. Turkish Discourse Bank: Porting a discourse annotation style to a morphologically rich language. Dialog and Discourse 4, 2 (2013), 174--184.Google ScholarCross Ref
- A. Al-Saif, K. Markert 2010. The Leeds Arabic Discourse Treebank: Annotating Discourse Connectives for Arabic. In LREC (2010).Google Scholar
- D. Marcu 1998. The Rhetorical Parsing, Summarization, and Generation of Natural Language Texts. Ph.D. Dissertation, University of Toronto, Canada. Google ScholarDigital Library
Index Terms
- Towards Building Vietnamese Discourse Treebank
Recommendations
Graph representations of discourse structure
The aim of this paper is to introduce a new method to represent discourse structures. This approach is based on three discourse theories in order to highlight three discourse features: cohesion, coherence and intentionality. The graph representations of ...
Annotating discourse connectives in the Chinese Treebank
CorpusAnno '05: Proceedings of the Workshop on Frontiers in Corpus Annotations II: Pie in the SkyIn this paper we examine the issues that arise from the annotation of the discourse connectives for the Chinese Discourse Treebank Project. This project is based on the same principles as the PDTB, a project that annotates the English discourse ...
Cross-Lingual Transfer for Hindi Discourse Relation Identification
Text, Speech, and DialogueAbstractDiscourse relations between two textual spans in a document attempt to capture the coherent structure which emerges in language use. Automatic classification of these relations remains a challenging task especially in case of implicit discourse ...
Comments