Shan Wang


2022

pdf bib
中国语言学研究 70 年:核心期刊的词汇增长(70 Years of Linguistics Research in China: Vocabulary Growth of Core Journals)
Shan Wang (王珊) | Runzhe Zhan (詹润哲) | Shuangyun Yao (姚双云)
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“建国以来我国语言学经过 70 年的发展取得了瞩目的成就,已有研究主要以回顾主要历史事件的方式介绍这一进程,但尚缺少使用量化手段分析其历时发展的研究。本文以词汇增长为切入点探究这一主题,首次创建大规模语言学中文核心期刊摘要的历时语料库,并使用三大词汇增长模型预测语料库中词汇的变化。本文选择拟合效果最好的 Heaps 模型分阶段深入分析语言学词汇的变化,显示出国家政策的指导作用和特定时代的语言生活特征。此外,与时序无关的验证程序支撑了本文研究方法的有效性。 关键词:中国语言学;词汇增长;核心期刊;摘要;语料库;历时发展”

2021

pdf bib
基于依存语法的偷抢类动词研究(Research of Verbs of Stealing and Robbing Based on Dependency Grammar)
Shan Wang (王珊) | Xiaojun Liu (刘晓骏)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

本文筛选了汉语“偷抢”类动词的单句,并借助依存语法的标注体系对“偷抢”类动词句法依存和语义依存进行定量分析。研究结果表明,当汉语“偷抢”类动词为从属词时,表现出句法功能的多样性、内部相似性和区别其他动词小类的特异性,其语义角色分布具有多样性。当汉语“偷抢”类动词为支配词时,该类动词的句法依存随其不同的句法功能而发生变化;从该类动词的语义依存来看,其客体语义密度整体低于主体语义密度,最常见的情境角色是地点和时间,在事件关系中,并列事件发生概率最高。“偷抢”类动词的句法语义特点丰富,主要的句型为主谓宾句式,而该句式中最常用的语义搭配模式是施事对受事实施偷抢动作。本研究结合依存语法和框架语义学,深化了对汉语“偷抢”类动词的句法、语义和事件关系的了解,促进了对该类动词的研究。

pdf bib
近十年来澳门的词汇增长(Macau’s Vocabulary Growth in the Recent Ten Year)
Shan Wang (王珊) | Zhao Chen (陈钊) | Haodi Zhang (张昊迪)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

词汇增长模型可以通过拟合词种(types)与词例(tokens)之间的数量关系,反映某一领域词汇的历时演化。澳门作为多语言多文化融合之地,词汇的使用情况能够反映社会的关注焦点,但目前尚无对澳门历时词汇演变的研究。本文首次构建澳门汉语历时语料库,利用三大词汇增长模型拟合语料库的词汇变化,并选取效果最好的 Heaps 模型进一步分析词汇演变与报刊内容的关系,结果反映出澳门词汇的变化趋势与热点新闻、澳门施政方针和民生密切相关。本研究还采用去除文本时序信息后的乱序文本,验证了方法的有效性。本文是首项基于大规模历时语料库考察澳门词汇演变的研究,对深入了解澳门语言生活的发展具有重要意义。

pdf bib
替换类动词的句法语义分析(Syntactic and Semantic Analysis of verbs of Exchange)
Shan Wang (王珊) | Le Wu (吴乐)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

句法和语义分析作为近年来自然语言处理的热点,对大量真实语料进行依存语法分析为探究语言的深层知识提供了可能。本文利用自主开发的句法语义标注工具,对替换类的四个动词“替换”“调换”“代替”和“取代”所在的例句进行句法和语义层面的标注和统计,根据结果将它们的句法表现概括成不同的句法模式,并分析它们的句法组合特点以及这种特点下的语义选择限制。本研究发现,替换类动词除了各自特有的句法结构外,会共同出现在“ADV+替换类动词+VOB”和“替换类动词+RAD”句法结构中;不同之处在于“取代”在“FOB+取代”句法结构中占有一定的比例,而“调换”和“替换”还经常出现在“替换类动词+CMP”和“COO”这样的句法结构中。在高频句法结构的基础上,本文对它们的语义依存进行了分析,发现它们共同的语义依存都有施事、当事、受事和客事这四种,而它们的不同之处在于“取代”的语义依存多为“当事”;“替换”的语义主体多为能动性较强的“施事”;而“代替”和“调换”的则有各自不同的语义依存和语义搭配结构。

pdf bib
回避类动词的句法语义(The Syntax and Semantics of Verbs of Avoiding)
Shan Wang (王珊) | Xiaojun Liu (刘晓骏)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

回避行为是人类重要的认知经验,己有对回避类动词的研究多为分析回避类动词的隐性否定语义和语篇博弈效果,但对该类动词的深层句法和语义分析不多。本文选取五个双音节回避类动词为研究对象,利用依存语法的相关理论,基于大规模语料分析回避类动词的句法和语义特征,从而深化对该类动词的研究。本研究的结果也可以进一步完善现有的汉语词典。本研究对汉语研究、汉语教学、词典编纂等具有重要的参考价值。

pdf bib
欺骗类动词的句法语义研究(On the Syntax and Semantics of Verbs of Cheating)
Shan Wang (王珊) | Jie Zhou (周洁)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

欺骗是一种常见的社会现象,但对欺骗类动词的研究十分有限。本文筛选“欺骗”类动词的单句并对其进行大规模的句法依存和语义依存分析。研究显示,“欺骗”类动词在句中作为从属词时,可作为不同的句法成分和语义角色,同时此类动词在句法功能上表现出高度的相似性。作为支配词的“欺骗”类动词,承担不同句法功能时,表现出不同的句法共现模式。语义上,本文详细描述、解释了该类动词在语义密度、主客体角色、情境角色和事件关系等维度的语义依存特点。“欺骗”类动词的句法语义虽具有多样性,但主要的句型为主谓宾句式,而该句式中最常用的语义搭配模式是施事对涉事进行欺骗行为,并对涉事产生影响。本研究结合依存语法和框架语义学,融合定量统计和定性分析探究欺骗类动词的句法语义,深化了对欺骗行为言语线索以及言说动词的研究。

2014

pdf bib
Identifying Idioms in Chinese Translations
Wan Yu Ho | Christine Kng | Shan Wang | Francis Bond
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Optimally, a translated text should preserve information while maintaining the writing style of the original. When this is not possible, as is often the case with figurative speech, a common practice is to simplify and make explicit the implications. However, in our investigations of translations from English to another language, English-to-Chinese texts were often found to include idiomatic expressions (usually in the form of Chengyu 成è ̄) where there were originally no idiomatic, metaphorical, or even figurative expressions. We have created an initial small lexicon of Chengyu, with which we can use to find all occurrences of Chengyu in a given corpus, and will continue to expand the database. By examining the rates and patterns of occurrence across four genres in the NTU Multilingual Corpus, a resource may be created to aid machine translation or, going further, predict Chinese translational trends in any given genre.

pdf bib
Building The Sense-Tagged Multilingual Parallel Corpus
Shan Wang | Francis Bond
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Sense-annotated parallel corpora play a crucial role in natural language processing. This paper introduces our progress in creating such a corpus for Asian languages using English as a pivot, which is the first such corpus for these languages. Two sets of tools have been developed for sequential and targeted tagging, which are also easy to set up for any new language in addition to those we are annotating. This paper also briefly presents the general guidelines for doing this project. The current results of monolingual sense-tagging and multilingual linking are illustrated, which indicate the differences among genres and language pairs. All the tools, guidelines and the manually annotated corpus will be freely available at compling.ntu.edu.sg/ntumc.

pdf bib
Issues in building English-Chinese parallel corpora with WordNets.
Francis Bond | Shan Wang
Proceedings of the Seventh Global Wordnet Conference

2013

pdf bib
Developing Parallel Sense-tagged Corpora with Wordnets
Francis Bond | Shan Wang | Eshley Huini Gao | Hazel Shuwen Mok | Jeanette Yiwen Tan
Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse

pdf bib
Building the Chinese Open Wordnet (COW): Starting from Core Synsets
Shan Wang | Francis Bond
Proceedings of the 11th Workshop on Asian Language Resources

2012

pdf bib
Compositionality of NN Compounds: A Case Study on [N1+Artifactual-Type Event Nouns]
Shan Wang | Chu-Ren Huang | Hongzhi Xu
Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation

pdf bib
Type Construction of Event Nouns in Mandarin Chinese
Shan Wang | Chu-Ren Huang
Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation

2011

pdf bib
Compound Event Nouns of the ‘Modifier-head’ Type in Mandarin Chinese
Shan Wang | Chu-Ren Huang
Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation

2010

pdf bib
Adjectival Modification to Nouns in Mandarin Chinese: Case Studies on “cháng+noun” and “adjective+tú shū gu n”
Shan Wang | Chu-Ren Huang
Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation

pdf bib
Compositional Operations of Mandarin Chinese Perception Verb “kàn”: A Generative Lexicon Approach
Shan Wang | Chu-Ren Huang
Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation

2007

pdf bib
Classifying Temporal Relations Between Events
Nathanael Chambers | Shan Wang | Dan Jurafsky
Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions