Sentential Representations in Distributional Semantics.

dblp(2016)

引用 23|浏览2
暂无评分
摘要
This thesis is about the problem of representing sentential meaning in distributional semantics. Distributional semantics obtains the meanings of words through their usage, based on the hypothesis that words occurring in similar contexts will have similar meanings. In this framework, words are modeled as distributions over contexts and are represented as vectors in high dimensional space. Compositional distributional semantics attempts to extend this approach to higher linguistics structures. Some basic composition models proposed in literature to obtain the meaning of phrases or possibly sentences show promising results in modeling simple phrases. The goal of the thesis is to further extend these composition models to obtain sentence meaning representations. The thesis puts more focus on unsupervised methods which make use of the context of phrases and sentences to optimize the parameters of a model. Three different methods are presented. The first model is the PLF model, a practical composition and linguistically motivated model which is based on the lexical function model introduced by Baroni and Zamparelli (2010) and Coecke et al. (2010). The second model is the Chunk-based Smoothed Tree Kernels (CSTKs) model, extending Smoothed Tree Kernels (Mehdad et al., 2010) by utilizing vector representations of chunks. The final model is the C-PHRASE model, a neural network-based approach, which jointly optimizes the vector representations of words and phrases using a context predicting objective. The thesis makes three principal contributions to the field of compositional distributional semantics. The first is to propose a general framework to estimate the parameters and evaluate the basic composition models. This provides a fair way to comparing the models using a set of phrasal datasets. The second is to extend these basic models to the sentence level, using syntactic information to build up the sentence vectors. The third contribution is to evaluate all the proposed models, showing that they perform on par with or outperform competing models presented in the literature. Thesis Supervisor: Marco Baroni Title: Associate Professor
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要