The Universal Transformer combines the following key properties into one model: Weight sharing: Following intuitions behind weight sharing found in CNNs and recurrent neural networks, we extend the Transformer with a simple form of weight sharing that strikes an effective balance...
We empirically show that the global and the local memory pointer are able to effectively produce system responses even in the out-of-vocabulary scenario, and visualize how global memory pointer helps as well
We empirically showed that when dependency parsers are not available for certain languages such as code-mixed languages we can use word co-occurrence frequencies and positive-pointwise mutual information values to extract a contextual graph and use such a graph with Graph Convolu...
Recent empirical improvements due to transfer learning with language models have demonstrated that rich, unsupervised pre-training is an integral part of many language understanding systems
We achieve state-of-the-art results on multiple benchmark knowledge base completion tasks and we show that our model is robust and can learn long chains-ofreasoning
Categorize and detect one type of clinical annotations stored in the hospital Picture Archiving and Communication Systems system as a rich retrospective data source, to build a large-scale Radiology lesion image database
We study how to automatically generate textual reports for medical images, with the goal to help medical professionals produce reports more accurately and efficiently
Our proposed measures and the analysis of strategies used by different publications and articles propose new directions for evaluating the difficulty of summarization tasks and for developing future summarization models
We propose a denoising distantly supervised open-domain question answering system which contains a paragraph selector to skim over paragraphs and a paragraph reader to perform an intensive reading on the selected paragraphs
Like web data in computer vision, a vast, loosely-labeled, and largely untapped data source does exist in the form of hospital picture archiving and communication systems
We presented 3D context enhanced region-based convolutional neural networks to leverage the 3D context when detecting lesions in volumetric data. 3D context enhanced region-based CNN is
Some SQuAD 2.0 questions are unlikely to be asked without significant foreknowledge of the context material and do not occur in QuAC. 4 Both SQuAD 2.0 and QuAC cover a significant number of unanswerable questions that could be plausibly in the article
Results show that the three types of edges are useful on combining global evidence and that the graph neural networks are effective on encoding complex graphs resulted by the first step
Annotators show substantial agreement when constructing dialogs with a three-way annotator agreement at a Fleiss’ Kappa level of 0.71.1 Likewise, we find that
We presented error type distribution by manually analyzing 100 bad responses sampled from Soft Typed Decoder and Hard Typed Decoder respectively, where bad means the response by our model is worse than that by some baseline during the pair-wise annotation
We present the Visual Knowledge Memory Network method as an efficient way to leverage pre-built visual knowledge base for accurate visual question answering