In this paper, we introduce the multilingual knowledge graph (KG) to the CLIR task due to the sufficient information of entities in multiple languages. We propose a model named CLIR with hierarchical knowledge
enhancement (HIKE) for our task. The proposed model encodes the textual information in queries, documents and the KG with multilingual BERT, and incorporates the KG information in the query-document matching process with a hierarchical information fusion mechanism. Experimental results demonstrate that HIKE achieves substantial improvements over state-of-the-art competitors.
In this paper, we investigate the unified ABSA task from the perspective of Machine Reading Comprehension (MRC) by observing that the aspect and the opinion terms can serve as the query and answer in MRC interchangeably. We propose a new paradigm named Role Flipped Machine Reading Comprehension (RF-MRC) to resolve. At its heart, the predicted results of either the Aspect Term Extraction (ATE) or the Opinion Terms Extraction (OTE) are regarded as the queries, respectively, and the matched opinion or aspect terms are extracted as answers. The queries and answers can be flipped for multi-hop detection. Finally, every matched aspect-opinion pair is predicted by the sentiment classifier. RF-MRC can solve the ABSA task without extra data annotation. Experiments on three widely use benchmarks and a challenging dataset demonstrate the superiority of the proposed framework. [paper]
In this paper, we formulate the personalized news headline generation problem whose goal is to output a user-specific title based on both a user’s reading interests and a candidate news body to be exposed to her. To build up a benchmark for this problem, we publicize a large-scale dataset named PENS. The training set is collected from user impressions logs of Microsoft News, and the test set is manually created by hundreds of native speakers to enable a fair testbed for evaluating models in an offline mode. We propose a generic framework as a preparatory solution to our problem. We investigate our dataset by implementing several state-of-the-art user modeling methods in our framework to demonstrate a benchmark score for the proposed dataset. The dataset is available at https://msnews.github.io/pens.html. [paper][code]
In this paper, we propose a multi-task learning approach named MIN to make flexible use of sub-tasks for a unified ABSA. We divide the sub-tasks of ABSA into extractive sub-tasks and classification sub-tasks, and optimize these sub-tasks in a unified manner with multiplex interaction mechanisms. Specifically, we devise a pairwise attention to exploit bidirectional interactions between any arbitrary pair of extractive sub-tasks and a consistency-weighting to perform unidirectional interaction from an extractive sub-task to a classification sub-task. Since the proposed interaction mechanisms are task-agnostic, our model can also work well when some specific sub-tasks are absent. [paper][code]
In this paper, we aim to improve ATSA by discovering the potential aspect terms of the predicted sentiment polarity when the aspect terms of a test sentence are unknown. We access this goal by proposing a capsule network based model named CAPSAR. In CAPSAR, sentiment categories are denoted by capsules and aspect term information is injected into sentiment capsules through a sentiment-aspect reconstruction procedure during the training. As a result, coherent patterns between aspects and sentimental expressions are encapsulated by these sentiment capsules. Experiments on three widely used benchmarks demonstrate these patterns have potential in exploring aspect terms from test sentence when only feeding the sentence to the model. [paper]
In this paper, we propose a neural model named TWASP for joint CWS and POS tagging following the character-based sequence labeling paradigm, where a two-way attention mechanism is used to incorporate both context feature and their corresponding syntactic knowledge for each input character. Particularly, we use existing language processing toolkits to obtain the auto-analyzed syntactic knowledge for the context, and the proposed attention module can learn and benefit from them although their quality may not be perfect. Our experiments illustrate the effectiveness of the two-way attentions for joint CWS and POS tagging, where state-of-the-art performance is achieved on five benchmark dataset. [paper]
In
this paper, we present a new large-scale Multi Aspect Multi-Sentiment (MAMS) dataset, in which each sentence contains at least two different aspects with different sentiment polarities. The release of this dataset would push forward the research in this field. In addition, we propose simple yet effective CapsNet and CapsNet-BERT models which combine the strengths of recent NLP advances. Experiments on our new dataset show that the proposed model significantly outperforms the
state-of-the-art baseline methods. [paper]
In this work, we re-examine extractive text summarization by simulating the process of extracting summarization of human. We adopt a convolutional neural network to encode gist of paragraphs for rough reading, and a decision making policy with an adapted termination mechanism for careful reading. [paper][code]
In this work, we present an unsupervised neural framework that leverages sememes to enhance lexical semantics. We propose a sememe attention structure to represent word meanings and add an RNN sentence encoder for guiding the sememe exploration. The experimental results show that our model is superior to the existing models especially on identifying infrequent aspects. [paper][code]
We propose an interpretable framework coined FISHQA (FInancial Sentiment analysis network with Hierarchical Query-driven Attention) for financial sentiment analysis. Multiple user specified queries are contributed to distill document representation with query based attention mechanism. The experiments demonstrate that our framework can learn better representation of the document, unearth meaningful clues on replying different users’ preferences and outperforms the state-of-the-art methods. [paper][code]
HTML Generator