In this paper, we adopt multi-scale behavior sequence generated from different granularities of web page structures and propose a model named SAH-RNN to consume the multi-scale behavior sequence for online payment fraud detection. The SAH-RNN has stacked RNN layers in which upper layers modeling for compendious behaviors are updated less frequently and receive the summarized representations from lower layers. A dual attention is devised to capture the impacts on both sequential information within the same sequence and structural information among different granularity of web pages. [paper]
In this paper, a heterogeneous transaction-intention network is devised to leverage the cross-interaction information over transactions and intentions,, which consists of two types of nodes, namely transaction and intention nodes, and two types of edges, i.e., transaction-intention and transaction-transaction edges. Then we propose a graph neural method coined IHGAT that not only perceives sequence-like intentions, but also encodes the relationship among transactions. Extensive experiments on a real-world dataset of Alibaba platform show that our proposed algorithm outperforms state-of-the-art methods in both offline and online modes. [paper]
To remedy
the class imbalance problem of graph-based fraud detection, we propose a Pick and Choose Graph Neural Network (PC-GNN for
short) for imbalanced supervised learning on graphs. First, nodes and edges are picked with a devised label-balanced sampler to construct sub-graphs for mini-batch training. Next, for each node in the sub-graph, the neighbor candidates are chosen by a proposed neighborhood sampler. Finally, information from the selected neighbors and different relations are aggregated to obtain the final representation of a target node. Experiments on both benchmark and real-world graph-based fraud detection tasks demonstrate that PC-GNN apparently outperforms SOTA baselines. [paper][code]
In this paper, we propose an end-to-end multi-view and multitask learning based approach named MvMoE (Multi-view-aware Mixture-of-Experts network) to solve credit risk and limits forecasting simultaneously. First, a multi-view network with a hierarchical attention mechanism is constructed to distill users’ heterogeneous financial information into shared hidden representations. Then, we jointly train these two tasks with a view-aware multi-gate mixture-of experts network and a subsequent progressive network to improve their performances. With the real-world dataset contained 5.44 million users, we demonstrate that the proposed model is able to improve AP over 5.60% on credit risk forecasting and MAE over 9.52% on credit limits. [paper]
In this paper, we propose a semi-supervised meta-learning based approach called TRUST (TRainable Undersampling withSelf Training) to resolve class-imbalance proglem in credit risk forecasting. First, it decides whether to sample the data through meta-learning based reinforcement learning. Secondly, it learns the distribution of the data that have not yet shown financial performance via self-training and updates the model trained in the first step. Finally, the updated model is evaluated on the validation dataset, the result of which will be fed back through the evaluator. These three steps will be iterated until the model converges. Experimental results on the real-world industrial dataset containing 1.75million users exhibit that the proposed method is able to improve AP over 5.94%on credit risk forecasting task compared with the recent methods. [paper]
In this paper, we propose a novel adversarial data augmentation method to solve the class imbalance problem in financial credit risk assessment. We
train a generator for synthetic sample generation with a discriminator to identify real or fake instances. Besides, an auxiliary risk discriminator is trained cooperatively with the generator to assess the credit risk. Experimental results on three real-world datasets
demonstrate the effectiveness of the proposed framework. [paper]
In this paper, we devise a tree-like structure named behavior tree to reorganize the user behavioral data, in which a group of successive sequential actions denoting a specific user intention are represented as a branch on the tree. We then propose a novel neural method coined LIC Tree-LSTM (Local Intention Calibrated Tree-LSTM) to utilize the behavior tree for fraud transactions detection. We investigate the effectiveness of LIC Tree-LSTM on a real-world dataset of Alibaba platform, and the experimental results show that our proposed algorithm outperforms state-of-the-art methods in both offline and online modes. [paper]
In this paper, we propose a multi-view attributed heterogeneous information network based approach coined MAHINDER for defaulter detection. First, multiple views of user behaviors are adopted to learn personal profile due to the endogenous aspect of financial default. Second, local behavioral patterns are specifically modeled since financial default is adversarial and accumulated. The experimental resuts on real-world datasets on Alibaba platform exhibit the proposed approach is able to improve AUC over 2.8% and Recall@Precision=0.1 over 13.1% compared with the state-of-the-art methods. [paper]
In this paper, we construct two graphs to represent the user interactions on social media and propose a hierarchical cross-modal embedding method that takes the high-order relationships into consideration. The key notion behind our method is a novel hierarchical embedding framework with meta-graphs connecting different layers. We introduce both inter-record and intra-record meta-graph structures, which enable learning distributed representations that preserve high-order proximities across graphs from different layers. Our empirical experiments on three real-world datasets demonstrate that our method not only outperforms state-of-the-art methods for spatiotemporal activity prediction, but also captures cross-modal proximity at a finer granularity. [paper]
Most existing FEM (Frequent Episode Mining) solutions are time-consuming. For fast-growing sequence data, old episodes may become obsolete while new useful episodes keep emerging. We proposed an algorithm named MESELO (Mining frEquent Serial Episode via Last Occurrence), which applies episode trie to store all minimal occurrences of episodes and adapts to rapidly growing data. We theoretically prove the proposed algorithm's soundness and completeness, and experimental results on both synthetic and real datasets show the superiority of our proposed algorithms. [paper][code]
We come up with the concept of fixed-gap episode and develop a trie-based data structure to mine such precise-positioning episode rules with several pruning strategies. A fixed-gap episode consists of an ordered set of events where the elapsed time between any two consecutive events is a constant. Experimental results on real datasets show the solution can also satisfy the requirement of many time sensitive applications. [paper][code]
In this work, we propose a scalable distributed framework LA-FEMH (Large-scale Frequent Episode Mining with Hierarchies) to partition the sequence into pieces. We adopt optimized rewrite skills and devise a local mining algorithm PEM (Peak Episode Miner) to improve local mining performance. We also make an extension of our framework and propose LA-FEMH+ to support other episode mining tasks such as maximal and closed episode mining in the context of event hierarchies. [paper]
Mobirise.com