In this paper, we propose a novel Differentially Private Graph Neural Network based on Structured Perturbation (GRASP), which combines independent and identical noise to achieve bidirectional shifts in the embedding similarity distribution, thereby effectively disrupting the ranking structure and enhancing defense against Graph Reconstruction Attacks (GRA). [paper] [code]
To enhance the stealthiness of graph backdoors, we propose SPEAR, a novel structure-preserving graph backdoor attack that avoids modifying the graph's topology. SPEAR operates within a limited attack budget by selectively perturbing node attributes while ensuring the triggers exert significant influence through a global importance-driven feature selection strategy. Additionally, a neighborhood-aware trigger generator is employed to underpin a high attack success rate by utilizing semantic information from the neighborhood. SPEAR amplifies effectiveness and stealthiness by combining subtle yet impactful attribute manipulation with a refined trigger generation mechanism. [paper] [code]
We propose DGNN-SR, a novel framework for credit risk assessment that jointly encodes dynamic transaction graphs and static fund transfer graphs. DGNN-SR uses a multi-view time encoder to capture both relative and absolute temporal information, and introduces an adaptive re-weighting strategy to fuse static relations into dynamic representations. Experiments on real-world datasets show DGNN-SR achieves a 0.85%–2.5% performance improvement over existing SOTA methods. [paper]
We introduce LOGIN, a framework that integrates Large Language Models (LLMs) as consultants within GNN training. LOGIN crafts semantic and topological prompts for nodes and adaptively leverages LLM responses to refine GNNs. Experiments on node classification tasks show that even basic GNNs, with LLM consultation, can match the performance of advanced GNNs.[paper] [code]
We present a new adversarial training paradigm for graph attack defense by re-examining both poisoning and evasion attacks from an out-of-distribution (OOD) perspective. Our method incorporates OOD detection into adversarial training, addressing the shortcomings of conventional approaches and improving robustness against adaptive attacks. [paper] [code]
We propose F2GNN, an adaptive filter with feature segmentation for graph-based fraud detection. By segmenting user features and applying adaptive graph filters to each segment, F2GNN better captures subtle fraudulent behaviors and addresses class imbalance. Experiments on real-world datasets show that F2GNN outperforms state-of-the-art methods. [paper] [code]
We propose ASD-VAE, a neural model that learns a shared latent space from both attribute and structural views of graphs to robustly handle high rates of missing node attributes. ASD-VAE uses a coupled-and-decoupled learning process for multimodal fusion and imputation. Experiments on four real-world incomplete graph datasets show that ASD-VAE outperforms state-of-the-art methods in dealing with missing attributes and improves downstream graph learning tasks. [paper] [code]
We propose FLOOD, a framework for OOD generalization on graphs that combines invariant learning with bootstrapped self-supervised refinement. FLOOD learns invariant representations across augmented environments and enables flexible encoder adaptation to the test distribution. Experiments show FLOOD consistently outperforms prior graph OOD generalization methods for both transductive and inductive node classification tasks. [paper]
We reveal that gradient-based attacks on GNNs for semi-supervised node classification concentrate adversarial edges around training nodes, explaining their effectiveness from a data distribution perspective. Based on this insight, we provide nine practical tips for both attack and defense, and propose a fast attack method and a self-training defense method that outperform state-of-the-art approaches and scale to large graphs. Extensive experiments on benchmark datasets validate our findings. [paper] [code]
We propose TLT-KGE, a method for temporal knowledge graph embedding that encodes semantic and temporal information as different axes in complex or quaternion spaces. By modeling their independence and interaction, TLT-KGE enables better distinction of entities and relations across timestamps. Experiments show that TLT-KGE significantly outperforms state-of-the-art methods on temporal knowledge graph completion tasks. [paper] [code]
We propose NGS, a framework that formalizes GNN message passing as a meta-graph and uses neural architecture search to optimize the graph structure for fraud detection. By aggregating multiple searched meta-graphs, NGS achieves superior performance and provides interpretable explanations. Experiments on real-world datasets show NGS outperforms state-of-the-art baselines. [paper]
We introduce an Uncertainty-aware Debiasing (UD) framework that addresses GNN bias toward homophilous nodes in mixed-structure graphs. UD estimates output uncertainty to identify heterophilous nodes, then prunes and retrains GNN parameters to improve performance on these nodes. Applied to both homophilous and heterophilous GNNs, UD consistently enhances performance and reduces the gap between homophilous and heterophilous nodes on various datasets. [paper]
We propose STABLE, an unsupervised pipeline that learns robust node representations insensitive to structural perturbations for graph structure optimization. The refined graph is then used by an advanced GCN, which improves robustness over standard GCNs without additional computational cost. [paper] [code]
We propose Bi-Level Selection (BLS), an algorithm that improves GNNs for fraud detection by selecting valuable nodes at both the instance and neighborhood levels using meta gradients from a clean validation set. BLS suppresses class imbalance and label noise, and can be applied to most GNNs. Experiments on real-world datasets show BLS significantly boosts GNN performance in fraud detection tasks. [paper]
We propose AO-GNN, a model that addresses label imbalance and noisy edges in GNN-based fraud detection by decoupling AUC maximization into classifier parameter search and edge pruning policy search. AO-GNN uses AUC-oriented stochastic gradients and a reinforcement learning module for edge pruning. Experiments on real-world datasets show AO-GNN significantly outperforms state-of-the-art baselines in AUC and other metrics. [paper]
In this paper, we adopt multi-scale behavior sequence generated from different granularities of web page structures and propose a model named SAH-RNN to consume the multi-scale behavior sequence for online payment fraud detection. The SAH-RNN has stacked RNN layers in which upper layers modeling for compendious behaviors are updated less frequently and receive the summarized representations from lower layers. A dual attention is devised to capture the impacts on both sequential information within the same sequence and structural information among different granularity of web pages. [paper]
In this paper, a heterogeneous transaction-intention network is devised to leverage the cross-interaction information over transactions and intentions,, which consists of two types of nodes, namely transaction and intention nodes, and two types of edges, i.e., transaction-intention and transaction-transaction edges. Then we propose a graph neural method coined IHGAT that not only perceives sequence-like intentions, but also encodes the relationship among transactions. Extensive experiments on a real-world dataset of Alibaba platform show that our proposed algorithm outperforms state-of-the-art methods in both offline and online modes. [paper]
To remedy
the class imbalance problem of graph-based fraud detection, we propose a Pick and Choose Graph Neural Network (PC-GNN for
short) for imbalanced supervised learning on graphs. First, nodes and edges are picked with a devised label-balanced sampler to construct sub-graphs for mini-batch training. Next, for each node in the sub-graph, the neighbor candidates are chosen by a proposed neighborhood sampler. Finally, information from the selected neighbors and different relations are aggregated to obtain the final representation of a target node. Experiments on both benchmark and real-world graph-based fraud detection tasks demonstrate that PC-GNN apparently outperforms SOTA baselines. [paper][code]
In this paper, we propose an end-to-end multi-view and multitask learning based approach named MvMoE (Multi-view-aware Mixture-of-Experts network) to solve credit risk and limits forecasting simultaneously. First, a multi-view network with a hierarchical attention mechanism is constructed to distill users’ heterogeneous financial information into shared hidden representations. Then, we jointly train these two tasks with a view-aware multi-gate mixture-of experts network and a subsequent progressive network to improve their performances. With the real-world dataset contained 5.44 million users, we demonstrate that the proposed model is able to improve AP over 5.60% on credit risk forecasting and MAE over 9.52% on credit limits. [paper]
In this paper, we propose a semi-supervised meta-learning based approach called TRUST (TRainable Undersampling withSelf Training) to resolve class-imbalance proglem in credit risk forecasting. First, it decides whether to sample the data through meta-learning based reinforcement learning. Secondly, it learns the distribution of the data that have not yet shown financial performance via self-training and updates the model trained in the first step. Finally, the updated model is evaluated on the validation dataset, the result of which will be fed back through the evaluator. These three steps will be iterated until the model converges. Experimental results on the real-world industrial dataset containing 1.75million users exhibit that the proposed method is able to improve AP over 5.94%on credit risk forecasting task compared with the recent methods. [paper]
In this paper, we propose a novel adversarial data augmentation method to solve the class imbalance problem in financial credit risk assessment. We
train a generator for synthetic sample generation with a discriminator to identify real or fake instances. Besides, an auxiliary risk discriminator is trained cooperatively with the generator to assess the credit risk. Experimental results on three real-world datasets
demonstrate the effectiveness of the proposed framework. [paper]
In this paper, we devise a tree-like structure named behavior tree to reorganize the user behavioral data, in which a group of successive sequential actions denoting a specific user intention are represented as a branch on the tree. We then propose a novel neural method coined LIC Tree-LSTM (Local Intention Calibrated Tree-LSTM) to utilize the behavior tree for fraud transactions detection. We investigate the effectiveness of LIC Tree-LSTM on a real-world dataset of Alibaba platform, and the experimental results show that our proposed algorithm outperforms state-of-the-art methods in both offline and online modes. [paper]
In this paper, we propose a multi-view attributed heterogeneous information network based approach coined MAHINDER for defaulter detection. First, multiple views of user behaviors are adopted to learn personal profile due to the endogenous aspect of financial default. Second, local behavioral patterns are specifically modeled since financial default is adversarial and accumulated. The experimental resuts on real-world datasets on Alibaba platform exhibit the proposed approach is able to improve AUC over 2.8% and Recall@Precision=0.1 over 13.1% compared with the state-of-the-art methods. [paper]
In this paper, we construct two graphs to represent the user interactions on social media and propose a hierarchical cross-modal embedding method that takes the high-order relationships into consideration. The key notion behind our method is a novel hierarchical embedding framework with meta-graphs connecting different layers. We introduce both inter-record and intra-record meta-graph structures, which enable learning distributed representations that preserve high-order proximities across graphs from different layers. Our empirical experiments on three real-world datasets demonstrate that our method not only outperforms state-of-the-art methods for spatiotemporal activity prediction, but also captures cross-modal proximity at a finer granularity. [paper]
Most existing FEM (Frequent Episode Mining) solutions are time-consuming. For fast-growing sequence data, old episodes may become obsolete while new useful episodes keep emerging. We proposed an algorithm named MESELO (Mining frEquent Serial Episode via Last Occurrence), which applies episode trie to store all minimal occurrences of episodes and adapts to rapidly growing data. We theoretically prove the proposed algorithm's soundness and completeness, and experimental results on both synthetic and real datasets show the superiority of our proposed algorithms. [paper][code]
We come up with the concept of fixed-gap episode and develop a trie-based data structure to mine such precise-positioning episode rules with several pruning strategies. A fixed-gap episode consists of an ordered set of events where the elapsed time between any two consecutive events is a constant. Experimental results on real datasets show the solution can also satisfy the requirement of many time sensitive applications. [paper][code]
In this work, we propose a scalable distributed framework LA-FEMH (Large-scale Frequent Episode Mining with Hierarchies) to partition the sequence into pieces. We adopt optimized rewrite skills and devise a local mining algorithm PEM (Peak Episode Miner) to improve local mining performance. We also make an extension of our framework and propose LA-FEMH+ to support other episode mining tasks such as maximal and closed episode mining in the context of event hierarchies. [paper]
No Code Website Builder