GNN Applied to Ego-nets for Friend Suggestions

Evgeny Zamyatin·December 16, 2024

Summary

The Generalized Ego-network Friendship Score framework uses WalkGNN for friend suggestion in social networks, reducing link prediction to low-scale ego-net tasks. This scalable approach aggregates results and excels in heterogeneous, dynamic graph-level link prediction. Evaluated on the Ego-VK dataset, it outperforms baselines, and A/B tests show improved business metrics. A distributed triangle counting algorithm computes common neighbors and Adamic-Adar heuristics in large graphs, making ego-nets a compact way to analyze local neighborhoods in graphs with billions of nodes. Inspired by friend suggestion algorithms, WalkGNN, a second-order GNN, constructs representations for node pairs, considering different link types and numerical characteristics. Evaluated on the Ego-VK dataset from Russia's largest social network, VK, this model efficiently solves link prediction tasks in complex networks.

Key findings

Introduction

Background

Overview of social network analysis and link prediction

Importance of friend suggestion algorithms in social networks

Challenges in link prediction for large-scale, heterogeneous, and dynamic graphs

Objective

To present a scalable framework for link prediction using the Generalized Ego-network Friendship Score

To introduce WalkGNN, a second-order Graph Neural Network (GNN) for efficient link prediction

Method

Data Collection

Description of the Ego-VK dataset used for evaluation

Data sources and preprocessing steps for the dataset

Data Preprocessing

Techniques for handling large-scale, heterogeneous, and dynamic graph data

Methods for aggregating and normalizing data for model training

Model Architecture

Detailed explanation of WalkGNN architecture

Incorporation of second-order information in node representations

Handling of different link types and numerical characteristics

Evaluation

Metrics used for assessing the performance of the Generalized Ego-network Friendship Score framework

Comparison with baseline models on the Ego-VK dataset

Results

Presentation of experimental results on the Ego-VK dataset

Analysis of improvements in business metrics from A/B tests

Scalability and Efficiency

Distributed Triangle Counting Algorithm

Overview of the algorithm for computing common neighbors and Adamic-Adar heuristics

Explanation of how it handles large graphs with billions of nodes

Ego-net Analysis

Explanation of how ego-nets provide a compact way to analyze local neighborhoods

Benefits of using ego-nets for scalable link prediction in complex networks

Conclusion

Summary of the Generalized Ego-network Friendship Score framework

Future Work

Potential improvements and extensions of the framework

Areas for further research in scalable link prediction and social network analysis

Basic info

papers

social and information networks

artificial intelligence

Advanced features

Insights

Which dataset was used to evaluate the performance of the Generalized Ego-network Friendship Score framework, and what were the results compared to baseline models?

How does the framework utilize WalkGNN for friend suggestion in social networks?

What are the key advantages of this scalable approach in heterogeneous, dynamic graph-level link prediction?

What is the main idea behind the Generalized Ego-network Friendship Score framework?