PolitiSky24: U.S. Political Bluesky Dataset with User Stance Labels

Peyman Rostami, Vahid Rahimzadeh, Ali Adibi, Azadeh Shakery·June 09, 2025

Summary

The POLITISKY24 dataset, focusing on the 2024 U.S. presidential election, provides user-level stance labels for Kamala Harris and Donald Trump on Bluesky. Comprising 16,044 user-target stance pairs, it includes engagement metadata, interaction graphs, and posting histories. Created using a pipeline combining advanced information retrieval and large language models, the dataset achieves 81% accuracy in stance labeling. It supports applications like public opinion tracking and user attitude understanding in social media conversations.

Introduction
Background
Overview of the POLITISKY24 dataset
Importance of stance labels in social media analysis
Context of the 2024 U.S. presidential election
Objective
Purpose of the dataset
Research questions addressed by the dataset
Expected outcomes and applications
Method
Data Collection
Sources of data
Techniques for gathering user-level stance labels
Integration of advanced information retrieval and large language models
Data Preprocessing
Cleaning and normalization of data
Handling missing values and outliers
Preparation for analysis
Data Analysis
Techniques for analyzing engagement metadata
Methods for interpreting interaction graphs
Examination of posting histories
Dataset Characteristics
Structure and Composition
Description of the 16,044 user-target stance pairs
Breakdown of data into engagement metadata, interaction graphs, and posting histories
Accuracy and Validation
Evaluation metrics for stance labeling
Validation process and results (81% accuracy)
Applications
Public Opinion Tracking
Utilization in monitoring public sentiment
Insights into voter preferences and trends
User Attitude Understanding
Analysis of user perspectives on candidates
Identification of influential user groups
Conclusion
Summary of Findings
Key insights from the dataset
Implications for future research and applications
Future Directions
Potential improvements to the dataset
Areas for further exploration in social media analysis
Basic info
papers
computation and language
information retrieval
social and information networks
artificial intelligence
Advanced features
Insights
Can the POLITISKY24 dataset be used to track emerging hotspots or trends in user attitudes towards the candidates?
How does the POLITISKY24 dataset facilitate sentiment analysis regarding the 2024 U.S. presidential election on Bluesky?
What are the prevalent user stances towards Kamala Harris and Donald Trump within the POLITISKY24 dataset?
What methodology was used to create the POLITISKY24 dataset, and what is the reported accuracy of the stance labeling?