Experiments in News Bias Detection with Pre-Trained Neural Transformers

Tim Menzner, Jochen L. Leidner·June 14, 2024

Summary

This study investigates the use of pre-trained neural transformers, specifically OpenAI GPT-3, GPT-4, and Meta Llama2, for detecting and classifying sentence-level news bias. The authors from Coburg University and the University of Sheffield address the challenges of media bias, propaganda, and fake news in the digital age, aiming to create a "nutrition label" for online content. They compare these models' effectiveness, using datasets like MBIC and Horne et al.'s, in identifying biased content without advocating for automatic censorship. The research evaluates the models' performance in zero-shot and fine-tuned settings, focusing on GPT-3.5, GPT-4, and fine-tuned GPT-3.5, with experiments on the MBIC dataset. Results show that fine-tuned GPT-3.5 outperforms others but has limitations in distinguishing factual statements from bias. The study also highlights the need for further development in cross-lingual and multilingual models, as well as tools for media literacy and combating misinformation.

Paper digest

What problem does the paper attempt to solve? Is this a new problem?

The paper aims to address the issue of news bias detection using pre-trained neural transformers and to compare their performance in identifying bias in news content . This problem is not entirely new, as media bias has been a subject of investigation for a long time . However, the increasing spread of biased, distorted, or fake information on the internet has heightened the need for effective methods to detect and combat news bias, making this research highly relevant in the current context.


What scientific hypothesis does this paper seek to validate?

This paper aims to validate the hypothesis that using pre-trained neural transformers can effectively detect news bias at the sentence level and classify bias sub-types, providing both quantitative and qualitative results . The study is part of a broader effort to realize the conceptual vision of creating a "nutrition label" for online content to promote social good by combating biased and fake news dissemination .


What new ideas, methods, or models does the paper propose? What are the characteristics and advantages compared to previous methods?

The paper proposes several new ideas, methods, and models in the field of news bias detection:

  • The paper plans to work on cross-distilling a single open, non-proprietary, on-premise model fine-tuned for media bias identification and classification, aiming to develop a browser plug-in to empower citizens in critically reading the news .
  • It suggests collecting a larger corpus of biased news specimens in multiple languages and encourages collaboration with other research groups on this socially relevant endeavor .
  • The experiments in the paper compare the performance of different pre-trained neural transformers, such as GPT-3.5, GPT-4, and Llama2, for news bias detection using the MBIC dataset .
  • The paper evaluates the models' ability to classify sentences from English news as biased or non-biased, as well as identifying the type of bias expressed in the sentences .
  • The experiments show that a fine-tuned variant of a model with a smaller number of parameters, like GPT-3.5, can outperform a model with a much larger number of parameters, such as GPT-4, in terms of precision and energy efficiency for news bias detection tasks . The paper proposes novel approaches in news bias detection compared to previous methods:
  • The paper focuses on evaluating transformer techniques for detecting various types of bias across datasets, distinguishing itself from previous works that used different datasets and did not compare the transformer types used in this study .
  • It plans to develop a single open, non-proprietary model fine-tuned for media bias identification and classification, aiming to create a browser plug-in to enhance citizens' critical reading of the news .
  • The experiments in the paper compare the performance of advanced pre-trained neural transformers like GPT-3.5, GPT-4, and Llama2 for news bias detection tasks using the MBIC dataset, showcasing the effectiveness of these models in identifying bias in news content .
  • The paper highlights the importance of exploring further the challenges faced by models in distinguishing language nuances, reported speech, and hallucinating new categories, indicating areas for improvement in future research .
  • The experiments demonstrate that a fine-tuned variant of a model with fewer parameters, such as GPT-3.5, can outperform models with larger parameters like GPT-4 in terms of precision and energy efficiency for news bias detection tasks, providing a more effective and resource-efficient approach .

Do any related researches exist? Who are the noteworthy researchers on this topic in this field?What is the key to the solution mentioned in the paper?

Several related research studies exist in the field of news bias detection. Noteworthy researchers in this area include Arapakis, Peleja, Berkant, and Magalhaes , who focused on linguistic benchmarks of online news article quality. Yano, Resnik, and Smith manually annotated sentence-level partisanship bias, while Zhang, Kawai, Nakajima, Matsumoto, and Tanaka developed a system for sentiment bias detection in support of news credibility judgment. Baumer, Elovic, Qin, Polletta, and Gay concentrated on testing and comparing computational approaches for identifying the language of framing in political news. Bhuiyan, Zhang, Sehat, and Mitra investigated differences in crowdsourced news credibility assessment.

The key to the solution mentioned in the paper on news bias detection with pre-trained neural transformers involves comparing several large, pre-trained language models for sentence-level news bias detection and sub-type classification. The study provides quantitative and qualitative results as part of a broader effort towards realizing the conceptual vision of a "nutrition label" for online content for the social good . The solution aims to enhance critical reading of news by providing tools to citizens and collecting a larger corpus of biased news specimens in multiple languages .


How were the experiments in the paper designed?

The experiments in the paper were designed to evaluate the performance of different large pre-trained language models, including GPT-3.5, GPT-4, and Llama2, in classifying sentences from English news for bias detection . The experiments involved three main lines of evaluation to quantitatively and qualitatively assess the models' ability to identify bias in news content .

In the first mode of evaluation, sentences were grouped into batches of ten sentences each to minimize the number of API calls and simulate bias detection in longer texts . This approach aimed to streamline the evaluation process by assessing multiple sentences together .

The second mode of evaluation involved assessing sentences individually, independent of preceding or following sentences, to determine if batching unrelated sentences affected the models' performance . This method allowed for a more granular analysis of bias detection capabilities .

Various prompt engineering techniques were employed throughout the experiments to optimize the models' performance, such as providing examples within the prompt, removing definitions for context-dependent bias criteria, and adjusting the prompt to enhance clarity in instructing the model . These modifications aimed to improve the models' bias detection accuracy and efficiency .

Additionally, the experiments included variations in prompt settings and parameter adjustments to explore the impact of different configurations on the models' performance . These adjustments ranged from changing the model temperature to filtering out results based on bias scores, ultimately aiming to enhance the models' precision and recall in bias detection .


What is the dataset used for quantitative evaluation? Is the code open source?

The dataset used for quantitative evaluation in the study is the MBIC dataset, which consists of 1,700 statements annotated by 10 different judges from various US-American news outlets representing left-oriented, right-wing, and center sources . The MBIC dataset contains 1,551 statements after removing those for which annotators could not reach a final decision, with 1,018 biased and 533 non-biased statements . The code used in the experiments is not explicitly mentioned as open source in the provided context.


Do the experiments and results in the paper provide good support for the scientific hypotheses that need to be verified? Please analyze.

The experiments and results presented in the paper provide substantial support for the scientific hypotheses that need verification. The study compares various pre-trained language models for news bias detection, focusing on sentence-level bias detection and sub-type classification . The findings contribute to the broader effort of creating a "nutrition label" for online content to promote social good . The experiments demonstrate the importance of identifying different types of biases in news reporting, such as linguistic bias, context bias, hate speech, and gender bias .

The study highlights the challenges and biases inherent in bias detection methods, emphasizing the need for developers to make users aware of potential biases in their applications . Additionally, the research addresses the concept of false balance, where equal weight is given to views lacking credible evidence, leading to biased perceptions . The study also discusses the impact of negatively connotated words on bias perception and the challenges posed by large language models in generating unbiased outputs .

Overall, the experiments provide valuable insights into news bias detection methodologies, the limitations of bias detection models, and the importance of promoting unbiased reporting in online content. The results support the scientific hypotheses by showcasing the performance of different language models in detecting biases and the need for continued research in this critical area of study .


What are the contributions of this paper?

The paper "Experiments in News Bias Detection with Pre-Trained Neural Transformers" makes several contributions in the field of news bias detection:

  • It compares large, pre-trained language models for sentence-level news bias detection and sub-type classification, providing both quantitative and qualitative results .
  • The findings of the paper contribute to the broader effort of realizing the conceptual vision of a "nutrition label" for online content for the social good, as articulated by Fuhr et al. .
  • The paper addresses the critical issues of media bias, propaganda, and fake news that pose a threat to democracy by providing insights into detecting biased and fake information spread by state actors and commercial players .
  • It builds on previous research efforts in bias detection by incorporating second-order information, such as probability distributions of frequency, positions, and sequential order of sentence-level bias, to enhance the effectiveness of article-level bias detection .
  • The paper leverages linguistic cue analysis to detect bias-inducing words in news articles, contributing to the advancement of bias detection techniques in news media .
  • Additionally, the paper utilizes media bias datasets released by various groups to enhance bias detection methodologies and datasets, such as the MBIC corpus introduced by Spinde et al. .

What work can be continued in depth?

To further advance the field of news bias detection, several avenues of research can be pursued based on the existing work:

  • Cross-distilling a single open, non-proprietary model: Developing a model fine-tuned specifically for media bias identification and classification tasks .
  • Collecting a larger corpus of biased news specimens: This can involve gathering biased news samples in multiple languages to enhance the understanding and detection of bias across different contexts .
  • Inviting collaboration with other research groups: Joining forces with other research teams to collectively work on socially relevant endeavors related to news bias detection and mitigation .

Introduction
Background
Rise of media bias, propaganda, and fake news in the digital age
Importance of media literacy and combating misinformation
Objective
Develop a "nutrition label" for online news content
Evaluate the effectiveness of pre-trained transformers in detecting bias
Avoid automatic censorship while promoting media transparency
Method
Data Collection
Datasets used:
MBIC (Multi-Bias Inference Corpus)
Horne et al.'s dataset (potentially)
Data sources: Bias-labeled news articles and sentences
Data Preprocessing
Cleaning and formatting of collected data
Splitting data into training, validation, and testing sets
Model Selection and Evaluation
Models analyzed:
OpenAI GPT-3
GPT-4
Meta Llama2
Evaluation methods:
Zero-shot learning
Fine-tuning (GPT-3.5)
Performance metrics: Accuracy, precision, recall, and F1-score
Experiments
Focus on GPT-3.5 in fine-tuned and zero-shot settings
Comparison of models' bias detection capabilities
Results and Analysis
Fine-tuned GPT-3.5 performance
Limitations in distinguishing factual statements from bias
Cross-lingual and multilingual model challenges
Discussion
Strengths and weaknesses of the studied models
Implications for media bias detection in real-world scenarios
Future directions for research and development
Conclusion
Summary of findings
Importance of human-in-the-loop approaches for accurate bias detection
Call for collaboration between academia and industry to combat misinformation
Basic info
papers
computation and language
artificial intelligence
Advanced features
Insights
What models does the study investigate for detecting news bias?
Which institutions are the authors affiliated with?
What is the purpose of creating a "nutrition label" for online content?
Which dataset(s) are used to evaluate the models' performance?

Experiments in News Bias Detection with Pre-Trained Neural Transformers

Tim Menzner, Jochen L. Leidner·June 14, 2024

Summary

This study investigates the use of pre-trained neural transformers, specifically OpenAI GPT-3, GPT-4, and Meta Llama2, for detecting and classifying sentence-level news bias. The authors from Coburg University and the University of Sheffield address the challenges of media bias, propaganda, and fake news in the digital age, aiming to create a "nutrition label" for online content. They compare these models' effectiveness, using datasets like MBIC and Horne et al.'s, in identifying biased content without advocating for automatic censorship. The research evaluates the models' performance in zero-shot and fine-tuned settings, focusing on GPT-3.5, GPT-4, and fine-tuned GPT-3.5, with experiments on the MBIC dataset. Results show that fine-tuned GPT-3.5 outperforms others but has limitations in distinguishing factual statements from bias. The study also highlights the need for further development in cross-lingual and multilingual models, as well as tools for media literacy and combating misinformation.
Mind map
Performance metrics: Accuracy, precision, recall, and F1-score
Fine-tuning (GPT-3.5)
Zero-shot learning
Evaluation methods:
Meta Llama2
GPT-4
OpenAI GPT-3
Comparison of models' bias detection capabilities
Focus on GPT-3.5 in fine-tuned and zero-shot settings
Models analyzed:
Cross-lingual and multilingual model challenges
Limitations in distinguishing factual statements from bias
Fine-tuned GPT-3.5 performance
Experiments
Model Selection and Evaluation
Data sources: Bias-labeled news articles and sentences
Horne et al.'s dataset (potentially)
MBIC (Multi-Bias Inference Corpus)
Datasets used:
Avoid automatic censorship while promoting media transparency
Evaluate the effectiveness of pre-trained transformers in detecting bias
Develop a "nutrition label" for online news content
Importance of media literacy and combating misinformation
Rise of media bias, propaganda, and fake news in the digital age
Call for collaboration between academia and industry to combat misinformation
Importance of human-in-the-loop approaches for accurate bias detection
Summary of findings
Future directions for research and development
Implications for media bias detection in real-world scenarios
Strengths and weaknesses of the studied models
Results and Analysis
Data Preprocessing
Data Collection
Objective
Background
Conclusion
Discussion
Method
Introduction
Outline
Introduction
Background
Rise of media bias, propaganda, and fake news in the digital age
Importance of media literacy and combating misinformation
Objective
Develop a "nutrition label" for online news content
Evaluate the effectiveness of pre-trained transformers in detecting bias
Avoid automatic censorship while promoting media transparency
Method
Data Collection
Datasets used:
MBIC (Multi-Bias Inference Corpus)
Horne et al.'s dataset (potentially)
Data sources: Bias-labeled news articles and sentences
Data Preprocessing
Cleaning and formatting of collected data
Splitting data into training, validation, and testing sets
Model Selection and Evaluation
Models analyzed:
OpenAI GPT-3
GPT-4
Meta Llama2
Evaluation methods:
Zero-shot learning
Fine-tuning (GPT-3.5)
Performance metrics: Accuracy, precision, recall, and F1-score
Experiments
Focus on GPT-3.5 in fine-tuned and zero-shot settings
Comparison of models' bias detection capabilities
Results and Analysis
Fine-tuned GPT-3.5 performance
Limitations in distinguishing factual statements from bias
Cross-lingual and multilingual model challenges
Discussion
Strengths and weaknesses of the studied models
Implications for media bias detection in real-world scenarios
Future directions for research and development
Conclusion
Summary of findings
Importance of human-in-the-loop approaches for accurate bias detection
Call for collaboration between academia and industry to combat misinformation

Paper digest

What problem does the paper attempt to solve? Is this a new problem?

The paper aims to address the issue of news bias detection using pre-trained neural transformers and to compare their performance in identifying bias in news content . This problem is not entirely new, as media bias has been a subject of investigation for a long time . However, the increasing spread of biased, distorted, or fake information on the internet has heightened the need for effective methods to detect and combat news bias, making this research highly relevant in the current context.


What scientific hypothesis does this paper seek to validate?

This paper aims to validate the hypothesis that using pre-trained neural transformers can effectively detect news bias at the sentence level and classify bias sub-types, providing both quantitative and qualitative results . The study is part of a broader effort to realize the conceptual vision of creating a "nutrition label" for online content to promote social good by combating biased and fake news dissemination .


What new ideas, methods, or models does the paper propose? What are the characteristics and advantages compared to previous methods?

The paper proposes several new ideas, methods, and models in the field of news bias detection:

  • The paper plans to work on cross-distilling a single open, non-proprietary, on-premise model fine-tuned for media bias identification and classification, aiming to develop a browser plug-in to empower citizens in critically reading the news .
  • It suggests collecting a larger corpus of biased news specimens in multiple languages and encourages collaboration with other research groups on this socially relevant endeavor .
  • The experiments in the paper compare the performance of different pre-trained neural transformers, such as GPT-3.5, GPT-4, and Llama2, for news bias detection using the MBIC dataset .
  • The paper evaluates the models' ability to classify sentences from English news as biased or non-biased, as well as identifying the type of bias expressed in the sentences .
  • The experiments show that a fine-tuned variant of a model with a smaller number of parameters, like GPT-3.5, can outperform a model with a much larger number of parameters, such as GPT-4, in terms of precision and energy efficiency for news bias detection tasks . The paper proposes novel approaches in news bias detection compared to previous methods:
  • The paper focuses on evaluating transformer techniques for detecting various types of bias across datasets, distinguishing itself from previous works that used different datasets and did not compare the transformer types used in this study .
  • It plans to develop a single open, non-proprietary model fine-tuned for media bias identification and classification, aiming to create a browser plug-in to enhance citizens' critical reading of the news .
  • The experiments in the paper compare the performance of advanced pre-trained neural transformers like GPT-3.5, GPT-4, and Llama2 for news bias detection tasks using the MBIC dataset, showcasing the effectiveness of these models in identifying bias in news content .
  • The paper highlights the importance of exploring further the challenges faced by models in distinguishing language nuances, reported speech, and hallucinating new categories, indicating areas for improvement in future research .
  • The experiments demonstrate that a fine-tuned variant of a model with fewer parameters, such as GPT-3.5, can outperform models with larger parameters like GPT-4 in terms of precision and energy efficiency for news bias detection tasks, providing a more effective and resource-efficient approach .

Do any related researches exist? Who are the noteworthy researchers on this topic in this field?What is the key to the solution mentioned in the paper?

Several related research studies exist in the field of news bias detection. Noteworthy researchers in this area include Arapakis, Peleja, Berkant, and Magalhaes , who focused on linguistic benchmarks of online news article quality. Yano, Resnik, and Smith manually annotated sentence-level partisanship bias, while Zhang, Kawai, Nakajima, Matsumoto, and Tanaka developed a system for sentiment bias detection in support of news credibility judgment. Baumer, Elovic, Qin, Polletta, and Gay concentrated on testing and comparing computational approaches for identifying the language of framing in political news. Bhuiyan, Zhang, Sehat, and Mitra investigated differences in crowdsourced news credibility assessment.

The key to the solution mentioned in the paper on news bias detection with pre-trained neural transformers involves comparing several large, pre-trained language models for sentence-level news bias detection and sub-type classification. The study provides quantitative and qualitative results as part of a broader effort towards realizing the conceptual vision of a "nutrition label" for online content for the social good . The solution aims to enhance critical reading of news by providing tools to citizens and collecting a larger corpus of biased news specimens in multiple languages .


How were the experiments in the paper designed?

The experiments in the paper were designed to evaluate the performance of different large pre-trained language models, including GPT-3.5, GPT-4, and Llama2, in classifying sentences from English news for bias detection . The experiments involved three main lines of evaluation to quantitatively and qualitatively assess the models' ability to identify bias in news content .

In the first mode of evaluation, sentences were grouped into batches of ten sentences each to minimize the number of API calls and simulate bias detection in longer texts . This approach aimed to streamline the evaluation process by assessing multiple sentences together .

The second mode of evaluation involved assessing sentences individually, independent of preceding or following sentences, to determine if batching unrelated sentences affected the models' performance . This method allowed for a more granular analysis of bias detection capabilities .

Various prompt engineering techniques were employed throughout the experiments to optimize the models' performance, such as providing examples within the prompt, removing definitions for context-dependent bias criteria, and adjusting the prompt to enhance clarity in instructing the model . These modifications aimed to improve the models' bias detection accuracy and efficiency .

Additionally, the experiments included variations in prompt settings and parameter adjustments to explore the impact of different configurations on the models' performance . These adjustments ranged from changing the model temperature to filtering out results based on bias scores, ultimately aiming to enhance the models' precision and recall in bias detection .


What is the dataset used for quantitative evaluation? Is the code open source?

The dataset used for quantitative evaluation in the study is the MBIC dataset, which consists of 1,700 statements annotated by 10 different judges from various US-American news outlets representing left-oriented, right-wing, and center sources . The MBIC dataset contains 1,551 statements after removing those for which annotators could not reach a final decision, with 1,018 biased and 533 non-biased statements . The code used in the experiments is not explicitly mentioned as open source in the provided context.


Do the experiments and results in the paper provide good support for the scientific hypotheses that need to be verified? Please analyze.

The experiments and results presented in the paper provide substantial support for the scientific hypotheses that need verification. The study compares various pre-trained language models for news bias detection, focusing on sentence-level bias detection and sub-type classification . The findings contribute to the broader effort of creating a "nutrition label" for online content to promote social good . The experiments demonstrate the importance of identifying different types of biases in news reporting, such as linguistic bias, context bias, hate speech, and gender bias .

The study highlights the challenges and biases inherent in bias detection methods, emphasizing the need for developers to make users aware of potential biases in their applications . Additionally, the research addresses the concept of false balance, where equal weight is given to views lacking credible evidence, leading to biased perceptions . The study also discusses the impact of negatively connotated words on bias perception and the challenges posed by large language models in generating unbiased outputs .

Overall, the experiments provide valuable insights into news bias detection methodologies, the limitations of bias detection models, and the importance of promoting unbiased reporting in online content. The results support the scientific hypotheses by showcasing the performance of different language models in detecting biases and the need for continued research in this critical area of study .


What are the contributions of this paper?

The paper "Experiments in News Bias Detection with Pre-Trained Neural Transformers" makes several contributions in the field of news bias detection:

  • It compares large, pre-trained language models for sentence-level news bias detection and sub-type classification, providing both quantitative and qualitative results .
  • The findings of the paper contribute to the broader effort of realizing the conceptual vision of a "nutrition label" for online content for the social good, as articulated by Fuhr et al. .
  • The paper addresses the critical issues of media bias, propaganda, and fake news that pose a threat to democracy by providing insights into detecting biased and fake information spread by state actors and commercial players .
  • It builds on previous research efforts in bias detection by incorporating second-order information, such as probability distributions of frequency, positions, and sequential order of sentence-level bias, to enhance the effectiveness of article-level bias detection .
  • The paper leverages linguistic cue analysis to detect bias-inducing words in news articles, contributing to the advancement of bias detection techniques in news media .
  • Additionally, the paper utilizes media bias datasets released by various groups to enhance bias detection methodologies and datasets, such as the MBIC corpus introduced by Spinde et al. .

What work can be continued in depth?

To further advance the field of news bias detection, several avenues of research can be pursued based on the existing work:

  • Cross-distilling a single open, non-proprietary model: Developing a model fine-tuned specifically for media bias identification and classification tasks .
  • Collecting a larger corpus of biased news specimens: This can involve gathering biased news samples in multiple languages to enhance the understanding and detection of bias across different contexts .
  • Inviting collaboration with other research groups: Joining forces with other research teams to collectively work on socially relevant endeavors related to news bias detection and mitigation .
Scan the QR code to ask more questions about the paper
© 2025 Powerdrill. All rights reserved.