Automated Identification of Competing Narratives on Social Media

About The Project

This repository contains the code and data for the paper "Automated Identification of Competing Narratives in Political Discourse on Social Media" published at the Text2Story Workshop 2025@ECIR.

The project is organized as follows:

dataset/ folder should contain the data to be used for the analysis
app.py and app_pages/ contain the code for the web application build with Streamlit
the numbered scripts are used to preprocess the data

The dataset is expected to be in JSONL format. Each line should contain one post. The data should contain the following fields:

date: the date of the post
text: the text of the post
user_id: unique identifier for the author of the post
translation: the translation of the post (optional)

Getting Started

Prerequisites

Python
Pipenv

Installation

Clone the repo
Add your dataset to the dataset/ folder. For example dataset/twitter-covid/dataset.jsonl.gz
- The data should be in JSONL format
- Each line should contain one post.
Add your configuration in .env. See .env.sample for a template.
Install the dependencies: pipenv install

Configuration Reference

The configuration is done in the .env file. The following variables are available:

DATASET: Path inside the dataset/ folder to the dataset.
- Example: twitter-covid/dataset.jsonl.gz
TEXT_ATTR: Name of the field in the dataset that contains the text of the post.
- Example: text
TEXT_TRANSLATION_ATTR: Name of the field in the dataset that contains the translation of the post. (optional)
- Example: translation
USER_ATTR: Name of the field in the dataset that contains the unique identifier of the author.
- Example: user_id
EMBEDDING_MODEL: Name of the sentence embedding model to use.
- Example: paraphrase-multilingual-MiniLM-L12-v2
EMBED_TRANSLATION: Whether to use the translation for the embeddings. (optional)
- 0 or 1
OPENAI_URL: URL for an OpenAI compatible API. (optional)
OPENAI_API_KEY: API key for the OpenAI API. (optional)
OPENAI_MODEL: Name of the LLM to use. (optional)
- Example: phi4:14b

LLMs can optionally be used to summarize events and stories. Otherwise, we fall back to keyword extraction.

Usage

Activate the virtual environment: pipenv shell
Run the numbered scripts in order to preprocess the data
Run the Streamlit app: streamlit run app.py

Citation

If you use this code or data, please cite the following paper:

@inproceedings{wildemann2025automated,
  title     = {Automated Identification of Competing Narratives in Political Discourse on Social Media},
  author    = {Sergej Wildemann and Erick Elejalde},
  editor    = {Ricardo Campos and 
               Al{\'{\i}}pio M{\'{a}}rio Jorge and 
               Adam Jatowt and 
               Sumit Bhatia and 
               Marina Litvak},
  booktitle = {Proceedings of Text2Story - Eigth Workshop on Narrative Extraction
               From Texts held in conjunction with the 47th European Conference on
               Information Retrieval {(ECIR} 2025), Lucca, Italy, April 10, 2025},
  year      = {2025},
  series    = {{CEUR} Workshop Proceedings},
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
app_pages		app_pages
dataset		dataset
imgs		imgs
util		util
.env.sample		.env.sample
.gitignore		.gitignore
1_embeddings.py		1_embeddings.py
2.1_topicmap.py		2.1_topicmap.py
2_topic_modelling.py		2_topic_modelling.py
3.1_event_representations.py		3.1_event_representations.py
3_event_detection.py		3_event_detection.py
4.1_entity_extraction.py		4.1_entity_extraction.py
4.1_keyword_extraction.py		4.1_keyword_extraction.py
4.2_entity_kw_embeddings.py		4.2_entity_kw_embeddings.py
4.3_event_distances.py		4.3_event_distances.py
4.4_story_generation.py		4.4_story_generation.py
4.5_story_representation.py		4.5_story_representation.py
5_sentiment.py		5_sentiment.py
5_user_communities_global.py		5_user_communities_global.py
LICENSE		LICENSE
Pipfile		Pipfile
README.md		README.md
app.py		app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automated Identification of Competing Narratives on Social Media

About The Project

Getting Started

Prerequisites

Installation

Configuration Reference

Usage

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

fjen/competing-narratives

Folders and files

Latest commit

History

Repository files navigation

Automated Identification of Competing Narratives on Social Media

About The Project

Getting Started

Prerequisites

Installation

Configuration Reference

Usage

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages