Skip to content

uuinfolab/Structure_and_dynamics_of_growing_networks_of_Reddit_threads

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Structure and dynamics of growing networks of Reddit threads

Diletta Goglia, Davide Vega
diletta.goglia@it.uu.se (D.G.), davide.vega@it.uu.se (D.V.)

InfoLab, Department of Information Technology, Uppsala University, Uppsala, Sweden

DOI

GitHub release date GitHub repo size License

Abstract

Millions of people use online social networks to reinforce their sense of belonging, for example by giving and asking for feedback as a form of social validation and self-recognition. It is common to observe disagreement among people beliefs and points of view when expressing this feedback. Modeling and analyzing such interactions is crucial to understand social phenomena that happen when people face different opinions while expressing and discussing their values. In this work, we study a Reddit community in which people participate to judge or be judged with respect to some behavior, as it represents a valuable source to study how users express judgments online. We model threads of this community as complex networks of user interactions growing in time, and we analyze the evolution of their structural properties. We show that the evolution of Reddit networks differ from other real social networks, despite falling in the same category. This happens because their global clustering coefficient is extremely small and the average shortest path length increases over time. Such properties reveal how users discuss in threads, i.e. with mostly one other user and often by a single message. We strengthen such result by analyzing the role that disagreement and reciprocity play in such conversations. We also show that Reddit thread’s evolution over time is governed by two subgraphs growing at different speeds. We discover that, in the studied community, the difference of such speed is higher than in other communities because of the user guidelines enforcing specific user interactions. Finally, we interpret the obtained results on user behavior drawing back to Social Judgment Theory.

Directory structure

ROOT
  │── src/
  │    │── utilities.py                     # useful fuctions (divided in groups based on purpose)
  │    │── AITA_data.py                     # data preparation and preprocessing functions
  │    │── network_analysis.py              # data analysis functions
  │    └── main.py                          # file to run to execute the analysis
  │── data-raw/                             #
  │    └── CSV/                             # raw data: one directory per query (metedata + texts)
  │    └── sample.csv                       # small sample of raw data
  │── data-tidy/                            #
  │    │── processed_CSV/                   # clean data + sentiment and language features -- csv files 
  │    │── recipr_in_time/                  # reciprocity metric over time -- csv files 
  │    │── threads_properties.csv           # thread statistics, structural properties and reciprocity
  │    └── ...                              # folders with networks growth at different time resolutions
  │── data-analysis/                        #  
  │    │── figs/                            # figures and files to reproduce them
  │    └── network-data/                    #
  │         │── networks_in_time/           # networks reconstruction (edgelist with structural properties) -- csv files 
  │         └── user_edgelists.csv          # user networks (as edgelist)  
  │──  docs                                 # utilities for this repo    
  │──  requirements.txt                     # packages to install      
  │──  README.md
  └──  LICENSE  

Resources

Download the dataset here: DOI

Quick start

Install Python:
sudo apt install python3

Install pip:
sudo apt install --upgrade python3-pip

Install requirements:
python -m pip install --requirement requirements.txt

Execute main:

cd src/
python main.py

Fundings

Open access funding provided by Uppsala University. This work has been partly funded by eSSENCE, an e-Science collaboration funded as a strategic research area of Sweden. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Rights

This work is licensed under the MIT License.

  • If you use this code, please cite the following paper:

Goglia, D., Vega, D. Structure and dynamics of growing networks of Reddit threads. Appl Netw Sci 9, 48 (2024). 10.1007/s41109-024-00654-y

@article{Goglia2024,
  author = {Diletta Goglia and Davide Vega},
  title = {Structure and dynamics of growing networks of Reddit threads},
  month = {aug},
  year = {2024},
  doi = {10.1007/s41109-024-00654-y},
  url = {https://doi.org/10.1007/s41109-024-00654-y},
  journal = {Applied Network Science},
  volume = {9},
  number = {48}}
  • If you use the data included in this work, please ALSO cite the following source:

Goglia, D. Structure and dynamics of growing networks of Reddit threads [Dataset], v1.0. Appl Netw Sci 9, 48 (2024). 10.5281/zenodo.13620016

@misc{Goglia2024Zenodo,
  author       = {Goglia, Diletta},
  title        = {Structure and dynamics of growing networks of Reddit threads},
  month        = {sep},
  year         = {2024},
  publisher    = {Applied Network Science},
  version      = {v1.0},
  doi          = {10.5281/zenodo.13620016},
  url          = {https://doi.org/10.5281/zenodo.13620016},
  note = {Dataset}
}

Contact

This repository is actively maintained. For any questions or further information, please feel free to contact the corresponding author:

Diletta Goglia ORCID logo
Ph.D. Candidate at Uppsala University Information Laboratory (UU-InfoLab) research group.
Information Technology department, Uppsala University, Sweden.
diletta.goglia@it.uu.se
@dilettagoglia


Last update: September 2024