Skip to content

kakeith/op-fed

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Op-Fed

This repository hosts the code and data to support the paper "Op-Fed: Opinion, Stance, and Monetary Policy Annotations on FOMC Transcripts Using Active Learning" by Alisa Kanganis and Katherine A. Keith.

If you use this data or code, please cite our paper:

@misc{kanganis2025
  author   = {Kanganis, Alisa and Keith, Katherine A.},
  title    = {Op-Fed: Opinion, Stance, and Monetary Policy Annotations on FOMC Transcripts Using Active Learning},
  year     = {2025},
  url      = {https://arxiv.org/pdf/2509.13539},
  note     = {Preprint}
}

Corresponding author: Email Katie Keith, kak5@williams.edu

Installation and Set-Up

git clone git@github.com:kakeith/op-fed.git
cd op-fed/
conda create -y --name opFed python==3.11 -c conda-forge
conda activate opFed
pip install -r requirements.txt
python -m spacy download en_core_web_sm

Data

The main dataset is in data/opfed_v1.csv and we also provide the raw annotations for all three annotators in data/opfed_raw_v1.csv.

Please see Section B: Datasheets for Datasets in our paper for a detailed description of this dataset.

data/opfed_v1.csv

Columns:

  • unique_id: Example 19811222_189_9. The first two numbers, e.g.,19811222_189, correspond to the id in ConvoKit. In ConvoKit, this is the transcript number, e.g., 19811222, followed by the utterance number, e.g., 189 (starting at index 1). In our unique_id, the third number, e.g., _9 corresponds to the sentence number (after sentence segmentation with spacy); this starts at index 1 for the first sentence.
  • speaker: The name of the speaker, e.g., MR. TRUMAN
  • sentence: The full text of the target sentence.
  • 1_opinion: Opinion label on the target sentence. Possible labels: yes, no, or ambiguous
  • 2_mp: Monetary policy label on the target sentence. Possible labels: yes, no, or ambiguous
  • 3_mp_context: Whether the 2_mp label needed additional context. Possible labels: sentence, utterance, -5 sentences, or 200+ tokens
  • 4_stance: StanceNLI labels on the target sentence. Possible labels: neutral, entailment, contradiction or ambiguous
  • 5_stance_context: Whether the 4_stance label needed additional context. Possible labels: sentence, utterance, -5 sentences, or 200+ tokens
  • utterance: The full utterance within which the target sentence exists.
  • -5 sentences: The previous five sentences leading up to (but not including) the utterance of the target sentence. When this is across multiple utterances, we return a list of dictionaries. Each dictionary is an utterance with keys for the 'speaker' and the 'text' for the utterance. Example:
    • [{'speaker': 'MR. LAWARE.', 'text': 'With the momentum that he will gain by our acquiescence to [releasing the transcripts], he will then say: Well, this is what I want you to decide to do.'}, {'speaker': 'MR. ANGELL.', 'text': 'Absolutely.'}, {'speaker': 'MR. LAWARE.', 'text': ""He's going to back us right into a corner.""}, ... ]
  • -200+ tokens: The previous 200 tokens (rounded to the nearest sentence) leading up to (but not including) the utterance with the target sentence. If this is across utterances, we use the same list-dictionary format as -5 sentences.

data/opfed_raw_v1.csv

In this folder, we also provide opfed_raw_v1.csv which contains the per-annotator labels combined into list form (prior to aggregation). For example, in the first row, 1_opinion column, the cell value is ['yes', 'yes', 'yes'] which means all three annotators labeled the target sentence as 'yes' for the opinion aspect.

Because this is a hierarchical schema, one could also have missing values for some of the annotators, e.g., ['yes', 'nan', 'yes'] meaning the second annotator ('nan' value) did not reach that annotation stage due to earlier annotation decisions.

Reproducing results in the paper

code/descriptive/

This folder contains scripts to gather descriptive statistics of the Op-Fed dataset (i.e. inner-annotator agreement rates).

  • descibe.py creates Table 6 in the appendix with the per-label breakdown of examples as well as inner-annotator agreement levels.
  • Run transcript_location.py to create Figure 3, "Opinions are expressed later in transcripts."
  • The script hand_selected_examples.ipynbcreates Table 9.
  • The script score_analysis.ipynb creates Table 13 (in the appendix).

code/active_learning/

This folder contains the scripts that ran the active learning simulations. The results of the simulation runs are saved in plots/sheets/.

To generate Figure 2 (the active learning simulations results presented in the main paper), run code/active_learing/plots/plot_main.py

The final deployed human-in-the-loop AL pipeline to create the dataset is in the script code/active_learing/real_deal/real_loop.py

code/baseline_models/

This folder contains scripts for the results on Op-Fed baseline models (zero-shot LLMs).

Zero-shot models

The resulting from all the zero shot models (run on the APIs in August, 2025) are saved in code/baseline_models/zeroshot/zeroshotpreds.zip.

To re-run the API calls of the LLMs re-run the scripts in code/baseline_models/zeroshot/ folder. Note: You will need to replace the path to your own API keys.

python gpt.py gpt-5
python gpt.py gpt-5-nano
python claude.py 
python deepseek.py 

This script runs the evaluation after the model's predictions are saved to disk:

python evaluate.py --full_print 

For just the accuracy metrics for the latex table run:

python evaluate.py

For just the weighted F1 metrics for the latex table run:

python evaluate.py --f1

If you want to include the ambiguous ground truth labels, re-run with the --includes_ambiguous flag, i.e.,

python gpt.py gpt-5 --includes_ambiguous

Alternatively, to optimize cost, you can submit the GPT and Claude models in "batch mode". Note, DeepSeek currently does not have this option available. Dev examples:

python gpt_batch_submit.py gpt-5-nano --dev
python gpt_batch_monitor.py gpt_batches/batch_ids_gpt-5-nano_2025-09-03-12-57.json --dev

Bow LogReg Baseline

For the bag-of-words logistic regression baseline on OpFed run the script:

baseline_models/finetune/bow.py

Other baselines

The human baseline and majority class baseline are in the script

baselines.ipynb

About

This repository supports the paper by Kanganis and Keith, 2025.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors