Skip to content

calebK25/hallucinations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 

Repository files navigation

DRM-LLM Memory Experiments

This repository contains a deterministic, single-prompt DRM (Deese–Roediger–McDermott) false-memory experiment for LLMs. The goal is to measure how often models “recall” a thematic lure word that never appeared in the study list.

Contents

Path Purpose
DRM/hallucinations_full_experiment.py Main DRM experiment (single strict recall prompt, deterministic decoding)
DRM/SLURM_TEMPLATE.sh SLURM script to run the experiment on a GPU node
DRM/download_models.py Prefetch all models to HF cache (run on login node)
DRM/related_words.csv Input word lists per lure word

Deprecated scripts (prompt variants, old downloaders) have been removed.

Experimental design

The experiment runs two conditions per lure word:

  • Short list (5 words)
  • Long list (10 words)

Each session proceeds through three phases:

  1. Memorization: present word list; model must reply exactly “Ready for next phase.”
  2. Arithmetic distractor: 5 additions; model outputs digits-only answers, one per line.
  3. Free recall: model outputs ONLY the recalled words, comma-separated, with full spellings (no stems/abbreviations).

Outputs are saved as CSVs: trial results and interaction logs.

Models

Models are configured in DRM/hallucinations_full_experiment.py (see the MODELS dict). The downloader includes all referenced repos, including gated ones like Llama‑2‑7B‑Chat.

Setup (cluster login node)

module load anaconda3/2024.6
conda activate hallucination

export HF_HOME=/scratch/gpfs/$USER/models2
mkdir -p "$HF_HOME/hub"

# Temporarily disable offline mode to download
unset HF_HUB_OFFLINE
export TRANSFORMERS_OFFLINE=0
export HF_DATASETS_OFFLINE=0

# If needed for gated models (e.g., Llama 2), run:
# huggingface-cli login
#   or securely export HF_TOKEN for the session only.

python DRM/download_models.py

# Re-enable offline mode for compute nodes
export TRANSFORMERS_OFFLINE=1
export HF_DATASETS_OFFLINE=1

Run the experiment (SLURM)

sbatch DRM/SLURM_TEMPLATE.sh

The template runs all configured models with the single strict recall prompt (P1) and writes CSVs to DRM/outputs/.

Outputs

  • Results CSV: per-session metrics and per-condition summaries (accuracy, precision, false memory rate, lure hallucination rate).
  • Logs CSV: all prompts and model responses (newlines escaped) for auditing analyses.

Troubleshooting

  • Offline errors: run DRM/download_models.py on a login node with offline flags disabled.
  • Gated models: accept licenses on Hugging Face and authenticate (CLI login or HF_TOKEN).
  • Cache path: ensure HF_HOME points to the parent cache dir; models will be stored under "$HF_HOME/hub".

License

MIT for repository code. Individual model licenses apply.

About

AI hallucination research

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published