FROSTT

The Formidable Repository of Open Sparse Tensors and Tools

Home » Tensors » Enron Emails

Enron Emails

Enron emails released during an investigation by the Federal Energy Regulatory Commission. The modes represent sender-receiver-word-date, and the values are counts of words. Email senders and recipients outside of the @enron.com domain were pruned. English stop words were pruned and Porter stemming was used on the remaining words. Words which appear fewer than five times were also pruned.

Tensor Statistics

Non-zeros 54,202,099
Order 4
Dimensions 6,066 x 5,699 x 244,268 x 1,176
Tags counts , text

Downloadable Files

File Description
enron.tns.gz Enron tensor
mode-1-senders.map.gz Sender emails
mode-2-receivers.map.gz Receiver emails
mode-3-words.map.gz Words
mode-4-dates.map.gz Dates

Citation

@article{shetty2004enron,
  title={The Enron email dataset database schema and brief statistical report},
  author={Shetty, Jitesh and Adibi, Jafar},
  journal={Information sciences institute technical report, University of Southern California},
  volume={4},
  year={2004}
}

Discussion