Enron Emails
Enron emails released during an investigation by the Federal Energy Regulatory Commission. The modes represent sender-receiver-word-date, and the values are counts of words. Email senders and recipients outside of the @enron.com domain were pruned. English stop words were pruned and Porter stemming was used on the remaining words. Words which appear fewer than five times were also pruned.
Tensor Statistics
| Non-zeros | 54,202,099 |
| Order | 4 |
| Dimensions | 6,066 x 5,699 x 244,268 x 1,176 |
| Tags | counts , text |
Downloadable Files
| File | Description |
|---|---|
| enron.tns.gz | Enron tensor |
| mode-1-senders.map.gz | Sender emails |
| mode-2-receivers.map.gz | Receiver emails |
| mode-3-words.map.gz | Words |
| mode-4-dates.map.gz | Dates |
Citation
@article{shetty2004enron,
title={The Enron email dataset database schema and brief statistical report},
author={Shetty, Jitesh and Adibi, Jafar},
journal={Information sciences institute technical report, University of Southern California},
volume={4},
year={2004}
}