Enron Emails
Enron emails released during an investigation by the Federal Energy Regulatory Commission. The modes represent sender-receiver-word-date, and the values are counts of words. Email senders and recipients outside of the @enron.com domain were pruned. English stop words were pruned and Porter stemming was used on the remaining words. Words which appear fewer than five times were also pruned.
Tensor Statistics
Non-zeros | 54,202,099 |
Order | 4 |
Dimensions | 6,066 x 5,699 x 244,268 x 1,176 |
Tags | counts , text |
Downloadable Files
File | Description |
---|---|
enron.tns.gz | Enron tensor |
mode-1-senders.map.gz | Sender emails |
mode-2-receivers.map.gz | Receiver emails |
mode-3-words.map.gz | Words |
mode-4-dates.map.gz | Dates |
Citation
@article{shetty2004enron, title={The Enron email dataset database schema and brief statistical report}, author={Shetty, Jitesh and Adibi, Jafar}, journal={Information sciences institute technical report, University of Southern California}, volume={4}, year={2004} }