Reddit-2015
Comments posted on Reddit over the year 2015. The tensor modes represent user-subreddit-word, where a subreddit is a community forum. Entry \((u, s, w)\) is the number of times that user \(u\) posted word \(w\) in subreddit \(s\) over the year 2015. Users, subreddits, and words with less than five entries have been removed.
Tensor Statistics
| Non-zeros | 4,687,474,081 | 
| Order | 3 | 
| Dimensions | 8,211,298 x 176,962 x 8,116,559 | 
| Tags | counts , text | 
Downloadable Files
| File | Description | 
|---|---|
| reddit-2015.tns.gz | Tensor | 
| mode-1-users.map.gz | Users | 
| mode-2-subreddits.map.gz | Subreddits | 
| mode-3-words.map.gz | Words | 
Citation
@online{redditdataset,
  author = {Jason Baumgartner},
  title = {Reddit comment dataset},
  month = july,
  year = {2015},
  url = {https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/}
}