email-Enron
Description
Nodes represent email addresses. Each hyperedge represents an email message, connecting the sender with all recipients (and may include a timestamp/weight when available). This dataset contains the emails from the Enron Email Dataset.
Basic statistics
- Nodes: 84172
- Hyperedges: 235395
- Unique hyperedges: 111558
- Max size hyperedge: 892
Hypergraph metadata
| Property | Description |
|---|---|
| name | (STRING) Dataset name (e.g., email-Enron). |
| type | (STRING) Hypergraph type (e.g., TemporalHypergraph). |
| version | (STRING) Dataset version (e.g., 1.0.0). |
| weighted | (BOOL) Whether the hypergraph is weighted (e.g., false). |
Node metadata
| Property | Description |
|---|---|
| address | (STRING) Email address represented by the node (e.g., fran.fagan@enron.com). |
Hyperedge metadata
| Property | Description |
|---|---|
| ccs | (LIST[INT]) Carbon-copy recipient node identifiers for the email message; may be empty (e.g., [1, 8]). |
| time | (INT) Unix timestamp of the email message (e.g., 1000501543). |
Hyperedge size distribution
Hyperdegree distribution
Download
- Version 1.0.0 Binary (9.5 MB) JSON (4.3 MB)
Provenance
Source: https://www.cs.cmu.edu/~enron/
License: Not specified. Please refer to the original source for licensing terms.
Reproducibility: Instructions and scripts
Citation
When this data is used in published research or for visualization purposes, please cite the following:
Copied!
@InProceedings{klimt2004enron,
author="Klimt, Bryan and Yang, Yiming",
title="The Enron Corpus: A New Dataset for Email Classification Research",
booktitle="Machine Learning: ECML 2004",
year="2004",
publisher="Springer Berlin Heidelberg",
address="Berlin, Heidelberg",
pages="217--226",
isbn="978-3-540-30115-8"
}