email-Enron-core

Description

Nodes represent email addresses. Each hyperedge represents an email message, connecting the sender with all recipients (and may include a timestamp/weight when available). This dataset contains the emails generated by core employees of the Enron company.

Basic statistics

  • Nodes: 143
  • Hyperedges: 10472
  • Unique hyperedges: 1840
  • Max size hyperedge: 38

Hyperedge size distribution

Hyperdegree distribution

Related datasets

Provenance

Source: https://www.cs.cmu.edu/~enron/

License: Not specified. Please refer to the original source for licensing terms.

Reproducibility: Instructions and scripts

Citation

When this data is used in published research or for visualization purposes, please cite the following:

                    
                    Copied!
                    @InProceedings{klimt2004enron,
 author="Klimt, Bryan and Yang, Yiming",
 title="The Enron Corpus: A New Dataset for Email Classification Research",
 booktitle="Machine Learning: ECML 2004",
 year="2004",
 publisher="Springer Berlin Heidelberg",
 address="Berlin, Heidelberg",
 pages="217--226",
 isbn="978-3-540-30115-8"
}

@article{benson2018simplicial,
 author = {Benson, Austin R. and Abebe, Rediet and Schaub, Michael T. and Jadbabaie, Ali and Kleinberg, Jon},
 title = {Simplicial closure and higher-order link prediction},
 year = {2018},
 doi = {10.1073/pnas.1800683115},
 publisher = {National Academy of Sciences},
 issn = {0027-8424},
 journal = {Proceedings of the National Academy of Sciences}
}