movies-cast-romance
Description
Movie-cast hypergraph for the Romance genre, built from The Movies Dataset. Nodes are actors and hyperedges are normalized movie casts. Movies with multiple genres also appear in the corresponding other genre-specific datasets.
Basic statistics
- Nodes: 53400
- Hyperedges: 6679
- Unique hyperedges: 6679
- Max size hyperedge: 312
Hypergraph metadata
| Property | Description |
|---|---|
| name | (STRING) Dataset name (e.g., movies-cast-romance). |
| type | (STRING) Hypergraph type (e.g., Hypergraph). |
| version | (STRING) Dataset version (e.g., 1.0.0). |
| weighted | (BOOL) Whether repeated movie casts in this genre are stored as edge weights (e.g., true). |
| genre | (STRING) Movie genre used to build this dataset (e.g., Romance). |
| node_type | (STRING) Semantic type of nodes (e.g., actor). |
| edge_type | (STRING) Semantic type of hyperedges (e.g., movie_cast). |
Node metadata
| Property | Description |
|---|---|
| tmdb_person_id | (INT) TMDB person identifier for the actor (e.g., 31). |
| name | (STRING) Actor name from credits.csv (e.g., Tom Hanks). |
| gender | (INT) TMDB gender code for the actor; 0=unknown/not specified, 1=female, 2=male (e.g., 2). |
Hyperedge metadata
| Property | Description |
|---|---|
| movie_ids | (LIST[STRING]) TMDB movie identifiers represented by this normalized cast hyperedge (e.g., [862]). |
| titles | (LIST[STRING]) Movie titles represented by this normalized cast hyperedge (e.g., [Toy Story]). |
| original_languages | (LIST[STRING]) Original language codes for the represented movies (e.g., [en]). |
| release_years | (LIST[INT]) Release years for represented movies when available (e.g., [1995]). |
| weight | (INT) Number of movies collapsed into this normalized cast hyperedge. |
Hyperedge size distribution
Hyperdegree distribution
Download
- Version 1.0.0 Binary (1.5 MB) JSON (1.3 MB)
Provenance
Source: https://www.kaggle.com/datasets/rounakbanik/the-movies-dataset
License: CC0 Public Domain
Data derived from The Movies Dataset on Kaggle using movies_metadata.csv and credits.csv. The Kaggle dataset is listed as CC0: Public Domain; the movie details and credits were collected from the TMDB Open API. This product uses TMDB data but is not endorsed or certified by TMDB.
Reproducibility: Instructions and scripts
Citation
When this data is used in published research or for visualization purposes, please cite the following:
Copied!
No BibTeX entry is currently available. Please refer to the original source: https://www.kaggle.com/datasets/rounakbanik/the-movies-dataset