๐พ Read & Writeยถ
Submodulesยถ
hypergraphx.readwrite.load moduleยถ
- hypergraphx.readwrite.load.download_remote_dataset(dataset_name, *, fmt='hgx', timeout=30, verify_ssl=False, cache_dir=None, overwrite=False, catalog_url=None, use_catalog=True, dataset_info=None)[source]ยถ
Download and cache a remote dataset without loading it into memory.
- Parameters:
dataset_name (str) โ Dataset identifier, such as
"zoo"or"contacts-hospital".fmt ({"hgx", "binary", "json"} or None, default="hgx") โ Remote format to download. If explicitly set to None, JSON URLs are tried first, then binary URLs.
timeout (int, default=30) โ Download timeout in seconds.
verify_ssl (bool, default=False) โ Whether to verify TLS certificates.
cache_dir (path-like, optional) โ Cache directory. Defaults to
~/.cache/hypergraphx/datasetsor theHYPERGRAPHX_DATA_CACHEenvironment variable.overwrite (bool, default=False) โ If True, re-download even when a matching cached file exists.
catalog_url (str, optional) โ Catalog metadata URL used to resolve dataset download URLs.
use_catalog (bool, default=True) โ If True, resolve download URLs from the remote catalog before falling back to legacy hard-coded URL patterns.
dataset_info (dict, optional) โ Already loaded catalog entry. Passing this avoids reloading the catalog when downloading many datasets.
- Returns:
Local decompressed cache path, suitable for
load_hypergraph(...).- Return type:
pathlib.Path
- hypergraphx.readwrite.load.download_remote_datasets(dataset_names=None, *, attributes=None, match_all=True, fmt='hgx', timeout=30, verify_ssl=False, cache_dir=None, overwrite=False, catalog_url=None, continue_on_error=False, progress_callback=None)[source]ยถ
Download and cache multiple remote datasets.
- Parameters:
dataset_names (str | Iterable[str], optional) โ Dataset names, filenames, or directories to download explicitly.
attributes (str | Iterable[str], optional) โ Tag/category names used to select datasets from the catalog. If both
dataset_namesandattributesare provided, named datasets are filtered by the requested attributes.match_all (bool, default=True) โ If True, selected datasets must contain all requested attributes. If False, any requested attribute is enough.
fmt ({"hgx", "binary", "json"} or None, default="hgx") โ Remote format to download.
timeout (int, default=30) โ Download timeout in seconds.
verify_ssl (bool, default=False) โ Whether to verify TLS certificates.
cache_dir (path-like, optional) โ Cache directory. Defaults to
~/.cache/hypergraphx/datasetsor theHYPERGRAPHX_DATA_CACHEenvironment variable.overwrite (bool, default=False) โ If True, re-download even when matching cached files exist.
catalog_url (str, optional) โ Catalog metadata URL used to resolve dataset download URLs.
continue_on_error (bool, default=False) โ If True, keep downloading after a dataset fails and store the exception in that datasetโs result record. If False, raise on the first failure.
progress_callback (callable, optional) โ Called after each dataset with its result record.
- Returns:
Mapping from canonical dataset name to records with
path,metadata,error, andstatusfields.- Return type:
dict
- hypergraphx.readwrite.load.get_remote_dataset_info(dataset_name, *, timeout=30, verify_ssl=False, catalog_url=None)[source]ยถ
Return the full catalog entry for a remote dataset.
dataset_nameis matched against the catalogname,filename, anddirectoryfields.
- hypergraphx.readwrite.load.iter_remote_hypergraphs(attributes=None, *, names=None, match_all=True, fmt='hgx', timeout=30, verify_ssl=False, catalog_url=None, include_metadata=False, store=True, cache_dir=None, overwrite=False)[source]ยถ
Yield remote hypergraphs selected by name or catalog tags/categories.
- Parameters:
attributes (str | Iterable[str], optional) โ Tag/category names to match, such as
"Undirected"or["Undirected", "Temporal"]. Matching is case-insensitive.names (str | Iterable[str], optional) โ Dataset names, filenames, or directories to load explicitly. If omitted, datasets are selected from
attributes.match_all (bool, default=True) โ If True, a dataset must contain all requested attributes. If False, any requested attribute is enough.
fmt ({"hgx", "binary", "json"}, default="hgx") โ Remote format to load for each matching dataset.
verify_ssl (bool, default=False) โ Whether to verify TLS certificates for remote requests.
catalog_url (str, optional) โ Catalog metadata URL used for filtering.
include_metadata (bool, default=False) โ If True, yield
(hypergraph, dataset_info)pairs. Otherwise yield only the hypergraph object.store (bool, default=True) โ Store downloaded datasets locally before loading them.
cache_dir (path-like, optional) โ Cache directory. Defaults to
~/.cache/hypergraphx/datasetsor theHYPERGRAPHX_DATA_CACHEenvironment variable.overwrite (bool, default=False) โ If True, re-download matching datasets even when cached files exist.
Notes
This is a generator: datasets are downloaded and loaded lazily as the iterator advances.
- hypergraphx.readwrite.load.list_remote_datasets(*, timeout=30, verify_ssl=False, catalog_url=None)[source]ยถ
List datasets advertised by the remote Hypergraphx-data catalog.
Returns a list of dictionaries with at least: -
name-tags/categories-vertices-edges- Parameters:
timeout (int, default=30) โ Download timeout in seconds.
verify_ssl (bool, default=False) โ Whether to verify TLS certificates when downloading the catalog. Defaults to False for compatibility with the current dataset server.
catalog_url (str, optional) โ Catalog metadata URL. Defaults to the Hypergraphx-data GitHub raw URL, or
HYPERGRAPHX_DATA_CATALOG_URLif set.
Notes
catalog_urlcan point to the generatedcatalog.jsonfile, a JSON list, or the legacyrelated-data.jsfile used by the website.
- hypergraphx.readwrite.load.load_hypergraph(file_name, *, fmt=None)[source]ยถ
Load a hypergraph from disk.
- Parameters:
file_name (str or path-like) โ Input file path.
fmt ({"json", "pickle", "hgr"} | None) โ Optional override for the input format. If None (default), infer format from the file extension. Gzipped files with
.gzsuffix are supported for each local format, such as.json.gzand.hgx.gz.
- hypergraphx.readwrite.load.load_hypergraph_from_server(dataset_name, *, fmt='hgx', as_dict=False, timeout=30, verify_ssl=False, store=True, cache_dir=None, overwrite=False, catalog_url=None, use_catalog=True, dataset_info=None)[source]ยถ
Load a dataset by name from the remote Hypergraphx-data server.
- Parameters:
dataset_name (str) โ Dataset identifier, such as
"zoo"or"contacts-hospital".fmt ({"hgx", "binary", "json"} or None, default="hgx") โ Remote format to load.
"hgx"and"binary"load the compact binary Hypergraphx format;"json"loads the JSON format. If explicitly set to None, JSON URLs are tried first, then binary URLs.as_dict (bool, default=False) โ If True, return the exposed internal data-structure dictionary instead of a hypergraph object.
timeout (int, default=30) โ Download timeout in seconds.
verify_ssl (bool, default=False) โ Whether to verify TLS certificates. Defaults to False for compatibility with the current dataset server certificate chain.
store (bool, default=True) โ Store the decompressed remote dataset locally before loading it. Cached files are reused on later calls.
cache_dir (path-like, optional) โ Cache directory. Defaults to
~/.cache/hypergraphx/datasetsor theHYPERGRAPHX_DATA_CACHEenvironment variable.overwrite (bool, default=False) โ If True, re-download even when a matching cached file exists.
catalog_url (str, optional) โ Catalog metadata URL used to resolve dataset download URLs.
use_catalog (bool, default=True) โ If True, resolve download URLs from the remote catalog before falling back to legacy hard-coded URL patterns.
dataset_info (dict, optional) โ Already loaded catalog entry. Passing this avoids reloading the catalog when loading many datasets.
- Returns:
Loaded hypergraph object, or its exposed dictionary if
as_dict=True.- Return type:
Hypergraph | DirectedHypergraph | TemporalHypergraph | MultiplexHypergraph | dict
Notes
The loader tries current per-dataset
.json.gz/.hgx.gzURLs first and keeps older flat URLs as fallbacks. Whenstore=True, compressed downloads are decompressed before being written to the cache.
- hypergraphx.readwrite.load.search_remote_datasets(query=None, *, tags=None, match_all_tags=True, source=None, license=None, min_nodes=None, max_nodes=None, min_edges=None, max_edges=None, timeout=30, verify_ssl=False, catalog_url=None)[source]ยถ
Search the remote Hypergraphx-data catalog.
- Parameters:
query (str, optional) โ Case-insensitive substring matched against dataset names and tags.
tags (str | Iterable[str], optional) โ Tags/categories to require. Matching is case-insensitive.
match_all_tags (bool, default=True) โ If True, all requested tags must be present. If False, any requested tag is enough.
source (str, optional) โ Case-insensitive substring matched against the source URL/text.
license (str, optional) โ Case-insensitive substring matched against the license identifier/text.
min_nodes (int, optional) โ Inclusive size filters using catalog
verticesandedges.max_nodes (int, optional) โ Inclusive size filters using catalog
verticesandedges.min_edges (int, optional) โ Inclusive size filters using catalog
verticesandedges.max_edges (int, optional) โ Inclusive size filters using catalog
verticesandedges.
- Returns:
Matching catalog entries in catalog order.
- Return type:
list[dict]
See also
list_remote_datasetsReturn the full remote catalog.
iter_remote_hypergraphsLazily load matching remote hypergraphs.
hypergraphx.readwrite.save moduleยถ
- hypergraphx.readwrite.save.save_hypergraph(hypergraph, file_name, *, fmt='json', binary=None)[source]ยถ
Save a hypergraph to disk.
- Parameters:
hypergraph โ Hypergraph-like object.
file_name (str) โ Output file path.
fmt ({"json", "pickle"}) โ Output format (default: โjsonโ).
binary (bool | None) โ Backward-compatible alias for fmt=โpickleโ when True. If provided, overrides fmt and emits a DeprecationWarning.
hypergraphx.readwrite.hif moduleยถ
- hypergraphx.readwrite.hif.read_hif(path)[source]ยถ
Load a hypergraph from a HIF file.
- Parameters:
path (str) โ The path to the HIF file
- Returns:
The loaded hypergraph
- Return type:
- hypergraphx.readwrite.hif.write_hif(H, path)[source]ยถ
Save a hypergraph to a HIF file.
- Parameters:
H (Hypergraph) โ The hypergraph to save.
path (str) โ The path to save the hypergraph to.
hypergraphx.readwrite.io_json moduleยถ
hypergraphx.readwrite.io_pickle moduleยถ
hypergraphx.readwrite.hashing moduleยถ
- hypergraphx.readwrite.hashing.hash_hypergraph(hypergraph)[source]ยถ
Generates a SHA-256 hash of a hypergraph based on its exposed attributes.
- Parameters:
hypergraph (object) โ The hypergraph instance to hash. Should implement expose_attributes_for_hashing.
- Returns:
The SHA-256 hash hex digest of the hypergraph.
- Return type:
str
Module contentsยถ
- hypergraphx.readwrite.download_remote_dataset(dataset_name, *, fmt='hgx', timeout=30, verify_ssl=False, cache_dir=None, overwrite=False, catalog_url=None, use_catalog=True, dataset_info=None)[source]ยถ
Download and cache a remote dataset without loading it into memory.
- Parameters:
dataset_name (str) โ Dataset identifier, such as
"zoo"or"contacts-hospital".fmt ({"hgx", "binary", "json"} or None, default="hgx") โ Remote format to download. If explicitly set to None, JSON URLs are tried first, then binary URLs.
timeout (int, default=30) โ Download timeout in seconds.
verify_ssl (bool, default=False) โ Whether to verify TLS certificates.
cache_dir (path-like, optional) โ Cache directory. Defaults to
~/.cache/hypergraphx/datasetsor theHYPERGRAPHX_DATA_CACHEenvironment variable.overwrite (bool, default=False) โ If True, re-download even when a matching cached file exists.
catalog_url (str, optional) โ Catalog metadata URL used to resolve dataset download URLs.
use_catalog (bool, default=True) โ If True, resolve download URLs from the remote catalog before falling back to legacy hard-coded URL patterns.
dataset_info (dict, optional) โ Already loaded catalog entry. Passing this avoids reloading the catalog when downloading many datasets.
- Returns:
Local decompressed cache path, suitable for
load_hypergraph(...).- Return type:
pathlib.Path
- hypergraphx.readwrite.download_remote_datasets(dataset_names=None, *, attributes=None, match_all=True, fmt='hgx', timeout=30, verify_ssl=False, cache_dir=None, overwrite=False, catalog_url=None, continue_on_error=False, progress_callback=None)[source]ยถ
Download and cache multiple remote datasets.
- Parameters:
dataset_names (str | Iterable[str], optional) โ Dataset names, filenames, or directories to download explicitly.
attributes (str | Iterable[str], optional) โ Tag/category names used to select datasets from the catalog. If both
dataset_namesandattributesare provided, named datasets are filtered by the requested attributes.match_all (bool, default=True) โ If True, selected datasets must contain all requested attributes. If False, any requested attribute is enough.
fmt ({"hgx", "binary", "json"} or None, default="hgx") โ Remote format to download.
timeout (int, default=30) โ Download timeout in seconds.
verify_ssl (bool, default=False) โ Whether to verify TLS certificates.
cache_dir (path-like, optional) โ Cache directory. Defaults to
~/.cache/hypergraphx/datasetsor theHYPERGRAPHX_DATA_CACHEenvironment variable.overwrite (bool, default=False) โ If True, re-download even when matching cached files exist.
catalog_url (str, optional) โ Catalog metadata URL used to resolve dataset download URLs.
continue_on_error (bool, default=False) โ If True, keep downloading after a dataset fails and store the exception in that datasetโs result record. If False, raise on the first failure.
progress_callback (callable, optional) โ Called after each dataset with its result record.
- Returns:
Mapping from canonical dataset name to records with
path,metadata,error, andstatusfields.- Return type:
dict
- hypergraphx.readwrite.get_remote_dataset_info(dataset_name, *, timeout=30, verify_ssl=False, catalog_url=None)[source]ยถ
Return the full catalog entry for a remote dataset.
dataset_nameis matched against the catalogname,filename, anddirectoryfields.
- hypergraphx.readwrite.iter_remote_hypergraphs(attributes=None, *, names=None, match_all=True, fmt='hgx', timeout=30, verify_ssl=False, catalog_url=None, include_metadata=False, store=True, cache_dir=None, overwrite=False)[source]ยถ
Yield remote hypergraphs selected by name or catalog tags/categories.
- Parameters:
attributes (str | Iterable[str], optional) โ Tag/category names to match, such as
"Undirected"or["Undirected", "Temporal"]. Matching is case-insensitive.names (str | Iterable[str], optional) โ Dataset names, filenames, or directories to load explicitly. If omitted, datasets are selected from
attributes.match_all (bool, default=True) โ If True, a dataset must contain all requested attributes. If False, any requested attribute is enough.
fmt ({"hgx", "binary", "json"}, default="hgx") โ Remote format to load for each matching dataset.
verify_ssl (bool, default=False) โ Whether to verify TLS certificates for remote requests.
catalog_url (str, optional) โ Catalog metadata URL used for filtering.
include_metadata (bool, default=False) โ If True, yield
(hypergraph, dataset_info)pairs. Otherwise yield only the hypergraph object.store (bool, default=True) โ Store downloaded datasets locally before loading them.
cache_dir (path-like, optional) โ Cache directory. Defaults to
~/.cache/hypergraphx/datasetsor theHYPERGRAPHX_DATA_CACHEenvironment variable.overwrite (bool, default=False) โ If True, re-download matching datasets even when cached files exist.
Notes
This is a generator: datasets are downloaded and loaded lazily as the iterator advances.
- hypergraphx.readwrite.list_remote_datasets(*, timeout=30, verify_ssl=False, catalog_url=None)[source]ยถ
List datasets advertised by the remote Hypergraphx-data catalog.
Returns a list of dictionaries with at least: -
name-tags/categories-vertices-edges- Parameters:
timeout (int, default=30) โ Download timeout in seconds.
verify_ssl (bool, default=False) โ Whether to verify TLS certificates when downloading the catalog. Defaults to False for compatibility with the current dataset server.
catalog_url (str, optional) โ Catalog metadata URL. Defaults to the Hypergraphx-data GitHub raw URL, or
HYPERGRAPHX_DATA_CATALOG_URLif set.
Notes
catalog_urlcan point to the generatedcatalog.jsonfile, a JSON list, or the legacyrelated-data.jsfile used by the website.
- hypergraphx.readwrite.load_any(obj_or_path)ยถ
- hypergraphx.readwrite.load_hypergraph(file_name, *, fmt=None)[source]ยถ
Load a hypergraph from disk.
- Parameters:
file_name (str or path-like) โ Input file path.
fmt ({"json", "pickle", "hgr"} | None) โ Optional override for the input format. If None (default), infer format from the file extension. Gzipped files with
.gzsuffix are supported for each local format, such as.json.gzand.hgx.gz.
- hypergraphx.readwrite.load_hypergraph_from_server(dataset_name, *, fmt='hgx', as_dict=False, timeout=30, verify_ssl=False, store=True, cache_dir=None, overwrite=False, catalog_url=None, use_catalog=True, dataset_info=None)[source]ยถ
Load a dataset by name from the remote Hypergraphx-data server.
- Parameters:
dataset_name (str) โ Dataset identifier, such as
"zoo"or"contacts-hospital".fmt ({"hgx", "binary", "json"} or None, default="hgx") โ Remote format to load.
"hgx"and"binary"load the compact binary Hypergraphx format;"json"loads the JSON format. If explicitly set to None, JSON URLs are tried first, then binary URLs.as_dict (bool, default=False) โ If True, return the exposed internal data-structure dictionary instead of a hypergraph object.
timeout (int, default=30) โ Download timeout in seconds.
verify_ssl (bool, default=False) โ Whether to verify TLS certificates. Defaults to False for compatibility with the current dataset server certificate chain.
store (bool, default=True) โ Store the decompressed remote dataset locally before loading it. Cached files are reused on later calls.
cache_dir (path-like, optional) โ Cache directory. Defaults to
~/.cache/hypergraphx/datasetsor theHYPERGRAPHX_DATA_CACHEenvironment variable.overwrite (bool, default=False) โ If True, re-download even when a matching cached file exists.
catalog_url (str, optional) โ Catalog metadata URL used to resolve dataset download URLs.
use_catalog (bool, default=True) โ If True, resolve download URLs from the remote catalog before falling back to legacy hard-coded URL patterns.
dataset_info (dict, optional) โ Already loaded catalog entry. Passing this avoids reloading the catalog when loading many datasets.
- Returns:
Loaded hypergraph object, or its exposed dictionary if
as_dict=True.- Return type:
Hypergraph | DirectedHypergraph | TemporalHypergraph | MultiplexHypergraph | dict
Notes
The loader tries current per-dataset
.json.gz/.hgx.gzURLs first and keeps older flat URLs as fallbacks. Whenstore=True, compressed downloads are decompressed before being written to the cache.
- hypergraphx.readwrite.read_hif(path)[source]ยถ
Load a hypergraph from a HIF file.
- Parameters:
path (str) โ The path to the HIF file
- Returns:
The loaded hypergraph
- Return type:
- hypergraphx.readwrite.save_hypergraph(hypergraph, file_name, *, fmt='json', binary=None)[source]ยถ
Save a hypergraph to disk.
- Parameters:
hypergraph โ Hypergraph-like object.
file_name (str) โ Output file path.
fmt ({"json", "pickle"}) โ Output format (default: โjsonโ).
binary (bool | None) โ Backward-compatible alias for fmt=โpickleโ when True. If provided, overrides fmt and emits a DeprecationWarning.
- hypergraphx.readwrite.search_remote_datasets(query=None, *, tags=None, match_all_tags=True, source=None, license=None, min_nodes=None, max_nodes=None, min_edges=None, max_edges=None, timeout=30, verify_ssl=False, catalog_url=None)[source]ยถ
Search the remote Hypergraphx-data catalog.
- Parameters:
query (str, optional) โ Case-insensitive substring matched against dataset names and tags.
tags (str | Iterable[str], optional) โ Tags/categories to require. Matching is case-insensitive.
match_all_tags (bool, default=True) โ If True, all requested tags must be present. If False, any requested tag is enough.
source (str, optional) โ Case-insensitive substring matched against the source URL/text.
license (str, optional) โ Case-insensitive substring matched against the license identifier/text.
min_nodes (int, optional) โ Inclusive size filters using catalog
verticesandedges.max_nodes (int, optional) โ Inclusive size filters using catalog
verticesandedges.min_edges (int, optional) โ Inclusive size filters using catalog
verticesandedges.max_edges (int, optional) โ Inclusive size filters using catalog
verticesandedges.
- Returns:
Matching catalog entries in catalog order.
- Return type:
list[dict]
See also
list_remote_datasetsReturn the full remote catalog.
iter_remote_hypergraphsLazily load matching remote hypergraphs.
- hypergraphx.readwrite.write_hif(H, path)[source]ยถ
Save a hypergraph to a HIF file.
- Parameters:
H (Hypergraph) โ The hypergraph to save.
path (str) โ The path to save the hypergraph to.