Scale-free hypergraph generation

In this notebook you will:

  • Generate scale-free hypergraphs with controlled degree correlations.

  • Compute size-degree distributions and compare layers.

  • Visualize interaction-size effects.

Definition

Scale-free hypergraphs exhibit power-law degree tails across interaction sizes.

You will learn

Create scale-free hypergraphs and analyze degree patterns across sizes.

Overview

  • Generate scale-free hypergraphs and compare degree distributions.

  • Visualize sample structures and validate model behavior.

Setup

[ ]:
import matplotlib as mpl

mpl.rcParams.update({
    "figure.figsize": (6, 4),
    "figure.dpi": 120,
    "savefig.dpi": 150,
})

[35]:
import math
import numpy as np
import matplotlib.pyplot as plt

from scipy.stats import spearmanr

from hypergraphx.generation.scale_free import scale_free_hypergraph
[36]:
# -----------------------------
# Helpers
# -----------------------------
def iter_all_edges(hg):
    for e in hg.get_edges():
        yield tuple(e)

def edges_of_size(hg, size: int):
    edges = hg.get_edges(size=size)
    return edges

def sizes_present(hg):
    return sorted(set(hg.get_sizes()))

def nodes_present(hg):
    ns = hg.get_nodes()
    return np.array(sorted(map(int, ns)))

def degree_by_size(hg, size: int, num_nodes: int | None = None):
    """Size-specific degrees k_i^{(size)} as a numpy array of length num_nodes."""
    if num_nodes is None:
        num_nodes = int(nodes_present(hg).max()) + 1
    deg = np.zeros(num_nodes, dtype=int)
    for e in edges_of_size(hg, size):
        for v in e:
            deg[int(v)] += 1
    return deg

def degree_matrix(hg, sizes=None, num_nodes: int | None = None):
    """Return dict {s: degree_vector} for requested sizes."""
    if sizes is None:
        sizes = sizes_present(hg)
    if num_nodes is None:
        # If you used 0..num_nodes-1 in generation, this works well
        num_nodes = int(nodes_present(hg).max()) + 1
    return {s: degree_by_size(hg, s, num_nodes=num_nodes) for s in sizes}

def spearman_corr_matrix(vectors_by_size: dict[int, np.ndarray]):
    """Compute Spearman correlation matrix between degree vectors across sizes."""
    sizes = sorted(vectors_by_size.keys())
    M = np.zeros((len(sizes), len(sizes)), dtype=float)
    for i, si in enumerate(sizes):
        for j, sj in enumerate(sizes):
            if i == j:
                M[i, j] = 1.0
            else:
                M[i, j], _ = spearmanr(vectors_by_size[si], vectors_by_size[sj])
    return sizes, M
[37]:
# -----------------------------
# Plotting utilities
# -----------------------------

def plot_corr_matrix(sizes, M, title="Spearman correlation across size-layers", vmin=-1.0, vmax=1.0, cmap="coolwarm"):
    """Plot a matrix with a diverging colormap and fixed bounds (useful for correlations)."""
    plt.figure()
    im = plt.imshow(M, interpolation="nearest", vmin=vmin, vmax=vmax, cmap=cmap)
    plt.colorbar(im)
    plt.xticks(range(len(sizes)), sizes)
    plt.yticks(range(len(sizes)), sizes)
    plt.title(title)
    plt.xlabel("layer size s")
    plt.ylabel("layer size s")
    plt.tight_layout()
    plt.show()

def plot_degree_ccdf(deg, label=None):
    """CCDF P(K >= k) plotted on log-log axes."""
    deg = np.asarray(deg)
    deg = deg[deg > 0]
    if deg.size == 0:
        return
    vals = np.sort(deg)
    # unique k and ccdf
    k = np.unique(vals)
    ccdf = np.array([(vals >= kk).mean() for kk in k])
    plt.loglog(k, ccdf, marker="o", linestyle="none", label=label)

def plot_degree_distributions(degrees_by_size: dict[int, np.ndarray], title="Degree CCDF by size-layer"):
    plt.figure()
    for s, deg in sorted(degrees_by_size.items()):
        plot_degree_ccdf(deg, label=f"s={s}")
    plt.xlabel("k")
    plt.ylabel("P(K ≥ k)")
    plt.title(title)
    plt.legend()
    plt.show()

def plot_rank_scatter(deg_a, deg_b, a_label="layer A", b_label="layer B"):
    """Scatter of node degree ranks (Spearman visualization)."""
    ra = np.argsort(np.argsort(deg_a))  # 0..n-1 ranks
    rb = np.argsort(np.argsort(deg_b))
    plt.figure()
    plt.scatter(ra, rb, s=8, alpha=0.4)
    plt.xlabel(f"rank({a_label})")
    plt.ylabel(f"rank({b_label})")
    corr, _ = spearmanr(deg_a, deg_b)
    plt.title(f"Rank–rank scatter (Spearman={corr:.3f})")
    plt.show()

1) Generate a baseline hypergraph

We generate a multi-layer hypergraph with sizes s=2,3,4. Use enough hyperedges per layer for stable degree correlations; parameters are described below.

[38]:
num_nodes = 500
edges_by_size = {2: 4000, 3: 2500, 4: 1500}

alpha_by_size = 1.1   # stronger hubs with larger alpha
rho = 0.8             # strong correlation across layers
seed = 7

H = scale_free_hypergraph(
    num_nodes=num_nodes,
    edges_by_size=edges_by_size,
    alpha_by_size=alpha_by_size,
    rho=rho,
    seed=seed,
    enforce_unique_edges=True,
)

print("sizes present:", sizes_present(H))
for s in sorted(edges_by_size):
    print(f"s={s}: edges={len(edges_of_size(H, s))}")

sizes present: [2, 3, 4]
s=2: edges=4000
s=3: edges=2500
s=4: edges=1500

2) Size-specific degrees and inter-layer correlation

We compute (k_i^{(s)}) for each layer and visualize:

  • Spearman correlation matrix across layers (degree vectors),

  • a rank–rank scatter between two chosen layers.

[39]:
deg = degree_matrix(H, sizes=sorted(edges_by_size.keys()), num_nodes=num_nodes)
sizes, M = spearman_corr_matrix(deg)

print("Spearman correlations of realized degrees:")
for i, si in enumerate(sizes):
    for j, sj in enumerate(sizes):
        if j > i:
            print(f"  s={si} vs s={sj}: {M[i,j]:.3f}")

plot_corr_matrix(sizes, M, title="Spearman correlation of realized size-specific degrees")
plot_rank_scatter(deg[2], deg[3], a_label="s=2", b_label="s=3")

Spearman correlations of realized degrees:
  s=2 vs s=3: 0.570
  s=2 vs s=4: 0.577
  s=3 vs s=4: 0.538
../_images/tutorials_scale_free_generation_12_1.png
../_images/tutorials_scale_free_generation_12_2.png

3) Degree distributions across layers

A common visualization for heavy-tailed degree distributions is the CCDF on log–log axes. If the generator is behaving as expected, you should see:

  • broad/heavy-tailed degree distributions in each layer,

  • heavier tails (more extreme hubs) when alpha_by_size is larger.

[40]:
plot_degree_distributions(deg, title="Degree CCDF by size-layer (baseline)")

../_images/tutorials_scale_free_generation_14_0.png

4) Generating a variety of hypergraphs

Below we vary:

  • hub persistence (rho): independent layers (rho≈0), aligned hubs (rho>0), anti-aligned hubs (rho<0);

  • heterogeneity (alpha_by_size): weaker hubs vs stronger hubs;

  • layer-specific heterogeneity: different alpha for each layer.

We track:

  • inter-layer Spearman correlations of realized degrees,

  • CCDFs per layer.

[41]:
def summarize_hypergraph(num_nodes, edges_by_size, alpha_by_size, rho, seed=0):
    H = scale_free_hypergraph(
        num_nodes=num_nodes,
        edges_by_size=edges_by_size,
        alpha_by_size=alpha_by_size,
        rho=rho,
        seed=seed,
        enforce_unique_edges=True,
    )
    deg = degree_matrix(H, sizes=sorted(edges_by_size), num_nodes=num_nodes)
    sizes, M = spearman_corr_matrix(deg)
    return H, deg, sizes, M

def show_case(title, num_nodes, edges_by_size, alpha_by_size, rho, seed=0):
    H, deg, sizes, M = summarize_hypergraph(num_nodes, edges_by_size, alpha_by_size, rho, seed=seed)
    print(title)
    print("  alpha_by_size:", alpha_by_size, " rho:", rho)
    for i, si in enumerate(sizes):
        for j, sj in enumerate(sizes):
            if j > i:
                print(f"    Spearman(k^(s={si}), k^(s={sj})) = {M[i,j]:.3f}")
    plot_corr_matrix(sizes, M, title=f"{title} — Spearman(degrees)")
    plot_degree_distributions(deg, title=f"{title} — degree CCDFs")
    return H, deg, sizes, M

num_nodes = 600
edges_by_size = {2: 6000, 3: 3500, 4: 2000}
seed = 11

# 4.1 Independent layers (rho ~ 0)
_ = show_case(
    title="Case A: Independent layers (rho=0.0)",
    num_nodes=num_nodes,
    edges_by_size=edges_by_size,
    alpha_by_size=1.1,
    rho=0.0,
    seed=seed,
)

# 4.2 Strongly aligned hubs (rho > 0)
_ = show_case(
    title="Case B: Strongly aligned hubs (rho=0.85)",
    num_nodes=num_nodes,
    edges_by_size=edges_by_size,
    alpha_by_size=1.1,
    rho=0.85,
    seed=seed,
)

# 4.3 Anti-aligned hubs (rho < 0) within feasibility bound:
# With 3 layers, rho >= -1/(3-1) = -0.5
_ = show_case(
    title="Case C: Anti-aligned hubs (rho=-0.35)",
    num_nodes=num_nodes,
    edges_by_size=edges_by_size,
    alpha_by_size=1.1,
    rho=-0.35,
    seed=seed,
)

# 4.4 Layer-dependent heterogeneity (different alpha per layer)
_ = show_case(
    title="Case D: Layer-dependent heterogeneity (alpha varies by size)",
    num_nodes=num_nodes,
    edges_by_size=edges_by_size,
    alpha_by_size={2: 0.8, 3: 1.1, 4: 1.4},
    rho=0.75,
    seed=seed,
)

Case A: Independent layers (rho=0.0)
  alpha_by_size: 1.1  rho: 0.0
    Spearman(k^(s=2), k^(s=3)) = 0.016
    Spearman(k^(s=2), k^(s=4)) = -0.016
    Spearman(k^(s=3), k^(s=4)) = -0.033
../_images/tutorials_scale_free_generation_16_1.png
../_images/tutorials_scale_free_generation_16_2.png
Case B: Strongly aligned hubs (rho=0.85)
  alpha_by_size: 1.1  rho: 0.85
    Spearman(k^(s=2), k^(s=3)) = 0.643
    Spearman(k^(s=2), k^(s=4)) = 0.629
    Spearman(k^(s=3), k^(s=4)) = 0.590
../_images/tutorials_scale_free_generation_16_4.png
../_images/tutorials_scale_free_generation_16_5.png
Case C: Anti-aligned hubs (rho=-0.35)
  alpha_by_size: 1.1  rho: -0.35
    Spearman(k^(s=2), k^(s=3)) = -0.269
    Spearman(k^(s=2), k^(s=4)) = -0.268
    Spearman(k^(s=3), k^(s=4)) = -0.181
../_images/tutorials_scale_free_generation_16_7.png
../_images/tutorials_scale_free_generation_16_8.png
Case D: Layer-dependent heterogeneity (alpha varies by size)
  alpha_by_size: {2: 0.8, 3: 1.1, 4: 1.4}  rho: 0.75
    Spearman(k^(s=2), k^(s=3)) = 0.556
    Spearman(k^(s=2), k^(s=4)) = 0.505
    Spearman(k^(s=3), k^(s=4)) = 0.504
../_images/tutorials_scale_free_generation_16_10.png
../_images/tutorials_scale_free_generation_16_11.png

5) Hub overlap across layers (optional visualization)

Another intuitive diagnostic is how much the top-q% hubs overlap across layers.

We compute, for each pair of layers, the Jaccard similarity between the sets of top hubs (according to size-specific degree).

[42]:
def top_hubs(deg, q=0.05):
    n = len(deg)
    k = max(1, int(round(q * n)))
    return set(np.argsort(deg)[-k:])

def hub_overlap_matrix(deg_by_size: dict[int, np.ndarray], q=0.05):
    sizes = sorted(deg_by_size.keys())
    M = np.zeros((len(sizes), len(sizes)), dtype=float)
    tops = {s: top_hubs(deg_by_size[s], q=q) for s in sizes}
    for i, si in enumerate(sizes):
        for j, sj in enumerate(sizes):
            A, B = tops[si], tops[sj]
            M[i, j] = len(A & B) / len(A | B)
    return sizes, M

# Using the last case generated above (Case D) if variables exist; otherwise recompute quickly.
try:
    deg_last = deg
except NameError:
    H_tmp = scale_free_hypergraph(
        num_nodes=600, edges_by_size={2: 6000, 3: 3500, 4: 2000},
        alpha_by_size=1.1, rho=0.75, seed=0
    )
    deg_last = degree_matrix(H_tmp, sizes=[2,3,4], num_nodes=600)

sizes_h, J = hub_overlap_matrix(deg_last, q=0.05)
plot_corr_matrix(sizes_h, J, title="Top-5% hub overlap (Jaccard) across layers", vmin=0.0, vmax=1.0, cmap="viridis")

../_images/tutorials_scale_free_generation_18_0.png

Notes / troubleshooting

  • If your realized degree correlations look noisy, increase edges_by_size (more hyperedges per layer).

  • Very large alpha can produce extremely dominant hubs (almost a star-like structure).

  • Negative rho is constrained when you have many sizes because the equicorrelation matrix must be a valid covariance/correlation matrix (PSD). With m sizes, the bound is (\rho `:nbsphinx-math:ge -1`/(m-1)).