🧑🤝🧑 Hy-MMSBM¶
Submodules¶
hypergraphx.communities.hy_mmsbm.model module¶
- class hypergraphx.communities.hy_mmsbm.model.HyMMSBM(K=None, u=None, w=None, assortative=None, kappa_fn='binom+avg', max_hye_size=None, u_prior=0.0, w_prior=1.0, seed=None)[source]¶
Bases:
objectImplementation of the Hy-MMSBM probabilistic model from
“Community Detection in Large Hypergraphs”, Ruggeri N., Contisciani M., Battiston F., De Bacco C., and
“A framework to generate hypergraphs with community structure” Ruggeri N., Battiston F., De Bacco C.
The probabilistic model assumes the formation of hyperedges according to a Poisson distribution. The Poisson distribution for every single hyperedge is determined by a common affinity matrix w and soft-community assignments u for the nodes.
- C(d='all', return_summands=False)[source]¶
Constant for calculation of likelihood. It has formula
\[\sum_{d=2}^D inom{N-2}{d-2}/ kappa_d\]- Parameters:
d (single value or array of values for the hyperedge dimension.)
return_summands (since C consists of a sum of terms, specify if to only return) – the final sum or the summands separately.
- Return type:
The C value, or its summands, according to return_summands.
- property N¶
Total number of nodes in the hypergraph.
- degree_sequence(include_dyadic=False, expected=False)[source]¶
Approximately sample the degree sequence from the model using the Central Limit Theorem. The degree sequence depends on the interactions to take into account. If include_dyadic, then also binary interactions (i.e. edges) are taken into account, otherwise only interactions of size three or higher are.
- Parameters:
include_dyadic (include the order-two hyperedges (i.e. edges) or not) – in the calculation of the degree sequence.
expected (return the analytical expected value of the degree sequence, or sample) – the sequence.
- Return type:
The N-dimensional array with the sampled degree sequence.
- dimension_sequence(include_dyadic=False, expected=False)[source]¶
Approximately sample the dimension sequence from the model using the Central Limit Theorem. The dimension sequence is a dictionary with {key: value} pairs {dimension: number of hyperedges with that dimension} If include_dyadic, also binary interactions (i.e. edges) are sampled.
- Parameters:
include_dyadic (whether to sample the number of order-two interactions.)
expected (return the analytical expected value of the dimension sequence, or) – sample the sequence.
- Return type:
The dictionary representing the sampled dimension sequence.
- expected_degree(per_node=False, d='all')[source]¶
Compute the expected degree according to the probabilistic model. If per_node=True, the expected degree is computed for the single nodes, else for the full hypergraph. Notice that the expected degree depends on the interaction sizes taken into account. These can be specified manually by giving an array of integers d (or a single integer).
- Parameters:
per_node (whether the expected degree needs to be computed for the single nodes,) – or averaged.
d (interactions to take into account to compute the expected degree.) – This can be an array of integers, a single integer, or the string “all”. Using d=”all” is equivalent to d=numpy.arange(2, self.max_hye_size+1).
- Returns:
A float with the average degree if per_node=False, else the array with the
expected degrees of the single nodes.
- fit(hypergraph, n_iter=500, tolerance=None, check_convergence_every=10)[source]¶
Perform Expectation-Maximization inference on a hypergraph, as presented in
“Community Detection in Large Hypergraphs”, Ruggeri N., Contisciani M., Battiston F., De Bacco C.,
The inference can be performed both on the affinity matrix w and the assignments u. If either or both have been provided as input at initialization of the model, they are regarded as ground-truth and are not inferred.
- Parameters:
hypergraph (the hypergraph to perform inference on.)
n_iter (maximum number of EM iterations.)
tolerance (tolerance for the stopping criterion.)
check_convergence_every (number of steps in between every convergence check.)
- log_kappa(d)[source]¶
Compute the normalization constant kappa(d) in log-space.
- Parameters:
d (float or array of values d to compute the function kappa over.)
- Return type:
The function value of log(kappa(d)).
- log_likelihood(hypergraph)[source]¶
Compute the log-likelihood of the model on a given hypergraph.
- Parameters:
hypergraph (the hypergraph to compute the log-likelihood of.)
- Return type:
The log-likelihood value.
- poisson_params(binary_incidence, return_edge_sum=False)[source]¶
Compute the Poisson parameters for all the hyperedges. Given the binary incidence matrix, of shape (N, E), return the parameters for the edges in a tensor of shape (E,). Notice that the parameters returned are not the final Poisson means for the hyperedges, as they are not normalized by the kappa constants. Formally, this function returns the values defined as lambda_e in the paper. For every hyperedge e, they are defined as
\[\sum_{i < j \in e} u_i^T w u_j\]- Parameters:
binary_incidence (the binary incidence matrix)
return_edge_sum (whether to return the edge sums.) –
These quantities are computed as an intermediate value. For a single hyperedge e they are defined as
\[\sum_{i \in e} u_i\]and are collected in a vector of length E.
- Returns:
The Poisson parameters in a vector of length E. Optionally, the edge sums in
another vector of length E.