pathcensus.nullmodels package
Submodules
pathcensus.nullmodels.base module
Exponential Random Graph Models (ERGM) with local constraints are
such ERGMs in which sufficient statistics are defined at the level of
individual nodes (or globally for the entire graph). In other words, their
values for each node can be set independently. Unlike ERGMs with non-local
constraints which are notoriously problematic
(e.g. due to degenerate convergence and non-projectivity)
they are analytically solvable. Prime examples of ERGMs with local constraints
are configuration models which induce maximum entropy distributions over
graphs with N nodes with arbitrary expected degree sequence and/or
strength sequence constraints.
The pathcensus.nullmodels submodule implements several such
ERGMs which are most appropriate for statistical calibration of strucutral
coefficients. They can be applied to simple undirected and unweighted/weighted
networks.
See also
ERGMbase class for ERGMs
pathcensus.nullmodels.ubcmUndirected Binary Configuration Model (fixed expected degree sequence)
pathcensus.nullmodels.uecmUndirected Enhanced Configuration Model (fixed expected degree and strength sequences assuming positive integer weights)
Note
The ERGM functionalities provided by pathcensus are simple
wrappers around the NEMtropy package.
- class pathcensus.nullmodels.base.ERGM(statistics: ndarray | GraphABC, **kwds: Any)[source]
Bases:
objectGeneric base class for Exponential Random Graph Models with local (i.e. node-level) constraints.
- statistics
2D (float) array with sufficient statistics for nodes. First axis is for nodes and second for differen statistics.
- fit_args
Dictionary with arguments used in the last call of
fit().Noneif the model has not been fitted yet.
Notes
The following class attributes are required and need to be defined on concrete subclasses.
- names
Mapping from names of sufficient statistics to attribute names in the
NEMtropysolver class storing fitted model parameters. They must be provided in an order consistent withstatistics. This is a class attribute which must be defined on subclasses implementing particular models. The mapping must have stable order (starting frompython3.6an ordinarydictwill do). However, it is usually better to use mapping proxy objects instead of dicts as they are not mutable.- labels
Mapping from abbreviated labels to full names of sufficient statistics.
- models
Model names as defined in
NEMtropyallowed for the specific type of model. Must be implemented on a subclass as a class attribute. The first model on the list should will be used by default.
- property Ewijfunc: Callable
JIT-compiled function calculating expected edge weights \(\mathbb{E}[w_{ij}]\) (conditional on being present) based on the model.
- property X: ndarray | None
Array with fitted model parameters (1D).
- Raises:
ValueError – If model is not fitted.
- aliases = None
- check_fitted() None[source]
Raise ValueError if model is not fitted.
- default_fit_kwds = None
- property default_model: str
- default_rtol = 0.1
- property directed: bool
Is model directed.
- property error: ndarray
Get maximum overall absolute error of the fit.
- property expected_statistics: ndarray
Model-based expected values of sufficient statistics.
- extract_statistics(graph: GraphABC) ndarray[source]
Extract array of sufficient statistics from a graph-like object.
- fit(model: str | None = None, method: Literal['auto', 'newton', 'fixed-point'] = 'auto', **kwds) float[source]
Fit model parameters to the observed sufficient statistics and returns the overall maximum absolute error.
- Parameters:
model – Type of model to use. Default value defined in
self.default_modelis used whenNone.method – Solver method to use. If
"auto"then either Newton or fixed-point method is used depending on the number of nodes with the threshold defined byself.fp_threshold.**kwds – Passed to NEMtropy solver method
solve_tool.
Notes
Some of the
**kwdsmay be prefilled (but can be overriden) with default values defined ondefault_fit_kwdsclass attribute.- Returns:
Fitted model.
- Return type:
self
- property fp_threshold: int
Threshold on the number of nodes after which by default the fixed-point solver is used instead of the Newton method solver.
- property fullname: str
Full name of model. May be reimplemented on concrete subclass to allow using shortened class names.
- get_P(*, dense: bool = False) LinearOperator | ndarray[source]
Get matrix of edge probabilities.
- Parameters:
dense – If
Truethen a dense array is returned. Otherwise ascipy.sparse.linalg.LinearOperatoris returned.
- get_W(*, dense: bool = False) LinearOperator | ndarray[source]
Get matrix of expected edge weights.
- Parameters:
dense – If
Truethen a dense array is returned. Otherwise ascipy.sparse.linalg.LinearOperatoris returned.- Raises:
NotImplementedError – If called on a model instance which is not weighted.
- get_nemtropy_graph() UndirectedGraph | DirectedGraph[source]
Get
NEMtropygraph representation instance appropriate for a given type of model.
- get_param(stat: int | str) ndarray[source]
Get parameter array associated with a given sufficient statistic.
Noneis returned if the model is not yet fitted.- Parameters:
stat – Index or label of a sufficient statistic.
- get_stat(stat: int | str, expected: bool = False) ndarray[source]
Get sufficient statistic array by index or label.
- Parameters:
stat – Index or label of a sufficient statistic.
expected – Should observed or expected statistic be returned.
- is_fitted() bool[source]
Check if model instance is fitted (this does not check quality of the fit).
- is_valid(rtol: float | None = None) bool[source]
Check if model is approximately correct or that the relative difference
|expected - observed| / |observed|is not greater thanrtol.- Parameters:
rtol – Maximum allowed relative difference. Class attribute
default_rtolis used whenNone.
- property labels: Mapping
Mapping from short labels to full names corresponding to sufficient statistics.
- methods = ('auto', 'newton', 'fixed-point')
- property models: Tuple[str]
- property n_nodes: int
Number of nodes in the underlying graph.
- property n_stats: int
Number of sufficient statistics.
- property names: Mapping
Mapping from names to
NEMtropysolver attribute names corresponding to sufficient statistics.
- property parameters: ndarray | None
Array with fitted model parameters shaped as
self.statistics.- Raises:
ValueError – If model is not fitted.
- property pijfunc: Callable
JIT-compiled function calculating \(p_{ij}\)’s based on the model.
- property pmv: Callable
JIT-compiled function calculating \(Pv\) where \(P\) is the edge probability matrix and \(v\) is an arbitrary vector.
- relerr() ndarray[source]
Get error of the fitted expected statistics relative to the observed sufficient statistics as
|expected - observed| / |observed|.
- property rpmv: Callable
JIT-compiled function calculating \(vP\) where \(P\) is the edge probability matrix and \(v\) is an abitrary vector.
- property rwmv: Callable
JIT-compiled function calculating \(vW\) where \(W\) is the matrix of expected edge weights and \(v\) is an arbitrary vector.
- sample(n: int) Iterable[spmatrix][source]
Generate n instances sampled from the model.
- Yields:
A – Graph instance represented as a sparse matrix (CSR format)
- sample_one() spmatrix[source]
Sample a graph instance as sparse matrix from the model.
- Returns:
Graph instance represented as a sparse matrix (CSR format).
- Return type:
A
- property solver: UndirectedGraph | DirectedGraph
NEMtropygraph solver instance.
- validate(rtol: float | None = None) None[source]
Raise
ValueErrorif the relative difference|expected - observed| / |observed|, is greater thanrtol.- Parameters:
rtol – Maximum allowed relative difference. Class attribute
default_rtolis used whenNone.- Returns:
The same model instance if the error is not raised.
- Return type:
self
- validate_statistics_shape(statistics: ndarray) None[source]
Raise
ValueErrorifstatisticshas an incorrect shape which is not consistent with the class attributecls.names.
- validate_statistics_values(statistics: ndarray) None[source]
Raise if
statisticscontain incorrect values.It must be implemented on a subclass.
Notes
Validation of the shape of
statisticsis implemented independently invalidate_statistics_shape()which is a generic method which in most cases does not need to be implemented on subclasses.
- property weighted: bool
Is model weighted.
- property wijfunc: Callable
JIT-compiled function sampling edge weights \(w_{ij}\) based on the model.
- property wmv: Callable
JIT-compiled function calculating \(Wv\) where \(W\) is the matrix of expected edge weights and \(v\) is an arbitrary vector.
- class pathcensus.nullmodels.base.SoftConfigurationModel(statistics: ndarray | GraphABC, **kwds: Any)[source]
Bases:
ERGMBase class for soft configuration models.
- class pathcensus.nullmodels.base.UndirectedSoftConfigurationModel(statistics: ndarray | GraphABC, **kwds: Any)[source]
Bases:
SoftConfigurationModelBase class for undirected soft configuration models.
- aliases = mappingproxy({'degree': 'd', 'strength': 's'})
- property directed: bool
Is model directed.
- pathcensus.nullmodels.base.get_pmv(X: ndarray, v: ndarray, pijfunc: Callable[[ndarray, int, int], float]) ndarray[source]
Calculate \(Pv\) where \(P\) is edge probability matrix and \(v\) an arbitrary vector.
- Parameters:
X – 1D array of model parameters.
v – Arbitrary vector.
pijfunc – JIT-compiled function (in no-python mode) calculating edge probabilities \(p_{ij}\). It should have the following signature:
(X, i, j) -> float, whereXis a 1D array of model parameters. The return value must be a float in[0, 1].
- pathcensus.nullmodels.base.get_wmv(X: ndarray, v: ndarray, pijfunc: Callable[[ndarray, int, int], float], Ewijfunc: Callable[[ndarray, int, int], float]) ndarray[source]
Calculate \(Wv\) where \(W\) is expected edge weight matrix and \(v\) is an arbitrary vector.
- Parameters:
X – 1D array of model parameters.
v – Arbitrary vector.
pijfunc – JIT-compiled function (in no-python mode) calculating edge probabilities \(p_{ij}\). It should have the following signature:
(X, i, j) -> float, whereXis a 1D array of model parameters. The return value must be a float in[0, 1].Ewijfunc – JIT-compiled function (in no-python mode) calculating expected edge weights \(\mathbb{E}[p_{ij}]\). It should have the following signature
(X, i, j) -> float, whereXis a 1D array of model parameters. The return value must be a positive float.
- pathcensus.nullmodels.base.sample_edgelist_unweighted(X: ndarray, n_nodes: int, pijfunc: Callable[[ndarray, int, int], float]) ndarray[source]
Sample edgelist array from an ERGM.
- Parameters:
X – 1D array of model parameters.
n_nodes – Number of nodes in hte underlying graph.
pijfunc – JIT-compiled function (in no-python mode) calculating edge probabilities \(p_{ij}\). It should have the following signature:
(X, i, j) -> float, whereXis a 1D array of model parameters. The return value must be a float in[0, 1].
- Returns:
Edgelist array.
- Return type:
E
- pathcensus.nullmodels.base.sample_edgelist_weighted(X: ndarray, n_nodes: int, pijfunc: Callable[[ndarray, int, int], float], wijfunc: Callable[[ndarray, int, int], int | float]) Tuple[ndarray, ndarray | None][source]
Sample edgelist array from an ERGM.
- Parameters:
X – 1D array of model parameters.
n_nodes – Number of nodes in the underlying graph.
weighted – Is the model weighted
pijfunc – JIT-compiled function (in no-python mode) calculating edge probabilities \(p_{ij}\). It should have the following signature:
(X, i, j) -> float, whereXis a 1D array of model parameters. The return value must be a float in[0, 1].wijfunc – JIT-compiled function (in no-python mode) sampling edge weights \(w_{ij}\). It should have the following signature:
(X, i, j) -> float/int, whereXis a 1D array of model arameters. The return value must be a positive int/float.
- Returns:
E – Edgelist array.
W – 1D array with edge weights.
pathcensus.nullmodels.ubcm module
Undirected Binary Configuration Model (UBCM) induces a maximum entropy probability distribution over networks of a given size such that it has a specific expected degree sequence. It can be used to model undirected unweighted networks. See [VBM+21] for details.
See also
UBCMUBCM class
Examples
>>> # Make simple ER random graph using `igraph`
>>> import random
>>> import igraph as ig
>>> random.seed(101)
>>> G = ig.Graph.Erdos_Renyi(20, p=.2)
>>> # Initialize UBCM directly from the graph object
>>> ubcm = UBCM(G)
>>> # Alternatively, initialize from degree sequence array
>>> D = np.array(G.degree())
>>> ubcm = UBCM(D).fit()
>>> # Check fit error
>>> round(ubcm.error, 6)
0.0
>>> # Mean absolute deviation of the fitted expected degree sequence
>>> # from the observed sequence
>>> (np.abs(ubcm.ED - ubcm.D) <= 1e-6).all()
True
>>> # Sample a single ensemble instance
>>> ubcm.sample_one()
<...
with ... stored elements ...>
>>> # Sample multiple instances (generator)
>>> for instance in ubcm.sample(10): pass
- class pathcensus.nullmodels.ubcm.UBCM(statistics: ndarray | GraphABC, **kwds: Any)[source]
Bases:
UndirectedSoftConfigurationModelUndirected Binary Configuration Model.
This is a soft configuration model for undirected unweighted networks which belongs to the family of Exponential Random Graph Models (ERGMs) with local constraints. It induces a maximum entropy probability distribution over a set of networks with \(N\) nodes such that it yields a specific degree sequence on average.
- statistics
2D (float) array with sufficient statistics for nodes. In this case there is only one sufficient statistic, that is, the degree sequence.
- fit_args
Dictionary with arguments used in the last call of
fit().Noneif the model has not been fitted yet.
Notes
The following important class attributes are also defined:
- labels
Mapping from abbreviated labels to full names identifying sufficient statistics.
- models
Model names as defined in
NEMtropyallowed for the specific type of model.
- property D: ndarray
Observed degree sequence.
- property ED: ndarray
Expected degree sequence.
- default_fit_kwds = mappingproxy({'initial_guess': 'chung_lu'})
- property expected_statistics: ndarray
Expected sufficient statistics.
- extract_statistics(graph: GraphABC) ndarray[source]
Extract sufficient statistics from a graph-like object.
- property fullname: str
Full name of model. May be reimplemented on concrete subclass to allow using shortened class names.
- get_nemtropy_graph() UndirectedGraph[source]
Get
NEMtropygraph representation instance.
- models = ('cm_exp', 'cm')
- names = mappingproxy({'degree': 'x'})
- property pijfunc: Callable
JIT-compiled routine for calculating \(p_{ij}\).
- property weighted: bool
Is model weighted.
pathcensus.nullmodels.uecm module
Undirected Enhanced Configuration Model (UECM) induces a maximum entropy probability distribution over networks of a given size such that it has specific expected degree and strength sequences. It can be used to model undirected weighted networks with edge weights being positive integers (with no upper bound). See [VBM+21] for details.
See also
UECMUECM class
Examples
>>> import random
>>> import igraph as ig
>>> # Make a ER random graph with random integer weights
>>> random.seed(27732)
>>> G = ig.Graph.Erdos_Renyi(20, p=.2)
>>> G.es["weight"] = np.random.randint(1, 11, G.ecount())
>>> # Initialize UECM from the graph object
>>> uecm = UECM(G)
>>> # Alternatively initialize from an array of sufficient statistics
>>> # 1st column - degree sequence; 2nd column - strength sequence
>>> D = np.array(G.degree())
>>> S = np.array(G.strength(weights="weight"))
>>> stats = np.column_stack([D, S])
>>> uecm = UECM(stats).fit()
>>> # Check fit error
>>> round(uecm.error, 6)
0.0
>>> # Mean absolute deviation of the fitted expected degree sequence
>>> # from the observed sequence
>>> (np.abs(uecm.ED - uecm.D) <= 1e-6).all()
True
>>> # Mean absolute deviation of the fitted expected strength sequence
>>> # from the observed sequence
>>> (np.abs(uecm.ES - uecm.S) <= 1e-6).all()
True
>>> # Sample a single instance
>>> uecm.sample_one()
<...
with ... stored elements ...>
>>> # Sample multiple instances (generator)
>>> for instance in uecm.sample(10): pass
- class pathcensus.nullmodels.uecm.UECM(statistics: ndarray | GraphABC, **kwds: Any)[source]
Bases:
UndirectedSoftConfigurationModelUndirected Enhanced Configuration Model.
This is a soft configuration model for undirected weighted networks with unbounded positive integer weights which belongs to the family of Exponential Random Graph Models (ERGMs) with local constraints. It induces a maximum entropy probability distribution over a set of networks with \(N\) nodes such that it yields a specific degree sequence and a specific strenght sequence on average.
- statistics
2D (float) array with sufficient statistics for nodes. In this case there are two sufficient statistics, that is, the degree sequence and the strength sequence.
- fit_args
Dictionary with arguments used in the last call of
fit().Noneif the model has not been fitted yet.
Notes
The following important class attributes are also defined:
- labels
Mapping from abbreviated labels to full names identifying sufficient statistics.
- models
Model names as defined in
NEMtropyallowed for the specific type of model.
- property D: ndarray
Observed degree sequence.
- property ED: ndarray
Expected degree sequence.
- property ES: ndarray
Expected strength sequence.
- property Ewijfunc: Callable
JIT-compiled routing for calculating \(\mathbb{E}[w_{ij}]\) (conditional on the edge being present).
- property S: ndarray
Observed strength sequence.
- default_fit_kwds = mappingproxy({'initial_guess': 'strengths_minor'})
- property expected_statistics: ndarray
Expected sufficient statistics.
- extract_statistics(graph: GraphABC) ndarray[source]
Extract sufficient statistics from a graph-like object.
- get_nemtropy_graph() UndirectedGraph[source]
Get
NEMtropygraph representation instance.
- models = ('ecm_exp', 'ecm')
- names = mappingproxy({'degree': 'x', 'strength': 'y'})
- property pijfunc: Callable
JIT-compiled routine for calculating \(p_{ij}\).
- property weighted: bool
Is model weighted.
- property wijfunc: Callable
JIT-compiled routine sampling \(w_{ij}\).
- pathcensus.nullmodels.uecm.uecm_Ewij(X: ndarray, i: int, j: int) float[source]
Calculate expected edge weight \(\mathbb{E}[w_{ij}]\) (conditional on the edge being present) in UECM model.
- Parameters:
X – 1D array od model parameters.
i – Node indices.
j – Node indices.
Module contents
Null model classes implementing different variants of the configuration model.
The classes implemented in this module are simple wrappers
around NEMtropy package.