pathcensus.nullmodels package
Submodules
pathcensus.nullmodels.base module
Exponential Random Graph Models (ERGM) with local constraints are
such ERGMs in which sufficient statistics are defined at the level of
individual nodes (or globally for the entire graph). In other words, their
values for each node can be set independently. Unlike ERGMs with non-local
constraints which are notoriously problematic
(e.g. due to degenerate convergence and non-projectivity)
they are analytically solvable. Prime examples of ERGMs with local constraints
are configuration models which induce maximum entropy distributions over
graphs with N
nodes with arbitrary expected degree sequence and/or
strength sequence constraints.
The pathcensus.nullmodels
submodule implements several such
ERGMs which are most appropriate for statistical calibration of strucutral
coefficients. They can be applied to simple undirected and unweighted/weighted
networks.
See also
ERGM
base class for ERGMs
pathcensus.nullmodels.ubcm
Undirected Binary Configuration Model (fixed expected degree sequence)
pathcensus.nullmodels.uecm
Undirected Enhanced Configuration Model (fixed expected degree and strength sequences assuming positive integer weights)
Note
The ERGM functionalities provided by pathcensus
are simple
wrappers around the NEMtropy
package.
- class pathcensus.nullmodels.base.ERGM(statistics: Union[ndarray, GraphABC], **kwds: Any)[source]
Bases:
object
Generic base class for Exponential Random Graph Models with local (i.e. node-level) constraints.
- statistics
2D (float) array with sufficient statistics for nodes. First axis is for nodes and second for differen statistics.
- fit_args
Dictionary with arguments used in the last call of
fit()
.None
if the model has not been fitted yet.
Notes
The following class attributes are required and need to be defined on concrete subclasses.
- names
Mapping from names of sufficient statistics to attribute names in the
NEMtropy
solver class storing fitted model parameters. They must be provided in an order consistent withstatistics
. This is a class attribute which must be defined on subclasses implementing particular models. The mapping must have stable order (starting frompython3.6
an ordinarydict
will do). However, it is usually better to use mapping proxy objects instead of dicts as they are not mutable.- labels
Mapping from abbreviated labels to full names of sufficient statistics.
- models
Model names as defined in
NEMtropy
allowed for the specific type of model. Must be implemented on a subclass as a class attribute. The first model on the list should will be used by default.
- property Ewijfunc: Callable
JIT-compiled function calculating expected edge weights \(\mathbb{E}[w_{ij}]\) (conditional on being present) based on the model.
- property X: Optional[ndarray]
Array with fitted model parameters (1D).
- Raises
ValueError – If model is not fitted.
- aliases = None
- check_fitted() None [source]
Raise ValueError if model is not fitted.
- default_fit_kwds = None
- property default_model: str
- default_rtol = 0.1
- property directed: bool
Is model directed.
- property error: ndarray
Get maximum overall absolute error of the fit.
- property expected_statistics: ndarray
Model-based expected values of sufficient statistics.
- extract_statistics(graph: GraphABC) ndarray [source]
Extract array of sufficient statistics from a graph-like object.
- fit(model: Optional[str] = None, method: Literal['auto', 'newton', 'fixed-point'] = 'auto', **kwds) float [source]
Fit model parameters to the observed sufficient statistics and returns the overall maximum absolute error.
- Parameters
model – Type of model to use. Default value defined in
self.default_model
is used whenNone
.method – Solver method to use. If
"auto"
then either Newton or fixed-point method is used depending on the number of nodes with the threshold defined byself.fp_threshold
.**kwds – Passed to NEMtropy solver method
solve_tool
.
Notes
Some of the
**kwds
may be prefilled (but can be overriden) with default values defined ondefault_fit_kwds
class attribute.- Returns
Fitted model.
- Return type
self
- property fp_threshold: int
Threshold on the number of nodes after which by default the fixed-point solver is used instead of the Newton method solver.
- property fullname: str
Full name of model. May be reimplemented on concrete subclass to allow using shortened class names.
- get_P(*, dense: bool = False) Union[LinearOperator, ndarray] [source]
Get matrix of edge probabilities.
- Parameters
dense – If
True
then a dense array is returned. Otherwise ascipy.sparse.linalg.LinearOperator
is returned.
- get_W(*, dense: bool = False) Union[LinearOperator, ndarray] [source]
Get matrix of expected edge weights.
- Parameters
dense – If
True
then a dense array is returned. Otherwise ascipy.sparse.linalg.LinearOperator
is returned.- Raises
NotImplementedError – If called on a model instance which is not weighted.
- get_nemtropy_graph() Union[UndirectedGraph, DirectedGraph] [source]
Get
NEMtropy
graph representation instance appropriate for a given type of model.
- get_param(stat: Union[int, str]) ndarray [source]
Get parameter array associated with a given sufficient statistic.
None
is returned if the model is not yet fitted.- Parameters
stat – Index or label of a sufficient statistic.
- get_stat(stat: Union[int, str], expected: bool = False) ndarray [source]
Get sufficient statistic array by index or label.
- Parameters
stat – Index or label of a sufficient statistic.
expected – Should observed or expected statistic be returned.
- is_fitted() bool [source]
Check if model instance is fitted (this does not check quality of the fit).
- is_valid(rtol: Optional[float] = None) bool [source]
Check if model is approximately correct or that the relative difference
|expected - observed| / |observed|
is not greater thanrtol
.- Parameters
rtol – Maximum allowed relative difference. Class attribute
default_rtol
is used whenNone
.
- property labels: Mapping
Mapping from short labels to full names corresponding to sufficient statistics.
- methods = ('auto', 'newton', 'fixed-point')
- property models: Tuple[str]
- property n_nodes: int
Number of nodes in the underlying graph.
- property n_stats: int
Number of sufficient statistics.
- property names: Mapping
Mapping from names to
NEMtropy
solver attribute names corresponding to sufficient statistics.
- property parameters: Optional[ndarray]
Array with fitted model parameters shaped as
self.statistics
.- Raises
ValueError – If model is not fitted.
- property pijfunc: Callable
JIT-compiled function calculating \(p_{ij}\)’s based on the model.
- property pmv: Callable
JIT-compiled function calculating \(Pv\) where \(P\) is the edge probability matrix and \(v\) is an arbitrary vector.
- relerr() ndarray [source]
Get error of the fitted expected statistics relative to the observed sufficient statistics as
|expected - observed| / |observed|
.
- property rpmv: Callable
JIT-compiled function calculating \(vP\) where \(P\) is the edge probability matrix and \(v\) is an abitrary vector.
- property rwmv: Callable
JIT-compiled function calculating \(vW\) where \(W\) is the matrix of expected edge weights and \(v\) is an arbitrary vector.
- sample(n: int) Iterable[spmatrix] [source]
Generate n instances sampled from the model.
- Yields
A – Graph instance represented as a sparse matrix (CSR format)
- sample_one() spmatrix [source]
Sample a graph instance as sparse matrix from the model.
- Returns
Graph instance represented as a sparse matrix (CSR format).
- Return type
A
- property solver: Union[UndirectedGraph, DirectedGraph]
NEMtropy
graph solver instance.
- validate(rtol: Optional[float] = None) None [source]
Raise
ValueError
if the relative difference|expected - observed| / |observed|
, is greater thanrtol
.- Parameters
rtol – Maximum allowed relative difference. Class attribute
default_rtol
is used whenNone
.- Returns
The same model instance if the error is not raised.
- Return type
self
- validate_statistics_shape(statistics: ndarray) None [source]
Raise
ValueError
ifstatistics
has an incorrect shape which is not consistent with the class attributecls.names
.
- validate_statistics_values(statistics: ndarray) None [source]
Raise if
statistics
contain incorrect values.It must be implemented on a subclass.
Notes
Validation of the shape of
statistics
is implemented independently invalidate_statistics_shape()
which is a generic method which in most cases does not need to be implemented on subclasses.
- property weighted: bool
Is model weighted.
- property wijfunc: Callable
JIT-compiled function sampling edge weights \(w_{ij}\) based on the model.
- property wmv: Callable
JIT-compiled function calculating \(Wv\) where \(W\) is the matrix of expected edge weights and \(v\) is an arbitrary vector.
- class pathcensus.nullmodels.base.SoftConfigurationModel(statistics: Union[ndarray, GraphABC], **kwds: Any)[source]
Bases:
ERGM
Base class for soft configuration models.
- class pathcensus.nullmodels.base.UndirectedSoftConfigurationModel(statistics: Union[ndarray, GraphABC], **kwds: Any)[source]
Bases:
SoftConfigurationModel
Base class for undirected soft configuration models.
- aliases = mappingproxy({'degree': 'd', 'strength': 's'})
- property directed: bool
Is model directed.
- pathcensus.nullmodels.base.get_pmv(X: ndarray, v: ndarray, pijfunc: Callable[[ndarray, int, int], float]) ndarray [source]
Calculate \(Pv\) where \(P\) is edge probability matrix and \(v\) an arbitrary vector.
- Parameters
X – 1D array of model parameters.
v – Arbitrary vector.
pijfunc – JIT-compiled function (in no-python mode) calculating edge probabilities \(p_{ij}\). It should have the following signature:
(X, i, j) -> float
, whereX
is a 1D array of model parameters. The return value must be a float in[0, 1]
.
- pathcensus.nullmodels.base.get_wmv(X: ndarray, v: ndarray, pijfunc: Callable[[ndarray, int, int], float], Ewijfunc: Callable[[ndarray, int, int], float]) ndarray [source]
Calculate \(Wv\) where \(W\) is expected edge weight matrix and \(v\) is an arbitrary vector.
- Parameters
X – 1D array of model parameters.
v – Arbitrary vector.
pijfunc – JIT-compiled function (in no-python mode) calculating edge probabilities \(p_{ij}\). It should have the following signature:
(X, i, j) -> float
, whereX
is a 1D array of model parameters. The return value must be a float in[0, 1]
.Ewijfunc – JIT-compiled function (in no-python mode) calculating expected edge weights \(\mathbb{E}[p_{ij}]\). It should have the following signature
(X, i, j) -> float
, whereX
is a 1D array of model parameters. The return value must be a positive float.
- pathcensus.nullmodels.base.sample_edgelist_unweighted(X: ndarray, n_nodes: int, pijfunc: Callable[[ndarray, int, int], float]) ndarray [source]
Sample edgelist array from an ERGM.
- Parameters
X – 1D array of model parameters.
n_nodes – Number of nodes in hte underlying graph.
pijfunc – JIT-compiled function (in no-python mode) calculating edge probabilities \(p_{ij}\). It should have the following signature:
(X, i, j) -> float
, whereX
is a 1D array of model parameters. The return value must be a float in[0, 1]
.
- Returns
Edgelist array.
- Return type
E
- pathcensus.nullmodels.base.sample_edgelist_weighted(X: ndarray, n_nodes: int, pijfunc: Callable[[ndarray, int, int], float], wijfunc: Callable[[ndarray, int, int], Union[int, float]]) Tuple[ndarray, Optional[ndarray]] [source]
Sample edgelist array from an ERGM.
- Parameters
X – 1D array of model parameters.
n_nodes – Number of nodes in the underlying graph.
weighted – Is the model weighted
pijfunc – JIT-compiled function (in no-python mode) calculating edge probabilities \(p_{ij}\). It should have the following signature:
(X, i, j) -> float
, whereX
is a 1D array of model parameters. The return value must be a float in[0, 1]
.wijfunc – JIT-compiled function (in no-python mode) sampling edge weights \(w_{ij}\). It should have the following signature:
(X, i, j) -> float/int
, whereX
is a 1D array of model arameters. The return value must be a positive int/float.
- Returns
E – Edgelist array.
W – 1D array with edge weights.
pathcensus.nullmodels.ubcm module
Undirected Binary Configuration Model (UBCM) induces a maximum entropy probability distribution over networks of a given size such that it has a specific expected degree sequence. It can be used to model undirected unweighted networks. See [VBM+21] for details.
See also
UBCM
UBCM class
Examples
>>> # Make simple ER random graph using `igraph`
>>> import random
>>> import igraph as ig
>>> random.seed(101)
>>> G = ig.Graph.Erdos_Renyi(20, p=.2)
>>> # Initialize UBCM directly from the graph object
>>> ubcm = UBCM(G)
>>> # Alternatively, initialize from degree sequence array
>>> D = np.array(G.degree())
>>> ubcm = UBCM(D).fit()
>>> # Check fit error
>>> round(ubcm.error, 6)
0.0
>>> # Mean absolute deviation of the fitted expected degree sequence
>>> # from the observed sequence
>>> (np.abs(ubcm.ED - ubcm.D) <= 1e-6).all()
True
>>> # Sample a single ensemble instance
>>> ubcm.sample_one()
<20x20 sparse matrix of type '<class 'numpy.uint8'>'
with ... stored elements in Compressed Sparse Row format>
>>> # Sample multiple instances (generator)
>>> for instance in ubcm.sample(10): pass
- class pathcensus.nullmodels.ubcm.UBCM(statistics: Union[ndarray, GraphABC], **kwds: Any)[source]
Bases:
UndirectedSoftConfigurationModel
Undirected Binary Configuration Model.
This is a soft configuration model for undirected unweighted networks which belongs to the family of Exponential Random Graph Models (ERGMs) with local constraints. It induces a maximum entropy probability distribution over a set of networks with \(N\) nodes such that it yields a specific degree sequence on average.
- statistics
2D (float) array with sufficient statistics for nodes. In this case there is only one sufficient statistic, that is, the degree sequence.
- fit_args
Dictionary with arguments used in the last call of
fit()
.None
if the model has not been fitted yet.
Notes
The following important class attributes are also defined:
- labels
Mapping from abbreviated labels to full names identifying sufficient statistics.
- models
Model names as defined in
NEMtropy
allowed for the specific type of model.
- property D: ndarray
Observed degree sequence.
- property ED: ndarray
Expected degree sequence.
- default_fit_kwds = mappingproxy({'initial_guess': 'chung_lu'})
- property expected_statistics: ndarray
Expected sufficient statistics.
- extract_statistics(graph: GraphABC) ndarray [source]
Extract sufficient statistics from a graph-like object.
- property fullname: str
Full name of model. May be reimplemented on concrete subclass to allow using shortened class names.
- get_nemtropy_graph() UndirectedGraph [source]
Get
NEMtropy
graph representation instance.
- models = ('cm_exp', 'cm')
- names = mappingproxy({'degree': 'x'})
- property pijfunc: Callable
JIT-compiled routine for calculating \(p_{ij}\).
- property weighted: bool
Is model weighted.
pathcensus.nullmodels.uecm module
Undirected Enhanced Configuration Model (UECM) induces a maximum entropy probability distribution over networks of a given size such that it has specific expected degree and strength sequences. It can be used to model undirected weighted networks with edge weights being positive integers (with no upper bound). See [VBM+21] for details.
See also
UECM
UECM class
Examples
>>> import random
>>> import igraph as ig
>>> # Make a ER random graph with random integer weights
>>> random.seed(27732)
>>> G = ig.Graph.Erdos_Renyi(20, p=.2)
>>> G.es["weight"] = np.random.randint(1, 11, G.ecount())
>>> # Initialize UECM from the graph object
>>> uecm = UECM(G)
>>> # Alternatively initialize from an array of sufficient statistics
>>> # 1st column - degree sequence; 2nd column - strength sequence
>>> D = np.array(G.degree())
>>> S = np.array(G.strength(weights="weight"))
>>> stats = np.column_stack([D, S])
>>> uecm = UECM(stats).fit()
>>> # Check fit error
>>> round(uecm.error, 6)
0.0
>>> # Mean absolute deviation of the fitted expected degree sequence
>>> # from the observed sequence
>>> (np.abs(uecm.ED - uecm.D) <= 1e-6).all()
True
>>> # Mean absolute deviation of the fitted expected strength sequence
>>> # from the observed sequence
>>> (np.abs(uecm.ES - uecm.S) <= 1e-6).all()
True
>>> # Sample a single instance
>>> uecm.sample_one()
<20x20 sparse matrix of type '<class 'numpy.int64'>'
with ... stored elements in Compressed Sparse Row format>
>>> # Sample multiple instances (generator)
>>> for instance in uecm.sample(10): pass
- class pathcensus.nullmodels.uecm.UECM(statistics: Union[ndarray, GraphABC], **kwds: Any)[source]
Bases:
UndirectedSoftConfigurationModel
Undirected Enhanced Configuration Model.
This is a soft configuration model for undirected weighted networks with unbounded positive integer weights which belongs to the family of Exponential Random Graph Models (ERGMs) with local constraints. It induces a maximum entropy probability distribution over a set of networks with \(N\) nodes such that it yields a specific degree sequence and a specific strenght sequence on average.
- statistics
2D (float) array with sufficient statistics for nodes. In this case there are two sufficient statistics, that is, the degree sequence and the strength sequence.
- fit_args
Dictionary with arguments used in the last call of
fit()
.None
if the model has not been fitted yet.
Notes
The following important class attributes are also defined:
- labels
Mapping from abbreviated labels to full names identifying sufficient statistics.
- models
Model names as defined in
NEMtropy
allowed for the specific type of model.
- property D: ndarray
Observed degree sequence.
- property ED: ndarray
Expected degree sequence.
- property ES: ndarray
Expected strength sequence.
- property Ewijfunc: Callable
JIT-compiled routing for calculating \(\mathbb{E}[w_{ij}]\) (conditional on the edge being present).
- property S: ndarray
Observed strength sequence.
- default_fit_kwds = mappingproxy({'initial_guess': 'strengths_minor'})
- property expected_statistics: ndarray
Expected sufficient statistics.
- extract_statistics(graph: GraphABC) ndarray [source]
Extract sufficient statistics from a graph-like object.
- get_nemtropy_graph() UndirectedGraph [source]
Get
NEMtropy
graph representation instance.
- models = ('ecm_exp', 'ecm')
- names = mappingproxy({'degree': 'x', 'strength': 'y'})
- property pijfunc: Callable
JIT-compiled routine for calculating \(p_{ij}\).
- property weighted: bool
Is model weighted.
- property wijfunc: Callable
JIT-compiled routine sampling \(w_{ij}\).
- pathcensus.nullmodels.uecm.uecm_Ewij(X: ndarray, i: int, j: int) float [source]
Calculate expected edge weight \(\mathbb{E}[w_{ij}]\) (conditional on the edge being present) in UECM model.
- Parameters
X – 1D array od model parameters.
i – Node indices.
j – Node indices.
Module contents
Null model classes implementing different variants of the configuration model.
The classes implemented in this module are simple wrappers
around NEMtropy
package.