# metrics.distribution¶

Metrics on probability distributions.

class pyphi.metrics.distribution.DistributionMeasureRegistry

Storage for distance functions between probability distributions.

Users can define custom measures:

Examples

>>> @measures.register('ALWAYS_ZERO')
... def always_zero(a, b):
...    return 0


And use them by setting, e.g., config.REPERTOIRE_DISTANCE = 'ALWAYS_ZERO'.

desc = 'distance functions between probability distributions'
register(name, asymmetric=False)

Decorator for registering a distribution measure with PyPhi.

Parameters:

name (string) – The name of the measure.

Keyword Arguments:

asymmetric (boolean) – True if the measure is asymmetric.

asymmetric()

Return a list of asymmetric measures.

class pyphi.metrics.distribution.ActualCausationMeasureRegistry

Storage for distance functions used in pyphi.actual.

Users can define custom measures:

Examples

>>> @measures.register('ALWAYS_ZERO')
... def always_zero(a, b):
...    return 0


And use them by setting, e.g., config.REPERTOIRE_DISTANCE = 'ALWAYS_ZERO'.

desc = 'distance functions for use in actual causation calculations'
register(name, asymmetric=False)

Decorator for registering an actual causation measure with PyPhi.

Parameters:

name (string) – The name of the measure.

Keyword Arguments:

asymmetric (boolean) – True if the measure is asymmetric.

asymmetric()

Return a list of asymmetric measures.

class pyphi.metrics.distribution.np_suppress

Decorator to suppress NumPy warnings about divide-by-zero and multiplication of NaN.

Note

This should only be used in cases where you are sure that these warnings are not indicative of deeper issues in your code.

pyphi.metrics.distribution.hamming_emd(p, q)

Return the Earth Mover’s Distance between two distributions (indexed by state, one dimension per node) using the Hamming distance between states as the transportation cost function.

Singleton dimensions are sqeezed out.

pyphi.metrics.distribution.effect_emd(p, q)

Compute the EMD between two effect repertoires.

Because the nodes are independent, the EMD between effect repertoires is equal to the sum of the EMDs between the marginal distributions of each node, and the EMD between marginal distribution for a node is the absolute difference in the probabilities that the node is OFF.

Parameters:
• p (np.ndarray) – The first repertoire.

• q (np.ndarray) – The second repertoire.

Returns:

The EMD between p and q.

Return type:

float

pyphi.metrics.distribution.emd(p, q, direction=None)

Compute the EMD between two repertoires for a given direction.

The full EMD computation is used for cause repertoires. A fast analytic solution is used for effect repertoires.

Parameters:
Returns:

The EMD between p and q, rounded to PRECISION.

Return type:

float

Raises:

ValueError – If direction is invalid.

pyphi.metrics.distribution.l1(p, q)

Return the L1 distance between two distributions.

Parameters:
• p (np.ndarray) – The first probability distribution.

• q (np.ndarray) – The second probability distribution.

Returns:

The sum of absolute differences of p and q.

Return type:

float

pyphi.metrics.distribution.entropy_difference(p, q)

Return the difference in entropy between two distributions.

pyphi.metrics.distribution.psq2(p, q)

Compute the PSQ2 measure.

This is defined as $$\mid f(p) - f(q) \mid$$, where

$f(x) = \sum_{i=0}^{N-1} p_i^2 \log_2 (p_i N)$
Parameters:
• p (np.ndarray) – The first distribution.

• q (np.ndarray) – The second distribution.

pyphi.metrics.distribution.mp2q(p, q)

Compute the MP2Q measure.

This is defined as

$\frac{1}{N} \sum_{i=0}^{N-1} \frac{p_i^2}{q_i} \log_2\left(\frac{p_i}{q_i}\right)$
Parameters:
• p (np.ndarray) – The first distribution.

• q (np.ndarray) – The second distribution.

Returns:

The distance.

Return type:

float

pyphi.metrics.distribution.information_density(p, q)

Return the information density of p relative to q, in base 2.

This is also known as the element-wise relative entropy; see scipy.special.rel_entr().

Parameters:
• p (np.ndarray) – The first probability distribution.

• q (np.ndarray) – The second probability distribution.

Returns:

The information density of p relative to q.

Return type:

np.ndarray

pyphi.metrics.distribution.kld(p, q)

Return the Kullback-Leibler Divergence (KLD) between two distributions.

Parameters:
• p (np.ndarray) – The first probability distribution.

• q (np.ndarray) – The second probability distribution.

Returns:

The KLD of p from q.

Return type:

float

pyphi.metrics.distribution.absolute_information_density(p, q)

Return the absolute information density function of two distributions.

The information density is also known as the element-wise relative entropy; see scipy.special.rel_entr().

Parameters:
• p (np.ndarray) – The first probability distribution.

• q (np.ndarray) – The second probability distribution.

Returns:

The absolute information density of p relative to q.

Return type:

np.ndarray

pyphi.metrics.distribution.approximate_specified_state(repertoire, partitioned_repertoire)

Estimate the purview state that maximizes the AID between the repertoires.

This returns only the state of the purview nodes (i.e., there is one element in the state vector for each purview node, not for each node in the network).

Note

Although deterministic, results are only a good guess. This function should only be used in cases where running specified_state() becomes unfeasible.

This algorithm runs in linear time as a function of purview size, as opposed to the exponential (on average) exhaustive exact search. Single-node (i.e. marginal) repertoires are considered one by one, and their state is determined according to the following heuristics:

If the most probable state in the unpartitioned repertoire ($$p > 1/2$$) becomes less probable in the partitioned one ($$p > q$$), we should pick that state for that node. Note that there can be ties. In that case, the state with the lowest index is arbitrarily chosen.

Now suppose that was enough to specify the state of only $$k$$ nodes, with joint point unpartitioned probability $$p_k$$ and partitioned probability $$q_k$$, and suppose we add node $$z$$. Let the node $$z$$ have probability $$p_z$$ for the state 0. For the complementary state 1, the probability is $$1 - p_z$$. We want to know which state of $$z$$ gives higher intrinsic information when it is added to the $$k$$ nodes. In other words, we want to compare $$I_x$$ and $$I_y$$:

$I_x = \left( p_k p_z \right) \log_2 \left( \frac{p_k p_z}{q_k q_z} \right)$
$I_y = \left( p_k (1-p_z) \right) \log_2 \left( \frac{p_k (1-p_z)}{q_k(1-q_z)} \right)$

For state 1 to give higher intrinsic information (i.e., $$I_y > I_x$$), $$p_z$$ and $$q_z$$ must satisfy two equations:

$p_z < 1/2$
$\log_2 \left( \frac{p_k}{q_k} \right) < \left( \frac{1}{1-2p_z} \right) \left( p_z \log_2 \left( \frac{p_z}{q_z} \right) - (1-p_z) \log_2 \left( \frac{1-p_z}{1-q_z} \right) \right)$

Otherwise, we should pick the state 0 as the state of node $$z$$.

Parameters:
• repertoire (np.ndarray) – The first probability distribution.

• partitioned_repertoire (np.ndarray) – The second probability distribution.

Returns:

A 2D array where the single row is the approximate specified_state().

Return type:

np.ndarray

pyphi.metrics.distribution.intrinsic_difference(p, q)

Compute the intrinsic difference (ID) between two distributions.

This is defined as

$\max_i \left\{ p_i \log_2 \left( \frac{p_i}{q_i} \right) \right\}$

where we define $$p_i \log_2 \left( \frac{p_i}{q_i} \right)$$ to be $$0$$ when $$p_i = 0$$ or $$q_i = 0$$.

See the following paper:

Barbosa LS, Marshall W, Streipert S, Albantakis L, Tononi G (2020). A measure for intrinsic information. Sci Rep, 10, 18803. https://doi.org/10.1038/s41598-020-75943-4

Parameters:
• p (np.ndarray) – The first probability distribution.

• q (np.ndarray) – The second probability distribution.

Returns:

The intrinsic difference.

Return type:

float

pyphi.metrics.distribution.absolute_intrinsic_difference(p, q)

Compute the absolute intrinsic difference (AID) between two distributions.

This is the same as the ID, but with the absolute value taken before the maximum is taken.

See documentation for intrinsic_difference() for further details and references.

Parameters:
• p (float) – The first probability distribution.

• q (float) – The second probability distribution.

Returns:

The absolute intrinsic difference.

Return type:

float

pyphi.metrics.distribution.iit_4_small_phi(p, q, state)
pyphi.metrics.distribution.iit_4_small_phi_no_absolute_value(p, q, state)
pyphi.metrics.distribution.generalized_intrinsic_difference(forward_repertoire, partitioned_forward_repertoire, selectivity_repertoire, state=None)
pyphi.metrics.distribution.absolute_pointwise_mutual_information(p, q, state)

Computes the state-specific absolute pointwise mutual information between two distributions.

This is the same as the MI, but with the absolute value.

Parameters:
• p (np.ndarray[float]) – The first probability distribution.

• q (np.ndarray[float]) – The second probability distribution.

Returns:

The maximum absolute pointwise mutual information.

Return type:

float

pyphi.metrics.distribution.pointwise_mutual_information_vector(p, q)
pyphi.metrics.distribution.pointwise_mutual_information(p, q)

Compute the pointwise mutual information (PMI).

This is defined as

$\log_2\left(\frac{p}{q}\right)$

when $$p \neq 0$$ and $$q \neq 0$$, and $$0$$ otherwise.

Parameters:
• p (float) – The first probability.

• q (float) – The second probability.

Returns:

the pointwise mutual information.

Return type:

float

pyphi.metrics.distribution.weighted_pointwise_mutual_information(p, q)

Compute the weighted pointwise mutual information (WPMI).

This is defined as

$p \log_2\left(\frac{p}{q}\right)$

when $$p \neq 0$$ and $$q \neq 0$$, and $$0$$ otherwise.

Parameters:
• p (float) – The first probability.

• q (float) – The second probability.

Returns:

The weighted pointwise mutual information.

Return type:

float

pyphi.metrics.distribution.repertoire_distance(r1, r2, direction=None, repertoire_distance=None, **kwargs)

Compute the distance between two repertoires for the given direction.

Parameters:
Returns:

The distance between r1 and r2, rounded to PRECISION.

Return type:

float