Distribution#

class HARK.distributions.DiscreteDistribution(pmv: ndarray, atoms: ndarray, seed: int = 0, limit: Dict[str, Any] | None = None)#

Bases: Distribution

A representation of a discrete probability distribution.

Parameters:
  • pmv (np.array) – An array of floats representing a probability mass function.

  • atoms (np.array) – Discrete point values for each probability mass. For multivariate distributions, the last dimension of atoms must index “atom” or the random realization. For instance, if atoms.shape == (2,6,4), the random variable has 4 possible realizations and each of them has shape (2,6).

  • seed (int) – Seed for random number generator.

dim() int#

Last dimension of self.atoms indexes “atom.”

draw_events(N: int) ndarray#

Draws N ‘events’ from the distribution PMF. These events are indices into atoms.

draw(N: int, atoms: None | int | ndarray = None, exact_match: bool = False) ndarray#

Simulates N draws from a discrete distribution with probabilities P and outcomes atoms.

Parameters:
  • N (int) – Number of draws to simulate.

  • atoms (None, int, or np.array) – If None, then use this distribution’s atoms for point values. If an int, then the index of atoms for the point values. If an np.array, use the array for the point values.

  • exact_match (boolean) – Whether the draws should “exactly” match the discrete distribution (as closely as possible given finite draws). When True, returned draws are a random permutation of the N-length list that best fits the discrete distribution. When False (default), each draw is independent from the others and the result could deviate from the input.

Returns:

draws – An array of draws from the discrete distribution; each element is a value in atoms.

Return type:

np.array

expected(func: Callable | None = None, *args: ndarray) ndarray#

Expected value of a function, given an array of configurations of its inputs along with a DiscreteDistribution object that specifies the probability of each configuration.

If no function is provided, it’s much faster to go straight to dot product instead of calling the dummy function.

If a function is provided, we need to add one more dimension, the atom dimension, to any inputs that are n-dim arrays. This allows numpy to easily broadcast the function’s output. For more information on broadcasting, see: https://numpy.org/doc/stable/user/basics.broadcasting.html#general-broadcasting-rules

Parameters:
  • func (function) – The function to be evaluated. This function should take the full array of distribution values and return either arrays of arbitrary shape or scalars. It may also take other arguments *args. This function differs from the standalone calc_expectation method in that it uses numpy’s vectorization and broadcasting rules to avoid costly iteration. Note: If you need to use a function that acts on single outcomes of the distribution, consider distribution.calc_expectation.

  • *args – Other inputs for func, representing the non-stochastic arguments. The the expectation is computed at f(dstn, *args).

Returns:

f_exp – The expectation of the function at the queried values. Scalar if only one value.

Return type:

np.array or scalar

dist_of_func(func: ~typing.Callable[[...], float] = <function DiscreteDistribution.<lambda>>, *args: ~typing.Any) DiscreteDistribution#

Finds the distribution of a random variable Y that is a function of discrete random variable atoms, Y=f(atoms).

Parameters:
  • func (function) – The function to be evaluated. This function should take the full array of distribution values. It may also take other arguments *args.

  • *args – Additional non-stochastic arguments for func, The function is computed as f(dstn, *args).

Returns:

f_dstn – The distribution of func(dstn).

Return type:

DiscreteDistribution

discretize(N: int, *args: Any, **kwargs: Any) DiscreteDistribution#

DiscreteDistribution is already an approximation, so this method returns a copy of the distribution.

TODO: print warning message?

make_univariate(dim_to_keep, seed=0)#

Make a univariate discrete distribution from this distribution, keeping only the specified dimension.

Parameters:
  • dim_to_keep (int) – Index of the distribution to be kept. Any other dimensions will be “collapsed” into the univariate atoms, combining probabilities.

  • seed (int, optional) – Seed for random number generator of univariate distribution

Returns:

univariate_dstn – Univariate distribution with only the specified index.

Return type:

DiscreteDistribution

class HARK.distributions.DiscreteDistributionLabeled(pmv: ndarray, atoms: ndarray, seed: int = 0, limit: Dict[str, Any] | None = None, name: str = 'DiscreteDistributionLabeled', attrs: Dict[str, Any] | None = None, var_names: List[str] | None = None, var_attrs: List[Dict[str, Any] | None] | None = None)#

Bases: DiscreteDistribution

A representation of a discrete probability distribution stored in an underlying xarray.Dataset.

Parameters:
  • pmv (np.array) – An array of values representing a probability mass function.

  • data (np.array) – Discrete point values for each probability mass. For multivariate distributions, the last dimension of atoms must index “atom” or the random realization. For instance, if atoms.shape == (2,6,4), the random variable has 4 possible realizations and each of them has shape (2,6).

  • seed (int) – Seed for random number generator.

  • name (str) – Name of the distribution.

  • attrs (dict) – Attributes for the distribution.

  • var_names (list of str) – Names of the variables in the distribution.

  • var_attrs (list of dict) – Attributes of the variables in the distribution.

classmethod from_unlabeled(dist, name='DiscreteDistributionLabeled', attrs=None, var_names=None, var_attrs=None)#
classmethod from_dataset(x_obj, pmf)#
property variables#

A dict-like container of DataArrays corresponding to the variables of the distribution.

property name#

The distribution’s name.

property attrs#

The distribution’s attributes.

dist_of_func(func: ~typing.Callable = <function DiscreteDistributionLabeled.<lambda>>, *args, **kwargs) DiscreteDistribution#

Finds the distribution of a random variable Y that is a function of discrete random variable atoms, Y=f(atoms).

Parameters:
  • func (function) – The function to be evaluated. This function should take the full array of distribution values. It may also take other arguments *args.

  • *args – Additional non-stochastic arguments for func, The function is computed as f(dstn, *args).

  • **kwargs – Additional keyword arguments for func. Must be xarray compatible in order to work with xarray broadcasting.

Returns:

f_dstn – The distribution of func(dstn).

Return type:

DiscreteDistribution or DiscreteDistributionLabeled

expected(func: Callable | None = None, *args: Any, **kwargs: Any) float | ndarray#

Expectation of a function, given an array of configurations of its inputs along with a DiscreteDistributionLabeled object that specifies the probability of each configuration.

Parameters:
  • func (function) – The function to be evaluated. This function should take the full array of distribution values and return either arrays of arbitrary shape or scalars. It may also take other arguments *args. This function differs from the standalone calc_expectation method in that it uses numpy’s vectorization and broadcasting rules to avoid costly iteration. Note: If you need to use a function that acts on single outcomes of the distribution, consider distribution.calc_expectation.

  • *args – Other inputs for func, representing the non-stochastic arguments. The the expectation is computed at f(dstn, *args).

  • labels (bool) – If True, the function should use labeled indexing instead of integer indexing using the distribution’s underlying rv coordinates. For example, if dims = (‘rv’, ‘x’) and coords = {‘rv’: [‘a’, ‘b’], }, then the function can be lambda x: x[“a”] + x[“b”].

Returns:

f_exp – The expectation of the function at the queried values. Scalar if only one value.

Return type:

np.array or scalar

class HARK.distributions.Distribution(seed: int | None = 0)#

Bases: object

Base class for all probability distributions with seed and random number generator.

For discussion on random number generation and random seeds, see https://docs.scipy.org/doc/scipy/tutorial/stats.html#random-number-generation

Parameters:

seed (Optional[int]) – Seed for random number generator.

property seed: int#

Seed for random number generator.

Returns:

Seed.

Return type:

int

reset() None#

Reset the random number generator of this distribution. Resetting the seed will result in the same sequence of random numbers being generated.

random_seed() None#

Generate a new random seed for this distribution.

draw(N: int) ndarray#

Generate arrays of draws from this distribution. If input N is a number, output is a length N array of draws from the distribution. If N is a list, output is a length T list whose t-th entry is a length N array of draws from the distribution[t].

Parameters:
  • N (int) – Number of draws in each row.

  • Returns

  • ------------

  • draws (np.array or [np.array]) – T-length list of arrays of random variable draws each of size n, or a single array of size N (if sigma is a scalar).

discretize(N: int, method: str = 'equiprobable', endpoints: bool = False, **kwds: Any) DiscreteDistribution#

Discretize the distribution into N points using the specified method.

Parameters:
  • N (int) – Number of points in the discretization.

  • method (str, optional) – Method for discretization, by default “equiprobable”

  • endpoints (bool, optional) – Whether to include endpoints in the discretization, by default False

Returns:

Discretized distribution.

Return type:

DiscreteDistribution

Raises:

NotImplementedError – If method is not implemented for this distribution.

class HARK.distributions.IndexDistribution(engine, conditional, RNG=None, seed=0)#

Bases: Distribution

This class provides a way to define a distribution that is conditional on an index.

The current implementation combines a defined distribution class (such as Bernoulli, LogNormal, etc.) with information about the conditions on the parameters of the distribution.

For example, an IndexDistribution can be defined as a Bernoulli distribution whose parameter p is a function of a different input parameter.

Parameters:
  • engine (Distribution class) – A Distribution subclass.

  • conditional (dict) – Information about the conditional variation on the input parameters of the engine distribution. Keys should match the arguments to the engine class constructor.

  • seed (int) – Seed for random number generator.

conditional = None#
engine = None#
discretize(N, **kwds)#

Approximation of the distribution.

Parameters:
  • N (init) – Number of discrete points to approximate continuous distribution into.

  • kwds (dict) – Other keyword arguments passed to engine distribution approx() method.

  • Returns

  • ------------

  • dists ([DiscreteDistribution]) –

    A list of DiscreteDistributions that are the approximation of engine distribution under each condition.

    TODO: It would be better if there were a conditional discrete distribution representation. But that integrates with the solution code. This implementation will return the list of distributions representations expected by the solution code.

draw(condition)#

Generate arrays of draws. The input is an array containing the conditions. The output is an array of the same length (axis 1 dimension) as the conditions containing random draws of the conditional distribution.

Parameters:
  • condition (np.array) – The input conditions to the distribution.

  • Returns

  • ------------

  • draws (np.array)

class HARK.distributions.TimeVaryingDiscreteDistribution(distributions, seed=0)#

Bases: Distribution

This class provides a way to define a discrete distribution that is conditional on an index.

Wraps a list of discrete distributions.

Parameters:
  • distributions ([DiscreteDistribution]) – A list of discrete distributions

  • seed (int) – Seed for random number generator.

distributions = []#
draw(condition)#

Generate arrays of draws. The input is an array containing the conditions. The output is an array of the same length (axis 1 dimension) as the conditions containing random draws of the conditional distribution.

Parameters:
  • condition (np.array) – The input conditions to the distribution.

  • Returns

  • ------------

  • draws (np.array)

class HARK.distributions.Lognormal(mu: float | ndarray = 0.0, sigma: float | ndarray = 1.0, seed: int | None = 0, mean=None, std=None)#

Bases: ContinuousFrozenDistribution

A Lognormal distribution

Parameters:
  • mu (float or [float]) – One or more means of underlying normal distribution. Number of elements T in mu determines number of rows of output.

  • sigma (float or [float]) – One or more standard deviations of underlying normal distribution. Number of elements T in sigma determines number of rows of output.

  • seed (int) – Seed for random number generator.

classmethod from_mean_std(mean, std, seed=0)#

Construct a LogNormal distribution from its mean and standard deviation.

This is unlike the normal constructor for the distribution, which takes the mu and sigma for the normal distribution that is the logarithm of the Log Normal distribution.

Parameters:
  • mean (float or [float]) – One or more means. Number of elements T in mu determines number of rows of output.

  • std (float or [float]) – One or more standard deviations. Number of elements T in sigma determines number of rows of output.

  • seed (int) – Seed for random number generator.

Return type:

LogNormal

class HARK.distributions.MeanOneLogNormal(mu: float | ndarray = 0.0, sigma: float | ndarray = 1.0, seed: int | None = 0, mean=None, std=None)#

Bases: Lognormal

A Lognormal distribution with mean 1.

class HARK.distributions.Normal(mu=0.0, sigma=1.0, seed=0)#

Bases: ContinuousFrozenDistribution

A Normal distribution.

Parameters:
  • mu (float or [float]) – One or more means. Number of elements T in mu determines number of rows of output.

  • sigma (float or [float]) – One or more standard deviations. Number of elements T in sigma determines number of rows of output.

  • seed (int) – Seed for random number generator.

discretize(N, method='hermite', endpoints=False)#

For normal distributions, the Gauss-Hermite quadrature rule is used as the default method for discretization.

class HARK.distributions.Weibull(scale=1.0, shape=1.0, seed=0)#

Bases: ContinuousFrozenDistribution

A Weibull distribution.

Parameters:
  • scale (float or [float]) – One or more scales. Number of elements T in scale determines number of rows of output.

  • shape (float or [float]) – One or more shape parameters. Number of elements T in scale determines number of rows of output.

  • seed (int) – Seed for random number generator.

class HARK.distributions.Bernoulli(p=0.5, seed=0)#

Bases: DiscreteFrozenDistribution

A Bernoulli distribution.

Parameters:
  • p (float or [float]) – Probability or probabilities of the event occurring (True).

  • seed (int) – Seed for random number generator.

class HARK.distributions.MVLogNormal(mu: List | ndarray = [0.0, 0.0], Sigma: List | ndarray = [[1.0, 0.0], [0.0, 1.0]], seed=None)#

Bases: multi_rv_frozen, Distribution

A Multivariate Lognormal distribution.

Parameters:
  • mu (Union[list, numpy.ndarray], optional) – Means of underlying multivariate normal, default [0.0, 0.0].

  • Sigma (Union[list, numpy.ndarray], optional) – nxn variance-covariance matrix of underlying multivariate normal, default [[1.0, 0.0], [0.0, 1.0]].

  • seed (int, optional) – Seed for random number generator, default 0.

mean()#

Mean of the distribution.

Returns:

Mean of the distribution.

Return type:

np.ndarray

rvs(size: int = 1, random_state=None)#

Random sample from the distribution.

Parameters:
  • size (int) – Number of data points to generate.

  • random_state (optional) – Seed for random number generator.

Returns:

Random sample from the distribution.

Return type:

np.ndarray

class HARK.distributions.MVNormal(mu=[1, 1], Sigma=[[1, 0], [0, 1]], seed=0)#

Bases: multivariate_normal_frozen, Distribution

A Multivariate Normal distribution.

Parameters:
  • mu (numpy array) – Mean vector.

  • Sigma (2-d numpy array. Each dimension must have length equal to that of) – mu. Variance-covariance matrix.

  • seed (int) – Seed for random number generator.

discretize(N, method='hermite', endpoints=False)#

For multivariate normal distributions, the Gauss-Hermite quadrature rule is used as the default method for discretization.

HARK.distributions.approx_beta(N, a=1.0, b=1.0)#

Calculate a discrete approximation to the beta distribution. May be quite slow, as it uses a rudimentary numeric integration method to generate the discrete approximation.

Parameters:
  • N (int) – Size of discrete space vector to be returned.

  • a (float) – First shape parameter (sometimes called alpha).

  • b (float) – Second shape parameter (sometimes called beta).

Returns:

d – Probability associated with each point in array of discrete points for discrete probability mass function.

Return type:

DiscreteDistribution

HARK.distributions.approx_lognormal_gauss_hermite(N, mu=0.0, sigma=1.0, seed=0)#
HARK.distributions.calc_expectation(dstn, func=<function <lambda>>, *args)#

Expectation of a function, given an array of configurations of its inputs along with a DiscreteDistribution object that specifies the probability of each configuration.

Parameters:
  • dstn (DiscreteDistribution) – The distribution over which the function is to be evaluated.

  • func (function) – The function to be evaluated. This function should take an array of shape dstn.dim() and return either arrays of arbitrary shape or scalars. It may also take other arguments *args.

  • *args – Other inputs for func, representing the non-stochastic arguments. The the expectation is computed at f(dstn, *args).

Returns:

f_exp – The expectation of the function at the queried values. Scalar if only one value.

Return type:

np.array or scalar

HARK.distributions.calc_lognormal_style_pars_from_normal_pars(mu_normal, std_normal)#
HARK.distributions.calc_normal_style_pars_from_lognormal_pars(avg_lognormal, std_lognormal)#
HARK.distributions.combine_indep_dstns(*distributions, seed=0)#

Given n independent vector-valued discrete distributions, construct their joint discrete distribution. Can take multivariate discrete distributions as inputs.

Parameters:

distributions (DiscreteDistribution) – Arbitrary number of discrete distributions to combine. Their realizations must be vector-valued (for each D in distributions, it must be the case that len(D.dim())==1).

Returns:

  • A DiscreteDistribution representing the joint distribution of the given

  • random variables.

HARK.distributions.distr_of_function(dstn, func=<function <lambda>>, *args)#

Finds the distribution of a random variable Y that is a function of discrete random variable atoms, Y=f(atoms).

Parameters:
  • dstn (DiscreteDistribution) – The distribution over which the function is to be evaluated.

  • func (function) – The function to be evaluated. This function should take an array of shape dstn.dim(). It may also take other arguments *args.

  • *args – Additional non-stochastic arguments for func, The function is computed at f(dstn, *args).

Returns:

f_dstn – The distribution of func(dstn).

Return type:

DiscreteDistribution

HARK.distributions.expected(func=None, dist=None, args=(), **kwargs)#

Expectation of a function, given an array of configurations of its inputs along with a DiscreteDistribution(atomsRA) object that specifies the probability of each configuration.

Parameters:
  • func (function) – The function to be evaluated. This function should take the full array of distribution values and return either arrays of arbitrary shape or scalars. It may also take other arguments *args. This function differs from the standalone calc_expectation method in that it uses numpy’s vectorization and broadcasting rules to avoid costly iteration. Note: If you need to use a function that acts on single outcomes of the distribution, consier distribution.calc_expectation.

  • dist (DiscreteDistribution or DiscreteDistributionLabeled) – The distribution over which the function is to be evaluated.

  • args (tuple) – Other inputs for func, representing the non-stochastic arguments. The the expectation is computed at f(dstn, *args).

  • labels (bool) – If True, the function should use labeled indexing instead of integer indexing using the distribution’s underlying rv coordinates. For example, if dims = (‘rv’, ‘x’) and coords = {‘rv’: [‘a’, ‘b’], }, then the function can be lambda x: x[“a”] + x[“b”].

Returns:

f_exp – The expectation of the function at the queried values. Scalar if only one value.

Return type:

np.array or scalar

class HARK.distributions.Uniform(bot=0.0, top=1.0, seed=0)#

Bases: ContinuousFrozenDistribution

A Uniform distribution.

Parameters:
  • bot (float or [float]) – One or more bottom values. Number of elements T in mu determines number of rows of output.

  • top (float or [float]) – One or more top values. Number of elements T in top determines number of rows of output.

  • seed (int) – Seed for random number generator.

class HARK.distributions.MarkovProcess(transition_matrix, seed=0)#

Bases: Distribution

A representation of a discrete Markov process.

Parameters:
  • transition_matrix (np.array) – An array of floats representing a probability mass for each state transition.

  • seed (int) – Seed for random number generator.

transition_matrix = None#
draw(state)#

Draw new states fromt the transition matrix.

Parameters:

state (int or nd.array) – The state or states (1-D array) from which to draw new states.

Returns:

new_state – New states.

Return type:

int or nd.array

HARK.distributions.add_discrete_outcome_constant_mean(distribution, x, p, sort=False)#

Adds a discrete outcome of x with probability p to an existing distribution, holding constant the relative probabilities of other outcomes and overall mean.

Parameters:
  • distribution (DiscreteDistribution) – A one-dimensional DiscreteDistribution.

  • x (float) – The new value to be added to the distribution.

  • p (float) – The probability of the discrete outcome x occuring.

  • sort (bool) – Whether or not to sort atoms before returning it

Returns:

d – Probability associated with each point in array of discrete points for discrete probability mass function.

Return type:

DiscreteDistribution