Distribution#
- class HARK.distribution.Distribution(seed: int | None = 0)#
Bases:
object
Base class for all probability distributions with seed and random number generator.
For discussion on random number generation and random seeds, see https://docs.scipy.org/doc/scipy/tutorial/stats.html#random-number-generation
- Parameters:
seed (Optional[int]) – Seed for random number generator.
- reset() None #
Reset the random number generator of this distribution. Resetting the seed will result in the same sequence of random numbers being generated.
- draw(N: int) ndarray #
Generate arrays of draws from this distribution. If input N is a number, output is a length N array of draws from the distribution. If N is a list, output is a length T list whose t-th entry is a length N array of draws from the distribution[t].
- Parameters:
N (int) – Number of draws in each row.
Returns
------------
draws (np.array or [np.array]) – T-length list of arrays of random variable draws each of size n, or a single array of size N (if sigma is a scalar).
- discretize(N: int, method: str = 'equiprobable', endpoints: bool = False, **kwds: Any) DiscreteDistribution #
Discretize the distribution into N points using the specified method.
- Parameters:
- Returns:
Discretized distribution.
- Return type:
- Raises:
NotImplementedError – If method is not implemented for this distribution.
- class HARK.distribution.ContinuousFrozenDistribution(dist: rv_continuous, *args: Any, seed: int = 0, **kwds: Any)#
Bases:
rv_continuous_frozen
,Distribution
Parameterized continuous distribution from scipy.stats with seed management.
- class HARK.distribution.Normal(mu=0.0, sigma=1.0, seed=0)#
Bases:
ContinuousFrozenDistribution
A Normal distribution.
- Parameters:
- discretize(N, method='hermite', endpoints=False)#
For normal distributions, the Gauss-Hermite quadrature rule is used as the default method for discretization.
- class HARK.distribution.Lognormal(mu: float | ndarray = 0.0, sigma: float | ndarray = 1.0, seed: int | None = 0, mean=None, std=None)#
Bases:
ContinuousFrozenDistribution
A Lognormal distribution
- Parameters:
mu (float or [float]) – One or more means of underlying normal distribution. Number of elements T in mu determines number of rows of output.
sigma (float or [float]) – One or more standard deviations of underlying normal distribution. Number of elements T in sigma determines number of rows of output.
seed (int) – Seed for random number generator.
- classmethod from_mean_std(mean, std, seed=0)#
Construct a LogNormal distribution from its mean and standard deviation.
This is unlike the normal constructor for the distribution, which takes the mu and sigma for the normal distribution that is the logarithm of the Log Normal distribution.
- Parameters:
- Return type:
LogNormal
- class HARK.distribution.MeanOneLogNormal(mu: float | ndarray = 0.0, sigma: float | ndarray = 1.0, seed: int | None = 0, mean=None, std=None)#
Bases:
Lognormal
A Lognormal distribution with mean 1.
- class HARK.distribution.Uniform(bot=0.0, top=1.0, seed=0)#
Bases:
ContinuousFrozenDistribution
A Uniform distribution.
- class HARK.distribution.Weibull(scale=1.0, shape=1.0, seed=0)#
Bases:
ContinuousFrozenDistribution
A Weibull distribution.
- class HARK.distribution.MVNormal(mu=[1, 1], Sigma=[[1, 0], [0, 1]], seed=0)#
Bases:
multivariate_normal_frozen
,Distribution
A Multivariate Normal distribution.
- Parameters:
mu (numpy array) – Mean vector.
Sigma (2-d numpy array. Each dimension must have length equal to that of) – mu. Variance-covariance matrix.
seed (int) – Seed for random number generator.
- discretize(N, method='hermite', endpoints=False)#
For multivariate normal distributions, the Gauss-Hermite quadrature rule is used as the default method for discretization.
- class HARK.distribution.MVLogNormal(mu: List | ndarray = [0.0, 0.0], Sigma: List | ndarray = [[1.0, 0.0], [0.0, 1.0]], seed=None)#
Bases:
multi_rv_frozen
,Distribution
A Multivariate Lognormal distribution.
- Parameters:
mu (Union[list, numpy.ndarray], optional) – Means of underlying multivariate normal, default [0.0, 0.0].
Sigma (Union[list, numpy.ndarray], optional) – nxn variance-covariance matrix of underlying multivariate normal, default [[1.0, 0.0], [0.0, 1.0]].
seed (int, optional) – Seed for random number generator, default 0.
- mean()#
Mean of the distribution.
- Returns:
Mean of the distribution.
- Return type:
np.ndarray
- class HARK.distribution.DiscreteFrozenDistribution(dist: rv_discrete, *args: Any, seed: int = 0, **kwds: Any)#
Bases:
rv_discrete_frozen
,Distribution
Parameterized discrete distribution from scipy.stats with seed management.
- class HARK.distribution.Bernoulli(p=0.5, seed=0)#
Bases:
DiscreteFrozenDistribution
A Bernoulli distribution.
- class HARK.distribution.DiscreteDistribution(pmv: ndarray, atoms: ndarray, seed: int = 0, limit: Dict[str, Any] | None = None)#
Bases:
Distribution
A representation of a discrete probability distribution.
- Parameters:
pmv (np.array) – An array of floats representing a probability mass function.
atoms (np.array) – Discrete point values for each probability mass. For multivariate distributions, the last dimension of atoms must index “atom” or the random realization. For instance, if atoms.shape == (2,6,4), the random variable has 4 possible realizations and each of them has shape (2,6).
seed (int) – Seed for random number generator.
- draw_events(N: int) ndarray #
Draws N ‘events’ from the distribution PMF. These events are indices into atoms.
- draw(N: int, atoms: None | int | ndarray = None, exact_match: bool = False) ndarray #
Simulates N draws from a discrete distribution with probabilities P and outcomes atoms.
- Parameters:
N (int) – Number of draws to simulate.
atoms (None, int, or np.array) – If None, then use this distribution’s atoms for point values. If an int, then the index of atoms for the point values. If an np.array, use the array for the point values.
exact_match (boolean) – Whether the draws should “exactly” match the discrete distribution (as closely as possible given finite draws). When True, returned draws are a random permutation of the N-length list that best fits the discrete distribution. When False (default), each draw is independent from the others and the result could deviate from the input.
- Returns:
draws – An array of draws from the discrete distribution; each element is a value in atoms.
- Return type:
np.array
- expected(func: Callable | None = None, *args: ndarray) ndarray #
Expected value of a function, given an array of configurations of its inputs along with a DiscreteDistribution object that specifies the probability of each configuration.
If no function is provided, it’s much faster to go straight to dot product instead of calling the dummy function.
If a function is provided, we need to add one more dimension, the atom dimension, to any inputs that are n-dim arrays. This allows numpy to easily broadcast the function’s output. For more information on broadcasting, see: https://numpy.org/doc/stable/user/basics.broadcasting.html#general-broadcasting-rules
- Parameters:
func (function) – The function to be evaluated. This function should take the full array of distribution values and return either arrays of arbitrary shape or scalars. It may also take other arguments *args. This function differs from the standalone calc_expectation method in that it uses numpy’s vectorization and broadcasting rules to avoid costly iteration. Note: If you need to use a function that acts on single outcomes of the distribution, consider distribution.calc_expectation.
*args – Other inputs for func, representing the non-stochastic arguments. The the expectation is computed at
f(dstn, *args)
.
- Returns:
f_exp – The expectation of the function at the queried values. Scalar if only one value.
- Return type:
np.array or scalar
- dist_of_func(func: ~typing.Callable[[...], float] = <function DiscreteDistribution.<lambda>>, *args: ~typing.Any) DiscreteDistribution #
Finds the distribution of a random variable Y that is a function of discrete random variable atoms, Y=f(atoms).
- Parameters:
func (function) – The function to be evaluated. This function should take the full array of distribution values. It may also take other arguments *args.
*args – Additional non-stochastic arguments for func, The function is computed as
f(dstn, *args)
.
- Returns:
f_dstn – The distribution of func(dstn).
- Return type:
- discretize(N: int, *args: Any, **kwargs: Any) DiscreteDistribution #
DiscreteDistribution is already an approximation, so this method returns a copy of the distribution.
TODO: print warning message?
- make_univariate(dim_to_keep, seed=0)#
Make a univariate discrete distribution from this distribution, keeping only the specified dimension.
- Parameters:
- Returns:
univariate_dstn – Univariate distribution with only the specified index.
- Return type:
- class HARK.distribution.DiscreteDistributionLabeled(pmv: ndarray, atoms: ndarray, seed: int = 0, limit: Dict[str, Any] | None = None, name: str = 'DiscreteDistributionLabeled', attrs: Dict[str, Any] | None = None, var_names: List[str] | None = None, var_attrs: List[Dict[str, Any] | None] | None = None)#
Bases:
DiscreteDistribution
A representation of a discrete probability distribution stored in an underlying xarray.Dataset.
- Parameters:
pmv (np.array) – An array of values representing a probability mass function.
data (np.array) – Discrete point values for each probability mass. For multivariate distributions, the last dimension of atoms must index “atom” or the random realization. For instance, if atoms.shape == (2,6,4), the random variable has 4 possible realizations and each of them has shape (2,6).
seed (int) – Seed for random number generator.
name (str) – Name of the distribution.
attrs (dict) – Attributes for the distribution.
var_names (list of str) – Names of the variables in the distribution.
var_attrs (list of dict) – Attributes of the variables in the distribution.
- classmethod from_unlabeled(dist, name='DiscreteDistributionLabeled', attrs=None, var_names=None, var_attrs=None)#
- classmethod from_dataset(x_obj, pmf)#
- property variables#
A dict-like container of DataArrays corresponding to the variables of the distribution.
- property name#
The distribution’s name.
- property attrs#
The distribution’s attributes.
- dist_of_func(func: ~typing.Callable = <function DiscreteDistributionLabeled.<lambda>>, *args, **kwargs) DiscreteDistribution #
Finds the distribution of a random variable Y that is a function of discrete random variable atoms, Y=f(atoms).
- Parameters:
func (function) – The function to be evaluated. This function should take the full array of distribution values. It may also take other arguments *args.
*args – Additional non-stochastic arguments for func, The function is computed as
f(dstn, *args)
.**kwargs – Additional keyword arguments for func. Must be xarray compatible in order to work with xarray broadcasting.
- Returns:
f_dstn – The distribution of func(dstn).
- Return type:
- expected(func: Callable | None = None, *args: Any, **kwargs: Any) float | ndarray #
Expectation of a function, given an array of configurations of its inputs along with a DiscreteDistributionLabeled object that specifies the probability of each configuration.
- Parameters:
func (function) – The function to be evaluated. This function should take the full array of distribution values and return either arrays of arbitrary shape or scalars. It may also take other arguments *args. This function differs from the standalone calc_expectation method in that it uses numpy’s vectorization and broadcasting rules to avoid costly iteration. Note: If you need to use a function that acts on single outcomes of the distribution, consider distribution.calc_expectation.
*args – Other inputs for func, representing the non-stochastic arguments. The the expectation is computed at
f(dstn, *args)
.labels (bool) – If True, the function should use labeled indexing instead of integer indexing using the distribution’s underlying rv coordinates. For example, if dims = (‘rv’, ‘x’) and coords = {‘rv’: [‘a’, ‘b’], }, then the function can be lambda x: x[“a”] + x[“b”].
- Returns:
f_exp – The expectation of the function at the queried values. Scalar if only one value.
- Return type:
np.array or scalar
- class HARK.distribution.IndexDistribution(engine, conditional, RNG=None, seed=0)#
Bases:
Distribution
This class provides a way to define a distribution that is conditional on an index.
The current implementation combines a defined distribution class (such as Bernoulli, LogNormal, etc.) with information about the conditions on the parameters of the distribution.
For example, an IndexDistribution can be defined as a Bernoulli distribution whose parameter p is a function of a different input parameter.
- Parameters:
- conditional = None#
- engine = None#
- discretize(N, **kwds)#
Approximation of the distribution.
- Parameters:
N (init) – Number of discrete points to approximate continuous distribution into.
kwds (dict) – Other keyword arguments passed to engine distribution approx() method.
Returns
------------
dists ([DiscreteDistribution]) –
A list of DiscreteDistributions that are the approximation of engine distribution under each condition.
TODO: It would be better if there were a conditional discrete distribution representation. But that integrates with the solution code. This implementation will return the list of distributions representations expected by the solution code.
- draw(condition)#
Generate arrays of draws. The input is an array containing the conditions. The output is an array of the same length (axis 1 dimension) as the conditions containing random draws of the conditional distribution.
- Parameters:
condition (np.array) – The input conditions to the distribution.
Returns
------------
draws (np.array)
- class HARK.distribution.TimeVaryingDiscreteDistribution(distributions, seed=0)#
Bases:
Distribution
This class provides a way to define a discrete distribution that is conditional on an index.
Wraps a list of discrete distributions.
- Parameters:
distributions ([DiscreteDistribution]) – A list of discrete distributions
seed (int) – Seed for random number generator.
- distributions = []#
- draw(condition)#
Generate arrays of draws. The input is an array containing the conditions. The output is an array of the same length (axis 1 dimension) as the conditions containing random draws of the conditional distribution.
- Parameters:
condition (np.array) – The input conditions to the distribution.
Returns
------------
draws (np.array)
- HARK.distribution.approx_lognormal_gauss_hermite(N, mu=0.0, sigma=1.0, seed=0)#
- HARK.distribution.calc_normal_style_pars_from_lognormal_pars(avg_lognormal, std_lognormal)#
- HARK.distribution.calc_lognormal_style_pars_from_normal_pars(mu_normal, std_normal)#
- HARK.distribution.approx_beta(N, a=1.0, b=1.0)#
Calculate a discrete approximation to the beta distribution. May be quite slow, as it uses a rudimentary numeric integration method to generate the discrete approximation.
- Parameters:
- Returns:
d – Probability associated with each point in array of discrete points for discrete probability mass function.
- Return type:
- HARK.distribution.make_markov_approx_to_normal(x_grid, mu, sigma, K=351, bound=3.5)#
Creates an approximation to a normal distribution with mean mu and standard deviation sigma, returning a stochastic vector called p_vec, corresponding to values in x_grid. If a RV is distributed x~N(mu,sigma), then the expectation of a continuous function f() is E[f(x)] = numpy.dot(p_vec,f(x_grid)).
- Parameters:
x_grid (numpy.array) – A sorted 1D array of floats representing discrete values that a normally distributed RV could take on.
mu (float) – Mean of the normal distribution to be approximated.
sigma (float) – Standard deviation of the normal distribution to be approximated.
K (int) – Number of points in the normal distribution to sample.
bound (float) – Truncation bound of the normal distribution, as +/- bound*sigma.
- Returns:
p_vec – A stochastic vector with probability weights for each x in x_grid.
- Return type:
numpy.array
- HARK.distribution.make_markov_approx_to_normal_by_monte_carlo(x_grid, mu, sigma, N_draws=10000)#
Creates an approximation to a normal distribution with mean mu and standard deviation sigma, by Monte Carlo. Returns a stochastic vector called p_vec, corresponding to values in x_grid. If a RV is distributed x~N(mu,sigma), then the expectation of a continuous function f() is E[f(x)] = numpy.dot(p_vec,f(x_grid)).
- Parameters:
x_grid (numpy.array) – A sorted 1D array of floats representing discrete values that a normally distributed RV could take on.
mu (float) – Mean of the normal distribution to be approximated.
sigma (float) – Standard deviation of the normal distribution to be approximated.
N_draws (int) – Number of draws to use in Monte Carlo.
- Returns:
p_vec – A stochastic vector with probability weights for each x in x_grid.
- Return type:
numpy.array
- HARK.distribution.make_tauchen_ar1(N, sigma=1.0, ar_1=0.9, bound=3.0)#
Function to return a discretized version of an AR1 process. See https://www.fperri.net/TEACHING/macrotheory08/numerical.pdf for details
- Parameters:
- Returns:
y (np.array) – Grid points on which the discretized process takes values
trans_matrix (np.array) – Markov transition array for the discretized process
- HARK.distribution.add_discrete_outcome_constant_mean(distribution, x, p, sort=False)#
Adds a discrete outcome of x with probability p to an existing distribution, holding constant the relative probabilities of other outcomes and overall mean.
- Parameters:
distribution (DiscreteDistribution) – A one-dimensional DiscreteDistribution.
x (float) – The new value to be added to the distribution.
p (float) – The probability of the discrete outcome x occuring.
sort (bool) – Whether or not to sort atoms before returning it
- Returns:
d – Probability associated with each point in array of discrete points for discrete probability mass function.
- Return type:
- HARK.distribution.add_discrete_outcome(distribution, x, p, sort=False)#
Adds a discrete outcome of x with probability p to an existing distribution, holding constant the relative probabilities of other outcomes.
- Parameters:
distribution (DiscreteDistribution) – One-dimensional distribution to which the outcome is to be added.
x (float) – The new value to be added to the distribution.
p (float) – The probability of the discrete outcome x occuring.
- Returns:
d – Probability associated with each point in array of discrete points for discrete probability mass function.
- Return type:
- HARK.distribution.combine_indep_dstns(*distributions, seed=0)#
Given n independent vector-valued discrete distributions, construct their joint discrete distribution. Can take multivariate discrete distributions as inputs.
- Parameters:
distributions (DiscreteDistribution) – Arbitrary number of discrete distributionss to combine. Their realizations must be vector-valued (for each D in distributions, it must be the case that len(D.dim())==1).
- Returns:
A DiscreteDistribution representing the joint distribution of the given
random variables.
- HARK.distribution.calc_expectation(dstn, func=<function <lambda>>, *args)#
Expectation of a function, given an array of configurations of its inputs along with a DiscreteDistribution object that specifies the probability of each configuration.
- Parameters:
dstn (DiscreteDistribution) – The distribution over which the function is to be evaluated.
func (function) – The function to be evaluated. This function should take an array of shape dstn.dim() and return either arrays of arbitrary shape or scalars. It may also take other arguments *args.
*args – Other inputs for func, representing the non-stochastic arguments. The the expectation is computed at
f(dstn, *args)
.
- Returns:
f_exp – The expectation of the function at the queried values. Scalar if only one value.
- Return type:
np.array or scalar
- HARK.distribution.distr_of_function(dstn, func=<function <lambda>>, *args)#
Finds the distribution of a random variable Y that is a function of discrete random variable atoms, Y=f(atoms).
- Parameters:
dstn (DiscreteDistribution) – The distribution over which the function is to be evaluated.
func (function) – The function to be evaluated. This function should take an array of shape dstn.dim(). It may also take other arguments *args.
*args – Additional non-stochastic arguments for func, The function is computed at
f(dstn, *args)
.
- Returns:
f_dstn – The distribution of func(dstn).
- Return type:
- class HARK.distribution.MarkovProcess(transition_matrix, seed=0)#
Bases:
Distribution
A representation of a discrete Markov process.
- Parameters:
transition_matrix (np.array) – An array of floats representing a probability mass for each state transition.
seed (int) – Seed for random number generator.
- transition_matrix = None#
- HARK.distribution.expected(func=None, dist=None, args=(), **kwargs)#
Expectation of a function, given an array of configurations of its inputs along with a DiscreteDistribution(atomsRA) object that specifies the probability of each configuration.
- Parameters:
func (function) – The function to be evaluated. This function should take the full array of distribution values and return either arrays of arbitrary shape or scalars. It may also take other arguments
*args
. This function differs from the standalone calc_expectation method in that it uses numpy’s vectorization and broadcasting rules to avoid costly iteration. Note: If you need to use a function that acts on single outcomes of the distribution, consier distribution.calc_expectation.dist (DiscreteDistribution or DiscreteDistributionLabeled) – The distribution over which the function is to be evaluated.
args (tuple) – Other inputs for func, representing the non-stochastic arguments. The the expectation is computed at
f(dstn, *args)
.labels (bool) – If True, the function should use labeled indexing instead of integer indexing using the distribution’s underlying rv coordinates. For example, if dims = (‘rv’, ‘x’) and coords = {‘rv’: [‘a’, ‘b’], }, then the function can be lambda x: x[“a”] + x[“b”].
- Returns:
f_exp – The expectation of the function at the queried values. Scalar if only one value.
- Return type:
np.array or scalar