• Language: en

Probability Distributions

The distributions available for use in PoPy models are shown in Table 72:-

Table 72 Probability Distributions
Name Syntax
Uniform x ~ unif(min_x, max_x) init_x
Normal y ~ norm(mean, var)
Censored Normal y ~ cennorm(mean, var, LLQ=llq, ULQ=ulq)
Rectified Normal y ~ rectnorm(mean, var, LLQ=llq, ULQ=ulq)
Truncated Normal y ~ truncnorm(mean, var, MIN=min, MAX=max)
Truncated Censored Normal y ~ trunccennorm(mean, var, MIN=min, LLQ=llq, ULQ=ulq, MAX=max)
Truncated Rectified Normal y ~ truncrectnorm(mean, var, MIN=min, LLQ=llq, ULQ=ulq, MAX=max)
Multi Normal y_vec ~ mnorm(mean_vec, var_mat)
Bernoulli y ~ bernoulli(p)
Poisson y ~ poisson(p)
Binomial y ~ binomial(p, n)
Negative Binomial y ~ negbinomial(p, n)

Uniform Distribution

Uniform is a continuous univariate distribution, written as:-

x ~ unif(min_x, max_x) init_x

The uniform distribution is used to define a range of values for an unknown scalar that you wish PoPy to estimate.

The input parameters are:-

  • min_x - the minimum value that variable ‘x’ is allowed to take during estimation.
  • max_x - the maximum value that variable ‘x’ is allowed to take during estimation.
  • init_x - the initial value that variable ‘x’ takes at the start of estimation.

The output ‘x’ and inputs ‘min_x’, ‘max_x’, ‘init_x’ are all continuous values.

For more information see Uniform Distribution on Wikipedia.

Uniform Distribution Examples

You use the Uniform Distribution in the EFFECTS section of a PoPy Fit Script as follows:-

f[KE] ~ unif(0.1, 100) 0.05

The above expressions limits the f[KE] variable to the range [0.1, 100] with an initial starting value of 0.05.

Alternatively you can use some convenient shortcuts, for example:-

f[KE] ~ P 0.05

Where ‘P’ stands for +ve. The equivalent long form is:-

f[KE] ~ unif(1e-06, +inf) 0.05

Which limits f[KE] to be greater than 1e-06.

You can also have an unconstrained variable as follows:-

f[KE] ~ U 0.05

Where ‘U’ stands for unlimited. The equivalent long form is:-

f[KE] ~ unif(-inf, +inf) 0.05

Normal Distribution

The Normal distribution is used for continuous variables and written in PoPy as:-

x ~ norm(mean, var)

The Normal models a Gaussian distribution with two parameters ‘mean’ and ‘var’.

The input parameters are:-

  • mean - the expected value of the Normal
  • var - the variance of the Normal

The output ‘x’ and inputs ‘mean’, ‘var’ are all continuous values

For more information see Normal Distribution on Wikipedia.

Normal Random Effect Example

You can use the Normal Distribution in the EFFECTS section of a PoPy script, to define a r[X] random effect variable as follows:-

EFFECTS:
    ID: |
        r[KE] ~ norm(0, f[KE_isv])

Here the r[KE] scalar variable is defined as a normal with mean zero and positive scalar variance f[KE_isv].

r[KE] is defined at the ‘ID’ level, so each individual in the population has an independent sample of this normal distribution.

Normal Likelihood Example

You can use the Normal Distribution in the PREDICTIONS section of a PoPy Fit Script as follows:-

PREDICTIONS:
    p[DV_CENTRAL] = s[CENTRAL]/m[V1]
    var = m[ANOISE]**2 + m[PNOISE]**2 * p[DV_CENTRAL]**2
    c[DV_CENTRAL] ~ norm(p[DV_CENTRAL], var)

The above syntax in a Fit Script specifies the likelihood of the observed c[DV_CENTRAL] observation from the data file, when modelled as a Normal variable, with mean p[DV_CENTRAL] and variance ‘var’.

Censored Normal Distribution

The ~cennorm() distribution is used to model whether the output of a Normal random variable lies within a particular range and is written in PoPy as:-

x ~ cennorm(mean, var, LLQ=llq, ULQ=ulq)

The ~cennorm() distribution models a Censored Gaussian distribution with two parameters ‘mean’ and ‘var’ and two limit parameters ‘llq’ and ‘ulq’.

The input parameters are:-

  • mean - the expected value of the Normal
  • var - the variance of the Normal
  • llq - lower limit of quantification - optional parameter - default value is -inf
  • ulq - upper limit of quantification - optional parameter - default value is +inf

The inputs ‘mean’, ‘var’, ‘llq’, ‘ulq’ are all continuous values. The default values above imply that the following:-

x ~ cennorm(mean, var)

Is the same as this:-

x ~ cennorm(mean, var, LLQ=-inf, ULQ=+inf)

Which is completely uninformative, as for any value of x the likelihood is one and the log likelihood contribution zero.

The ‘llq’ and ‘ulq’ values define 3 adjacent regions in the range [-inf, +inf]. When sampling from a Censored Normal, the output can take one of three values:-

  • llq - represents a sample in the range [-inf, llq]
  • (llq + ulq)/2 - represents a sample in the range [llq, ulq]
  • ulq - represents a sample in the range [ulq, +inf]

The probability of each value is computed using the cumulative normal distribution for each range. The sum of all three range probabilities will sum to one.

For more information see Cumulative Normal Distribution on Wikipedia.

Censored Normal Likelihood Example

You can use the ~cennorm() distribution in the PREDICTIONS section of a PoPy Fit Script to model below level of quantification (BLQ) data, i.e. observations that are not observed directly, but are known to be below a certain lower limit of quantification (LLQ), as follows:-

PREDICTIONS:
    p[DV_CENTRAL] = s[CENTRAL]/m[V1]
    var = m[ANOISE]**2 + m[PNOISE]**2 * p[DV_CENTRAL]**2
    llq = 2.0
    if c[DV_CENTRAL] <= llq:
        c[DV_CENTRAL] ~ cennorm(p[DV_CENTRAL], var, LLQ=llq)
    else:
        c[DV_CENTRAL] ~ norm(p[DV_CENTRAL], var)

The above syntax in a Fit Script specifies the likelihood of the observed c[DV_CENTRAL] observation from the data file. c[DV_CENTRAL] observations greater than LLQ are modelled as a Standard Normal variable, with mean p[DV_CENTRAL] and proportional variance ‘var’. c[DV_CENTRAL] observations less than LLQ are modelled as a cumulative normal distribution with the same mean and variance lying within the range [-inf, llq]. This BLQ data model is referred to as method ‘M3’ in [Beal2001].

Note that any value for c[DV_CENTRAL] in the data set less than or equal to llq is treated as a BLQ observation by this model.

Also note that PoPy requires the keyword syntax ‘LLQ=llq’ here to clarify the purpose of the third ~cennorm() distribution parameter. It is also possible to model above level of quantification (ALQ) observations, as follows:-

PREDICTIONS:
    p[DV_CENTRAL] = s[CENTRAL]/m[V1]
    var = m[ANOISE]**2 + m[PNOISE]**2 * p[DV_CENTRAL]**2
    ulq = 100.0
    if c[DV_CENTRAL] >= ulq:
        c[DV_CENTRAL] ~ cennorm(p[DV_CENTRAL], var, ULQ=ulq)
    else:
        c[DV_CENTRAL] ~ norm(p[DV_CENTRAL], var)

Or potentially both BLQ and ALQ observations:-

PREDICTIONS:
    p[DV_CENTRAL] = s[CENTRAL]/m[V1]
    var = m[ANOISE]**2 + m[PNOISE]**2 * p[DV_CENTRAL]**2
    llq = 2.0
    ulq = 100.0
    if c[DV_CENTRAL] <= llq or c[DV_CENTRAL] >= ulq:
        c[DV_CENTRAL] ~ cennorm(p[DV_CENTRAL], var, LLQ=llq, ULQ=ulq)
    else:
        c[DV_CENTRAL] ~ norm(p[DV_CENTRAL], var)

The ‘if’ statement above, makes it reasonably clear how PoPy models BLQ and ALQ data, when fitting a model, however these formulae are quite long winded and difficult to sample from, so in practice it is recommended to use a Rectified Normal Distribution instead, see below.

Rectified Normal Distribution

The ~rectnorm() distribution combines a ~cennorm() distribution and a ~norm() distribution. Its primary purpose is modelling of BLQ and ALQ observations. It is written in PoPy as:-

x ~ rectnorm(mean, var, LLQ=llq, ULQ=ulq)

The ~rectnorm() distribution models BLQ and ALQ observations using a ~cennorm() distribution and a ~norm() distribution for fully observed data, with shared parameters ‘mean’ and ‘var’ over the following ranges:-

  • [-inf, llq] - cennorm(mean, var, LLQ=-inf, ULQ=llq)
  • [llq, ulq] - norm(mean, var)
  • [ulq, +inf] - cennorm(mean, var, LLQ=ulq, ULQ=+inf)

The input parameters are:-

  • mean - the expected value of the Normal
  • var - the variance of the Normal
  • llq - lower limit of quantification - optional parameter - default value is -inf
  • ulq - upper limit of quantification - optional parameter - default value is +inf

The inputs ‘mean’, ‘var’, ‘llq’, ‘ulq’ are all continuous values. The default values above imply that the following:-

x ~ rectnorm(mean, var)

Is the same as this:-

x ~ rectnorm(mean, var, LLQ=-inf, ULQ=+inf)

Which is the same as a ~norm() distribution:-

x ~ norm(mean, var)

The ‘llq’ and ‘ulq’ values define 3 adjacent regions in the range [-inf, +inf]. When sampling from a Rectified Normal, the output can take one of three types of value:-

  • llq - represents a sample in the range [-inf, llq]
  • [llq, ulq] - a standard Normal sample in the range [llq, ulq]
  • ulq - represents a sample in the range [ulq, +inf]

The discrete probability of a value of LLQ or less is computed using the cumulative normal distribution over the range [-inf, llq]. The discrete probability of a value of ULQ or more is computed using the cumulative normal distribution over the range [ulq, +inf]. The continous probability density function (pdf) in the range [llq, ulq] is computed from the standard Normal distribution. The area under the pdf in the range [llq, ulq] added to the BLQ and ULQ discrete probabilities sums to one.

For more information see Rectified Gaussian Distribution on Wikipedia.

Rectified Normal Likelihood Example

You can use the ~rectnorm() distribution in the PREDICTIONS section of a PoPy Fit Script to model below level of quantification (BLQ) data, i.e. observations that are not observed directly, but are known to be below a certain lower limit of quantification (LLQ), as follows:-

PREDICTIONS:
    p[DV_CENTRAL] = s[CENTRAL]/m[V1]
    var = m[ANOISE]**2 + m[PNOISE]**2 * p[DV_CENTRAL]**2
    llq = 2.0
    c[DV_CENTRAL] ~ rectnorm(p[DV_CENTRAL], var, LLQ=llq)

The above syntax in a Fit Script specifies the likelihood of the observed c[DV_CENTRAL] observation from the data file. c[DV_CENTRAL] observations greater than LLQ are modelled as a Standard Normal variable, with mean p[DV_CENTRAL] and proportional variance ‘var’. c[DV_CENTRAL] observations less than LLQ are modelled as a cumulative normal distribution with the same mean and variance lying within the range [-inf, llq]. This BLQ data model is referred to as method ‘M3’ in [Beal2001] and recommended by [Ahn2008].

Note that any value for c[DV_CENTRAL] in the data set less than or equal to llq is treated as a BLQ observation by this model. You can vary the LLQ limit for each observation by specifying the limit as a separate field in the data file:-

PREDICTIONS:
    p[DV_CENTRAL] = s[CENTRAL]/m[V1]
    var = m[ANOISE]**2 + m[PNOISE]**2 * p[DV_CENTRAL]**2
    c[DV_CENTRAL] ~ rectnorm(p[DV_CENTRAL], var, LLQ=c[LLQ])

You can then remove the BLQ limit for selected observations by setting c[LLQ] to zero or a large negative number. Sometimes a BLQ observation is recorded in the data file using a separate flag field and the c[DV_CENTRAL] value itself is then the LLQ. In this case you could use the PREDICTIONS section above and PREPROCESS the data to compute a suitable c[LLQ] field as follows:-

PREPROCESS:
    if c[BLQ_FLAG] > 0.5:
        c[LLQ] = c[DV_CENTRAL]
    else:
        c[LLQ] = -inf

Also note that PoPy requires the keyword syntax ‘LLQ=llq’ here to clarify the purpose of the third ~rectnorm() distribution parameter. It is also possible to model above level of quantification (ALQ) observations, as follows:-

PREDICTIONS:
    p[DV_CENTRAL] = s[CENTRAL]/m[V1]
    var = m[ANOISE]**2 + m[PNOISE]**2 * p[DV_CENTRAL]**2
    ulq = 100.0
    c[DV_CENTRAL] ~ rectnorm(p[DV_CENTRAL], var, ULQ=ulq)

Or potentially both BLQ and ALQ observations:-

PREDICTIONS:
    p[DV_CENTRAL] = s[CENTRAL]/m[V1]
    var = m[ANOISE]**2 + m[PNOISE]**2 * p[DV_CENTRAL]**2
    llq = 2.0
    ulq = 100.0
    c[DV_CENTRAL] ~ rectnorm(p[DV_CENTRAL], var, LLQ=llq, ULQ=ulq)

The functionality above is the same as combining a ~cennorm() distribution and a ~norm() distribution using an ‘if’ statement, see Censored Normal Likelihood Example. However using ~rectnorm() distribution is recommended as it is more compact and also more flexible. For example the syntax above works in the context of a Gen Script, Sim Script or Tut Script as well as a Fit Script. i.e. you can sample from a ~rectnorm() distribution easily.

The ~rectnorm() distribution is the principle way that PoPy modellers are encouraged to deal with BLQ and ALQ data.

Truncated Normal Distribution

The ~truncnorm() distribution is based on the ~norm() distribution, but with the domain of the distribution limited to a range [min,max]. It can be used to restrict a ~norm() distribution to say all positive values. It is written in PoPy as:-

x ~ truncnorm(mean, var, MIN=min, MAX=max)

The input parameters are:-

  • mean - the expected value of the Normal
  • var - the variance of the Normal
  • min - minimum value of truncated range - optional parameter - default value is -inf
  • max - maximum value of truncated range - optional parameter - default value is +inf

The inputs ‘mean’, ‘var’, ‘min’, ‘max’ are all continuous values. The default values above imply that the following:-

x ~ truncnorm(mean, var)

Is the same as this:-

x ~ truncnorm(mean, var, MIN=-inf, MAX=+inf)

Which is the same as a ~norm() distribution:-

x ~ norm(mean, var)

Note the ~truncnorm() distribution is different from the ~rectnorm() distribution. A ~truncnorm() distribution rescales its probability density function, so that the area under the curve in the domain [min, max] is one. There is zero probability mass outside of the [min, max] range. A ~rectnorm() distribution keeps the same probability density function as the ~norm() distribution within the range [llq, ulq], but includes the cumulative probability outside this region to achieve a total probality of one.

For more information see Truncated Normal Distribution on Wikipedia.

Truncated Normal Likelihood Example

You can use the ~truncnorm() distribution in the PREDICTIONS section of a PoPy Fit Script to model data that is known to occur in a certain range, e.g. all positive data:-

PREDICTIONS:
    p[DV_CENTRAL] = s[CENTRAL]/m[V1]
    var = m[ANOISE]**2 + m[PNOISE]**2 * p[DV_CENTRAL]**2
    c[DV_CENTRAL] ~ truncnorm(p[DV_CENTRAL], var, MIN=0)

The above syntax in a Fit Script specifies the likelihood of the observed c[DV_CENTRAL] observation from the data file.

Also note that PoPy requires the keyword syntax ‘MIN=min’ here to clarify the purpose of the third ~truncnorm() distribution parameter. It is also possible to model observations with a known upper limit, e.g. data that is known to be negative, as follows:-

PREDICTIONS:
    p[DV_CENTRAL] = s[CENTRAL]/m[V1]
    var = m[ANOISE]**2 + m[PNOISE]**2 * p[DV_CENTRAL]**2
    c[DV_CENTRAL] ~ truncnorm(p[DV_CENTRAL], var, MAX=0)

Or potentially observations that are known to be within a single standard deviation:-

PREDICTIONS:
    p[DV_CENTRAL] = s[CENTRAL]/m[V1]
    var = m[ANOISE]**2 + m[PNOISE]**2 * p[DV_CENTRAL]**2
    std = sqrt(var)
    min = p[DV_CENTRAL] - std
    max = p[DV_CENTRAL] + std
    c[DV_CENTRAL] ~ truncnorm(p[DV_CENTRAL], var, MIN=min, MAX=max)

Note that if values of c[DV_CENTRAL] lie outside the range [min, max] then this model will make little sense, as the likelihood of such observations are zero and the loglikelihood is -inf.

You might wish to use ~truncnorm() distribution to generate synthetic positive only observations from a model. The alternative, is possibly to use ~rectnorm() distribution and generate sythetic data with a small positive LLQ.

Truncated Censored Normal Distribution

The ~trunccennorm() distribution is based on the ~cennorm() distribution, but with the domain of the distribution limited to a range [min,max]. It can be used to restrict a ~cennorm() distribution to say all positive values. It is written in PoPy as:-

x ~ trunccennorm(mean, var, MIN=min, LLQ=llq, ULQ=ulq, MAX=max)

The input parameters are:-

  • mean - the expected value of the Normal
  • var - the variance of the Normal
  • min - minimum value of truncated range - optional parameter - default value is -inf
  • llq - lower limit of quantification - optional parameter - default value is -inf
  • ulq - upper limit of quantification - optional parameter - default value is +inf
  • max - maximum value of truncated range - optional parameter - default value is +inf

The inputs ‘mean’, ‘var’, ‘min’, ‘llq’, ‘ulq’, ‘max’ are all continuous values. The default values above imply that the following:-

x ~ trunccennorm(mean, var)

Is the same as this:-

x ~ trunccennorm(mean, var, MIN=-inf, LLQ=-inf, ULQ=+inf, MAX=+inf)

Which is completely uninformative, as for any value of x the likelihood is one and the log likelihood contribution zero.

Note the ~trunccennorm() distribution rescales a ~cennorm() distribution, so that the area under the curve in the domain [min, max] is one. There is zero probability mass outside of the [min, max] range. Effectively the range [-inf,+inf] is split into 5 sub ranges:-

  • [-inf, min] - zero probability
  • [min, llq] - cumulative normal probability
  • [llq, ulq] - cumulative normal probability
  • [ulq, max] - cumulative normal probability
  • [max, +inf] - zero probability

Truncated Censored Normal Likelihood Example

You can use the ~trunccennorm() distribution in the PREDICTIONS section of a PoPy Fit Script to model data that is known to occur in a certain range, e.g. all positive data:-

PREDICTIONS:
    p[DV_CENTRAL] = s[CENTRAL]/m[V1]
    var = m[ANOISE]**2 + m[PNOISE]**2 * p[DV_CENTRAL]**2
    if c[DV_CENTRAL] <= llq:
        c[DV_CENTRAL] ~ trunccennorm(p[DV_CENTRAL], var, MIN=0, LLQ=2.0)
    else:
        c[DV_CENTRAL] ~ truncnorm(p[DV_CENTRAL], var, MIN=0)

The above syntax in a Fit Script specifies the likelihood of the c[DV_CENTRAL] known positive observation from the data file, with a LLQ of 2.0.

The model above shows how you might implement the ‘M4’ method described in [Beal2001], which conditions on the BLQ data being positive. However a more convenient notation for doing this is described in ~truncrectnorm() distribution below.

Truncated Rectified Normal Distribution

The ~truncrectnorm() distribution is based on the ~rectnorm() distribution, but with the domain of the distribution limited to a range [min,max]. It can be used to restrict a ~rectnorm() distribution to say all positive values. It is written in PoPy as:-

x ~ truncrectnorm(mean, var, MIN=min, LLQ=llq, ULQ=ulq, MAX=max)

The input parameters are:-

  • mean - the expected value of the Normal
  • var - the variance of the Normal
  • min - minimum value of truncated range - optional parameter - default value is -inf
  • llq - lower limit of quantification - optional parameter - default value is -inf
  • ulq - upper limit of quantification - optional parameter - default value is +inf
  • max - maximum value of truncated range - optional parameter - default value is +inf

The inputs ‘mean’, ‘var’, ‘min’, ‘llq’, ‘ulq’, ‘max’ are all continuous values. The default values above imply that the following:-

x ~ truncrectnorm(mean, var)

Is the same as this:-

x ~ truncrectnorm(mean, var, MIN=-inf, LLQ=-inf, ULQ=+inf, MAX=+inf)

Which is the same as a ~norm() distribution:-

x ~ norm(mean, var)

Note the ~truncrectnorm() distribution rescales a ~rectnorm() distribution, so that the area under the curve in the domain [min, max] is one. There is zero probability mass outside of the [min, max] range. Effectively the range [-inf,+inf] is split into 5 sub ranges:-

  • [-inf, min] - zero probability
  • [min, llq] - cumulative normal probability
  • [llq, ulq] - standard normal probability
  • [ulq, max] - cumulative normal probability
  • [max, +inf] - zero probability

Truncated Rectified Normal Likelihood Example

You can use the ~truncrectnorm() distribution in the PREDICTIONS section of a PoPy Fit Script to model data that is known to occur in a certain range, e.g. all positive data:-

PREDICTIONS:
    p[DV_CENTRAL] = s[CENTRAL]/m[V1]
    var = m[ANOISE]**2 + m[PNOISE]**2 * p[DV_CENTRAL]**2
    c[DV_CENTRAL] ~ truncrectnorm(p[DV_CENTRAL], var, MIN=0, LLQ=2.0)

The above syntax in a Fit Script specifies the likelihood of the c[DV_CENTRAL] known positive observation from the data file, with a LLQ of 2.0.

The model above shows the recommend way for PoPy modellers to implement the ‘M4’ method described in [Beal2001], which conditions on the BLQ data being positive.

The ~truncrectnorm() distribution is easier to sample from and therefore use in a Gen Script, Tut Script and Sim Script compared to the ‘if’ statment method show in Truncated Censored Normal Likelihood Example.

Note in many cases it may be easier and more appropriate to use the ‘M3’ method and the ~rectnorm() distribution shown in Rectified Normal Likelihood Example.

Multivariate Normal Distribution

Multivariate-Normal distribution is used for vectors of continuous variables and written like this:-

output_vector ~ mnorm(mean_vector, covariance_matrix)

The Multivariate Normal is a generalisation of the Normal Distribution with two parameters ‘mean_vector’ and ‘covariance_matrix’, as follows:-

  • mean_vector - the mean of the ‘output_vector’
  • covariance_matrix - the covariance of the ‘output_vector’ elements

The ‘output_vector’ must have the same number of dimensions as the ‘mean_vector’. Also the ‘covariance_matrix’ needs to be symmetric positive definite with a matching dimensionality. See Matrices for examples of how to define the covariance matrix.

For more information see Multivariate Normal Distribution on Wikipedia.

Multivariate Normal Random Effect Example

You can use the Multivariate Normal Distribution in the EFFECTS section of a PoPy script, to define a vector of r[X] random effects variables as follows:-

EFFECTS:
    ID: |
        r[KA,CL,V] ~ mnorm([0, 0, 0], f[KA_isv,CL_isv,V_isv])

Here the r[KA,CL,V] variable is defined as a 3 element vector with mean zero. [0,0,0] is a 3 element ‘mean_vector’ and f[KA_isv,CL_isv,V_isv] is a 3x3 ‘covariance_matrix’. The f[KA_isv,CL_isv,V_isv] matrix can be a diagonal or square symmetric matrix, see Matrices.

The r[KA,CL,V] is defined at the ‘ID’ level, so each individual in the population has an independent sample of this multivariate normal distribution.

Bernoulli Distribution

The Bernoulli is univariate discrete distribution used to model binary variables, and written in PoPy as:-

y ~ bernoulli(prob_success)

The Bernoulli models the distribution of a single Bernoulli trial.

The input parameters are:-

  • prob_success - probability of success of the bernouilli trial

The output ‘y’ is a binary value, i.e. either 1 for success or 0 for failure. ‘prob_success’ is a real valued number in the range [0,1].

For more information see Bernoulli Distribution on Wikipedia.

Bernoulli Likelihood Example

You can use the Bernoulli Distribution in the PREDICTIONS section of a PoPy Fit Script as follows:-

PREDICTIONS:
    conc = s[X]/m[V]
    p[DV_BERN] = 1.0 / (1.0+ exp(-conc))
    c[DV_BERN] ~ bernoulli(p[DV_BERN])

The above syntax in a Fit Script specifies the likelihood of the observed c[DV_BERN] binary observation from the data file, when modelled as a Bernoulli variable, with success rate dependent on ‘conc’ via a logistic transform.

Poisson Distribution

The Poisson is a discrete univariate distribution, to model discrete count variables, written in PoPy as:-

y ~ poisson(lambda)

The Poisson models the distribution of the number of events occurring within a fixed time interval, if each individual event occurs independently and at constant rate ‘lambda’.

The input parameters are:-

  • lambda - the expected number of occurrences within the time interval

The output ‘y’ is the observed count, i.e. a non-negative integer value. ‘lambda’ is a positive real valued number, which represents the mean rate of event occurrence.

For more information see Poisson Distribution on Wikipedia.

Poisson Likelihood Example

You can use the Poisson Distribution in the PREDICTIONS section of a PoPy Fit Script as follows:-

PREDICTIONS:
    c[COUNT] ~ poisson(m[LAMBDA])

The above syntax in a Fit Script specifies the likelihood of the observed c[COUNT] count observations from the data file, when modelled as a Poisson process with estimated rate parameter m[LAMBDA].

Binomial Distribution

The binomial is a univarite discrete distribution, written in PoPy as:-

num_successes ~ binomial(prob_success, num_trials)

The binomial models the distribution of the number of successes given a fixed number of independent Bernoulli trials.

The input parameters are:-

  • prob_success - probability of success of each bernouilli trial
  • num_trials - number of bernouilli trials

Here the output ‘num_successes’ is an integer. ‘num_trials’ is also an integer and ‘prob_success’ is a real valued number in the range [0,1].

For more information see Binomial Distribution on Wikipedia.

Binomial Likelihood Example

You can use the Binomial Distribution in PREDICTIONS section of a PoPy Fit Script as follows:-

PREDICTIONS:
    conc = s[X]/m[V]
    p[DV_B] = 1.0 / (1.0 + exp(-conc))
    c[DV_B] ~ binomial(p[DV_B], c[N_OBS])

The above syntax in a Fit Script specifies the likelihood of the observed c[DV_B] count data from the data file when modelled as the number of successes of c[N_OBS] trials performed. Here success rate is dependent on ‘conc’ via a logistic transform.

Negative Binomial Distribution

The negative binomial is a univarite discrete distribution, written in PoPy as:-

num_fails ~ negbinomial(prob_success, num_of_successes)

The negative binomial models the distribution of the number of failures for a series of independent Bernoulli trials until the success count reaches ‘num_of_successes’.

The input parameters are:-

  • prob_success - probability of success of each bernouilli trial
  • num_of_successes - number of successful bernouilli trials before num_fails output

Here the output ‘num_fails’ is an integer. ‘num_of_successes’ is also an integer and ‘prob_success’ is a real valued number in the range [0,1].

For more information see Negative Binomial Distribution on Wikipedia. However, at the time of writing, the wikipedia page inverts the definition of success/failure. In practice there are many ways of parameterising the negative binomial parameterisation, PoPy uses the SciPy parameterisation described here:-

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.nbinom.html

Negative Binomial Likelihood Example

You can use the Negative Binomial Distribution in PREDICTIONS section of a PoPy Fit Script as follows:-

PREDICTIONS:
    conc = s[X]/m[V]
    p[DV_NB] = 1.0 / (1.0 + exp(-conc))
    c[DV_NB] ~ negbinomial(p[DV_NB], 1)

The above syntax in a Fit Script specifies the likelihood of the observed c[DV_NB] count data from the data file when modelled as the number of failures of a Bernoulli variable (with success rate dependent on ‘conc’ via a logistic transform) until the occurrence of the first success.

Back to Top