compute_DIC

compute_DIC(model: Model, print_summary: bool = True) dict[str, float]

It computes and prints the Deviance Information Criterion (DIC) for the fitted model.

Parameters

modelModel

The model with data, regressors, response variable and priors to be solved through Monte Carlo sampling.

print_summarybool, optional

If True prints the deviance summary report. Default is True.

Returns

dict
Dictionary with deviance summary. It contains:
  • key 'Deviance at posterior means',

  • key 'Posterior mean deviance',

  • key 'Effective number of parameters',

  • key 'Deviance Information Criterion'.

Raises

TypeError
  • If model is not a Model,

  • if print_summary is not a bool.

ValueError
  • If a model.posteriors is None because the sampling has not been done yet,

  • if a posterior key is not a column of model.data,

  • if model.data is an empty pandas.DataFrame,

  • if model.response_variable is not a column of model.data.

Notes

The DIC measures posterior predictive error by penalizing the fit of a model (deviance) by its complexity, determined by the effective number of parameters. Comparing some alternative models, the smaller the DIC of a model, the better the model. Consider a linear regression of the response variable \(y\) with respect to regressors \(X\), according to the following model:

\[y \sim N(\mu, \sigma^2)\]
\[\mu = \beta_0 + B X = \beta_0 + \sum_{j = 1}^m \beta_j x_j\]

then the likelyhood is:

\[p \left( y \left\vert B,\sigma^2 \right. \right) = \frac{1}{\sqrt{2 \pi \sigma^2}} \exp{- \frac{\left(y - \mu \right)^2}{2 \sigma^2}} .\]

The deviance [1] [2] is defined as:

\[D \left( y, B, \sigma^2 \right) = -2\log p \left( y \left\vert B,\sigma^2 \right. \right) .\]

The deviance at posterior mean of \(B\) and \(\sigma^2\), denoted by \(\overline{B}\) and \(\overline{\sigma^2}\) is:

\[D_{{\overline{\beta}}, \overline{\sigma^2}} (y) = D \left( y, \overline{B}, \overline{\sigma^2} \right)\]

while the posterior mean deviance is:

\[\overline{D} \left( y, B, \sigma^2 \right) = E \left( D(y, B, \sigma^2) \left. \right\vert y \right) .\]

and the effective number of parameter is defined as:

\[pD = \overline{D} \left( y, B, \sigma^2 \right) - D_{{\overline{\beta}}, \overline{\sigma^2}} (y) .\]

The Deviance Information Criterion [1] is:

\[DIC = 2 \overline{D} \left( y, B, \sigma^2 \right) - D_{{\overline{\beta}}, \overline{\sigma^2}} (y) = \overline{D} \left( y, B, \sigma^2 \right) + pD = D_{{\overline{B}}, \overline{\sigma^2}} (y) + 2pD .\]

References

See Also

LinearRegression