A new discussion paper by Dr Sylvain Barde, KDPE 1504, January 2015

**Non-technical summary**

The recent increase in the breadth of computational methodologies has been matched with a corresponding increase in the difficulty of comparing the relative explanatory power of models from different methodological lineages, particularly simulations.

The traditional statistical and econometric methods that researchers rely on to evaluate the relative explanatory power of different models requires that these models possess a specific formal structure of equations and parameters. This is no longer the case for many of the modelling techniques used nowadays, making the problem of comparing the predictions of such models an important open question in the field...

In order to help address this problem the paper develops an information criterion that is analogous to the traditional Akaike information criterion (AIC) in its theoretical derivation and yet can be applied much more widely, as it can be used to compare the explanatory power of any model able to generate simulated data, regardless of its formal structure. Both the proposed criterion and the AIC are grounded in the same information theoretical concept of using the Kullback-Leibler (KL) distance between model predictions and real data as a measure of prediction accuracy. However instead of using the standard maximum likelihood approach, like the AIC, the proposed criterion relies on the original computer science interpretation of the KL distance as the inefficiency of compressing data using a model that imperfectly approximates the true process that generated the data.

While this may seem like an unnecessary complication, it is what enables the comparison of very different formal models, as the algorithm chosen for the procedure simply maps all the models to a standardised representation (formally, their Markov transition matrices), at which point their predictions can be compared easily. The specific algorithm used in the paper is the Context Tree Weighting (CTW) algorithm. The paper establishes that this algorithm is chosen because it provides the proposed criterion with three desirable properties:

- The criterion is optimal, which essentially guarantees that the measurements produced by the algorithm reach the maximum theoretical precision.
- It is also universal i.e. the optimal performance mentioned previously is proven for all Markov processes. Markov processes are a very wide class of data-generating processes that englobe nearly all the modelling methodologies in existence, from regression models to simulations. This property underpins the claim that the proposed criterion to compare the predictions of any model capable of producing simulated data.
- Finally, it is sequential : the criterion can measure the relative prediction accuracy of different models observation by observation. This means that when comparing the predictive power of different models on a given set of data, statistical testing can be performed to ensure the measurements obtained are statistically significant.

Two Monte Carlo exercises are carried out validate the proposed methodology. The first of these is to check that these theoretical properties are realised in practice, which is shown to be the case, confirming that the algorithm behaves the way information theory predicts.

The second Monte Carlo exercise tests the effectiveness of the methodology at ranking models according to their accuracy. Seven models (one “true” model and six alternate models) are simulated and passed through the CTW algorithm. The result of this test confirms that the methodology can identify the true model from the others and rank all the models according to their predictive power.

Read the complete paper here.