Simulated models suffer intrinsically from validation and comparison problems. The choice of a suitable indicator quantifying the distance between the model and the data is pivotal to model selection. An information theoretic criterion, called GSL-div, is introduced to measure how closely models’ synthetic output replicates the properties of observable time series without the need to resort to the likelihood function or to impose stationarity requirements. The indicator is sufficiently general to be applied to any model able to simulate or predict time series data, from simple univariate models to more complex objects including Agent-Based Models. When a set of models is given, a simple function of the L-divergence is used to select the candidate producing distributions of patterns that are closest to those observed in the data. The proposed approach is illustrated through three examples of increasing complexity where the GSL-div is used to discriminate among a variety of competing models. Results are compared to those obtained employing alternative measures of model's fit. The GSL-div is found to perform, in the vast majority of cases, better than the alternatives.
An information theoretic criterion for empirical validation of simulation models
Lamperti, Francesco
2018-01-01
Abstract
Simulated models suffer intrinsically from validation and comparison problems. The choice of a suitable indicator quantifying the distance between the model and the data is pivotal to model selection. An information theoretic criterion, called GSL-div, is introduced to measure how closely models’ synthetic output replicates the properties of observable time series without the need to resort to the likelihood function or to impose stationarity requirements. The indicator is sufficiently general to be applied to any model able to simulate or predict time series data, from simple univariate models to more complex objects including Agent-Based Models. When a set of models is given, a simple function of the L-divergence is used to select the candidate producing distributions of patterns that are closest to those observed in the data. The proposed approach is illustrated through three examples of increasing complexity where the GSL-div is used to discriminate among a variety of competing models. Results are compared to those obtained employing alternative measures of model's fit. The GSL-div is found to perform, in the vast majority of cases, better than the alternatives.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.