EuroSciPy logo


Cambridge, UK - 27-30 August 2014

you’re doing it wrong: the lack of reproducibility in statistical science, and how to fix it

Michael McKerns


A recent mathematical proof by Owhadi et al [1,2] details how many of today's common statistical methods are inherently unreliable. For example, Bayesian inference is only guaranteed to be correct when the selected prior is exact -- otherwise, the predictions are not guaranteed to be any more likely true than a random guess. Common tools in statistical science such as Bayesian inference, Monte Carlo, and Machine Learning impose strong implicit assumptions on a problem set in order to yield a solution. These methods do not provide a means for testing the assumptions the methods themselves require. For example, with Bayesian inference one must select a prior, where selecting a prior essentially turns the past into an explicit predictor of future events. Monte Carlo can never rigorously predict bounds on risk, and falls victim to the curse of dimensionality.

We have developed a comprehensive mathematical framework (OUQ) [3,4] capable of utilizing all available information to rigorously predict the impact of high-impact rare events, where our predictors are multiply-nested global optimizations over all possible valid scenarios. Such optimizations are high-dimensional, highly-constrained, non-convex, and generally impossible to solve with current optimization technology; however, by addressing optimization constraints as quantum operators on a probability distribution, our software (called 'mystic') [5,6] converts highly-nonlinear statistical calculations to those that are nearly embarrassingly parallel. By utilizing abstractions on programming models and global distributed caching of data and results, we can scale up from desktop calculations to petascale and larger with little burden on the programmer.

Within the context of this framework, assumptions inherent to common statistical science can be tested and validated, and models can be rigorously tested and improved. Results obtained are rigorous and optimal with respect to the information provided, and should enable great strides in reproducibility in statistical science. This framework has been used in calculations of materials failure under hypervelocity impact, elasto-plastic failure in structures under seismic ground acceleration, and the design of the next generation of large-scale heterogeneous compute clusters. Tools are included for rigorously constraining design space, constructing standard and statistical constraints, leveraging discrete and symbolic math, and quantifying uncertainties and risk.

This talk will lightly cover Owhadi's proof in pictorial form, however will primarily focus on the implementation of Owhadi's new rigorous statistical framework in the mystic software, and discuss the outlook and impact on scientific reproducibility in statistical science.