Title:
Stochastic search in high-dimensional model spaces

Abstract:
Model search in very high-dimensional spaces raises computational challenges, and standard approaches such as serial Markov chain Monte Carlo (MCMC) methods are often ineffective. I introduce a novel shotgun stochastic search (SSS) approach for model space exploration that is inspired by existing MCMC approaches but offers the ability to much more rapidly identify good models as dimension escalates. Parallel computing is at the core of SSS methodology. Rather than simply parallelizing existing MCMC stochastic search methods by simultaneously running multiple chains, I describe a new stochastic search that differs in two key respects: (i) SSS evaluates and records many candidate models in parallel at each iteration, efficiently exploring neighborhoods of models; (ii) SSS is designed to move towards and aggressively explore regions of model space that contain multiple high probability models. While serial approaches typically traverse model space via pair-wise model comparisons, the use of parallel computing allows for potentially tens of thousands of models in a neighborhood of a given model to be simultaneously considered, yielding a stochastic search with different properties than the usual serial implementation. I highlight the relationship between standard MCMC approaches and SSS and provide examples where the ability of SSS to rapidly catalogue high probability models is superior to competing MCMC methods. I present examples from cancer genomics that demonstrate the effectiveness of SSS, where modeling goals include the identification of complex multivariate patterns of association within sets of key genes and also between sets of genes and observed patient outcomes.