Monte Carlo: Concepts, Algorithms, and Applications

Preface (excerpt)

This book provides an introduction to the Monte Carlo method suitable for a one- or two-semester course for graduate and advanced undergraduate students in the mathematical and engineering sciences. It also can serve as a reference for the professional analyst. In the past, my inability to provide students with a single-source book on this topic for class and for later professional reference had left me repeatedly frustrated, and eventually motivated me to write this book.In addition to focused accounts of major topics, the book has two unifying themes: One concerns the effective use of information and the other concerns error control and reduction. The book describes how to incorporate information about a problem into a sampling plan in a way that reduces the cost of estimating its solution to within a specified error bound. Although exploiting special structures to reduce cost long has been a hallmark of the Monte Carlo method, the propensity of users of the method to discard useful information because it does not fit traditional textbook models repeatedly has impressed me. The present account aims at reducing the impediments to integrating this information.

Errors, both statistical and computational, abound in every Monte Carlo sampling experiment, and a considerable methodology exists for controlling them. However, error is not an appealing topic, especially in an age of easily accessible software. Many conversations with students and professional users of the Monte Carlo method have made me acutely aware of the seductive quality of the rapidly developing user-friendly software for performing sampling experiments: If it executes and produces a numerical result, why not use it? To serve a useful purpose, a book on the Monte Carlo method should sensitize students to the potential sources of error that lurk in every Monte Carlo sampling experiment and encourage them to recognize the limitations that off-the-shelf software may unwittingly impose. Moreover, the book should provide techniques for reducing and, if possible, eliminating error. The present account does this for both statistical and computional errors.

This concern for error and cost containment has led to several departures from standard advice given to potential users. Chapter 2 does this with regard to error assessment when independent replications generate the sample data. In particular, the traditional reliance on asymptotic normal theory inevitably induces an error of approximation in reporting statistical error. To remove this error, when possible, the book describes distribution-free methods for statistical error reporting. While these lead to a higher cost for achieving a specified error bound, they offer the benefit of eliminating the need to answer the often vexing question: How large a sample size is needed to make the error of normal approximation negligible? Doubtlessly, some readers will react with skepticism to this emphasis. After all, standard asymptotic normal theory for i.i.d. data provides a convenient basis for assessing statistical error as the sample size n grows without bound. However, n is always finite in practice, and it is well understood that convergence rates can vary widely in different problems, leading to widely varying errors of approximation. When they apply, the distribution-free techniques eliminate these errors of approximation.

While i.i.d. sampling represents a fundamental component of the Monte Carlo method, a substantial proportion of the problems to which the method is applied call for generating sample paths based on Markov chain formulations. These paths of dependent observations provide the data for parameter estimation. When I began work on this manuscript about nine years ago, the literature on Monte Carlo Markov chain sampling experiments had been growing at a relatively slow pace since the initial flurry of publications on this topic following World War II. During these nine years, the rate of publication on this topic has increased substantially, and a major portion of my time has been devoted to identifying those new developments that I could weave into a coherent account accessible to a newcomer to the Monte Carlo method. These developments include sampling techniques (e.g., Gibbs sampling), convergence analysis, computational complexity, and estimating statistical accuracy from data on a single sample path and on multiple sample paths. Chapter 5 first addresses the conceptual issues and then Ch. 6 turns to the computational considerations. As writing progressed, Ch. 5 seemed to grow without bound. This was a consequence of the many new ideas that were published and the depth of analysis to which each had been exposed. Although I culled many topics and reduced the space devoted to others, such as sunulated annealing, Ch. 5 remains the most substantial of the chapters. The conceptual foundations of contemporary Monte Carlo Markov chain sampling are organized and presented there in a manner that allows readers to skip over some sections which may be of lesser interest to them without losing sight of the chapter’s underlying theme.

The account of computational issues in Ch. 6 also called for selectivity. Recently, many papers have appeared with proposals for diluting the influence of initial conditions on sample path data. These augment an existing literature on discrete-event simulation that addresses this problem. Much reading, testing, and pondering led me to conclude that a theoretically well-supported and computationally feasible methodology for solving this problem remains for future development. Accordingly, the description of how to adjust for the influence of initial conditions relies heavily on easily implementable, but admittedly subjective, graphical analyses. Ideally, one hopes that more objective criteria grounded, for example, in statistical hypothesis testing eventually will complement graphical interpretation as a basis for choosing a warm-up interval.

By contrast, theoretically justified and computationally feasible methods for computing confidence intervals based on single and multiple sample path data long have been available in the time-series and discrete-event simulation literature. However, the considerable number of careful decisions, based on theoretical and computational considerations that successful implementation requires, has limited the use of these techniques in practice. To overcome this limitation, Ch. 6 focuses on the batch rneans method, a conceptually simple methodology, and describes an implementation that removes many of the impediments. Software for employing this method in the context of long sample paths is available by anonymous file transfer protocol (ftp) as described in Sec. 1.2.