Technique for increasing the precision of estimates in Monte Carlo experiments
The control variates method is a
variance reduction technique used in
Monte Carlo methods. It exploits information about the errors in estimates of known quantities to reduce the error of an estimate of an unknown quantity.
[1]
[2]
[3]
Let the unknown
parameter of interest be
, and assume we have a
statistic
such that the
expected value of m is μ:
, i.e. m is an
unbiased estimator for μ. Suppose we calculate another statistic
such that
is a known value. Then
![{\displaystyle m^{\star }=m+c\left(t-\tau \right)\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/198f265e17b41f7c3d742c2ed8a8f2ab94f6c074)
is also an unbiased estimator for
for any choice of the coefficient
.
The
variance of the resulting estimator
is
![{\displaystyle {\textrm {Var}}\left(m^{\star }\right)={\textrm {Var}}\left(m\right)+c^{2}\,{\textrm {Var}}\left(t\right)+2c\,{\textrm {Cov}}\left(m,t\right).}](https://wikimedia.org/api/rest_v1/media/math/render/svg/55962d250ee3308c104b3b0d1958a1cb48010518)
By differentiating the above expression with respect to
, it can be shown that choosing the optimal coefficient
![{\displaystyle c^{\star }=-{\frac {{\textrm {Cov}}\left(m,t\right)}{{\textrm {Var}}\left(t\right)}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ee817d18fe15b76695dbfeebf6ca5e828b2347e1)
minimizes the variance of
. (Note that this coefficient is the same as the coefficient obtained from a
linear regression.) With this choice,
![{\displaystyle {\begin{aligned}{\textrm {Var}}\left(m^{\star }\right)&={\textrm {Var}}\left(m\right)-{\frac {\left[{\textrm {Cov}}\left(m,t\right)\right]^{2}}{{\textrm {Var}}\left(t\right)}}\\&=\left(1-\rho _{m,t}^{2}\right){\textrm {Var}}\left(m\right)\end{aligned}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/3399b00e688aae740a79cd98214602547b8f7b51)
where
![{\displaystyle \rho _{m,t}={\textrm {Corr}}\left(m,t\right)\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d33516aecb3d527eeb2045b95d14ba4926ee31c0)
is the
correlation coefficient of
and
. The greater the value of
, the greater the
variance reduction achieved.
In the case that
,
, and/or
are unknown, they can be estimated across the Monte Carlo replicates. This is equivalent to solving a certain
least squares system; therefore this technique is also known as regression sampling.
When the expectation of the control variable,
, is not known analytically, it is still possible to increase the precision in estimating
(for a given fixed simulation budget), provided that the two conditions are met: 1) evaluating
is significantly cheaper than computing
; 2) the magnitude of the correlation coefficient
is close to unity.
[3]
We would like to estimate
![{\displaystyle I=\int _{0}^{1}{\frac {1}{1+x}}\,\mathrm {d} x}](https://wikimedia.org/api/rest_v1/media/math/render/svg/be4f2bfb9f2dcf5989124ff6837b0bb4cd241bef)
using
Monte Carlo integration. This integral is the expected value of
, where
![{\displaystyle f(U)={\frac {1}{1+U}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/7e820cb13e3b8fc927e1dddff81fc8baca030cc0)
and U follows a
uniform distribution [0, 1].
Using a sample of size n denote the points in the sample as
. Then the estimate is given by
![{\displaystyle I\approx {\frac {1}{n}}\sum _{i}f(u_{i}).}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a4a3744a047dbe6f20edda13be93fece93279b29)
Now we introduce
as a control variate with a known expected value
and combine the two into a new estimate
![{\displaystyle I\approx {\frac {1}{n}}\sum _{i}f(u_{i})+c\left({\frac {1}{n}}\sum _{i}g(u_{i})-3/2\right).}](https://wikimedia.org/api/rest_v1/media/math/render/svg/2e3ae6f24318109fbf20f5d6fcc139fc61f9cf26)
Using
realizations and an estimated optimal coefficient
we obtain the following results
|
Estimate
|
Variance
|
Classical estimate
|
0.69475
|
0.01947
|
Control variates
|
0.69295
|
0.00060
|
The variance was significantly reduced after using the control variates technique. (The exact result is
.)