Navigation menu

SQRLS SQRLS

The Ornstein-Uhlenbeck Process

From Continuous Control With Deep Reinforcement Learning:

For the exploration noise process we used temporally correlated noise in order to explore well in physical environments that have momentum. We used an Ornstein-Uhlenbeck process (Uhlenbeck & Ornstein, 1930) with $\theta = 0.15$ and $\sigma = 0.3$. The Ornstein-Uhlenbeck process models the velocity of a Brownian particle with friction, which results in temporally correlated values centered around 0.

Start with Brownian motion:

$$dW_t = \mathcal{N}(0, dt)$$

Then add friction:

$$dx_t = \theta (\mu - x_t) , dt + \sigma , dW_t$$

Where $\theta$ controls the amount of friction to pull the particle towards the global mean $\mu$. The parameter $\sigma$ controls the scale of the noise.

References