2.2. Sampling whole space
In the original paper, the authors also defined the fact that, given a function, an algorithm samples the whole search space. An algorithm is said to sample the whole search space, given f
, if the supremum of the minimum distance between the samples and a point in the search space tends to 0
in probability, i.e.
\sup_{x \in \mathcal{X}} \min_{0 \le i \le n} \| x - X_i\| \xrightarrow{p} 0,
where \| \cdot \|
denotes a norm on the search space (usually the Euclidean norm), and X_i
is the i
-th sample produced by the algorithm given f
.
This definition can be unfolded as follows:
\forall \varepsilon > 0, \; \mathbb{P}(\sup_{x \in \mathcal{X}} \min_{0 \le i \le n} \| x - X_i\| > \varepsilon) \xrightarrow[n \to \infty]{} 0,
where \mathbb{P}
is equal to Algorithm.
fin_measure
.
To formalize this definition, we define the minimum distance between a point in the search space and a sequence of samples.
noncomputable def min_dist_x :=
fun {n : ℕ} (u : iter α n) (x : α) => Tuple.min ((fun xi => dist xi x) ∘ u)
As min_dist_x
is a continuous function, we can define the maximum of this function over the compact search space. Hence, the \sup
in the above definition can be replaced by a \max
over the search space.
noncomputable def max_min_dist {n : ℕ} (u : iter α n) :=
min_dist_x u (compact_argmax (min_dist_x_continuous u))
Finally, given a continuous function f
, an algorithm A
samples the whole search space if the measure of the set of sequences of samples of size n + 1
such that, max_min_dist
is greater than \varepsilon
, tends to 0
when n
tends to infinity.
noncomputable def sample_whole_space (A : Algorithm α β) {f : α → β} (hf : Continuous f) :=
∀ ε > 0, Tendsto (fun n => A.fin_measure hf {u : iter α n | max_min_dist u > ε}) atTop (𝓝 0)
Now, we have all the definitions needed to formalize the Proposition 3 of (Malherbe and Vayatis, 2017)Cédric Malherbe and Nicolas Vayatis, 2017. “Global optimization of Lipschitz functions”. In International conference on machine learning..