2.2. Sampling whole space

In the original paper, the authors also defined the fact that, given a function, an algorithm samples the whole search space. An algorithm is said to sample the whole search space, given f, if the supremum of the minimum distance between the samples and a point in the search space tends to 0 in probability, i.e. \sup_{x \in \mathcal{X}} \min_{0 \le i \le n} \| x - X_i\| \xrightarrow{p} 0, where \| \cdot \| denotes a norm on the search space (usually the Euclidean norm), and X_i is the i-th sample produced by the algorithm given f.

This definition can be unfolded as follows: \forall \varepsilon > 0, \; \mathbb{P}(\sup_{x \in \mathcal{X}} \min_{0 \le i \le n} \| x - X_i\| > \varepsilon) \xrightarrow[n \to \infty]{} 0, where \mathbb{P} is equal to Algorithm.fin_measure.

To formalize this definition, we define the minimum distance between a point in the search space and a sequence of samples.

noncomputable def min_dist_x :=
  fun {n : ℕ} (u : iter α n) (x : α) => Tuple.min ((fun xi => dist xi x) ∘ u)

As min_dist_x is a continuous function, we can define the maximum of this function over the compact search space. Hence, the \sup in the above definition can be replaced by a \max over the search space.

noncomputable def max_min_dist {n : ℕ} (u : iter α n) :=
  min_dist_x u (compact_argmax (min_dist_x_continuous u))

Finally, given a continuous function f, an algorithm A samples the whole search space if the measure of the set of sequences of samples of size n + 1 such that, max_min_dist is greater than \varepsilon, tends to 0 when n tends to infinity.

noncomputable def sample_whole_space (A : Algorithm α β) {f : α → β} (hf : Continuous f) :=
  ∀ ε > 0, Tendsto (fun n => A.fin_measure hf {u : iter α n | max_min_dist u > ε}) atTop (𝓝 0)

Now, we have all the definitions needed to formalize the Proposition 3 of (Malherbe and Vayatis, 2017)Cédric Malherbe and Nicolas Vayatis, 2017. “Global optimization of Lipschitz functions”. In International conference on machine learning..