Data depth and floating body

Little known relations of the renown concept of the halfspace depth for multivariate data with notions from convex and affine geometry are discussed. Halfspace depth may be regarded as a measure of symmetry for random vectors. As such, the depth stands as a generalization of a measure of symmetry for convex sets, well studied in geometry. Under a mild assumption, the upper level sets of the halfspace depth coincide with the convex floating bodies used in the definition of the affine surface area for convex bodies in Euclidean spaces. These connections enable us to partially resolve some persistent open problems regarding theoretical properties of the depth.


Introduction
Halfspace depth and floating body are the same concept. The first is extensively studied in nonparametric statistics, the second is of great importance in convex geometry. Until recently, work on data depth has not been recognized by the convex geometry community, and that in convex geometry not by researchers in statistics. Of course, the motivation and the goals in both fields are different, and even their philosophies are not the same. Nonetheless, there is an abundance of results common to both fields. We want to explore and summarize here what is common to both fields, what is known and what is not known.
In nonparametric statistics, data depth is a generalization of order statistics and ranks to multivariate random variables. Its aim is, for a multivariate probability distribution, to devise a distribution-specific ranking of points in the sample space. In other words, depth is a function intended to distinguish points that fit the overall pattern of the distribution, from measurement errors and other outliers.
In convex geometry, the concept of floating body was used, among other things, to introduce the affine surface area to all convex bodies. The associated affine isoperimetric inequality is much stronger than the classical isoperimetric inequality. It provides solutions to many problems where ellipsoids are extrema.

Motivation and background
In classical statistics of univariate data it is well known that the ordering of data points and the corresponding rank statistics constitute powerful statistical tools, valid under 3. 6 3. The halfspace depth-based median (brown star), and the depth-based central region containing 25 % of the observations, provide a more appropriate representation of the location and the variability of the main data cloud.
very broad sets of assumptions. The median, for instance, is a rather efficient, robust, affine equivariant location estimator. Quantiles are invaluable in both visualization and inference. Rank tests provide versatile analogues to the traditional testing procedures, and unlike many standard parametric statistical tests, work under minimal assumptions imposed on the data. In multivariate spaces, however, no natural ordering exists. For d > 1, we are not able to rank points x and y in R d according to their magnitude, or tell whether x lies "to the left" of y. Though, for a given dataset, one can still ask how well a point fits into the overall pattern of the observations. If the data concentrate around a focal point, and follow a simple scatter structure, we can say that a point x inside the data cloud is "deeper" inside the mass of the data, than a point y that lies on the outskirts, or outside the data cloud. This notion of multivariate center-outwards ranking is formalized by the idea of data depth. In general, a depth D(x; P ) is a function that, given a probability distribution P on R d (or a random sample from this distribution), quantifies the centrality (the depth) of a point x with respect to (w.r.t.) the geometry of P . The more x is inside the main bulk of the mass of P , the higher the depth D(x; P ). As such, the depth enables us to rank the points of R d according to their centrality w.r.t. P , and devise the corresponding depth-rank statistics, or depth-based quantile regions.
Let us illustrate our point by giving two simple examples where the depth plays an instrumental role. In the first one, our search for sensible data ordering is motivated by the problem to define a multivariate analogue of the median. In the classification task presented afterwards, we stress the importance of general global ranking procedures in data science.
Consider the dataset of 47 bivariate observations taken from [150], displayed in Figure 1. The data correspond to the Hertzsprung-Russell diagram of the stars in the Star Cluster CYG OB1 in the Cygnus constellation. In the scatterplot, the logarithm of the effective temperature at the surface of the star (log.Te) is plotted, against the logarithm of its light intensity (log.light). The majority of the observations follows a common pattern -their data points concentrate in the south-east part of the plot, and appear to be scattered rather regularly. Four stars clearly do not follow that pattern, and could be considered as outliers (the red points in Figure 1). Those are known to be stars of different characteristics (so-called giant stars). Let us determine the location of the random sample. The sample mean (red triangle) is attracted towards the outlying observations, and does not represent the location of the majority of the data appropriately. That is, of course, caused by the fact that the expectation is known to be affected severely by erroneous data, and outliers, i.e., it is not robust. For univariate data, one can opt for the median in such situations. But, what is a median of a multivariate dataset? Intuitively, the median should capture the location of the majority of observations, and should be little affected by errors, or other anomalies in the data. The median should be a point "deep" inside the data cloud. With the notion of the halfspace depth (the precise definition is given below), we consider the depth median being the point whose depth w.r.t. the random sample is the highest (brown star in Figure 1). The depth median is robust, i.e. it is much less affected by the four giant stars than the mean. It captures the center of the main bulk of data much better then the sample mean. Additionally, let us consider the Mahalanobis ellipse (for precise definition see (6) below) that corresponds to the sample mean, and the sample covariance of our dataset, and contains 25 % of the data points (red ellipse in Figure 1). This ellipse is intended to represent the scatter pattern of the data. As seen in Figure 1, it is also heavily biased towards the anomalous observations. On the other hand, the halfspace depth region that contains (roughly) 25 % of the deepest points (brown polygon), still represents the main modes of variation of the data quite reliably.
For our second motivating example, the hemophilia data (available in [140]) are visualized in Figure 2. The dataset consists of bivariate measurements (AHF activity and AHF antigen) taken from blood samples of 75 women, out of whom 45 are known to be hemophilia A carriers (black dots on Figure 2). Our task here is classificationgiven a new datum with the two measured characteristics, decide whether the new patient is a potential hemophilia A carrier. The literature where problems of this type are studied in statistics is immense. One approach to this problem is to make use of the depth, and the ranking of the observations. Firstly, compute the depth of the point w.r.t. both random samples, i.e. rank the new data point inside the group of carriers, and the group of non-carriers, respectively. Then, assign the new datum to that group for which it is more typical, reflected by its higher depth-based rank. Figure 2 shows the contours of the halfspace depth functions for both random samples (left panel), and the regions where one of the depths is larger than the other (the polygons on the right hand side). A new observation within the light-gray polygon would be assigned into the black cloud of points (carriers); a point inside the light-red polygon is assessed to come from a non-carrier patient 1 . This approach, sometimes called the maximum depth classification and its other variants based on the depth, turned out be particularly appealing in the past years. Mainly due to their conceptual simplicity, versatility, and good robustness properties, depth-based classification rules have gained great importance over the past decade.
As we saw, data depth introduces ranking and ordering also for multivariate datasets. Other applications of the depth include multivariate extensions of the rank tests, Lstatistics (linear combinations of order statistics), and many other nonparametric and robust procedures.
Let us now provide a rigorous definition of the halfspace depth. For d ≥ 1, a point x in d-dimensional Euclidean space R d , and a probability distribution P on R d , the halfspace depth hD of x with respect to P is given by hD(x; P ) = inf P (H − ) : H is a hyperplane with x ∈ H , where H − denotes one of the closed halfspaces associated with its boundary hyperplane H in R d . In other words, the depth hD is given as the smallest probability of a closed halfspace that contains x. Points outside the convex hull of the support of P have zero depth. More generally, any point x with hD(x; P ) < δ can be separated from the main mass of P by a hyperplane cutting away both the point x, and a mass of probability at most δ. Points with rather high depth values can be seen as those lying at the center of the distribution, as no halfspace of small probability can separate them from the rest of P . This way, the depth hD acts as a mapping that orders the sample points in a center-outwards direction, with the ordering given subject to the distribution P .
One plausible statistical application of the depth is the possibility to introduce quantiles to multivariate data. Consider, for observations on the real line R, the central quantile regions given as the intervals bounded by the α-, and (1 − α)-quantiles of P , for α ∈ (0, 1/2]. A natural multivariate analogue of these sets are then the loci of points whose depth exceeds given thresholds. Such sets of points are called the central regions of P in R d (given by the depth hD). As discussed in Section 3 below, the collection of all central regions of P consists of affine equivariant nested closed convex sets, monotone in the sense of set inclusion, see also the left panel of Figure 2, or Figure 3 below. For sufficiently regular distributions, the smallest non-empty set in this collection is a singleton -the most central point of R d for P . This point is frequently recognized as a generalization of the median to R d -valued data.
The earliest contribution in statistics that deals with some form of the halfspace depth is believed to be [79] from 1955. There, a sign test for bivariate data is proposed and examined. Its test statistic takes the form of the depth hD at a single, given point x.
The seminal paper that introduced the depth in the sample case (that is, for datasets) is Tukey [177] from 1975. In that paper, the depth is proposed as a tool that enables efficient visualization of random samples in R 2 . It is in [177], where the word depth is used for the first time. The original formal definition of the halfspace depth for multivariate data can be found in Donoho [44] and Donoho and Gasko [45] (see also [168]).
Starting with the study of Donoho, much research has focused on data depth and related concepts. The prominent, loosely related simplicial depth for multivariate data was defined by Liu [95,96], building upon the ideas presented in Oja [135]. Soon, the idea of depth was extended to data in non-linear spaces [168,99], general metric spaces [35], observations on graphs [169], regression [148], or data taking values in functional spaces [56], and Banach spaces [43].
The general concept of data depth in R d was formalized by Zuo and Serfling [187], Dyckerhoff [49], and Serfling [166], see also Mosler [128]. Nowadays, dozens of depth functions and related methods for all types of data can be found in the literature. It is, however, the halfspace depth hD, that is the single most important depth that continues to reappear as the prime representative of this idea.
Apart from statistics, halfspace depth gained considerable attention also in discrete and computational geometry (see [98]). There, the combinatorial nature of the sample version of hD provides a rich source of interesting problems, especially in connection with its computational aspects. For instance, the halfspace medians, i.e. points at which hD is maximized over R d , are closely related to the notion of centerpoints studied in discrete geometry (see [119,Section 1.4]). For a recent overview of data depth and its links to computational geometry see Rousseeuw and Hubert [149].
A notable article on the properties of the halfspace depth for general (probability) measures is Rousseeuw and Ruts [151]. There, the population version of the depth hD (i.e. the depth w.r.t. the true sampling distribution P ) is investigated. Several interesting links between the concept of the halfspace depth, and some sources outside mathematical statistics are outlined. In Rousseeuw and Ruts [151,Section 8] it is noted that the halfspace depth relates with the voting problem studied in Caplin and Nalebuff [33,34] in the theory of social choice. Further, it is also observed that some results concerning the maximal depth of a point in R d can be found already in Neumann [132], Rado [141], and Grünbaum [69], in the literature concerned with the geometric properties of functions and sets.
In the present paper, we pursue this line of research, and point to the remarkable similarity of the notion of halfspace depth, and some concepts used in other fields of mathematics, especially in convex and affine geometry.
The paper is organized as follows. In Section 3 we introduce the notation, and give a brief overview of some of the most important properties of the halfspace depth. In Section 4 we follow the lead provided by Rousseeuw and Ruts [151], and trace a little known early precursor of the halfspace depth to be the so-called Winternitz measure of symmetry of convex bodies, a functional that dates back at least to Blaschke [17]. In Sections 5 and 6 we examine relations of the halfspace depth with the (convex) floating bodies, an important tool used in the study of convex sets in R d . As demonstrated, the history of the halfspace depth is much longer than assumed: the earliest predecessors of the depth hD appear to be the floating bodies in R d , studied already by Dupin [47] in 1822. Later, floating bodies reappear in mathematics in 1923 in Blaschke [17], in connection with an affine invariant, the affine surface area of convex bodies, and other problems. As discussed in Section 5, the modern notion of the floating body, the convex floating body, defined independently by Bárány and Larman [12] and Schütt and Werner [161], plays a major role for the concept of affine surface area studied in geometry. We present extensions of this notion to log-concave measures and show its importance in questions of approximation of convex bodies by polytopes. It is also discussed in Section 6 that the convex floating body corresponds to the upper level sets of the halfspace depth. Using this identity, we provide in Section 7 a surprising bound of the halfspace depth in terms of the Mahalanobis depth. Section 8 is devoted to the distribution-by-depth characterization problem, concerned with finding conditions under which no two probability measures can have the same depth over R d . It is shown that two important partial positive results to this problem [76,88] are both special cases of a more general theorem, conveniently stated in terms of floating bodies of measures. As a corollary, we obtain some new classes of distributions characterized by their depth. Finally, in Section 9 we discuss some extensions of the depth to more exotic data. The survey is completed with a series of open problems relevant to the topics of halfspace depth and floating bodies.

Data depth: Notation and essential properties
Let (Ω, F , P) be the probability space on which all random variables are defined. For a measurable space M, denote by P (M) the space of all probability distributions on M, and write X ∼ P for a random variable X with distribution P ∈ P (M). The support of P is denoted by Supp(P ) ⊆ M. For n ∈ N = {1, 2, . . . } and a random sample X 1 , X 2 , . . . , X n from P , let P n ∈ P (M) be the associated empirical measure, i.e. the uniform measure supported in the sample points. For X ∼ P and φ : M → M measurable, write P φ(X) ∈ P (M) for the probability distribution of the transformed random variable φ(X). This way, P ≡ P X .
The space R d is equipped with the Euclidean norm · and the inner product ·, · . For x ∈ R d and r > 0, B d (x, r) = y ∈ R d : y − x ≤ r is the closed Euclidean ball centered at x with radius r. B d stands for the unit ball B d (0, 1) and S d−1 = ∂B d denotes the unit sphere. ∂K stands for the topological boundary of K ⊂ R d . The Lebesgue measure of a measurable set K will be denoted also by vol d (K).
A convex body is a convex, compact subset of R d with non-empty interior. For k ∈ N it is said to have a C k boundary if its boundary, locally parametrized as a function from R d−1 , is k-times continuously differentiable. We denote the collection of all convex bodies in R d by K d . For an interior point x 0 of a convex body K define the polar body If 0 ∈ Int(K), the interior of K, we write K • for the polar body (1) of K w.r.t. 0. A star body K is a compact subset of R d with the property that there exists x ∈ K such that the open line segment from x to any point y ∈ K is contained in the interior of K.
The centroid (or the barycenter) of a compact set K ⊂ R d is the expectation of a random variable distributed uniformly on K.
3.1. Statistical depth for multivariate data. The first formal definition of the halfspace depth in R d for general probability distributions can be found in Donoho [44].
Definition. Let P ∈ P R d and x ∈ R d . The halfspace depth (or Tukey depth) of x w.r.t. P is defined as In (3), the infimum can be equally well taken only over those H ∈ H with x ∈ H [151, In the study of the theoretical properties of hD, two regularity conditions imposed on P ∈ P R d frequently play an important role. The first, a smoothness condition, appears in Dümbgen [46] and Mizera and Volauf [126], and reads It is trivially satisfied if, for instance, P has a density in R d .
The second requirement concerns the support of P . We say that P ∈ P R d has contiguous support [126,88] if there are no two disjoint halfspaces In other words, the support of P cannot be separated by a slab between two closed parallel hyperplanes.
In Section 7 we demonstrate a surprising relation between the halfspace depth, and another renown depth function that can be found in the literature: the Mahalanobis depth. To this end, let us briefly recall its definition, and some elementary properties.
For any symmetric positive definite matrix Σ ∈ R d×d , the Mahalanobis distance [112] of two points x, y ∈ R d is defined as It is a metric on R d . Based on this distance, Liu [97] proposed the following depth function.
Definition. Let X ∼ P ∈ P R d be such that E X = µ and Var X = Σ is positive definite. The Mahalanobis depth of x w.r.t. P is defined as The Mahalanobis depth w.r.t. P takes the maximal value 1 at the expectation of P . Its upper level sets are concentric ellipsoids given, for δ ∈ (0, 1], by (6) x These ellipsoids are also called the Mahalanobis ellipsoids of the distribution P . Note that unlike the halfspace depth hD, the Mahalanobis depth MD is not defined for all P ∈ P R d , but rather it is restricted to distributions with finite second moments, and positive definite variance matrices.

3.2.
Properties of the halfspace depth. In this section we collect some basic properties of the halfspace depth (3), and of its upper level sets that will prove to be useful in the sequel. For any P ∈ P R d , consider the upper level sets of hD Immediately from the definition we see that the collection of sets P δ , δ ∈ [0, 1] is nested, decreasing in the sense of set inclusion, and P 0 = R d . The set P δ is also called the central region of P corresponding to δ ∈ [0, 1].
Example 1. Let X ∼ P ∈ P R d be the uniform distribution on the unit ball B d . The marginal distribution function of the first coordinate of X is given by It is not difficult to see that i.e. the central region P δ of P is a ball with radius −F −1 1 (δ) for δ ∈ [0, 1/2] and F −1 1 the quantile function corresponding to F 1 . For d = 2 we obtain which agrees with Rousseeuw and Ruts [151,Section 5.6]. Uniform distributions on balls are a special case of spherically (and elliptically) symmetric distributions. Such distributions will be treated in Example 2 below. For the uniform distribution P ∈ P (R 2 ) on the unit square [0, 1] 2 , the halfspace depth and its central regions were computed by Rousseeuw and Ruts [151,Section 5.4] hD(x; P ) = 2 min {x, 1 − x} min {y, 1 − y} for x, y ∈ [0, 1], 0 otherwise.
The expression for the halfspace depth of P ∈ P (R 2 ) distributed uniformly on the equilateral triangle can be found in Rousseeuw and Ruts [151,Section 5.3]. Several central regions (7) of the halfspace depth for the latter two distributions centered at their halfspace medians are displayed in Figure 3. Exact expressions for the halfspace depth for the uniform distribution on a simplex, and a (hyper)-cube in R d are much more involved for d > 2 than for d = 2. They can be obtained from [158, Lemma 1.3 and its proof]. The black dots stand for the halfspace medians of these two distributions. Note that the central region for δ = 0.45 is empty for P uniform on a triangle. In that case Π(P ) = 4/9 < 0.45 for Π from (9).
Example 2. In accordance with Fang et al. [53] we say that the distribution of X ∼ P ∈ P R d is α-symmetric, 0 < α ≤ 2, if for some continuous function φ : R → R the characteristic function of the random vector X takes the form where i is the imaginary unit, and for t = (t 1 , . . . , t d ) ∈ R d we set t α = d i=1 |t i | α 1/α for 0 < α < ∞, and t ∞ = max i=1,...,d |t i |. For α = 2 we obtain the collection of all spherically symmetric distributions, i.e. distributions invariant under all orthonormal rotations of the sample space [53,Chapter 2]. For instance, the uniform distribution on the unit ball B d , the uniform distribution on the unit sphere S d−1 , or the standard multivariate Gaußian distribution are all spherically symmetric. The multivariate probability distribution with independent Cauchy marginals is 1-symmetric. For φ(s) = e −s α with α ∈ (0, 2] we obtain the multivariate symmetric stable laws. α-symmetric distributions have been studied by many authors [32,53,86]. They are distinguished by the special property that all univariate projections of an α-symmetric measure X = (X 1 , . . . , X d ) ∼ P are multiples of the same univariate distribution where d = stands for "is equal in distribution" [53,Theorem 7.1]. This makes it possible to compute the depth hD (·; P ) exactly. For an α-symmetric P ∈ P R d we have for all where F 1 is the distribution function of X 1 , and is the conjugate exponent to α. The last equality in (8) is due to the (generalized) Hölder inequality (see, e.g., [38, Lemma A.1]). All central regions P δ of an α-symmetric distribution are therefore the lower level sets of the norm · α * . In particular, for all spherically symmetric distributions the central regions are centered balls, and for all α ≤ 1 the central regions are centered (hyper)-cubes in R d . Apart from simple uniform distributions on convex bodies such as those in Example 1 and atomic distributions (see the left panel of Figure 2), α-symmetric distributions (and their affine images) are the only class of probability distributions whose depth hD are we able to evaluate exactly. This was noticed by Massé and Theodorescu [118, Example (C)] and Chen and Tyler [38]. See also Figure 4.
3.2.1. Affine invariance. For a non-singular matrix A ∈ R d×d and b ∈ R d , consider the affine transformation T : This implies that central regions P δ ≡ (P X ) δ are affine equivariant under affine transformations T of full rank, i.e. T ((P X ) δ ) = (P T (X) ) δ for any δ ∈ [0, 1]. Due to the affine invariance of hD and Example 2, the central regions P δ of elliptically symmetric distributions (i.e. invertible affine images of spherically symmetric distributions, see [53,Chapter 2]) are concentric ellipsoids with the same center and orientation as the density level sets of P (if the density exists). In particular, this holds true for the central regions of any full-dimensional multivariate Gaußian distribution (see Figure 4). For any x ∈ R d \ {0} (brown triangle) the unique hyperplane H ∋ x such that P (H − ) = hD(x; P ) is the hyperplane that supports the contour ellipsoid of the density of P passing through x (solid brown line). Thus, the depth central regions P δ with δ ∈ (0, 1/2) are all concentric ellipsoids of the same shapes as the density contours. Right panel: several density contours (dashed lines) and the corresponding halfspace depth contours (solid lines) of the bivariate distribution with independent Cauchy marginals P . Since P is 1-symmetric, the central regions P δ are all concentric squares.

3.2.2.
Quasi-concavity. The sets P δ are all convex, which means that the mapping hD is quasi-concave in its first argument. Quasi-concavity of hD is essential for the construction of estimators based on the depth, such as the depth-trimmed means.

3.2.3.
Maximality at the center. Denote the maximal depth value of a distribution by (9) Π(P ) = sup x∈R d hD(x; P ) for P ∈ P R d .
By Rousseeuw and Struyf [152,Lemma 1], and Π(P ) ≤ 1/2 for P that satisfies (4). As shown by Rousseeuw and Ruts [151,Proposition 7], for any P ∈ P R d the maximal depth is attained in R d . Therefore, it makes sense to define the halfspace median (or depth median) of P as any point x P ∈ R d such that hD(x P ; P ) = Π(P ). The halfspace median is not necessarily unique -consider, for instance, the uniform distribution P on the vertices of a simplex in R d , where any point in that simplex is a halfspace median of P . If the set M(P ) = x ∈ R d : hD (x; P ) = Π(P ) is not a singleton, some authors prefer to define the halfspace median as the barycenter of the region M(P ).
In this paper, we do not follow that convention, and unless stated otherwise, we call all elements of the set M(P ) halfspace medians of P .
In general, the set of all halfspace medians of P can be shown to be non-empty, compact and convex. If (4) is true for P with contiguous support, then by Mizera and Volauf [126,Proposition 7] the halfspace median of P is unique. In any case, the central regions (7) are non-empty if and only if δ ∈ [0, Π(P )].
If the distribution P is (in some sense) symmetric around a point x P ∈ R d , it is natural to require that the center of symmetry x P is the unique halfspace median of P , i.e. the only point such that Π(P ) = hD(x P ; P ).
Definition. The distribution P ≡ P X ∈ P R d is said to be centrally symmetric around P is centrally symmetric, if it is centrally symmetric around some x P ∈ R d .
If P is centrally symmetric, the maximal depth value Π(P ) must be at least 1/2, and this depth is attained only at the center of symmetry x P . But centrally symmetric distributions are not the only ones for which the maximal depth is at least 1/2. This leads to the following definition, due to Zuo and Serfling [188].
P is said to be halfspace symmetric, if it is halfspace symmetric around some x P ∈ R d .
As discussed in Zuo and Serfling [188], the halfspace symmetry of measures in R d is more general than the rather restrictive central symmetry, in the sense that any centrally symmetric distribution is also halfspace symmetric. To see that the converse does not hold true, consider the following example.
Example 3. Let P ∈ P (R 2 ) be the uniform distribution concentrated in the vertices (±1, ±1) of a centered square in R 2 . P is halfspace symmetric, and centrally symmetric around x P = 0 ∈ R 2 . For any λ > 0, translate the point mass from (1, 1) to (λ, λ). The resulting distribution P ′ is then still halfspace symmetric around the origin. Yet, for λ = 1, P ′ is not centrally symmetric.
Any univariate distribution is halfspace symmetric around its (univariate) median. For a comprehensive discussion on the subject of symmetry of multivariate probability distributions see Serfling [167].

3.2.4.
Vanishing at infinity. Any random vector X ∼ P ∈ P R d lives with large probability inside a closed ball of finite diameter. Thus, it is reasonable to ask that also the depth associated to P assigns high values of hD only to points inside (big) closed balls. This property, often called the vanishing at infinity property of hD, can be expressed as For the halfspace depth this condition is satisfied (see, for instance, [187, Theorem 2.1]). The central regions (7) are therefore bounded for all δ > 0.
3.2.5. Continuity of the depth. As observed by Donoho and Gasko [45, Lemma 6.1], the halfspace depth is upper semi-continuous in its first argument (10) lim By Mizera and Volauf [126, Proposition 1] if (4) holds true for P , then hD is also continuous in x. For the central regions (7) condition (10) means that each P δ is a (convex) closed set for δ ∈ [0, Π(P )], and compact for δ ∈ (0, Π(P )] for any P ∈ P R d .
3.2.6. Continuity of the central regions. Consider now the set-valued mapping that for P ∈ P R d given, to δ ∈ [0, Π(P )] assigns its central region (7). This mapping is essential for understanding the properties of the depth, as the level sets of hD are usually of greater interest than individual depth values at fixed points in R d . The mapping δ → P δ takes values in the space K d of convex subsets of R d . That space can be equipped with the Hausdorff distance (see, e.g., [156, Section 1.8]) where K ε is the ε-neighborhood of K, Continuity properties of the map δ → P δ were investigated by several authors. The following result was first stated by Massé  . In a slightly different context, it was also considered by Kong and Mizera [87].
Theorem 1. Let (4) be true for P ∈ P R d with contiguous support. Then the map δ → P δ is continuous in the Hausdorff distance for δ ∈ (0, Π(P )).

3.2.7.
Consistency, robustness and other statistical properties. In statistics, the true distribution P ∈ P R d is seldom known. Instead, one usually observes for n ∈ N only a random sample X 1 , . . . , X n of independent random variables with distribution P , and infers the properties of P from the empirical distribution P n of that sample. As n → ∞, the halfspace depth is universally consistent, which means that for any P ∈ P R d the depth hD based on the empirical distribution P n (the sample depth) approaches the true depth evaluated w.r.t. P uniformly over the whole space This result was first established in Donoho and Gasko [45, p. 1817]. Interestingly, it does not require any properties of the distribution P . For P satisfying (4), it can be strengthened to the form that for any sequence of measures This property follows by an argument of Dümbgen [46,Corollary 2] applied to hD, see also [130,Theorem A.3], and is frequently called the uniform qualitative robustness property of hD. Further robustness properties of hD were studied by Romanazzi [146,147], and Chen and Tyler [37,38], among others.
Uniform consistency results hold true also for the depth level sets (7). In its full generality, the following result, recently established in Dyckerhoff [50, Theorem 4.5 and Example 4.2], unifies and completes the partial results from [118,77,189].
Theorem 2. Let (4) be true for P ∈ P R d with contiguous support. Then for every In Theorem 2, (P n ) δ stands for the δ-central region (7) of the empirical measure P n . Further valuable improvements of the statistical theory of the halfspace depth include the derivation of the rates of convergence of the depth and its central regions [84,29,28], and distributional asymptotics of these and related quantities [8,185,186,115,116,117] (9) can be found in literature much earlier than the definition of the halfspace depth (see [151,Sections 3 and 4]). From these references, it appears that the behavior of the maximal depth relates to the degree of concavity of the measure P . Following Borell [19], see also Bobkov [18], let us first provide a rigorous definition of concave probability measures.
As noted by Bobkov [18], if P is not a Dirac measure, then s ≤ 1. Further, a measure P ∈ P R d is s-concave with s ≤ 1/d if and only if P has a density f that is supported on an open convex subset U of R d and that is s d = s/(1 − ds)-concave, i.e., for all x, y ∈ U, for all λ ∈ [0, 1], For s = 0, s-concave measures are also called log-concave measures, and represent a natural generalization of uniform measures on convex bodies. Indeed, any uniform measure on a convex body is log-concave.
We are ready to state a result that summarizes what is known about the maximal depth functional Π(P ) defined in (9). Theorem 3. The following inequalities hold true: (i) For any P ∈ P R d (ii) For P ∈ P R d uniformly distributed on a convex body, As noted by Grünbaum [69,Section 4], the lower bounds in parts (i) and (ii) are sharp. In part (i) it is enough to take the uniform distribution in the vertices of a simplex in R d . For part (ii) one takes the uniform distribution on the simplex in R d . Problem 1. Are the lower bounds in part (iii) of Theorem 3 sharp? That is, does there exist an s-concave probability measure P with equality on the right hand side of (12)?
The lower bounds in parts (i) and (ii) of Theorem 3 were proved by Neumann [132] for d = 2. In full generality, part (ii) was proved independently by Grünbaum [69], and Hammer [74]. Part (iii) can be found in Caplin and Nalebuff [34,Proposition 3], see also Bobkov [18,Theorem 5.2]. As discussed by Bobkov [18], the condition s > −1 implies the existence of the expectation E X of X ∼ P . Actually, in all three parts of Theorem 3 in the proofs it is shown that hD(E X; P ) is never smaller than the given lower bounds. Theorem 4. Let P ∈ P R d be uniformly distributed on a convex body K ∈ K d . Then P is halfspace symmetric around x P ∈ R d if and only if it is centrally symmetric around x P .
The proof of Theorem 4 was first obtained in 1915 for d = 2 and d = 3 by Funk [59]. In its full generality the result was conjectured, among others, by Grünbaum [71, p. 251], but completely solved only in 1970 in Schneider [155,Satz 4.2] and Schneider [154,Theorem 1.5], see also Falconer [52]. For its modern version, including an extension to star convex bodies K ⊂ R d see Groemer [65,Section 5.6].
By Theorem 4, the two notions of central and halfspace symmetry from Section 3.2.3 coincide for uniform distributions on (star) convex bodies in R d , see also Example 3. This suggests the following problem. Definition. The distribution of a random vector X ∼ P ∈ P R d is said to be angularly symmetric around x P ∈ R d , if the random variables (X − x P )/ X − x P and −(X − x P )/ X − x P are identically distributed. P is angularly symmetric, if it is angularly symmetric around some x P ∈ R d . Angular symmetry can be shown to be an intermediate between the rather strong concept of central symmetry, and the halfspace symmetry, considered in Section 3.2.3. Any P that is centrally symmetric around x P is angularly symmetric around x P [188, Lemma 2.2], and any P angularly symmetric around x P is also halfspace symmetric around x P [188, Lemma 2.4]. None of these implications can be reversed. Though, a partial reverse to the second one was asserted in the statistical literature. For d = 2, Zuo and Serfling [188,Theorem 2.6] in 2000 and Dutta et al. [48,Theorem 2] in 2011 independently proved that if P is absolutely continuous and halfspace symmetric around x P ∈ R d , then P must be also angularly symmetric around x P . Rousseeuw and Struyf [152, Theorems 1 and 2] in 2004 gave a complete proof for general d ∈ N in the following form.
In particular, (i) any P halfspace symmetric around x P with P ({x P }) = 0 is angularly symmetric around x P , and (ii) for any P such that sup x∈R d P ({x}) = 0, halfspace symmetry and angular symmetry are equivalent notions.
When P is the uniform distribution on a (centered) convex body K ∈ K d , Theorem 5 stands as a generalization of Funk's theorem to probability measures. Indeed, assume that P is halfspace symmetric around the origin x P = 0 ∈ R d . Since P is absolutely continuous, by Theorem 5 it is also angularly symmetric around x P . Because P is uniform, angular symmetry of P implies that the support function h K from (2) must be an even function on S d−1 , which in turn gives that K must be centrally symmetric around x P .
Remarkably, Rousseeuw and Struyf [152, Theorems 1 and 2] were discovered independently of the results in geometry. The proof of Rousseeuw and Struyf [152] makes use of the classical theorem of Cramér and Wold [40] from 1936, closely related to the Fourier transforms of measures. The known proofs of Theorem 4 employ techniques from spherical harmonics, or integral equations. Thus, all known proofs of Theorems 4 and 5 are non-trivial, but have in common the use of harmonic analysis.

4.3.
Measures of symmetry. Characterization results like Theorem 4 for convex bodies stimulated much research in convex geometry. Eventually, these efforts led to measures of symmetry for convex sets, comprehensively covered by Grünbaum [71]. A measure of symmetry is a mapping S : ) for any non-singular affine transformation T : R d → R d , and (iii) S is continuous on K d (equipped with a suitable topology 2 ). A variant of part (ii) in Theorem 3, that states that for any X ∼ P ∈ P R d uniformly distributed on a convex body 2 For details on possible choices of topology see Grünbaum [71].
is known since the 1910s as the Winternitz theorem (due to Artur Winternitz, according to [17]). This result gave rise to the following measure of symmetry, which is remarkably close to the halfspace depth. , The Winternitz measure of symmetry of K is then defined as The measure of symmetry W (K) was considered by many authors. For a historical account and the theoretical background on measures of symmetry see the seminal paper of Grünbaum [71, Section 6.2]. For a modern treatment of the topic see Toth [175].
Obviously, for K ∈ K d , the Winternitz measure of symmetry is equivalent with the maximal depth (9) attained w.r.t. the uniform measure P on K . .
For w K , it was noted already by Grünbaum [71] in 1963 that its upper level sets are convex, and that its maximal value is always attained in K (cf. Sections 3.2.2 and 3.2.3 above).
Connections of the depth hD with results on partitions of convex bodies (Theorem 3 above) have already been noted by Rousseeuw and Ruts [151]. Though, as far as we know, no links between the measures of symmetry for convex bodies and the halfspace depth have yet been established in the statistical literature.
In the other direction, some notions of depth can be found in the literature on the geometry of convex bodies. For instance, in Bose et al. [24] the "depth" for a convex body K is defined as the halfspace depth (3) of the associated uniform distribution, in connection with a generalized version of the Winternitz theorem. Nonetheless, precise links between the respective fields of mathematics appear to be still lacking.
H is then called a minimal hyperplane of x. From the definition of the minimal halfspace it is easy to see that the following holds. Proposition 6. Let P ∈ P R d have contiguous support and let H − ∈ H − be minimal at x ∈ R d with hD(x; P ) = δ. Then the halfspace H + supports P δ .
An interesting characterization of the halfspace median of a measure P ∈ P R d in terms of minimal halfspaces was observed by Donoho and Gasko [45, pp. 1818-1819] in 1992 and Rousseeuw and Ruts [151,Propositions 8 and 12] in 1999. For P absolutely continuous, x is a halfspace median of P if and only if the union of the collection of minimal halfspaces at x is R d . In Rousseeuw and Ruts [151], this result is dubbed the ray basis theorem.
Theorem 7. Let P ∈ P R d , and x ∈ R d be such that the union of the collection of minimal halfspaces at x is R d . Then x is a halfspace median of P .
Assume that P satisfies (4), and let x ∈ R d be a halfspace median of P . Then there exists a collection of minimal halfspaces at x of cardinality at most d + 1 whose union is R d .
The smoothness condition (4) is important in Theorem 7. As noted by Massé [116,Example 4.3] it is possible to construct distributions P ∈ P R d , that violate (4), with a unique minimal halfspace at their halfspace median.
For P ∈ P R d uniformly distributed on a convex body K ∈ K d , a result similar to Theorem 7 was stated in Grünbaum [71, p. 251] in 1963 for the Winternitz measure of symmetry. There, it was asserted that it follows from a version of Helly's theorem that there must exist at least d + 1 different minimal halfspaces at the halfspace median x P of P . The assumptions of that result appear, however, to be incomplete, as pointed out to us by M. Tancer [138].
Another interesting problem closely connected with the halfspace median and Theorem 7, is a conjecture of Grünbaum [70, p. 41] from 1961 that asks if for any convex body K ∈ K d with d ≥ 2 there exists a point x ∈ K that is a centroid of at least d + 1 sections of K by different hyperplanes passing through x. For d = 2, the solution to this problem is straightforward, as noted already in [70]. For d > 2, this problem appears to be still open (see [172], [71, p. 251], and [41, Problem A8]). It is natural to conjecture that the halfspace median is such a point. Indeed, combine Theorem 7 with a theorem of Dupin [47] (stated in part (ii) of Proposition 12 below) that says that for any K ∈ K d the point x ∈ K is the centroid of all minimal hyperplanes at x (w.r.t. the uniform distribution P on K) to obtain that if the minimal hyperplanes at x are in general position, then the halfspace median is a point as postulated in the conjecture. Here, a set of hyperplanes is said to be in general position if for all choices of at most d such distinct hyperplanes their normals are linearly independent. A further open question is if the conjecture holds true with x being the centroid of K.
Theorem 7 provides a useful characterization criterion for the depth-based extension of the median. Apart from its theoretical appeal, it promises applications in the computation of the depth, and the depth median.

4.3.2.
Minimality and stability. An important question regarding the measures of symmetry concerns their minimality, i.e. characterization of sets K ∈ K d such that S(K) = inf S(K ′ ) : K ′ ∈ K d . As remarked by Grünbaum [69,Section 4] in 1960, for the Winternitz measure of symmetry and this value is attained if and only if K is a bounded cone in R d . This value corresponds to see also Theorem 3. In a related question, Grünbaum [69] also determined the collection of measures P ∈ P R d such that by showing that this can happen if and only if P is a uniform distribution on the vertices of a non-degenerate simplex in R d . In statistics, this result was observed independently by Donoho and Gasko [45,Lemma 6.3] in 1992 for hD.
In convex analysis, another desirable property of measures of symmetry is their stability. A measure of symmetry S is said to have the stability property if for any ε > 0 and Here, δ stands for some metric on K d , and c may depend on d, as well as on some characteristic of K such as its volume, or diameter. An important stability theorem for the Winternitz measure of symmetry was derived by Groemer [66,Theorem 2]. where, is the symmetric difference metric on K d .
As far as we are aware, no results corresponding to stability theorems can be found for probability measures and the halfspace depth. 4.4. Affine invariant points. Symmetry is a key structural property of convex bodies relevant in many problems. A systematic study of symmetry was initiated by Grünbaum in his seminal paper [71] from 1963. A crucial notion in his work is that of affine invariant point. It allows to analyze the symmetry situation. In a nutshell: the more affine invariant points, the fewer symmetries.
Recall that the set K d is equipped with the Hausdorff distance (11).
We denote by P d the set of all affine invariant points on R d .
Examples of affine invariant points, already known to Grünbaum [71] are, e.g., the centroid of a convex body K (i.e. the expectation of the uniform distribution on K), the Santaló point (the unique point s(K) in the interior of K ∈ K d for which the minimum of the functional vol d K s(K) is attained, see also the important Blaschke-Santaló inequality in (19) below), and the center of the ellipsoid of maximal volume inside a convex body. Grünbaum [71] asked a number of questions about affine invariant points: We denote Do we have F d (K) = P d (K)? One can argue that those convex bodies that have only one affine invariant point are the most symmetric convex bodies. This would include the simplex in R d which is from another point of view the most non-symmetric convex body (see Theorem 3).
A convex body has only one affine invariant point, if it has enough symmetries. We say that an affine map T : R d → R d is a symmetry of a convex body K if T (K) = K. We say that a convex body has enough symmetries if the only affine maps commuting with all symmetries of K are multiples of the identity.
For a convex body K with enough symmetries the halfspace median coincides with the centroid of K.
The following theorems answer Grünbaum's questions (i) and (ii). They can be found in Meyer et al. [123].
Such convex bodies are actually dense in K d with respect to the Hausdorff metric.
In the proofs of these theorems, new classes of affine invariant points were introduced using convex floating bodies (see Section 5.2 below). We define p δ : K d → R d to be the mapping that sends K to the centroid of P δ from (7) for P uniform on K.
Moreover, in Meyer et al. [123,Theorem 2] it was shown that for convex bodies K with dim(P d (K)) = d − 1 a positive answer to Grünbaum's question (iii) above holds, i.e. F d (K) = P d (K). It was settled in all dimensions by Mordhorst [127], based on work by Kučment [90] (see also [91]) where question (iii) of Grünbaum was almost proved already in 1972, with only a compactness argument missing.
Theorem 11. For any K ∈ K d we have that F d (K) = P d (K).

Description at the boundary: Convex floating bodies
Data depth is intimately related to the concept of floating body which we now introduce. We start with a brief discussion of differentiability properties of the boundary of convex bodies, since this will be essential in what follows. 5.1. Curvature of convex bodies. We take as a measure on the boundary ∂K of a convex body K ∈ K d the restriction of the (d − 1)-dimensional Hausdorff measure to ∂K. We call this measure the boundary measure, or the Lebesgue measure on ∂K, and denote it by µ ∂K . Let U be an open subset of R d and f : U → R be a twice continuously differentiable function. Then the classical Gauß-Kronecker curvature at x 0 ∈ U is where ∇f is the gradient of f and ∇ 2 f the Hessian of f . The Gauß-Kronecker curvature of the boundary of a convex body is the curvature of a function parametrizing the boundary. By a theorem of Rademacher (see, e.g., [23, Theorem 2.5.1]), a convex function on R d , and in particular the boundary of a convex body, is almost everywhere differentiable. There are, however, examples of convex functions and convex bodies that are not differentiable on a dense set of R d and of the boundary of the convex body, respectively. Those examples do not have a second derivative at any point and thus the classical Gauß-Kronecker curvature κ does not exist at any point.
Therefore we use the generalized Gauß-Kronecker curvature as introduced by Busemann and Feller [30] in dimension d = 3 and Aleksandrov [2] in general. We present here only a short explanation of the generalized Gauß-Kronecker curvature and we refer to e.g., [164, Section 1.6] and [161] for a detailed account.
A cap of K ∈ K d at x ∈ ∂K is the intersection of a halfspace H − with K such that there is a supporting hyperplane to K at x that is parallel to H. There may, of course, be points on the boundary of K having more than one supporting hyperplane. But, those points are of measure 0 and shall be of less importance in our discussion.
If K has a unique supporting hyperplane at x ∈ ∂K, we denote by ∆(x, δ) the height of a cap with volume δ. The height of a cap is the distance of the supporting hyperplane at x to the parallel hyperplane cutting off a set of volume δ.
Assume that K has at x a unique supporting hyperplane. We say that K has a generalized Gauß-Kronecker curvature if the limit lim δ→0 c d ∆(x, δ) d+1 δ 2 exists. In this case we define (14) κ(x) = lim δ→0 c d ∆(x, δ) d+1 δ 2 to be the generalized Gauß-Kronecker curvature at x.
If the Gauß-Kronecker curvature exists, then it is equal to the generalized Gauß-Kronecker curvature. By a theorem of Busemann, Feller and Aleksandrov [30,2] the generalized Gauß-Kronecker curvature of a convex body exists almost everywhere. Geometrically, the existence of the generalized Gauß-Kronecker curvature at x means that ∂K can be "well" approximated by an ellipsoid, or ellipsoidal cylinder at x (see, e.g., [164, Section 1.6]).
The following example clarifies the difference between Gauß-Kronecker curvature and generalized Gauß-Kronecker curvature. if 1 n+1 < |x| < 1/n and n ∈ N. The function f is not differentiable at the points x = ±1/n and therefore f is not twice differentiable at 0. Thus, the Gauß-Kronecker curvature of f does not exist at 0. On the other hand, it is not difficult to compute that f has a generalized Gauß-Kronecker curvature at 0 and this curvature is 2, see Figure 5.

5.2.
Floating body and convex floating body. Earliest records on floating bodies can be traced back to the early 19th century work of Dupin [47] and are motivated by mechanics. By the Archimedean principle, a solid convex body K ∈ K 3 of constant (volumetric mass) density that floats in water has always a set of the same volume above the water surface, regardless of its position. This leads to the definition of floating bodies for convex bodies in K ∈ K d according to Dupin: A nonempty convex subset K [δ] of K is a floating body of K if each supporting hyperplane to K [δ] cuts off a set of volume δ > 0 of K. Dupin observed that a support hyperplane H to K [δ] touches the boundary of K [δ] in exactly one point, the barycenter of K ∩H. It implies that if K [δ] exists, its boundary is given by the surface of all barycenters of H ∩ K for hyperplanes H that cut off volume δ from K.
The floating body cannot exist for δ > vol d (K) /2. Suppose it does exist. Then any two different parallel supporting hyperplanes of K [δ] cut off disjoint sets of volume δ from K, and therefore K [δ] is the empty set. As shown in the next example, the floating body K [δ] may not exist even for small δ > 0.
Example 5. Let K ∈ K 2 be the equilateral triangle from Example 1. For all δ > 0, the curve of barycenters of lines that cut off volume δ from K is not the boundary of a convex set. Some of these curves for various values of δ are displayed on the left panel of Figure 6. Therefore, in agreement with the observation of Leichtweiß [92, pp. 433-434], no floating body of a triangle exists. Compare this also to Example 1.
If K ∈ K 2 is the unit square of Example 1, all floating bodies K [δ] exist for δ ∈ (0, vol d (K) /2], and they coincide with the halfspace depth central regions (7). If K ∈ K d has a sufficiently smooth boundary, then K [δ] exists by Leichtweiß [92,Satz 2], at least for small δ > 0. However, in many applications (e.g., in Section 5.3 below), existence of floating bodies for all convex bodies is needed. Therefore a modified definition has been proposed, independently by Bárány and Larman [12] and Schütt and Werner [161], called the convex floating body.
Definition. Let K be a convex body in R d and δ ≥ 0. The convex floating body is the intersection of all halfspaces whose defining hyperplanes cut off a set of volume δ of K, where H ∈ H and H + and H − are its associated halfspaces.
The convex floating body exists for all convex bodies since it is an intersection of halfspaces. For instance, the convex floating body of the triangle has a boundary described by the red curve in Figure 6. Note also that K 0 = K. It is easy to see that whenever K [δ] exists, then K [δ] = K δ [161]. Unlike the floating body, the convex floating body is allowed to be an empty set. This way, all convex floating bodies K δ of K are well defined convex sets, but certainly K δ = ∅ if δ > vol d (K) /2.
Properties of the convex floating body are stated in the next proposition.
(i) Through every point of ∂K δ there is at least one supporting hyperplane of K δ that cuts off a set of volume δ from K.
(ii) A supporting hyperplane H of K δ that cuts off a set of volume δ touches K δ in exactly one point, the barycenter of K ∩ H.
Then K δ 0 consists of one point only and for δ < δ 0 we have that K δ is a convex body.
Most of Proposition 12 was proved in [163, Lemma 2]. Part (ii), in dimension d = 3, is due to Dupin [47], see also [92, p. 435]. In general, it is not true that all supporting hyperplanes to the convex floating body K δ cut off a set of exactly volume δ from K. An example is the simplex, as can be seen also from Example 5. Not every point on the boundary of K δ has a unique supporting hyperplane. An example is the cube, see Example 5.
Meyer and Reisner [122] show that for centrally symmetric convex bodies K [δ] exists for any δ ∈ (0, vol d (K) /2]. Moreover, in that case each K [δ] is also (centrally) symmetric around the same center of symmetry as K. In an unpublished work, K. Ball gave a different proof of the existence result, see [121,Section 4].
Proposition 13. Let K ∈ K d be a convex body that is (centrally) symmetric with respect to the origin 0, i.e. x ∈ K implies −x ∈ K. Then we have for all δ ∈ (0, vol d (K) /2) (i) The floating body of K exists.
(ii) For all convex bodies K with C 1 boundary and all δ the floating body K δ has a C 2 boundary.
The next two results can be found in Schütt and Werner [162, Theorem 5.3 and Proposition 5.1], and describe the behavior of the volume of K \ K δ . Proposition 14. Let K ∈ K d , and let δ 0 be as in (15). Then vol d (K \ K δ ) is a differentiable function of δ on (0, δ 0 ) and where H(x, N ∂K δ (x)) is the hyperplane passing through x orthogonal to the normal of K δ at x.

5.3.
Affine surface area. An important affine invariant from affine convex geometry is the affine surface area. Applications of the affine surface area are numerous. We only name some in convex geometry [111,61,21,102,103,72], in differential geometry [3,4,170,83], approximation of convex bodies by polytopes (see Section 5.7), information theory [110,180,7,181], and partial differential equations [108,176]. Let K be a convex body in R d with a C 2 boundary. Then for all x ∈ ∂K, the Gauß-Kronecker curvature κ(x) exists and the (classical) affine surface area, introduced by Blaschke [17] in 1923 in dimensions two and three, is defined as For a Euclidean ball with radius 1, the affine surface area equals its surface area. It is 0 for all polytopes. Blaschke [17] observed that for convex bodies in R 3 with analytic boundary the following identity holds An important tool in the proof of this identity is the rolling theorem of Blaschke [17]: The floating body exists if a sufficiently small Euclidean ball rolls freely inside K, i.e., there is r > 0 such that for all x ∈ ∂K there is y ∈ K such that x − y = r and B d (y, r) ⊂ K.
It is natural to ask if formula (16) can be extended to all dimensions and all convex bodies using the convex floating body instead of the floating body. This is indeed the case and was achieved in Schütt and Werner [161], where now the function κ under the integral is the generalized Gauß-Kronecker curvature (14).
The expressions in the above theorem can thus be used to define the affine surface area for all convex bodies. Around the same time, different extensions of the affine surface area to arbitrary convex bodies were given by Leichtweiß [92] and Lutwak [105] and afterwards several more have been found, e.g., [81,124,178]. It has been shown that all those extensions coincide.
Expression (17) is called the affine surface area because of its similarity to Minkowski's definition of surface area and because for all affine maps T : R d → R d , as (T (K)) = |det(T )| d−1 d+1 as (K) . The latter equation follows easily from (17). Indeed, An important tool in the proof of Theorem 16 is a strengthening of Blaschke's rolling theorem. To achieve this, Schütt and Werner [161] introduce the rolling function. For x ∈ ∂K, the rolling function r(x) is the supremum of all radii of Euclidean balls that contain x and that are contained in K, i.e. r : ∂K → R is defined by If K does not have a unique normal at x then r(x) = 0. The following was shown by Schütt and Werner [161,Lemmas 4 and 5].
Proposition 17. Let K ∈ K d be such that B d ⊂ K. Then we have for all t with 0 ≤ t ≤ 1 that {x ∈ ∂K : r(x) ≥ t} is a closed set and The inequality is optimal. In particular, the function r −α : ∂K → R is Lebesgue integrable for all α with 0 ≤ α < 1.
Note that by taking t = 0 in Proposition 17 it follows that the boundary of a convex body is almost everywhere differentiable.
Affine invariance is a useful property as it lets us consider convex bodies independent of their position in space. Another extremely important property of the affine surface area is the affine isoperimetric inequality which says that for all convex bodies K ∈ K d , (18) as , with equality if and only if K is an ellipsoid (see, e.g., [156,Section 10.5]). The affine isoperimetric inequality is stronger than the classical isoperimetric inequality and provides solutions to many problems where ellipsoids are extrema [106,163,171,183]. The affine isoperimetric inequality (18) is equivalent to another classical inequality from convex geometry, the Blaschke-Santaló inequality [17,153]. For an interior point x 0 of a convex body K recall the definition of the polar body K x 0 of K w.r.t. x 0 from (1). The Blaschke-Santaló inequality states that for all convex bodies K in R d , (19) vol where s(K) is the Santaló point of K, i.e. the unique point for which the minimum is attained on the left hand side. This inequality and its counterpart, the reverse Blaschke-Santaló inequality (proved by Bourgain and Milman [26] and closely connected to the stillunsolved Mahler's conjecture, see e.g. Giannopoulos et al. [62]), are helpful to estimate the volume of convex bodies in situations, when it is easier to compute the volume of the polar K x 0 of a convex body. These inequalities have important applications in convex geometry, functional analysis, Banach space theory, quantum information theory, operator theory and geometric number theory. For background including references, see e.g., the books [5,60,62,86,156].
To conclude this section, note that for a polytope S ∈ K d we have a different behavior of the volume difference vol d (S \ S δ ) than that from Theorem 16. To describe it, we need the notion of flag. A flag of a polytope S is a d-tuple (f 0 , . . . , f d−1 ) where f i is an i-dimensional face of S with f i ⊂ f i+1 . fl d (S) denotes the number of flags of the polytope S.
Theorem 18. Let S be a convex polytope with nonempty interior in R d . Then 5.4. L p -affine surface area. The concept of affine surface area for convex bodies has been generalized to L p -affine surface areas. Those are by now the cornerstones of the rapidly developing L p -Brunn-Minkowski theory, initiated in the groundbreaking paper of Lutwak [107]. See also [156, Section 9.1] and, e.g., [137,73,109,124]. The next definition was given by Lutwak [107] for p > 1, and Schütt and Werner [165] for all other p. See also Hug [82].
Definition. Let K be a convex body in R d such that 0 is in the interior of K. Let −∞ ≤ p ≤ ∞, p = −d. The L p -affine surface area of K is (20) as Here, N K (x) is the outer unit normal at x ∈ ∂K, µ ∂K is the usual surface area measure on ∂K and κ is the generalized Gauß-Kronecker curvature at x.
For p = 0, as 0 (K) = d vol d (K). For p = ±∞, the L p -affine surface area is defined by the corresponding limit in (20) as which, for K sufficiently smooth, gives as ±∞ (K) = d vol d (K • ), where K • is the polar body (1) of K w.r.t. 0. For p = 1 we get the above mentioned affine surface area of K, Note that in general the L p -affine surface area is not an affine invariant anymore, only a linear invariant. There exist geometric identities, analogous to (17), also for L p -affine surface area. These use weighted floating bodies [179], Santaló bodies [124] and surface bodies [165]. We refer to those references for the details. Moreover, the corresponding L p -affine isoperimetric inequalities hold true as well.
Theorem 19. Let K ∈ K d with the origin in its interior.

Floating measures.
Much effort has been devoted to extend the theory of convex bodies to a functional setting (e.g., [9,6,54]). Natural analogs of convex bodies in the realm of functions are log-concave functions, i.e. densities of log-concave measures. For such measures we present a notion of floating measure. Another approach will be shown in Section 6.
Let ψ : R d → R be a convex function such that In the general case, when ψ is neither smooth nor strictly convex, the gradient of ψ, denoted by ∇ψ, exists almost everywhere by Rademacher's theorem [23, Theorem 2.5.1]. A theorem of Busemann and Feller [30] and Aleksandrov [2] guarantees the existence of the (generalized) Hessian, denoted by ∇ 2 ψ, almost everywhere in R d (for details see, e.g., [164, Section 1.6]). The Hessian is a quadratic form on R d , and if ψ is a convex function, for almost every x ∈ R d one has, when y → 0, that Let µ be a log-concave measure on R d , i.e. a measure with density e −ψ , where ψ : R d → R is a convex function. Note that we do not necessarily require that µ is a probability measure. Let be the epigraph of ψ. Then epi(ψ) is a closed convex set in R d+1 and for sufficiently small δ we can define its floating set epi(ψ) δ as This was done in [94], where also the definition of a floating set was introduced for convex, not necessarily bounded subsets of R d . It is easy to see that there exists a unique convex function ψ δ : R d → R such that (epi(ψ)) δ = epi(ψ δ ). Consequently, Li et al. [94] define the floating function of a convex function ψ and the floating measure of the (not necessarily probability) measure µ as follows.
(ii) Let µ be a measure with density f (x) = e −ψ(x) . The floating measure of µ is the measure with density f δ where Note that when ψ is affine, ψ δ = ψ and, for f = e −ψ , f δ = f . 5.6. Affine surface areas for log-concave measures. As far as we know, at present there are two approaches for a definition of affine surface area for log-concave measures. The first one is similar to the one discussed in Section 5.3 and uses the floating measure of Section 5.5 instead of the floating bodies K δ . It was proposed in [94] and is inspired by the formula of Theorem 16. As in Section 5.5, we do not require that the log-concave measure µ with density e −ψ is a probability measure.
Theorem 20. Let ψ : R d → R be a convex function such that (21) holds true. Then This theorem was proved in [94,Theorem 1]. Its comparison with convex bodies (see Theorem 16) led Li et al. [94] to call the right hand side integral of Theorem 20 the affine surface area of the measure µ.
Definition. For a log-concave measure µ on R d with density e −ψ such that (21) holds true, the affine surface area of the measure µ is given by (22) as This definition is further justified as the expression shares many properties of the affine surface area for convex bodies. For instance, it is invariant under affine transformations with determinant 1. For the standard Gaußian measure P we have that as (P ) = 1.
Another definition of affine surface area for log-concave measures was put forward in Caglar et al. [31]. Actually, an even more general approach was proposed, again for convex functions ψ such that (21) holds true. We put Ω ψ to be the set of vectors in R d at which ∇ 2 ψ exists and is invertible.
Definition. For a log-concave measure µ on R d with density e −ψ such that (21) holds true and λ ∈ R, the λ-affine surface areas are (23) As We can replace Ω ψ by R d for λ > 0.
Thus, for 2-homogeneous functions ψ, formula (23) simplifies to and definitions (22) and (23) agree for λ = 1 d+2 . To understand why it is justified to name the quantities (23) affine surface areas, we recall the definition of the L p -affine surface areas (20) for convex bodies K. It was noted in Caglar et al. [31] that the definition of λ-affine surface area for a log-concave density agrees with the definition of L p -affine surface area for convex bodies if the function is the gauge function · K of a convex body K with 0 in its interior, The next theorem is from Caglar et al. [31,Theorem 3].
Theorem 21. Let K be a convex body in R d that contains the origin in its interior. For any p ≥ 0, let λ = p d+p . Then Moreover, if the set of points of ∂K where the generalized Gauß-Kronecker curvature is strictly positive has full measure in ∂K, then the same relation holds true for every p = −d.
The L p -affine isoperimetric inequalities for convex bodies of Theorem 19 have analogs for the λ-affine surface areas for log-concave measures. We only mention the case λ ∈ [0, 1] and refer to Caglar et al. [31] for the other cases.
In particular, if ψ is in addition 2-homogeneous, then as (µ) Equality holds in the inequalities if and only if there are a ∈ R and a positive definite matrix A such that for all x ∈ R d ψ(x) = Ax, x + a.
A main ingredient in the proof of this proposition is a functional version of the Blaschke-Santaló inequality. We refer to [9,6,54] for the details.

5.7.
Applications of affine surface area: Approximation of convex bodies by polytopes. Approximation by polytopes is a central topic in convex geometry with numerous applications. There is a huge amount of literature on the subject. A (very incomplete) list is [20,68,143,159,67,80,104]. We present only one aspect of the subject, approximation by polytopes with a fixed number of vertices and refer to the literature for others. 5.7.1. Best and random approximation. Ideally, in approximation problems, one seeks a best approximating polytope in a given metric. One such result is given in the next theorem, where we consider all polytopes with at most N vertices that are contained in a convex body K. By compactness, there is a polytope P N in this class with maximal volume. This means that the symmetric difference metric d S (K, P N ) from (13) is minimal. Such a polytope is called best approximating with respect to the symmetric difference metric. Theorem 23. Let K be a convex body in R d with C 2 -boundary ∂K and everywhere strictly positive Gauß-Kronecker curvature κ. For every N ∈ N let P N be a best approximating polytope of K with at most N vertices. Then (24) lim where del d−1 is a constant depending only on the dimension d.
This theorem was proved by McClure and Vitale [120] in dimension 2 and by Gruber [68] for general dimension. It was shown by Mankiewicz and Schütt [113] that del d−1 is of the order of dimension, or more precisely, Note that where c is an absolute constant. On the right hand side of equation (24) we find the affine surface area of K from Section 5.3. It is natural that such a term should appear in approximation questions: Intuitively, we expect that more vertices of the approximating polytope should be put where the boundary of K is very curved, and fewer points where the boundary is flat, to get a good approximation in the d S -metric.
However it is only in rare cases that a best approximating polytope can be singled out. Consequently, a common practice is to randomize: Choose N points at random in K with respect to a probability measure P on K. The convex hull of these randomly chosen points is a random polytope. The expected volume of a random polytope of N points is where [x 1 , . . . , x N ] is the convex hull of the points x 1 , . . . , x N . Thus the expression vol d (K) − E(K, N) measures how close a random polytope and the convex body are in the symmetric difference metric.
We now compare best approximation with random approximation. The analog to Theorem 23 in the random case is the following theorem. There, the probability measure is the normalized Lebesgue measure on K.
Theorem 24. Let K be a convex body in R d . Then where c(d) is a constant that depends only on d.
This theorem was proved by Rényi and Sulanke [144,145] in dimension 2. Wieacker [184] settled the case of the Euclidean ball in dimension d. Bárány [11] proved the result for convex bodies with C 3 -boundary and everywhere positive Gauß-Kronecker curvature. Finally, the general result for arbitrary convex bodies was proved by Schütt [159] and Böröczky et al. [22].
Notice that Theorem 24 does not give the optimal dependence on N for best approximation. One reason is that not all the points chosen at random from K appear as vertices of the approximating random polytope. Thus we now choose the points randomly from the boundary of K according to a measure with a density with respect to µ ∂K . We denote by E(K, f, N) the expected volume of the corresponding random polytope. Which density is optimal? It turns out that it is, up to normalization, the (d + 1)-root of the generalized Gauß-Kronecker curvature. The integral of this function is the affine surface area. The next theorem was shown by Schütt and Werner [164,Theorem 1.1], see also Reitzner [142].
for N K (x) an outer unit normal of K at x, and let f : ∂K → (0, ∞) be continuous with The minimum at the right-hand side of (26) is attained for the normalized affine surface area measure with density Best approximation of Theorem 23 differs from random approximation of Theorem 25 only in the dimensional constants del d−1 and c(d). Comparing those, using also (25), an amazing fact follows: with the density f as random approximation is almost as good as best approximation, where c is an absolute constant. Theorem 26. Let K be a convex body in R d . Then there is N 0 ∈ N such that for all N ≥ N 0 where c 1 and c 2 are constants that depend on d only.
Even more can be said about the connection between floating bodies and random polytopes. There is an algorithm, the floating body algorithm, where, for a given convex body K in R d , one uses floating bodies to construct a polytope P N with as few vertices N as possible such that for a suitable δ, K δ ⊆ P N ⊆ K and such that P N approximates the convex body K very well in the symmetric difference metric. It should be noted that we make no assumption on K.
We describe this algorithm: We are choosing the vertices x 1 , . . . , x N ∈ ∂K of the polytope P N . x 1 is chosen arbitrarily . Having chosen x 1 , . . . , x k−1 we choose x k such that where N K (x k ) denotes a (not necessarily unique) outer normal to ∂K at x 0 , Int(C) is the interior of a set C ⊂ R d , and ∆ k is determined by The next theorem can be found in Schütt [160].
where c is a universal constant, and there exists a polytope P N that has at most N vertices and such that K δ ⊆ P N ⊆ K.
How well does this polytope approximate K? It follows from Theorem 27 that lim sup This should be compared to (24). Since del d−1 is of the order of d, both expressions differ only by a factor of the order of dimension d.

Floating bodies of measures
The definition of the (convex) floating body of a convex body K ∈ K d discussed in Section 5 extends naturally also to general probability measures, in a manner different than that from Section 5.5. It is closely related to the halfspace depth. Analogously to the approach of Dupin [47], let P ∈ P R d and δ > 0. We say that the nonempty convex set P [δ] ∈ K d is the floating body of P if for each supporting halfspace For P distributed uniformly on a convex body K ∈ K d of unit volume, . Therefore, the floating body P [δ] does not exist for δ > 1/2, and it may happen that it does not exist for any δ > 0, see the example of (the uniform distribution on) a triangle from Example 5. Unlike in the situation with the floating body of K ∈ K d , even if the floating body P [δ] of P ∈ P R d exists, it may not be uniquely defined. Take, for instance, a distribution P on R whose support is not contiguous, such as that displayed in Figure 7. For δ = 1/4, and q 2 the (1 − δ)-quantile of P , each interval [q 1 , q 2 ] for q 1 ∈ [0, 1] is a floating body of P . Note that if P has contiguous support and P [δ] exists, then it is unique.
To avoid these problems, let us consider, as in the case of convex bodies, the convex floating body of P , given by an intersection of halfspaces.
Definition. Let P ∈ P R d . For δ ≥ 0, the convex floating body of P with index δ is defined as the intersection of all closed halfspaces whose defining hyperplanes cut off a set of probability content at most δ from P , i.e.
where H ∈ H and H + and H − are its associated closed halfspaces.
Note that with the convention that the intersection of an empty collection of subsets of R d is R d , convex floating bodies of a measure are always well defined, unique, convex subsets of R d . It can happen that P F B δ = ∅, especially for larger values of δ. It is easy to see that for P ∈ P R d distributed uniformly on K ∈ K d with vol d (K) = 1, P F B δ = K δ for any δ ≥ 0, and the convex floating bodies of measures generalize the convex floating bodies discussed throughout Section 5.
(Convex) floating bodies for general measures have already been considered in the literature, mainly due to the association of convex bodies and log-concave measures established by Ball [10]. The previous definitions were considered by Werner [179], Bobkov [18], Fresen [57,58], and Brunel [28], among others. In connection with the halfspace depth, the floating bodies (27) were considered in Nolan [134], and Massé and Theodorescu [118]. In the latter paper, those regions are called the δ-trimmed regions of P .
The convex floating body of a measure P is very closely related to the depth central region P δ , defined in (7) as the upper level set of the depth hD (·; P ). Indeed, recall the characterization of Rousseeuw and Ruts [151,Proposition 6], who showed that for any P ∈ P R d and δ > 0 On the other hand, it is not difficult to see that the convex floating body (27) can be written also in the form Now it is obvious that for all δ ≥ 0 we have that and under the assumption of the contiguity of the support of P , For general measures P it may happen that the convex floating body is a proper subset of the depth central region, see Figure 7.
It is interesting to investigate which results for convex bodies described in Section 5 carry over to measures.
Let us first relate the floating body P [δ] with the the convex floating body P F B δ and the central region P δ . If a unique floating body P [δ] of a measure P ∈ P R d exists, then the corresponding convex floating body P F B δ must be equal to P [δ] . For the sake of completeness, let us provide an elementary proof of this result.   Figure 7. For distributions whose support is not contiguous, the convex floating body (27) and the halfspace depth central region (7) may differ. In this example, the density of P ∈ P (R 1 ), supported on disjoint intervals [−2, 0] and [1,5], is displayed (orange line), along with its halfspace depth function (dashed brown line). For δ = 1/4, the left endpoint of the interval P δ is 0. But the complement of the halfline [1, ∞) (black arrow) has probability 1/4, and the left endpoint of the convex 1/4-floating body of P is 1. Points in the interval (0, 1) are not boundaries of any convex floating body of P .
Proof. Recall that if P [δ] exists, then it is unique as P has contiguous support. For P with contiguous support, the proof of P F B δ = P δ can be found in [ , as for P contiguous the boundary hyperplane of any closed halfspace with probability δ must support P [δ] . Now we explore whether analogues of Propositions 12-15 stated for convex bodies in Section 5 hold true also for measures. 6.0.1. Proposition 12 for measures. A result analogous to part (i) of Proposition 12 would require that the infimum in the definition of the halfspace depth (3) can be replaced by a minimum, i.e. that a minimal halfspace of hD exists at each x ∈ R d for any P ∈ P R d . For measures that that do not satisfy (4) this is not true, as noted already by Rousseeuw and Ruts [151,Remark 1]. There, the following example is given. Example 6. Let P ∈ P (R 2 ) be a mixture of the standard bivariate Gaußian distribution and the Dirac measure at the point (1, 1), with equal mixing proportions. Then, at x = (0, 1), we have hD (x; P ) = Φ(−1)/2, where Φ is the distribution function of the standard univariate Gaußian distribution, see also Example 2. Yet, no minimal halfspace at x exists.
For a different example of the same phenomenon, see Massé [116,Section 2]. For distributions that satisfy (4), a minimal halfspace always exists for all x ∈ R d . That was shown, e.g., by Massé [116,Proposition 4.5 (i)].
An extension of Dupin's theorem (part (ii) of Proposition 12) to probability distributions was stated in Hassairi and Regaieg [76, Theorem 3.1]. Here we provide a version of that result with a slightly modified set of assumptions. The proof of the proposition follows very closely the original proof of [76, Theorem 3.1], and is omitted.
Proposition 29. Let X ∼ P ∈ P R d be absolutely continuous with contiguous support Supp(P ) and let x ∈ R d be such that hD(x; P ) > 0. Denote by f u the density of the random variable given by X − x, u with u ∈ S d−1 . Suppose that f u (y) is continuous as a function of u ∈ S d−1 and y in a neighborhood of 0 ∈ R. Let H − ∈ H − be a minimal halfspace at x, i.e. x ∈ H and P (H − ) = hD(x; P ). Then i.e. x is the conditional expectation of P given H. The integrals in the formula above are taken with respect to the (d − 1)-dimensional Lebesgue measure on H.
One has to be careful with the statement of Proposition 29. Without the required continuity properties of the marginal densities f u , the conditional expectation of P given a hyperplane H, may not even be well defined. To illustrate our point, we give an example that was brought to our attention by M. Tancer [174].  Figure 8. Consider x = (ε, 0) for −1/2 ≤ ε ≤ 1/2. A simple computation shows that hD(x; P ) = 1/5, and the unique minimal halfspace at all such points x is the halfspace H + that cuts off the smaller square from P . A direct analogue of Dupin's theorem would now assert that the conditional expectation of H = ∂H + is not unique -any x on the line segment L that joins (−1/2, 0) and (1/2, 0) would be a candidate for the barycenter of P given H. The problem here, of course, is due to the discontinuity of the marginal density of P at H. For this particular H, the conditional expectation of P given H is not properly defined. To see that the strict convexity of the central regions (part (iii) of Proposition 12) does not hold true for all measures, it is enough to return to Example 7. Indeed, due to the considerations made there, the line segment L lies on the boundary of the central region P δ = P F B δ for δ = 1/5, and P 1/5 is not strictly convex, see also the right panel of Figure 8. For another example where the strict convexity of P δ is violated, recall the collection of α-symmetric distributions from Example 2 for α ≤ 1, and the right panel of Figure 4. In Example 7, the problem appears to stem from discontinuity of the density of P at the boundary of a minimal halfspace. For α-symmetric distributions the problem is that the expectation of P is not defined. An extension of part (iv) of Proposition 12 was given by Mizera and Volauf [126,Proposition 7], who stated that if (4) is true for P with contiguous support, then the halfspace median of P is a unique point. 6.0.2. Proposition 13 for measures. The (Dupin's) floating body P [δ] of a general measure P may not exist. Sufficient conditions for the existence of floating bodies of probability measures appear to be a challenging problem of great importance in mathematical statistics, and the theory of data depth (see, e.g., [28, Open question 1], or [116,117]). Many theoretical results on the behavior of the depth and its central regions hold true only under the assumption of existence of floating bodies of P , see also the discussion in Section 8 below. Brunel [28, Open question 2] asks a question that can be rephrased as follows: Is it true that for any log-concave measure P ∈ P R d all floating bodies P [δ] for δ > 0 small enough exist?
From the example of the uniform distribution on a triangle (Example 5), we see that the answer to the above question is negative. Though, under the additional assumption of central symmetry of P , similar properties have been investigated by Meyer and Reisner [122] for convex bodies (see Proposition 13 above), and extended to certain probability measures by Bobkov [18,Section 6]. In the latter paper, it is shown that all P [δ] exist for centrally symmetric s-concave measures with s ≥ −1. As far as we are aware, the following theorem from [18,Theorem 6.1] is, up to date, the most general result on the existence of floating bodies of measures P ∈ P R d .
Theorem 30. Let P ∈ P R d be a centrally symmetric s-concave measure with s ≥ −1 such that Supp(P ) is a d-dimensional subset of R d . Then P [δ] exists for all δ ∈ (0, 1/2]. As remarked by Bobkov [18], it is not known whether the restriction s ≥ −1 can be dropped. Proposition 31. Let P ∈ P R d satisfy (4), δ ∈ (0, 1/2), and let P δ be a convex body whose boundary is C 1 . Then P δ is a floating body.
Proof. Let x ∈ ∂P δ . Since, under (4), the depth hD (·; P ) is continuous on R d ([126, Proposition 1], or Section 3.2.5 above), hD(x; P ) = δ. Using [116, Proposition 4.5 (i)] there exists a minimal halfspace H − ∈ H − at x, and H + then must support P δ at x by Proposition 6. Because of the smoothness of the boundary of P δ , there is only a single supporting halfspace of P δ at each x ∈ ∂P δ . Thus, we have shown that for any supporting halfspace H + of P δ , P (H − ) = δ, and P δ is a floating body of P .
Smoothness of boundaries of P δ was recognized to be crucial in establishing theoretical properties of hD already by Nolan [134], and Massé and Theodorescu [118]. Many theoretical results stated for the halfspace depth in statistics rely on that condition. For instance, as shown by Massé [116,Theorem 2.1], the asymptotic distribution of the sample halfspace depth at x is Gaußian if the boundary of P δ passing through x has a unique minimal halfspace. For another application of the smoothness of boundaries of floating bodies see Section 8 below.
Despite being of critical importance, so far the only examples of distributions with smooth contours of hD are the (full-dimensional affine images of) α-symmetric distributions with α > 1, see Example 2. As discussed in Gijbels and Nagy [63], apart from those distributions, no other multivariate measure with smooth depth contours is known in statistics. In that paper, it is also shown that simple distributions such as mixtures of multivariate Gaußian distributions, and distributions with smooth centrally symmetric, or smooth strictly quasi-concave densities, may have points at which the boundary of P δ is not smooth. It is therefore remarkable that Meyer and Reisner [122,Theorem 3] (part (ii) of Proposition 13 above) showed that for certain (centrally) symmetric convex bodies, the boundaries of K δ exhibit a high degree of smoothness. We are not aware of any result giving sufficient conditions for higher order differentiability of the boundary of the depth central regions, or convex floating bodies of measures, in statistics. 6.1. Application: Multivariate extremes and depth. The intimate connections of floating bodies with the approximation problems described in Section 5.7 have analogues for probability measures. If X 1 , . . . , X n is a random sample from distribution P ∈ P R d , one can ask how fast does the random polytope given by the convex hull of these random points grow to the convex hull of the support of P . In conjunction with the advances for uniform measures on convex bodies outlined in Section 5.7, it is not surprising that the halfspace depth and floating bodies of measures play a prominent role in these problems.
The following theorem, called the multivariate Gnedenko law of large numbers, can be found in Fresen [58,Theorem 2].
Theorem 32. Let q > 0 and p > 1, and let P ∈ P R d be a probability measure with a density of the form f (x) = ce −g(x) p where g : R d → [0, ∞) is a convex function and c > 0. Then there exist constants c 1 , c 2 > 0 such that for a random sample X 1 , . . . , X n of any size n ∈ N with n ≥ d + 2 from P , it holds true that (29) P d H [X 1 , . . . , X n ] , P 1/n ≤ c 1 log log n where [X 1 , . . . , X n ] is the closed convex hull of the points X 1 , . . . , X n , and P 1/n is the depth central region P δ with δ = 1/n.
Distributions P from Theorem 32 are sometimes called p-log-concave measures. For usual log-concave measures, an inequality only slightly weaker than (29) is given in Fresen [58,Theorem 1].
Theorem 32 asserts that, with large probability, convex hulls of large random samples from P behave as the halfspace depth central regions P δ for very small values of δ. This observation opens a whole new field of applications of the depth in multivariate extreme value theory. Indeed, by now, data depth has been used in statistics predominantly as a robust tool that identifies the central parts of the probability mass of distributions, and little attention was paid to its behavior near the tails. Theorem 32 gives a probabilistic interpretation also to the boundaries of those depth regions that correspond to the extreme depth-quantiles. It is also interesting to compare Theorem 32 with the recent advances of Einmahl et al. [51] and He and Einmahl [78]. There, the authors employ extreme value theory in order to estimate P δ for low values of δ reliably from the data. It will be interesting to see what can be obtained by a proper combination of the estimation techniques from the latter papers, and the asymptotic representations of Fresen [58].
In a further analogue with the exposition from Section 5, one may study the limit behavior of the quantity (30) P (P 0 ) − P (P δ ) = 1 − P (P δ ) as δ → 0 from the right. More specifically, assume that the difference (30) is scaled properly, so that the resulting limit is a finite, non-negative number Ω(P ). The characteristic Ω(P ), together with the sequence of its scaling constants, is then an affine invariant on P R d . Ω(P ) is not a generalized notion of the affine surface area such as the functional from Section 5.6, but it is interesting in its own right. From the viewpoint of statistics, Ω(P ) may serve as an index of heavy-tailedness of the distribution P , where not only the size of the tails is evaluated, but also "the complexity of the boundary" of Supp(P ) is taken into account.

Mahalanobis ellipsoids and the halfspace depth
Let X ∼ P ∈ P R d be distributed uniformly on K ∈ K d . The body K is said to be isotropic, or in the isotropic position, if vol d (K) = 1, E X = 0, and Var X = L 2 K I d where L K > 0 a constant and I d the d × d identity matrix. Geometrically, this means that the barycenter of K is at the origin and that the ellipsoid of inertia of K, or equivalently, all Mahalanobis ellipsoids of P from (6), are Euclidean balls. The constant is called the isotropic constant of K.
The isotropic constant plays an important role in the analysis of convex bodies. We refer to e.g., Milman and Pajor [125] and the book of Brazitikos et al. [27,Chapter 3]. The conjecture that for all K ∈ K d the constant L K is bounded from above by an absolute constant independent of the dimension d is one of the major open problems in geometric analysis. The best known upper estimate so far, due to Klartag [85], is that L K ≤ c d 1 4 for an absolute constant c, improving an earlier estimate by Bourgain [25] by a logarithmic factor. The conjecture is equivalent to the hyperplane conjecture, first formulated by Bourgain, which asks if every centered convex body of volume 1 has a hyperplane section through the origin whose (d − 1)-dimensional volume is greater than an absolute positive constant, independent of dimension d. We refer to e.g., [27, Section 3.1] for the details.
For any K there exists an affine transformation T such that T (K) is isotropic, and the isotropic position is uniquely determined up to orthogonal transformations. The isotropic constant of a general body K ∈ K d is then defined as the isotropic constant of the corresponding isotropic body T (K).
Similarly, we can define the isotropic constant for probability measures P with logconcave density f . A measure that corresponds to X ∼ P is isotropic if it is centered, i.e. E X = 0, and if for all u ∈ S d−1 , or, equivalently, Var X = I d . Then is the isotropic constant of P . The isotropic constant of a general probability measure P with log-concave density is, again, given as the isotropic constant of an affine image of P that is isotropic [27,Chapter 2]. Note that a convex body K ∈ K d of volume 1 is isotropic, if and only if the density of the uniform distribution on the convex body K/L K is an isotropic log-concave density.
For bodies and log-concave measures in isotropic position, many important geometrical results are known. In this section we state one that relates to the subjects of data depth and floating bodies.
Proposition 33. The following holds true: (i) For any isotropic convex body K ∈ K d and any δ ∈ 0, 1 (ii) For any isotropic measure P ∈ P R d with a log-concave density This proposition was proved by Milman and Pajor [125, Proposition in the Appendix], and re-stated by Fresen [57] who also gave the formulation in part (ii) for isotropic logconcave measures. Its further extension to centrally symmetric s-concave measures with s > −∞ can be found in Bobkov [18,Theorem 5.1]. Part (i) of this proposition is a special case of more general relations between floating bodies and p-centroid bodies which can be found in [137,Theorem 2.2].
Proposition 33 has important implications for the theory of halfspace depth. By affine equivariance of the halfspace depth central regions P δ , for any log-concave measure P ∈ P R d with expectation µ ∈ R d and a positive definite variance matrix Σ ∈ R d×d , where d Σ is the Mahalanobis distance from (5). Therefore, all central regions of the halfspace depth for δ < 1/e of log-concave measures are, up to a constant that depends only on δ and L P , isomorphic to the Mahalanobis ellipsoids given by the covariance structure of P . This corroborates the findings from statistics, where it has been long observed that the depth central regions P δ tend to take more "ellipsoidal" shapes than the level sets of the densities, see also Figures 3 and 4 above. Results in this section provide quantitative statements that support those claims. Problem 11. Is it possible to state an analogue of Proposition 33 also for more general probability measures?

Characterization of distributions
One of the most important open questions connected with the halfspace depth is the halfspace depth characterization problem. It has been conjectured (e.g. [42, p. 2306] and [87, p. 1598]) that for each distribution P ∈ P R d there exists a unique depth surface hD(x; P ) : x ∈ R d , i.e., that all probability distributions are determined by their halfspace depth. Such a result would be invaluable in statistics, as it would assert that just as the distribution function or the characteristic function of a random vector, also the halfspace depth could be used as a complete representative of any probability distribution.
Recently, the depth characterization conjecture was disproved in [129], where an example of two different probability distributions with the same depth at all points in R d , d ≥ 2, was given. The example employs collections of different α-symmetric distributions with α ≤ 1 whose projections coincide in some directions.
Even though the general characterization conjecture turned out to be false, important partial positive results to the characterization problem can be found in the literature. Thanks to the results of Struyf and Rousseeuw [173], Koshevoy [89], and Hassairi and Regaieg [75] we know that if P, Q ∈ P R d are distributions whose supports are finite subsets of R d , then hD(x; P ) = hD(x; Q) for all x ∈ R d implies P = Q. For non-atomic distributions, two results can be found in the literature in the papers of Hassairi and Regaieg [76], and Kong and Zuo [88]. In this section we show that the last two theorems are special cases of the following theorem.
Theorem 34. Let P ∈ P R d have contiguous support, and let x P ∈ R d be the halfspace median of P . Then the following are equivalent: (FB 1 ) For each δ ∈ (0, 1/2) the floating body P [δ] of P exists. (FB 2 ) (4) holds true, and Consequently, if (FB 1 ) is true, then P is characterized by its halfspace depth, i.e. there is no other probability distribution with the same depth at all points in R d .
Proof. Assume first that (FB 1 ) is true. We show first that (4) holds. Suppose it does not hold. Then there exists a hyperplane H such that P (H) > 0. Without loss of generality we can assume that P (H − ) ≤ P (H + ). We put δ = P (H − ) − 3 4 P (H). Then 0 < δ < 1/2. We claim that the floating body P [δ] does not exist, for if it does exist, then there is a supporting hyperplane H 1 to P [δ] parallel to H such that P (H − 1 ) = δ. Note that P (H − ) = δ + 3 4 P (H) > δ, and it must be that H − 1 H − . But, in that case, P (H − 1 ) ≤ P (H − ) − P (H) < δ, a contradiction. Take now an arbitrary hyperplane H ∈ H, and define ψ(H − ) = sup x∈H hD(x; P ). For any x ∈ H we have (32) hD(x; P ) = inf P (G − ) : since the halfspace H − ∈ H − belongs to the collection over which the infimum is taken. Because (32) is valid for any x ∈ H, ψ(H − ) ≤ P (H − ).
To prove the other inequality, assume that δ = P (H − ) > 0. Otherwise, trivially ψ(H − ) ≥ P (H − ) = 0. Further, it is possible to assume that δ ≤ 1/2. If this is not the case, take H + ∈ H − , the closed halfspace complementary to H − , and proceed with H + (note that in the latter case, we know by (4) that P (H + ) ≤ 1/2 and P (H + )+P (H − ) = 1). We first treat the case δ < 1/2. Because all floating bodies of P are assumed to exist and because P (H − ) = δ, the hyperplane H supports the floating body P [δ] of P . That is, there must exist a point x H ∈ H ∩ P [δ] . As x H ∈ P [δ] = P δ = {y ∈ R d : hD(y; P ) ≥ δ}, Thus (31) holds for δ < 1/2. By continuity, it also holds for δ = 1/2. Hence (FB 1 ) implies that the probability of halfspaces is characterized by their depth as in (31).
For the opposite implication, assume that (FB 2 ) is true and let δ ∈ (0, 1/2). Consider the depth level set P δ . This is a convex compact set. From (31) with H − such that P (H − ) = 1/2 and the continuity of the depth hD(·; P ) guaranteed by (4), we see that P δ must be non-empty for all δ ∈ (0, 1/2). Take any H ∈ H such that P (H − ) = δ, and consider the family G ⊂ H of all hyperplanes parallel to H. Then P δ must be supported by some G ∈ G with G − ⊆ H − or G − ⊇ H − . If P (G − ) = δ ′ > δ, (FB 2 ) cannot be true as P δ ′ ⊂ P δ by the nestedness and convexity of the central regions, and the continuity of hD. Indeed, because G supports P δ , for all x ∈ G either x ∈ ∂P δ or x / ∈ P δ . In both cases hD(x; P ) ≤ δ, since, using the continuity of hD again, hD(x; P ) = δ for any x ∈ ∂P δ . By (31) this means that we have δ ′ = P (G − ) = sup x∈G hD(x; P ) ≤ δ, a contradiction. If δ ′ ≤ δ, then there must exist x 0 ∈ G ∩ P δ . But then δ ≤ hD(x 0 ; P ) ≤ P (G − ) = δ ′ ≤ δ, and necessarily P (G − ) = δ ′ = δ. Because P has contiguous support, this means that G = H, and P δ is supported by H. As this is true for any H ∈ H such that P (H − ) = δ, P δ = P [δ] , and (FB 2 ) =⇒ (FB 1 ).
The characterization of P follows from (FB 2 ) by a theorem of Cramér and Wold [40], see also [16, p. 383].
Note that a further minor extension of Theorem 34 can be obtained if P is allowed to have a single atom at its halfspace median x P , with obvious modifications to the statement and the proof of this theorem.
By Theorem 34 and Proposition 31 we obtain that all α-symmetric distributions with α > 1, and their full-dimensional affine images, satisfy ( To see that there exist distributions P ∈ P R d that satisfy (FB 1 ), but not the assumptions of Proposition 31, take P ∈ P (R 2 ) to be the uniform distribution on a square in R 2 from Example 5. For P it is known [92, pp. 433-434] that (FB 1 ) is true, yet each floating body P [δ] for δ ∈ (0, 1/2) contains four non-smooth points at its boundary, see also the left panel of Figure 3.
Condition (FB 1 ) is, however, still rather strict. Not only does it impose (4) on P , but also it means that P must be halfspace symmetric. For (uniform measures on) convex bodies, this was noted by Meyer and Reisner [122,Lemma 4]. The next proposition extends that result to probability measures. Its proof follows closely the arguments of Meyer and Reisner [122,Lemma 4], and is omitted.
Proposition 35. Let P ∈ P R d have contiguous support. If (FB 1 ) is true for P , then P must be halfspace symmetric.
Problem 12. Describe the collection of all probability measures P ∈ P R d whose halfspace depth is unique, i.e. there is no Q = P with hD(x; P ) = hD(x; Q) for all x ∈ R d . Is the existence of the expectation E X for X ∼ P sufficient for the halfspace depth of P to be unique? Is the uniform distribution on a simplex in R d characterized by its halfspace depth? Problem 13. If condition (FB 1 ) is not satisfied, how can one reconstruct the probability content of all halfspaces P (H − ) from the depth hD(x; P ) for all x ∈ R d only?
8.1. Characterization theorem of Kong and Zuo (2010). In [88,Theorem 3.2] it is shown that if, for P ∈ P R d with contiguous support, (33) for all δ ∈ (0, 1/2) the boundary of the central region P δ is C 1 , and (4) holds, then (31) is true, and P is characterized by its halfspace depth. In Theorem 34 we provide a generalization of this result. Indeed, by Proposition 31 above, if (33) and (4) are true, then the floating body P [δ] of P exists for all δ ∈ (0, 1/2), and Theorem 34 can be used. Hassairi and Regaieg (2008). Let us state a characterization result for the halfspace depth for absolutely continuous distributions that can be found in [76,Theorem 3.2]. For this, we define for any x ∈ R d the halfspace function

Characterization theorem of
where H − u, x,u ∈ H − is the closed halfspace in R d whose outer normal is parallel to u, and x ∈ H u, x,u .
Theorem 36. Let P ∈ P R d be as in Proposition 29, and suppose that (34) for all x ∈ R d , if φ x has a local minimum at u = u(x) ∈ S d−1 , then φ x (u) = hD(x; P ).
Then (31) holds true, and P is characterized by its halfspace depth.
In [76,Theorem 3.2], condition (34) is formulated in a slightly different manner in terms of derivatives of functions related to φ x . It is easy to see that for P that satisfies the conditions from Proposition 29, (34) and the corresponding condition from [76] are equivalent.
If P satisfies (4), then for any x ∈ R d the function φ x is continuous on S d−1 [116,Proposition 4.5]. Thus, it must attain a global minimum over its domain. Condition (34) therefore means that there cannot exist any local minimum of φ x that is not global.
Suppose for a moment that (4) is valid for P . By Theorem 36,(34) implies the characterization result (31) which is, by Theorem 34, equivalent with (FB 1 ). Therefore, given that (4) is true, Condition (34) implies (FB 1 ), and the characterization of Hassairi and Regaieg [76] is a special case of Theorem 34 above 3 . 8.3. Homothety conjecture. In convex geometry, the following open question, similar in nature to the depth characterization conjecture, was posed by [163]: Let a convex body K ∈ K d and one of its convex floating bodies K δ be homothetic, i.e. K δ = λK + x for some λ > 0 and x ∈ R d . Is then K necessarily an ellipsoid? Schütt and Werner [163] showed that if K is homothetic to a sequence of its floating bodies K δn with δ n → 0, then K must be an ellipsoid. Stancu [171] demonstrated that for K with a sufficiently smooth boundary, K is homothetic to K δ for a single small δ also implies that K is an ellipsoid. The latter result was later refined in [183]. Problem 14. Does the homothety conjecture hold true? More generally, which convex bodies are characterized by any of their convex floating bodies?

Conclusions and further perspectives
In this survey, we discussed little known relations of the concept of halfspace depth, studied extensively in statistics, and paradigms well known in functional analysis and geometry. In Section 4 we saw that the depth of the halfspace median is a particular example of a more general concept of measures of symmetry. In Sections 5 and 6 we focused on the floating body and its possible generalizations towards (probability) measures. These little explored junctions of mathematical statistics and geometry are, however, hardly limited only to the halfspace depth hD defined in finite-dimensional linear spaces R d . In this concluding section of our paper our intention is to outline, and properly refer to, a few further links between the statistics of depth functions, and current research in pure mathematics.
9.1. Depth in non-linear spaces. By directional data one understands data that live on the unit sphere S d−1 of R d [114]. Each observation can be interpreted as a direction of a non-zero vector in R d . Such data appear quite naturally, and it is of great interest to find depth functions suitable also for this kind of observations. Several definitions of depth have been proposed for directional data [168,99,1,93,136]. The following depth, proposed by Small [168], is an analogue of the halfspace depth for directional data.
Definition. Let P ∈ P S d−1 and x ∈ S d−1 . The angular halfspace depth (or angular Tukey depth) of x w.r.t. P is defined as where H 0 denotes the set of hyperplanes H ∈ H in R d such that 0 ∈ H.
It is natural to consider the collection H 0 in the definition of AhD, as H 0 ∩ S d−1 is the collection of all closed hemispheres of S d−1 . Therefore, it is not surprising that also for spherical convex bodies, concepts similar to floating bodies have been investigated. Recall that for K ⊂ S d−1 , K is said to be spherically convex if the radial extension of K, given by rad K = {λx : x ∈ K, λ ≥ 0} , is a convex set in R d . A closed spherically convex subset of S d−1 such that the interior of rad K is nonempty is called a spherical convex body. Analogues of floating bodies and convex floating bodies for spherical convex bodies were studied by Besau and Werner [14].
Definition. For a spherical convex body K ⊂ S d−1 and P ∈ P S d−1 uniformly distributed on K take δ ≥ 0. The spherical convex floating body of K is defined as Just as in Section 6 it is possible to define floating bodies, and convex floating bodies also for general probability measures on S d−1 , and it is easy to see that the spherical convex floating body coincides with the central regions of the angular halfspace depth for uniform distributions on spherical convex bodies. Some results in the spirit of those discussed in Section 5 can be obtained also for spherical convex floating bodies [14]. In another paper, Besau and Werner [15] provide extensions of those results also to certain Riemannian manifolds. Research in this direction in the statistics of data depth is still only in its beginnings [55]. 9.2. Depth for infinite-dimensional data. In statistics, since the work of Liu and Singh [100] and Fraiman and Muniz [56], considerable attention has focused also on devising depth functions applicable to data from high-dimensional, and infinite-dimensional (functional) spaces. Direct applications of the halfspace depth are known to be inadequate [48], but many other depth functions that are suited for functional data can be found in the literature [43,101,128,39,36,131,133,64]. In geometry, some advances that appear to be related are the floating functions [94] considered in Section 5.5 above. Solid connections between these two areas of research appear to be uncharted. 9.3. Centroid body and simplicial volume depth. Apart from the halfspace depth, the simplical depth, and the Mahalanobis depth mentioned above, there exists an abundance of other depth functions defined in R d in statistics. A comprehensive survey on some of those is [187], where, based on the ideas of Oja [135], also the following depth function can be found.
Definition. Let X ∼ P ∈ P R d be such that Var X = Σ is a positive definite matrix and x ∈ R d . The simplicial volume depth (or Oja depth) of x w.r.t. P is defined as (35) svD(x; P ) = 1 + E vol d ([x, X 1 , . . . , where X 1 , . . . , X d ∼ P are independent.
The factor √ det Σ ensures the affine invariance of svD. Similarly as the Mahalanobis depth MD, also svD is not defined for all P ∈ P R d , but only for distributions with finite second moments, and positive definite variance matrices.
For (a uniform distribution on) a compact (possibly non-convex) set K ⊂ R d with vol d (K) > 0, a concept closely related to x → vol d ([x, X 1 , . . . , X d ]), that is central in (35), is that of the centroid body of K. The centroid body of K is a convex body Z ∈ K d defined via its support function (2) h Z (u) = 1 vol d (K) K | x, u | d x.
If K is (centrally) symmetric around around the origin, ∂Z is the locus of centroids of all intersections of halfspaces H − ∈ H − such that 0 ∈ H with K. As discussed in [60, Section 9.1], this body was defined by Petty [139], but its earlier predecessors can be traced back to the work of Dupin [47]. The volume of the centroid body Z of K determines the simplicial volume depth svD of 0 ∈ R d with respect to the the uniform distribution on K. The next theorem can be found in Gardner [60, Theorem 9.1.5].
Extensions not listed here can be found in [139,157]. For star bodies K ⊂ R d a version of this theorem is given in [156,Section 10.8].
Theorem 37. Let X ∼ P ∈ P R d be uniformly distributed on a compact set K ⊂ R d with vol d (K) > 0. Denote Var X = Σ. Let Z x be the centroid body of K − x. Then svD(x; P ) = 1 + 2 d Centroid bodies have been the subject of numerous studies in geometry and functional analysis. We only refer here to [156,Section 10.8] and [27, Section 5.1] and the references therein for a comprehensive account of results that can be found in the literature on centroid bodies and their extensions. Problem 15. Is it possible to extend Theorem 37 also to more general probability measures?