1 Introduction

In this paper we derive analytic formulas for approximations of sub-Riemannian distances on the 2D and 3D rotation translation groups, denoted, respectively, with SE(2) and SE(3). Additionally, we extend the perceptual grouping algorithm [13] for clustering of local orientations (points on blood vessels). Here clustering is based on alignment of local orientations, which is quantified using sub-Riemannian distances on SE(n); see Fig. 1 for an illustration.

1.1 Nilpotent Approximation

The sub-Riemannian distances on SE(n) are approximated via norms on the vectors obtained from the logarithmic map (from group elements to the Lie algebra). This approach is motivated by problems from sub-Riemannian geometry in nilpotent Lie groups, in which such homogenous norms provide exact fundamental solutions to sub-Laplacians.

The vectors obtained by the logarithmic map, expressed in a left-invariant basis, are the so-called exponential coordinates of the first kind. For a nilpotent group of step two, like the Heisenberg group, these coordinates define [together with a group product defined via the Baker–Campbell–Hausdorf (BCH) formula] a global isomorphism to the group. In our SE(n) setting we have to truncate the commutator series in the BCH formula due to non-vanishing (higher-order) commutators, yielding a corresponding Heisenberg type approximation which we denote with \((\mathrm{SE}(n))_0\). The obtained Taylor development of the group product and associated left-invariant vector fields gives rise to a local approximation of the (sub-Riemannian) flows on SE(2) in the sense of Rothschild and Stein [50].

Fig. 1
figure 1

The red and green arrows have equal spatial and angular distance to the origin (black arrow). In a flat geometry on \(\mathbb {R}^2 \times S^1\) the distance between the red and green arrow and the source would be equal, and the geodesics straight lines (see dashed lines). In sub-Riemannian geometry on SE(2) the green arrow has a shorter distance to the source. The left image shows 2D projections of the sub-Riemannian geodesics in solid black, and the right image shows their paths in SE(2) (Color figure online)

Fig. 2
figure 2

The pipeline for grouping vessel segments consists of 2 steps. First, key points are generated (from a single source point) using minimal path tracking with key points [5]. Second, the automatically generated key points, with estimated orientations, are grouped based on an adaption of the perceptual grouping algorithm [13] with the use of sub-Riemannian distances on SE(2). The result on the right is obtained with the nilpotent approximations of the sub-Riemannian distances in SE(2)

We then define a norm on \((\mathrm{SE}(n))_0\) based on the Folland–Kaplan–Korányi gauge, which is known for its relation to the fundamental solution of the sub-Laplacian on the Heisenberg group [29, 33, 35]. We reason that the Folland–Kaplan–Korányi provides an accurate approximation to the fundamental solution on SE(n) as well, as it provides the exact fundamental solution on the Heisenberg type approximation \((\mathrm{SE}(n))_0\). As such, we provide an approach to approximating the heat kernel and fundamental solution of the sub-Laplacian on SE(n), as an alternative to the works [12, 22, 47].

The distance associated with the Folland–Kaplan–Korányi-type norm on \((\mathrm{SE}(n))_0\) is locally equivalent to the sub-Riemannian distance on SE(2), as was formally proved in full generality in the seminal work by Nagel et al. [43]. In this paper we show by qualitative and quantitative comparison that the norm on \((\mathrm{SE}(n))_0\) indeed provides a sharp local approximation of the sub-Riemannian distances on SE(n).

1.2 Perceptual Grouping

The motivation for perceptual grouping of local orientations comes from problems in medical image analysis in which the topologically correct reconstruction of vessel (and pulmonary) trees is of great importance in biomarker research and surgery planning. Knowing the correct connectivity in tree structures not only allows for local biomarker analysis (e.g., studies on bifurcation and crossing properties [37]), but also allows for higher level biomarker research via statistics on tree structures [28]. Topological knowledge of vessel trees is also essential in determining artery/vein classification problems [15, 25, 26]. Finally, in many medical applications involving vessel analysis, including topological tree reconstruction, distances between local orientations play a crucial role [1, 16, 27, 39, 55, 57]. The approximate sub-Riemannian distance in this paper is analytic, fast and easy to implement, and as such may be a useful tool for algorithms that rely on local orientation analysis.

Sub-Riemannian models are shown to be effective in both image processing and in neuropsychological models for line perception in the primary visual cortex [3, 7, 12, 20, 27, 40, 45, 48, 51, 53]. In this paper we indeed observe by quantitative validation of automatic connectivity analysis that sub-Riemannian distances are preferred over their (full) Riemannian counter parts.

The approach taken in this paper for doing connectivity analysis is based on the perceptual grouping algorithm proposed by Cohen [13]. This algorithm turns a set of key points into a graph by iteratively adding edges between nodes based on their geodesic distances while putting constraints on the number of connections per node. The input set of key points may be obtained via key point tracking algorithms [5, 10, 34], as is done also in this paper; see Fig. 2.

In [13] an isotropic metric was used to define the geodesic distances. Later, the perceptual grouping algorithm was adapted for use with anisotropic Riemannian metrics by Bougleux et al. [8]. In recent work [11] it was further extended for the grouping of n closed contours for an a priori n. There, a (sub-)Finsler metric on position orientation space was used, similar to the sub-Riemannian metric used in this paper. As in [8] and [11] we use the main algorithm of [13] as a backbone, but we change the metric used for perceptual grouping and we impose an additional constraint to avoid closed loops (which are physically not realistic in the vessel networks of interest).

With quantitative experiments we show that perceptual grouping with sub-Riemannian distances on SE(n) is preferred over the use of (full) Riemannian distances on SE(n), which is in turn preferred over grouping with distances on \(\mathbb {R}^n\). Furthermore, the analytic approximations allow for fast perceptual grouping with competitive performance compared to data-adaptive sub-Riemannian distances computed via fast marching.

1.3 Paper Outline

In Sects. 2 and 3 we derive approximations for sub-Riemannian distances in, respectively, SE(2) and SE(3). There, for each Lie group we first provide the preliminaries, then define the sub-Riemannian distance and then describe the proposed approximations. In Sect. 4 the algorithms (perceptual grouping, fast marching and key point tracking) are described, including an overview of the different distances used in this paper. In Sect. 5 we then compare the performance of the perceptual grouping algorithm using different distances, first on \(\mathbb {R}^2\) and SE(2) in Sect. 5.1, and then on \(\mathbb {R}^3\) and SE(3) in Sect. 5.2. General conclusions are provided in Sect. 6.

2 Sub-Riemannian Distance and its Approximation in SE(2)

2.1 The Lie Group SE(2)

2.1.1 SE(2)

In order to measure distances between local orientations we will consider the Lie group SE(2) as our base manifold. The group SE\((2) = \mathbb {R}^2 \rtimes \mathrm{SO}(2)\) is the semi-direct product of the group of planar translations \(\mathbb {R}^2\) and rotations \(\mathrm{SO}(2)\), and its group product and inverse are, respectively, defined via:

$$\begin{aligned} g \cdot g'= & {} ( \mathbf {x},\mathbf {R}_\theta ) \cdot ( \mathbf {x}', \mathbf {R}_{\theta '} ) = ( \mathbf {R}_\theta \mathbf {x}' + \mathbf {x}, \mathbf {R}_{\theta + \theta '}),\nonumber \\ g^{-1}= & {} \left( -\mathbf {R}_{\theta }^{-1} \mathbf {x} , \mathbf {R}_{\theta }^{-1}\right) , \end{aligned}$$
(1)

with group elements \(g,g' \in \mathrm{SE}(2)\). The group acts on the (coupled) space of positions and orientations \(\mathbb {R}^2 \rtimes S^1\) via

$$\begin{aligned} g \cdot (\mathbf {x}',\theta ') = ( \mathbf {R}_\theta \mathbf {x}' + \mathbf {x}, \theta +\theta '). \end{aligned}$$

Since \(( \mathbf {x} , \mathbf {R}_\theta ) \cdot ( \mathbf {0}, 0 ) = ( \mathbf {x} , \theta )\), we can uniquely identify the roto-translation group SE(2) with the space of positions and orientations \(\mathbb {R}^2 \rtimes S^1\).

2.1.2 The Lie Algebra, Exponential Map and Commutators

The Lie algebra associated with SE(2) is the real vector space \(\mathfrak {se}(2) = {\text {span}} \{ A_1, A_2, A_3 \}\) together with a bilinear operator \([\cdot ,\cdot ]:\mathfrak {se}(2)\times \mathfrak {se}(2)\rightarrow \mathfrak {se}(2)\) called the Lie bracket (which we define in Eq. (4)). The generators of the Lie algebra are given by the differential frame \(\left. \{\partial _\theta ,\partial _x,\partial _y\}\right| _{(0,0,0)}\) at the origin

$$\begin{aligned} A_1 = \left. \partial _\theta \right| _{(0,0,0)}, \quad A_2 = \left. \partial _x\right| _{(0,0,0)}, \quad A_3 = \left. \partial _y\right| _{(0,0,0)}, \end{aligned}$$
(2)

which define corresponding left-invariant vector fields

$$\begin{aligned} \begin{aligned} \left. \mathcal {A}_1\right| _{g} = (L_{g})_* A_1&= \left. \partial _\theta \right| _{g}, \\ \left. \mathcal {A}_2\right| _{g} = (L_{g})_* A_2&= \cos \theta \left. \partial _x\right| _{g} + \sin \theta \left. \partial _y\right| _{g}, \\ \left. \mathcal {A}_3\right| _{g} = (L_{g})_* A_3&= -\sin \theta \left. \partial _x\right| _{g} + \cos \theta \left. \partial _y\right| _{g} \end{aligned} \end{aligned}$$
(3)

via the push forward of left multiplication, denoted by \((L_g)_*\), and with \(g = (x,y,\theta ) \in \mathrm{SE}(2)\).

The exponential map \({\text {Exp}}: \mathfrak {se}(2) \rightarrow \mathrm{SE}(2)\) defines a mapping from a vector \(X \in \mathfrak {se}(2)\) in the tangent space at \(g=(0,0,0)\) to an element in the group SE(2) by following an integral curve along the left-invariant vector field \((L_g)_* X\). The logarithmic map \({\text {Log}}: SE(2) \rightarrow \mathfrak {se}(2)\) defines the mapping from group element to tangent vector at \(g=(0,0,0)\).

The Lie bracket for vector fields is defined as follows

$$\begin{aligned} \begin{aligned} {[}X,Y]&:= \underset{t\rightarrow 0}{{\text {lim}}} \frac{\gamma (t) - e}{t^2}, \quad \text {with} \\ \gamma (t)&= {\text {Exp}}(-t Y){\text {Exp}}(-t X){\text {Exp}}(t Y){\text {Exp}}(t X). \end{aligned} \end{aligned}$$
(4)

That is, it describes the infinitesimal displacement by following a path moving forth and back in X and Y directions. The Lie bracket of two vectors defines a new vector (the commutator) and the Lie bracket of two vector fields defines a new vector field. The nonzero commutators of \(\mathfrak {se}(2)\) are

$$\begin{aligned} \begin{aligned} {[}A_1,A_2]&= -[A_2,A_1] = A_3, \\ {[}A_1,A_3]&= -[A_3,A_1] = -A_2. \end{aligned} \end{aligned}$$
(5)

2.2 Sub-Riemannian Geometry in SE(2)

We consider a sub-Riemannian geometry on SE(2) by measuring distances between two points in SE(2) via the lengths of shortest horizontal paths. A horizontal path is a curve \(\gamma :[t_0,t_1]\subset \mathbb {R}\rightarrow \mathrm{SE}(2)\) with tangent vectors \(\dot{\gamma }(t) \in \left. \varDelta \right| _{\gamma (t)}:= {\text {span}} \{ \left. \mathcal {A}_1\right| _{\gamma (t)}, \left. \mathcal {A}_2 \right| _{\gamma (t)}\}\), where \(\varDelta \) denotes the sub-bundle of the full tangent bundle \(T(\mathrm{SE}(2)):= {\text {span}} \{\mathcal {A}_1,\mathcal {A}_2,\mathcal {A}_3\}\). Lengths of horizontal curves with \(\dot{\gamma }(t) = u^1(t) \left. \mathcal {A}_1\right| _{\gamma (t)} + u^2(t) \left. \mathcal {A}_2\right| _{\gamma (t)}\) are measured by the sub-Riemannian metric tensorFootnote 1

$$\begin{aligned} \left. \mathcal {G}^{\xi ,C}\right| _{\gamma (t)} (\dot{\gamma }(t),\dot{\gamma }(t)):= C(\gamma (t))^2 (|u^1(t)|^2 + \xi |u^2(t)|^2), \end{aligned}$$
(6)

in which \(C:\mathrm{SE}(2)\rightarrow \mathbb {R}^+\) is an external cost which penalizes the curves to move through certain regions in SE(2), \(\xi \) is a parameter which balances the penalty of motion in the angular and spatial directions and has dimensions [1/length], and \(u^1\) and \(u^2\) are the control parameters of the curve \(\gamma \).

The sub-Riemannian distances between two points \(g_1,g_2 \in \mathrm{SE}(2)\) is then given by

$$\begin{aligned} d_0(g_1,g_2) := {\text {inf}} \left\{ \int _0^1 \sqrt{ \left. \mathcal {G}^{\xi ,C}\right| _{\gamma (t)}(\dot{\gamma }(t),\dot{\gamma }(t)) } \mathrm{d}t \right\} , \end{aligned}$$
(7)

where the infimum is taken over by Lipschitz continuous curves \(\gamma \in {\text {Lip}}([0,T],\mathrm{SE}(2))\) with \(\gamma (0) = g_1\), \(\gamma (1) = g_2\) and \(\dot{\gamma }(t) = u^1(t) \left. \mathcal {A}_1\right| _{\gamma (t)} + u^2(t) \left. \mathcal {A}_2\right| _{\gamma (t)}\). Note that due to the inclusion of an external cost function C the distance d is not strictly left-invariant; however, when substituting C by \(C_g:=C(g^{-1} h)\) in (7) we do have left invariance (i.e., then \(d(g \cdot g_1, g \cdot g_2) = d(g_1, g_2)\)).

2.3 A Nilpotent Approximation \((\mathrm{SE}(2))_0\) of SE(2)

2.3.1 A Local Approximation via the Baker–Campbell–Hausdorff Formula

Consider the exponential map from Lie algebra \(\mathfrak {se}(2)\) to the group SE(2)

$$\begin{aligned} (c^1,c^2,c^3)\mapsto (x,y,\theta ) = {\text {Exp}}( c^1 A_1 + c^2 A_2 + c^3 A_3 ), \end{aligned}$$
(8)

with \(\{A_i\}_{i=1}^3\) the basis vectors of \(\mathfrak {se}(2)\) given in (2), and with \((c^1,c^2,c^3)\) the canonical coordinates of the first kind given by

$$\begin{aligned} \begin{array}{ll} c^1 = \theta , &{}\quad c^2 = \left\{ \begin{array}{ll} \tfrac{1}{2} \theta \left( y + x \cot \tfrac{\theta }{2}\right) &{} \quad \text {if} \quad \theta \ne 0\\ x &{} \quad \text {if} \quad \theta = 0\\ \end{array}\right. ,\\ &{}\quad c^3 = \left\{ \begin{array}{ll} \tfrac{1}{2} \theta \left( -x + y \cot \tfrac{\theta }{2}\right) &{} \quad \text {if} \quad \theta \ne 0\\ y &{} \quad \text {if} \quad \theta = 0\\ \end{array}\right. . \end{array} \end{aligned}$$
(9)

For two left-invariant vector fields \(X = \sum _{i=1}^3 x^i \mathcal {A}_i\) and \(Y = \sum _{i=1}^3 y^i \mathcal {A}_i\) the Baker–Campbell–Hausdorff (BCH) formula (see, e.g., [49]) gives:

$$\begin{aligned} {\text {Log}}( {\text {Exp}}(X) {\text {Exp}}(Y) )= & {} X + Y + \frac{1}{2} [X,Y] \nonumber \\&+\,\frac{1}{12}([X,[X,Y]] + [Y,[Y,X]])\nonumber \\&+\,\mathcal {O}([\cdot ,[\cdot ,[\cdot ,\cdot ]]]), \end{aligned}$$
(10)

where \(\mathcal {O}([\cdot ,[\cdot ,[\cdot ,\cdot ]]])\) denotes higher-order nested brackets. Since the Lie algebra \(\mathfrak {se}(2)\) is not nilpotent it has non-vanishing Lie brackets of order \(\ge 2\) [cf. the commutator relations in (5)] the BCH formula gives an infinite series of nested Lie brackets.

Here, we approximate the BCH formula SE(2) asFootnote 2

$$\begin{aligned} {\text {Log}}( {\text {Exp}}(X) {\text {Exp}}(Y) ) \approx X + Y + \frac{1}{2} [X,Y], \end{aligned}$$
(11)

by omitting the Lie brackets of order 2 (once nested brackets) and higher, as if our Lie algebra \(\mathfrak {se}(2)\) is nilpotent of step 2. Then, together with the commutator relations \([A_i,A_i]=0\), \(A_3 = [A_1,A_2]\), and again omitting Lie brackets of order 2 (i.e., setting \([A_1,A_3] = [A_1,[A_1,A_2]] = 0\)), the BCH formula defines a group product on the vector space \(\mathbb {R}^3\) of the canonical coordinates of the first kind via

$$\begin{aligned}&(x^1,x^2,x^3) \cdot (y^1,y^2,y^3) \nonumber \\&\quad = \left( x^1+y^1,x^2+y^2,x^3+y^3 + \frac{1}{2}(x^1y^2-x^2y^1)\right) .\nonumber \\ \end{aligned}$$
(12)

The new group product (12), where the elements are expressed in coordinates of the first kind [cf. Eq. (8)], gives rise to a nilpotent Heisenberg group. It is a localFootnote 3 approximation of the true group product \(g_1 \cdot g_2 = {\text {Exp}}(\sum _{i=1}^3 x^i A_i) \cdot {\text {Exp}}(\sum _{i=1}^3 y^i A_i)\) given by (1). We denote this group by \((\mathrm{SE}(2))_0 = H(3)\), with H(3) the three-dimensional (nilpotent) Heisenberg group. Note that if \((x^1,x^2,x^3)\) and \((y^1,y^2,y^3)\) were coordinates of the first kind for a group with a step 2 nilpotent algebra, then this new group would be globally isomorphic to that group. The new group \((\mathrm{SE}(2))_{0}\) defines a homogeneous Carnot group with respect to the dilations

$$\begin{aligned} \delta _s(\mathbf {c}) = (s \, c^1, s \, c^2, s^2 \, c^3). \end{aligned}$$
(13)

2.3.2 Homogeneous Norms on \((\mathrm{SE}(2))_0\) and the Fundamental Solution of the Sub-Laplacian

In our approximation of the sub-Riemannian distance \(d_{0}\) of Eq. (7) we make use of the following homogenous norm on \((\mathrm{SE}(2))_{0}\):

$$\begin{aligned} ||\mathbf {c} ||_\zeta := \root 4 \of {(|c^1|^2 + |c^2|^2)^2 + \zeta \, |c^3|^2}, \end{aligned}$$
(14)
Fig. 3
figure 3

Distances on SE(2) for \(\xi =1\), \(C=1\). Top row: level sets of the distance volumes on SE(2). Bottom row: minimum intensity projections of the distances to the plane \(\mathbb {R}^2\) with level set contours. From left to right: the sub-Riemannian distance \(d_{0}(e,\cdot )\), see Eq. (7); Homogenous norms \(||\cdot ||_{\xi ,\zeta }\), see Eq. (15), of the nilpotent approximation \((\mathrm{SE}(2))_0\) for, respectively, \(\zeta =44\), \(\zeta =16\) (Folland–Kaplan–Korányi gauge) and \(c=\zeta \); The (\(\xi \)-isotropic) Riemannian distance \(d_{1}(e,\cdot )\) on SE(2), see Table 1 for an overview of the different distances

with constant \(\zeta >0\) a relative penalty for the non-horizontal part. For \(\zeta =16\) this norm coincides with the well-known Folland–Kaplan–Korányi gauge, which is a widely studied norm on Carnot groups due to its relation to fundamental solutions of sub-Laplacians [6]: Folland found that \(||\mathbf c ||_{16}^{2-Q}\), with homogeneous dimensions Q, is (a constant multiple of) the fundamental solution of the canonical sub-Laplacian on the Heisenberg group [29]; Kaplan showed that this relation more generally holds for H-type (Carnot) groups [33]; Korányi derived many more of its properties in relation to harmonic analysis and potential theory [35].

In relation to sub-Riemannian geometry on SE(2) and its sub-Laplacian \(\mathcal {L}:=\mathcal {A}_1^2 + \mathcal {A}_2^2\), we find that the fundamental solution \(\varGamma \) of \(\mathcal {L}\) can be approximated by the (explicit) fundamental solution of the canonical sub-Laplacian \(\mathcal {L}_0: = \mathcal {X}_1^2 + \mathcal {X}_2^2\), with Jacobian basis \( \mathcal {X}_1 = \partial _{c^1} + \frac{c^2}{2}\partial _{c^3}\), \(\mathcal {X}_2 = \partial _{c^2} - \frac{c^1}{2}\partial _{c^3} \) on \((\mathrm{SE}(2))_{0}\). This solution in fact coincides with one of the approximations of \(\varGamma \) found by Duits and Franken [22]. There, the fundamental solution of \(\mathcal {L}\) was first approximated by relying on a contraction of SE(2) to a three-dimensional Heisenberg group (via dilations on the group SE(2)) and then derived the Gaussian estimates based on the homogeneous norm \(||\cdot ||_1\), i.e., \(\zeta =1\), with exponential coordinates derived from the contraction.

In our study on the sub-Riemannian distance approximations we found that even sharper estimates could be obtained by relying on the explicit formula for the fundamental solution of the (Kohn) sub-Laplacian on H(3) (which is up to a constant given by \(||\mathbf {c} ||_{16}^{-2}\)). In this context we thus obtain an estimate of the fundamental solution of \(\mathcal {L}\) by estimating it with \(||\mathbf {c} ||_{16}^{-2}\), which is proportional to the exact fundamental solution of \(\mathcal {L}_0\) on our approximated group \((\mathrm{SE}(2))_0\).

2.3.3 Approximation of the Sub-Riemannian Distance

Finally we arrive at the sub-Riemannian distance approximations. By the Ball–Box theorem (see, e.g., [4]) and equivalence of homogeneous norms, there exists a constant \(\mathfrak {c}\) such that

$$\begin{aligned} \mathfrak {c}^{-1} ||{\text {Log}}(g) ||_\zeta \le d_0(e, g ) \le \mathfrak {c} ||{\text {Log}}(g) ||_\zeta , \end{aligned}$$

with \({\text {Log}}(g)\) defined by Eq. (9). The logarithmic norm is locally equivalent to the sub-Riemannian distance, which was proved in full generality in [43, Thm. 2 and 4]. Via a scaling of the generators \(\tilde{{A}}_2 = \xi ^{-1} {A}_2\) and \(\tilde{{A}}_3 = \xi ^{-1} {A}_3\) we define the \(\xi \)-isotropic norm

$$\begin{aligned} ||\mathbf {c} ||_{\xi ,\zeta }:= & {} \root 4 \of {(|c^1|^2 + |\tilde{c}^2|^2)^2 + \zeta |\tilde{c}^3|^2}\nonumber \\= & {} \root 4 \of {(|c^1|^2 + \xi ^2 |c^2|^2)^2 + \zeta \; \xi ^2 |c^3|^2}, \end{aligned}$$
(15)

with \(\tilde{c}^2 = \xi c^2\) and \(\tilde{c}^3 = \xi c^3\), and the \(c^i\) given in (9). The norm \(||\cdot ||_{\xi ,\zeta }\) closely approximates the sub-Riemannian distance \(d_{0}(e,\cdot )\) for \(C=1\) (no data adaptivity) via

$$\begin{aligned} d_{0}(g,h) \approx |{\text {Log}}(g^{-1}h)|_{\xi ,\zeta }, \quad |{\text {Log}}(g)|_{\xi ,\zeta }:=||\mathbf {c} ||_{\xi ,\zeta } \end{aligned}$$
(16)

with \(\mathbf {c}\) the coordinates of the first kind obtained via (9). In view of the Folland–Kaplan–Korányi gauge setting \(\zeta =16\) in \(||\cdot ||_{\xi ,\zeta }\) would be a sensible choice. We do observe, however, that \(\zeta =44\) gives an even sharper approximation; see Fig. 3 for a visual comparison to the sub-Riemannian distance \(d_0\) and “Appendix A” for a quantitative comparison. The setting \(\zeta =44\) is used in all experiments on SE(2).

3 Sub-Riemannian Distance and Its Approximation in SE(3)

In this section we extend the concepts of the previous section to the group SE(3) of 3D translations and rotations. In the end we again obtain an approximation for the sub-Riemannian distance, which allows us to do perceptual grouping in 3D images as well.

3.1 The Lie Group SE(3)

3.1.1 SE(3)

The Lie group SE\((3) = \mathbb {R}^3 \rtimes SO(3)\) is the semi-direct product of the group of 3D translations \(\mathbb {R}^3\) and the group of 3D rotations \(\mathrm{SO}(3)\). The group product and inverse for elements \(g = (\mathbf {x},\mathbf {R}),g'=(\mathbf {x}',\mathbf {R}') \in \mathrm{SE}(3)\) are defined by

$$\begin{aligned}&g \cdot g' = (\mathbf {x}, \mathbf {R}) \cdot (\mathbf {x}',\mathbf {R}') = (\mathbf {x} + \mathbf {R} \mathbf {x}', \mathbf {R} \mathbf {R}'), \nonumber \\&g^{-1} = (-\mathbf {R}^{-1}\mathbf {x},\mathbf {R}^{-1}). \end{aligned}$$
(17)

In the 3D case, we define the space of coupled positions and orientations as a Lie group quotient of SE(3):

$$\begin{aligned} \mathbb {R}^3 \rtimes S^2 := \mathrm{SE}(3)/({\mathbf {0}}\times SO(2)). \end{aligned}$$

The group action of \(g \in \mathrm{SE}(3)\) onto \((\mathbf {y},\mathbf {n}) \in \mathbb {R}^3 \times S^2\) is defined by

$$\begin{aligned} g \cdot (\mathbf {y},\mathbf {n}) = (\mathbf {x},\mathbf {R}) \cdot (\mathbf {y},\mathbf {n}) = (\mathbf {x} + \mathbf {R} \mathbf {y}, \mathbf {R} \mathbf {n}). \end{aligned}$$

We can identify the element \((\mathbf {x},\mathbf {n}) \in \mathbb {R}^3 \times S^2\) with group elements \((\mathbf {x},\mathbf {R}_\mathbf {n}) \in \mathrm{SE}(3)/(\mathbf {0}\times \mathrm{SO}(2))\), where \(\mathbf {R}_\mathbf {n}\) is any rotation matrix such that \(\mathbf {R}_\mathbf {n} \mathbf {e}_z = \mathbf {n}\).

3.1.2 The Lie Algebra, Exponential Map and Commutators

Analogously as in the SE(2) case, we associate with the group SE(3) the Lie algebra \(\mathfrak {se}(3)\) using the exponential and logarithmic maps. This is most easily done using an isomorphism with the corresponding matrix group:

$$\begin{aligned} (\mathbf {x},\mathbf {R}_{\gamma ,\beta ,\alpha }) \leftrightarrow \begin{pmatrix} \mathbf {R}_{\gamma ,\beta ,\alpha } &{} \quad \mathbf {x}^T \\ 0 &{} \quad 1 \end{pmatrix}. \end{aligned}$$

A basis for the corresponding matrix Lie algebra is given by

$$\begin{aligned} \begin{aligned} \mathbf {X}_1&= \begin{pmatrix} 0 &{}\quad 0 &{}\quad 0 &{}\quad 1 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \end{pmatrix}, \quad \mathbf {X}_2 = \begin{pmatrix} 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 1 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \end{pmatrix}, \\ \mathbf {X}_3&= \begin{pmatrix} 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 1 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \end{pmatrix}, \quad \mathbf {X}_4 = \begin{pmatrix} 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad -1 &{}\quad 0 \\ 0 &{}\quad 1 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \end{pmatrix}, \\ \mathbf {X}_5&= \begin{pmatrix} 0 &{}\quad 0 &{}\quad 1 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ -1 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \end{pmatrix}, \quad \mathbf {X}_6 = \begin{pmatrix} 0 &{}\quad -1 &{}\quad 0 &{}\quad 0 \\ 1 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \end{pmatrix}, \end{aligned} \end{aligned}$$
(18)

and their equivalents \(A_i\) in the tangent space of SE(3) span the Lie algebra \(\mathfrak {se}(3)\). Since it will be clear from the context if we are in the SE(2) or SE(3) case, we use the same notation for the generators of the Lie algebra as previously. Now the left-invariant vector fields are again obtained using the push forward of the left multiplication \((L_g)^*\), but they depend on the choice of coordinates. In this paper we mostly rely on ZYZ Euler angles in the parameterization of \(\mathrm{SO}(3)\), i.e.,

$$\begin{aligned} \mathbf {R}_{\gamma ,\beta ,\alpha } = \mathbf {R}_{\mathbf {e}_z,\gamma } \mathbf {R}_{\mathbf {e}_y,\beta } \mathbf {R}_{\mathbf {e}_z,\alpha }, \end{aligned}$$
(19)

with \(\mathbf {R}_{\mathbf {n},\alpha }\) a rotation with angle \(\alpha \) around \(\mathbf {n}\). Then, the left-invariant vector fields are given by

$$\begin{aligned} \mathcal {A}_1|_g= & {} (\cos \alpha \cos \beta \cos \gamma - \sin \alpha \sin \gamma ) \partial _x \nonumber \\&+\,(\sin \alpha \cos \gamma + \cos \alpha \cos \beta \sin \gamma ) \partial _y - \cos \alpha \sin \beta \partial _z \nonumber \\ \mathcal {A}_2|_g= & {} (- \sin \alpha \cos \beta \cos \gamma - \cos \alpha \sin \gamma ) \partial _x \nonumber \\&+\,(\cos \alpha \cos \gamma - \sin \alpha \cos \beta \sin \gamma ) \partial _y + \sin \alpha \sin \beta \partial _z, \nonumber \\ \mathcal {A}_3|_g= & {} \sin \beta \cos \gamma \partial _x + \sin \beta \sin \gamma \partial _y + \cos \beta \partial _z \nonumber \\ \mathcal {A}_4|_g= & {} \cos \alpha \cot \beta \partial _\alpha + \sin \alpha \partial _\beta - \dfrac{\cos \alpha }{\sin \beta }\partial _\gamma , \nonumber \\ \mathcal {A}_5|_g= & {} -\sin \alpha \cot \beta \partial _\alpha + \cos \alpha \partial _\beta + \dfrac{\sin \alpha }{\sin \beta }\partial _\gamma \nonumber \\ \mathcal {A}_6|_g= & {} \partial _\alpha , \end{aligned}$$
(20)

for \(\beta \ne 0,\pi \).

Remark 1

A second coordinate chart is needed to cover the entire \(\mathrm{SO}(3)\), for which, for example, ZYX angles can be used, as is done in, e.g., [23], where also the expressions for the vector fields in this alternative coordinate chart are given. In fact, the basis elements \(A_i\) of the Lie algebra correspond to partial derivatives with respect to the ZYX angles, similar to the SE(2)-case.

We can express each element \(\mathfrak {se}(3)\) in terms of the basis with coefficients \(\mathbf {c} = (c^1, \dots , c^6)^T\). Furthermore, we define \(\mathbf {c}^{(1)}:= (c^1, c^2, c^3)^T\) and \(\mathbf {c}^{(2)} := (c^4,c^5,c^6)^T\), the spatial and rotational coefficients, respectively. We can make the exponential map \({\text {Exp}}_{\mathrm{SE}(3)} : \mathfrak {se}(3) \rightarrow \mathrm{SE}(3)\) and logarithmic map \({\text {Log}}_{\mathrm{SE}(3)} : \mathrm{SE}(3)\rightarrow \mathfrak {se}(3)\) explicit using these coefficients. For a \(3 \times 3\) matrix \({\varvec{\Omega }}\) of the form

$$\begin{aligned} {\varvec{\Omega }} := \left( \begin{array}{ccc} 0 &{}\quad -c^6 &{}\quad c^5 \\ c^6 &{}\quad 0 &{}\quad -c^4 \\ -c^5 &{}\quad c^4 &{}\quad 0 \end{array} \right) , \end{aligned}$$
(21)

we obtain a rotation using the exponential map of matrices, i.e., \(\mathbf {R} = \exp ({\varvec{\Omega }})\). The relation between the spatial coefficients \(\mathbf {c}^{(1)}\) and \((\mathbf {x},\mathbf {R})\) is given by

$$\begin{aligned} \mathbf {c}^{(1)} = \left( I - \frac{1}{2} {\varvec{\Omega }} + q^{-2} \left( 1 - \frac{q}{2} \cot \left( \frac{q}{2} \right) \right) ({\varvec{\Omega }})^2 \right) \mathbf {x}, \end{aligned}$$
(22)

where \(q = ||\mathbf {c}^{(2)}||\) and \({\varvec{\Omega }}\) such that \(\mathbf {R} = \exp ({\varvec{\Omega }})\). Now

$$\begin{aligned} \begin{aligned}&{\text {Log}}_{\mathrm{SE}(3)}(g) = \sum _{i=1}^6 c_i(g) A_i, \qquad \text {and}\\&\quad {\text {Exp}}_{\mathrm{SE}(3)}\left( \sum _{i=1}^6 c_i(g) A_i \right) = g, \end{aligned} \end{aligned}$$
(23)

using the relations above.

3.2 Sub-Riemannian Geometry in SE(3)

In the SE(3) case, a horizontal path is a curve \(\gamma :\mathbb {R}\rightarrow \mathrm{SE}(3)\) with tangent vectors \(\dot{\gamma }(t) \in \left. \varDelta \right| _{\gamma (t)}:= {\text {span}} \{ \left. \mathcal {A}_3\right| _{\gamma (t)}, \left. \mathcal {A}_4 \right| _{\gamma (t)}, \left. \mathcal {A}_5 \right| _{\gamma (t)}\}\), where \(\varDelta \) is now the sub-bundle of full tangent bundle spanned by \(\{\mathcal {A}_i\}_{i=1}^6\). In this case we have one spatial control \(u^3\) and two “angular” controls \(u^4\) and \(u^5\), so that the sub-Riemannian metric tensor becomes:

$$\begin{aligned} \begin{array}{rl} \left. \mathcal {G}^{\xi ,C}\right| _{\gamma (t)} (\dot{\gamma }(t),\dot{\gamma }(t)):= &{}C(\gamma (t))^2 \left( \xi |u^3(t)|^2 \right. \\ &{}\left. +\quad |u^4(t)|^2 + |u^5(t)|^2\right) , \end{array} \end{aligned}$$
(24)

The sub-Riemannian distance between two elements \(g_1,g_2 \in \mathrm{SE}(3)\) is still defined as in (7), but now the infimum is taken over Lipschitz continuous curves \(\gamma \in {\text {Lip}}([0,T],\mathrm{SE}(3))\) with \(\gamma (0) = g_1\), \(\gamma (1) = g_2\) and \(\dot{\gamma }(t) = u^3(t) \left. \mathcal {A}_3\right| _{\gamma (t)} + u^4(t) \left. \mathcal {A}_4\right| _{\gamma (t)} + u^5(t) \left. \mathcal {A}_5\right| _{\gamma (t)}\).

3.3 A Nilpotent Approximation \((\mathrm{SE}(3))_0\) of SE(3) and the Approximated Sub-Riemannian Distance

It is important to realize that the logarithmic map is only well defined on the group SE(3) and not on the quotient \(\mathbb {R}^3 \rtimes S^2\), i.e., different choices for \(\alpha \) in the rotational part result in different values for the coefficients \(c^i\). Here, we choose the approach of [46] and set \(\alpha =-\gamma \) such that expected symmetries are preserved. With that choice the logarithm (23) gives for each \((\mathbf {x},\mathbf {n}) \in \mathbb {R} \rtimes S^2\) a unique vector \(\mathbf {c}\), on which we can put a norm:

Fig. 4
figure 4

Distances on SE(3) for \(\xi =.1\), \(C=1\), with the origin placed at \(e=(\mathbf {0},\mathbf {e}_x)\). Top row: Level sets of the spatial projections (minimum intensity projections over \(S^2\)) of the distance volumes on SE(3). Rows two to four: glyph visualizations in which each distance volume d is visualized with a “Gaussian” density \(U(g)=e^{-d(e,g)^2}\). For an interpretation of the glyphs see Remark 2. Row two: glyph visualizations of sub-volume. Row three: glyph visualization of slice at \(z=0\). Row four: zoomed in glyph visualization of the slice a \(z=0\). From left to right: the sub-Riemannian distance \(d_{0}(e,\cdot )\) on SE(3); see Eqs. (7) and (24); homogenous norms \(||\cdot ||_{\xi ,\zeta }\), see Eq. (25), of the nilpotent approximation \((\mathrm{SE}(3))_0\) for, respectively, \(\zeta =100\), \(\zeta =16\) (Folland–Kaplan–Korányi gauge) and \(\zeta =1\); the (\(\xi \)-isotropic) Riemannian distance \(d_1(e,\cdot )\) on SE(3); see Table 1 for an overview of the different distances

$$\begin{aligned} \begin{aligned}&|{\text {Log}}_{\mathrm{SE}(3)}(g)|_{\xi ,\zeta } := ||\mathbf {c}||_{\xi ,\zeta } \\&\quad := \root 4 \of { (\xi ^2 |c^3|^2 + |c^4|^2 + |c^5|^2)^2 + \zeta \, (\xi ^2(|c^1|^2 + |c^2|^2) + |c^6|^2) }, \end{aligned} \end{aligned}$$
(25)

where \(\mathbf {c} = \mathbf {c}(g)\) according to (23).

Also here, the Folland–Kaplan–Korányi-type norm can be used to approximate the fundamental solutions of the sub-Laplacian on SE(3). The norm \(||\mathbf {c}||_{\xi ,\zeta }\) with \(\zeta =1\) was, for example, used in [23] approximations of the heat kernel and the fundamental solution on SE(3), of which only recently exact solutions were found in [47]. In the context of this paper we can approximate the exact solutions of the sub-Laplacian on SE(3) by \(||\mathbf c ||_{1,16}^{2-Q}\), with homogeneous dimensions \(Q=9\), as the exact solution of the sub-Laplacian on the approximation group \((\mathrm{SE}(3))_0\). The group \((\mathrm{SE}(3))_0\) that locally approximates SE(3) is again obtained via a nilpotent step 2 approximation of the BCH formula and is defined by the group product

$$\begin{aligned}&(x^1,x^2,x^3,x^4,x^5,x^6) \cdot (y^1,y^2,y^3,y^4,y^5,y^6) \nonumber \\&\quad \quad = \left( \begin{array}{c} x^1 + y^1 + \frac{1}{2}(x^5y^3 - x^3y^5)\\ x^2 + y^2 + \frac{1}{2}(x^3y^4 - x^4y^3) \\ x^3 + y^3 \\ x^4 + y^4 \\ x^5 + y^5 \\ x^6 + y^6 + \frac{1}{2}(x^4y^5 - x^5y^4) \\ \end{array}\right) ^T, \end{aligned}$$
(26)

with \(x^i,y^i\) coordinates of the first kind given by the logarithmic map (22). This new group is a free-nilpotent group of rank 3 and step 2.

We approximate the sub-Riemannian distance \(d_{0}\) on SE(3) via the norm (25). That is,

$$\begin{aligned} d_{0}(g,h) \approx |{\text {Log}}_{\mathrm{SE}(3)}(g^{-1}h)|_{\xi ,\zeta }, \end{aligned}$$
(27)

and as such again obtain an approximation of the distance in the sense of Rothschild and Stein [50]. Based on the quantitative comparison to the sub-Riemannian distances \(d_0\) in “Appendix A” and the visualizations in Fig. 4 of the level sets we conclude that the approximated sub-Riemannian distance of (27) quite accurately approximates the true sub-Riemannian distance on SE(3). In our analysis we found that the logarithmic norm with \(\zeta =100\) gave the best approximation, and as such we used this norm in the perceptual grouping experiments of Sec. 5.2.

Remark 2

The glyph at each grid point \(\mathbf {y}\) in Fig. 4 is given by the surface \(\{\mathbf {y} + \nu U(\mathbf {y}, \mathbf {n})\mathbf {n} | \mathbf {n}\in S^2\}\), for a specific choice \(\nu > 0\), and with density \(U:\mathbb {R}^3\times S^2\rightarrow \mathbb {R}^+\). The color of each orientation \(\mathbf {n} = (n^1,n^2,n^3) \in S^2\) on the glyph is defined by the RGB color \((n^1,n^2,n^3)\).

4 Perceptual Grouping, Fast Marching and Key Point Tracking

In this section the algorithms used in this paper are explained. Our main application of interest is that of grouping/clustering of points on blood vessels via the perceptual grouping algorithm, which is explained in Sect. 4.1. The perceptual grouping algorithm takes as input a set of key points that are obtained via the minimal path tracking with key points algorithm [5], explained in Sect. 4.3, which is an adaptation of the fast marching algorithm, explained in Sect. 4.2. Finally since different metrics are used throughout the experiments (both for generating key points and for perceptual grouping) we end this section with an overview of the used metrics in this paper in Sect. 4.4.

4.1 The Perceptual Grouping Algorithm

The perceptual grouping algorithm presented in this paper is a modification of the original algorithm proposed by Cohen [13], and which was later adapted for perceptual grouping based on anisotropic distances [8]. In recent work [11], the perceptual grouping algorithm was extended for the grouping of n closed contours for an a priori specified n. Like in [8] and [11], we use the main algorithm of [13] as a backbone, but we change the metric used for perceptual grouping and we impose an additional constraint to avoid closed loops (which are physically not realistic in the vessel networks of interest). Our adapted perceptual grouping algorithm is given in pseudo-code in Algorithm 1.

figure d

The goal of the perceptual grouping algorithm is to construct a graph out of a set \(\mathcal {S}\) of points of interest in which the edges \(\mathcal {D}_\mathcal {S}\) are true connections (represented by geodesics) between points. Following the terminology of [5, 10, 17] we will refer to the points of interest as key points. Each key point is only linked to at most 2 other key points (i.e., node degree \(\delta _i\) is 2 at most). The final graph thus only contains sets of non-bifurcating vessel segments. The graph is build up by inserting one by one the edges which have the shortest geodesic distance (if the node degree allows). As such, only the strongest connections (shortest geodesics) appear in the final graph network. Since the original algorithm in [13] (and also [8]) does not include a mechanism to avoid closed loops we include an additional check in the main algorithm to prevent this. Finally, in order to avoid connecting key points which are too far apart from each other we only consider edges of which the spatial arc length of the connecting geodesic does not exceed a certain a priori threshold \(s_{max}\).

In summary our changes relative to the works [8, 11, 13] are that we

  • keep the choice for distance \(d(x_i,x_j)\) open. In our experiments the distances d will be mainly based on sub-Riemannian geometry in SE(n).

  • explicitly avoid making long distance connections by filtering out such possible connections in an initialization step.

  • avoid closed loops by not making connections between nodes that belong to the same subgraph.

  • group crossing lines without pre-specifying the number of groups.

In particular, it is the use of a sub-Riemannian metric on SE(n) that allows for the grouping of crossing lines. A first (successful) feasibility study on the possibility of perceptual grouping of crossing lines was performed by Chen et al. [11] using a (sub-)Finsler metric (based on the Euler elastica model) on position orientation space. There it was successfully demonstrated on phantom images that their algorithm is able to deal with crossing closed contours; however, it required specification of the number of contours (which is not always a priori known). Furthermore, their metric relies on a notion of directionality (instead of just orientations) which is useful in grouping closed contours, but may be disadvantages for grouping non-closed contours. Here, we focus on the grouping of non-closed crossing contours without specifying the number of contours. Furthermore, we quantify the performance of perceptual grouping of crossing lines on a large set of both retinal images in 2D, and phantom images in 3D.

4.2 Fast Marching

Most of the distances (except for the fast analytic approximations) and the geodesics used in this paper are computed via the fast marching algorithm, which is an efficient numerical solver of the eikonal equation and which can be used to obtain globally optimal geodesics [14]. Let \(g_0\) be an arbitrary source point in a domain \(\mathbb {M}\) of interest, let \(\mathcal {G}|_{g}: T_g(\mathbb {M}) \times T_g(\mathbb {M}) \rightarrow \mathbb {R}^+\) be a metric tensor defined on the tangent space \(T_g(\mathbb {M})\) at \(g\in \mathbb {M}\), and let

$$\begin{aligned} U(g):=d(g_0,g) = \underset{\gamma \in \mathcal {S}(g_0,g)}{{\text {inf}}} \int _0^1 \sqrt{ \left. \mathcal {G}\right| _{\gamma (t)}(\dot{\gamma }(t),\dot{\gamma }(t)) }\mathrm{d}t \end{aligned}$$
(28)

its associated distance map, where the infimum is taken over the set \(\mathcal {S}(g_0,g)\) of Lipschitz continuous curves with \(\gamma (0)=g_0\), \(\gamma (1) = g\), and with \(\dot{\gamma }(t)\in T_{\gamma (t)}(\mathbb {M})\). Then the distance map U is the unique viscosity solution of the eikonal equation

$$\begin{aligned} \left\{ \begin{array}{l} \sqrt{\mathcal {G}\left( \nabla _\mathcal {G} U(g),\nabla _\mathcal {G} U(g) \right) } = 1,\\ U(g_0) = 0, \end{array} \right. \leftrightarrow \left\{ \begin{array}{l} ||\nabla _\mathcal {G} U(g) ||_\mathcal {G} = 1,\\ U(g_0) = 0, \end{array} \right. \end{aligned}$$
(29)

with \(\nabla _\mathcal {G} := \mathcal {G}^{-1} \mathrm{d}U\) the intrinsic gradient with inverse metric \(\mathcal {G}^{-1}\) and \(\mathrm{d}U\) the differential of U, and \(||\cdot ||_\mathcal {G}\) the norm with respect to the metric tensor. In the standard (data-adaptive) Euclidean case with \(\mathbb {M}=\mathbb {R}^2\), \(g_0=\mathbf {0}\), \(g=\mathbf {x}\), \(\dot{\gamma }(t) = u^1(t) \partial _x + u^2(t) \partial _y \in T_{\gamma (t)}(\mathbb {R}^2)\), and with \(\mathcal {G}|_{\gamma (t)}(\dot{\gamma }(t),\dot{\gamma }(t)) = C(\gamma (t))^2 (|u^1(t)|^2+|u^2(t)|^2)\) the eikonal equation is given by \(||\nabla U (\mathbf {x})||= C(\mathbf {x})\).

The fast marching algorithm efficiently solves the eikonal equation in a one pass algorithm. It computes the values of U in increasing order [starting with \(U(g_0)=0\)] based on the Bellman principle of optimality, in a manner very similar to the Dijkstra algorithm for shortest paths on graphs [18]. The minimal geodesic connecting \(g_0\) with g is then obtained via a gradient descent on U from g back to the origin \(g_0\), i.e., solving the ODE

$$\begin{aligned} \left\{ \begin{array}{l} \dot{\gamma }(t) \propto -\mathcal {G}^{-1} \mathrm{d}U(\gamma (t)),\\ \gamma (0) = g_0. \end{array} \right. \end{aligned}$$

For details on the fast marching algorithm on isotropic manifolds we refer to [54, 56], to [32, 41] for anisotropic fast marching, and to [52] and [24] for fast marching in sub-Riemannian manifolds in SE(2) and SE(3), respectively.

4.3 Generating Key Points

The key point method is based on keeping track of a spatial arc length map \(U_l\) (in which the spatial lengths of the minimizing geodesics \(\gamma \) defining U are stored) and stops as soon as a certain distance threshold is passed [17]. The rationale behind this algorithm is that among all points with equal geodesic distance values U, the points reached by geodesics \(\gamma \) that best follow the data (paths along which C is low) have maximum spatial distance \(l(\gamma )\). Such a point maximizing spatial distance in a given level set in U is called a key point. The fast marching algorithm is highly suited for keeping track of a spatial arc length map \(U_l\), in addition to U, due to the local updating approach (wavefront propagation). Moreover, the algorithm can stop early if one is only interested in finding the first key point with length larger than \(l_{max}\) [17].

In summary a key point is detected as follows. The spatial arc length map is defined as

$$\begin{aligned} U_l(g):= l(\gamma _{g_0,g}), \end{aligned}$$
(30)

with \(\gamma _{g_0,g}= \underset{\gamma \in \mathcal {S}(g_0,g)}{{\text {argmin}}} \int _0^1 \sqrt{ \left. \mathcal {G}\right| _{\gamma (t)}(\dot{\gamma }(t),\dot{\gamma }(t)) }\mathrm{d}t\) the minimizing geodesic in (28), and with

$$\begin{aligned} l(\gamma ) = \int _0^1 ||\dot{\mathbf {x}}(t) ||\mathrm{d}t \end{aligned}$$
(31)

the spatial arc length of \(\gamma \), with \(\dot{\mathbf {x}}(t) = \mathbb {P}_{\mathbb {R}^n} \dot{\gamma }(t) \in \mathbb {R}^n\) the spatial components of the tangents \(\dot{\gamma }(t)\).Footnote 4 The fast marching algorithm stops as soon as there is a g for which \(U_l(g)\ge l_{max}\), and g will be called a key point.

With the above criteria one can iteratively detect new key points based on geodesic distances to previously found key points, a method known as minimal path tracking with key point detection [5]. One can make several choice on when to stop the key point tracking algorithm [5, 10, 34]. In this work we rely on the approach by Chen et al. [10], where we only add key points on locations which lie in a masked region (we use a binary vessel centerline mask \(m:\mathbb {M}\rightarrow \{0,1\}\)), i.e., we only add a key point when both \(U_l(g)\ge l_{max}\) and \(m(g)=1\). The algorithm is stopped as soon \(U_l(g)\ge 3 \; l_{max}\).

4.4 Overview of Distances Used in this Paper

Table 1 gives an overview of the different distances discussed in this paper and used in the experiments. The isotropic Euclidean metrics are used the generate key points in \(\mathbb {R}^2\) and \(\mathbb {R}^3\) using the algorithm of Subsec. 4.3. The isotropic Euclidean distances are also used in comparison to the other distances on SE(n) in the perceptual grouping experiments. The sub-Riemannian distances on SE(2) and SE(3) are explained, respectively, in Sects. 2.2 and 3.2. In the Riemannian distances the full tangent bundle on SE(n) is considered. This means that now also non-horizontal curves in SE(n) are considered, i.e., points on the curves \(\gamma \) are allowed to move sideways by the non-horizontal controls \(u^3(t)\) in the SE(n) case, and \(u^1(t), u^2(t)\) in the SE(3) case. Recall that in this case the blue and red oriented particles in Fig. 1 do have the same distance to the source (black arrow). Finally, the sub-Riemannian distance approximations, denoted by \(|{\text {Log}}_{\mathrm{SE}(n)}(g^{-1}h)|_{\xi ,\zeta }\), are discussed and defined in, respectively, Sect. 2.3 and Eq. (16) for SE(2) and Sect. 3.3 and Eq. (25) for SE(3).

Table 1 Overview of the metrics used in this paper

4.4.1 The Cost C

The cost functions C are constructed from functions \(U_f:\mathbb {R}^n \times S^{n-1}\rightarrow \mathbb {R}\) on the orientation-lifted space. These functions \(U_f\) are obtained via an orientation score transform [21, 31] of image \(f:\mathbb {R}^n\rightarrow \mathbb {R}\) by correlating the image with a set of anisotropic wavelets \(\psi :\mathbb {R}^n \rightarrow \mathbb {R}\):

$$\begin{aligned} U_f(g) = ( \mathcal {U}_g \psi , f )_{\mathbb {L}_2(\mathbb {R}^n)}, \end{aligned}$$
(32)

with \(( f , g )_{\mathbb {L}_2(\mathbb {R}^n)}=\int _{\mathbb {R}^n} \overline{f(\mathbf {x})}g(\mathbf {x}) \mathrm{d}\mathbf {x}\) the standard inner product on \(\mathbb {L}_2(\mathbb {R}^n)\), with the overline denoting complex conjugation, and where \(\mathcal {U}_g\) denotes the left regular representation of the Lie group on images f. For the group SE(2) acting on images \(f \in \mathbb {L}_2(\mathbb {R}^2)\) it is defined as

$$\begin{aligned} (\mathcal {U}_g f)(\mathbf {y}):=f(\mathbf {R}_\theta ^{-1}( \mathbf {y} - \mathbf {x}) ) \end{aligned}$$

with \(g = (\mathbf {x},\theta ) \in \mathrm{SE}(2)\) (recall the group definitions in Sect. 2.1.1). For the group SE(3) acting on images \(f \in \mathbb {L}_2(\mathbb {R}^3)\) it is defined as

$$\begin{aligned} (\mathcal {U}_g f)(\mathbf {y}):=f(\mathbf {R}_{\mathbf {n}}^{-1}( \mathbf {y} - \mathbf {x}) ) \end{aligned}$$

with \(g = (\mathbf {x}, \mathbf {R}_{\mathbf {n}}) \in \mathrm{SE}(3)\) (recall the group definition in Sect. 3.1.1).

The wavelets used in the orientation score transform [21, 31] are designed in such a way that all rotated version together cover the full Fourier spectrum. With this design no data are lost in the transformation and a stable invertible transform (from orientation score) back to image exists. For details on this wavelet-type transform for lifting 2D images to functions on SE(2) we refer to [21], and for lifting 3D images to 3D orientation scores we refer to [31]. In all experiments we define the cost in the following form

$$\begin{aligned} C(g) = \frac{1}{1 + \lambda \mathcal {V}(g)^p}, \end{aligned}$$
(33)

with \(\mathcal {V}\) a vessel (or centerline) enhancement obtained by processing of the orientation score \(U_f\), and which is normalized between 0 and 1. Parameters \(\lambda \) and p then control, respectively, the influence of the cost (data adaptivity) and p the contrast.

Good choices for \(\mathcal {V}\) for tracking of vessels in 2D position orientation space may be via the vessel enhancements of [58] or [30], similar to the SE(2) tracking experiments in [3]. For tracking in 3D orientation scores \(\mathcal {V}\) may be obtained via the crossing preserving vessel enhancements of [19]. In related tracking problems in lifted spaces the lifts are obtained via tubularity measures [11, 36, 38], or by correlating the image with a set of rotated templates [44].

4.4.2 Projective Line Bundle

Finally, we remark that when dealing with geodesic distances in SE(n) we have to take into account that these are defined for positions and orientations on the full sphere \(S^{n-1}\). The distances discussed in this paper thus make a distinction between forward and backward arrival directions, i.e., \(d(e,(\mathbf {x},\theta )) \ne d(e,(\mathbf {x},\theta + \pi ))\).

In practice, and in particular in our perceptual grouping problem, we often do not know the direction of the vessel, but we only have orientations. As such, we would actually want to compute distances on the projective line bundle \(\mathbb {R}^n \times P^{n-1}\), with \(P^{n-1} := S^{n-1}/\sim \) with identification of antipodal points \(\mathbf {n}_1 \sim \mathbf {n}_2 \leftrightarrow \mathbf {n}_1 = \pm \mathbf {n}_2\). We define the distances \(\tilde{d}\) on the projective line bundle by distances d on SE(n) via

$$\begin{aligned} \tilde{d}(g,(\mathbf {x},\mathbf {n})) = {\text {min}}\left\{ d( g , (\mathbf {x},\mathbf {n})), d( g , (\mathbf {x},-\mathbf {n}) ) \right\} , \end{aligned}$$
(34)

with \(n \in S^{n-1}\), and \(g,(\mathbf {x},\pm \mathbf {n})\in \mathrm{SE}(n)\). Note that in the SE(2) case we have with \(\mathbf {n}(\theta )= (\cos \theta , \sin \theta ) \leftrightarrow \theta \) and \(-\mathbf {n}(\theta )=\mathbf {n}(\theta +\pi )\). For a more detailed analysis on data-adaptive sub-Riemannian geodesics on the 2D projective line bundle we refer [2].

Table 2 Perceptual grouping performance for the 2D retinal image experiments in terms of percentage of correct key point connections (# of false connections in parentheses)
Fig. 5
figure 5

Example 1 of the retinal vessel grouping experiments. Each connected component has its own color (note that the colors might not match between experiments as the number of recovered components may differ), and false connections are indicated in red. Top row: experiments with data-adaptive distances (\(C\ne 1\)), and the ground-truth vessel components including the automatically generated key points. Bottom row: experiments without data-adaptive distance (\(C=1\)) (Color figure online)

Fig. 6
figure 6

Example 2 of the retinal vessel grouping experiments. Each connected component has its own color (note that the colors might not match between experiments as the number of recovered components may differ), and false connections are indicated in red. Top row: experiments with data-adaptive distances (\(C\ne 1\)), and the ground-truth vessel components including the automatically generated key points. Bottom row: experiments without data-adaptive distance (\(C=1\)) (Color figure online)

5 Experiments

In the experiments we aim to quantify the performance of perceptual grouping with different distances. For a fair comparison we therefore generate automatically the most reasonable key points by using a vessel centerline mask \(m:\mathbb {R}^n \rightarrow [0,1]\) (see Sect. 4.3) based on the ground-truth data. Moreover, this guarantees that the key points are always located on the ground-truth centerlines, which allows us to quantify performance using the ground-truth data. In both the 2D and 3D case the key points are then generated using the isotropic Euclidean metric tensor, and with \(\mathcal {V}(\mathbf {x})=m(\mathbf {x})\) (see Sect. 4.4.1). In all experiments we set \(p=1\), \(\lambda =100\) to compute the cost [cf. Eq. (33)].

In the perceptual grouping experiments the cost functions are constructed from orientation score transforms \(U_f\) of the mask m on \(\mathbb {R}^n\). The costs on SE(n) are then constructed via the modulus of the score:

$$\begin{aligned} \mathcal {V}(g)=\mathcal {V}_{\mathrm{SE}(n)}(g):=|U_f(g)|. \end{aligned}$$
(35)

For equal comparison the costs on \(\mathbb {R}^n\) are then constructed via \(V(\mathbf {x}) = \underset{\mathbf {n}\in S^{n-1}}{{\text {max}}}\mathcal {V}_{\mathrm{SE}(n)}(\mathbf {x},\mathbf {n})\), i.e., via a maximum intensity projection over orientations \(\mathbf {n}\).

5.1 Perceptual Grouping in SE(2)

5.1.1 Experimental Setup

The data for the 2D retinal vessel grouping experiments consist of 52 retinal image patches in which the vessels have complicated topologies (each patch contains at least 1 crossing, and at least 1 bifurcation). For each retina patch the centerlines were semiautomatically traced, after which the connectivity (bifurcation relations) between the vessel segments were manually determined. The set of images contained in total 313 separate vessel segments. A connection between two nodes was determined to be a true positive if both nodes lie on the same vessel tree.

Table 3 Perceptual grouping performance for the 3D synthetic volume experiments in terms of percentage of correct key point connections (# of false connections in parentheses)
Fig. 7
figure 7

Example 1 of the 3D synthetic vessel grouping experiments. Each connected component has its own color (note that the colors might not match between experiments as the number of recovered components may differ), and false connections are indicated in red. Top row: experiments with data-adaptive distances (\(C\ne 1\)), and the ground-truth vessel components including the automatically generated key points. Bottom row: experiments without data-adaptive distance (\(C=1\)) (Color figure online)

The minimum distance between key points in the retina experiments (with patch sizes of \(\approx 400\times 400\) pixels) was set to \(l_\mathrm{max}=30\) pixels. The maximum geodesic arc length distance in the perceptual grouping algorithm was set to \(s_\mathrm{max}=80\) pixels. The orientations \(\theta \) at each key point \(\mathbf {x}\) was estimated by the orientation that gave maximum response in the orientation score, i.e., \(\theta =\underset{\theta \in S^1}{{\text {argmax}}} \; \mathcal {V}_{\mathrm{SE}(2)}(\mathbf {x},\theta )\). The circle \(S^1\) was sampled with \(N_\theta = 32\). All distances were computing via the fast marching algorithm of [41, 42] except for the sub-Riemannian approximations, which were computed directly using (9) and (15). The position orientation balancing parameter was set to \(\xi =0.01\).

5.1.2 Results

Table 2 gives a quantitative overview of the results, and Figs. 5 and 6 show the results on two of the 52 retina patches. From Table 2 we make the following observations and conclusions:

  1. 1.

    Perceptual grouping is preferred in the lifted domain SE(2) instead of in the based domain \(\mathbb {R}^2\). This suggest that taking orientation into account in the grouping is essential.

  2. 2.

    A sub-Riemannian geometry on SE(2) is preferred over a (\(\xi \)-isotropic) Riemannian geometry. This suggests that a sub-Riemannian geometry is necessary to deal with the complex geometry at crossings and parallel tracks (cf. Figs. 5 and 6).

  3. 3.

    The results obtained with the sub-Riemannian distances on SE(2) for \(C=1\) are almost equal. This suggests that the approximations are quite accurate, and that for \(C=1\) the analytic approximations may be preferred due to speed and algorithm complexity considerations.

  4. 4.

    Overall, results for \(C\ne 1\) are better than for \(C=1\). Note, however, that the sub-Riemannian distances on SE(2) for \(C=1\) are still better then the Euclidean distance on \(\mathbb {R}^2\) and Riemannian distance on SE(2) for \(C\ne 1\), and only slightly under performs relative to the sub-Riemannian \(C\ne 1\) case. This again shows that sub-Riemannian geometry is preferred, whether data are included in the metric tensors or not.

We conclude that in perceptual grouping of 2D vessels a sub-Riemannian geometry in SE(2) is preferred over a Euclidean geometry in \(\mathbb {R}^2\), or a Riemannian geometry in SE(2). When accurate vesselness maps are available, it is preferable to use these in the distances. Furthermore, if one aims to design a easy to implement and efficient perceptual grouping pipeline, approximate sub-Riemannian distances should be used. With only a 2D key point tracking algorithm, a method for estimating orientations and the analytic approximate distances (15) one obtains very accurate grouping results.

5.2 Perceptual Grouping in SE(3)

5.2.1 Experimental Setup

To quantify and study the influence of different distances in perceptual grouping algorithms for 3D vessels we make use of synthetic 3D images. For these experiments 10 volumes were generated, each with 6 random paths. Each path was generated with a Monte Carlo simulation of a random walk in SE(3) (see, e.g., [59, Ch. 3.5]). Due to the random construction it might occur that 2 paths cross each other. This is physiologically unrealistic (vessels in 3D might bifurcate or touch, but never grow through each other), but it does make the experiments more challenging.

Fig. 8
figure 8

Mean squared errors between the sub-Riemannian distance on SE(2) [see Eq. (7)] and its approximation [see Eq. (16)]. The errors is computed for varying choices of \(\zeta \) and on a varying grid size (from \(x,y\in [-0.5,0.5]\) to \(x,y\in [-4,4]\))

Fig. 9
figure 9

Mean squared errors between the sub-Riemannian distance on SE(3) [see Eqs. (7), (24)] and its approximation [see Eq. (27)]. The errors is computed for varying choices of \(\zeta \) and on a varying grid size (from \(x,y,z\in [-0.5,0.5]\) to \(x,y,z\in [-4,4]\))

For each volume a binary centerline mask was constructed using the generated ground-truth paths. The volumes were of size \(51\times 51\times 51\) voxels. The distance between key points was set to \(l_\mathrm{max}=5\) voxels. The maximum geodesic arc length distance in the perceptual grouping algorithm was set to \(s_\mathrm{max}=15\) voxels. The orientation at each key point was again estimated as the orientation that gave maximum response in \(\mathcal {V}_{\mathrm{SE}(n)}\) [Eq. (35)]. The sphere \(S^2\) was sampled with 200 orientations using Euler angles with \(\mathbf {n}(\beta ,\gamma ) = \mathbf {R}_{\gamma ,\beta ,\alpha }.\mathbf {e}_z\), with \(\beta \in \{\frac{\pi }{2 N_\beta }, 2\frac{\pi }{2 N_\beta }, \ldots ,\pi - \frac{\pi }{2 N_\beta }\}\), \(\gamma \in \{0, \frac{\pi }{N_\beta }, \ldots ,2\pi - \frac{\pi }{N_\beta }\}\), with \(N_\beta =10\), and with \(\mathbf {R}_{\gamma ,\beta ,\alpha }\) given by (19). In the lifted metric tensor we set \(\xi =1\).

5.2.2 Results

Table 3 gives a quantitative overview of the results, and Fig. 7 shows the results on one of the ten synthetic volumes. From Table 3 we can draw the same conclusions as for the SE(2) case (using a sub-Riemannian geometry and including data adaptivity improves results). Here, however, we make two additional observations

  1. 1.

    Data-adaptive fast marching seems less sensitive to the choice of metric, but tracking in the lifted domain SE(3) still improves results. This can be explained by the fact that the volume is relatively sparse, and by the fact that the cost function C is constructed from ground-truth data (the best possible cost). If the cost function dominates the metric, then the intrinsic energy/geometry has a smaller influence.

  2. 2.

    Out of all \(C = 1\) distances (no data adaptivity) the grouping via the nilpotent distance approximations on SE(3) give best performance, even better then for the true sub-Riemannian distance. This can be explained by the fact that for long distances from the origin, the approximation gradually looses their sub-Riemannian nature and allows more non-horizontal behavior, as in the Riemannian case. It could be that, due to the discrete sampling of the sphere, not all orientations are accurately estimated. The grouping based on the sub-Riemannian distance approximations seems less sensitive to such errors.

6 Conclusion

In this paper we have proposed an efficient approach for perceptual grouping of local orientations via nilpotent approximations of sub-Riemannian distances in the roto-translation group SE(n). The quantitative experiments on grouping of retinal blood vessels in 2D images, and perceptual grouping in challenging 3D synthetic volumes, showed that (1) sub-Riemannian geometry is essential in achieving top performance and (2) that the grouping approach via the fast analytic approximations performs almost equally, or better, than the data-adaptive fast marching approaches.

The sub-Riemannian distances on SE(2) and SE(3) were approximated via norms on exponential coordinates of the first kind (obtained via the logarithmic map). In both quantitative and visual comparison it was found that the approximations accurately follow the true sub-Riemannian distances, a conclusion which was further supported by the equal performance in quantitative perceptual grouping experiments. We also numerically showed that the weighted logarithmic norms used in this paper provide a more accurate approach for approximating the heat kernel and fundamental solution of the sub-Laplacian on SE(n), compared to previous approaches [12, 22, 47].

Since the sub-Riemannian distance approximations are analytic, they are easy to implement and fast to compute. An interesting line of further research would be to embed the sub-Riemannian distances in other algorithms that rely on the quantification of the distance between local orientations. The results of this paper could be further improved by augmenting the sub-Riemannian distances with additional features (like cross-sectional profile descriptors) and use a global graph optimization approach as in [26, 57]. The potential of using sub-Riemannian distances in such problems is demonstrated by the experiments of this paper.