From Deterministic to Stochastic Methods for Shape From Shading Pascal Daniel and Jean-Denis Durou
IRIT - Institut de Recherche en Informatique de Toulouse UPS, 118 route de Narbonne, 31062 Toulouse Cedex 04, France Tel: (+33) 561 556 882 - Fax: (+33) 561 556 258 e-mail:
[email protected],
[email protected] Shape from shading designates the problem of shape rebuilding of a surface, starting from one image only of this surface. We compare different iterative methods of resolution of shape from shading. An analysis of the various discrete formulations of the problem is made, which allows the proposition of a certain number of error functions. Three deterministic methods to minimize these errors are tested, including the new method that we propose, and that we call “optimal gradient descent”, which is the only one which guarantees convergence. In addition, we show that it is necessary to use stochastic methods to improve the results given by the deterministic methods. Finally, we confirme that the direct resolution of the problem with height as the only unknown leads to failure. Keywords: shape from shading, gradient descent method, minimization problem.
1. Introduction Shape from shading is an inverse problem of image analysis considered to be difficult to resolve insofar that no general method of resolution have been found to date. This field, which was formulated initially by B. Horn [5], consists of rebuilding the relief of an object represented on an image, thanks to the information of illumination (or shading) of this image. 1
1.1. The Image Irradiance Equation Let us start by describing the ideal conditions, which only allow a simple mathematical formulation of the problem. The image is sharp and is considered, to within about a scale factor called magnification, to be like the result of the orthogonal projection of a scene on the focal plane of the observer. The photosensitive receiver is linear. Each light source is placed at infinity and produces a parallel and uniform light beam, which can be described by a single vector. The surface is opaque, has the same photometric characteristics at each point (homogeneous surface) and presents neither hidden part nor edge. The physical space is mapped onto an orthogonal reference mark Oxyz, so that Oxy coincides with the focal plane of the observer and Oz is directed towards him. Let us suppose that the surface to be rebuilt can be represented by the height function z(x, y), which represents the unknown of the problem. As it was supposed that the observed surface presents neither hidden part nor edge, it follows that this function is ~ (outgoing normal continuous and derivable. The normal vector of the surface, denoted by N of norm 1), is then defined in each point of the surface and has as coordinates:
1 ~ =√ N 1 + p2 + q 2
−p −q ,
(1)
1
∂z ∂z et q = . If the grey level image of the observed surface is represented by ∂x ∂y the function E(x, y), then the mathematical formulation of shape from shading results in the where p =
image irradiance equation [5]:
~ ) = E(x, y), R(N
(2)
where R is a function called the reflectance map [5]. This function is a priori unknown. Without its knowledge, one cannot go further in the resolution of the equation (2). Now, let us deal with the description of a particular case for which the function R has a quite precise analytical expression. 2
1.2. The Eikonal Equation ~ parallel Henceforth, we suppose that there is a single light source, described by a single vector S to the direction of observation, i.e.:
0 ~ =S S 0 .
(3)
−1
Moreover, we suppose that the surface is perfectly diffusing (or lambertian), which implies [2] that the function R is written in the following form:
~ ·N ~ , ~ ) = k S R(N
(4)
where k is a constant which depends on the albedo of the surface as well as certain parameters of the observer. ~ and S, ~ and using the expression (4) Thanks to the expressions (1) and (3) of the vectors N of the reflectance map, the image irradiance equation (2) can thus be rewritten in the following form: √
kS = E(x, y). 1 + p2 + q 2
(5)
One sees on the equality (5) that the maximum value Emax of E(x, y) is reached when p = q = 0 and is worth kS, which enables to reformulate (5) in the following way: √
Emax = E(x, y). 1 + p2 + q 2
(6)
This equation is commonly called the eikonal equation [2], and often appears in the following equivalent form: 2
2
p +q =
Emax E(x, y)
!2
− 1,
but we prefer, within the framework of this paper, to preserve the formulation (6). It is a partial derivative equation of the first order of derivation, nonlinear, having the function z as unknown. Despite its simple formulation, this equation still raises serious difficulties in its resolution.
1.3. Resolution of the Eikonal Equation The resolution of the eikonal equation has already been tried using the various usual methods of resolution of the partial derivative equations, namely: 3
• Resolution by a power series expansion [2, 9]: this method of resolution supposes that the functions E(x, y) and z(x, y) are analytical (thus in particular infinitely derivable!). For this reason, it cannot be used for digital images, and primarily made it possible to show the existence of cases for which the eikonal equation has a continuously infinite set of analytical solutions, proving by there the possibility of nonvisible deformations.
• Resolution by the characteristic strips expansion [5]: this method provides an integral resolution along quite particular lines, if the values of z, p and q are known in a point at least of each one of these lines. Nevertheless, the accumulation of the errors makes this method unusable on noisy images.
• Resolution by the viscosity solutions [11, 4]: these solutions, from which the introduction goes back only a score of years, have the advantage of being not inevitably derivable, and to include in particular surfaces having edges. Moreover, diagrams undoubtedly convergent exist, allowing to calculate numerical approximations of these solutions, since one knows the solution on the edge of the rebuilding domain, term which designates the part of the image on which one seeks to calculate the shape.
Rather than discretizing a method of continuous resolution of the eikonal equation, it can seem more logical to start by discretizing this equation, before solving the system of equations thus obtained. A large majority of the resolution methods of shape from shading, having given more or less satisfactory results [8, 7, 6, 12], are iterative methods which try indeed to solve, in each pixel of the rebuilding domain, the discretized eikonal equation. Horn and Brooks also demonstrated [7] that these methods of resolution could be regarded as resulting from the discretization of resolution methods in a continuous field. The aim of this paper is to lead a comparative study of the various iterative methods of resolution of shape from shading, in the particular case of the eikonal equation. In section 2, we start defining clearly the equations coming from the discretization of the eikonal equation. In section 3, we quote the various types of methods allowing the solution of these equations, and we define some quality criteria for these methods. Finally, section 4 has the results of several series of tests. 4
2. Discretization of the Eikonal Equation One says of a system of equations which has a number of solutions neither too high nor null, that it is well-constrained. Obviously, the majority of the systems are badly-constrained, i.e., they admit a number of solutions either null (over-constrained systems) or infinite (underconstrained systems). We will now write the equations resulting from the discretization of the eikonal equation, and will try to determine if the obtained system is well-constrained or not.
2.1. Systems of Discrete Equations resulting from the Eikonal Equation The rebuilding domain of the studied image, D, will be made up of m pixels in which one seeks to calculate the height zi,j . One will say of a pixel that it is on the borderline of D if this pixel is not in D, but at least one of its eight closer neighbors do. One supposes that the height is known in any pixel of the borderline of D. The system to be solved consists in the eikonal equation (6), discretized in each pixel (i, j) of D, i.e.: Emax q
1 + pi,j 2 + qi,j 2
= Ei,j .
(7)
This system of equations, which one notes Σp,q , comprises in each pixel (i, j) of D two unknowns pi,j and qi,j . It is thus a system of m equations and of 2m unknowns. In each pixel (i, j) of D, pi,j can take an unspecified real value checking the following inequality: pi,j 2 ≤
Emax Ei,j
!2
− 1,
and qi,j can take one of the two following values: qi,j
Emax = ± Ei,j
!2
− 1 − pi,j
2
.
The number of solutions of the system Σp,q being infinite (and even uncountable), it follows that this system is under-constrained. In addition, one notices that it does not use the natural unknowns of shape from shading, which are the height values zi,j . By making a certain choice of finite differences, one can nevertheless reveal these unknowns in (7). By fixing by convention the distance between two close pixels at 1, one will choose, for example, as finite differences: (
pi,j = zi+1,j − zi,j , qi,j = zi,j+1 − zi,j . 5
(8)
By deferring the relations (8) in (7), one thus obtains the new following equations: Emax q
1 + (zi+1,j − zi,j )2 + (zi,j+1 − zi,j )2
= Ei,j .
Let us note this new system of equations Σz . It comprises m equations and m unknowns zi,j , and thus seems better-constrained than the system Σp,q . Lastly, one can consider a third system of equations, noted Σf,g , by introducing [8] the new unknowns fi,j and gi,j : fi,j = gi,j =
1+ 1+
√
√
2pi,j 1+pi,j 2 +qi,j 2 2qi,j
1+pi,j 2 +qi,j
, (9)
. 2
The interest of these unknowns is that they always obey the inequality fi,j 2 + gi,j 2 ≤ 4, the equality being reached only for the pixels located on the silhouette of an object, pixels for which pi,j or qi,j is infinite [8]. To obtain the equations of the system Σf,g , it is necessary to use the formulas opposite of the formulas (9), which are written: pi,j = qi,j =
4fi,j , 4−fi,j 2 −gi,j 2 4gi,j . 4−fi,j 2 +gi,j 2
(10)
By deferring the expressions (10) in (7), one obtains the required equation: Emax
4 − fi,j 2 − gi,j 2 4 + fi,j 2 + gi,j 2
!
= Ei,j .
The system Σf,g , which comprises m equations and 2m unknowns, is also under-constrained. There are thus three different systems of equations, which all express the eikonal equation in a discrete form. It seems that Σz is more interesting than Σp,q and Σf,g because, on the one hand, this system is the only one of the three for which the unknowns are the natural unknowns of shape from shading and, on the other hand, this system is the best constrained of the three. However, the majority of the iterative methods of resolution of the literature use the systems Σp,q or Σf,g . We will see the reason of this choice in section 4. We will now associate to these three systems of equations several error functions.
2.2. Error functions associated to the systems of equations Σz , Σp,q and Σf ,g To solve the following system of equations Σ: fu (x1 , . . . , xn ) = 0, u ∈ [1, m], 6
(11)
one in general introduces an error function, which is written in the following way: ε0 (x1 , . . . , xn ) =
m X
λu [fu (x1 , . . . , xn )]2 .
(12)
u=1
In this expression, the coefficients λu are strictly positive constants, which guarantees to ε0 to be a positive or null function. Rather than solving the system Σ, one seeks for an absolute minimum of this error. If Σ is an under-constrained system, each solution of Σ corresponds to a null value of the error ε0 , and thus constitutes an absolute minimum of ε0 . To decrease the number of absolute minima, one adds to ε0 an additional term of the following type, where λ is strictly positive: λ [g(x1 , . . . , xn )]2 . This amounts to add with the system Σ the equation: g(x1 , . . . , xn ) = 0.
(13)
It is obvious that the addition with the system Σ of the equation (13) constitutes a new system Σ0 which is better-constrained than Σ. One thus seeks now the absolute minima of the following error function ε1 : ε1 (x1 , . . . , xn ) = ε0 (x1 , . . . , xn ) + λ [g(x1 , . . . , xn )]2 . We will suppose (and this is true in the large majority of events) that it remains at least an absolute minimum for this new error. In addition, the number of absolute minima of ε 1 is lower or equal to the number of absolute minima of ε0 , since the system Σ0 is better-constrained than Σ. By taking each λu equal to 1 (all pixels being equivalent), the definition (12) enables to associate the following errors with the systems Σz , Σp,q and Σf,g : ε0z =
Emax 1+(zi+1,j −zi,j )2 +(zi,j+1 −zi,j )2 (i,j)∈D 2 P √ Emax ε0p,q = , − E i,j 1+pi,j 2 +qi,j 2 (i,j)∈D 2 P 4−fi,j 2 −gi,j 2 ε0f,g = Emax 4+f − E 2 i,j . 2 i,j +gi,j (i,j)∈D
P
√
− Ei,j
2
,
These errors are generally called variations to data. Like the systems Σp,q and Σf,g are under-constrained, the error functions ε0p,q and ε0f,g have each one an (uncountable) infinity of absolute minima. It is thus necessary to add with each one an additional term. Let us start 7
with the error ε0p,q . It is clear that the unknowns pi,j and qi,j , corresponding with a pixel (i, j) of D, are not independent. Indeed, if the function z(x, y) is of class C 2 , then, according to Schwartz’s theorem: ∂2z ∂2z ∂p ∂q − =0 ⇔ − = 0. ∂y∂x ∂x∂y ∂y ∂x
(14)
The discretization of the equation (14) can be written: (pi,j+1 − pi,j ) − (qi+1,j − qi,j ) = 0.
(15)
While adding with ε0p,q the additional term corresponding to the equation (15), one obtains as new error function: ε1p,q = ε0p,q + λ1p,q
X
(i,j)∈D
(pi,j+1 − pi,j − qi+1,j + qi,j )2 .
If one seeks an error equivalent to ε1p,q for the system Σf,g , then it is necessary to express the equation (15) according to the unknowns (fi,j , gi,j )(i,j)∈D . One finds:
4 + fi,j 2 − gi,j 2 (fi,j+1 − fi,j ) − 4 − fi,j 2 + gi,j 2 (gi+1,j − gi,j )
(16)
+ 2fi,j gi,j (gi,j+1 − gi,j − fi+1,j + fi,j ) = 0.
While indicating by (Schwartz(f, g))i,j the left hand side of this equation, the error ε1f,g is written: ε1f,g = ε0f,g + λ1f,g
X
2
(Schwartz(f, g))i,j .
(i,j)∈D
The error ε1f,g has an analytical expression definitely more complicated than its equivalent ε1p,q . Moreover, its additional term results from the equation (16), which is nonlinear, contrary to the equation (15), and such additional terms generally tend to increase the number of relative minima of the error function, and thus make the search for an absolute minimum more delicate. For these two reasons, we prefer, like Horn and Ikeuchi [8], to add with the error ε 0f,g additional terms coming from at the same time simple and linear equations, as for example: ∂f = 0,
∂x
∂f = 0,
(17)
∂y
and two equivalent equations for g, equations which do not have other solutions than surfaces ~ , i.e., plane surfaces. The equations (17) are probably not checked by the of constant normal N 8
real shape, but they have the advantage of being at the same time simple and linear. They enable to introduce the error: ε2f,g = ε0f,g + λ2f,g
X
(i,j)∈D
(fi+1,j − fi,j )2 + (fi,j+1 − fi,j )2 + (gi+1,j − gi,j )2 + (gi,j+1 − gi,j )2 .
This additional term is often called smoothing factor, because it tightens to smooth the calculated shape. The introduction of an error ε2p,q equivalent to ε2f,g seems to be obvious, i.e.: ε2p,q = ε0p,q + λ2p,q
X
(i,j)∈D
(pi+1,j − pi,j )2 + (pi,j+1 − pi,j )2 + (qi+1,j − qi,j )2 + (qi,j+1 − qi,j )2 .
Lastly, we also planned to add with the system Σp,q the simple and linear following equation: ∂2z ∂2z ∂p ∂q + =0 ⇔ + = 0, ∂x ∂y ∂x2 ∂y 2
(18)
which is not different from Laplace’s equation. This equation admits also the plane surfaces as solutions, but not only. It induces the error: ε3p,q = ε0p,q + λ3p,q
X
(i,j)∈D
(pi+1,j − pi,j + qi,j+1 − qi,j )2 ,
and its equivalent with the unknowns (fi,j , gi,j )(i,j)∈D : ε3f,g = ε0f,g + λ3f,g
X
(i,j)∈D
(fi+1,j − fi,j + gi,j+1 − gi,j )2 .
Now, let us describe the minimization methods of these various errors.
3. Resolution Methods of the Eikonal Equation In this section, we will enumerate various resolution methods of the discretized eikonal equation, i.e., various minimization methods of the errors quoted in the preceding section. The search for an absolute minimum of an error function ε(x1 , . . . , xn ), corresponding to a system of equations of the type (11), can be undertaken using two types of methods: resolution of the Euler’s equations associated with the error, and direct minimization methods of ε.
3.1. Resolution of the Euler’s equations associated with an Error For an absolute or relative minimum (x∗1 , . . . , x∗n ) of ε, the following equalities (by supposing that all the partial derivatives are defined) hold: ∀v ∈ [1, n],
∂ε ∗ (x1 , . . . , x∗n ) = 0. ∂xv 9
This system of n equations (known as Euler’s equations) and of n unknowns also admits as solutions the absolute or relative maxima of ε, and also its inflection points, but it is a wellconstrained system, since it comprises as many equations as unknowns. A usual resolution technique of a well-constrained system of nonlinear equations consists in taking an initial configuration ω 0 = (x01 , . . . , x0n ), and building a recurring succession of configurations ω k = (xk1 , . . . , xkn ) defined by a relation of the type: ω k+1 = F (ω k ), where F is a function having as fixed point the required minimum ω ∗ = (x∗1 , . . . , x∗n ), i.e., F must check the relation ω ∗ = F (ω ∗ ). Such iterations are called Jacobi’s iterations (or iterations). It is generally difficult to find a function F ensuring the convergence of the iteration towards the fixed point ω ∗ . In fact, one often prefers to the Jacobi’s iterations other iterations, called iterations of Gauss-Seidel (or relaxations), which have a better behaviour with respect to convergence, and which are defined, for v ∈ [1, n], by: k+1 k k k xk+1 = Fv (xk+1 v 0 , . . . , xv−1 , xv , xv+1 , . . . , xn ).
The functions Fv are the coordinates of a function F defining a Jacobi’s iteration. One can thus associate a relaxation with each Jacobi’s iteration.
3.2. Direct Minimization Methods of the Error The alternative to the resolution of Euler’s equations associated with an error ε is the direct search for an absolute minimum of this error. Two types of methods allow that: the stochastic methods and the deterministic methods. We will evoke stochastic methods at the end of this paper. As for the deterministic methods, they especially include the gradient descent methods. We chosed to implement only one of these methods, which we call the optimal gradient descent method, and works as follows: starting from an arbitrary initial configuration ω 0 , one builds a recurring succession of configurations ω k , while moving in the direction of steepest descent, i.e.: ω
k+1
−−→ grad εk = ω − dl −−→ , ||grad εk || k
k
where dlk represents the length of displacement. The suggested method consists in calculating, with each step, smallest displacement dl k for which the error presents a relative minimum, 10
which ensures an optimal decrease of the error and guarantees the convergence of the method (at least towards a relative minimum of ε). To compare the various iterativeresolution methods that have just been mentioned above, we will start by drawing up a list of five quality criteria concerning these methods.
3.3. Quality Criteria related to an Iterative Method For any iterative method of error minimization, necessary qualities are in general the following ones: • Q1 : be convergent. • Q2 : converge towards the required solution. • Q3 : be fast. • Q4 : be robust, in the event of noisy data. In the case of shape from shading, the goal is to produce as final result a shape ω = (zi,j )(i,j)∈D .
∗ If one lays out of the real shape ω ∗ = zi,j
(i,j)∈D
corresponding to the analyzed image, a
good measurement of the precision of the rebuilding thus consists in calculating the following expression, called variation with the real shape: e1 (ω) =
X
(i,j)∈D
2
∗ zi,j − zi,j .
For the resolution methods of shape from shading which seek to solve the systems Σp,q or Σf,g , it is necessary to introduce another quality criterion. Indeed, by using if necessary the relations (10), which are bijective relations, these resolution methods enable to obtain the values (pi,j , qi,j )(i,j)∈D which are supposed to correspond to the required shape. To calculate this shape, it is necessary to use the relations (8), which amounts carrying out a discrete integration of the ~ . One can thus define a new quality criterion Q5 for these methods, which consists in normal N checking if such an integration is possible or not, i.e., if the equality (15) is checked or not. As a measure of this quality criterion, one can calculate the following expression: e2 (ω) =
X
(i,j)∈D
(pi,j+1 − pi,j − qi+1,j + qi,j )2 ,
that we will call variation to integrability. We will now see how to allow a graphic visualization of the performances of an iterative method. 11
3.4. Graphic Visualization of the Performances of an Iterative Method To be able to compare various methods, with respect to the five quality criteria which have been just defined, we propose to trace two types of graphics, for which one will represent in X-coordinate the calculation time t, and in Y-coordinate, according to events, the variation with the real shape e1 or the variation to integrability e2 : the first type of graphics will enable to evaluate if the quality criteria Q1 , Q2 , Q3 and Q4 are checked; the second one will enable to show as far as the quality criterion Q5 is checked. Obviously, given a method of resolution, the calculation time t depends closely on the tricks of programming used. Nevertheless, this X-coordinate seems more reliable to us than the iteration count (used in [12]), since the calculating time δtk corresponding with an iterative step is variable from one method to another, and since δtk may depend on the step k, in particular for the optimal gradient descent. In addition, the calculation time must take into account the necessary duration to carry out the integration corresponding to the relations (8), when the error uses the unknowns (pi,j , qi,j )(i,j)∈D or (fi,j , gi,j )(i,j)∈D . Moreover, this duration can vary, according to whether the values (pi,j , qi,j )(i,j)∈D obey more or less the quality criterion Q5 . We chose to apply an integration method of constant duration, which systematically carries out 50 iterative steps of the method of iterative integration suggested by Horn and Brooks [7], on the basis of an initial configuration calculated using an integration step by step using the relations (8).
4. Results In this section, we will analyze the non noisy synthetic image of size 256 × 256 of a spherical cap of radius 100 placed on a plane, or the same image degraded by a random noise, which both are represented on Figure 1. The maximum grey level Emax of these images is equal to 255. We will use as a starting shape a cone having the same base as the spherical cap. This shape, like the spherical cap, is represented in perspective on Figure 2.
4.1. Comparison between Jacobi’s Iteration, Relaxation and Optimal Gradient Descent Horn and Brooks [7] quote Horn and Ikeuchi [8], which proposed an iteration enabling to minimize the error ε2f,g , which we will call iteration of Horn and Ikeuchi. They also propose an 12
(a) Non noisy mage.
(b) Noisy image.
Figure 1: Images of a spherical cap.
(a) Spherical cap.
(b) Cone.
Figure 2: Real shape and starting shape. iteration enabling to minimize the error ε1p,q , which we will call iteration of Horn and Brooks. According to them, these two iterations “have a good behavior” [7]. For the iteration of Horn and Ikeuchi, a theoretical proof of convergence exists [10], but this proof imposes on λ 2f,g to be higher that a threshold, so high than it would make the method unusable. On the graphic of Figure 3.a, we traced e1 according to t, for the iteration of Horn and Ikeuchi, for the associated relaxation and for the optimal gradient descent, with λ2f,g = 100000, starting from the conical surface. It is noted that the iteration diverges, contrary to the two other methods. This result is not really surprising because a relaxation always has a better stability than the iteration from which it results. But when λ2f,g = 3500, one notes on the graphic of Figure 3.b that relaxation diverges too. On the graphic of Figure 4.a, we traced e1 according to t, for the iteration of Horn and Brooks, for the associated relaxation and for the optimal gradient descent, with λ1p,q = 100000, starting from the conical surface. Lastly, the graphic of Figure 4.b correspond 13
4
x 10
x 10
6
iteration Horn and Ikeuchi relaxation optimal gradient descent
3.5
iteration Horn and Ikeuchi relaxation optimal gradient descent
3.5 3
variation to the real shape
3
variation to the real shape
6
2.5 2 1.5 1
2.5 2 1.5 1 0.5
0.5 0 0
100
200
300
400
500
600
time (seconds)
700
800
900
0 0
1000
(a) Error ε2f,g , with λ2f,g = 100000.
20
40
60
80
100
120
time (seconds)
140
160
180
200
(b) Error ε2f,g , with λ2f,g = 3500.
Figure 3: Minimization methods based on the error introduced by Horn and Ikeuchi.
4
x 10
x 10
6
iteration Horn and Brooks relaxation optimal gradient descent
3.5
3.5 3
variation to the real shape
variation to the real shape
3 2.5 2 1.5 1
iteration Horn and Brooks relaxation optimal gradient descent
2.5 2 1.5 1 0.5
0.5 0 0
6
200
400
600
800
1000
time (seconds)
1200
1400
1600
0 0
1800
(a) Error ε1p,q , with λ1p,q = 100000.
50
100
150
time (seconds)
200
250
(b) Error ε1p,q , with λ1p,q = 3500.
Figure 4: Minimization methods based on the error introduced by Horn and Brooks.
to the same tests, when λ1p,q = 4400. Considering these four graphics, one can note that these iterations and these relaxations, which are supposed, according to Horn and Brooks, to give good guarantees of convergence, can nevertheless have a divergent behavior, for certain values of the coefficients λ2f,g and λ1p,q . In addition, the speed of decrease of e1 is completely comparable for the three methods. Relaxation is fastest in the case of ε2f,g , but it is also the only one of these three methods not to be computing in parallel. For all these reasons, we will not use it any more in the continuation of this paper, but only the optimal gradient descent as method of error minimization. 14
4.2. Comparison of the various Errors Using the method of optimal gradient descent, we will compare the various error functions which were introduced in section 2. We chose to take the same value for all the coefficients of the various errors tested, even if it is arbitrary. On the graphics of Figures 5.a and 5.b, we traced e1 and e2 according to t, for the three errors depending on the unknowns (pi,j , qi,j )(i,j)∈D , when λ1p,q = λ2p,q = λ3p,q = 100000. It clearly appears that, of these three errors, ε3p,q is 6
x 10
100
ε1p,q ε2p,q ε3p,q
3.5 3
ε1p,q ε2p,q ε3p,q
90
variation to integrability
variation to the real shape
4
2.5 2 1.5
80 70 60 50 40 30
1
20
0.5 0 0
10 200
400
600
800
1000
1200
time (seconds)
1400
1600
1800
0 0
2000
(a) Variation to the real shape.
20
40
60
80
100
120
time (seconds)
140
160
180
200
(b) Variation to integrability.
Figure 5: ε1p,q , ε2p,q et ε3p,q . the least interesting: it not only converges towards the good shape, but also does not check the quality criterion Q5 . It must thus be abandoned. As for the errors ε1p,q and ε2p,q , they have relatively equivalent behaviours. On the graphics of Figures 6.a and 6.b, we traced e 1 and e2 according to t, for the three errors depending on the unknowns (fi,j , gi,j )(i,j)∈D , when λ1f,g = λ2f,g = λ3f,g = 100000. The variation with the real shape remains high for the error ε1f,g , which is due in fact to a slowness problem, but is not really astonishing, insofar as the additional term of this error, relatively complicated, requires a significant calculation time. It is also noted that the error ε3f,g does not correctly satisfy the quality criterion Q5 . These two errors must thus be abandoned. Finally, let us announce that the error ε0z gives place to a decrease of e1 even slower than ε1f,g . For this reason, we did not reproduce it on the graphics, and also decided to give it up. Notice that, among the seven errors that we have just compared, the one which gives the worst results is ε3p,q , for which the additional term comes from Laplace’s equation (18), where do not appear the second cross derivatives
∂2z ∂x∂y
and
∂2z ∂y∂x
of the unknown
z. This observation was already made by Blake and Zisserman [1]: in a minimization problem, a smoothing term gets better results when it uses the second cross derivatives of the unknown. 15
6
x 10
100
ε1f,g ε2f,g ε3f,g
3.5 3
ε1f,g ε2f,g ε3f,g
90 80
variation to integrability
variation to the real shape
4
2.5 2 1.5
70 60 50 40 30
1
20
0.5
10
0 0
200
400
600
800
1000
1200
time (seconds)
1400
1600
1800
0 0
2000
(a) Variation to the real shape.
200
400
600
800
1000
1200
time (seconds)
1400
1600
1800
2000
(b) Variation to integrability.
Figure 6: ε1f,g , ε2f,g et ε3f,g . It thus proves that only ε1p,q , ε2p,q and ε2f,g remain in contention. A good manner to compare these three errors consists in testing them on the noisy image of the spherical cap, represented on Figure 1.b. Like the error ε1p,q comprises a term of integrability, it is probable that it will tend to interpret this image like the nondisturbed image of a surface presenting some asperities. Otherwise, like the errors ε2p,q and ε2f,g comprise each one a smoothing term, one can expect that the reconstructed shape be smooth. On the graphics of Figure 7.a and 7.b, we traced e 1 and 6
x 10
100
ε1p,q ε2p,q ε2f,g
3.5 3
ε1p,q ε2p,q ε2f,g
90
variation to integrability
variation to the real shape
4
2.5 2 1.5
80 70 60 50 40 30
1
20
0.5 0 0
10
200
400
600
800
1000
1200
time (seconds)
1400
1600
1800
0 0
2000
(a) Variation to the real shape.
200
400
600
800
1000
1200
time (seconds)
1400
1600
1800
2000
(b) Variation to integrability.
Figure 7: ε1p,q , ε2p,q and ε2f,g . e2 according to t, for the three remaining errors ε1p,q , ε2p,q and ε2f,g . It is seen that, from these three errors, the error ε1p,q gives the smallest limit of e1 and of e2 . Thus, the error ε1p,q seems to be the best of all the errors we have tested. Nevertheless, even if the global aspect of the reconstructed shape is closer to the real shape for ε1p,q (cf. Figure 8.a) than for ε2p,q or ε2f,g (cf. 16
(a) Error ε1p,q .
(b) Error ε2p,q .
(c) Error ε2f,g .
Figure 8: Reconstructed shapes from the images of Figure 1.b. Figures 8.b and 8.c), it can be seen on Figure 8.a that the reconstructed shape is rough, that is, the noise on the image of Figure 1.b has been interpreted as coming from asperities on the surface, which is not the case. This could be avoided while using a smoothing term in addition to the integrability term, since such a term in ε2p,q or in ε2f,g ensures that the reconstructed shape is smooth (but warp). The best compromise could thus be to use, for example, an error ε4p,q as: ε4p,q = ε1p,q + λ4p,q
X
(i,j)∈D
(pi+1,j − pi,j )2 +
(pi,j+1 − pi,j )2 + (qi+1,j − qi,j )2 + (qi,j+1 − qi,j )2 . We will tackle a more serious problem now: that of the relative minima of an error.
4.3. Problem of the Relative Minima of the Error used Among the five quality criteria mentioned above, Q2 constitutes a major problem for all the deterministic methods of error minimization. Indeed, these methods remain trapped since they fall into a relative minimum. In the case of the image processing of the spherical cap, the resolution of the problem really does not depend on the starting shape, because whatever the initial configuration (sufficient smooth otherwise), the absolute minimum is always reached. Processing on this image is shown on Figure 9 using various stages of its warping. It is true that it is about a very simple case, but it often appears in the literature of shape from shading, and this is why resolution methods are initially tested on.
17
(a) Random starting shape.
(b) Shape with t = 1000 sec.
(c) Shape with t = 1500 sec.
(d) Shape with t = 2000 sec.
(e) Shape with t = 2500 sec.
(f) Final shape.
Figure 9: Error ε1p,q , with λ1p,q = 10000. Hence, we will focus on this problem for a much more complex shape than the previous one, represented on Figure 10.a, which lead the image represented on Figure 10.b. If one starts
(a) Smooth shape.
(b) Corresponding image.
Figure 10: Complex smooth shape and its image. from an unspecified initial shape, such as for example a plane surface, the deformation will be slow, from where the idea to find an adequate initial shape arising. It can be said that the only random thing in a gradient descent method is the initial shape. One decides to yield (a little bit more) stochastic the gradient descent method by choosing several particular initial shapes before 18
applying the optimal gradient method in order to converge towards the absolute minimum of the error. As the unknown count umber being high, space of configurations becomes gigantic, therefore we propose to restrict it in the following way: • First of all, one begins by detecting the points of maximum illumination on the image. According to the assumptions made behind in section 2, these points correspond to extremum, it thus acts according to the cases of maximum, minimum or inflection points. • The second stage consists in fixing a random height for each one of these points. • Finally, it only remains to build a shape made up of facets, whose each node is a point of maximum illumination. If the height in each point of interest worth the real height, one thus obtains such a shape represented on Figure 11.b. Method has therefore to be processed using this methodology,
(a) Detection of points of interest
(b) Faceted shape using
by tresholding (s = 254).
the points of interest.
Figure 11: Construction of a starting shape. insofar one obtains a finished number of initial shapes (a rather large number however). Finally, comparison of the various errors obtained at the end of the processing, is enough to choose that which is weakest. Another shape built with facets is used (Figure 13.a) in the will to test the behaviour of these methods starting from a shape on which one knows only the points of interest without the knowledge of their height. One thus reach unavoidably to a relative minimum represented on Figure 13.b. 19
(a) Initial shape
(b) Intermediary shape
(c) Final reconstructed shape
Figure 12: Shape reconstruction with the error ε4p,q (λ = 105 ).
(a) Other faceted initial shape.
(b) Intermediary shape
(c) Final reconstructed shape
Figure 13: Shape reconstruction with the error ε4p,q (λ = 105 ).
4.4. Results on Real Images In the case of the spherical image, the error is sufficiently regular insofar it does not present many relative minima, but, otherwise, for images such as those of Figures 14.a and 14.c (which have been taken in our laboratory to check the assumptions of shape from shading [3]), one finds the shapes represented respectively on Figures 14.b and 14.d, which obviously constitute relative minima for the error ε1p,q . Indeed, as these images are real ones, the numerical real shape is unknown in these cases, but, even if the global frame of these two reconstructed shapes seem to be quite satisfactory, notice that human eye does interprete some details in a different way, such as the intervall between the stag’s horns, or the rabbit’s right ear, which should both present a cavity. One can notice that general frame still remain in spite of the presence of a relative minimum, but the smoothing factor tends to remove all the details presents on the object, indeed, a means of curing this problem and have in this way better results would be to take an object of higher size which present coarser details. 20
(a) Image of a stag.
(b) Calculated shape.
(c) Image of a rabbit.
(d) Calculated shape.
Figure 14: Illustration of the problem of relative minima.
5. Conclusion In this paper, we selected, from a great number of iterative methods, the one which presents the best compromise between the various quality criteria which were defined in section 3, namely the method of optimal descent gradient applied to the error ε1p,q , or possibly to the error ε4p,q , in the event of noisy images. Nevertheless, among the five quality criteria which were defined above, the criteria Q2 and Q3 can be not exactly satisfied. With regard to Q3 , on the basis of an initial shape very far away from the sought shape, this method converges well towards the real shape (cf. Figure 9), but the calculation time required to have a satisfactory result is of about one hour on a Sun Ultra 1, which is too much. Several authors already tried to cure this defect [12, 13], and it would be of course desirable to apply their acceleration techniques of the calculation time to our method. As regards the quality criterion Q 2 , one touches on a major problem of all the deterministic methods of minimization of an error. We propose to yield partially stochastic the optimal descent gradient algorithm, by sorting by chance a set of 21
starting shapes composed of facets, which makes it possible to restrict the space of the starting shapes and thus the calculation time. The choice of the facets depends on the detection of the pixels where E(x, y) = Emax , and such a detection cannot be made on noisy images otherwise. It will thus be necessary to find the absolute minimum of the error ε1p,q (or ε4p,q ) on noisy images, to use a definitely stochastic method. As a conclusion, we think that the problem of relative minimua in shape from shading could be avoided using a real stochastic method of error minimization, as for example simulated annealing, a track which we already started to explore.
References [1] A. Blake and A. Zisserman. Visual Reconstruction. MIT Press, Cambridge, MA, 1987. [2] A. R. Bruss. The eikonal equation: Some results applicable to computer vision. Journal of Mathematical Physics, 23(5):890–896, may 1982. [3] P. Daniel and J. D. Durou. Creation of real images which are valid for the assumptions made in shape from shading. In ICIAP’99 Proceedings 10th International Conference on Image Analysis and Processing (Venice, Italy, September 27-29, 1999). IEEE Computer Society Press, sep 1999. [4] M. Falcone and M. Sagona. An algorithm for the global solution of the shape-from-shading model. In ICIAP’97 (IEEE International Conference on Image Analysis and Processing, Florence, Sept 1997), pages 596–603, 1997. [5] B. K. P. Horn. Obtaining shape from shading information. The Psychology of Computer Vision, 1975. [6] B. K. P. Horn. Height and gradient from shading. International Journal of Computer Vision, 5(1):37–75, aug 1990. [7] B. K. P. Horn and M. J. Brooks. The variational approach to shape from shading. Computer Vision, Graphics, and Image Processing, 33(2):174–208, feb 1986. [8] K. Ikeuchi and B. K. P. Horn. Numerical shape from shading and occluding boundaries. Artificial Intelligence, 17:141–194, 1981. [9] L. Mascarilla J. D. Durou and D. Piau. D´eformations invisibles. In RFIA’98 Reconnaissance des Formes et Intelligence Artificielle (Clermont-Ferrand, 20-21-22 Janvier 1998), pages 219–225. AFCET, jan 1998.
22
[10] D. Lee. A provably convergent algorithm for shape from shading. In ed. L.S. Baumann, editor, Image Understanding Workshop (Miami Beach, FL, December 9-10, 1985), pages 489–496, 1985. [11] P. L. Lions E. Rouy and A. Tourin. Shape-from-shading, viscosity solutions and edges. Numerische Mathematik, 64:323–353, 1993. [12] R. Szeliski. Fast shape from shading. In O. Faugeras, editor, Proceedings of the 1st European Conference on Computer Vision (Antibes, France, April 23-27, 1990), volume 427 of LNCS, pages 359–368, Berlin, apr 1990. Springer. [13] D. Terzopoulos. Image analysis using multigrid relaxation methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(2):129–139, 1986.
23