HumanWalking Motion Synthesis Based on Multiple Regression ...

Comment

Report 5 Downloads 37 Views

Human Walking Motion Synthesis Based on Multiple Regression Hidden Semi-Markov Model Takashi Yamazaki, Naotake Niwase, Junichi Yamagishi, Takao Kobayashi Tokyo Institute of Technology Interdisciplinary Gradute School of Science and Engineering Yokohama, 226-8502, Japan {t-yamazaki, junichi.yamagishi, takao.kobayashi}@ip.titech.ac.jp Abstract This paper describes a statistical approach for modeling and synthesizing human walking motion. In the approach, each motion primitive is modeled statistically from motion capture data using multiple regression hidden semiMarkov model (HSMM). HSMM is an extension of hidden Markov model (HMM), in which each state has an explicit state duration probability distribution, and multiple regression HSMM is the one whose mean parameter of probability distribution function is assumed to be given by a function of factors which affects human motion. In this paper, we introduce a training algorithm for the multiple regression HSMM, called factor adaptive training based on the EM algorithm and also describe a parameter generation algorithm from motion primitive HSMMs with prescribed values of factors. From experimental results, we show that the proposed technique can control walking movements in accordance with a change of the factors such as walking pace and stride length and can provide realistic human motion.

1. Introduction Generating realistic human motion is an important issue in human-computer interaction (HCI) which includes agents or virtual humans. Although state-of-the-art motion synthesis systems based on the use of motion capture data can generate realistic human motion, there still exist problems to be solved. In general, human motion varies depending on various factors. For example, ball catching and throwing movements would change with catching position and speed of a ball, and walking motion is affected by factors such as walking speed, walking pace, and emotion. In such cases, it is not easy to synthesize the motion from motion capture data, in which the desired movements are not included, with adapting it in accordance with a change of

Proceedings of the 2005 International Conference on Cyberworlds (CW’05) 0-7695-2378-1/05 $20.00 © 2005

IEEE

the those factors. Moreover, we would require some interpolation/blending/warping processes to smooth jerky motion in the transition between different kinds of movements or to generate intermediate shapes between the target shapes. This would cause the created human motion animation to be unrealistic. There have been proposed a number of approaches to overcome this problem, such as signal processing technique [1], statistical techniques based on hidden Markov model (HMM) or other probabilistic models [2]–[6], and corpusbased techniques [7]–[9]. In addition we have proposed an alternative approach to human motion synthesis based on HMM [10]–[11]. In this paper, we focus on human walking motion and extend our approach to synthesizing walking movements having desired walking pace and stride length. In [11], each motion primitive is modeled by a hidden Markov model (HMM) in which duration control is not considered explicitly. To control duration of the motion primitive, we incorporate hidden semi-Markov Model (HSMM) [12], that is an HMM with explicit state duration probability distributions, into modeling of each motion primitive. Furthermore, mean parameter of probability distribution function of HSMM is assumed to be given by a function of factors that affect human motion, such as walking pace and stride length. We derive a training algorithm of the model parameters of HSMM, which will be referred to as factor adaptive training, and parameter generation algorithm from HSMM both based on the ML criterion. We also show that we can synthesize walking motion with specifying arbitrary walking pace and stride length using the proposed technique.

2. Related Work Our approach is similar to conventional statistical approaches in the respect that motion modeling is based on HMM [2]–[5] or the motion is controlled by using a low-dimensional vector [3],[6]. Especially, our ap-

proach is closely related to style machine proposed in [3]. However, there are distinct differences between two approaches. In the style machine approach, human motion is modeled by using an HMM and controlled by a low-dimensional vector, called style variable, deﬁned in an eigenspace that is obtained by principal component analysis (PCA). In general, each axis of the eigenspace derived from PCA does not represent any speciﬁc physical meanings such as stride length and walking pace. As a result, it is possible to specify a desired style only in an implicit way. In contrast, our approach uses an HSMM parameterized by a low-dimensional vector representing a point in a space where each coordinate corresponds to a speciﬁc value of the factor such as stride length or walking pace. Hence our approach has the ability to explicitly control motion by specifying a set of desired values of factors which affect motion. Furthermore, HSMM can model state duration properly, but HMM cannot. This means that we can control walking pace of the human walking motion properly. In addition, to model mean vector of the distribution function using multiple regression is not a new idea in HMM-based framework [16],[17]. However, a distinctive feature of our approach is in incorporating an adaptive training framework based on multiple regression HSMM into human motion synthesis.

3. Human Motion Modeling and Synthesis Based on HSMM 3.1. Overview of the Approach We decompose human motion into motion primitives and describe the motion by a sequence of symbols which represent motion primitives. In this paper, we will refer to the name of a motion primitive as the motion label. The procedure for synthesizing human motion animation consists of two stages, where the ﬁrst stage is model training and the second stage is motion synthesis. In the model training stage, a number of movements are acquired using a motion capture system. The motion capture data are segmented into the motion primitives, and labels corresponding to the motion primitives are added. Then each motion primitive is modeled by an HSMM based on the training algorithm described in 3.2. In the motion synthesis stage, for a desired motion description to be synthesized, it is converted in a sequence of motion primitives and represented by a motion label sequence. Then HSMMs corresponding to the motion labels are concatenated to construct one HSMM which represents the whole motion to be synthesized. Using the obtained HSMM, a human body model parameter sequence is generated based on the parameter generation algorithm from HSMM described in 3.3. Finally generated human body

Proceedings of the 2005 International Conference on Cyberworlds (CW’05) 0-7695-2378-1/05 $20.00 © 2005

IEEE

model parameters at each frame are visualized in 3D image and converted into motion animation. To model and synthesize human motion, we use a parameter set obtained via the motion capture as a static feature vector. We also use dynamic features as well as static features. Here the dynamic feature vectors are the ﬁrst and second delta parameter vectors which correspond to the ﬁrst and second time derivatives of the static feature vector, respectively. Let xt be the human body parameter vector captured at frame t. For a given human body parameter vector sequence (x1 , x2 , . . . , xT ) of a length of T frames, the k-th delta parameter vector of xt , Δk xt , is deﬁned by Δk xt =

Lk

w(k) (i)xt+i ,

0≤k≤2

(1)

i=−Lk

where w(k) (i) are coefﬁcients for obtaining delta parameters, and L0 = 0, w(0) (0) = 1. For example, if we set L1 = L2 = 1, then w(k) (i) derived from numerical differentiation are given by w(1) (i) = i/2, w(2) (i) = 3i2 − 2,

for i = −1, 0, 1 for i = −1, 0, 1.

(2) (3)

The static feature vector and its dynamic feature vectors are combined together and the observation vector at frame t, denoted by ot , is deﬁned by 1 2 ot = [x t , Δ xt , Δ xt ]

(4)

where · denotes the matrix transpose.

3.2. Factor Adaptive Training for Motion Primitives Using HSMM We model each motion primitive based on an HSMM framework. An N -state HSMM λ is speciﬁed by initial state probability {πi }N i=1 , state transition probability {aij }N i,j=1,i=j , state output probability distribution {bi (·)}N i=1 , and state duration probability distribution {pi (·)}N . i=1 We assume that the i-th state output and duration distributions are given by Gaussian density functions with mean vector μi and diagonal covariance matrix Σi , and mean mi and variance σi2 , respectively. bi (o) = N (o; μi , Σi ) pi (d) = N (d; mi , σi2 ).

(5) (6)

In the following, without loss of generality, we assume that HSMM λ is a simple left-to-right model with no skip paths. As a result, the parameter set of HSMM λ can be simpliﬁed as λ = (μ, Σ, m, σ 2 ). To take account of factors that would affect the human motion, we assume that the means of the output and duration distributions at each state are given by functions of

parameter vector ξ. More speciﬁcally, we assume that the mean vector μi and the mean mi in (5) and (6) are modeled by using multiple regression [11] in the following form. μi (ξ b ) = H bi ξ b mi (ξ p ) = H pi ξ p

⎛ H pi = ⎝

⎞

(n)

t K T

⎠· γtd (i) · d · ξ (n) p

n=1 t=1 d=1

⎛

(7) (8)

⎝

⎞−1

(n)

t K T

(n) ⎠ γtd (i) ξ (n) p ξp

(17)

n=1 t=1 d=1

where

(n)

ξb = ξp = (b)

(b) (b) (b) [1, ξ1 , ξ2 , . . . , ξLb ] (p) (p) (p) [1, ξ1 , ξ2 , . . . , ξLp ]

K T t

(9) (10)

σ 2i =

n=1 t=1 d=1

bi (o) = N (o; H bi ξb , Σi ) pi (d) = N (d; H pi ξ p , σi2 ).

(11) (12)

Suppose that the training data contain K observation vector sequences O = {O (1) , · · · , O (K) }, where O (n) = (n) (n) (n) (o1 , o2 , . . . , oTn ) is the n-th observation sequence of length Tn . We assume that the values of factors depends on n and are represented by ξ

(n)

(n) (ξ b ,

=

ξ (n) p ).

(13)

Then, for given O, we train the model λ so that the likelihood deﬁned by K

L(λ, Λ) =

P (O

(n)

| λ, Λ, ξ

(n)

)

(14)

(18)

(n)

t K T

(p)

{ξk } and {ξk } are predictor variables, namely factors that affect the motion, H bi and H pi are M × (Lb + 1)and 1 × (Lp + 1)-dimensional multiple regression matrices, and M is the dimensionality of μi . Then the probability distribution functions at state i are given by

2 γtd (i) (d − H pi ξ (n) p )

γtd (i).

n=1 t=1 d=1

where γtd (i) is a probability generating serial observation sequence ot−d+1 , · · · , ot at the i-th state and deﬁned by 1 αt−d (j)aji pi (d) bi (os )βt (i) P (O|λ) j=1 N

γtd (i)=

t

s=t−d+1

j=i

(19) and αt (i) and βt (i) are the forward and backward probabilities deﬁned by αt (i) =

βt (i) =

N t d=1 j=1 j=i N T −t

αt−d (j)aji pi (d)

t

bi (os )

(20)

s=t−d+1

aij pj (d)

t+d

bj (os ) βt+d (j)

(21)

s=t+1

d=1 j=1 j=i

n=1

is maximized with respect to the parameter set of λ and the regression matrices Λ = (H b , H p ), where H b = N {H bi }N i=1 and H p = {H pi }i=1 , jointly. We refer to this model training scheme as factor adaptive training. Based on the EM algorithm, we can derive re-estimation formulas for λ and Λ [15], and we have ⎞ ⎛ (n) t K T t (n) ⎠ · H bi = ⎝ γtd (i) o(n) ξ s b

n=1 t=1 d=1

⎛ ⎝

s=t−d+1

(n)

t K T

⎞−1

(n) (n) ⎠

γtd (i) · d · ξb ξ b

(15)

n=1 t=1 d=1

Σi =

(n) K T

t

n=1 t=1 d=1 t (o(n) s s=t−d+1

γtd (i)· (n)

(n)

− H bi ξ b )(o(n) s − H bi ξ b )

γtd (i) · d

n=1 t=1 d=1

(16)

Proceedings of the 2005 International Conference on Cyberworlds (CW’05) 0-7695-2378-1/05 $20.00 © 2005

IEEE

3.3. Parameter Generation from HSMM We now consider a problem of generating human body parameter sequence from the trained HSMM λ, given an arbitrarily prescribed factor vector ξ = (ξ b , ξp ) and frame length T . Here we generate human body parameter sequence xt in an ML sense as (x∗1 , x∗2 , . . . , x∗T ) =

argmax P (O, q | λ, Λ, ξ, T )

(x1 ,x2 ,...,xT ),q

(22) where q = (q1 , q2 , . . . , qT ) is state sequence. To simplify the the problem of solving (22), we obtain a suboptimal solution as follows. From the deﬁnition of the HSMM λ, we have P (O, q | λ, Λ, ξ, T ) = P (O | q, Σ, H b , ξ b , T )P (q | H p , ξp , T ). (23)

(n)

t K T

with α0 (i) = πi , and βT (i) = 1.

From (23), we ﬁrst determine an optimal state sequence q ∗ by maximizing P (q | H p , ξ p , T ), and then we maximize

L_step

R_step

Figure 1. Motion primitives taken from motion capture data. P (O | q ∗ , Σ, H b , ξ b , T ). In this case, we can easily obtain the optimal state sequence as ∗

q = (1, . . . , 1, 2, . . . , 2, . . . , N, . . . , N )

d1

d2

(24)

dN

where di = mi + ρσi2 , i = 1, 2, . . . , N mi = H pi ξ p , i = 1, 2, . . . , N N N ρ= T− mi σi2 . i=1

(25) (26) (27)

i=1 ∗

Given the optimal state sequence q , the optimal parameter vector sequence x∗ is determined by solving the following set of linear equations [13]. Rx∗ = r

(28)

where R = W Σ−1 W r = W Σ−1 μ ∗ ∗ x∗ = [x∗ 1 , x2 , . . . , xT ] μ = [μq1∗ , μq2∗ , · · · , μqT∗ ]

(29) (30) (31)

= [(H q1∗ ξ b ), (H q2∗ ξb ), . . . , (H qT∗ ξ b ) ] Σ = diag Σq1∗ , Σq2∗ , . . . , ΣqT∗ W = [w1 , w2 , · · · , wT ]

(32) (33) (34)

wt = [wt , w t , wt ]

(35)

(0)

(k) wt

(1)

(2)

= [0M , . . . , 0M , w

(k)

(−Lk )I M , (t−Lk )-th . . . , w(k) (0)I M , . . . , w(k) (Lk )I M , t-th (t+Lk )-th 0M , . . . , 0M ] T -th

1st

(36)

0M and I M are the M × M zero and identity matrices, respectively, and M is the dimensionality of xt .

4. Experiments

L step the back-to-front interval of the left leg R step the back-to-front interval of the right leg. In addition, to represent transient motion of beginning and end of walking, we treat the ﬁrst two steps and the last two steps as distinct primitives from the walking cycle: * step b1 the interval of the ﬁrst step * step b2 the interval of the second step * step e2 the interval of the second last step * step e1 the interval of the last step where * represents L or R. The above deﬁnition of walking motion primitives might differ from that based on physics of the human body or dynamics of the human motion. However, from experimental results, we will show that it is a reasonable choice for synthesizing walking motion cycles in the HSMM-based framework proposed here. To train motion primitive models, we captured motion data which consist of 66 series of walking movements of a male adult with changing walking pace and stride length. Every series of movements was sevenstep walking motion described by a motion label sequence as “R step b1 L step b2 R step L step R step L step e2 R step e1.” Human body parameters were captured at a frame rate of 30 frame/s. Each frame of motion capture data was manually classiﬁed into one of motion primitives mentioned above and labeled. We set the beginning of each back-to-front interval of the leg at the moment that the foot corresponding the other leg reaches the ﬂoor. Figure 1 shows an example of the motion primitives taken from motion capture data. In the ﬁgure, the motion is depicted every two frames. From the motion capture data, we calculated the average walking pace in s/step and average stride length in m/step of the middle three steps in seven-step walking motion, denoted by ξ1 and ξ2 , respectively. Figure 2 shows the distribution of the walking pace and the stride length for the captured data.

4.1. Motion Primitives and Motion Database 4.2. Experimental Conditions We assume that normal human walking is a periodic motion and we decompose one walking cycle into two motion primitives:

Proceedings of the 2005 International Conference on Cyberworlds (CW’05) 0-7695-2378-1/05 $20.00 © 2005

IEEE

We used a human body model which consists of 19 joints, and used 3 rotation angles about Z, X, and Y axes,

Caputured motion

No. 1 2 3 4 5 6

ξ1 0.47 0.48 0.50 0.52 0.59 0.60

ξ2 0.444 0.401 0.483 0.433 0.417 0.451

Synthetic motion HSMM HMM Walking pace Stride length Walking pace Stride length 0.47 0.431 0.51 0.467 0.48 0.397 0.51 0.421 0.51 0.481 0.51 0.503 0.52 0.430 0.51 0.415 0.59 0.407 0.51 0.369 0.61 0.449 0.51 0.392

Table 1. Comparison of walking pace ξ1 (s/step) and stride length ξ2 (m/step) of test data with those of synthetic motion.

ξb = [1, ξ1 , ξ2 ] ξp = [1, ξ1 ] .

(37) (38)

Proceedings of the 2005 International Conference on Cyberworlds (CW’05) 0-7695-2378-1/05 $20.00 © 2005

IEEE

0.64 Training Data Walking pace ξ1 (s/step)

respectively, which were the output from the motion capture system, as the parameters representing each joint. In addition, we also used the 2-D coordinate data in the Z-Y plane and the difference between two frames of the coordinate data in X axis, i.e., direction of movement, for the root joint. Thus the dimensionality of the body shape parameter vector was 60. Moreover, since we used dynamic features as well as the static feature, the parameter vector at each frame became 180-dimensional vector which consisted of the 60dimensional body shape parameter vector, its ﬁrst and second delta parameter vectors. We set L1 = L2 = 1 and used (2) and (3) for calculating dynamic parameters. We modeled each motion primitive using single-mixture 5-state left-to-right HSMM with no skip paths. We assumed that the covariance matrix of the Gaussian distribution function was diagonal. We used 60 series of walking movements as the training data and remaining 6 series of those as the test data. The values of walking pace and stride length for the test data are shown in Figure 2 and Table 1. Human motion animation was synthesized by generating body shape parameter vector sequence from a given motion label sequence using the algorithm described in 3.3, and then by converting the generated parameter vector sequence into 3D computer graphics. Initial regression matrices were estimated by using a least-square method. We used walking pace ξ1 and stride length ξ2 as the factors in the factor-adaptive training of motion primitive HSMM and parameter generation from HSMM. Walking speed, which is given by ξ2 /ξ1 , might be an alternative choice of the factor. However, it is obvious that the walking speed is not a global indicator of human walking motion [18]. Moreover, since there was a strong correlation between state duration and walking pace, we used walking pace as only the factor that controls duration distribution. As a result, the factor vectors were given by

Test Data

0.60 6 5

0.56

0.52

4 3

0.48 2 1

0.44 0.38

0.40

0.42

0.44

0.46

0.48

Stride length ξ2 (m/step)

Figure 2. Distribution of walking pace and stride length taken from motion capture data.

4.3. Results To evaluate the performance of the proposed technique objectively, we synthesized seven-step walking movements corresponding to the same motion label sequence as that for the test data given by “R step b1 L step b2 R step L step R step L step e2 R step e1.” We speciﬁed a pair of values of walking pace and stride length taken from the test data as shown in Table 1 and created walking motion. We set ρ = 0 in (25). In this case, each state duration becomes equal to H pi ξ p . Table 1 shows the comparison between the desired values of walking pace ξ1 and stride length ξ2 and those of the average values in the middle three steps of synthetic motion. In the table, entries for HSMM show the results of the proposed technique, and entries for HMM show those of the conventional technique [11] based on HMM. It can be seen

Figure 3. Captured motion at the interval of the fourth and ﬁfth steps for walking pace of 0.59 s/step and stride length of 0.417 m/step.

Figure 4. Synthetic motion at the interval of the fourth and ﬁfth steps with prescribed walking pace of 0.59 s/step and stride length of 0.417 m/step.

that the walking pace and stride length of synthetic motion are very close to the desired values, which did not appear in the training data. In fact, the RMS errors of walking pace and stride length between desired and resultant values are 5.8 × 10−3 s/step (0.17 frame/step) and 7.1 × 10−3 m/step, respectively. In contrast, walking pace of the motion generated by the conventional technique does not change because the state duration is not controlled explicitly in HMM-based case. Figure 3 shows captured motion at the interval of the fourth and ﬁfth steps of the test sample of no. 5. In this case, the average walking pace and stride length in the middle three steps were 0.59 s/step and 0.417 m/step, respectively. Figure 4 shows the synthesized motion for the same steps when we set the factors as (ξ1 , ξ2 ) = (0.59, 0.417). The average walking pace and stride length in the middle three steps of synthetic motion are 0.59 s/step and 0.407 m/step, respectively. Note that, the motion is depicted every two frames in these ﬁgures. Figure 5 shows the captured and generated X differences of the root joint under the same condition. The unity of X difference in the motion capture system corresponded to 2.3 cm. It is noted that the proposed technique provides very smooth trajectories on the boundaries of motion primitives without any smoothing and the captured and synthetic trajectories around steady walking cycles are resemble. We have observed that synthetic motion is smooth and realistic. To show that duration control is properly performed in the proposed technique, we compared the trajectories of captured and generated rotation angles of joints using several techniques. Figure 6 shows a result for the left knee joint about X axis. In the ﬁgure, thin line shows the values taken from the motion capture sample with (ξ1 , ξ2 ) =

Proceedings of the 2005 International Conference on Cyberworlds (CW’05) 0-7695-2378-1/05 $20.00 © 2005

IEEE

(0.62, 0.419) and thick line shows those of the generated motion by the proposed method with the desired values of factors (ξ1 , ξ2 ) = (0.62, 0.419). In the training of model, we reselected training data so that the they did not contain the above motion capture sample. In contrast, dotted line shows the trajectory generated using simple HSMM, where simple means HSMM is modeled without the factor adaptive training, and dashed line shows that of obtained by linear interpolation from another motion capture sample of (ξ1 , ξ2 ) = (0.52, 0.421). In both cases, we adjusted the total number of frames to the same value as the real motion. It can be seen from the ﬁgure that duration of generated motion by using the proposed technique is very similar to that of the real motion, whereas simple HSMM and linear interpolation techniques provide different duration. We next investigated how the amount of training data affects the objective performance of the proposed technique. Figures 7 and 8 show examples of RMS errors between desired and resultant values for the test data with respect to the amount of training data. We randomly chose training sequences from the sixty-movement set shown in Fig. 2 with increasing the number of training sequences. We can see from these ﬁgures that the performance for the case of twenty training sequences is close to the case of sixty training sequences. However, since the amount of the training data is not so large, it should be noted that the results would change depending on how the training sequences were chosen. We further investigated how the multiple regression HSMM can model walking motion for a wide variety of factor values. We synthesized walking motion with changing the values of factors and judged whether the synthetic motion was natural or not. Figure 9 shows a region

1.6

0.010 RMSE of walking pace (s/step)

Synthetic Captured

1.4 1.2 X difference

1 0.8 0.6 0.4 0.2

0.008 0.006 0.004 0.002 0 0

0

10

20

30

40

50

60

Number of training sequences

-0.2 0

20

40

60

80

100

120

140

Frame no.

Figure 7. RMS error of walking pace for test data.

Captured Proposed Simple HSMM Interpolation

90

Rotation angle (deg)

80 70 60 50

RMSE of stride length (m/step)

Figure 5. Captured and synthetic X differences of the root joint. 0.020 0.016 0.012 0.008 0.004 0 40

0

10

20

30

40

50

60

Number of training sequences

30 20

Figure 8. RMS error of stride length for test data.

10 0

0

20

40

60

80 100 Frame no.

120

140

160

Figure 6. Comparison of captured and synthetic rotation angles of the left knee joint about X axis.

of the factor values for which the synthetic walking motion looked natural. In this experiment, we used sixty training sequences. It can be seen that we can synthesize natural-looking walking motion everywhere in a neighborhood of training samples.

5. Concluding Remarks We have proposed a statistical approach to synthesizing walking motion animation automatically based on HSMM. In the approach, human motion is represented by a sequence

Proceedings of the 2005 International Conference on Cyberworlds (CW’05) 0-7695-2378-1/05 $20.00 © 2005

IEEE

of motion primitives modeled by using HSMMs, in which the mean parameter of probability distribution function is given by a function of some factors that affect the human motion. We have also derived a factor adaptive training algorithm for motion primitive HSMMs and a parameter generation algorithm from HSMM with the prescribed factors. From experimental results, we have shown that we can control walking movements in accordance with a change of the factors such as walking pace and stride length, and the proposing approach provides smooth and realistic human motion. In the proposed motion generation algorithm, the whole parameter sequence of one motion is determined by solving a linear set of equations. Hence the computational complexity increases with an increase in the total number of frames. It would be a problem when synthesizing a large number of walking cycles. However, for the case of a constant or piecewise constant pace and stride length, we can avoid the prob-

0.65

[4]

Walking pace ξ1 (s/step)

Training Data 0.60 0.55

[5]

0.50

[6]

0.45

[7]

0.40 0.36

0.38

0.40

0.42

0.44

0.46

0.48

0.50

Stride length ξ 2 (m/step)

Figure 9. Example of region where synthetic motion looks natural.

[8] [9]

lem by repeating one walking cycle with the period having the prescribed pace. Although we have assumed that human walking is a periodic motion with a constant pace and stride length, it is easy to model and synthesize walking motion with variable pace and stride length by replacing ﬁxed values of factors with variable ones. Our future work will focus on avoiding the violation of body constraints and collision detection. We are also planing to investigate the evaluation of naturalness of the synthetic animation. More complicated human motion animation synthesis and combining hand gesture and speech are also our future issues.

[10]

[11]

[12]

[13]

Acknowledgment The authors would like to thank Dr. Takashi Masuko of the Corporate Research & Development Center, Toshiba Corporation, for his valuable discussions with us. The authors also would like to thank Prof. Makoto Sato of the Tokyo Institute of Technology for providing motion capture data.

[14]

References

[16]

[1] A. Bruderlin and L. Williams. Motion signal processing. In Proc. 22nd Annual Conf. on Computer Graphics and Interactive Techniques, SIGGRAPH ’95, pages 97–104, Los Angeles, USA, Aug. 1995. [2] M. Brand. Shadow puppetry. In Proc. IEEE Intl. Conf. on Computer Vision, ICCV’99, pages 1237–1244, Kerkyra, Greece, Sept. 1999. [3] M. Brand and A. Hertzmann. Style machines. In Proc. 27th Annual Conf. on Computer Graphics and Interactive Tech-

Proceedings of the 2005 International Conference on Cyberworlds (CW’05) 0-7695-2378-1/05 $20.00 © 2005

IEEE

[15]

[17]

[18]

niques, SIGGRAPH 2000, pages 183–192, New Orleans, Louisiana, USA, July 2000. L. Tanco and A. Hilton. Realistic synthesis of novel human movements from a database of motion capture examples. In IEEE Workshop on Human Motion, pages 137–142, Austin, USA, Dec. 2000. H. Sidenbladh, M. Black, and L. Sigal. Implicit probabilistic models of human motion for synthesis and tracking. In Computer Vision - ECCV 2002, LNCS 2350, pages 784–800, Springer-Verlag, 2002. K. Grochow, S. Martin, A. Hertzmann, and Z. Popovic. Style-based inverse kinematics. ACM Trans. on Graphics, 23(3):522–531, Aug. 2004. J. Lee, J. Chai, R. Reitsma, J. Hodgins, and N. Pollard. Interactive control of avatars animated with human motion data. In Proc. 29th Annual Conf. on Computer Graphics and Interactive Techniques, SIGGRAPH 2002, volume 21 of 3, pages 491–500, Texas, USA, July 2002. L. Kovar, M. Gleicher, and F. Pighin. Motion graphs. ACM Trans. on Graphics, 21(3):473–482, July 2002. L. Kovar and M. Gleicher. Automated extraction and parameterization of motions in large data sets. ACM Trans. on Graphics, 23(3):559–668, July 2004. T. Haoka, T. Masuko, and T. Kobayashi. HMM-based synthesis of hand-gesture animation. IEICE Technical Report, CS2002-141/IE2002-129, pages 43–48, Dec. 2002. T. Kobayashi, Y. Takamido, and T. Masuko. Generation of ball catching and throwing movements with arbitrarily prescribed ball speed and catching position. Language Understanding and Action Control: Report Grant-in-Aid for Creative Basic Research 13NP0301, pages 250–259, Mar. 2004. S. Levinson. Continuously variable duration hidden Markov models for automatic speech recognition. Computer Speech and Language, 1(1):29–45, 1986. K. Tokuda, T. Kobayashi, and S. Imai. Speech parameter generation from HMM using dynamic features. In Proc. IEEE Intl. Conf. on Acoust. Speech & Signal Process., ICASSP-95, pages 660–663, Detroit, USA, May 1995. T. Masuko, K. Tokuda, T. Kobayashi, and S. Imai. Speech synthesis using HMMs with dynamic features. In Proc. IEEE Intl. Conf. on Acoust. Speech & Signal Process., ICASSP-96, pages 383–392, Atlanta, USA, May 1996. J. Yamagishi and T. Kobayashi. Adaptive training for hidden semi-Markov model. In Proc. IEEE Intl. Conf. on Acoust. Speech & Signal Process., ICASSP 2005, pages 365–368, Philadelphia, USA, Mar. 2005. A. Wilson and A. Bobick. Parametric hidden Markov models for gesture recognition. IEEE Trans. Pattern Analysis and Machine Intelligence, 21(9):884–900, Sept. 1999. K. Fujinaga, M. Nakai, H. Shimodaira, and S. Sagayama. Multiple-regression hidden Markov model. In Proc. IEEE Intl. Conf. on Acoust. Speech & Signal Process., ICASSP 2001, pages 513–516, Salt Lake City, USA, May 2001. J. Davis and H. Gao. Recognizing human action efforts: An adaptive three-mode PCA framework. In Proc. IEEE Intl. Conf. on Computer Vision, ICCV 2003, pages 1463–1469, Nice, France, Oct. 2003.

Recommend Documents

Multiple Regression

Multiple regression

Multiple-Average-Voice-based Speech Synthesis

Module 9: Multiple Regression

Chapter 3 Multiple Regression