Motion Tracking by Sensors for Real-time Human Skeleton Animation

—Human Computer Interaction based Research has emerged in the early 1980s with the advent of computer technology. Human Motion Capture is the process of recording the movement of people. Among many kinds of human motion capture devises, Microsoft Kinect sensor and inertial sensors are most popular nowadays. In this paper we propose an efficient motion tracking mechanism to construct real time human skeleton animation using inertial sensors. We compare the results of our proposed method with the Microsoft Kinect sensor over the complicated motion tracking and joint position. During the experiment we observed that our results are much steady than Microsoft Kinect results. Some motions like hand cross over or leg cross over, our method showed better results than Kinect because the Kinect may lose skeleton of the blocked parts. On the other hand, since we use radio frequency inertial sensors, our method has a larger working area than Kinect.


I. INTRODUCTION uman Computer Interaction (HCI
) is an interaction interface between human and computers.With the growth of computing capability and the maturity of development of electronic devices, HCI has gained muchattention in research and industrial fields and became extensively applied in sports, special effects, user interface, training, rehabilitation and computer animation for movies, and video games.Hadjidj et al. used wireless sensor networks for rehabilitation [1].The data-driven system introduced by Kurakin et al. is capable of automatic hand gesture recognition in real-time using a commodity depth camera [2].During the last decade, micro sensors have proven to be a good alternative to traditional optical motion capture system, because their low-cost and self-contained nature.Real-time inertial tracking of human motion requires attaching inertial sensors to the major segments of human body.Shiratori et al. presented the theory and practice of using body-mounted cameras to reconstruct the motion of a subject and show results in settings where capture would be difficult or impossible with traditional motion capture systems [3].
Slyper and Hodgins created a performance animation system that leverages the power of low-cost accelerometers [4].Even though their setup based on upper body suit, another study contributed their research by introducing a novel framework for generating full-body animations controlled by only four 3D accelerometers that are attached to the extremities of a human actor [5].Favre et al.hasproposed a new calibration procedure adapted for the joint coordinate system (JCS), which required only inertial measurement units (IMUs) data [6].Another study presented an Extended Kalman Filter for fusion of inertial and magnetic sensing that is used to estimate relative positions and orientations [7].The study of Altun et al. provided different techniques of classifying human activities that are performed using bodyworn miniature inertial and magnetic sensors [8].
In this paper, we introduce a wearable real-time human motion capture system using inertial sensors.A Kalman filter is applied to integrate the output of sensors data.Then we compare our position and tracking information with Microsoft Kinect sensor.
The rest of this paper is organized as follows.In Section II related work in inertial sensor human motion tracking and gait analysis is discussed.The proposed method is explained in section III.Section IVdiscusses results comparing with Kinect.Conclusion and future work is presented insection V.

A. Inertial Sensor Based Motion Capture
Inertial sensors can offer an accurate and reliable method to study human motion, however the degree of accuracy and reliability is site and task specific [9].Most inertial systems use gyroscopes to measure rotational rates.Thus, inertial tracking system has attracted many interests [9][10][11].Raptis et al. present a real-time gesture classification system for skeleton wireframe motion.Its key components include the design of an angular representation of the skeleton [12].

B. Kinematics
We used Euler angle to represent each segments rotation of a human skeleton.Human body is comprised of 14 segments, linked by 15 joints.We considered rotation information to determine the position and orientation of joints.For the relative transfer movement between father and child joint, we have used forward kinematics and inverse kinematics as shown in Figure 1.If used only forward kinematics, the model is fixed at the centre joint of the human model.Joint position of father can be calculated using position of child joints and oriented vector in inverse kinematics.
The kinematics equations for the series chain of a robot are obtained using a rigid transformation [Z] to characterize the relative movement allowed at each joint and separate rigid transformation [X] to define the dimensions of each link [13].
For a serial open chain, the result is a sequence of rigid transformations alternating joint and link transformations from the base of the chain to its end link.A chain of n links connected in series has the kinematic equations, where [T] is the transformation locating the end-link.Notice that the chain includes a 0 th link consisting of the ground frame to which it is attached.These equations are called the forward kinematics equations of the serial chain [13].

C. Kalman Filter
The algorithm works under two steps and in the prediction step, the Kalman filter produces estimates of the current state variables, along with their uncertainties.Once the outcome of the next measurement is observed, these estimates are updated using a weighted average The objective of Kalman filter is to estimate the state of a linear system, by assuming the true state at time k is evolved from the state at (k-1) according to, where, F k is the state transition model which is applied to the previous state X k−1 and input values to the new state value .In our experiments, we have set this value as 1.
B k is the control input model which is applied to the control vector and U k , w k is the process noise which is assumed to be drawn from a zero mean multivariate normal distribution with covariance Q k ;w k ∼N(0,Q k ) [15].At time k an observation (or measurement) Z k of the true state X k is made according to, where,H k is the observation model which maps the true state space into the observed space and V k is the observation noise which is assumed to be zero mean Gaussian white noise with covariance R k ;V k ∼N(0,R k ) [15].The initial state, and the noise vectors at each step {x 0 , w 1 , ..., w k , v 1 , ..., v k } are all assumed to be mutually independent [15].
) ( The Kalman filter produces an optimal state estimate by recursively updating the system state and the estimation error covariance, P k .This estimation error covariance is used for calculating the optimal Kalman gain, K k .That has been used in (4), estimating the further state with input data.R is a covariance which can be define by user.Finally updated state estimate P k is again using updated K k [15].

A. System Overview
The system use sensors to capture motion information of human.First, denoising and processing is conducted to correct the errors and compensate for the disturbances and then display the real-time motion of human.As shown in the Hardware design diagram in Figure 2, inertial sensors connect Micro Centre Unit (MCU) using nRF24L01 wireless sensors.nRF24L01 is a highly integrated, ultra-low power 2Mbps RF transceiver IC for the 2.4GHz band.Transmitting of float data is not so easy by using nRF24L01 or Bluetooth as the communication of both 2 sensors are using byte.MCU is gathered the inertial sensor data, then use nRF24L01 for transmitting data to computer.For the first time, we wanted to use one sever of nRF24L01.But, since we had 5 clients the communication with those clients reduced the frequency of data.Even the gathering frequency is fast, it will be 5 times slower.Therefore, have to used 5 independent severs for 5 independent clients.Three kinds of nodes; centre of gravity (COG), joint and segment used to construct virtual skeleton model.The COG is the centre point of the human model; at the same time it is also the root joint of human skeleton model.Segment is the component unit of the human skeleton model, which describes each part of the model.The COG is also the most important during estimation of human motion.As it is also a joint, but this joint is the centre of other joints, which is the coordinate origin of all the joints.When we have detected which leg is support and which leg is winging, we can calculate the position of the COG by using inverse kinematics.In this study we have constructed a human model containing 14 segments and 15 joints.As shown in Figure 3, red points indicate sensors and blue points represent joints.We used 9 sensors attached on the major segments to record human motion.Also we have attached two sensors with the forearms and the upper arms to record the motion of elbow and shoulder joints.One sensor is attached on the head to track the head's transfer.The chest has one sensor to describe the body's movement.For each leg, we have attached two sensors at thigh and calf to record the knee joints and ankle joints' motion.

B. Process of Motion
Euler angles represent three elemental rotations about the axes of the coordinate system.For instance, a first rotation about z by an angle α, a second about x by an angle β, and third again about z, by an angle γ (Figure 4 (a)).The axes of the original frame are denoted as x, y, z and the axes of the rotated frame are denoted as X, Y, Z.The line of nodes (N) is defined as the intersection of the xy and the XY coordinate planes.In other words, it is a line passing through the origin of both frames, and perpendicular to the zZplane, on which both z and Z lie, shown as Figure 4 (b).The three Euler angles are defined as: α (or Φ) is the angle between the x axis and the N axis, β (or Θ) is the angle between the z axis and the Z axis, and γ (or Ψ) is the angle between the N axis and the X axis.This implies that: α, β, γ represent rotations around the z axis, N axis and Z axis respectively.
Most of the pedometers demonstrate an acceptable level of accuracy and reliability in step-count measurement [16].There are plenty of studies which examined the accuracy, reliability, and validity of using pedometers [17][18][19][20].Hasson et al. proposed first validation study of examining pedometer performance using a variable-speed condition [17].
All accelerometers provide basic step counting and activity counts.Important gait parameters can be measured using accelerometer to evaluate one's risk of falling and mobility level [21].Bamberg et al. proposed a gait analysis system using integrated wireless sensors [22].Figure 5 describes the gait shoe system with labels indicating relevant anatomical markers.For the analysis of the kinematic motion of the foot, two dual axis accelerometers and three gyroscopes were placed at the back of the shoe, oriented such that the individual sensing axes were aligned along three perpendicular axes.the Euler angle theory Fig. 5 Schematic of the Gait Shoe system [22] Figure 6 shows the relevant coordinate systems used for the analysis of the data.The global reference frame of the room and the second corresponds to the local body frame, in which the sensors are mounted and collect their measurements.
International Journal on Advances in ICT for Emerging Regions December 2016

D. Human Model Construction
We take the right arm as an example to analyze the position and orientation of joints.The right arm is comprised of two body segments, i.e. right upper arm, right forearm.It is modeled as a kinematics chain of these two rigid segments linked by joints, i.e. right shoulder, right elbow and right wrist.The right shoulder is considered as the root joint in the right arm model.The skeleton model is established from the root joint downward to form a kinematics model with the joints obeying a Parent-Child relationship [23,24].We assume the length of each segment is fixed.The formula of obtaining joint position is, Here, P shoulder is the position of shoulder joint, V upperarm is the initial vector of the upper arm segment which should be known before calculation.Q upperarm is the orientation of the upper arm segment represented by quaternion which rotate the vector of the human body in the global coordinate system.⊗ is the quaternion multiplication operator.
We can easily get the wrist joints position after we have calculated the elbow joints position using the following formula, As we have just got P elbow in (6), we can easily get P wrist by taking ( 7) into (6).Applying this in to the whole human model, we can get the relation and position of each joint.In the human lower body is comprised of seven body segments, i.e. pelvis, left and right femurs, left and right tibias, and left and right feet.It is modeled as a kinematics chain of these seven rigid segments linked by joints, i.e. hips, knees, ankles and toes.The pelvis is considered the root joint in the model.The skeleton model is established from the root joint downward to form a kinematics model with the joints obeying a parent-child relationship.These rigid body segments can be represented by vectors.We build two sets of the lower body segments, namely, S L = {Pelvis, LelfFemur, LeftTibia, LeftFoot}andS R = {Pelvis, RightFemur, RightTibia, RightFoot}, also two sets of the joints J L = {Pelvis, LeftHip, LeftKnee, LeftAnkle, LeftToe}andJ R = {Pelvis, RightHip, RightKnee, RightAnkle, RightToe}.In these sets, the preceding element is the parent.We take the right lower limb as an example to demonstrate how the position information is transmitted between lower body segments according to the segmental kinematics.From the proximal joint, e.g.root joint, to the distal joint, the position of the child joint can be calculated from its parent joints position using,    (), =    (−1), +    (), ⊗    (),0 ⊗    (), −1 where i = 1,2,3,4 and k = 2,3,4,5.From the distal joint, e.g.right toe, to the proximal joint, the position of the parent joint can be calculated from its child joints position using,    (−1), =    (), +    (), ⊗    (),0 ⊗    (), −1 where i = 4,3,2,1 and k = 5,4,3,2.
Range of Motion (ROM) is the angle that a joint may normally travel.ROM can help us get the maximum angle of joints motion, including ante flexion, posterior extension, abduction and adduction.Therefore, we used ROM to limit skeleton model joints motion angle, calibrating the sensor captured data When taken right shoulder (RS) joint as an example, the range of motion of shoulder joint was 90° for ante flexion, 60° for posterior extension, 90° for abduction and 40° for adduction as depicted in Figure 7 (a).We can easily get -60° <   <-180° while -180° <   <40°.Since we considered as the length of each segment is fixed, assumed shoulder joints coordinate value is (0, 0, 0) and upper arm length is l 0 .The elbow coordinate value can be calculated using, Using the values θ RSx and θ RSy from 10, the motion range of elbow joint can be calculated.By comparing this elbow range with the result of 6, we could filter input sensor data.If the results of 6 is not in the range of elbow, we assume that it may not a human regular motion.
For the right-elbow (RE) joint shown in the Figure 7(b), it is not only deal with its own range of motion, since it is a kinematics chain.It also has to follow the transfer relation.So the angels can be described as following range: After we get the elbows range of motion fore-arm length is l 1 , the end values can be calculated using,

E. Extended Kalman Filter
In this research, by integrating the output of the gyroscope and the accelerometer, the Kalman filter provided a noisy and disturbed but drift-free measurement of orientation.We can get the Process Model and Measurement Model for the extended Kalman filter.
For a sensor in the state of rest, the linear acceleration is quite small.So the signals from an accelerometer can be regarded as the gravitational acceleration.But in most situations, the kinematics linear acceleration is usually in existence.Thus, the data from accelerometer and gyroscope are fused using EKF to calculate the gravitational acceleration.The state vector of the dynamical system is the gravitational acceleration expressed in the sensor coordinate frame, represented by g s .The control vector of the system is the angular velocity from the gyroscope, denoted by ω s .The measurement vector of the system is the acceleration data from the accelerometer, denoted by a S .Define Then the state equation of the dynamical system is represented by, where T denotes the sampling period of sensors.The measurement equation of the system is represented by, where the n = [n gx , n gy , n gz ] T denotes the measurement noise.
The result of the human motion of human skeleton is described using Euler angle.So all data from sensors used to calculate angle in order to make the dynamic of human motion.The gyroscope can be calculated into the angle using integrating of sensor sampling time, and the accelerometer can be calculated a deflected angle by using trigonometric.
Angle(Axz) = sin −1 (x/gravity) ( 14) During the calculation, initially, we only considered equation 14.But, when the x is larger than gravity answer is not possible.And also there was an error when only used equation 15 when the z value is minimum.Hence we combined two formulas and run while wearing the sensors in order to get much more stable result for the angle.At that time, we noticed that there was a small difference between angles.Then assigned the minimum difference between two formulas and got the accelerometer value as 0.4477539 m/s 2 .

F. Gait Estimation
Walking estimation is conducted in the following two steps: walking status detect and kinematics analysis as written in algorithm 3. Walking status detect is used to determine the support leg of which ankle joint is selected as the root joint of the lower body model during walking.Human walking is a cyclical motion.We can divide into two phase as shown in Figure 8 (a) stance phase (ST) and swing phase (SW).
The stance phase (ST) starting by a heel on the ground is the portion of the cycle during which a foot is contacted with the ground.The swing phase (SW) starting with a toe off which is the portion of the cycle when the foot is not in contact with the ground.Something during a walking our both legs are standing or are in the double stance phase (DS).During the DS phase, we consider the leg which is just going into the ST as the support leg.The stick model of the walking gait shows in Figure 8

IV. EXPERIMENTAL RESULTS AND DISCUSSION
We performed the following experiments to evaluate the efficacy of our inertial sensor based human motion tracking system.We used OpenNI library since it is open source and we could easily get the coordinate value of each joint in human model.First, we compared the tracking result.Then we record 14 joints position data in our method and Kinect, include left-shoulder, left-elbow, left-wrist, right-shoulder, right-elbow, right-wrist, left-hip, left-knee, left-ankle, righthip, right-knee, right-ankle, chest, and head.

A. Comparing with Kinect Result
While running our method and OpenNI project at the same time, recorded the posture result.Inertial sensors are mounted to the human body segments including upper-arms, forearms, chest, thighs and shanks.First we sit on chair and swing the arms.Then we stand and do much more postures.Figure 9 shows six screenshots of recorded results.The left picture of a particular screenshot represents the result of proposed method while the right picture represents Kinect OpenNI.During this experiment, we could notice that the proposed motion tracking system works well in position of human skeleton in real-time.We have recorded 13 skeleton joints coordinate value including shoulder joints, elbow joints, wrist joints, hip joints, knee joints, ankle joints and chest of both methods.Among them we have considered chest, left wrist and right ankle asexamples.Figure 10

C. Applications
HCI is wildly used in many parts of human life, like video games, virtual reality, 3D technology, Multi-Media, etc.We used our method to play some video games and use our method to control the Microsoft Power Point. Figure 11 (a) shows that play First Person Shooting (FPS) game using our method.Used chest joints angle to control the rotation of the player.When the chest joint leans to the right side, the player in the video game will leads to right side.Ankle joints movement controls the players step.Player will fire while we put right hand up.A flight simulator game has been tested using our method which has been shown in Figure 11 (b).We used upper body's posture to control planes fly state.

D. Walking Estimation
Fig. 11 Game control using our method (a).FPS game using our method (b) flight simulator game using our method As we used two kinds of kinematics, our model is not fixed at the centre of mass.Model can walk as what we have done in the reality.We have recorded many walking frames as shown in Figure 12.

V. CONCLUSION AND FUTURE WORK
This research study implemented a HMC system using multiple inertial sensors, and introduced an efficient human tracking algorithm.Further by applying an accurate gait estimation algorithm, a real-time human skeleton animation capture system was born.The performance of system was fast and stably in an area of 20m×20m which is much larger than Kinect.
After performing a series of experiments such as human motion with complicated movements, it was shown that our method can do the same done by Kinect SDK with much less data shaking.Also some motions like hand cross over or leg cross over, our method produced better results than Kinect as Kinect may lose the skeleton of the blocked part.Furthermore, our method has a larger working area than Kinect as our method uses a radio frequency sensor which can provide a communication range of 10m or more.By applying kinematics, we could make the skeleton movement as in reality, such as walking estimation.Additionally, we have tested our method with video games and it worked well in both of these two games.
As possible extensions of this work, we would like to improve our system under hardware design, performance improvement and experiment.Currently, the method has only 20 fps, in future, we plan to use a faster micro centre unit to increase the number of fps.Since our proposed method has been used nRF24l01 for the communication between PC and MCU, which has 10 meter to 15meter plan to replace nRF24l01 with WIFI unit which can make an even larger area of working.Furthermore, by attaching more joints such ascrotch joint and back joint, much more detailed human motion could be presented.We plan to build a Back Propagation Neural Network by using Kinect results as the standard output and our method results as the input data, to get better capture results of human motion.In addition to Kinect, we would like to compare our method with other inertial sensor systems by Xsens technology or other companies.

Fig. 3
Fig. 3 Model construction and overview of the system

Fig. 6
Fig. 6 Schematic of the Gait Shoe system [22] C. Back Propagation Neural Network First, low resolution sensor data of 8 bits were used which provide 0 ~ 255 range.Hence, a small change of sensor data can result a high angle change in the calculation.To improve sensor resolution, used 16 bit which provides 0 ~ 65,535 range.For more accurate results, we wanted to use Neural Network, to verifying our result with Kinect result.The above mentioned back propagation learning algorithm can be divided into two phases as propagation and weight update.

Fig. 7
Fig. 7 The range of motion (a) right shoulder Joint, (b) right elbow joint
(b). (p): Left terminal swing, Right terminal stance, Right toe reference.(q), (r), (s): Left support leg, Left toe reference.(t): Left terminal stance, Right terminal swing, Left toe reference.Here grey dash lines represent the lower body movements

Fig. 9 Fig. 12
Fig. 9 Experimental results of both our method and Kinect OpenNI.Left: our method result; Right: Kinect OpenNI result.For (a) and (b) we sit on the chair, and for the left (c), (d), (e), (f) we just take much standing postures.

Fig. 10
Fig. 10 Coordinate diagram result.(a) Chest result using our method.(b) using Kinect OpenNI.(c) takes the left wrist position result using our method, (d) using Kinect OpenNI.(e) the result of right ankle using our method (f) Kinect result depicts the comparative results of the joints cooridinate diagram result for our method and Kinect sensor.Motion analysis, which represent in y-axis in meters over time in seconds.Both 10(a) and 10(b) depict the result of chest joint's cooridiante result.10(a) is using our proposed method and10(b) is using Kinect sensor.This demonstrated that the result from 10(a) is a little more convergent and the data is much more smooth.10(c) and 10(d) depict the location of the left wrist, while 10(e) and 10(f) are represent the right ankle diagram.Corresponding values of Figure 10(a), 10(b) and 10(c), 10(d) are as in

TABLE 1 .
MOTION ANALYSIS OVER TIME IN SECONDS AND JOINT POSITION VALUES CORRESPONDINGTO FIGURE10(a) AND 10(b)

Table
1 and Table 2 respectively.