Force Sensorless Admittance Control With Neural Learning for Robots With Actuator Saturation

In this paper, we present a sensorless admittance control scheme for robotic manipulators to interact with unknown environments in the presence of actuator saturation. The external environment is defined as linear models with unknown dynamics. Using admittance control, the robotic manipulator is controlled to be compliant with external torque from the environment. The external torque acted on the end-effector is estimated by using a disturbance observer based on generalized momentum. The model uncertainties are solved by using radial basis neural networks (NNs). To guarantee the tracking performance and tackle the effect of actuator saturation, an adaptive NN controller integrating an auxiliary system is designed to handle the actuator saturation. By employing Lyapunov stability theory, the stability of the closed-loop system is achieved. The experiments on the Baxter robot are implemented to verify the effectiveness of the proposed method.


I. INTRODUCTION
I N the recent years, robots have been increasingly applied in a wide range of fields, such as elderly care, medical care and entertainment. In these cases, the robot will be faced with unknown and complex environment. Therefore, physical interaction with environment is an inevitable robot behaviour. The interactive behaviour of the robot may be the main objective of control design. Due to higher requirements for intelligence of robots, robots are expected to complete more difficult tasks on safety issues in social production life for human beings. Robots are required to learn and adapt to the environment to achieve compliant behaviours.
To make the robot better adapt to the unknown environment and achieve a compliant behavior, force sensing is essential and fundamental. Force sensing is a way to make a robot enable to detect objects near them. Traditional way to achieve force sensing is to use force sensors. Force sensors are always expensive, and will bring burden to the system integration. In addition, they could increase the complexity of task execution. Therefore, sensorless techniques have attracted more researchers. Early approaches of estimating the external force are introduced in [1] [2], and applied to the robotic manipulator [3]. Disturbance observer approaches based on control error for force estimation are often used in early robotic applications [4]. Recently, an alternative way is to use a force observer based on generalized momentum [5] [6]. The advantage of the generalized momentum based approaches is that joint acceleration is not needed. In [5], this method is further performed by a filtered model and a recursive leastsquares estimator. In [6], a Kalman filter is integrated in the generalized momentum approaches to estimate the contact forces/torques in Cartesian space.
Interaction control between robots and environments has been studied for long time and attracted much attention from a large number of researchers. Hybrid position/force control [7] is the most used method before impedance control proposed. However, when the environment is stiff, it may cause instability during the interaction. Impedance control aims to develop a relationship between the manipulator and the environment and is proved to have better robustness [8]. In the early literatures, researchers focused on dealing with uncertainties wth passive impedance models in robotic systems. Therefore, impedance control combining with adaptive control is often studied [9]. In [10], a desired impedance model is obtained with the consideration of environmental dynamics. Under impedance control, the manipulator could be compliant to the unknown environment [11]. Admittance control regarded as the inverse of impedance control is another scheme to achieve the compliant behavior [12]. Compared with impedance control methods, the concept of admittance control is that the system receives a force from the environment and exports a motion. Then, the compliant behavior of the manipulator will be achieved by trajectory adaptation to the environment [13].
Under admittance control, the tracking performance is important and essential after trajectory adaptation. It is well known that control strategies can be divided into two categories, namely model-based control and model-free control. Compared with model-free control, model-based control has a better control perfromance, but will depend heavily on the model accuracy. In practical systems, due to the existence of nonlinearities and uncertainties [14], perfect knowledge on the model cannot be assumed. Therefore, adaptive control methods integrating with intelligent architectures [15] [ 16], have been widely researched. Different from the traditional control methods, with powerful approximation ability, these adaptive methods do not require complete dynamics of robotic model [17]- [21]. In [22], in order to improve the dynamically substructured systems (DSS) testing performance, an adaptive NN-based controller is proposed and neural networks used to approximate uncertainties and nonlinearities on the DSS dynamics. In [23], a fuzzy logic system employed in backstepping control method is to approximate complicated functions. Evolutionary algorithms are also combined with fuzzy systems to achieve an optimal performance [24]. In [25], ant colony optimization and particle swarm optimization are integrated into fuzzy control systems to avoid the time-consuming task of manually designing the controllers.
In practical control systems, saturation is a common and unavoidable actuator nonlinearity and how to deal with actuator saturation is important. The saturation problem not only affects the control effect, but also may lead to the instability [26]. Therefore, effort of investigation has been considered on this topic. Based on adaptive control theory, several derived adaptive schemes to solve the saturation problems have been studied to handle actuator saturation [27]. In recent years, neural learning adaptive schemes have received much attention [28]. In [29], based on the state observer, neural networks are employed into control design to deal with the effects of the unknown disturbances and the saturation nonlinearity. In [30], a well defined smooth function and a Nussbaum function are integrated into adaptive control design. The saturation effect will lead to nonlinear terms, which is compensated by the Nussbaum function. In [31], an adaptive neural impedance control is designed for a n-link robotic manipulator with input saturation. An auxiliary system is introduced in control design to deal with the saturation effect. This paper is a continuation of our previous work [32], and the contributions are summarized as follows: (i) Admittance control method has been employed to achieve a compliant behaviour with the consideration of environmental dynamics in robot-environment systems.
(ii) An RBFNN-based controller integrating an auxiliary system is designed in the presence of actuator saturation and uncertainties in robotic system.
(iii) The external torque in the admittance model in joint space is estimated by a torque observer replacing force sensors to reduce the system burden.
The rest of the paper is organized as follow. In Section II, problem statement and preliminaries are presented. In Section III, the admittance control design with neural networks in the presence of input saturation is discussed. In Section IV, experimental results are presented. The appendix is the final section of the paper and follows the conclusion.

II. PROBLEM STATEMENT AND PRELIMINARIES A. Problem Formulation
Generally, most of the environmental dynamics can be expressed as [33] where M E , C E and G E denote mass, damping and stiffness respectively. As shown in Fig. 1, considering a robot arm interacting with an unknown environment, a control scheme is designed to make the robot arm have a compliant behaviour, and will satisfy the following requirements: i) based on the admittance method, the desired trajectory will be modified when an external force is acting on the robot arm; ii) the external torque applied at end-effector is estimated by the observer; iii) the adaptive neural controller can guarantee the tracking performance.

B. System Dynamics
The robot kinematics is given as follows where x(t) is the vector of position and orientations and q is the vector of joint angles. Then, the inverse kinematics are Take the derivative of (2) with respect to time, we havė where J(q) is the Jacobian matrix. Further differentiating (4), we haveẍ (t) =J(q)q + J(q)q The relationship between joint force and wrench is Then, we give the n-link robot manipulator dynamics in joint space where q ∈ R n ,q ∈ R n andq ∈ R n are the vectors of joint angle, velocity and accelerations respectively. D q (q) ∈ R n×n denotes the inertia matrix; C q (q,q)q denotes the Coriolis and centripetal torque; G q (q) is the gravity. τ is the robot motor torque; τ f ric is the friction torque and τ ext is the external torque. a) Property 1 [34]: Matrix D q (q) is symmetric and positive definite.
An admittance model describes the relationship between the external force and position of robot arm [35] where x d ∈ R n is the desired trajectory, and x r ∈ R n is the virtual desired trajectory arised from the external force f . Substituting (2)-(5) into (8), the left side of (8) is Then the admittance model in joint space can be defined as where q d ∈ R n and q r ∈ R n are the desired trajectory and virtual desired trajectory in the joint space, respectively. The M d , C d and K d are gain matrices denoting the mass, damping and stiffness matrix specified by the designer. Assumption 1: Both q d and q r are bounded and differentiable: ||q d ||, ||q r || ≤ c 1 , ||q d ||, ||q r || ≤ c 2 , ||q d ||, ||q r || ≤ c 3 and c 1 , c 2 and c 3 are positive constants. Remark 1: In some specific situations, other admittance models such as damping-stiffness model and stiffness model are given We can find that if there is no external collision and the desired manipulator's motion is free, we have q r = q d , τ ext = 0. On the contrary, when external collision exists, the robot arm will follow the new trajectory which can be seen as the adaptation to the external torque and the target admittance model defined in (10) describes this relationship.

C. Actuator Saturation
Saturation is a static nonlinear function used to describe the insensitivity of large signals which exceed the input limit of the actuator, as shown in Fig. 2. In general, the saturation can be described as where u is the input signal, g(t) is a smooth function; S at (τ ) is the output of the saturation nonlinearity; τ max and τ min denote the maximum and minimum value of saturation nonlinearity, respectively.

D. NN Approximation
With the approximation capability of the RBFNN [36], a continuous smooth function h(Z): R q → R is defined, and RBFNN is used to approximate it where Z in ∈ Ω ⊂ R q denotes the input of RBFNN; W = [w 1 , w 2 , ..., w m ] ∈ R m , denotes the NN weight and m > 0 is NN node number in the hidden layer; S(Z in ) = [S 1 (Z in ), S 2 (Z in ), ...., S m (Z in )] T and S i (Z in ) denotes an activation function which is often chosen as Gaussian function where u i = [u i1 , u i2 , ..., u iq ] T ∈ R q is the center of receptive field and η i is the variance. From the definition of activation function, we can obtain that the S(Z in ) is bounded, which can be described as where is a positive constant. With a sufficiently large node m, any smooth continuous function can be approximated to any degree where W * is the ideal weight over a compact set Ω Zin ⊂ R q ; the approximation error of RBFNN satisfies ||ε|| ≤ ω, where ω is a small unknown constant. Over a compact set Z in ∈ Ω Zin ⊂ R q , the ideal weight vector can be (17) III. CONTROL STSTEM DESIGN AND STABILITY ANALYSIS In this section, an admittance control scheme is developed, as shown in Fig. 3. The collision from the environment is viewed as the external torque exerted at the end-effector, which is estimated by observer approach and also seen as the disturbance of the system.

A. Observer based on the generalized momentum
Traditional force estimation methods rely on the model of the manipulator involve joint accelerationq and the inverse of mass matrix D q (q) [37], which will bring the amplification of measurement noise to the system and increase the amount of calculation of the system. To solve this problem, a disturbance observer based on the generalized momentum is developed [5]. This approach can be used to estimate the external torque without involving joint accelerationq and computation of the inverse of the matrix D q (q).
The generalized momentum of robot joint can be described as and its derivative isṗ Considering the robot dynamics (7), (19) can be written aṡ Using the property 2 with a symmetry D q (q), (20) can bė It can be seen that, in (21), the derivative of p depends on the external torque τ ext and no acceleration term of robot joints. Now, an observer is defined for generalized momentum p, as shown in Fig. 4. The error of the p is defined as Then, we havė where K p is the positive gain matrix.
In this paper, we use the model of Baxter robot in [38]. As we can see, combine (21) with (23) and define r = K p e p , we can obtainṙ Remark 2: In robot-environment interaction applications, control of contacted force is essential and important. The force observer is a feasible way to estimate external forces using the dynamics of the system. However, due to the uncertainties in system dynamics, these model-based observers can not provide accurate force estimates when the dynamics of the model are not accurate. One straightforward solution is to use adaptive methods to estimate unknown parameters. The adaptive estimation technique has been investigated in previous literatures, e.g., in [5], an adaptive parameters estimation method was designed to estimate the unknown robot dynamics and a recursive least squares estimation algorithm is proposed for external force without the joint accelerationq. With the powerful approximation ability of neural networks, many works have been investigated to use neural-networkbased force/torque observers for estimating the contact force and achieve a good performance [39] [40]. These NN-based observers have the advantage of knowing little information about the robots dynamics and avoid the restriction on traditional model-based observer approaches. In this regard, modelfree observer integrating in NN-based adaptive control scheme will be included in our future work.

B. Controller
In this section, we develop a controller to track the desired trajectory with input constraints. After that, stability analysis will be presented. First, in joint space, we define some related error signals as follows e q = q r − q α =q r + Ke q e v =ė q + Ke q (25) where K is positive gain matrix. Taking the external disturbance into consideration, we define the control torque where K v and K s are gain matrices;D q ,Ĉ q andĜ q are the approximations of RBFNN; ξ is the state variable which will be defined latter.
Considering the dynamics of a robot system with actuator saturation D q (q)q + C q (q,q)q + G q (q) = S at (τ ) + τ ext (27) where S at (•) is the function of actuator saturation defined in (12). Substituting (26) into (27), we have where The auxiliary system is defined aṡ where K ξ is a positive gain matrix and µ is defined as a small constant.
Employing the approximation of neural networks, we can obtainD The updating laws of RBFNN arė where Θ D , Θ C and Θ G are positive matrices, δ is a small gain matrix for disturbance [41]. Then, the dynamics (28) can be derived into −D qėv = − e q + C q e v + ∆τ + ∆τ e + K v (e v + ξ) then −D qėv = − e q + C q e v + ∆τ + ∆τ e + K v (e v + ξ) Theorem 1: Consider the definition of V , we can obtain e q , e v , ξ, ||W D ||, ||W C || and ||W G || are uniformly ultimately bounded. Since ||W || is bounded, ||Ŵ || = ||W +W || is bounded. With the given bounded q d ,q d , q r andq r , according to the definition of error signals in (25), we can obtain q = q r − e q is bounded; α =q r + Ke q is bounded.
Remark 3: As well known, Lyapunov direct method is a very important controller design and stability analysis tool in nonlinear systems. By constructing a Lyapunov function and analyzing its derivative with respect to time, the stability at the equilibrium point can be obtained without seeking the system solution. Given a nonlinear dynamic systeṁ where x ∈ R n . Its equilibrium point is the origin. N is the neighborhood of the origin, where N = x : ||x|| ≤ , > 0. Then, We can analyze the convergence of system states by constructing a scalar Lyapunov function. However, lyapunov method also has its limitations in some situations. In general, the lyapunov stability analysis method focuses on the final convergence results of the system state, that is, whether the state converges or not. It rarely pays attention to convergence process of system states. For example, in a practical control system, we can use lyapunov direct method to analyze whether the control errors of the system converge, but some transient target, such as the overshoot and rise time, are difficult to be achieved. Even in some practical cases, the lyapunov method can lead to infeasible controller design and failing to achieve a desired performance [42]. Another conservative point of lyapunov method is that only quadratic Lyapunov functions are considered in most cases. some related strategies to reduce the conservatism are studied in [17], [43]- [45].

IV. EXPERIMENT RESULTS
To illustrate effectiveness of our developed method, we utilize the Baxter robot to perform the experiments. The Baxter robot has two arms and each of the Baxter robot has 7 Degrees of Freedom (DOF): shoulder joints s 0 , s 1 , elbow joints: e 0 , e 1 and wrist joints: w 0 , w 1 and w 2 . The model of the Baxter is introduced in [38].   In this experiment, at the beginning, when there is no interaction with the external environment, the robot manipulator will follow the desired trajectory. After a period of time, an external torque will be applied at the end-effector when the robot begins to interact with the environment. Under the influence of external torque, the reference trajectory q d of the robot will be modified to adapt to the environment and a modified trajectory q r will be generated. The modified trajectory q r is viewed as the adaptation behaviour of the robot manipulator to the environment. The description of the experiment is depicted in Fig. 5. In the course of the experiment, we test the shoulder joint s 0 .  [15,15,15,15,15,15,15]. The control gains are set K = diag [10, 10, 10, 10, 10, 10, 10], K v = diag [5, 6, 5, 4, 1, 1, 1] and K s = diag [2, 2, 2, 2, 2, 2, 2].

A. Test of NN Controller
The tracking performance is depicted in Fig. 6. As shown in Fig. 6, the actual trajectories of the end−effector and the joint s 0 can follow the desired trajectories effectively, and the average tracking errors of the end-effector with respect to x, y, z are around (−0.03m, 0.04m), (0.004m, 0.008m), (−0.003m, 0.007m), respectively, where (•) denotes the range of values. The overall results are satisfactory, which implies the effectiveness of the adaptive NN controller.

B. Test of Saturation Compensator
This group of experiments aim at testing the effectiveness of the saturation compensator on joint s 0 . The parameters are selected as K ξ = diag{25; 0; 0; 0; 0; 0; 0}, sat max = diag{2.2; 5; 5; 5; 5; 5; 5} and sat min = diag{−2.2; −5; −5; −5; −5; −5; −5}. where K ξ is the gain matrix in auxiliary system for saturation compensation and sat max , sat min are virtual saturation limits. When the amplitude of the control input is larger than the virtual saturation limits, it will be set equal to the amplitude of virtual saturation limits. The experimental results are illustrated in Figs. 7-9. As shown in Fig. 9, when there is no saturation compensation, the amplitude of control input is larger than the upper bound at some points. With the compensator, we can find that, the points of input torque which are larger than the upper bound are compensated, so the input of the torque can be limited within the upper and lower bounds. In Fig. 7, the tracking performance with input compensation is presented and the tracking errors are also compared with and without the input compensation in Fig. 8. As shown in Fig. 8, without compensation, the tracking errors are larger because the actuator cannot provide enough energy to guarantee the tracking performance where the points of control input reach out the saturation limits. The comparative figures show that, with input compensation, the controller can ensure good position tracking performance, which demonstrate that our control strategy is effective.
The experiment results are shown in Figs. 10-12. The acting time of the external force is from 13s to 15s. As shown in Fig. 10, when the manipulator is interacting with the environment, the desired trajectory will be modified to adapt to the environment. By using a filter to make the result clearer, the estimation performance is demonstrated in Fig. 12, which shows a satisfactory results. The tracking performance for the modified trajectory is illustrated in Fig. 11. The overall results demonstrate that the proposed control scheme is effective.

D. Discussion
Comparative simulation studies are conducted to further verify the proposed method. The first group of comparative simulation is carried out to illustrate the influence of input saturation restriction on control performance in admittance control scheme. In [46], an adaptive admittance control method for human-robot interaction is developed, and the inner control loop is to guarantee the actual trajectory x can track the desired trajectory x m generated from an admittance model without input constraints. The comparative simulations are taking the input saturation into consideration, and results are depicted in Figs  performance can be improved without exceeding the upper and lower bounds of the input of the actuator, as shown in Fig. 15. It can be referred from that, in practical admittance control scheme, when the actuator is saturated, it is difficult to ensure that the robot can track the desired trajectory obtained from the admittance model, which results in that the ultimate control goal in [46] may not be achieved.
To further validate the effectiveness of admittance control method in interactive control system, another comparative studies are designed based on the admittance method and the adaptive control method in [31]. In [31], a desired impedance model and the external force τ e are assumed to be completely known, which are ideal conditions in practice. Therefore, we conduct the simulation under the same environment and remove these ideal conditions. The desired trajectory and the initial configuration of joint 1 is set q d = 1.25sin(0.25t), and initial joint condition is q 0 = 0,q 0 = 0. As depicted in Fig. 16, the manipulator is designed to follow the desired trajectory q d . After a period of time, the manipulator will have a collision with the obstacle. Simulation results are presented in Figs. 17−19. As shown in Fig. 17, after a collision, by using admittance control method (trajectory of green dasheddotted line), the desired trajectory (blue dashed-dotted line) set by the designer will be modified to reduce the contact torque and have a compliant behaviour. On the contrary, the desired trajectory will not be modified by using the adaptive control method (trajectory of red dashed-dotted line), and the manipulator is tracking the desired trajectory all the time. Thus, the interaction torque (red solid line) is increasing in the course of collision by adaptive control method without ideal assumptions, and the contact torque (green dashed-dotted line) by using admittance control method is smaller, as shown in Fig. 19. The tracking performance of both methods is depicted in Fig. 18. From these comparative results , we can find that admittance control method combining with observer approach is more applicable and can make the robotic manipulator have a compliant behaviour to external torque/force. In fact, the proposed sensorless admittance control scheme also has some weak points. As pointed out, in practical system, some noise in the observer approach will have a inverse influence on the estimation accuracy, and an expected modified desired-trajectory obtained from the admittance model in (11) could not be guaranteed. From the point of estimation accuracy, in [46], a force sensor is used to measure the contact force could be closer to real value of torque/force. Furthermore, due to the uncertainties of the model dynamics, the generalized momentum based observer may meet some limitations in certain practical applications and this will be included in our future work to expand the scope of applications of our proposed control scheme. In addition, input saturation does not cover all the non-linear phenomena in actual mechanical systems, and other constraints (e.g. dead-zone and time-delay) should be further considered.

V. CONCLUSION
This paper presents a sensorless control scheme integrating RBFNN, torque estimation and admittance control for Baxter robot to interact with the unknown environment with input constraint. The adaptive neural controller can guarantee the tracking performance and the tracking errors of the system within a small neighborhood of zero. The external torque from the environment applied at end effector is estimated and admittance control method is employed for trajectory adaptation to achieve a compliant behaviour. Finally, the experiment results on Baxter robot demonstrate the effectiveness of the proposed method.
In the future work, many other non-linear constraint problems (e.g. dead-zone, time-delay and hysteresis) will be considered in our proposed system. In these situations, the performance and stability of the system might not be guaranteed. Furthermore, analysis and studies on dealing with more complex environmental models will enrich the diversity of the admittance control system. In addition, the proposed NN-based adaptive control scheme can be combined with free-model observer using intelligent tools, such as radial basis function neural network (RBFNN), together to make the control scheme more applicable.
VI. APPENDIX Considering the Lyapunov candidate Substituting (34) into (37), we havė Since From (30), we have Then, combing the (43) and the property 2, the derivative form (40) can bė V ≤ e T qėq − e T v (−e q + K v (e v + ξ) + K s sgn(e v ) + ∆τ + ∆τ e ) − ξ T K ξ ξ − |e T v ∆τ | + Considering the error signal e v =ė q +Ke q , We can simplify (44) To ensure the stability of the closedloop system, the parameters should satisfy : K v − I ≥ 0 and 2K ξ − I − K T v K v ≥ 0. Let us define the variable ς comprised of e q , ξ, e v ,W D ,W C ,W G and the derivative form of V can be rewritten asV (ς) ≤ −Aϕ(ς) + C, where A and C are positive constants. There exists a invariant set L(ς), that makes: −Aϕ(ς) + C < 0 when ς starts outside of the set L(ς). At this time, sinceV (ς) < 0, V (ς) decreases so that ς will enter into L(ς) within a period of time and remain afterwards. Therefore, the state variable ς satisfies uniformly ultimately bounded(UUB) stablity and approaches to a bounded compact set near zero.