Composite learning adaptive backstepping control using neural networks with compact supports

The ability to learn is crucial for neural network (NN) control as it is able to enhance the overall stability and robustness of control systems. In this study, a composite learning control strategy is proposed for a class of strict‐feedback nonlinear systems with mismatched uncertainties, where raised‐cosine radial basis function NNs with compact supports are applied to approximate system uncertainties. Both online historical data and instantaneous data are utilized to update NN weights. Practical exponential stability of the closed‐loop system is established under a weak excitation condition termed interval excitation. The proposed approach ensures fast parameter convergence, implying an exact estimation of plant uncertainties, without the trajectory of NN inputs being recurrent and the time derivation of plant states. The raised‐cosine radial basis function NNs applied not only reduces computational cost but also facilitates the exact determination of a subregressor activated along any trajectory of NN inputs so that the interval excitation condition is verifiable. Numerical results have verified validity and superiority of the proposed approach.


INTRODUCTION
One of the successful stories of applying machine learning to intelligent control is neural network (NN)-based adaptive control (NNAC). 1 Compared with the traditional adaptive control, the most appealing merit of NNAC is that the modelling difficulty in many practical control problems can be greatly mitigated resulting in the simplification of control synthesis for a wider class of nonlinear systems with functional uncertainties. 2 However, in most existing NNAC methods, eg, see some recent works, [3][4][5][6][7][8][9][10][11][12][13][14][15][16][17] the ability of NNs to learn plant uncertainties is not completely exploited and only tracking error convergence is available. The ability to learn for NNs, reflected by the convergence of NN weights, is guaranteed by the well-known condition termed persistent excitation (PE). 18 Parameter convergence in NNAC brings several salient benefits, eg, accurate online modeling, superior tracking, and robustness against various perturbations. 19 The classical PE condition is too stringent and often infeasible in practice. 20 A more practical PE condition based on radial basis function (RBF)-NNs shows that any recurrent trajectory of NN inputs that stays within a regular lattice leads to a partial PE condition. 21 Based on the practical PE condition, several NN learning control (NNLC) methods were proposed to guarantee closed-loop practical exponential stability so that accurate NN learning is obtainable. [21][22][23][24][25] The relationship between PE levels and RBF-NN structures was analyzed in the work of Zheng and Wang. 26 However, in the existing NNLC methods, the necessity that the trajectory of NN inputs is recurrent is still stringent in practice, and the parameter convergence rate highly depends on PE levels, which generally gives rise to a slow parameter convergence speed. 27 A hybrid direct and indirect adaptive control strategy termed composite adaptive control utilizes both tracking and prediction errors to update parameter estimates such that both tracking accuracy and parameter convergence can be improved. [28][29][30] Motivated by the composite adaptation, an emerging composite learning technique was proposed to achieve parameter convergence in adaptive control at the absence of PE. [31][32][33][34][35][36][37] The difference of the composite learning compared with the composite adaptation is that online historical data are employed to construct prediction errors so that closed-loop exponential stability is ensured by an interval excitation (IE) condition, which greatly relaxes the PE condition. A model reference composite learning control method was presented for a class of nonlinear systems with matched parametric uncertainties in the work of Pan et al, 31 where the time derivation of plant states is eliminated by using an integral transformation. In the work of Pan and Yu, 32 the approach of Pan et al 31 was extended to a class of strict-feedback nonlinear systems with mismatched parametric uncertainties via command filtered backstepping. The approach of Pan and Yu 32 was further extended to the case with functional uncertainties in the work of Pan et al. 33 In the work of Pan et al, 34 the composite learning was applied to achieve parameter convergence in least squares-based identification and indirect adaptive control. The IE condition for parameter convergence in the composite learning was relaxed to be a condition of sufficient excitation in the work of Pan et al. 35 In the work of Pan and Yu, 36 a composite learning control approach was developed for a general class of robotic arms. In the work of Guo et al, 37 an NN composite learning control (NNCLC) approach with friction compensation was designed and implemented to an industrial robot arm. However, the approaches of Pan and Yu 36 and Guo et al 37 are specifically designed for robotic systems; the time derivatives of plant states are needed to be estimated in the works of Pan et al, 32,33 and the extension of the integral transformation in other works 31,34,35 to the case with mismatched uncertainties is infeasible.
In this article, an NNCLC strategy is presented for the class of strict-feedback nonlinear systems in the work of Xu et al, 29 where raised-cosine RBF (RCRBF)-NNs are used to approximate plant uncertainties. Command filtered backstepping 38 is resorted to alleviate the problem of "explosion of complexity" in the traditional integrator backstepping. Compared with existing NNLC approaches, the attractive feature of our approach is that fast parameter convergence in NNs, implying exact learning of plant uncertainties, is guaranteed without the trajectory of NN inputs being recurrent. Compared with the NNCLC approach of Pan et al, 33 the distinctive feature of the proposed approach includes the following: (1) the state derivation is not needed for the computation of prediction errors; (2) the RCRBF-NNs applied is not only helpful for reducing computational cost but also convenient for exactly determining a subregressor activated along any trajectory of NN inputs so that the IE condition is verifiable.
The rest of this article is organized as follows. The problem is formulated in Section 2; the RCRBF-NN is described in Section 3; the NNCLC is designed in Section 4; illustrative results are provided in Section 5; conclusions are drawn in Section 6. Throughout this brief, ℝ, ℝ + , and ℝ n denote the spaces of real numbers, positive real numbers, and real n-vectors, respectively; L ∞ is the space of bounded signals; ||x|| is the Euclidean norm of x; min{·}, max{·}, and sup{·} are the operators of minimum, maximum, and supremum, respectively; tanh(x) is a hyperbolic tangent function; Ω c ∶ = {x| ||x|| ≤ c} is the ball of radius c; and  k represents the space of functions for which all k-order derivatives exist and are continuous, where c ∈ ℝ + , x ∈ ℝ, x ∈ ℝ n , and n and k are positive integers.

PROBLEM FORMULATION
Consider the following class of nth-order strict-feedback nonlinear systems with functional uncertainties 29 : is a vector of system states, u(t) ∈ ℝ is a control input, (t) ∈ ℝ is a controlled output, i (x i ) ∶ ℝ i  → ℝ are unknown functions, and i = 1 to n. Let x d (t) ∈ ℝ denote a desired output. The following assumptions are given to facilitate the control design. Let i (t) ∈ ℝ and c i (t) ∈ ℝ with i = 1 to n − 1 be virtual control inputs and the filtered counterparts, respectively. Define tracking errors e i (t) ∶= x i (t) − c i−1 (t) with c 0 (t) = x d (t) and i = 1 to n. Let e(t) ∶ = [e 1 (t), e 2 (t), … , e n (t)] T and In this study, the objective is to design an NN-based control law for the system (1) under Assumptions 1 and 2 such that the tracking error e tends to 0 and f i (x i ) with i = 1 to n are accurately approximated by NNs along NN input trajectories.

RADIAL BASIS FUNCTION NN
Let Ω x ∈ ℝ n be a domain of NN approximation. The region of each x i is divided into m i − 1 uniform and symmetric grids with widths i ∈ ℝ + by m i grid points c l i i ∈ R, where m i ≥ 3 is an odd number, l i = 1 to m i , and i = 1 to n. Then, an RCRBF of the form 40 : is applied to cover at least one grid for each possible l i and i. Hence, N = m 1 m 2 … m n neural nodes can be generated. The N neural nodes ordered in an n-dimension matrix can be reordered into a one-dimension array through a scalar index Then, a RCRBF-NN is represented as follows:̂( is a regression function corresponding to the jth NN node, and j = 1 to N. The RCRBF belongs to a class of localized RBFs as its support is a compact set [c The RBF-NN (2) is used to approximate a function (x) ∶ Ω x  → ℝ resulting in an optimal NN approximation error with W * ∈ Ω w a constant vector of optimal weights given by The approximation theorem of RBF-NNs shows that | (x)| ≤ * , ∀x ∈ Ω x can be guaranteed for any given small constant * ∈ ℝ + if N is sufficiently large. 41 The following definitions and lemmas are presented for the subsequent development.
Definition 1 (See the work of Kurdila et al 18 ).
Definition 2 (See the work of Kurdila et al 18 where W * ∈ ℝ N and Φ ∈ ℝ N are subvectors of W * and Φ, respectively, and N < N is the number of total activated NN nodes.

Lemma 2 (See the work of Wang and Hill 21 ).
For the RBF NN (5) with centers placed on a regular lattice to cover Ω x , given any Remark 1. Because the RCRBF has a compact support, the maximal number of RCRBFs with nonzero values for a given input x i , denoted bym, is controlled by the width i , and the number of current activated NN nodes in the RCRBF-NN (2) is at most N c =m n . Typically,m is set to be 2 or 3, which is smaller than the numbers of grid points m 1 to m n , and thus, N c is generally much smaller than the total number of NN nodes N. A distribution of RCRBFs is illustrated in Figure 1, where , and l i = 1 to 7. In this case, one has N c = 3 3 = 27, which is much smaller than N = 7 3 = 343. Therefore, we can use only activated NN nodes, determined by nonzero RCRBFs, to updateŴ and to compute the NN output so that the RCRBF-NN (2) can have much lower computational cost than other types of RBF-NNs. 40

Neural network-based backstepping control
In the subsequent sections, the valuation of all i is i = 1 to n except special indications. As (2) with extra subscripts i as follows: 18 N i is the number of neural nodes, and i ∈ ℝ + is a constant. An NN-based command-filtered backstepping control law is presented as follows: *In the work of Wang and Hill, 21 due to the usage of Gaussian RBFs, one has (x) = Φ T (x)W * + (x) with ∈ ℝ + of the order . For the RCRBF-NN (2), as the RCRBF has a compact support, the outputs of most j can be strictly zero, and thus, can be used directly instead of in (5).
where v i ∈ ℝ is an auxiliary control term given by with k i , i , i ∈ ℝ + being control parameters, and c i and . c i with i = 1 to n− 1 are generated by a command filter 38 : with z 1 (0) = i (0), z 2 (0) = 0, c i = z 1 , and . c i = z 2 , where ∈ ℝ + is a natural frequency, and ∈ ℝ + is a damping ratio. Note that only excited neural nodes are applied to compute the NN output in (7), and tanh(e i ∕ i ) in (8) serves as an approximation of the sliding mode control term sgn(e i ) to reject system perturbations.
It follows from Lemma 1 that f i (x i ) can be expressed by where W * i is a subvector of W * i , and i and W * i are given by (3) and (4) with extra subscripts i, respectively. Therefore, one where * i ∈ R + are constants that can be made sufficiently small by increasing N i . Applying (10) to (1), one obtains Applying (7) to (11) and after some transformations, one obtains the closed-loop tracking error dynamics whereW i ∶= W * i −Ŵ i is a parameter estimation error. The detailed steps to obtain (12) can be referred to the work of Pan et al. 30

Composite learning using NNs
In the composite NNAC design, the prediction errors i are usually generated by first-order filters as follows 28 : with ∈ ℝ + being a filtering constant such that i can be obtained without the usage of .
x i . Although the convergence of both e i and i can be achieved in composite NNAC, the PE condition still has to be satisfied to guarantee partial convergence ofW i . In this section, composite learning laws ofŴ i are designed such that partial convergence ofW i can be guaranteed by the IE condition in Definition 1, which is much weaker than the PE condition in Definition 2.
Let s+ be a stable low-pass filter with s a complex variable and ∈ ℝ + a filtering constant. To avoid the usage of .
x i in parameter update, s+ is applied to each item of (11) such that ], and u = s+ [u] are represented in the hybrid time-frequency domain. 42 Define an excitation matrix Multiplying each side of the ith equation in (13) by Φ fi and integrating the result over [t − d , t], one obtains From Lemma 2 and Definitions 1 and 2, for any given  1 trajectory x(t) that is not necessary to be recurrent, there exist constants T ei > T a and i ∈ ℝ + such that with T e ∶= max{T ei }.
where c w i ∈ ℝ + are some constants. Then, design composite update laws in which i ∈ ℝ + are learning rates, i ∈ ℝ + are weight factors, and (•) is a projection operator given by 43 Remark 2. Another advantage of applying RCRBF-NNs is that the subregressor Φ i (x i ) activated along any given trajectory x(t) can be exactly determined due to the compact support of RCRBFs such that the IE condition in Definition 1 is verifiable by checking the minimal singular value of Θ i (t) in (14) and the time T ei that satisfies the IE condition is obtainable accordingly.

Stability and convergence analysis
The following lemmas are useful in the subsequent analysis.
As Ω x ⊃ Ω x 0 , one has c x > c x 0 and c e > c e0 . The following theorem demonstrates the main results of this study. 1 with x(0) ∈ Ω x 0 and x d (t) under Assumption 2 driven by the control law constituted by (7) to (9) and (17) withŴ i (0) ∈ Ω wi , if there exist constants T ei > T a and i ∈ R + to satisfy the IE conditions Θ i (T ei ) ≥ i I in Definition 1 and the control parameters k ci and i in (8) are chosen to satisfy k c1 , k cn > 1∕2, k ci > 1, i = 2 to n − 1,

Theorem 1. For the system (1) under Assumption
there exist sufficiently large control parameters k ci , i and so that all signals involved are of L ∞ on t ∈ [0, ∞) and the equilibrium point of the closed-loop system has practical exponential stability on t ∈ [T e , ∞).

Applying lemma A.3.2 in the work of Farrell and Polycarpou 43 to the aforementioned inequality yields
Lyapunov function candidate of the entire system so that on x ∈ Ω x with ∶= ∑ n i=1 i ∈ ℝ + and k s ∶= min i∈ [1,n] {k si } ∈ ℝ + , where the positivity of k s is from the definitions of k si and the choice of k ci in (18). The conditions x ∈ Ω x (implying e ∈ Ω e ) andŴ i , W * i ∈ Ω w i are used to determine a Lyapunov surface Thus, one has > V(0) as c e0 < c e andŴ i (0) ∈ Ω w i . Then, there exist sufficiently large k ci and to satisfy implying < . Thus, {V ≤ } ∩ Ω w is positively invariant such that the trajectories of (e(t),W (t)) started from Ω e0 ∩ Ω w stay within {V ≤ } ∩ Ω w for all time implying T a = ∞ and converge to {V ≤ } ∩ Ω w as t → ∞. Hence, one gets e(t) ∈ Ω e ,W (t) ∈ Ω w , ∀t ∈ [0, ∞) implying x(t), u(t), i (t), c i (t) ∈ L ∞ , ∀t ∈ [0, ∞). Consequently, all signals involved are of L ∞ on t ∈ [0, ∞).
Third, the stability is analyzed at t ∈ [T e , ∞). As there exist T ei > T a and i ∈ ℝ + to satisfy Θ i (T ei ) ≥ i I, it follows from (22) with the derivations in the second part that Applying Young's inequality to the third line of the aforementioned expression and noting (20) and Solving the aforesaid inequality using lemma A.3.2 in the work of Farrell and Polycarpou 43 leads to Using the aforementioned result, one obtains with ∶= ∑ n i=1 i ∈ ℝ + and k m ∶= min 1≤i≤n {k mi } ∈ ℝ + , which implies the trajectories of (e(t),W (t)) exponentially converge to a positively invariant set {V ≤ }. Thus, practical exponential stability is achieved on t ∈ [T e , ∞) in the sense that e(t) andW (t) converge to small neighborhoods of 0 dominated by k ci , i , and .
Remark 3. Dynamic regressor extension and mixing (DREM) is an alternative parameter estimation approach where the PE condition of the regressor Φ f is relaxed to be a nonsquare-integrability condition for the determinant of an instrumental matrix. 46 However, the DREM estimator was studied only for open-loop parameter estimation and its closed-loop stability was not formally proven. In addition, the nonsquare-integrability condition is not directly correlated to the IE condition so that parameter convergence still may not be guaranteed for the DREM estimator even if the IE condition is satisfied.
The control law composed of (7) to (9) and (17) (17). Simulation is carried out in MATLAB software with the solver being fixed-step node 1 and the step size being 1 ms. In addition, 35 dB Gaussian white noise is applied to corrupt state measurement, and the traditional NNLC (set 2 = 0) is selected as the baseline controller.
The reference output x d is generated by where x c = ∕3 at t ∈ [5, 10] ∪ [35,40] seconds, x c = − ∕3 at t ∈ [15,20] ∪ [45,50] seconds, and x c = 0 for other time. It is clear that the x d generated by the aforementioned model includes two tasks that are the same, and it does not satisfy the partial PE condition in Lemma 2. Control trajectories by the two controllers are depicted in Figure 2. It is shown that the proposed NNCLC achieves a much better transient tracking performance than the NNLC under the control input u with a similar gain and less oscillations, where the transient errors e 1 by the proposed NNCLC are reduced from 0.1115 and 0.05136 to 0.07008 and 0.03079 for the first and second tasks, respectively. The slight oscillations in u result from the significantly and frequently changing reference trajectory x d . Learning trajectories under the two controllers are depicted in Figure 3. For the NNLC, no approximation of f 2 and no convergence of ||Ŵ 2 || are shown due to the absence of the partial PE condition. On the contrary, for the proposed NNCLC, f 2 is accurately estimated and ||Ŵ 2 || converges to a constant after a short transient process. The excitation level 2 based on the subregressor Φ f 2 is clearly shown in Figure 3, where 2 by the proposed NNCLC is about two times larger than that by the NNLC. The major drawbacks of the proposed NNCLC include the following: (1) the calculation of the prediction errors i in (16) with i = 1 to n increases the computational cost; (2) the composite update laws (17) with i = 1 to n are more sensitive to external disturbances so that the control design needs to be carefully considered in this case.

CONCLUSIONS
In this paper, an NNCLC strategy based on RCRBF-NNs has been developed for a class of strict-feedback nonlinear systems with mismatched uncertainties, where closed-loop practical exponential stability is established under the IE condition that relaxes the classical PE condition. Compared with existing composite learning approaches, the proposed approach has two distinctive features. (1) An exact estimation of plant uncertainties is achieved without using the time derivatives of plant states; (2) a subregressor activated along any trajectory of NN inputs can be determined such that the IE condition is verifiable. Illustrative results have demonstrated that the proposed NNCLC achieves much better control and learning performances under a similar control input compared with the traditional NNLC, and the excitation level is clearly demonstrated due to the usage of RCRBF-NNs. The determination of centers and widths of RBFs using self-organizing techniques [47][48][49] and the extension to a more general class of pure-feedback nonlinear systems 50 for the proposed approach are interesting for future studies.