New Noise-Tolerant Neural Algorithms for Future Dynamic Nonlinear Optimization With Estimation on Hessian Matrix Inversion

Nonlinear optimization problems with dynamical parameters are widely arising in many practical scientific and engineering applications, and various computational models are presented for solving them under the hypothesis of short-time invariance. To eliminate the large lagging error in the solution of the inherently dynamic nonlinear optimization problem, the only way is to estimate the future unknown information by using the present and previous data during the solving process, which is termed the future dynamic nonlinear optimization (FDNO) problem. In this paper, to suppress noises and improve the accuracy in solving FDNO problems, a novel noise-tolerant neural (NTN) algorithm based on zeroing neural dynamics is proposed and investigated. In addition, for reducing algorithm complexity, the quasi-Newton Broyden–Fletcher–Goldfarb–Shanno (BFGS) method is employed to eliminate the intensively computational burden for matrix inversion, termed NTN-BFGS algorithm. Moreover, theoretical analyses are conducted, which show that the proposed algorithms are able to globally converge to a tiny error bound with or without the pollution of noises. Finally, numerical experiments are conducted to validate the superiority of the proposed NTN and NTN-BFGS algorithms for the online solution of FDNO problems.

related methods and Newton-Raphson iteration (NRI) and their modifications are commonly used [2]. For example, a class of nonlinear conjugate gradient methods aiming at solving optimization problems are summarized in [12], which are of global convergence properties. More recently, a three-term conjugate gradient algorithm providing descent searching directions is investigated in [13]. It is worth pointing out that a large number of practical problems are dynamic in nature, of which the parameters involved are varying with time, thereby leading to a time-dependent theoretical solution. When solved by these traditional algorithms, a dynamic optimization problem is assumed to be time-invariant during the computational interval and thus the generated solution is directly employed to the problem at the next time instant. This is mainly due to the fact that, without leveraging the velocity compensation for the dynamic parameters, a traditional model is not able to track the time-dependent theoretical solutions in a predictive manner [14]. Therefore, for a time-dependent problem aided with a traditional model, large lagging error is unavoidable.
Neural networks and the related neural dynamics methods have shown superior properties in parallel distribution and high-speed computing with extensive applications in neurophysiology, chemical equilibrium and robotics [2], [15]- [26]. For instance, Liu and Tong present an adaptive neural network based on optimal control for a class of nonlinear discrete-time systems in [17], which achieves optimal control performance with system stability guaranteed. Continuoustime zeroing neural dynamics is reported to be able to track the time-dependent solution of dynamic problems in an errorfree manner [20]. A discrete-time numerical algorithm based on zeroing neural dynamics is presented in [25], [27], which is able to solve time-varying nonlinear optimization (termed future dynamic nonlinear optimization (FDNO) problem) accurately without perturbed by noises. However, in spite of the fact that noises and perturbations are widely existing in the online solution process, existing methods for solving the FDNO problem in the presence of noises are considerably rare. Therefore, it is of crucial importance to find a new computational method to handle noises and perturbations with high accuracy achieved for the FDNO problem.
Considering that the continuous-time model can not be applied to digital hardware directly, computational method depicted in discrete form is desirable. To this end, based on zeroing neural dynamics, a discrete-time noise-tolerant neural (NTN) algorithm is constructed in this paper to solve the FDNO problem in the presence of noises and perturbations.
Given that the Hessian matrix inversion is involved in the NT-N algorithm, the Broyden-Fletcher-Goldfarb-Shanno (BFGS) [28], [29] is leveraged to approximate the inverse of Hessian matrix, which is especially helpful for the situation that direct computing of Hessian matrix inverse is expensive or difficult to conduct. The content of this paper is organized as follows. In Section II, the FDNO problem is formulated and the NTN and NTN-BFGS algorithms are proposed to handle such a future problem. Moreover, for comparison, the existing solutions are presented as well. Then, Section III provides theoretical analyses to illustrate the global convergence of the proposed NTN and NTN-BFGS algorithms with or without noises. Moreover, numerical experiments and applications to the robot manipulator are presented in Section IV to validate the superiority of the proposed NTN and NTN-BFGS algorithms, as compared with other existing models. Finally, conclusions are drawn in Section VI. In the end of this introductory section, main contributions of this paper are summarized as follows.
1) This is the first work for solving nonlinear optimizations with dynamic parameters and noise suppressed, of which an intrinsic requirement is that the solution should be calculated before its corresponding mathematical formulation appeared. In this sense, this is quite different from the conventionally investigated static optimization, and thus termed future dynamic nonlinear optimization (FDNO) problem. 2) Two neural algorithms, termed NTN and NTN-BFGS, are proposed to solve the FDNO problems in the presence of noises based on neural dynamics approach, of which the latter one eliminates the intensively computational burden for matrix inversion. 3) Control techniques are leveraged to conduct the theoretical analyses, which reveal that the residual errors of the two proposed neural algorithms are able to converge to a tiny value near zero globally with or without noises.
II. PROBLEM FORMULATION AND SOLUTIONS This section presents the framework and formulations of the FDNO problem with two newly proposed neural algorithms. For comparison, existing solutions are provided as well.

A. Problem Formulation
It is required in the digital hardware implementation that a problem should be depicted in discrete form. Therefore, it is desirable to formulate a problem in discrete manner. Let t s and t f denote the start and the final time instant of the solving process, respectively. An FDNO problem, for which the calculation should be conducted during the time internal where t = kδ with updating index k = 0, 1, 2, · · · , which is abbreviated as t k ; δ > 0 represents the time sampling gap; Φ y(t k+1 ), t k+1 is discretized from the smoothly timevarying signal Φ y(t), t , for which the following assumptions are made: Φ(·, ·) is a time-varying nonlinear function and twice differentiable and lower bounded. This work is dedicated to finding the future solution y(t k+1 ) ∈ R m during the computational interval [t k , t k+1 ) that makes function (1) achieve its minimum value at time instant t k+1 . Note that, during the present computational interval [t k , t k+1 ), Φ(y(t k+1 ), t k+1 ) and its derivatives are not available. In this sense, only the present and/or previous data (e.g., y(t k )) rather than the unknown data (e.g., y(t k+1 )) can be leveraged to compute y(t k+1 ).

B. Continuous-Time NTN Model
The continuous-time FDNO problem is defined as of which the gradient is The 2-norm of q(y(t), t) of an algorithm is a measure of the geometric distance between the current solution y(t) and the zero of q(y(t), t). An intuitive approach to obtain the desired path y * (t) on which q(y(t), t) = 0 is to exploit the derivative method. Therefore, to obtain the online solution of FDNO (2), the derivative of q(y(t), t) with regard to time t should be 0 where H(y(t), t) ∈ R m×m represents Hessian matrix. In detail,q t (y(t), t) is the derivative of q(y(t), t) with respect to time t and can be defined aṡ For performance evaluations in this paper, how well each model solves the FDNO problem is observed through the following error-function where ξ h (t) is the hth element of ξ(t), ∀h ∈ {1, 2, · · · , m}.
Then, the CT-NTN algorithm can be formulated aṡ It has been proven in [25] that the FDNO (2) achieves its minimal when the solution to equation (7) is obtained with a position definite H(y(t), t).

C. Existing Discrete-Time Solutions
Existing discrete-time solutions are presented here for comparison. The discrete-time zeroing dynamics (DTZD) model derived from [30], [31] is formulated as where step-size c = δγ > 0. The three-step DTZD model obtained from [30]- [32] is and the four-step DTZD model [30]- [32] is In addition, the five-step DTZD model [33] is also presented here: Besides, the NRI model in [34] is provided as follows:

D. NTN and NTN-BFGS Neural Algorithms
Noise-interference is ever present during the solving process, e.g., the observational error, the truncation error, the quantization error and the sampling error. Therefore, a highspeed algorithm with noise-tolerant competence for solving the FDNO problem is in demand. In this section, discretetime NTN and NTN-BFGS algorithms are derived to tolerate noises during the solution.
In order to simplify the structure of discrete-time NTN algorithm, the numerical differentiation formula with the least items is chosen. Thus, substituting Euler forward difference [27] to CT-NTN algorithm (7), we can obtain discrete-time NTN algorithm: where step-size c 1 = δγ > 0 and c 2 = δλ > 0; noise term ǫ(t k ) is eliminated to depict the NTN algorithm structure only. Sum term of q(y(t j ), t j ), which is discretized from integral term of (7), plays a significant role in offsetting the impact brought by abrupt disturbance for (13). The circuit diagram showing general components of NTN algorithm (13) is illustrated in Fig. 1.
To solve y(t k+1 ) through NTN algorithm (13), the calculation of Hessian matrix inversion is unavoidable, which could be quite costly if Hessian matrix is complicated. Besides, it is not likely to make the calculation offline because H −1 (y(t k ), t k ) is required to be computed online. In order to overcome this drawback, the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm [35] is utilized in this section.
The highlight of BFGS algorithm [35] is its capability to explicitly escape the usage of inverted Hessian matrix, reducing the computation complexity. In BFGS, exact Hessian matrix is replaced by an approximation consisting of leastchange updates generated from gradient at every iteration. It is known in Section II-A that Φ(·, ·) is a convex function whose Hessian matrix is positive-definite. Thus we can conclude that the Hessian approximation matrix obtained by BFGS algorithm [35] converges to the Hessian matrix inversion.
The following NTN-BFGS algorithm is given for FDNO (1) in case of computing the Hessian matrix inversion being expensive.
where D k (y(t k ), t k ) is the approximation of H −1 (y(t k ), t k ) supported by BFGS iterative formula: where (16) has no temporary matrices. Besides, scalar s T k z k , z T k D k z k and symmetric D k accelerate the computation. The initial iterative value D 0 should be positive-definite to achieve rapid convergence, wherein D 0 = I is a typical choice.
Remark 1: The BFGS iterative formula (15) uses the hypothesis of short-time invariance, which pays the price to avoid expensive computation on the inversion of H(y(t k ), t k ). Even so, the major scheme of NTN-BFGS algorithm (14) still breaks the hypothesis of short-time invariance, which is supported by the following theoretical analyses.
For one thing, NTN-BFGS algorithm (14) exploits the time difference [i.e., term q(y(t k ), t k ) − q(y(t k ), t k−1 )] during the real-time solution process and therefore adapts to the change of coefficients in a predictive manner, making itself suitable for solving FDNO (1), whereas many conventional algorithms do not. For another, NTN-BFGS algorithm (14) exploits the errorfeedback information [i.e., term c 1 q(y(t k ), t k )] as the input to handle the occurrence of computational errors. Additionally, sum term of q(y(t j ), t j ) plays a significant role in offsetting the impact brought by abrupt disturbance.

NTN algorithm (13) polluted by noises is written as
Theorem 1: There is an equivalence between NTN algorithm (13) and the following equation where O(δ 2 ) denotes the vector of truncation errors with each entry being O(δ 2 ).
Proof: (13) can be rewritten as Then, we can obtain The Euler forward difference [27] is utilized for Substituting the above two formulas into (20) can directly generate The discrete version of equation (4) iṡ Then, substituting (22) into (21), a simple form of (21) with respect to error function is shown as In addition, operating Euler forward difference [27] on which is simplified as The proof is completed. Remark 2: To prove the linear property of formula (18), we rewrite (18) as a function ξ(t k+1 ) = f (ξ(t k )).
Setting a as a random coefficient, it is evident that f (aξ(t k )) = aξ(t k+1 ) = af (ξ(t k )), which proves the homogeneity of formula (18). Similarly, by substituting , which proves the additivity of formula (18). In summary, the linear property of formula (18) is proved from aspects of homogeneity and additivity.
Regarding the noise suppressing property of NTN algorithm (13), we offer the following theoretical analyses.
Proof: Using ξ i (t k ) to denote the ith subsystem of ξ(t k ) generates Subtracting (26) from the ith subsystem of (18), we can get:  (27) into the pattern of state-space matrix: where Taking (28) into Minkowski's inequality [36] for the 2norm, we have Matrix U (29) has two different eigenvalues which are The region shown in Fig. 2 describes the value of c 1 , c 2 that make the real part of µ 1 , µ 2 ranging from −1 to 1. Thereby, we can obtain lim k→∞ U k 2 = 0 which further assists that lim As long as c 1 , c 2 which we choose for (13) belonging to that parameter region of convergence (PROC) depicted in Fig. 2, we have lim k→∞ ξ(t k ) 2 of NTN algorithm (13) convergent to O(δ 2 ). The proof is completed.
Proof: The factors that determine the residual error of noisepolluted NTN algorithm (17) can be classified as O(δ 2 ) and injecting constant noise ǫ(t k ). The linear property of (18) which is proved in Remark 2 allows us to independently investigate the two factors.
Firstly, we simply change the form of (18) as It has been proven in Theorem 2 that the residual error of (32) is O(δ 2 ). Next, consider how constant noises influence the convergent performance of In a more general sense, constant noise ǫ(t k ) = Ω is a subcase of linear time-variant noise ǫ(t k ) = ηδk + Ω, k = 0, 1, 2, · · · . Thus, the subsystem of (33) can be rewritten as The Z-transformation of (34) is where ξ i (0) is the initiation of ξ i (z) and its poles are It is the same as the range of c 1 , c 2 in Fig. 2. Thus, utilizing Z-transformation final theorem of (35), the limit formula of ξ i (t k ) is Summing up, taking linear noise ǫ(t k ) = ηδk + Ω and the residual error of (32) into account, the residual error lim k→∞ ξ(t k ) 2 of NTN algorithm (17) is ηδ/c 2 2 +O(δ 2 ); Setting η = 0, we can know that when it comes to constant noise ǫ(t k ) = Ω, it is for certain that the residual error lim k→∞ ξ(t k ) 2 of NTN algorithm (17) is O(δ 2 ), which has nothing to do with the value of constant noises. The proof is completed.
For further investigation, the ensuring theorem reveals how NTN algorithm (13) handles bounded random noises.
Proof: In accordance with superposition principle exploited in Theorem 3, the difference equation is generated: Moreover, with lim k→∞ U k 2 = 0, we can get Finally, we come to the conclusion that The proof is completed. BFGS algorithm has been found dramatically helpful for easing the expensive computational burden for NTN algorithm (13) while the Hessian matrix inversion is diffcult to get. Several theorems regarding to NTN-BFGS algorithm (14) are provided as follows. In addition, noises denote by ǫ(t) polluting NTN-BFGS algorithm (14) Theorem 5: Consider FDNO (1). Given that D k − H −1 k 2 = O(δ 2 ), the theoretical solution generated by NTN-BFGS algorithm (14) converges to that solutioned by NTN algorithm (13) with the residual error being O(δ 2 ).
Proof: To begin this proof, y(t k ) and y(t k ) are used to denote the solutions of NTN-BFGS algorithm (14) and NTN algorithm (13), respectively. We further substitute D k − H −1 k 2 =   (14) and then obtain It is evident that y(t k+1 ) = y(t k+1 ) + O(δ 2 ). Therefore, we can get that the residual error relationship between NTN-BFGS algorithm (14) and NTN algorithm (13): . The proof is completed. Theorem 5 verifies the feasibility of using NTN-BFGS algorithm (14) to replace inverse calculation from the perspective of theoretical derivation. In addition, the following theorem is provided to reveal the effectiveness of the NTN-BFGS algorithm (14). Proof: It can be generalized from proofs in Theorem 2 though Theorem 4, and thus omitted. The proof is completed. Remark 3: To illustrate how noise level influences the performance of NTN algorithm (13), upper bounds of every kind of noises are given with the precision of the residual error being χ according to  For constant noise ǫ(t k ) = Ω, there is no upper bound for the residual error of NTN algorithm (14) invariably being O(δ 2 ). For linear noise ǫ(t k ) = ηδk + Ω, there only exists an upper bound of the rate of change η, which is c 2 χ/δ + O(δ 2 ). For random noises, the upper bound lies in sup 1≤ι≤k, 1≤i≤m |ρ i ι | ≤ (1 − U 2 )χ + O(δ 2 )/2m. In addition, based on Theorem 6, the above conclusion can be applied to NTN-BFGS algorithm (14).
Via the theorem proposed in [25], the residual error of traditional methods which are intrinsically constructed to solve static optimization problems for future dynamic nonlinear optimization is O(δ), within computing interval [0, δ]. Besides, a step-by-step methodology to carry out noise-polluted NTN algorithm (17) and NTN-BFGS algorithm (39) is presented in Algorithm 1.

IV. NUMERICAL EXPERIMENTS
In this section, the effectiveness of the proposed NTN algorithm (13) and NTN-BFGS algorithm (14) with rapid calculating ability for FDNO (1) is substantiated through numerical experiments in the presence of noises. Meanwhile, several representative existing models, which are DTZD model (8), three-step DTZD model (9), four-step DTZD model (10), five-step DTZD model (11) and NRI (12) model, are also used to solve the same FDNO problem in the presence of noises for comparison.
To begin with, Fig. 3 shows comparative performances of NRI model (12) and NTN algorithm (13) in the presence of random noises. In addition, the starting states y(t 0 ) used in computing process is randomly generated. Specifically speaking, as demonstrated in Fig. 3(a), NTN algorithm (13) successfully obtains the minimum value of Φ(y(t k ), t k ) at each time instant, while NRI model (12) fails to cope with the interference caused by random noises. Furthermore, Fig. 3(b) shows that the residual error of NTN algorithm (13) converges to a very small value which is basically 10 3 times less than that of NRI model (12). In addition, the comparison of each  (14), DTZD model (8), three-step DTZD model (9), five-step DTZD model (11) and NRI model (12)  element trajectory of y(t k ) is plotted in Fig. 3(c). As shown in Fig. 3(d), Hessian matrix minimal eigenvalues of FDNO benchmark problem (41) obtained from two models coincide with each other, which are both larger than zero during the computational time. That is the prerequisites for utilizing NRI model (12) and NTN algorithm (13) on FDNO benchmark problem (41).
Next, no matter what kind of noises they are, we can use combinations of constant noises, linear noises and random noises to define it mathematically. Thereby, it is allowed to explore effects which noises exert on different models in a categorical way. Experimental results of FDNO benchmark problem (41) among all the aforementioned models in the presence of different kinds of additive noises are plotted Fig.  4 through Fig. 7. Fig. 4, DTZD model (8), three-step DTZD model (9), five-step DTZD model (11) and NRI model (12) basically fail to handle constant noises. The residual error generated by NRI model (12) is smaller than these three DTZD models. Fig. 4(a) through (c) indicate that constant noises have negligible influence on existing models with residual error being about 2 × 10 2 or 4 × 10 2 . In contrast, since associating the error-integration term with the proposed novel NT-N algorithm (13)    For c 1 , four-step DTZD (10) is 0.05; the other aforementioned algorithms is 0.5.
Overall, FDNO benchmark problem (41) verifies that the proposed NTN algorithm (13) and NTN-BFGS algorithm (14) possesses extraordinary ability in suppressing different kinds of noises, even their combinations, without any prior information of noises.

V. APPLICATION TO MOTION GENERATION
Robotic systems have evolved rapidly in engineering fields [37]- [40]. In this section, we apply NTN algorithm (13), NTN-BFGS algorithm (14) and NRI model (12) to solve the inverse kinematics of a two-link planar robot. Afterwards, let a(t) ∈ R 2 denotes the vector of practical end effector position in Cartesian coordinate system, meanwhile a d (t) is the corresponding desired one; θ(t) = [θ 1 (t), θ 2 (t)] T symbolizes Algorithm 1 Solution to FDNO problem (1) generated by noise-polluted NTN algorithm (17) and NTN-BFGS algorithm (39) 1. initial set m, maxstep, c 1 , c 2 , noise = ǫ(t k ), Φ(·, ·), calculate random y(t 0 ) and q(y(t 0 ), t 0 ) 2. for (k = 1; k ≤ maxstep; k + +) do error = 0; calculate q(y(t k ), t k ) = ∂Φ(y(t k ), t k )/∂y(t k ); calculate k j=0 q(y(t j ), t j ) = q(y(t 0 ), t 0 ) + · · · + q(y(t k ), t k ) ; calculate H(y(t k ), t k ) = ∂q(y(t k ), t k )/∂y(t k ); if NTN algorithm then calculate y(t k+1 ) = y(t k ) − H −1 (y(t k ), t k ) (c 1 + 1)q(y(t k ), t k ) − q(y(t k ), t k−1 ) + c 2 k j=0 q(y(t j ), t j ) + noise ; end if if NTN-BFGS algorithm then calculate y(t k+1 ) using D k+1 = D k + ∆ D k , and y(t k+1 ) = y(t k ) − D k (y(t k ), t k ) (c 1 + 1)q(y(t k ), t k ) − q(y(t k ), t k−1 ) + c 2 k j=0 q(y(t j ), t j + noise) ; end if calculate error equals to the Euclidean norm of q(y(t k ), t k ); ERROR(k) = q(y(t k ), t k ) 2 ; end for 3. plot the residual error the joint-angle vector; f (·) represents the forward-kinematics mapping relation between the end effector position and the joint-angle [39], which means f (θ(t)) = a(t). Considering that T indicates the end point of solving process, every computational time interval can be expressed by [kδ, (k + 1)δ) ∈ [0, T ], where k = 0, 1, 2, · · · and δ is sampling gap. Thereby, the cost function of the aforementioned robot motion could be defined as Evidently, the solution of a(t) tracking a d (t) can be classified as the following FDNO problem: For simple illustration, each link length is set as 1 m. To be specific, our aim is to use the two-link planar robot manipulator with NTN algorithm (13) and NRI model (12) embed to draw a four-leaf-clover. The ensuring figures show comparative experimental results conducted under various experimental environments which differs in kinds of additive noises. For comparison, Fig. 8 plots profiles of the whole tracking motion trajectories, joint angle and the position error without the presence of noises. Through Fig. 9 to Fig. 11, it is shown that the given motion is completed well by NTN algorithm (13) where the position error is slight, despite additive noises. Specifically, adding constant noise ǫ(t) = 10 to joint velocities when being solved by NTN algorithm (13), the position error of the manipulator's endeffector in Fig. 9(a) is less than 10 −9 m. Thus, the proposed NTN algorithm (13) is feasible to industrial applications.
VI. CONCLUSIONS In this paper, NTN algorithm (13) has been addressed for future dynamic nonlinear optimization problems in the presence of a class of noises affecting the system. Integration control from control-technique has been defined to minimize the cost function rapidly. The use of quasi-Newton BFGS method has been proposed through NTN-BFGS algorithm (14), which can not only deal with noises for the FDNO problem, but eliminate the expensive calculation of inversion matrices. Results obtained by numerical experiments are reported in comparison with DTZD model (8) , three-step DTZD model (9), five-step DTZD model (11) and NRI model (12), thereby highlighting the superiority of the proposed algorithms in robustness, efficacy and computational complexity when solving the FDNO problem with noises. Besides, a possible future research direction is the proposing of an algorithm independent of the short-time invariance hypothesis with the explicit matrix-inversion operation eliminated for solving FD-NO problems.