Performance analysis of the generalised projection identification for time-varying systems

The least mean square methods include two typical parameter estimation algorithms, which are the projection algorithm and the stochastic gradient algorithm, the former is sensitive to noise and the latter is not capable of tracking the time-varying parameters. On the basis of these two typical algorithms, this study presents a generalised projection identification algorithm (or a finite data window stochastic gradient identification algorithm) for time-varying systems and studies its convergence by using the stochastic process theory. The analysis indicates that the generalised projection algorithm can track the time-varying parameters and requires less computational effort compared with the forgetting factor recursive least squares algorithm. The way of choosing the data window length is stated so that the minimum parameter estimation error upper bound can be obtained. The numerical examples are provided.

Define the time-varying parameter vector ( − 1) ∈ ℝ to be identified and the regressive information vector () ∈ ℝ consisting of the observations up to and including time ( − 1), ( − 1): = [ 1 (), …, (), 1 (), …, ()] T ∈ ℝ , where the superscript T denotes a vector transpose.Equation (1) can be written in vector form It is well-known that the forgetting factor recursive least squares (FF-RLS) algorithm is effective for estimating the time-varying parameter vector [15].In this literature, Lozano [16], and Canetti and Espana [17] analysed the performance of the FF-RLS algorithms for time-invariant and time-varying systems, respectively.Unfortunately, as the forgetting factor approaches unity, their results (i.e. the parameter estimation errors) goes to infinity even for time-invariant systems whose parameters are constant [14].Bittanti et al. [18] studied the convergence properties of the directional FF-RLS algorithms for time-invariant deterministic systems; for ergodic input-output data, Ljung and Prioret [19,20] and Guo et al. [21] obtained a parameter estimation error (PEE) upper bound like (see ( 3)) for large enough , where ^() is the estimate of (), E represents the expectation operator, 0 < < 1 is the forgetting factor, 1 , 2 and are positive constants, () is the parameter changing rate.However, for deterministic time-varying systems, i.e. the observation noise () ≡ 0, as → 0, the PEE upper bound [i.e. the expression on the right-hand side of (3)] is bounded.Unfortunately, this result is incompatible with the existing ones because as → 0, the covariance matrix goes to infinity, it is impossible to obtain the bounded PEE.This motivates us to present a novel generalised stochastic gradient (SG) algorithm.
Although the FF-RLS algorithm can estimate the time-varying parameter vector (), its computational load is heavy due to computing the covariance matrix [14,22].From the perspective of decreasing computational complexity, the projection algorithm is sensitive to noise and the SG algorithm is not capable of tracking the time-varying parameters [22].On the basis of the work in [23], this paper combines the advantages of the projection algorithm and the SG algorithm to present a generalised projection identification algorithm for time-varying systems, and studies its convergence of the proposed algorithm and obtain its PEE upper bound by using the stochastic process theory.The generalised projection algorithm can track the time-varying parameters and requires less computational effort compared with the FF-RLS algorithm.
This paper is organised as follows.Section 2 gives several timevarying parameter estimation algorithms and derives the generalised projection algorithm.Section 3 provides several lemmas to prove the main convergence results in Section 4. Section 5 provides two numerical examples and summarises some conclusions.

Several time-varying parameter identification algorithms
Let us introduce some notations.Let be an identity matrix of order , tr[] denote the trace of a square matrix , and the norm of be ∥ ∥ := tr[ T ] = tr[ T ] and max [] represent the maximum eigenvalue of the positive definite matrix .
The following discusses several time-varying parameter estimation algorithms to real-timely identify the parameter vector () by using the input-output data, i.e. the observations {(), (), ⩽ } up to and including time .
Using the Newton method and minimising the cost function give the following recursive algorithm for estimating the parameter vector () [13]: where ^() is the estimate of () at time .The difference of the matrix () ∈ ℝ × will lead to different identification algorithms, e.g. the FF-RLS algorithm, the projection algorithm, the finite data window least squares algorithm, the SG algorithm and so on.
The proof is easy and omitted here.

Main convergence results
In this section, we derive the PEE upper bounds for the generalised projection algorithm for time-varying systems, assuming that the observation noise and the parameter changing rate have bounded variances.
Theorem 1: For the system in (2) and the algorithm in ( 21) and ( 22), assume that the observation noise {()} and the parameter changing rate {() := () − ( − 1)} are uncorrelated random variable sequences with zero mean and satisfy (see (30)) .If the conditions in Lemma 1 hold, then the parameter estimation error vector ^() − () given by the generalised projection algorithm is mean squarely bounded, i.e.
Furthermore, we have where ⌊/⌋ is the greatest integer number not more than /, and The limit of the parameter estimation error upper bound is (), which depends on the data window length and the noise variances 2 and 2 , and and as well.Proof: Define the parameter estimation error vector ~() = ^() − ().Assume that ~(0) and {()} are uncorrelated, and E[ ∥ ~(0) ∥ 2 ] < ∞.Subtracting () from both sides of (21), we have Furthermore, in order to prove that the parameter estimation error is mean squarely bounded, taking the norm to both sides of the above equation and using Lemma 1, we have (see (31) and ( 32)) Here, we have used the relation . Note that the maximum eigenvalue of T ( + 1, − + 1)( + 1, − + 1) is not more than unity for any ⩾ 1.Using (A1)-(A3) and taking the expectation of both sides of the above equation and using Lemma 1 give (see (32) and ( 33)) .Let = + , 0 ⩽ ⩽ − 1, we have (see (33) and ( 34)) .Taking the limit and using Lemma 3 lead to (see (34)) .This completes the proof of Theorem 1. □ From the PEE upper bound () in Theorem 1, we can arrive at the following conclusions: • There exists an appropriate (or best) data window length such that the error upper bound () is minimum.That is, as long as we choose an appropriate data window length , the PEE upper bound can be minimal.• A small noise variance 2 and small parameter changing rate 2 lead to a small PEE upper bound.• Increasing and decreasing in the SPE condition result in a small PEE upper bound.That is, the larger is and/or the smaller is, the smaller the parameter estimation error is.In other words, the stationarity of the input-output data can reduce the PEE upper bound and improve the parameter estimation accuracy.
Theorem 1 gives the PEE upper bound.The following studies how to obtain the minimum PEE upper bounds under different conditions and how to find the data window length which leads to the minimum PEE upper bound.From Theorem 1, we have the following corollaries.
Corollary 3: For the time-varying stochastic systems in (2), letting 1 ′() = 0 in Theorem 1 gives This equation has three solutions, 1 , 2 and 3 , and the solution = [ ] or = [ ] + 1 which leads to ( ) = min is the best data window length, [] represents the maximum integer no more than , and the corresponding minimal estimation error upper bound is ([ ]) or ([ ] + 1).
For a practical identification problem, the task first is designing an experiment, collecting the input-output data {(), (): = 1, 2, …, } with the data length = ≫ and determining the order of the system by using some order identification methods.
According to the SPE condition, we use the input-output data to construct the information vector (), and set , we can compute and and 1 and 2 .Also, we must compute the estimates ^ 2 and ^ 2 of the variances 2 and 2 by In practice, we do not know the noise variances 2 and 2 , so we simply substitute the variances 2 and 2 in the PEE upper bound () with their estimates ^ 2 and ^ 2 in order to obtain the approximate PEE upper bound.
A longer window permits better performance of the GPJ algorithm for slowly time varying systems.However, if the parameters of systems is changing fast, the window cannot be too large.These three corollaries can give the practical guidance for users on how to choose the window length.Also, these three corollaries may be used for evaluating the PEE upper bound given by the GPJ algorithm for time-varying systems and may guide the choice of the data window length so as to obtain the minimal parameter estimation errors.
From Theorem 1, we can see that the identification algorithms encounter difficulties for fast-changing-parameter systems because the fast-changing parameters have large variance 2 and the parameter estimation error upper bound become large.Otherwise, the slowly time-varying parameters and small observation noise variance 2 lead to a small parameter estimation error.Theorem 2: For time invariant stochastic systems () = T () + (), () ≡ is a constant parameter vector, if the conditions in Theorem 1 hold, then the parameter estimation error E[ ∥ ^() − ∥ 2 ] given by the SG algorithm in (17) and (18) converges to zero at the rate of (1/).
Proof: A similar derivation of Theorem 2 leads to Using (29), we have (see (37)) Using Lemma 3, it is not difficult to obtain This proves Theorem 2. □ Next, we find the relation between the forgetting factor in the FFSG algorithm in (15) and ( 16) and the data window length in the GPJ algorithm in (21) and (22).

Examples
In order to test the performances of the GPJ algorithm, we choose a time-invariant parameter system and a time-varying parameter system as two numerical simulation examples.The former's parameters are constant and the latter's parameters are slowly changing.
The simulation results indicate that for time-invariant systems, the projection algorithm is sensitive to noise, the parameter estimation errors becomes smaller as the noise variance becomes smaller -see Table 2 and Fig. 1; the SG algorithm does not have the ability of tracking the (time-varying) parameters, the parameter estimation errors given by the SG algorithm are very large -see Table 1 and the upper error curve in Fig. 2; the GPJ algorithm can give satisfactory parameter estimation accuracy for an appropriate the data window length -see Tables 1-3 and Fig. 1.
Example 2: Consider the following time-varying system: