## MB02ND

### Solution of Total Least-Squares problem using a partial SVD approach

[Specification] [Arguments] [Method] [References] [Comments] [Example]

Purpose

```  To solve the Total Least Squares (TLS) problem using a Partial
Singular Value Decomposition (PSVD) approach.
The TLS problem assumes an overdetermined set of linear equations
AX = B, where both the data matrix A as well as the observation
matrix B are inaccurate. The routine also solves determined and
underdetermined sets of equations by computing the minimum norm
solution.
It is assumed that all preprocessing measures (scaling, coordinate
transformations, whitening, ... ) of the data have been performed

```
Specification
```      SUBROUTINE MB02ND( M, N, L, RANK, THETA, C, LDC, X, LDX, Q, INUL,
\$                   TOL, RELTOL, IWORK, DWORK, LDWORK, BWORK,
\$                   IWARN, INFO )
C     .. Scalar Arguments ..
INTEGER           INFO, IWARN, L, LDC, LDWORK, LDX, M, N, RANK
DOUBLE PRECISION  RELTOL, THETA, TOL
C     .. Array Arguments ..
LOGICAL           BWORK(*), INUL(*)
INTEGER           IWORK(*)
DOUBLE PRECISION  C(LDC,*), DWORK(*), Q(*), X(LDX,*)

```
Arguments

Input/Output Parameters

```  M       (input) INTEGER
The number of rows in the data matrix A and the
observation matrix B.  M >= 0.

N       (input) INTEGER
The number of columns in the data matrix A.  N >= 0.

L       (input) INTEGER
The number of columns in the observation matrix B.
L >= 0.

RANK    (input/output) INTEGER
On entry, if RANK < 0, then the rank of the TLS
approximation [A+DA|B+DB] (r say) is computed by the
routine.
Otherwise, RANK must specify the value of r.
RANK <= min(M,N).
On exit, if RANK < 0 on entry and INFO = 0, then RANK
contains the computed rank of the TLS approximation
[A+DA|B+DB].
Otherwise, the user-supplied value of RANK may be
changed by the routine on exit if the RANK-th and the
(RANK+1)-th singular values of C = [A|B] are considered
to be equal, or if the upper triangular matrix F (as
defined in METHOD) is (numerically) singular.

THETA   (input/output) DOUBLE PRECISION
On entry, if RANK < 0, then the rank of the TLS
approximation [A+DA|B+DB] is computed using THETA as
(min(M,N+L) - d), where d is the number of singular
values of [A|B] <= THETA. THETA >= 0.0.
Otherwise, THETA is an initial estimate (t say) for
computing a lower bound on the RANK largest singular
values of [A|B]. If THETA < 0.0 on entry however, then
t is computed by the routine.
On exit, if RANK >= 0 on entry, then THETA contains the
computed bound such that precisely RANK singular values
of C = [A|B] are greater than THETA + TOL.
Otherwise, THETA is unchanged.

C       (input/output) DOUBLE PRECISION array, dimension (LDC,N+L)
On entry, the leading M-by-(N+L) part of this array must
contain the matrices A and B. Specifically, the first N
columns must contain the data matrix A and the last L
columns the observation matrix B (right-hand sides).
On exit, if INFO = 0, the first N+L components of the
columns of this array whose index i corresponds with
INUL(i) = .TRUE., are the possibly transformed (N+L-RANK)
base vectors of the right singular subspace corresponding
to the singular values of C = [A|B] which are less than or
equal to THETA. Specifically, if L = 0, or if RANK = 0 and
IWARN <> 2, these vectors are indeed the base vectors
above. Otherwise, these vectors form the matrix V2,
transformed as described in Step 4 of the PTLS algorithm
(see METHOD). The TLS solution is computed from these
vectors. The other columns of array C contain no useful
information.

LDC     INTEGER
The leading dimension of array C.  LDC >= max(1,M,N+L).

X       (output) DOUBLE PRECISION array, dimension (LDX,L)
If INFO = 0, the leading N-by-L part of this array
contains the solution X to the TLS problem specified by
A and B.

LDX     INTEGER
The leading dimension of array X.  LDX >= max(1,N).

Q       (output) DOUBLE PRECISION array, dimension
(max(1,2*min(M,N+L)-1))
This array contains the partially diagonalized bidiagonal
matrix J computed from C, at the moment that the desired
singular subspace has been found. Specifically, the
leading p = min(M,N+L) entries of Q contain the diagonal
elements q(1),q(2),...,q(p) and the entries Q(p+1),Q(p+2),
...,Q(2*p-1) contain the superdiagonal elements e(1),e(2),
...,e(p-1) of J.

INUL    (output) LOGICAL array, dimension (N+L)
The indices of the elements of this array with value
.TRUE. indicate the columns in C containing the base
vectors of the right singular subspace of C from which
the TLS solution has been computed.

```
Tolerances
```  TOL     DOUBLE PRECISION
This parameter defines the multiplicity of singular values
by considering all singular values within an interval of
length TOL as coinciding. TOL is used in checking how many
singular values are less than or equal to THETA. Also in
computing an appropriate upper bound THETA by a bisection
method, TOL is used as a stopping criterion defining the
minimum (absolute) subinterval width. TOL is also taken
as an absolute tolerance for negligible elements in the
QR/QL iterations. If the user sets TOL to be less than or
equal to 0, then the tolerance is taken as specified in
SLICOT Library routine MB04YD document.

RELTOL  DOUBLE PRECISION
This parameter specifies the minimum relative width of an
interval. When an interval is narrower than TOL, or than
RELTOL times the larger (in magnitude) endpoint, then it
is considered to be sufficiently small and bisection has
converged. If the user sets RELTOL to be less than
BASE * EPS, where BASE is machine radix and EPS is machine
precision (see LAPACK Library routine DLAMCH), then the
tolerance is taken as BASE * EPS.

```
Workspace
```  IWORK   INTEGER array, dimension (N+2*L)

DWORK   DOUBLE PRECISION array, dimension (LDWORK)
On exit, if INFO = 0, DWORK(1) returns the optimal value
of LDWORK, and DWORK(2) returns the reciprocal of the
condition number of the matrix F.

LDWORK  INTEGER
The length of the array DWORK.
LDWORK = max(2, max(M,N+L) + 2*min(M,N+L),
min(M,N+L) + LW + max(6*(N+L)-5,
L*L+max(N+L,3*L)),
where
LW = (N+L)*(N+L-1)/2,  if M >= N+L,
LW = M*(N+L-(M-1)/2),  if M <  N+L.
For optimum performance LDWORK should be larger.

If LDWORK = -1, then a workspace query is assumed;
the routine only calculates the optimal size of the
DWORK array, returns this value as the first entry of
the DWORK array, and no error message related to LDWORK
is issued by XERBLA.

BWORK   LOGICAL array, dimension (N+L)

```
Warning Indicator
```  IWARN   INTEGER
= 0:  no warnings;
= 1:  if the rank of matrix C has been lowered because a
singular value of multiplicity greater than 1 was
found;
= 2:  if the rank of matrix C has been lowered because the
upper triangular matrix F is (numerically) singular.

```
Error Indicator
```  INFO    INTEGER
= 0:  successful exit;
< 0:  if INFO = -i, the i-th argument had an illegal
value;
= 1:  if the maximum number of QR/QL iteration steps
(30*MIN(M,N)) has been exceeded;
= 2:  if the computed rank of the TLS approximation
[A+DA|B+DB] exceeds MIN(M,N). Try increasing the
value of THETA or set the value of RANK to min(M,N).

```
Method
```  The method used is the Partial Total Least Squares (PTLS) approach
proposed by Van Huffel and Vandewalle .

Let C = [A|B] denote the matrix formed by adjoining the columns of
B to the columns of A on the right.

Total Least Squares (TLS) definition:
-------------------------------------

Given matrices A and B, find a matrix X satisfying

(A + DA) X = B + DB,

where A and DA are M-by-N matrices, B and DB are M-by-L matrices
and X is an N-by-L matrix.
The solution X must be such that the Frobenius norm of [DA|DB]
is a minimum and each column of B + DB is in the range of
A + DA. Whenever the solution is not unique, the routine singles
out the minimum norm solution X.

Let V denote the right singular subspace of C. Since the TLS
solution can be computed from any orthogonal basis of the subspace
of V corresponding to the smallest singular values of C, the
Partial Singular Value Decomposition (PSVD) can be used instead of
the classical SVD. The dimension of this subspace of V may be
determined by the rank of C or by an upper bound for those
smallest singular values.

The PTLS algorithm proceeds as follows (see [2 - 5]):

Step 1: Bidiagonalization phase
-----------------------
(a) If M is large enough than N + L, transform C into upper
triangular form R by Householder transformations.
(b) Transform C (or R) into upper bidiagonal form
(p = min(M,N+L)):

|q(1) e(1)  0   ...  0   |
(0)  | 0   q(2) e(2)      .   |
J   = | .                  .   |
| .                e(p-1)|
| 0             ... q(p) |

if M >= N + L, or lower bidiagonal form:

|q(1)  0    0   ...  0     0   |
(0)  |e(1) q(2)  0        .     .   |
J   = | .                  .     .   |
| .                 q(p)   .   |
| 0             ... e(p-1) q(p)|

if M < N + L, using Householder transformations.
In the second case, transform the matrix to the upper
bidiagonal form by applying Givens rotations.
(c) Initialize the right singular base matrix with the identity
matrix.

Step 2: Partial diagonalization phase
-----------------------------
If the upper bound THETA is not given, then compute THETA such
that precisely p - RANK singular values (p=min(M,N+L)) of the
bidiagonal matrix are less than or equal to THETA, using a
bisection method . Diagonalize the given bidiagonal matrix J
partially, using either QL iterations (if the upper left diagonal
element of the considered bidiagonal submatrix is smaller than the
lower right diagonal element) or QR iterations, such that J is
split into unreduced bidiagonal submatrices whose singular values
are either all larger than THETA or are all less than or equal
to THETA. Accumulate the Givens rotations in V.

Step 3: Back transformation phase
-------------------------
Apply the Householder transformations of Step 1(b) onto the base
vectors of V associated with the bidiagonal submatrices with all
singular values less than or equal to THETA.

Step 4: Computation of F and Y
----------------------
Let V2 be the matrix of the columns of V corresponding to the
(N + L - RANK) smallest singular values of C.
Compute with Householder transformations the matrices F and Y
such that:

|VH   Y|
V2 x Q = |      |
|0    F|

where Q is an orthogonal matrix, VH is an N-by-(N-RANK) matrix,
Y is an N-by-L matrix and F is an L-by-L upper triangular matrix.
If F is singular, then reduce the value of RANK by one and repeat
Steps 2, 3 and 4.

Step 5: Computation of the TLS solution
-------------------------------
If F is non-singular then the solution X is obtained by solving
the following equations by forward elimination:

X F = -Y.

Notes:
If RANK is lowered in Step 4, some additional base vectors must
be computed in Step 2. The additional computations are kept to
a minimum.
If RANK is lowered in Step 4 but the multiplicity of the RANK-th
singular value is larger than 1, then the value of RANK is further
lowered with its multiplicity defined by the parameter TOL. This
is done at the beginning of Step 2 by calling SLICOT Library
routine MB03MD (from MB04YD), which estimates THETA using a
bisection method. If F in Step 4 is singular, then the computed
solution is infinite and hence does not satisfy the second TLS
criterion (see TLS definition). For these cases, Golub and
Van Loan  claim that the TLS problem has no solution. The
properties of these so-called nongeneric problems are described
in  and the TLS computations are generalized in order to solve
them. As proven in , the proposed generalization satisfies the
TLS criteria for any number L of observation vectors in B provided
that, in addition, the solution | X| is constrained to be
|-I|
orthogonal to all vectors of the form |w| which belong to the
|0|
space generated by the columns of the submatrix |Y|.
|F|

```
References
```   Golub, G.H. and Van Loan, C.F.
An Analysis of the Total Least-Squares Problem.
SIAM J. Numer. Anal., 17, pp. 883-893, 1980.

 Van Huffel, S., Vandewalle, J. and Haegemans, A.
An Efficient and Reliable Algorithm for Computing the
Singular Subspace of a Matrix Associated with its Smallest
Singular Values.
J. Comput. and Appl. Math., 19, pp. 313-330, 1987.

 Van Huffel, S.
Analysis of the Total Least Squares Problem and its Use in
Parameter Estimation.
Doctoral dissertation, Dept. of Electr. Eng., Katholieke
Universiteit Leuven, Belgium, June 1987.

 Chan, T.F.
An Improved Algorithm for Computing the Singular Value
Decomposition.
ACM TOMS, 8, pp. 72-83, 1982.

 Van Huffel, S. and Vandewalle, J.
The Partial Total Least Squares Algorithm.
J. Comput. Appl. Math., 21, pp. 333-341, 1988.

 Van Huffel, S. and Vandewalle, J.
Analysis and Solution of the Nongeneric Total Least Squares
Problem.
SIAM J. Matr. Anal. and Appl., 9, pp. 360-372, 1988.

```
Numerical Aspects
```  The computational efficiency of the PTLS algorithm compared with
the classical TLS algorithm (see [2 - 5]) is obtained by making
use of PSVD (see ) instead of performing the entire SVD.
Depending on the gap between the RANK-th and the (RANK+1)-th
singular values of C, the number (N + L - RANK) of base vectors to
be computed with respect to the column dimension (N + L) of C and
the desired accuracy RELTOL, the algorithm used by this routine is
approximately twice as fast as the classical TLS algorithm at the
expense of extra storage requirements, namely:
(N + L) x (N + L - 1)/2  if M >= N + L or
M x (N + L - (M - 1)/2)  if M <  N + L.
This is because the Householder transformations performed on the
rows of C in the bidiagonalization phase (see Step 1) must be kept
until the end (Step 5).

```
```  None
```
Example

Program Text

```*     MB02ND EXAMPLE PROGRAM TEXT
*     Copyright (c) 2002-2017 NICONET e.V.
*
*     .. Parameters ..
DOUBLE PRECISION ZERO
PARAMETER        ( ZERO = 0.0D0 )
INTEGER          NIN, NOUT
PARAMETER        ( NIN = 5, NOUT = 6 )
INTEGER          MMAX, NMAX, LMAX
PARAMETER        ( MMAX = 20, NMAX = 20, LMAX = 20 )
INTEGER          LDC, LDX
PARAMETER        ( LDC = MAX( MMAX, NMAX+LMAX ), LDX = NMAX )
INTEGER          LENGQ
PARAMETER        ( LENGQ = 2*MIN(MMAX,NMAX+LMAX)-1 )
INTEGER          LIWORK
PARAMETER        ( LIWORK = NMAX+2*LMAX )
INTEGER          LDWORK
PARAMETER        ( LDWORK = MAX(2, MAX( MMAX, NMAX+LMAX ) +
\$                            2*MIN( MMAX, NMAX+LMAX ),
\$                            MIN( MMAX, NMAX+LMAX ) +
\$                            MAX( ( NMAX+LMAX )*( NMAX+LMAX-1 )/2,
\$                              MMAX*( NMAX+LMAX-( MMAX-1 )/2 ) ) +
\$                            MAX( 6*(NMAX+LMAX)-5, LMAX*LMAX +
\$                                 MAX( NMAX+LMAX, 3*LMAX ) ) ) )
INTEGER          LBWORK
PARAMETER        ( LBWORK = NMAX+LMAX )
*     .. Local Scalars ..
DOUBLE PRECISION RELTOL, THETA, THETA1, TOL
INTEGER          I, INFO, IWARN, J, K, L, LOOP, M, MINMNL, N,
\$                 RANK, RANK1
*     .. Local Arrays ..
DOUBLE PRECISION C(LDC,NMAX+LMAX), DWORK(LDWORK),
\$                 Q(LENGQ), X(LDX,LMAX)
INTEGER          IWORK(LIWORK)
LOGICAL          BWORK(LBWORK), INUL(NMAX+LMAX)
*     .. External Subroutines ..
EXTERNAL         MB02ND
*     .. Intrinsic Functions ..
INTRINSIC        MAX, MIN
*     .. Executable Statements ..
*
WRITE ( NOUT, FMT = 99999 )
*     Skip the heading in the data file and read the data.
READ ( NIN, FMT = '()' )
READ ( NIN, FMT = * ) M, N, L, RANK, THETA, TOL, RELTOL
IF ( M.LT.0 .OR. M.GT.MMAX ) THEN
WRITE ( NOUT, FMT = 99982 ) M
ELSE IF ( N.LT.0 .OR. N.GT.NMAX ) THEN
WRITE ( NOUT, FMT = 99983 ) N
ELSE IF ( L.LT.0 .OR. L.GT.LMAX ) THEN
WRITE ( NOUT, FMT = 99981 ) L
ELSE IF ( RANK.GT.MIN( MMAX, NMAX ) ) THEN
WRITE ( NOUT, FMT = 99980 ) RANK
ELSE IF ( RANK.LT.0 .AND. THETA.LT.ZERO ) THEN
WRITE ( NOUT, FMT = 99979 ) THETA
ELSE
READ ( NIN, FMT = * ) ( ( C(I,J), J = 1,N+L ), I = 1,M )
RANK1 = RANK
THETA1 = THETA
*        Compute the solution to the TLS problem Ax = b.
CALL MB02ND( M, N, L, RANK, THETA, C, LDC, X, LDX, Q, INUL,
\$                TOL, RELTOL, IWORK, DWORK, LDWORK, BWORK, IWARN,
\$                INFO )
*
IF ( INFO.NE.0 ) THEN
WRITE ( NOUT, FMT = 99998 ) INFO
ELSE
IF ( IWARN.NE.0 ) THEN
WRITE ( NOUT, FMT = 99997 ) IWARN
WRITE ( NOUT, FMT = 99996 ) RANK
ELSE
IF ( RANK1.LT.0 ) WRITE ( NOUT, FMT = 99996 ) RANK
END IF
IF ( THETA1.LT.ZERO ) WRITE ( NOUT, FMT = 99995 ) THETA
WRITE ( NOUT, FMT = 99994 )
MINMNL = MIN( M, N+L )
LOOP = MINMNL - 1
DO 20 I = 1, LOOP
K = I + MINMNL
WRITE ( NOUT, FMT = 99993 ) I, I, Q(I), I, I + 1, Q(K)
20       CONTINUE
WRITE ( NOUT, FMT = 99992 ) MINMNL, MINMNL, Q(MINMNL)
WRITE ( NOUT, FMT = 99991 )
DO 60 J = 1, L
DO 40 I = 1, N
WRITE ( NOUT, FMT = 99990 ) X(I,J)
40          CONTINUE
IF ( J.LT.L ) WRITE ( NOUT, FMT = 99989 )
60       CONTINUE
WRITE ( NOUT, FMT = 99987 ) N + L, N + L
WRITE ( NOUT, FMT = 99985 )
DO 80 I = 1, MAX( M, N + L )
WRITE ( NOUT, FMT = 99984 ) ( C(I,J), J = 1,N+L )
80       CONTINUE
WRITE ( NOUT, FMT = 99986 )
DO 100 J = 1, N + L
WRITE ( NOUT, FMT = 99988 ) J, INUL(J)
100       CONTINUE
END IF
END IF
STOP
*
99999 FORMAT (' MB02ND EXAMPLE PROGRAM RESULTS',/1X)
99998 FORMAT (' INFO on exit from MB02ND = ',I2)
99997 FORMAT (' IWARN on exit from MB02ND = ',I2,/)
99996 FORMAT (' The computed rank of the TLS approximation  = ',I3,/)
99995 FORMAT (' The computed value of THETA = ',F7.4,/)
99994 FORMAT (' The elements of the partially diagonalized bidiagonal ',
\$       'matrix are',/)
99993 FORMAT (2(' (',I1,',',I1,') = ',F7.4,2X))
99992 FORMAT (' (',I1,',',I1,') = ',F7.4,/)
99991 FORMAT (' The solution X to the TLS problem is ',/)
99990 FORMAT (1X,F8.4)
99989 FORMAT (' ')
99988 FORMAT (I3,L8)
99987 FORMAT (/' Right singular subspace corresponds to the first ',I2,
\$       ' components of the j-th ',/' column of C for which INUL(',
\$       'j) = .TRUE., j = 1,...,',I2,/)
99986 FORMAT (/'  j    INUL(j)',/)
99985 FORMAT (' Matrix C',/)
99984 FORMAT (20(1X,F8.4))
99983 FORMAT (/' N is out of range.',/' N = ',I5)
99982 FORMAT (/' M is out of range.',/' M = ',I5)
99981 FORMAT (/' L is out of range.',/' L = ',I5)
99980 FORMAT (/' RANK is out of range.',/' RANK = ',I5)
99979 FORMAT (/' THETA must be at least zero.',/' THETA = ',F8.4)
END
```
Program Data
``` MB02ND EXAMPLE PROGRAM DATA
6     3     1     -1     0.001     0.0     0.0
0.80010  0.39985  0.60005  0.89999
0.29996  0.69990  0.39997  0.82997
0.49994  0.60003  0.20012  0.79011
0.90013  0.20016  0.79995  0.85002
0.39998  0.80006  0.49985  0.99016
0.20002  0.90007  0.70009  1.02994
```
Program Results
``` MB02ND EXAMPLE PROGRAM RESULTS

The computed rank of the TLS approximation  =   3

The elements of the partially diagonalized bidiagonal matrix are

(1,1) =  3.2280   (1,2) = -0.0287
(2,2) =  0.8714   (2,3) =  0.0168
(3,3) =  0.3698   (3,4) =  0.0000
(4,4) =  0.0001

The solution X to the TLS problem is

0.5003
0.8003
0.2995

Right singular subspace corresponds to the first  4 components of the j-th
column of C for which INUL(j) = .TRUE., j = 1,..., 4

Matrix C

-0.3967  -0.7096   0.4612  -0.3555
0.9150  -0.2557   0.2414  -0.5687
-0.0728   0.6526   0.5215  -0.2128
0.0000   0.0720   0.6761   0.7106
0.1809   0.3209   0.0247  -0.4139
0.0905   0.4609  -0.3528   0.5128

j    INUL(j)

1       F
2       F
3       F
4       T
```