「特異値分解」の版間の差分

提供: miniwiki
移動先:案内検索
(1版 をインポートしました)
(内容を「{{テンプレート:20180815sk}}」で置換)
(タグ: Replaced)
 
1行目: 1行目:
{{出典の明記|date=2011年12月}}
+
{{テンプレート:20180815sk}}
[[File:Singular-Value-Decomposition.svg|thumb|350px|特異値分解の図示。2次元の実ベクトル空間上の[[せん断写像]] <math>M = \begin{bmatrix}1&1\\0&1\end{bmatrix}</math> による単位円の変形。{{mvar|M}} は {{math|''V''<sup>*</sup>}} による[[等長変換]](この図では回転)、{{math|&Sigma;}} による伸縮(この図では単位円が楕円に変形されていて、その長径と短径が特異値に相当する)、{{mvar|U}} による等長変換(この図では回転)の合成に分解される。]]
 
'''特異値分解'''(とくいちぶんかい、{{lang-en-short|singular value decomposition; SVD}})とは、[[線形代数学]]における、[[複素数]]あるいは[[実数]]を成分とする[[行列]]に対する[[行列の分解|行列分解]]の一手法である。[[信号処理]]や[[統計学]]の分野で用いられる。特異値分解は、行列に対する[[スペクトル定理]]の一般化とも考えられ、[[正方行列]]に限らず[[任意]]の形の行列を分解できる。
 
 
 
== 特異値分解定理 ==
 
{{mvar|M}} を[[行列の階数|階数]] {{mvar|r}} の {{mvar|m}} 行 {{mvar|n}} 列の[[行列]]とする。ただし、行列の要素は[[体 (数学)|体]] {{mvar|K}} の[[元 (数学)|元]]であり、{{mvar|K}} は[[実数体]] {{math|'''R'''}} または[[複素数体]] {{math|'''C'''}} のいずれかであるとする。このとき、
 
:<math>M = U\Sigma V^*</math>
 
という {{mvar|M}} の分解が存在する{{sfn|Golub|Van Loan|2013|loc={{google books quote|id=X5YfsuCWpxMC|page=76|Theorem 2.4.1 (Singular Value Decomposition)}}}}{{sfn|Horn|Johnson|2013|loc={{google books quote|id=5I5AYeeh0JUC|page=150|Theorem 2.6.3 (Singular value decomposition)}}}}。
 
ここで {{mvar|U}} は {{mvar|m}} 行 {{mvar|m}} 列の[[ユニタリ行列]]、{{math|''V''<sup>*</sup>}} は {{mvar|n}} 行 {{mvar|n}} 列のユニタリ行列 {{mvar|V}} の[[随伴行列]]([[複素共役]]かつ[[転置行列]])。さらに[[行列の定値性|半正定値行列]] {{math|''MM''<sup>*</sup>}}(あるいは {{math|''M''<sup>*</sup>''M''}})の正の[[固有値]]の[[平方根]] {{math|''&sigma;''<sub>1</sub> &ge; &hellip; &ge; ''&sigma;<sub>r</sub>'' &gt; 0}} が存在して、{{math|1=''q'' = min(''m'', ''n''), ''&sigma;''<sub>''r''+1</sub> = &hellip; = ''&sigma;<sub>q</sub>'' = 0}} とおけば、{{mvar|m}} 行 {{mvar|n}} 列の行列 {{math|&Sigma;}} は以下の形になる。
 
:<math>
 
\Sigma =
 
\begin{cases}
 
  \begin{bmatrix}
 
  \Delta & O
 
  \end{bmatrix}
 
  & (m > n) \\
 
  \Delta
 
  & (m = n) \\
 
  \begin{bmatrix}
 
  \Delta \\
 
  O
 
  \end{bmatrix}
 
  & (m < n)
 
\end{cases}
 
\quad
 
\text{where } \Delta = \operatorname{diag}(\sigma_1, \sigma_2, \dotsc, \sigma_q)
 
</math>
 
ここで {{math|&Delta;}} は {{math|''&sigma;''<sub>1</sub>, ..., ''&sigma;<sub>q</sub>''}} を対角成分とする {{mvar|q}} 行 {{mvar|q}} 列の[[対角行列]]、部分行列 {{math|''O''}} は[[零行列]]である。
 
この分解を'''特異値分解'''、{{math|''&sigma;''<sub>1</sub>, ..., ''&sigma;<sub>q</sub>''}} を行列 {{mvar|M}} の'''[[特異値]]'''と呼ぶ。
 
 
 
入力情報を {{mvar|n}} 成分の列ベクトル {{mvar|'''v'''}} として表し、出力として {{mvar|M'''v'''}} が得られるようなモデルを考えると、行列 {{mvar|M}} の特異値分解によって得られるユニタリ行列と特異値について以下のような解釈を与えることができる。
 
 
 
* 行列 {{mvar|V}} の各列は、{{mvar|M}} の''入力''空間の[[正規直交基底]]を表す。
 
* 行列 {{mvar|U}} の各列は、{{mvar|M}} の''出力''空間の正規直交基底を表す。
 
* 特異値は''増幅率''を表し、入力成分がそれぞれ何倍されて出力されるかを表す。
 
 
 
{{math|&Sigma;}} の対角成分の並びは自由だが、応用上は取り扱いを簡単にするため降順に並べることが多い。こうすると、{{mvar|U}} と {{mvar|V}} は一意には定まらないが、{{math|&Sigma;}} は一意に定まる。
 
 
 
== 特異値、特異ベクトルと特異値分解との関係 ==
 
{{mvar|M}} を {{math|''K''<sup>''m''&times;''n''</sup>}} 上の[[行列]]とする。ある非負の実数 {{mvar|&sigma;}} に対し、
 
:<math>\begin{align}
 
Mv  &= \sigma u \\
 
M^*u &= \sigma v
 
\end{align}</math>
 
という条件を満たす {{mvar|K<sup>m</sup>}} 上の[[単位ベクトル]] {{mvar|u}} と {{mvar|K<sup>n</sup>}} 上の単位ベクトル {{mvar|v}} の組が存在するとき、実数 {{mvar|&sigma;}} を(ベクトル {{math|''u'', ''v''}} に対応する)行列 {{mvar|M}} の'''特異値''' {{en|(singular value)}} と呼ぶ。またベクトル {{mvar|''u'', ''v''}} を、それぞれ {{mvar|&sigma;}} の'''左特異ベクトル''' {{en|(left-singular vector)}} と'''右特異ベクトル''' {{en|(right-singular vector)}} と呼ぶ。
 
 
 
任意の特異値分解
 
:<math>M = U\Sigma V^{*}</math>
 
において、{{math|&Sigma;}} の対角成分は {{mvar|M}} の特異値に等しく、[[ユニタリ行列]] {{math|''U'', ''V''}} の列ベクトルは、それぞれ左特異ベクトル、右特異ベクトルを並べたものである。すなわち、
 
 
 
* {{math|''m'' &times; ''n''}} 行列 {{mvar|M}} は、少なくともひとつ、多くとも {{math|1=''q'' = min(''m'', ''n'')}} 個の異なる特異値を持つ。
 
* 常に {{mvar|K<sup>m</sup>}} 上のユニタリ基底が存在して、それは {{mvar|M}} の左特異ベクトルから成る。
 
* 常に {{mvar|K<sup>n</sup>}} 上のユニタリ基底が存在して、それは {{mvar|M}} の右特異ベクトルから成る。
 
 
 
1つの特異値に対し、2つ以上の[[線形独立]]な右(あるいは左)特異ベクトルが存在する場合、その特異値は'''縮退''' {{en|(degenerate)}} しているという。縮退のない特異値に対しては常に、左右の特異ベクトルがそれぞれ(位相 {{mvar|e<sup>i&theta;</sup>}} の違いを除いて)唯一つ存在する。結果として、もし行列 {{mvar|M}} のすべての特異値が正であり縮退のない場合、特異値分解は(ユニタリ行列 {{math|''U'', ''V''}} の各列にかかる位相 {{mvar|e<sup>i&theta;<sub>k</sub></sup>}} の違いを除いて)唯一つに定まる。
 
 
 
縮退のある特異値 {{math|''&sigma;''<sub>deg</sub>}} に対して、左特異ベクトル {{math|''u''<sub>1</sub>, ''u''<sub>2</sub>}} の正規化された線型結合 {{math|1=''u''<sub>c</sub> = ''&alpha;u''<sub>1</sub> + ''&beta;u''<sub>2</sub>}} を考えると、左特異ベクトルの線型結合 {{math|''u''<sub>c</sub>}} もまた特異値 {{math|''&sigma;''<sub>deg</sub>}} の左特異ベクトルとなっている。同様のことが右特異ベクトルについても成り立つ。特異値分解のユニタリ行列 {{math|''V'', ''U''}} の特異値 {{math|''&sigma;''<sub>deg</sub>}} に対応する列ベクトルは、特異ベクトルの線型結合の中から自由に選ぶことができるため、結果として行列 {{mvar|M}} の分解は一意ではなくなる。
 
 
 
[[固有値分解]]が[[正方行列]]に対してのみ適用できるのに対し、特異値分解は任意の矩形行列に対して適用が可能である。また、行列 {{mvar|M}} が[[行列の定値性|正定値]]の[[エルミート行列]](したがって正方行列)である場合、{{mvar|M}} の固有値は実数かつ非負であり、このとき {{mvar|M}} の特異値と特異ベクトルはそれぞれ {{mvar|M}} の固有値と固有ベクトルに一致する。
 
<!--
 
== Relation to eigenvalue decomposition ==
 
The singular value decomposition is very general in the sense that it can be applied to any ''m'' &times; ''n'' matrix.  The eigenvalue decomposition, on the other hand, can only be applied to certain classes of square matrices.  Nevertheless, the two decompositions are related. 
 
 
 
In the special case that <math>M</math> is a [[Hermitian matrix]] which is [[positive-definite matrix|positive semi-definite]], i.e., all its eigenvalues are real and non-negative, then the singular values and singular vectors coincide with the eigenvalues and eigenvectors of ''M'',
 
 
 
{{Indent|<math>M = V \Lambda V^{*}\,</math>}}
 
 
 
More generally, given an SVD of ''M'', the following two relations hold:
 
 
 
{{Indent|
 
<math>
 
M^{*} M = V \Sigma^{*} U^{*}\, U \Sigma V^{*} =
 
V (\Sigma^{*} \Sigma) V^{*}\,
 
</math>
 
 
 
<math>
 
M M^{*} = U \Sigma V^{*} \, V \Sigma^{*} U^{*} =
 
U (\Sigma \Sigma^{*}) U^{*}\,
 
</math>
 
}}
 
The right hand sides of these relations describe the eigenvalue decompositions of the left hand sides.  Consequently, the squares of the non-zero singular values of ''M'' are equal to the non-zero eigenvalues of either <math>M^{*}M</math> or <math>MM^{*}</math>.  Furthermore, the columns of ''U'' (left singular vectors) are eigenvectors of <math>MM^{*}</math> and the columns of ''V'' (right singular vectors) are eigenvectors of <math>M^{*}M</math>.
 
 
 
== Existence ==
 
An eigenvalue ''&lambda;'' of a matrix is characterized by the algebraic relation ''M u'' = ''&lambda; u''. When ''M'' is Hermitian, a variational characterization is also available. Let ''M'' be a real ''n'' &times; ''n'' [[symmetric matrix]]. Define ''f'': '''R'''<sup>''n''</sup> &rarr; '''R''' by ''f''(''x'') = ''x<sup>T</sup> M x''. This continuous function attains a maximum at some ''u'' when restricted to the closed unit sphere {||''x''|| &le; 1}. By the Lagrange multipliers theorem, ''u'' necessarily satisfies
 
 
 
{{Indent|<math>\nabla f = \nabla \; x^T M x = \lambda \cdot \nabla \; x^T x.</math>}}
 
 
 
A short calculation shows the above leads to ''M u'' = ''&lambda; u'' (symmetry of ''M'' is needed here). Therefore ''&lambda;'' is the largest eigenvalue of ''M''. The same calculation performed on the orthogonal compliment of ''u'' gives the next largest eigenvalue and so on. The complex Hermitian case is similar; there ''f''(''x'') = ''x* M x'' is a real-valued function of 2''n'' real variables.
 
 
 
Singular values are similar in that they can be described algebraically or from variational principles. Although, unlike the eigenvalue case, Hermiticity, or symmetry, of ''M'' is no longer required.
 
 
 
This section gives these two arguments for existence of singular value decomposition.
 
 
 
=== Linear algebraic proof ===
 
Let ''M'' be a rectangular matrix with complex entries. ''M*M'' is positive semidefinite, therefore Hermitian. By the [[spectral theorem]], there exists a unitary ''U'' such that
 
 
 
{{Indent|<math>U^* M^* M U = \begin{bmatrix} \Sigma & 0 \\ 0 & 0\end{bmatrix}</math>}}
 
 
 
where ''Σ'' is diagonal and positive definite. Partition ''U'' appropriately so we can write
 
 
 
{{Indent|<math>\begin{bmatrix} U_1 ^* \\ U_2 ^*\end{bmatrix} M^* M \begin{bmatrix} U_1 & U_2 \end{bmatrix}
 
= \begin{bmatrix} U_1 ^* M^* M U_1 & U_1 ^* M^* M U_2 \\ U_2 ^* M^* M U_1 & U_2 ^* M^* M U_2  \end{bmatrix}
 
= \begin{bmatrix} \Sigma & 0 \\ 0 & 0\end{bmatrix}.
 
</math>}}
 
 
 
Therefore ''U*''<sub>1</sub>''M*MU*''<sub>1</sub> = ''&Sigma;'', and ''MU''<sub>2</sub> = 0. Define
 
 
 
{{Indent|<math>W_1 = \Sigma ^{- \frac{1}{2}} U_1^* M^*.</math>}}
 
 
 
Then
 
 
 
{{Indent|<math>W_1 M U_1 = \Sigma ^{\frac{1}{2}}.</math>}}
 
 
 
We see that this is almost the desired result, except that ''W''<sub>1</sub> and ''U''<sub>1</sub> are not unitary in general. ''W''<sub>1</sub> is a partial isometry (''W''<sub>1</sub>''W*''<sub>1</sub> = ''I'' ) while ''U''<sub>1</sub> is an isometry (''U*''<sub>1</sub>''U''<sub>1</sub> = ''I'' ). To finish the argument, one simply has to "fill out" these matrices to obtain unitaries. ''U''<sub>2</sub> already does this for ''U''<sub>1</sub>. Similarly, one can choose ''W''<sub>2</sub> such that
 
 
 
{{Indent|<math>\begin{bmatrix} W_1 \\ W_2 \end{bmatrix}</math>}}
 
 
 
is unitary. Direct calculation shows
 
 
 
{{Indent|<math>
 
\begin{bmatrix} W_1 \\ W_2 \end{bmatrix} M \begin{bmatrix} U_1 & U_2 \end{bmatrix}
 
= \begin{bmatrix} \Sigma^{\frac{1}{2}} & 0 \\ 0 & 0\end{bmatrix}
 
</math>}}
 
 
 
, which is the desired result.
 
 
 
Notice the argument could begin with diagonalizing ''MM*'' rather than ''M*M'' (This shows directly that ''MM*'' and ''M*M'' have the same non-zero eigenvalues).
 
 
 
=== Variational characterization ===
 
The singular values can also be characterized as the maxima of ''u''<sup>T</sup>''Mv'', considered as a function of ''u'' and ''v'', over particular subspaces. The singular vectors are the values of ''u'' and ''v'' where these maxima are attained.
 
 
 
Let ''M'' denote an ''m'' &times; ''n'' matrix with real entries. Let <math>S^{m-1}</math> and <math>S^{n-1}</math> denote the sets of unit 2-norm vectors in '''R'''<sup>''m''</sup> and '''R'''<sup>''n''</sup> respectively.  Define the function
 
 
 
{{Indent|<math>
 
\sigma(u,v) = u^{T} M v \,
 
</math>}}
 
 
 
for vectors ''u'' &isin; <math>S^{m-1}</math> and ''v'' &isin; <math>S^{n-1}</math>.  Consider the function ''&sigma;'' restricted to <math>S^{m-1}</math> &times; <math>S^{n-1}</math>.  Since both <math>S^{m-1}</math> and <math>S^{n-1}</math> are [[compact]] sets, their [[Product topology|product]] is also compact.  Furthermore, since ''&sigma;'' is continuous, it attains a largest value for at least one pair of vectors ''u'' &isin; <math>S^{m-1}</math> and ''v'' &isin; <math>S^{n-1}</math>. This largest value is denoted ''&sigma;''<sub>1</sub> and the corresponding vectors are denoted ''u''<sub>1</sub> and ''v''<sub>1</sub>. Since <math>\sigma_{1}</math> is the largerst value of <math>\sigma(u,v)</math> it must be non-negative.  If it was negative, changing the sign of either ''u''<sub>1</sub> or ''v''<sub>1</sub> would make it positive and thereby larger.
 
 
 
'''Statement:''' ''u''<sub>1</sub>, ''v''<sub>1</sub> are left and right singular vectors of ''M'' with corresponding singular value ''&sigma;''<sub>1</sub>.
 
 
 
'''Proof:''' Similar to the eigenvalues case, by assumption the two vectors satisfy the Lagrange multiplier equation:
 
 
 
{{Indent|<math>\nabla \sigma = \nabla \; u^T M v = \lambda_1 \cdot \nabla \; u^T u = \lambda_2 \cdot \nabla \; v^T v.</math>}}
 
 
 
After some algebra, this becomes
 
 
 
{{Indent|<math> M v_{1} = 2 \lambda_{1} u_{1} + 0, \,</math>}}
 
 
 
and
 
 
 
{{Indent|<math> M^{T} u_{1} = 0 + 2 \lambda_{2} v_{1}. \,</math>}}
 
 
 
Multiplying the first equation from left by <math>u_{1}^{T}</math> and the second equation from left by <math>v_{1}^{T}</math> and taking ||''u''|| = ||''v''|| = 1  into account gives
 
 
 
{{Indent|<math> u_{1}^{T} M v_{1} = \sigma_{1} = 2 \lambda_{1}, </math>}}
 
 
 
{{Indent|<math> v_{1}^{T} M^{T} u_{1} = \sigma_{1} = 2 \lambda_{2}. </math>}}
 
 
 
So ''&sigma;''<sub>1</sub> = 2 ''&lambda;''<sub>1</sub> = 2 ''&lambda;''<sub>1</sub>. By properties of the functional ''&phi;'' defined by
 
 
 
{{Indent|<math>\phi(w) = u_1 ^T w, \,</math>}}
 
 
 
we have
 
 
 
{{Indent|<math> M v_{1} = \sigma_{1} u_{1}. \,</math>}}
 
 
 
Similarly,
 
 
 
{{Indent|<math> M^{T} u_{1} = \sigma_{1} v_{1}. \,</math>}}
 
 
 
This proves the statement.
 
 
 
More singular vectors and singular values can be found by maximizing ''&sigma;''(''u'', ''v'') over normalized ''u'', ''v'' which are orthogonal to ''u''<sub>1</sub> and ''v''<sub>1</sub>, respectively.
 
 
 
The passage from real to complex is similar to the eigenvalue case.
 
-->
 
 
 
== 幾何的な意味 ==
 
行列 ''U'' と ''V'' はユニタリ行列だから、''U'' の列ベクトル ''u''<sub>1</sub>,...,''u<sub>m</sub>'' は、体 ''K''<sup>''m''</sup> 上の[[正規直交基底]]を成し、''V'' の列ベクトル ''v''<sub>1</sub>,...,''v<sub>n</sub>'' は、体 ''K''<sup>''n''</sup> 上の正規直交基底を成す。<!-- (with respect to the standard [[scalar product]]s on these spaces).-->
 
 
 
ベクトル ''x'' を ''Mx'' に写す[[線形変換]](線型写像) <math>T: K^{n} \rightarrow K^{m}</math> は、これらの正規直交基底を用いて簡単な形に表される。すなわち、 <math>T(v_{i}) = \sigma_{i} u_{i}</math>, ここで ''i'' = 1,...,min(''m'',''n'') に対しては ''&sigma;<sub>i</sub>'' は &Sigma; の ''i'' 番目の対角成分、''i'' > min(''m'',''n'') に対し ''T''(''v''<sub>''i''</sub>) = 0。
 
 
 
このことから、特異値分解定理の幾何的な意味は以下のように説明できる。線型写像 <math>T: K^{n} \rightarrow K^{m}</math> に対し、次のような性質を持つ正規直交基底 ''K''<sup>''m''</sup> と ''K''<sup>''n''</sup> が存在する。ここに、''T'' は ''K''<sup>''n''</sup> の ''i'' 番目の基底ベクトルを ''K''<sup>''m''</sup> の ''i'' 番目の基底ベクトルについて ''&sigma;''<sub>i</sub> 倍したものに写す。''&sigma;''<sub>i</sub> は負でない数。つまり、これらの基底を用いて、写像 ''T'' は、負でない数を成分に持つ対角行列で表される。
 
<!-- 以下未訳
 
==Reduced SVDs==
 
In applications it is quite unusual for the full SVD, including a full unitary decomposition of the null-space of the matrix, to be required.  Instead, it is often sufficient (as well as faster, and more economical for storage) to compute a reduced version of the SVD.  The following can be distinguished for an ''m''&times;''n'' matrix ''M'' of rank ''r'':
 
 
 
===Thin SVD===
 
{{Indent|<math>M = U_n \Sigma_n V^{*} \,</math>}}
 
 
 
Only the ''n'' column vectors of ''U'' corresponding to the row vectors of ''V''* are calculated. The remaining column vectors of ''U'' are not calculated. This is significantly quicker and more economical than the full SVD if ''n''<<''m''. The matrix ''U''<sub>n</sub> is thus ''m''&times;''n'', &Sigma;<sub>n</sub> is ''n''&times;''n'' diagonal, and ''V'' is ''n''&times;''n''.
 
 
 
The first stage in the calculation of a thin SVD will usually be a [[QR decomposition]] of ''M'', which can make for a significantly quicker calculation if ''n''<<''m''.
 
 
 
===Compact SVD===
 
{{Indent|<math>M = U_r \Sigma_r V_r^*</math>}}
 
 
 
Only the ''r'' column vectors of ''U'' and ''r'' row vectors of ''V''* corresponding to the non-zero singular values &Sigma;<sub>r</sub> are calculated. The remaining vectors of ''U'' and ''V''* are not calculated. This is quicker and more economical than the thin SVD if ''r''<<''n''. The matrix ''U''<sub>r</sub> is thus ''m''&times;''r'', &Sigma;<sub>r</sub> is ''r''&times;''r'' diagonal, and ''V''<sub>r</sub>* is ''r''&times;''n''. 
 
 
 
===Truncated SVD===
 
{{Indent|<math>\tilde{M} = U_t \Sigma_t V_t^*</math>}}
 
 
 
Only the ''t'' column vectors of ''U'' and ''t'' row vectors of ''V''* corresponding to the ''t'' largest singular values &Sigma;<sub>r</sub> are calculated. The rest of the matrix is discarded. This can be much quicker and more economical than the thin SVD if ''t''<<''r''. The matrix ''U''<sub>t</sub> is thus ''m''&times;''t'', &Sigma;<sub>t</sub> is ''t''&times;''t'' diagonal, and ''V''<sub>t</sub>* is ''t''&times;''n'. 
 
 
 
Of course the truncated SVD is no longer an exact decomposition of the original matrix ''M'', but as discussed below, the approximate matrix <math>\tilde{M}</math> is in a very useful sense the closest approximation to ''M'' that can be achieved by a matrix of rank ''t''.
 
 
 
==Norms==
 
The sum of the ''k'' largest singular values of ''M'' is a [[matrix norm]], the '''Ky Fan ''k''-norm''' of ''M''.  The Ky Fan 1-norm is just the [[operator norm]] of ''M'' as a linear operator with respect to the Euclidean norms of ''K''<sup>''m''</sup> and ''K''<sup>''n''</sup>. In other words, the Ky Fan 1-norm is the operator norm induced by the standard ''l''<sup>2</sup> Euclidean inner product. For this reason, it is also called the operator 2-norm. One can easily verify the relationship between the Ky Fan 1-norm and singular values. For an bounded operator ''M'' on Hilbert spaces in general, it is true that
 
 
 
{{Indent|<math>\| M \| = \| M^*M \|^{\frac{1}{2}}.</math>}}
 
 
 
But ''M*M''<sup>&frac12;</sup> is a [[normal matrix]], so ||''M*M''||<sup>&frac12;</sup> is the largest eigenvalue of ''M*M''<sup>&frac12;</sup>, i.e. the largest singular value of ''M''.
 
 
The singular values are related to another norm on the space of operators. Consider the [[Hilbert-Schmidt]] inner product on the ''n'' &times; ''n'' matrices, defined by <''M'', ''N''> = Tr ''N*M''. So the induced norm is ||''M''|| = <''M, M''><sup>&frac12;</sup> = (Tr ''M*M'')<sup>&frac12;</sup>. Since trace is invariant under unitary equivalence, this shows
 
 
 
{{Indent|<math>\| M \| = (\sum s_i ^2)^{\frac{1}{2}}</math>}}
 
 
 
where ''s<sub>i</sub>'' are the singular values of ''M''. This is called the '''[[Frobenius norm]]''', '''Schatten 2-norm''', or '''Hilbert-Schmidt norm''' of ''M''. Direct calculation shows that if
 
 
 
{{Indent|<math>\, M = (m_{ij}),</math>}}
 
 
 
the Frobenius norm of ''M'' coincides with
 
 
 
{{Indent|<math>( \sum_{ij} | m_{ij} | ^2 )^{\frac{1}{2}}.</math>}}
 
 
 
== Bounded operators on Hilbert spaces ==
 
The factorization <math>M = U\Sigma V^*</math> can be extended to a [[bounded operator]] ''M'' on a separable Hilbert space ''H''. Namely, for any bounded operator ''M'', there exist a [[partial isometry]] ''U'', an unitary ''V'', a measure space ''(X, μ)'', and a non-negative measurable ''f'' such that
 
 
 
{{Indent|<math>\; M = U T_f V^*, </math>}}
 
 
 
where <math>T_f</math> is the [[multiplication operator|multiplication by ''f'']] on ''L''<sup>2</sup>(''X'', ''&mu;'').
 
 
 
This can be shown by mimicking the linear algebraic argument for the matricial case above. ''VT<sub>f</sub> V*'' is the unique positive square root of ''M*M'', as given by the [[Borel functional calculus]] for [[self adjoint operator]]s. The reason why ''U'' need not be unitary is because, unlike the finite dimensional case, given an isometry ''U''<sub>1</sub> with non trivial kernel, a suitable ''U''<sub>2</sub> may not be found such that
 
 
 
{{Indent|<math>\begin{bmatrix} U_1 \\ U_2 \end{bmatrix}</math>}}
 
 
 
is an unitary operator.
 
 
 
As for matrices, the singular value factorization is equivalent to the [[polar decomposition]] for operators: we can simply write
 
 
 
{{Indent|<math>M = U V^* \cdot V T_f V^*</math>}}
 
 
 
and notice that ''U V*'' is still a partial isometry while ''VT<sub>f</sub> V*'' is positive.
 
 
 
=== Singular values and compact operators ===
 
 
 
To extend notion of singular values and left/right-singular eigenvectors to the operator case, one needs to restrict to [[compact operator on Hilbert space|compact operators]]. It is a general fact that compact operators on Banach spaces, therefore Hilbert spaces, have only discrete spectrum: If ''T'' is compact, every nonzero ''&lambda;'' in its spectrum is an eigenvalue. Furthermore, a compact self adjoint operator can be diagonalized by its eigenvectors. If ''M'' is compact, so is ''M*M''. Applying the diagonalization result, the unitary image of its positive square root ''T<sub>f</sub>'' has a set of orthonormal eigenvectors {''e<sub>i</sub>''} corresponding to strictly positive eigenvalues {''&sigma;<sub>i</sub>''}. For any ''&psi;'' &isin; ''H'',
 
 
 
{{Indent|<math>
 
\; M \psi = U T_f V^* \psi = \sum_i  \langle U T_f V^* \psi, U e_i \rangle U e_i
 
= \sum_i \sigma_i \langle \psi, V e_i \rangle U e_i.
 
</math>}}
 
 
 
,where the series converges in the norm topology on ''H''. Notice how this resembles the expression from the finite dimensional case. The ''&sigma;<sub>i</sub>'' 's are called the singular values of ''M''. {''U e<sub>i</sub>''} and {''V e<sub>i</sub>''} can be considered the left- and right-singular vectors of ''M'' respectively.
 
 
 
[[Compact operator on Hilbert space|Compact operators on a Hilbert space]] are the closure of finite rank operators in the uniform operator topology. The above series expression gives an explicit such representation. An immediate consequence of this is:
 
 
 
'''Theorem''' ''M'' is compact if and only if ''M*M'' is compact.
 
 
 
-->
 
== 特異値分解の応用 ==
 
=== 擬似逆行列 ===
 
特異値分解を用いて、[[擬似逆行列]]を計算することができる。行列 ''M'' の擬似逆行列は、その特異値分解 <math> M = U\Sigma V^*</math> を用いて
 
{{Indent|<math> M^+ = V \Sigma^+ U^*, \,</math>}}
 
と表せる。ここに ''&Sigma;''<sup>+</sup> は、''&Sigma;'' の零でない成分の逆数を成分とする行列の転置である。この擬似逆行列を用いて、[[線型性|線形]][[最小二乗法]]を行うことができる。
 
 
 
=== 値域、零空間、行列の階数 ===
 
特異値分解を用いて、行列 <math>M</math> の[[値域]]、[[零空間]]を表現することができる。<math>M</math> の特異値の中で零になるものに対応する右特異ベクトルが零空間の基底となる。<math>M</math> の特異値の中で零でないものに対応する左特異ベクトルが値域の基底となる。すなわち <math>M</math> の[[行列の階数]]は、零でない特異値の数と一致する。さらに、<math>M</math> と <math>M^*M</math> と <math>M M^*</math> の階数は一致し、 <math>M^*M</math> と <math>M M^*</math> の固有値は一致する。
 
 
 
数値計算上では、特異値を用いて行列の''有効な階数''を求めることができる。数値計算上では[[丸め誤差]]の影響で、[[行列の階数|階数]]が退化した行列に対し、非常に小さいが零ではない特異値が得られてしまう場合に有効である。
 
 
 
=== 行列の近似 ===
 
行列 <math>M</math> を、ある特定の階数 <math>r</math> を持つ別の行列 <math>\tilde{M}</math> で近似すると便利な場合がある。この場合の近似を''<math>\mbox{rank}(\tilde{M}) = r</math> という条件のもとで <math>M</math> と <math>\tilde{M}</math> の差([[フロベニウスノルム]])が最小なもの''という意味であるとすると、行列 <math>M</math> の特異値分解によって、<math>\tilde{M}</math> を求めることができる。すなわち、
 
 
 
{{Indent|<math>
 
\tilde{M} = U \tilde{\Sigma} V^*
 
</math>}}
 
 
 
ここに、 <math>\tilde{\Sigma}</math> は、<math>\Sigma</math> から大きい方から数えて <math>r</math> 個の特異値を残して、それ以外の特異値を零とおいたもの。
 
<!-- 以下未訳
 
'''Quick proof:''' We hope to minimize <math>||M - \tilde M||_F</math> subject to <math>\mbox{rank}(\tilde{M}) = r</math>.
 
 
 
Suppose the SVD of <math> M = U \Sigma V^* </math>. Since the [[Frobenius norm]] is unitarily invariant, we have an equivalent statement:  
 
{{Indent|<math>\min_{\tilde M} ||\Sigma - U^* \tilde M V||_F</math>}}
 
Note that since <math>\Sigma</math> is diagonal, <math>U^* \tilde M V</math> should be diagonal in order to minimize the [[Frobenius norm]]. Remember that the [[Frobenius norm]] is the square-root of the summation of the squared modulus of all entries.
 
This implies that <math>U</math> and <math>V</math> are also singular matrices of <math>\tilde M</math>. Thus we can assume that <math>\tilde M</math> to minimize the above statement has the form:
 
{{Indent|<math>{\tilde M} = U S V^*,</math>}}
 
where <math>S</math> is diagonal. The diagonal entries <math>s_i</math> of <math>S</math> are not necesarily ordered as in SVD.
 
{{Indent|<math>
 
\min_{\tilde M} ||\Sigma - S||_F \equiv  \min_{s_i} \sqrt {\sum_{i=1}^{n} (\sigma_i - s_i)^2 }.
 
</math>}}
 
From the rank constraint, i.e. <math>S</math> has <math>r</math> non-zero diagonal entries, the minimum of the above statement is obtained as follows:
 
{{Indent|<math>
 
\min_{s_i} \sqrt {\sum_{i=1}^{r} (\sigma_i - s_i)^2 + \sum_{i=r+1}^{n} \sigma_i^2 }
 
= \sqrt {\sum_{i=r+1}^{n} \sigma_i^2 }.
 
</math>}}
 
Therefore, <math>\tilde M</math> of rank <math>r</math> is the best approximation of <math>M</math> in the [[Frobenius norm]] sense when <math>\sigma_i = s_i \quad (i=1,\cdots,r)</math> and the corresponding singular vectors are same as those of <math>M</math>.
 
--><!-- 以下不出来
 
=== その他の応用 ===
 
特異値分解は、線形[[逆問題]]や、[[Tikhonov regularization|Tikhonov]]の用な正規化に用いられる。[[統計学]]では[[主成分分解]]などに用いられ、他にも[[信号処理]]や[[パターン認識]]でも用いられる。It is also used in output-only [[modal analysis]], where the non-scaled [[mode shape]]s can be determined from the singular vectors.
 
 
 
大規模な行列に対する特異値分解の応用例は[[数値天気予報]]である。[[Lanczos method]] によって線形にもっとも早く成長する擾乱場を求めるのに、線形化した全球大気場の propagator のもっとも大きな特異値を持つ特異ベクトルを用いることができる。これらの擾乱場を用いて、[[アンサンブル予報]]を行い天気予報の信頼度を定量化しようと試みられている。
 
--><!-- 以下未訳
 
==Computation of the SVD==
 
The [[LAPACK]] subroutine [http://www.netlib.org/lapack/double/dgesvd.f  DGESVD] represents a typical approach to the computation of the singular value decomposition. If the matrix has more rows than columns a [[QR decomposition]] is first performed. The factor R is then reduced to a [[bidiagonal matrix]]. The desired singular values and vectors are then found by performing a bidiagonal QR iteration, using the LAPACK routine [http://www.netlib.org/lapack/double/dbdsqr.f DBDSQR] (See Demmel and Kahan for details).
 
 
 
The [[GNU Scientific Library]] offers four functions, one with the Golub-Reinsch algorithm, one with the modified Golub-Reinsch algorithm (faster for matrices with many more rows than columns), one with a one-sided Jacobi orthogonalization and one which uses only non-zero singular values.  See the [http://www.gnu.org/software/gsl/manual/html_node/Singular-Value-Decomposition.html#Singular-Value-Decomposition  GSL manual page on SVD].
 
 
 
==History==
 
The singular value decomposition was originally developed by [[differential geometry|differential geometers]], who wished to determine whether a real [[bilinear form]]s could be made equal to another by independent orthogonal transformations of the two spaces it acts on. [[Eugenio Beltrami]] and [[Camille Jordan]] discovered independently, in 1873 and 1874 respectively, that the singular values of the bilinear forms, represented as a matrix, form a complete set of [[invariant (mathematics)|invariant]]s for bilinear forms under orthogonal substitutions. [[James Joseph Sylvester]] also arrived at the singular value decomposition for real square matrices in 1889, apparently independent of both Beltrami and Jordan. Sylvester called the singular values the ''canonical multipliers'' of the matrix ''A''. The fourth mathematician to discover the singular value decomposition independently is Autonne in 1915, who arrived at it via the [[polar decomposition]]. The first proof of the singular value decomposition for rectangular and complex matrices seems to be by Eckart and Young in 1936; they saw it as a generalization of the [[principal axis]] transformation for Hermitian matrices.
 
 
 
In 1907, [[Erhard Schmidt]] defined an analog of singular values for [[integral operator]]s (which are compact, under some weak technical assumptions); it seems he was unaware of the parallel work on singular values of finite matrices. This theory was further developed by [[Émile Picard]] in 1910, who is the first to call the numbers <math>\sigma_k</math> ''singular values'' (or rather, ''valeurs singulières'').
 
-->
 
== 脚注 ==
 
{{reflist}}
 
 
 
== 参考文献 ==
 
*{{Cite journal|和書|title=特異値分解とそのシステム制御への応用|last1=伊理|first1=正夫|last2=児玉|first2=慎三|last3=須田|first3=信英|journal=計測と制御|volume=21|number=8|date=1982|pages=763-772|url=http://dx.doi.org/10.11499/sicejl1962.21.763|ref=harv}}
 
*{{Cite journal|和書|title=《制御理論における数学》第1回: 線形代数-特異値分解を中心にして|last=太田|first=快人|journal=計測と制御|volume=38|number=2|date=1999|pages=144-149|url=http://dx.doi.org/10.11499/sicejl1962.38.144|ref=harv}}
 
* {{cite book
 
|last1      = Golub
 
|first1    = Gene H.
 
|last2      = Van Loan
 
|first2    = Charles F.
 
|year      = 2013
 
|title      = Matrix computations
 
|edition    = Fourth
 
|url        = {{google books|X5YfsuCWpxMC|plainurl=yes}}
 
|publisher  = Johns Hopkins University Press
 
|isbn      = 978-1-4214-07940-4
 
|mr        = 3024913
 
|ref        = harv
 
}}
 
* {{cite book
 
|last1      = Horn
 
|first1    =  Roger A.
 
|last2      = Johnson
 
|first2    = Charles R.
 
|year      = 2013
 
|title      = Matrix analysis
 
|edition    = Second
 
|url        = {{google books|5I5AYeeh0JUC|plainurl=yes|page=149}}
 
|publisher  = [[Cambridge University Press]]
 
|isbn      = 978-0-521-54823-6
 
|mr        = 2978290
 
|ref        = harv
 
}}
 
 
 
== 関連項目 ==
 
* [[対角化]]
 
* [[潜在意味解析]]
 
 
 
{{デフォルトソート:とくいちふんかい}}
 
 
 
[[Category:数値線形代数]]
 
[[Category:行列]]
 
[[Category:行列の分解]]
 
[[Category:関数解析学]]
 
[[Category:数学に関する記事]]
 

2019/4/27/ (土) 18:50時点における最新版



楽天市場検索: