Fast and stable matrix multiplication Olga Holtz Department of Mathematics University of California-Berkeley holtz@math.berkeley.edu joint work James Demmel,loana Dumitriu and Robert Kleinberg Fast and stable matrix mtiplicaion-p.1/44
Fast and stable matrix multiplication Olga Holtz Department of Mathematics University of California-Berkeley holtz@math.berkeley.edu joint work James Demmel, Ioana Dumitriu and Robert Kleinberg Fast and stable matrix multiplication – p.1/44
The quest for speed How fast can one multiply two nxn matrices? .Standard multiplication:O(n3)operations. .Strassen's algorithm:O(2.81)operations. 0.… .Coppersmith and Winograd's algorithm:O(n2.38) operations. 。lsO(m2)achievable? Fast and stable matrix multiplication-p.2/44
The quest for speed How fast can one multiply two n×n matrices? • Standard multiplication: O(n3) operations. • Strassen’s algorithm: O(n2.81) operations. • ... • Coppersmith and Winograd’s algorithm: O(n2.38) operations. • Is O(n2) achievable? Fast and stable matrix multiplication – p.2/44
Why should we care? Complexity of matrix multiplication complexity of 'almost all"matrix problems: solving linear systems, evaluating determinants, 。LU factorization, many more. See P.Burgisser,M.Clausen,M.A.Shokrollahi Algebraic complexity theory. Fast and stable matrix multiplication-p.3/44
Why should we care? Complexity of matrix multiplication = complexity of “almost all” matrix problems: • solving linear systems, • evaluating determinants, • LU factorization, • many more. See P. Bürgisser, M. Clausen, M. A. Shokrollahi Algebraic complexity theory. Fast and stable matrix multiplication – p.3/44
Strassen's algorithm Main idea: Multiplication by recursively partitioning into smaller blocks. To be faster than O(n3), Volker Strassen this needs a method to Gaussian elimination multiply small matrices is not optimal.Numer. (order k)using o() Mathematik [1969]. multiplications. Fast and stable matrix multiplication-p.4/44
Strassen’s algorithm Volker Strassen Gaussian elimination is not optimal. Numer. Mathematik [1969]. Main idea: • Multiplication by recursively partitioning into smaller blocks. • To be faster than O(n3), this needs a method to multiply small matrices (order k) using o(k3) multiplications. Fast and stable matrix multiplication – p.4/44
Strassen [细 A12 7× B11 B12 requires only A22 B21B22 7 multiplications: M1:=(A11+A22)(B11+B22) M2 :=(A21+A22)B11 M3:=A11(B12-B22) M4:=A22(B21-B11) M5:=(A11+A12)B22 M6= (A21-A11)(B11+B12) M7 = (A12-A22)(B21+B22). Fast and stable matrix multiplication-p.5/44
Strassen A11 A12 A21 A22 × B11 B12 B21 B22 requires only 7 multiplications: M1 := (A11 + A22)(B11 + B22) M2 := (A21 + A22)B11 M3 := A11(B12 − B22) M4 := A22(B21 − B11) M5 := (A11 + A12)B22 M6 := (A21 − A11)(B11 + B12) M7 := (A12 − A22)(B21 + B22). Fast and stable matrix multiplication – p.5/44
Strassen 「A11 1 Then A21 ]×[】-[aa] where C11= M1+M4-M5+M7 C12 = M3+M5 C21 = M2+M4 C22= M1-M2+M3+M6. Applied recursively, this yields running time 0(nog27)≈0(n2.8) Fast and stable matrix multiplication-p.6/44
Strassen Then A11 A12 A21 A22 × B11 B12 B21 B22 = C11 C12 C21 C22 , where C11 := M1 + M4 − M5 + M7 C12 := M3 + M5 C21 := M2 + M4 C22 := M1 − M2 + M3 + M6. Applied recursively, this yields running time O(nlog2 7) ≈ O(n2.8). Fast and stable matrix multiplication – p.6/44
Coppersmith and Winograd Don Coppersmith Shmuel Winograd Matrix multiplication via arithmetic progressions.Journal of Symbolic Computation [1990]. Used a thm on dense sets of integers containing no three terms in arithmetic progression(R.Salem D.C.Spencer [1942])to get an algorithm with running time(n2.376). Fast and stable matrix multiplication-p.7/44
Coppersmith and Winograd Don Coppersmith Shmuel Winograd Matrix multiplication via arithmetic progressions. Journal of Symbolic Computation [1990]. Used a thm on dense sets of integers containing no three terms in arithmetic progression (R. Salem & D. C. Spencer [1942]) to get an algorithm with running time ≈ O(n2.376). Fast and stable matrix multiplication – p.7/44
Group-theoretic approach Chris Umans Henry Cohn A group-theoretic approach to matrix multiplication,FOCS Proceedings [2003]. Proposed embedding into group algebra to be combined with recursive partitioning. Fast and stable matrix multiplication-p.8/44
Group-theoretic approach Chris Umans Henry Cohn A group-theoretic approach to matrix multiplication, FOCS Proceedings [2003]. Proposed embedding into group algebra to be combined with recursive partitioning. Fast and stable matrix multiplication – p.8/44
Why group algebras? Multiplying two polynomials has complexity O(n log n)instead of O(n2)by embedding coefficients into CG]where G is a finite cyclic group of order N 2n,via the map p→{p(w):w=exp(2πki/N)}k=0,.,N-1 embed convolve extract into C[G] using FFT from C[G] The same can be done with matrix products, via the map A一∑m,yA(ar,y)x-ly.The group G must have special properties. Fast and stable matrix multiplication-p.9/44
Why group algebras? Multiplying two polynomials has complexity O(n log n) instead of O(n2) by embedding coefficients into C[G] where G is a finite cyclic group of order N ≥ 2n, via the map p 7→ {p(w) : w = exp(2πki/N)}k=0,...,N−1. embed into C[G] → convolve using FFT → extract from C[G] . The same can be done with matrix products, via the map A 7→ Px,y A(x, y)x−1y. The group G must have special properties. Fast and stable matrix multiplication – p.9/44
The algorithm Embed A,B in group algebra Perform FFT Reorganize results into new matrices Multiply new matrices recursively Reorganize results into new matrices ●Perform Inverse FFT Extract C=AB from group algebra Fast and stable matrix multiplication-p.10/44
The algorithm • Embed A, B in group algebra • Perform FFT • Reorganize results into new matrices • Multiply new matrices recursively • Reorganize results into new matrices • Perform Inverse FFT • Extract C = AB from group algebra Fast and stable matrix multiplication – p.10/44