Professional Documents
Culture Documents
Abstract—We show that linear differential oper- constitutes a first useful step towards understanding the
ators with polynomial coefficients over a field of bit complexity. Of course, in the special, very important
characteristic zero can be multiplied in quasi-optimal case when the field K is finite, both complexities coincide
time. This answers an open question raised by van
der Hoeven. up to a constant factor.
The complexities of operations in the rings K[x] and
Keywords-Linear differential operators; multiplica-
Kr×r have been intensively studied during the last
tion; algebraic algorithms; computational complexity.
decades. It is well established that polynomial multi-
plication is a commutative complexity yardstick, while
I. Introduction matrix multiplication is a non-commutative complexity
The product of polynomials and the product of matri- yardstick, in the sense that the complexity of oper-
ces are two of the most basic operations in mathematics; ations in K[x] (resp. in Kr×r ) can generally be ex-
the study of their computational complexity is central pressed in terms of the cost of multiplication in K[x]
in computer science. In this paper, we will be inter- (resp. in Kr×r ), and for most of them, in a quasi-linear
ested in the computational complexity of multiplying way [2], [4], [8], [24], [14].
two linear differential operators. These algebraic objects Therefore, understanding the algebraic complexity of
encode linear differential equations, and form a non- multiplication in K[x] and Kr×r is a fundamental ques-
commutative ring that shares many properties with tion. It is well known that polynomials of degrees < d
the commutative ring of usual polynomials [21], [22]. can be multiplied in time M(d) = O(d log d log log d)
The structural analogy between polynomials and linear using algorithms based on the Fast Fourier Transform
differential equations was discovered long ago by Libri (FFT) [11], [25], [9], and two r × r matrices in Kr×r
and Brassinne [18], [7], [13]. Yet, the algorithmic study can be multiplied in time O(rω ), with 2 ω 3
of linear differential operators is currently much less [27], [23], [12]. The current tightest upper bound, due to
advanced than in the polynomial case: the complexity Vassilevska Williams [28], is ω < 2.3727, following work
of multiplication has been addressed only recently [16], of Coppersmith and Winograd [12] and Stothers [26].
[6], but not completely solved. The aim of the present Finding the best upper bound on ω is one of the most
work is to make a step towards filling this gap, and to important open problems in algebraic complexity theory.
solve an open question raised in [16]. In a similar vein, our thesis is that understanding the
Let K be an effective field. That is, we assume data algebraic complexity of multiplication of linear differen-
structures for representing the elements of K and al- tial operators is a very important question, since the
gorithms for performing the field operations. The aim complexity of more involved, higher-level operations on
of algebraic complexity theory is to study the cost of linear differential operators can be reduced to that of
basic or more complex algebraic operations over K (such multiplication [17].
as the cost of computing the greatest common divisor From now on, we will assume that the base field K has
of two polynomials of degrees less than d in K[x], or characteristic zero. Let K[x, ∂] denote the associative al-
the cost of Gaussian elimination on an r × r matrix gebra Kx, ∂; ∂x = x∂ +1 of linear differential operators
d
in Kr×r ) in terms of the number of operations in K. in ∂ = dx with polynomial coefficients in x. Any element
The algebraic complexity usually does not coincide with L of K[x, ∂] can be written as a finite sum i Li (x)∂ i
the bit complexity, which also takes into account the for uniquely determined polynomials Li in K[x]. We say
potential growth of the actual coefficients in K. Never- that L has bidegree less than (d, r) in (x, ∂) if L has
theless, understanding the algebraic complexity usually degree less than r in ∂, and if all Li ’s have degrees less
than d in x. The degree in ∂ of L is usually called the
† Supported by the Microsoft Research – INRIA Joint Centre.
‡ Supported by the ANR-09-JCJC-0098-01 MaGiX project, as order of L.
well as by a Digiteo 2009-36HD grant and Région Île-de-France. The main difference with the commutative ring K[x, y]
525
The cost of our algorithms will be measured by the of roots of unity. It is also natural to represent the
number of field operations in K they use. We recall evaluation of L at a suitable number of such powers by
that polynomials in K[x]d can be multiplied within a matrix. More precisely, given k ∈ N, we may regard
M(d) = O(d log d log log d) = Õ(d) operations in K, L as an operator from K[x]k to K[x]k+d . We may also
using the FFT-based algorithms in [25], [9], and that regard elements of K[x]k and K[x]k+d as column vectors,
ω denotes a feasible exponent for matrix multiplication written in the canonical bases with powers of x. We will
over K, that is, a real constant 2 ω 3 such that two denote by
r × r matrices with coefficients in K can be multiplied ⎛ ⎞
L(1)0 ··· L(xk−1 )0
in time O(rω ). Throughout this paper, we will make ⎜ ⎟
the classical assumption that M(d)/d is an increasing Φk+d,k = ⎜ .. .. ⎟
L ⎝ . . ⎠
function in d.
Most basic polynomial operations in K[x]d (division, L(1)k+d−1 · · · L(x k−1
)k+d−1
Taylor shift, extended gcd, multipoint evaluation, inter- the matrix of the K-linear map L : K[x]k → K[x]k+d
polation, etc.) have cost Õ(d) [2], [4], [8], [24], [14]. Our with respect to these bases. Given two operators K, L in
algorithms will make a crucial use of the following result K[x, ∂]d,r , we clearly have
due to Chin [10], see also [20] for a formulation in terms
of structured matrices. Φk+2d,k
KL = Φk+2d,k+d
K Φk+d,k
L , for all k 0.
Theorem 2 (Fast Hermite evaluation–interpolation):
Let K be an effective field of characteristic zero, For k = 2r (or larger), the operator KL can be recovered
let from the matrix Φ2r+2d,2r , whence the formula
c0 , . . . , ck−1 be k positive integers, and let d = i ci . KL
Given k mutually distinct points α0 , . . . , αk−1 in K and
Φ2r+2d,2r = Φ2r+2d,2r+d Φ2r+d,2r (1)
a polynomial P ∈ K[x]d , one can compute the vector KL K L
526
where to a Hermite evaluation at the points αi , with multi-
plicity ci = d at each point αi . By Theorem 2, this
Lα = Li (x)(∂ + α)i
computation can be performed in time O(M(pd) log p) =
i
O(M(r) log r). Furthermore, Hermite interpolation al-
is theoperator obtained by substituting ∂ + α for ∂ in lows us to recover L from L∗α0 , . . . , L∗αp−1 with the
L = i Li (x)∂ i . Indeed, this is a consequence of the fact same time complexity O(M(r) log r).
that, by Leibniz’s rule: Now let L ∈ K[x, ∂]d,r and consider the expansion of
⎛ ⎞
i L in x
∂ i (P eαx ) = ⎝ αj ∂ i−j P ⎠ eαx = (∂ + α)i (P )eαx . L(x, ∂) = L0 (∂) + · · · + xd−1 Ld−1 (∂).
j
ji
For each i, one Hermite evaluation of Li allows us to
Now let p = r/d and let α0 , . . . , αp−1 be p pairwise
compute the L∗αj ,i with j < p in time O(M(r) log r).
distinct points in K. For each integer k 1, we define
The operators L∗αj with j < p can therefore be
the vector space
computed in time O(d M(r) log r). By Lemma 1, we
Vk = K[x]k eα0 x ⊕ · · · ⊕ K[x]k eαp−1 x need O(r M(d)) = O(d M(r)) additional operations in
[2d,d] [2d,d]
order to obtain ΦL . Similarly, given ΦL , Lemma 1
with canonical basis ∗
allows us to recover the operators Lαj with j < p in
(eα0 x , . . . , xk−1 eα0 x , . . . . . ., eαp−1 x , . . . , xk−1 eαp−1 x ). time O(d M(r)). Using d Hermite interpolations, we also
recover the coefficients Li of L in time O(d M(r) log r).
Then we may regard L as an operator from Vk into Vk+d
[k+d,k]
and we will denote by ΦL the matrix of this operator Theorem 3: Assume r d and let K, L ∈ K[x, ∂]d,r .
with respect to the canonical bases. By what precedes, Then the product KL can be computed in time
this matrix is block diagonal, with p blocks of size d: O(dω−1 r + d M(r) log r).
⎛ k+d,k ⎞
ΦLα Proof: Considering K and L as operators in
⎜ 0
⎟ K[x, ∂]3d,3r , Lemma 2 implies that the computation of
[k+d,k] ⎜ .. ⎟
ΦL = ⎜ . ⎟. [4d,3d]
ΦK and ΦL
[3d,2d]
as a function of K and L can be
⎝ ⎠
Φk+d,k done in time O(d M(r) log r). The multiplication
Lα p−1
[4d,2d] [4d,3d] [3d,2d]
Let us now show that the operator L is uniquely deter- ΦKL = ΦK ΦL
[2d,d]
mined by the matrix ΦL , and that this gives rise to can be done in time O(dω p) = O(dω−1 r). Lemma 2
an efficient algorithm for multiplying two operators in [4d,2d]
finally implies that we may recover KL from ΦKL in
K[x, ∂]d,r . time O(d M(r) log r).
Lemma 2: Let L ∈ K[x, ∂]d,r with r d. Then
[2d,d] IV. The new algorithm in the case d > r
1) We may compute ΦL as a function of L in time
O(d M(r) log r); Any differential operator L ∈ K[x, ∂]d,r can be written
[2d,d] in a unique form
2) We may recover L from the matrix ΦL in time
O(d M(r) log r).
L= Li,j xj ∂ i , for some scalars Li,j ∈ K.
Proof: For any operator L = i<d, j<r Li,j xi ∂ j in i<r,j<d
K[x, ∂]d,r , we define its truncation L∗ at order O(∂ d ) by
This representation, with x on the left and ∂ on the
L∗ = Li,j xi ∂ j . right, is called the canonical form of L.
i,j<d Let ϕ : K[x, ∂] → K[x, ∂] denote the map defined by
⎛ ⎞
Since L − L∗ vanishes on K[x]d , we notice that Φ2d,d =
ϕ⎝ Li,j xj ∂ i ⎠ =
L
Φ2d,d
∗ . Li,j ∂ j (−x)i .
L
If L ∈ K[∂]r , then L∗ can be regarded as the power i<r,j<d i<r,j<d
series expansion of L at ∂ = 0 and order d. More In other words, ϕ is the unique K-algebra automorphism
generally, for any i ∈ {0, . . . , p − 1}, the operator of K[x, ∂] that keeps the elements of K fixed, and is
L∗αi (∂) = L(∂ + αi )∗ coincides with the Taylor series defined on the generators of K[x, ∂] by ϕ(x) = ∂ and
expansion at ∂ = αi and order d: ϕ(∂) = −x. We will call ϕ the reflection morphism of
K[x, ∂]. The map ϕ enjoys the nice property that it sends
L∗αi (∂) = L(αi ) + L (αi )∂ +· · ·+ (d−1)!
1
L(d−1) (αi )∂ d−1 .
K[x, ∂]d,r onto K[x, ∂]r,d . In particular, to an operator
In other words, the computation of the truncated oper- whose degree is higher than its order, ϕ associates a
ators L∗α0 , . . . , L∗αp−1 as a function of L corresponds “mirror operator” whose order is higher than its degree.
527
A. Main idea of the algorithm in the case d > r where we use the convention that pi,j = 0 as soon as
If d > r, then the reflection morphism ϕ is the key i r or j d.
to our fast multiplication algorithm for operators in Proof: Leibniz’s differentiation rule implies the com-
K[x, ∂]d,r , since it allows us to reduce this case to the mutation rule
previous case when r d. More precisely, given K, L in i j
jx j xi−k j−k
K[x, ∂]d,r with d > r, the main steps of the algorithm ∂ = ∂ .
i! k (i − k)!
are: k=0
(S1) compute the canonical forms of ϕ(K) and ϕ(L), Together with the hypothesis, this implies the equality
(S2) compute the product M = ϕ(K)ϕ(L) of opera-
xi j xi
tors ϕ(K) ∈ K[x, ∂]r,d and ϕ(L) ∈ K[x, ∂]r,d , using (i! qi,j ) ∂ = (i! pi,j )∂ j
the algorithm described in the previous section, i,j
i! i,j
i!
⎛ ⎞
and
i−k
⎝ (i! pi,j ) j x
(S3) compute and return the (canonical form of the) = ∂ j−k ⎠ .
operator KL = ϕ−1 (M ). i,j
k (i − k)!
k0
Since d > r, step (S2) can be performed in complexity
Õ(rω−1 d) using the results of Section III. In the next We conclude by extraction of coefficients.
subsection, we will prove that both steps (S1) and (S3) Theorem 4: Let L ∈ K[x, ∂]d,r . Then we may compute
can be performed in Õ(rd) operations in K. This will ϕ(L) and ϕ−1 (L) using O(min(d M(r), r M(d))) = Õ(rd)
enable us to conclude the proof of Theorem 1. operations in K.
Proof:
We first deal with the case r d. If
B. Quasi-optimal computation of reflections L = i<r, j<d pi,j xj ∂ i , then by the first equality of
We now show that the reflection and the inverse re- Lemma 3, the reflection ϕ(L) is equal to
flection of a differential operator can be computed quasi-
optimally. The idea is that performing reflections can
be interpreted in terms of Taylor shifts for polynomials, ϕ(L) = pi,j ∂ j (−x)i = qi,j (−x)j ∂ i ,
which can be computed in quasi-linear time using the i<r,j<d i<r,j<d
528
If d > r, then we notice that the equality (2) is [6] A. Bostan, F. Chyzak, and N. Le Roux, “Products of
equivalent to ordinary differential operators by evaluation and inter-
polation,” in Proceedings of ISSAC’08. New York:
i +
ACM, 2008, pp. 23–30.
j! qi,j = (j +
)! pi+,j+ ,
i [7] E. Brassinne, “Analogie des équations différentielles
0
linéaires à coefficients variables, avec les équations al-
as can be seen by expanding the binomial coeffi- gébriques,” in Note III du Tome 2 du Cours d’analyse
cients.
Redefining G k := i i!qi+k,i xi+k and Fk := de Ch. Sturm, École polytechnique, 2ème édition, 1864,
i+k pp. 331–347.
i i!pi+k,i x , similar arguments as above show that
ϕ(P ) can be computed using O(d M(r)) operations in K. [8] P. Bürgisser, M. Clausen, and M. A. Shokrollahi, Al-
By what has been said at the beginning of this gebraic Complexity Theory, ser. Grundlehren der Math-
section, we finally conclude that the inverse reflection ematischen Wissenschaften [Fundamental Principles of
ϕ−1 (L) = ϕ(ψ(L)) can be computed for the same cost Mathematical Sciences]. Berlin: Springer-Verlag, 1997,
as the direct reflection ϕ(L). vol. 315, with the collaboration of Thomas Lickteig.
[1] A. V. Aho, K. Steiglitz, and J. D. Ullman, “Evaluating [16] J. van der Hoeven, “FFT-like multiplication of linear
polynomials at fixed sets of points,” SIAM J. Comput., differential operators,” J. Symbolic Comput., vol. 33,
vol. 4, no. 4, pp. 533–539, 1975. no. 1, pp. 123–127, 2002.
[2] A. V. Aho, J. E. Hopcroft, and J. D. Ullman, The design [17] ——, “On the complexity of skew arith-
and analysis of computer algorithms. Addison-Wesley metic,” Tech. Rep., 2011, HAL 00557750,
Publishing Co., 1974. http://hal.archives-ouvertes.fr/hal-00557750.
529
[21] O. Ore, “Formale Theorie der linearen Differentialgle- [25] A. Schönhage and V. Strassen, “Schnelle Multiplikation
ichungen. (Erster Teil),” J. Reine Angew. Math., vol. großer Zahlen,” Computing, vol. 7, pp. 281–292, 1971.
167, pp. 221–234, 1932.
[26] A. Stothers, “On the complexity of matrix multiplica-
[22] ——, “Theory of non-commutative polynomials,” Ann. tion,” Ph.D. dissertation, University of Edinburgh, 2010.
of Math. (2), vol. 34, no. 3, pp. 480–508, 1933.
[27] V. Strassen, “Gaussian elimination is not optimal,” Nu-
[23] V. Pan, How to multiply matrices faster, ser. Lecture mer. Math., vol. 13, pp. 354–356, 1969.
Notes in Computer Science. Berlin: Springer-Verlag,
1984, vol. 179. [28] V. Vassilevska Williams, “Multiplying matrices faster
than Coppersmith-Winograd,” in Proceedings of
[24] V. Y. Pan, Structured matrices and polynomials – Unified STOC’12. New York: ACM, 2012, pp. 887–898.
Superfast Algorithms. Boston, MA: Birkhäuser Boston
Inc., 2001.
530