5 tayangan

Diunggah oleh Bhavesh joshi

Mathematics elementary

- St. Thomas School Grade Nine Curriculum
- Time - Frequency analysis of musical instruments
- Single Degree of Freedom
- Fundamentals of Electrical Engineering I
- Reality Revealed
- math congress
- A Guided Tour of the Fast Fourier Transform
- Digital Image Processing - Lecture Weeks 19 and 20
- math dgp c dana graham phelps 2011
- Fourier Analysis and Optimization
- TMS320c6745 Manual
- r05320201 Digital Signal Processing
- Sushmita 10 A
- math unit plan - cra mtss
- Srr320402 Digital Signal Processing
- Complete theory lms.pdf
- Col 10253
- An Experimental Investigation of Higher Harmonic Forces on a Vertical Cylinder_99
- Real Counterexample.
- EE 411-Digital Signal Processing-Muhammad Tahir

Anda di halaman 1dari 222

Jack D’Aurizio

Contents

0 Introduction 3

8.1 The Gauss circle problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

0 INTRODUCTION

0 Introduction

This course has been designed to serve University students of the first and second year of Mathematics. The purpose

of these notes is to give elements of both strategy and tactics in problem solving, by explaining ideas and techniques

willing to be elementary and powerful at the same time. We will not focus on a single subject among Calculus,

Algebra, Combinatorics or Geometry: we will just try to enlarge the “toolbox” of any professional mathematician

wannabe, by starting from humble requirements:

A collection of problems in Analysis and Advanced integration techniques, kindly provided by Tolaso J. Kos and Zaid

Alyafeai, are excellent sources of exercises to match with the study of these notes.

These notes are distributed under a Creative Commons Share-Alike (CCSA) license. Personal use is allowed,

distribution is allowed with the only constraint of making a proper mention of the author (Jack D’Aurizio).

Commercial use, modification or inclusion in other works are not allowed.

Page 3 / 222

1 Creative Telescoping and DFT

It is soon evident, during the study of Mathematics, that the bijectivity of some function f does not grant that the

explicit computation of f −1 (y) is “just as easy” as the explicit computation of f (x). Some examples are related to the

ease of multiplication, against the hardness of factorization; the possibility of computing derivatives in a algorithmic

fashion, against the lack of a completely algorithmic way to find indefinite integrals; the determination of a Galois

group of an irreducible polynomial over Q, against the difficult task of finding a polynomial having a given Galois

group. In the present section we will outline two interesting techniques for solving (or getting arbitrarily close to an

actual “solution”) a peculiar inverse problem, that is the computation of series.

Definition 1. We give the adjective telescopic to objects of the form a1 + a2 + . . . + an , where each ai can be written

as bi − bi+1 for some sequence b1 , b2 , . . . , bn , bn+1 . With such assumption we have:

Essentially, every telescopic sum is simple to compute, just as any convergent series with terms of the form bi − bi+1 .

A peculiar example is provided by Mengoli series: the identity

N N

X 1 X 1 1 1

= − =1−

n=1

n(n + 1) n=1 n n + 1 N +1

grants we have

X 1

= 1.

n(n + 1)

n≥1

Lemma 2. X 1 1

∀k ∈ N+ , = .

n(n + 1) · . . . · (n + k) k · k!

n≥1

In a forthcoming section we will also see a proof not relying on telescoping, but on properties of Euler’s Beta function.

The first technical issue we meet in this framework is related to the fact that recognizing a contribution of the form

bi − bi+1 in the general term of a series is not always easy, just like in the continuous analogue: if the task is to

Rb

find a f (x) dx, it is not always easy to devise a function g such that f (x) = g 0 (x). Here there are some non-trivial

examples:

Lemma 3.

X 1 π

arctan = .

1 + n + n2 4

n≥1

Proof. If we use “backwards” the sum/subtraction formulas for the tangent function, we have that

tan(x) ± tan(y)

tan(x ± y) =

1 ∓ tan(x) tan(y)

implies: !

1 1

−

1 1 n n+1 1

arctan − arctan = arctan 1 = arctan

n n+1 1+ n(n+1)

1 + n + n2

π

so the given series is telescopic and it converges to arctan(1) = 4.

Page 4 / 222

1 CREATIVE TELESCOPING AND DFT

and Fn+2 = Fn+1 + Fn for any n ≥ 0. Show that:

(−1)n+1 √

X

arctan = arctan( 5 − 2).

Fn+1 (Fn + Fn+2 )

n≥1

X sinh 1 π X 1 π π

arctan = , arctan = − arctan tanh .

cosh(2n) 2 8n2 4 4

n∈Z n≥1

N

X 1 2n N + 1 2N + 2

= ,

n=0

4n n 22N +1 N + 1

for instance by considering that by De Moivre’s formula we have

Z π

1 2n 1

An = n = cos2n (x) dx

4 n 2π −π

P2n Rπ

since cos(x)2n = 41n j=0 2n

(2n−j)ix −jix

j e e and −π ekix dx = 2π · δ(k),

so the only non-vanishing contribution is related to the j = n term, and

N Z π

1 − cos2N +2 (x)

X 1 2n 1 N + 1 2N + 2

= dx = (2N + 2) AN +1 = 2N +1

n=0

4n n 2π −π sin2 (x) 2 N +1

dx

R

follows from the integration by parts formula ( sin2 x

= − cot x). Prove also that

X 2k

1

=1

k (k + 1)4k

k≥0

by recognizing in the main term a telescopic contribution. Give a probabilistic interpretation to the proved identities,

by considering random paths on a infinite grid (Z × Z) where only unit movements towards North or East are allowed.

We now outline the first (really) interesting idea, namely creative telescoping: even if we are not able to write the

main term of a series in the bi − bi+1 form, it is not unlikely there is an accurate approximation of the main term that

can be represented in such a telescopic form. By subtracting the accurate telescopic approximation from the main

term, the original problem boils down to computing/approximating a series that is likely to converge faster than the

original one, and the same approximation-by-telescopic-series trick can be performed again. For instance, we might

employ creative telescoping for producing very accurate approximations of the series

X 1

ζ(2) =

n2

n≥1

In particular, for any n > 1 the term n12 is quite close to the telescopic term 1

n2 − 14

:

1 4 1 1

1 = =2 −

n2 − 4

(2n − 1)(2n + 1) 2n − 1 2n + 1

Page 5 / 222

1 1

and we have n2 − n2 − 41

= − (2n−1)n12 (2n+1) , so:

X 1 1

X X 1

ζ(2) = 1 + = 1+ 1 −

n2 n 2−

4

(2n − 1)n 2 (2n + 1)

n≥2 n≥2 n≥2

2 X 1

= 1+ −

3 (2n − 1)n2 (2n + 1)

n≥2

gives us ζ(2) < 35 (that we will prove to be equivalent to π 2 < 10), and the magenta “residual series” can be manipu-

lated in the same fashion (by extracting the first term, approximating the main term with a telescopic contribution,

considering the residual series) or simply bounded above by:

X 1 X 1 3 22

< 1

1

= ζ(2) −

(2n − 1)n2 (2n + 1) (2n − 1) n − 2 n + 2 (2n + 1) 2 9

n≥2 n≥2

45 follows. We may notice that the difference between 45 and 3 is already pretty

small. In this framework the iteration of creative telescoping leads to two interesting consequences: the identity

X 1 X 3

ζ(2) = = ,

n≥1

n 2

n≥1

n 2n

2

n

providing a remarkable acceleration of the series defining ζ(2), and Stirling’s inequality:

n n √

1

n n √

1

2πn exp ≤ n! ≤ 2πn exp

e 12n + 1 e 12n

It is also possible to employ creative telescoping for proving that:

X 1 5 X (−1)n+1

ζ(3) = =

3

n3 2n

n 2 n

n≥1 n≥1

We may notice that:

1 1 (−1)

3

= +

n (n − 1)n(n + 1) (n − 1)n3 (n + 1)

1 1 −22

3

= +

(n − 1)n (n + 1) (n − 2)(n − 1)n(n + 1)(n + 2) (n − 2)(n − 1)n3 (n + 1)(n + 2)

Continuing on telescoping we get that:

m

1 (−1)m m!2 X (−1)j−1 (j − 1)!2

= +

n3 (n − m) . . . n3 . . . (n + m) j=1 (n − j) . . . (n + j)

So by setting m = n − 1:

n−1

1 (−1)n−1 (n − 1)!2 X (−1)j−1 (j − 1)!2

= +

n3 n2 (2n − 1)! j=1

(n − j) . . . (n + j)

The terms of the last series can be managed through partial fraction decomposition:

1 1 1 1

= − + − ...

(n − j) . . . (n + j) (2j)!(n − j) (2j − 1)!1!(n − j + 1) (2j − 2)!2!(n − j + 2)

Page 6 / 222

1 CREATIVE TELESCOPING AND DFT

2j 2j 2j

(n − j − 1)! X (−1)k 1 X (−1)k k

= =

(n + j)! (2j − k) k! (n − j + k) (2j)! n−j+k

k=0 k=0

and since:

2j 2j 2j 2j−1

(−1)h−1 1

(−1)k

Z

XX X h−1 1

k

= = (1 − x)2j−1 dx =

n>j k=0

n−j+k h 0 2j

h=1

we get:

+∞ +∞

X (−1)n−1 n!2 X X (−1)j−1 (j − 1)!2

ζ(3) = +

n=1

n4 (2n − 1)! j=1 n>j (n − j) . . . (n + j)

+∞ +∞ +∞

X (−1)n−1 n!2 X (−1)j−1 j!2 5 X (−1)n−1

ζ(3) = + =

n4 (2n − 1)! j=1 2j 3 (2j)! 2 n=1 n3 2n

n=1 n

as wanted.

Exercise 7. Prove that the following identity (about the acceleration of an “almost-geometric” series) holds.

X 1 1 X 8m + 1

= + .

n≥2

2n − 1 4

m≥2

(2m − 1)2m2 +m

As proved by Tachiya, this kind of acceleration tricks provide a simple way for proving the irrationality

of n≥1 qn1+1 and n≥1 qn1−1 for any q ∈ Z such that |q| ≥ 2.

P P

Creative telescoping can also be used for a humble purpose, like proving the divergence of the harmonic series.

By recalling that the n-th harmonic number Hn is defined through

n

X 1

Hn =

k

k=1

x

x 1+ 2

x < 2 arctanh = log x

2 1− 2

it follows that:

n

X 2k + 1

Hn < log = log(2n + 1).

2k − 1

k=1

On the other hand 2 arctanh x2 − x = O(x3 ) in a neighbourhood of the origin, and the series k≥1 k13 = ζ(3) is

P

convergent, so there is an absolute constant C granting Hn ≥ log(2n + 1) − C for any n ≥ 1. In a similar way, by

(j)

defining the n-th generalized harmonic number Hn through

n

X 1

Hn(j) = ,

kj

k=1

n

√ √ X 1

an = 2 n − Hn(1/2) = 2 n − √

k=1

k

5

√ √ 1 1 1 1

an+1 − an = 2 n + 1 − 2 n − √ = 1 √ √ −√ =√ √ √ 2 > 0

n+1 2 n+ n+1 n+1 n+1 n+ n+1

Page 7 / 222

and

n

X 1 X 1

an = √ √ √ −→

2 n→+∞ √ √ √ 2 .

m=0 m+1 m+ m+1 n≥0 n+1 n+ n+1

The claim then follows from considering that the main term of the last series is well-approximated by the telescopic

term

1 1

√ −√

4n + 1 4n + 5

for any n ≥ 1. In similar contexts, by exploiting creative telescoping and the Cauchy-Schwarz inequality we may get

surprising results, like the following one:

v

n n u n

X 1 X 1 CS u X 1 1 1

< √ √ ≤ tn − =√

n+k n+k−1 n+k n+k−1 n+k 2

k=1 k=1 k=1

but the limit of the LHS for n → +∞ is log(2), hence log(2) ≤ √12 . In general, by mixing few ingredients among

creative telescoping, the Cauchy-Schwarz inequality, convexity arguments and Weierstrass products we may achieve

short and elegant proofs of highly non-trivial claims, like:

q

1

2n

π n+ 4

an =

n 4n

That implies

Γ x + 12

x

∼p

Γ(x) x + 1/4

for any x > 0, that is a strengthening of Gautschi’s inequality.

Creative telescoping is also a key element in the Wilf and Zeilberger algorithm for the symbolic computation of binomial

sums (http://mathworld.wolfram.com/Wilf-ZeilbergerPair.html), further extended by Gosper to the hyperge-

ometric case and by Risch (https://en.wikipedia.org/wiki/Risch_algorithm) to the symbolic computation of

elementary antiderivatives.

X 1

= k − ζ(2) − . . . − ζ(k)

n(n + 1)k

n≥1

1

P

where ζ(m) = n≥1 nm .

The following exercise is particularly exemplary, since it stresses some interesting relations among creative telescoping,

the Cauchy-Schwarz inequality, the Maclaurin series of arcsin2 (z), ζ(2) and Catalan numbers: all these topics will be

deeply investigated in the following sections.

Page 8 / 222

1 CREATIVE TELESCOPING AND DFT

X 1

S= √

(n + 1) n

n≥1

1 π

√

is extremely close to 2 + 4 3.

2n

1 √ X n 1 3√

S≈ + π n

= + π

2 4 (n + 1) 2 4

n≥2

2n

since 41n n ≈ √1πn is a pretty good approximation for any n ≥ 1 and the generating function for Catalan numbers

√

1 π 2n 1 1

√ ≈ n 1+ + .

n 4 n 8n 128n(n + 2)

1 √

Creative telescoping provides us a more elementary approach: indeed, (n+1) n

< √2n − √n+1

2

immediately proves

q

1 √

S < 2, and the more accurate (n+1) n

≈ √ 2 1 − √ 2 7 gives S ≈ 12 + 2 13

6

. On the other hand we may also combine

n+ 6 n+ 6

the approximation through central binomial coefficients with the Cauchy-Schwarz inequality to get an exceptionally

simple and very accurate approximation:

v

u

2n

r

1 u X 4n 1 π2 3

u X

n

S ≤ +t 2n

n

= + ·

2 n(n + 1) n (n + 1)4 2 4 4

n≥2 n≥2

√

gives S ≈ 1

2 + 3

4 π, whose absolute error is less than 4 · 10−4 .

Before introducing a second tool (the discrete Fourier transform, DFT ), it might be interesting to consider an appli-

cation of creative telescoping to the computation of an integral.

1

log(x) log2 (1 − x)

Z

1X 1 ζ(4)

dx = − 4

=− .

0 x 2 n 2

n≥1

Proof. The dilogarithm function is defined, for any x ∈ [0, 1], through:

X xn

Li2 (x) = .

n2

n≥1

x , so, by integration by parts:

Z 1 Z 1

log(x)Li2 (x) Li2 (x) 0

dx = log(1 − x) + log(x)Li2 (x) dx

0 1−x 0 x

log(x) log2 (1 − x)

Z 1 Z 1

= − Li02 (x)Li2 (x) dx − dx

0 0 x

In particular the opposite of our integral equals:

log(x) log2 (1 − x)

Z 1 Z 1 X xn X

1 2

− dx = Li (1) + xk log(x) dx

0 x 2 2 0 n≥1 n2

k≥0

1 X 1 X 1

= ζ(2)2 −

2 n2 m>n m2

n≥1

Page 9 / 222

where, by symmetry:

2

X 1 1 X 1 X 1

= −

m2 n2 2 n2 n4

m>n≥1 n≥1 n≥1

log2 (1 − x) X 2Hn n

= x ,

x (n + 1)

n≥0

since: X xn − log(1 − x) X 1 X Hn

− log(1 − x) = = Hn x n log2 (1 − x) = xn+1

n 1−x 2 n+1

n≥1 n≥1 n≥1

R1 n 1

By termwise integration (through 0

(− log x)x dx = (n+1)2 ) the proved identities lead to:

X Hn 1X 1 ζ(4)

3

= = .

(n + 1) 4 n4 4

n≥1 n≥1

A keen reader might ask why this virtuosity1 has been included in the creative telescoping section. The reason is

the following: in order to make the magic work, we actually do not need the dilogarithm function (a mathematical

function with the sense of humour, according to D.Zagier) or integration by parts. As a matter of fact:

n X1 X

X 1 1 n

Hn = = − =

k m m+n m(m + n)

k=1 m≥1 m≥1

X Hn X Hn+1 1

X Hn

= − = −ζ(4) +

(n + 1)3 (n + 1) 3 (n + 1) 4 n3

n≥1 n≥1 n≥1

X 1 1 X 1 1

= −ζ(4) + = −ζ(4) + +

mn2 (m + n) 2 mn2 (m + n) m2 n(m + n)

n,m≥1 m,n≥1

1 X 1 1

= −ζ(4) + = −ζ(4) + ζ(2)2

2 m2 n2 2

m,n≥1

and by comparing the last identity to the identities we already know, we get that ζ(4) = 52 ζ(2)2 .

Some questions might naturally arise at this point: is it possible, in a similar fashion, to relate the value of ζ(2k+1 )

to the value of ζ(2k )? Or: is it possible to find the explicit value of ζ(2) by simply squaring the Taylor series at the

origin of the arctangent function? Answers to such questions are postponed.

Exercise 12. Let A be a finite set with cardinality ≥ 4. Let P0 be the set of subsets of A with 3j elements, let P1 be

the set of subsets of A with 3k + 1 elements, let P2 be the set of subsets of A with 3h + 2 elements. Prove that any

two numbers among |P0 |, |P1 |, |P2 | differ at most by 1, no matter what |A| is.

1 Itis worth mentioning that just like 01 dx is associated to an Euler sum with weight 4, namely n≥1 H

P n

1−x n3

, the similar

R 1 log(1+x) log2 (x) P Hn n+1

integral 0 1+x

dx is associated to the alternating series n≥1 n3 (−1) . On the other hand, while the first series is clearly

given by the values of the Riemann ζ function at s = 2 or s = 4, the alternating series has a much more involved closed form:

1 1 log(1 + x) log2 (x)

Z

X Hn n+1 1 4

−π − 4π 2 log2 (2) + 4 log4 (2) + 96 Li2 21 + 84 log(2)ζ(3)

3

(−1) = =

n≥1

(n + 1) 2 0 1 + x 48

has been proved by De Doelder in 1991. See also Flajolet and Salvy, Euler sums and contour integral representations.

Page 10 / 222

1 CREATIVE TELESCOPING AND DFT

The claim appears to be a (more or less) direct generalization of a well-known fact: the number of subsets of I =

{1, 2, . . . , n} with even/odd cardinality is the same. In that framework, we may consider the map sending B ⊆ I in

B \ {1} when 1 ∈ B, and in B ∪ {1} when 1 6∈ B (“if there is 1, we remove it, otherwise we insert it”): such map is

an involution and provides a bijection between the subsets with even cardinality and the subsets with odd cardinality.

As an alternative, by recalling that in I we have nk subsets with k elements, we may simply check that

n

X n

(−1)k = 0

k

k=0

In the ternary case we have to compare the sums

X n X n X n

|P0 | = , |P1 | = , |P2 | =

k k k

k≡0 (mod 3) k≡1 (mod 3) k≡2 (mod 3)

and we would like to have a tool allowing us to isolate the contributions given by elements in particular positions

(positions given by an arithmetic progression) in a sum. The DFT is precisely such a tool.

2πi

Lemma 13 (DFT). If n ≥ 2 is a natural number and ω = exp n , the function f : Z → C given by

n−1

1 X km

f (m) = ω

n

k=0

n−1

1 X −hk km

χh (m) = ω ω

n

k=0

The possibility of writing some indicator functions as weighted power sums has deep consequences.

In our case, if we take ω as a primitive third root of unity, we have:

n n

X n 1X n (1 + 1)n + (1 + ω)n + (1 + ω 2 )n

|P0 | = χ0 (k) = 1k + ω k + ω 2k =

k 3 k 3

k=0 k=0

due to the binomial Theorem. Since both (1 + ω) and its conjugate (1 + ω 2 ) lie on the unit circle, we have that |P0 |

n

is an integer number whose distance from 23 is bounded by 13 . The reader can easily check the same holds for |P1 |

and |P2 | and the claim readily follows. The discrete Fourier transform proves so the reasonable proposition claiming

the almost-uniform distribution of the cardinality (mod 3) of subsets of {1, . . . , n}. Perfect uniformity is clearly not

possible, since |P0 | + |P1 | + |P2 | = 2n never belongs to 3Z.

Exercise left to the reader: prove the claim of Exercise 11 by induction on |A|.

X 1

S= .

(3n)!

n≥0

X zn

ez =

n!

n≥0

Page 11 / 222

is defined by an everywhere-convergent power series, then apply a ternary DFT to such series and get, like in the

previous exercise (here we are manipulating an infinite sum, but there is no issue since ez is an entire function):

X z 3n X zn X zn 1 X z n + (ωz)n + (ω 2 z)n 1 z 2

= = χ0 (n) = = e + eωz + eω z

(3n)! n! n! 3 n! 3

n≥0 n≡0 (mod 3) n≥0 n≥0

√ √

−1+i 3 −1−i 3

and since ω = 2 and ω 2 = 2 , for any z ∈ C we have:

√ !

X z 3n 1 z 3

z −z/2

= e + 2e cos ,

(3n)! 3 2

n≥0

e 2 3

S = + √ cos .

3 3 e 2

Was the introduction of the complex z variable really necessary? It clearly was not: a viable alternative would have

been to just re-write 1 as 1n in the definition of S. Besides the identity 1 = 1n being really obvious, the idea of tackling

the original problem through such identity and the DFT is not obvious at all: similar situations explain just fine the

subtle difference between the adjectives elementary and easy in a mathematical context. Another famous application

of the DFT is related to the Frobenius coin problem:

Exercise 15. Given n ∈ N, let Un be the number of natural solutions of the (diophantine) equation a + 2b + 3c = n,

i.e.

Un = (a, b, c) ∈ N3 : a + 2b + 3c = n .

(n+3)2

Prove that for any n, Un equals the closest integer to 12 .

This claim will be proved in the section about Analytic Combinatorics, since few elements of Complex Analysis and

manipulation of formal power series are required. However we remark that the key idea is the same key idea of

Hardy-Littlewood’s circle method, a really important tool in Additive Number Theory: for instance, it has been used

for proving that any odd natural number large enough is the sum of three primes (Chen’s theorem, also known as

ternary Goldbach). Now we will focus on a typical application of the DFT in Arithmetics, i.e. a proof of a particular

case of Dirichlet’s Theorem.

Theorem 16 (Dirichlet). If a and b are coprime positive integers, there are infinite prime numbers ≡ a (mod b).

The particular case we are going to study is the proof of the existence of infinite primes of the form 6k + 1. We recall

that the infinitude of primes of the form 6k − 1 follows from a minor variation on Euclid’s proof of the infinitude of

primes:

Let us assume the set of primes of the form 6k − 1 is finite and given by {p1 = 5, 11, 17, . . . , pM } = E.

Let us consider the huge number

YM

N = −1 + 6 pm .

m=1

By construction, no element of E divides N . On the other hand, N is a number of the form 6K − 1, hence it

must have some prime divisor ≡ −1 (mod 6). Such contradiction leads to the fact that the set of primes of the

form 6k − 1 is not finite (aka infinite).

Page 12 / 222

1 CREATIVE TELESCOPING AND DFT

We may notice that a number of the form 6k + 1 is not compelled to have a prime divisor of the same form (for

instance, 55 = 5 · 11), so the previous argument is not well-suited for covering the 6k + 1 case, too. 2 We then take a

step back and a step forward: we provide an alternative proof of the infinitude of primes, then prove it can be adjusted

to prove the existence of infinite primes of the form 6k + 1, too. Let us recall the main statement in Analytic Number

Theory:

Theorem 17 (Euler’s product for the ζ function). If P is the set of prime numbers and s is a complex number with

real part greater than one,

Y −1 X

1 1

1− s = = ζ(s).

p ns

p∈P n≥1

−1

Since 1 − p1s = 1 + p1s + p12s + . . ., such result is just the analytic counterpart of the Fundamental Theorem of

Arithmetics, stating that Z is a UFD.

In such framework the following argument is pretty efficient: if there were just a finite number of primes, given Euler’s

product the harmonic series would be convergent. But we know it is not, so there have to be an infinite number of

primes. The Theorem just outlined has an interesting generalization:

Theorem 18 (Euler’s product for Dirichlet’s L-functions). If P is the set of prime numbers, s is a complex number with

real part greater than one and χ(n) is a totally multiplicative function (i.e. a function such that χ(nm) = χ(n)χ(m)

holds for any couple (n, m) of positive integers), we have:

Y −1

χ(p) X χ(n)

1− = = L(s, χ).

ps ns

p∈P n≥1

We may consider a simple totally multiplicative function: the function that equals 1 over natural numbers ≡ 1 (mod 6),

−1 over natural numbers ≡ −1 (mod 6) and zero otherwise. Such function is the non-principal (Dirichlet) character

(mod 6). We may notice that:

X zn

= − log(1 − z)

n

n≥1

for any complex number z having modulus less than one. By applying the DFT with respect to a primitive sixth root

of unity:

X 1 1

π

L(1, χ) = − = √ .

6k + 1 6k + 5 2 3

k≥0

As an alternative:

X XZ 1

1 1

L(1, χ) = − = (x6k − x6k+4 ) dx

6k + 1 6k + 5 0

k≥0 k≥0

1 1

1 − x4

Z X Z

= (1 − x4 ) x6k dx = dx

0 0 1 − x6

k≥0

1

1 + x2

Z

π

= dx = √

0 1 + x2 + x4 2 3

2 However,there is a light that never goes out: the infinitude of primes of the form 6k + 1 can be proved in a algebraic fashion by

considering cyclotomic polynomials. For instance, every prime divisor of Φ6 (3n) = 9n2 − 3n + 1 is a number of the form 6k + 1.

Page 13 / 222

Let us assume that prime numbers ≡ 1 (mod 6) are finite and consider Euler’s product for L(s, χ):

−1 −1

Y 1 Y 1

L(s, χ) = 1− s 1+ s

p p

p≡1 (mod 6) p≡−1 (mod 6)

−1 −1

Y 1 Y 1

≤ 1− 1+ s

p p

p≡1 (mod 6) p≡−1 (mod 6)

Y −1

1

= C 1+ s

p

p6=2,3

Y −1

1

= D 1+ s

p

p

ζ(2s)

= D

ζ(s)

for some constant D > 0. From the divergence of the harmonic series we would have:

lim L(s, χ) = 0,

s→1+

but we already know that L(1, χ) > 0 (we computed its explicit value).

Such contradiction leads to the fact that the set of primes of the form 6k + 1 is infinite.

We underline some points in the proof just outlined:

• we used Euler’s product, analytic counterpart of the Fundamental Theorem of Arithmetics, for studying the

distribution of primes in the arithmetic progressions (mod 6). It looks highly unlikely that there is just a finite

number of primes ≡ 1 (mod 6) and infinite primes ≡ −1 (mod 6), so we just need to show that such awkward

“imbalance” does not really occur;

• through the DFT, we may compute the value of L(1, χ) (with χ being the non-principal character (mod 6)) in

a explicit way, and check it is a positive number;

• from Euler’s product we have that the previous “imbalance” would lead to L(1, χ) = 0. It is not so, hence there

is no “imbalance”.

At last, we mention that both the DFT and the existence of Dirichlet’s characters are instances of Pontryagin’s duality

(https://en.wikipedia.org/wiki/Pontryagin_duality). The DFT is also of great importance for algorithms, since

it gives methods for the fast multiplication of polynomials (or integers): in such a context it is also known as FFT

(Fast Fourier Transform). Summarizing:

• The key idea is to exploit interpolation/extrapolation. A polynomial with degree m is fixed by its values at

m + 1 distinct points. If we assume to have a(x) and b(x) and we need to compute c(x) = a(x) · b(x), we may. . .

• compute in a explicit way the values of a and b at the 2n -th roots of unity, then the values of c at such points. . .

• and compute the coefficients of c(x) through such values. Nicely, both the evaluation process and the extrapola-

tion process are associated with a matrix-vector-product problem, where the involved matrix is Vandermonde’s

matrix given by the 2n -th roots of unity;

• the structure of such matrix depends in a simple way from the structure of Vandermonde’s matrix given by the

2n−1 -th roots of unity, hence the needed matrix-vector-products can be computed through a recursive, divide et

impera approach, with a significant improvement in computational costs.

Page 14 / 222

1 CREATIVE TELESCOPING AND DFT

We have studied how to use the creative telecoping machinery for producing fast-convergent series representing

ζ(2) or ζ(3). Three formulas provide a wide generalization of such statement. The first one is due to Koecher

(1979):

k−1

1 X (−1)k+1 5k 2 − a2 Y a2

X

2n

X 1

ζ(2n + 3)a = = · 2 1− 2 ,

n≥0 k≥1

k(k 2 − a2 ) 2

k≥1

k 3 2k

k

k − a2 m=1 m

X (−1)n+1 k−1

X 3k 2 + a2 Y a2

1 1X 1

1− ζ(2n + 2)a2n = = · 1 − ,

22n+1 n2 − a2 k 2 2k k 2 − a2 m=1 m2

2 k

n≥0 n≥1 k≥1

k−1

Y m2 − 4a2

X X 1 X 1

ζ(2n + 2)a2n = 2 2

=3 2k

.

k −a (k − a ) m=1 m2 − a2

2 2

n≥0 k≥1 k≥1 k

They hold for any a ∈ (−1, 1): by comparing the coefficients of ah in the LHS/RHS one gets that ζ(m), for

any m ≥ 2, can be represented as a fast-convergent series involving central binomial coefficients and generalized

harmonic numbers. It is straightforward to recover the well-known results

X 1 5 X (−1)n+1

ζ(2) = 3 2n , ζ(3) =

n3 2n

n2 n

2 n

n≥1 n≥1

n

36 X 1 X (−1)k 1X 2n X 1

ζ(4) = , G= = 2n .

k 4 2k (2k + 1)2

17 k

2 (2n + 1) n

2k + 1

k≥1 k≥0 n≥0 k=0

The last identity can also be proved by computing integrals involving the arcsin2 (x) function or by computing the

1

binomial transform of (2k+1) 2.

X 1 log(3) − log(2)

arctanh = ,

n3 2

n≥2

1

trivially leading to ζ(3) < 1 + 2 log 32 .

X 1 ζ(s)3

∀s > 1, = .

lcm(m, n)s ζ(2s)

m,n≥1

αk

Proof. For any M ∈ N+ of the form M = pα 1 · · · pk , the number of solutions of lcm(n, m) = M

1

is given by (2α1 + 1) · · · (2αk + 1). It follows that the given series equals

X 1 Y

(2νp (M ) + 1)

Ms

M ≥1 p|M

Page 15 / 222

Q

and since M 7→ p|M (2νp (M ) + 1) clearly is a multiplicative function, by Euler’s product

X 1 Y 3 5 7

Y s s

p (p + 1) Y 1 − p12s ζ(s)3

= 1 + + + + . . . = = 3 = .

lcm(m, n)s ps p2s p3s (ps − 1)2 ζ(2s)

m,n≥1 p∈P p∈P p∈P 1 − 1s

p

2n

1

log2 (1 − x)

H2n−1 π3

Z X

√ =π n

= + π log2 (2).

−1 1−x 2 n4n 3

n≥1

Exercise 22. The analytic continuation for the Riemann ζ function to the region Re(s) > 0 gives us the identity

1

X 1

ζ 2 = −2 + √ √ √ .

k≥1

k( k + k + 1)2

1

3 1X 1

ζ 2 = − + √ √ √ √

2 2 k k + 1( k + k + 1)3

k≥1

35 7 X 1 1 X 1

= − − √ √ √ √ + √ √ √ √ .

24 96 k k(k + 1) k + 1( k + k + 1) 3 96 k k(k + 1) k + 1( k + k + 1)7

k≥1 k≥1

1 1

Q

Exercise 23. Find a rational approximation of r≥1 1+ 2r within 100 from the exact value.

3

1 + 2x

3

1+x>

1 + x3

3 3

Y 1 + 3·21r−1

1 3 5 Y 15 1 50

1+ r > · · 1 = 1+ > .

2 2 4 1 + 3·2 r 8 12 21

r≥1 r≥3

Similarly

1 + 2x + 34 x2 + 87 x3

1+x<

1 + x + 31 x2 + 17 x3

implies

Y 1

15

4 8

74

1+ < 1 + 2x + x2 + x3 < .

2r 8 3 7 x=1/8 31

r≥1

The difference between the upper bound and the lower bound is already less than 7 · 10−3 .

Page 16 / 222

2 CONVOLUTIONS AND BALLOT PROBLEMS

Exercise 24. Let Ω(n) be the function returning the number of prime factors of n, counted according to their

multiplicity, such that, for instance, Ω(24) = Ω(23 · 3) = 3 + 1 = 4. Prove that

X 1 7π 2

2

= .

n 60

n≥1

Ω(n) is even

1 X 1 1

1− 2 = , 1+ 2 = ,

p n2 p n2

p∈P n≥1 p∈P n≥1

7π 2

X 1 1 ζ(4)

2

= ζ(2) + = .

n 2 ζ(2) 60

n≥1

Ω(n) is even

We start this section by recalling a well-known identity:

Lemma 25.

n 2

X n 2n

= .

k n

k=0

Proof. The first proof we provide is based on a double counting argument. Let us assume to have a parliament

with n politicians in the left wing and n politicians in the right wing, and to be asked to count how many committees

with n politicians we may have. It is pretty clear such number is given by 2n

n , i.e. the number of subsets with n

elements in a set with 2n elements. On the other hand, we may count such committees according to the number of

politicians from the left wing (k ∈ [0, n]) in them. There are nk ways for choosing k politicians of the left wing from

the n politicians we have. If in a committee there are k politicians from the left wing, there are n − k politicians from

n

= nk for selecting them. It follows that:

the right wing, and we have n−k

n Xn 2

2n X n n n

= =

n k n−k k

k=0 k=0

n

def

X

(f ∗ g)(n) = f (k) · g(n − k)

k=0

Since nk is the coefficient of xk in the Taylor series of (1 + x)n at the origin3 ,

n

X n n

(1 + x)n = xk =⇒ [xk ](1 + x)n =

k k

k=0

3 The notation [xk ] f (x) stands for the coefficient of xk in the Taylor/Laurent series of f (x) at the origin.

Page 17 / 222

implies that:

n Xn

X n n k n n−k n n n n n 2n 2n

= [x ](1 + x) · [x ](1 + x) = [x ] [(1 + x) · (1 + x) ] = [x ](1 + x) = .

k n−k n

k=0 k=0

The second approach leads to a nice generalization of the first identity in the current section:

X a b a + b

= .

j k n

j+k=n

In the introduced convolution context the last identity simply follows from the trivial (1 + x)a · (1 + x)b = (1 + x)a+b .

We may notice that

X X X

c(n) xn · d(n) xn = (c ∗ d)(n) xn

n≥0 n≥0 n≥0

is the Cauchy product between two power series. The interplay between analytic and combinatorial arguments

√

allows us to prove interesting things. For instance we may consider the function f (x) = 1 − x = (1 − x)1/2 , analytic

in a neighbourhood of the origin. It is not difficult to compute its Taylor series by the extended binomial theorem.

Moreover f (x)2 = (1 − x) has a trivial Taylor series, hence by defining a(n) as the coefficient of xn in the Taylor series

of f (x), (a ∗ a)(n) always takes values in {−1, 0, 1}.

1 1 1

n 2 · 2 − 1 · ... · 2 − n + 1

X 1/2 X

1/2 n n

(1 − x) = (−1) x = (−1) xn

n n!

n≥0 n≥0

X 1 · (1 − 2) · . . . (1 − 2n + 2) n

= (−1)n x

2n n!

n≥0

X (2n − 1)!! X (2n)!

= 1− xn = 1 − xn

(2n − 1) · (2n)!! (2n − 1) · (2n)!!2

n≥1 n≥1

X 2n xn

= 1−

n 4n (2n − 1)

n≥1

1 X 2n xn

√ =

1 − x n≥0 n 4n

1 1 2n

follows, and since 1−x = 1 + x + x2 + x3 + . . ., if we set a(n) = 4n n we have (a ∗ a)(n) = 1, i.e.:

Lemma 27.

n

X 2k 2n − 2k

= 4n .

k n−k

k=0

n

1 X X X X

a(n)xn = (a ∗ 1)(n) xn = a(k) xn

1−x

n≥0 n≥0 n≥0 k=0

Page 18 / 222

2 CONVOLUTIONS AND BALLOT PROBLEMS

1 d

1

X 2n nxn−1 X 2n + 2 2n + 2

√ =2 √ =2 = xn .

(1 − x) 1 − x dx 1−x n 4n n+1 4n+1

n≥1 n≥0

By comparing the last two RHSs we have an alternative proof of an identity claimed by Exercise 6:

N

X 2n 1 2N + 2 N + 1

= .

n=0

n 4n N + 1 22N +1

1 X n + k

= xn .

(1 − x)k+1 k

n≥0

Proof. We may tackle this question both in a combinatorial and in an analytic way. The coefficient of xn in the

product of (k + 1) terms of the form (1 + x + x2 + . . .) is given by the number of ways for writing n as the sum of k + 1

natural numbers. By stars and bars we know the number of ways for writing n as the sum of k + 1 positive natural

numbers is n−1

k and it is not difficult to finish from there. As an alternative, we may proceed by induction on k.

The claim is trivial in the k = 0 case, and since

1 1 1 X n + k

= · = ∗ 1 xn

(1 − x)k+2 1 − x (1 − x)k+1 k

n≥0

n

X k+j n+k+1

= .

j=0

k k+1

Exercise 29. Given the sequences of Fibonacci and Lucas numbers {Fn }n≥0 and {Ln }n≥0 ,

prove the following convolution identity:

n

X nLn − Fn

Fk Fn−k = .

5

k=0

Proof. Since Fibonacci numbers fulfill the relation Fn+2 = Fn+1 + Fn , by defining their generating function as

X

f (x) = Fn x n

n≥0

we have that (1 − x − x2 ) · f (x) is a linear polynomial (a similar idea leads to the Berkekamp-Massey algorithm).

On the other hand, if

X ax + b

f (x) = Fn x n =

1 − x − x2

n≥0

b = 0 has to hold to grant f (0) = F0 = 0 and a = 1 has to hold to grant f 0 (0) = F1 = 1. It follows that:

√ √

x 1 1 1 1+ 5 1− 5

f (x) = =√ − , ϕ= , ϕ̄ =

1 − x − x2 5 1 − ϕx 1 − ϕ̄x 2 2

1 1

and by computing the Taylor series (that is a geometric series) of 1−ϕx and 1−ϕ̄x we immediately recover

Binet’s formula

ϕn − ϕ̄n

Fn = √ .

5

Page 19 / 222

The identity Ln = ϕn + ϕ̄n has a similar proof. By starting the convolution machinery:

n 2

X x 1 n 1 1 2

Fk Fn−k = [xn ] = [x ] + −

1 − x − x2 5 (1 − ϕx)2 (1 − ϕ̄x)2 (1 − ϕx)(1 − ϕ̄x)

k=0

and the claim follows from simple fraction decomposition. To find a combinatorial proof is an exercise left to the

reader: we recall that Fibonacci numbers are related to subsets of {1, 2, . . . , n} without consecutive elements.

A note in mathematical folklore: Alon’s Combinatorial Nullstellensatz has further tightened the interplay

between combinatorial arguments and generating functions arguments. We invite the reader to delve into the

bibliography to find a generalization of Cauchy-Davenport’s theorem, once known as Kneser’s conjecture, now

known as Da Silva-Hamidoune’s Theorem:

def

Theorem 30 (Da Silva, Hamidoune). If A ⊆ Fp and A ⊕ A = {a + a0 : a, a0 ∈ A, a 6= a0 }, we have:

The convolution machinery applies very well to another kind of coefficients given by Catalan numbers. We introduce

them in a combinatorial fashion, assuming to have two people involved in a ballot and to check the votes one by one.

Theorem 31 (Bertrand’s ballot problem). If the winning candidate gets A votes and the loser gets B votes (so we are

clearly assuming A > B), the probability that the winning candidate had the lead during the whole scrutiny equals:

A−B

A+B

Proof. The final outcome is so simple due to a slick symmetry argument, applied in a double counting framework:

instead of trying to understand what happens or might happen once a single vote is checked, it is more effective to

consider which orderings of the votes favour A or not. Let us consider just the first vote: if it is a vote for B, at some

point of the scrutiny there must be a tie, since the winning candidate is A. If the first vote is for A and at some point

of the scutiny there is a tie, by switching the votes for A and for B till the tie we return in the previous situation. It is

B

pretty clear that the probability the first vote is a vote for B is A+B . It follows that the probability of a tie happening

2B

during the scrutiny is A+B , and the probability that A always leads is:

2B A−B

1− = .

A+B A+B

Theorem 32 (Catalan numbers). The number of strings made by n characters 0 and n characters 1,

with the further property that no initial substring has more 1s than 0s, is:

1 2n

Cn = .

n+1 n

Proof. Any string with the given property can be associated (in a bijective way) with a path on a n×n grid, starting in

the bottom left corner and ending in the upper right corner, made by unit steps towards East (for each 1 character) or

North (for each 0 character) and never crossing the SW-NE diagonal (this translates the substrings constraint). These

paths can be associated in a bijective way with ballots that end in a tie, in which at every moment of the scrutiny the

Page 20 / 222

2 CONVOLUTIONS AND BALLOT PROBLEMS

votes for B are ≤ than the votes for A. If a deus ex machina adds an extra vote for A before the scrutiny begins, we

have a situation in which A gets n + 1 votes, B gets n votes and A is always ahead of B. There are 2n+1 = 2n+1

n n+1

possible scrutinies in which A gets n + 1 votes and B gets n votes: by the previous result (Bertrand’s ballot problem)

the number of the wanted strings is given by:

(n + 1) − n 2n + 1 1 2n + 1 1 2n

= = .

(n + 1) + n n 2n + 1 n n+1 n

For a slightly different perspective on the same subject, the reader is invited to have a look at Josef Rukavicka’s

“third proof” on the Wikipedia page about Catalan numbers.

Theorem 33. Given two natural numbers a and b with a ≥ b, the following identity holds:

b

X 1 2k a−b a + b − 2k 1+a−b a+b+1

= .

k + 1 k a + b − 2k b−k 1+a+b b

k=0

Proof. It is enough to count scrutinies for a ballot between two candidates A and B, with A getting a votes, B getting

b votes and A being always ahead of B, without excluding the chance of a tie at some point. We get the RHS by

adding an extra vote for A before the scrutiny begins and mimicking the previous proof. On the other hand we may

count such scrutinies according to the last moment in which we have a tie. If the last tie happens when 2k votes have

been checked, we simply need to assign a − k votes for A and b − k votes for B: what happens before the tie can be

accounted through Catalan numbers and what happens next through Bertrand’s ballot problem. This leads to the

LHS.

Pr

The last identity is a particular case of a remarkable generalization of Vandermonde’s identity m+n = k=0 m

n

r k r−k :

Theorem 34 (Rothe-Hagen).

n

X x x + kz y y + (n − k)z x+y x + y + nz

= .

x + kz k y + (n − k)z (n − k) x + y + nz n

k=0

Proof. This identity is usually proved through generating functions and that approach is not terribly difficult. We may

point that a purely combinatorial proof is also possible, by following the lines of the previous proof. It is enough to

slightly modify the constraint at any point, the votes for B are ≤ than the votes for A by replacing it with something

involving the ratio of such votes. This is surprising both for experts and for newbies: Micheal Spivey has written an

interesting lecture about it on his blog.

Exercise 35 (Balanced parenthesis). How many strings with 2n characters over the alphabet Σ = {(, )} have as

many open parenthesis as closed parenthesis, and in every initial substring the number of closed parenthesis is

always ≤ the number of open parenthesis?

Exercise 36 (Sub-diagonal paths). Let us consider the paths from (0; 0) to (n; n) where each step is a unit step

towards East or North. How many such paths belong to the region y ≤ x?

partition in almost-disjoint triangles, with the property that every triangle has its vertices on the boundary of the

original polygon. How many triangulations are there for a convex polygon with n + 2 sides?

Page 21 / 222

Exercise 38 (Complete binary trees). A tree is a connected, undirected and acyclic graph. It is said binary

and complete if each vertex has two neighbours (in such a case it is an inner node) or no neighbours (in such a

case it is a leaf ). How many complete binary trees with n inner nodes are there?

It is not difficult to prove the above claiming by exhibiting three combinatorial bijections:

TP ←→ CBT ←→ BP ←→ SDP.

1 2n

Cn =

n+1 n

where the sub-diagonal paths interpretation proves the identity

n

X

Cn+1 = Ck Cn−k

k=0

Cn xn ,

P

in a very straightfoward way. Such identity is a convolution formula: if we set c(x) = n≥0

we have c(x) = 1 + x · c(x)2 . By solving such quadratic equation in c(x) we get:

√

1 − 1 − 4x

c(x) =

2x

(the square root sign is chosen in such a way that c(x) is continuous at the origin) hence the coefficients of the

power series c(x) can be computed from the extended binomial theorem, extended since it is applied to (1 − x)α with

α = 21 6∈ N.

n π/2

16n

2n − 2k

Z

X 2k 1 n

= 2n = 4

cos(x)2n+1 dx.

n−k k 2k + 1 (2n + 1) n 0

k=0

The last identity is not trivial at all, and it has very deep consequences. A possible proof of such identity (that is

not the most elementary one: to find an elementary proof is an exercise we leave to the reader) exploits a particular

class of orthogonal polynomials. We have not introduced the L2 space yet, nor the usual techniques for dealing with

square-integrable objects, so such “advanced” proof is postponed to the end of this section, in a dedicated box. What

we can say through the convolution machinery is that the above identity is related to a Taylor coefficient in the product

1

between arcsin(x) and its derivative √1−x 2

. By termwise integration it implies:

1 X (4x2 )n

arcsin2 (x) =

n2 2n

2 n

n≥1

1

and by evaluating the last identity at x = 2 we get that:

π2

X 3 2 1

2n = 6 arcsin =

n2 n

2 6

n≥1

2 2

where the LHS equals ζ(2) by creative telescoping, as seen in the previous section, where we proved ζ(4) = 5 ζ(2)

too. It follows that:

Page 22 / 222

2 CONVOLUTIONS AND BALLOT PROBLEMS

X 1 π2 X 1 π4

ζ(2) = = , ζ(4) = = .

n2 6 n4 90

n≥1 n≥1

We have just solved Basel problem through a very creative approach, i.e. by combining creative telescoping with

convolution identities for Catalan-ish numbers. Plenty of other approaches are presented in a forthcoming section.

n

∆n a0

X

n

X X n

(−1) an = (−1)n n+1 , n

∆ a0 = (−1) k

an−k .

2 k

n≥0 n≥0 k=0

This identity is simple to prove and it is really important in series manipulation and numerical computation: for

instance, it is the core of Van Wijngaarden’s algorithm for the numerical evaluating of series with alternating signs.

Let us study a consequence of Euler’s acceleration technique, applied to:

X (−1)n XZ 1 Z 1

dx π

= (−1)n x2n dx = 2

= arctan(1) =

2n + 1 0 0 1+x 4

n≥0 n≥0

n n

Z 1X

X n 1 n

∆n a0 = (−1)n−k = (−1)n (−x2 )k dx

k 2k + 1 0 k

k=0 k=0

Z 1 n

4

= (−1)n (1 − x2 )n dx = (−1)n 2n

0 (2n + 1) n

2n+1 X

π=

(2n + 1) 2n

n≥0 n

where the main term of the last series behaves like 21n nπ for n → +∞, with a significative boost for the convergence

p

speed of the original series 4 . About the series defining ζ(2), Euler’s acceleration technique leads to:

X Hn ζ(2) π2

= = .

n2n 2 12

n≥1

n

P

By recalling Hn = m≥1 m(m+n) , the last identity turns out to be equivalent to

1

π2 − log(1 − x2 )

Z

= dx,

12 0 x

2

that is trivial by applying termwise integration to the Taylor series at the origin of − log(1−x

x

)

.

4 Byapplying Euler’s acceleration technique to the Taylor series of the arctangent function we get something equivalent to the functional

identity

x

arctan x = arcsin √ .

1 + x2

Page 23 / 222

A convolution involving the Riemann ζ function.

We now present a result about an ubiquitous Euler sum:

∞ q−2

X Hm q+2 1X

= ζ(q + 1) − ζ(q − j)ζ(j + 1).

m=1

mq 2 2 j=1

Proof.

k

X

ζ(k + 2 − j)ζ(j + 2)

j=0

∞ X∞ X

k

X 1

(expand ζ) =

m=1 n=1 j=0

mk+2−j nj+2

(pull out the terms for m = n and use the ∞ 1 1

X 1 mk+1

− nk+1

formula for finite geometric sums on the rest) = (k + 1)ζ(k + 4) + 1 1

m2 n2 −

m,n=1 m n

m6=n

∞

X 1 1

(simplify terms) = (k + 1)ζ(k + 4) + −

m,n=1

nmk+2 (n − m) mnk+2 (n − m)

m6=n

+∞ X +∞

X 1 1

(exploit symmetry) = (k + 1)ζ(k + 4) + k+2 (n − m)

− k+2 (n − m)

m=1 n=m+1

nm mn

∞ X ∞

X 1 1

(n 7→ n + m and switch sums) = (k + 1)ζ(k + 4) + 2 k+2 n

−

m=1 n=1

(n + m)m m(n + m)k+2 n

1 1 1

By exploiting mn = n(m+n) + m(m+n) we get:

∞ X

∞ ∞ X

∞

X 1 1 X 1 1

(k + 1)ζ(k + 4) + 2 k+3

− −2 +

m=1 n=1

m n (m + n)mk+3 m=1 n=1

m(n + m) k+3 n(n + m)k+3

1 1 1 1

P

and since Hm = n≥1 n − n+m , by exploiting the symmetry of m(n+m)k+3

+ n(n+m)k+3

we get:

∞ ∞ X ∞

X Hm X 1

(k + 1)ζ(k + 4) + 2 k+3

− 4

m=1

m n=1 m=1

n(n + m)k+3

∞ ∞ ∞

X Hm X X 1

(m 7→ m − n) = (k + 1)ζ(k + 4) + 2 k+3

− 4 k+3

m=1

m n=1 m=n+1

nm

∞

X Hm ∞ ∞

XX 1

(reintroducing terms ) = (k + 1)ζ(k + 4) + 2 k+3

−4 + 4ζ(k + 4)

m=1

m n=1 m=n

nmk+3

∞ ∞ X m

X Hm X 1

(switching sums) = (k + 5)ζ(k + 4) + 2 k+3

− 4 k+3

m=1

m m=1 n=1

nm

∞ ∞

X Hm X Hm

= (k + 5)ζ(k + 4) + 2 k+3

− 4 k+3

m=1

m m=1

m

∞

X Hm

(combining sums) = (k + 5)ζ(k + 4) − 2

m=1

mk+3

Letting q = k + 3 and reindexing j 7→ j − 1 yields

q−2 ∞

X X Hm

ζ(q − j)ζ(j + 1) = (q + 2)ζ(q + 1) − 2

j=1 m=1

mq

Page 24 / 222

2 CONVOLUTIONS AND BALLOT PROBLEMS

Exercise 42. Do we get something interesting (like an approximated functional identity for the ζ function)

from the previous convolution identity, by recalling that

1 1 1 1

Hm = log(m) + γ + − 2

+ 4

− + ...

2m 12m 120m 252m6

and that X log m X 1 X

q

= −ζ 0 (q) = Λ(d)

m mq

m≥1 m≥1 d|m

1 1

P

Exercise 43. By exploiting Hm = n≥1 n − n+m and symmetry prove that

X Hm X 1

= 2 ζ(3) = 2 .

m2 m3

m≥1 m≥1

Proof.

log2 (1 − x)

X Hm Z 1 X m+n−1 Z 1

XX 1 x

= = dx = dx

m2 (m + n)mn 0 mn 0 x

m≥1 m≥1 n≥1 m,n≥1

1

log2 (x) XZ 1

Z X 2

dx = xn log2 (x) dx = .

0 1−x 0 (n + 1)3

n≥0 n≥0

X Hn2

= 3 ζ(3).

n(n + 1)

n≥1

Vandermonde’s identity and Bessel functions. Bessel functions are important mathematical functions: they

are associated with the coefficients of the Fourier series of some inverse trigonometric functions and they arise in

the study of the diffusion of waves, like in the vibrating drum problem. Bessel functions of the first kind with

integer order can be simply defined by giving their Taylor series at the origin:

X (−1)l

Jn (z) = z 2l+n

22l+n l!(m + l)!

l≥0

from which it is trivial that Jn (z) is an entire function, a solution of the differential equation z 2 f 00 + zf 0 + (z 2 −

n2 )f = 0 and much more. In this paragraph we will see how Vandermonde’s identity plays a major role in dealing

with the square of a Bessel function of the first kind.

Page 25 / 222

Exercise 45. Prove the identity:

Z π

2 2

Jn2 (z) = J2n (2z cos(θ))dθ.

π 0

X (−1)a x 2a+n

Jn (x) =

a!(a + n)! 2

a≥0

hence:

X X (−1)m (x/2)2m+2n

Jn2 (x) =

a!b!(a + n)!(b + n)!

m≥0 a+b=m

where:

X 1 1 X mm + 2n

=

a!b!(a + n)!(b + n)! m!(m + 2n)! a b+n

a+b=m a+b=m

1

= [xm+n ](1 + x)m (1 + x)m+2n

m!(m + 2n)!

1 2m + 2n

= (♣)

m!(m + 2n)! m + n

leads to:

X (−1)m (x/2)2m+2n 2m + 2n

Jn2 (x) = . (♥)

m!(m + 2n)! m+n

m≥0

R π/2

Since π2 0 cos2h (θ) dθ = 41h 2h

h follows from De Moivre’s formula, in order to prove the claim it is enough to

expand J2n (2z cos θ) as a power series in 2z cos θ, perform terwmwise integration and exploit (♥). The claim is

ultimately a consequence of Vandermonde’s identity proved in (♣).

This technique also shows that the Laplace transform (an important tool we will introduce soon) of J02 (x) is

related to the complete elliptic integral of the first kind (another object we will study in a forthcoming section)

through the identity

2

2 4

L J0 (s) = K − 2 .

πs s

4n

2n

≥

n n+1

is a trivial consequence of the Cauchy-Schwarz inequality.

√

1+ 5

Exercise 47. Prove that if ϕ is the golden ratio 2 , we have:

P (−1)n+1

Hint: it might be useful to consider the rapidly convergent series n≥1 2n2 (2n) .

n

Page 26 / 222

2 CONVOLUTIONS AND BALLOT PROBLEMS

X π2

1 1

+ = .

(6n + 1)2 (6n + 5)2 9

n≥0

π/2

5π 2

Z

1 dθ

log 1 + sin θ = .

0 2 sin θ 72

346

Exercise 50. Prove that log(3) > 315 follows from computing/approximating the integral

1

x4 (1 − x2 )2

Z

dx.

0 4 − x2

There is another kind of convolution, that is known as multiplicative convolution or Dirichlet’s convolution. We say

that a function f : Z+ → C is multiplicative when gcd(a, b) = 1 grants f (ab) = f (a) · f (b). A multiplicative function

has to fulfill f (1) = 1, and since Z is a UFD the values of a multiplicative function are fixed by the values of such

function over prime powers. Common examples of multiplicative functions are the constant 1, the divisor function

d(n) = σ0 (n) and Euler’s totient function ϕ(n). Given two multiplicative functions f, g : Z+ → C, their Dirichlet

convolution is defined through X n

(f ∗ g)(n) = f (d) · g .

d

d|n

It is a simple but interesting exercise to prove that the convolution between two multiplicative functions still is a mul-

tiplicative function. Just like additive convolutions are related to products of power series, multiplicative convolutions

are related with products of Dirichlet series. If we state that

X f (n)

L(f, s) =

ns

n≥1

is the Dirichlet series associated with f , the following analogue of Cauchy’s product holds:

X (f ∗ g)(n) X X f (d) g n

d

L(f ∗ g, s) = = = L(f, s) · L(g, s).

ns ns

n≥1 n≥1 d|n

The main difference between additive and multiplicative convolutions is that in the multiplicative context, given f (n)

P

and H(n) = d|n h(d), we can always find a multiplicative function g such that f ∗g = H, and such function is unique.

There is no additive analogue of Möbius inversion formula, allowing us to solve such problem. The extraction process

of a coefficient from a given power/Dirichlet series is similar and relies on the residue theorem, applied to the original

function multiplied by x1h or to the Laplace transform of the original function. Let us see how to use this machinery

for solving actual problems.

Exercise 51. Prove that for any n ∈ Z+ the following identity holds:

X

n= ϕ(d).

d|n

Page 27 / 222

Proof. The usual combinatorial proof starts by considering the n-th roots of unity on the unit circle. Every n-th root

of unity is a primitive d-th root of unity for some d | n, and the number of primitive d-th roots of unity is exactly ϕ(d),

so the claim follows from checking that no overcounting or undercounting occurs. With the multiplicative convolution

machinery, we do not have to find a combinatorial interpretation for both sides, we just have to find the Dirichlet

series associated with both sides. In equivalent terms, in order to show that Id = ϕ ∗ 1 it is enough to compute:

X n X 1

L(Id, s) = s

= = ζ(s − 1),

n ns−1

n≥1 n≥1

X 1

L(1, s) = = ζ(s)

ns

n≥1

ζ(s−1)

then prove that L(ϕ, s) = ζ(s) . By Euler’s product:

Y 1− 1

Y ϕ(p) ϕ(p2 ) ϕ(p3 ) Y ps − 1

ps

L(ϕ, s) = 1+ + 2s + 3s + . . . = = 1 ,

p

ps p p p

ps − p p

1− ps−1

Y Y −1

1 1 1 1

ζ(s) = L(1, s) = 1 + s + 2s + 3s + . . . = 1− s

p

p p p p

p

1

The keen reader might observe the ζ(s) function played an import role in the previous proof, and ask about the

multiplicative function associated to such Dirichlet series. Well, by defining ω(n) as the number of distinct prime

factors of n and µ(n) as (

(−1)ω(n) if n is square-free

µ(n) =

0 otherwise

X µ(n) Y µ(p)

1

L(µ, s) = = 1+ s =

ns p

p ζ(s)

n≥1

1

as wanted. Then the trivial 1 = ζ(s) · ζ(s) leads to the following convolution identity:

(

X 1 if n = 1

µ(d) =

d|n

0 otherwise.

X

f (n) = g(d)

d|n

then X n

g(n) = µ(d) · f

d

d|n

holds.

Proof. Let us denote with ε the multiplicative function that equals 1 at n = 1 and zero otherwise. Since f = g ∗ 1,

µ ∗ g = µ ∗ (f ∗ 1) = µ ∗ (1 ∗ f ) = (µ ∗ 1) ∗ f = ε ∗ f = f

Page 28 / 222

2 CONVOLUTIONS AND BALLOT PROBLEMS

Corollary 53. Y Y

F (n) = f (d) =⇒ f (n) = F (d)µ(n/d) .

d|n d|n

The last identity encodes an algebraic equivalent of the inclusion-exclusion principle. For instance, by denoting through

Φm (x) the m-th cyclotomic polynomial (i.e. the minimal polynomial over Q of a primitive m-th root of unity) we have

Y

xn − 1 = Φd (x)

d|n

Y

Φn (x) = (xd − 1)µ(n/d) .

d|n

Xn X µ(d)

ϕ(n) = · µ(d) = n

d d

d|n d|n

corresponding to ϕ = Id ∗ µ. The explicit formula for cyclotomic polynomials has many interesting consequences,

for instance:

∀n > 1, Φn (0) = (−1)(µ∗1)(n) = 1

that also follows from the fact that Φn (x) is a palindromic polynomial (if ξ is a root of Φn , ξ −1 is a root of Φn too).

We also have

Φ0n (z) d X n dxd−1

= log Φn (z) = µ

Φn (z) dz d xd − 1

d|n

[z 1 ]Φn (z) = Φ0n (0) = Φn (0) = lim µ = −µ(n)

Φn (0) x→0 d xd − 1

d|n

since only the term d = 1 may provide a non-zero contribution to the limit. By putting together the following facts:

• by Vieta’s formulas, for a monic polynomial with degree q the sum of the roots

equals the opposite of the coefficient of xq−1 ;

X 2πim

µ(n) = exp .

n

1≤m≤n

gcd(m,n)=1

This sum is a particular case of Ramanujan sum. Thanks to Srinivasa Ramanujan we also know that the σ3 function,

σ3 (n) = d|n d3 , fulfills at the same time an additive convolution identity (due to the fact that the Eisenstein series

P

n−1

X σ7 (n) − σ3 (n) X n

σ3 (m) σ3 (n − m) = , σ3 (d) σ3 = (1 ∗ 1 ∗ . . . ∗ 1 ∗ 1) = σ7 (n).

m=1

120 d | {z }

d|n

8 times

Thanks to Giuseppe Melfi and his work on the modular group Γ(3) we also know that:

n

X 1

σ1 (3k + 1)σ1 (3n − 3k + 1) = σ3 (3n + 2).

9

k=0

Page 29 / 222

Exercise 54. Prove that for any M ∈ {3, 4, 5, . . .} the following identity holds:

X πn ϕ(M ) − µ(M )

sin2 = .

M 2

1≤n≤M

gcd(n,M )=1

Hint: convert the LHS into something depending on the roots of a cyclotomic polynomial,

then recall the representation of the Möbius function as an exponential sum.

If we define the sequence of Legendre polynomials through Rodrigues’ formula

n 2

1 dn 2 n 1 X n

Pn (x) = (x − 1) = (x − 1)n−k (x + 1)k

2n n! dxn 2n k

k=0

give a orthogonal and complete base of L2 (−1, 1) with respect to the usual inner product:

Z 1

2δ(n, m)

Pn (x) Pm (x) dx = .

−1 2n + 1

1 X

√ = Pn (x)tn .

1 − 2xt + t2

n≥0

can be defined through Rodrigues’ formula

n

1 dn

2 n n

X n n+k

P̃n (x) = (x − x ) = (−1) (−x)k

n! dxn k k

k=0

and they give a complete and orthogonal base of L2 (0, 1) with respect to the usual inner product:

Z 1

δ(m, n)

P̃n (x)P̃m (x) dx = .

0 2n + 1

Z 1 Z 1

n 1 2n+1 n!2

x Pn (x) dx = n (1 − x2 )n dx = .

−1 2 −1 (2n + 1)!

In particular: Z 1 X Z 1

dx

√ = xn Pn (x) dx tn ,

−1 1 − 2x2 t + x2 t2 n≥0 −1

q

4 arcsin 2t X 2n+1 n!2

p = tn ,

t(2 − t) (2n + 1)!

n≥0

X (4t2 )n

2 arcsin2 (t) = .

n≥1

n2 2n

n

Page 30 / 222

2 CONVOLUTIONS AND BALLOT PROBLEMS

1 √ X X 2n + 1

√ = 2 Pn (2x − 1), − log(1 − x) = 1 + Pn (2x − 1).

1−x n(n + 1)

n≥0 n≥1

We already mentioned that n≥1 n12 = n≥1 n2 32n can be proved by creative telescoping. In this paragraph

P P

(n)

we prove it is a consequence of a change of variable in a integral, in particular the tangent half-angle substitution

(sometimes known as Weierstrass’ substitution, apparently for no reason) x = 2 arctan 2t , sending (0, π/2) into

dx dt

(0, 1) and sin x in t . Let us set

Z π/2

1 2 dx

I=− log 1 − sin x .

0 4 sin x

By expanding − log 1 − 14 sin2 x as a Taylor series in sin x we have that

X 1 Z π/2 X 1 1X 1

2n−1

I= sin(x) dx = 2n−2

=

n4n 0 n 2n

4n(2n − 1) n−1

2 2

n

n≥1 n≥1 n≥1

Z 1 2 !

t dt

I= − log 1 −

0 1 + t2 t

2

t

where the rational function 1 − 1+t 2 can be written in terms of products and ratios

m

of polynomials of the form 1 − t . Eureka, since:

Z 1

log(1 − tm ) X 1 1 mn−1

Z X 1 ζ(2)

Im = − = t dt = 2

=

0 t n 0 mn m

n≥1 n≥1

implies:

1

I = (I2 − 2I4 + I6 ) = ζ(2).

6

X 1 5 X (−1)n+1

ζ(3) = =

3

n3 2n

n 2 n

n≥1 n≥1

The Taylor series of arcsin2 (x) from the Complex Analysis point of view.

The function f (z) = sin(z) is an entire function of the form z + o(z), hence it gives a conformal map

between two neighbourhoods of the origin (this crucial observation is the same leading to

the Buhrmann-Lagrange inversion formula). In particular,

I I

arcsin(x) 1 arcsin(z) 1 z

a2n−1 = [x2n−1 ] √ = √ dz = dz

1−x 2 2πi z 2n 1−z 2 2πi sin(z)2n

Page 31 / 222

and Z

z dz

a2n−1 = Res , z = 0 = −Res ,z = 0 .

sin(z)2n sin(z)2n

d 1

Since dz cot(z) = sin2 z

, the last integral can be computed through repeated integration by parts:

Z n−1 k

dz X cot(z) Y 1

= − 1 +

sin(z)2n (2n) sin(z)2n−2k j=0 2n − 2j − 1

k=0

cot z cos z 1

Res , z = 0 = Res , z = 0 = Res , z = 0 = 0,

sin2m (z) sin2m+1 (z) z 2m+1

so there is a single term really contributing to the value of the residue at the origin,

and since Res(cot(z), z = 0) = 1,

n

4n

Z

dz 1 Y 1 (2n)!!

−Res , z = 0 = 1 + = =

sin(z)2n 2n 2n

2n j=0 2n − 2j − 1 (2n) · (2n − 1)!! n

arcsin(x) X 4n X (4x2 )n

√ = 2n−1 ,

2n x arcsin2 (x) = .

1 − x2 n≥1

2n n n≥1

2n2 2n

n

Exercise 57. Prove that the value of the rapidly convergent series

−

8 n!(n + 2)!42n+3

n≥0

√

1+ 5

is the golden ratio 2 .

xk

√

k, prove that limx→+∞ efx(x)

√ = 1. Hint: compute the series of f (x)2

P

Exercise 58. Given f (x) = k≥0 k! x

p

and exploit the inequality 2z(1 − z) ≤ z(1 − z) ≤ 12 , holding for any z ∈ [0, 1].

X (−1)k bk/2c

X 2m (−1)m

2k + 1 m=0 m 4m

k≥0

√

is convergent to √1 log(1 + 2).

2

Proof.

X (−1)m 2m 2 π/2

Z

dθ 1

m

= 2

=√

4 m π 0 1 + cos θ 2

m≥0

and Dirichlet’s test ensure that the given series is convergent. If k is even (say k = 2n) we have

n

2m (−1)m

X 1

m

= [xn ] √

m=0

m 4 (1 − x) 1 + x

Page 32 / 222

3 CHEBYSHEV AND LEGENDRE POLYNOMIALS

and if k is odd (say k = 2n + 1) we have the same identity, where [xn ]f (x) stands for the coefficient of xn in the

Maclaurin series of f (x). In particular the original series can be written as

Z 1

X 1 1 1−x

· [x2n ] √ = √ dx

(2n + 1)(2n + 2) (1 − x 2 ) 1 + x2

0 (1 − x 2 ) 1 + x2

n≥0

R1 1

since 0

x2n (1 − x) dx = (2n+1)(2n+2) . It turns out that the original series is just

Z 1

√

dx arcsinh(1) log(1 + 2)

√ = √ = √ .

0 (1 + x) 1 + x2 2 2

Lemma 60. For any n ∈ N there exists a polynomial Tn (x) ∈ Z[x] such that:

It is not difficult to prove the claim by induction on n. It is trivial for n = 0 ed n = 1, and by the cosine addition

formulas:

cos((n + 2)θ) + cos(nθ) = 2 cos(θ) cos((n + 1)θ)

such that:

for any n ≥ 0. The Tn (x) polynomials are Chebyshev polynomials of the first kind

and they have many properties, simple to prove:

• (uniform boundedness)

∀x ∈ [−1, 1], |Tn (x)| ≤ 1

• (distibution of roots)

n

Y (2k − 1)π

Tn (x) = 2n−1 x − cos

2n

k=1

• (orthogonality)

Z 1

Tn (x) Tm (x) π

√ dx = δ(m, n)(1 + δ(n, 0))

−1 1 − x2 2

1 − xt X

2

= Tn (x)tn

1 − 2xt + t

n≥0

1h p p i

Tm (x) = (x + i 1 − x2 )m + (x − i 1 − x2 )m .

2

Chebyshev polynomials of the second kind, Un (x), are similarly defined through sin((n+1)θ)

sin θ = Un (cos θ):

they share with Chebyshev polynomials of the first kind the recurrence relation Un+2 (x) = 2x Un+1 (x) − Un (x)

and similar properties:

Page 33 / 222

• (boundedness)

∀x ∈ [−1, 1], |Un (x)| ≤ (n + 1)

• (distribution of roots)

n

Y kπ

Un (x) = 2n x − cos

n+1

k=1

• (orthogonality)

Z 1 p π

Un (x) Um (x) 1 − x2 dx = δ(m, n)

−1 2

1 X

2

= Un (x)tn

1 − 2xt + t

n≥0

bm/2c

X m−r

Um (x) = (−1)r (2x)m−2r .

r=0

r

By combining Vieta’s formulas (about the interplay between roots and coefficients of a polynomial) with the explicit

form of the roots of Un (x) or Tn (x), we have that many trigonometric sums or products can be easily evaluated

through Chebyshev polynomials.

Lemma 61.

n−1 n−1 n−1

n2 − 1

X πk n X 1 Y πk 2n

sin2 = , = , sin = .

k=1

n 2

k=1

sin2 πk

n

3

k=1

n 2n

The last identity is related to the combinatorial broken stick problem and it provides an unexpected way for tackling

(through Riemann sums!) the following integral:

Lemma 62. Z π

log sin(x) dx = −π log 2.

0

Z π Z π/2 Z π/2 Z π/2

log sin(x) dx = 2 log sin(x) dx = log sin2 (x) dx = log cos2 (x) dx

0 0 0 0

Z π/2 Z π/2

sin(2x)

= log [sin(x) cos(x)] dx = log dx

0 0 2

1 π

Z

π

(2x = z) = − log(2) + log sin(z) dz.

2 2 0

Z π n−1 n−1

πX πk π Y πk

log sin(x) dx = lim log sin = lim log sin

0 n→+∞ n n n→+∞ n n

k=1 k=1

π 2n

= lim log = −π log 2.

n→+∞ n 2n

Page 34 / 222

3 CHEBYSHEV AND LEGENDRE POLYNOMIALS

2N N

X 1 X 1 2N 2 + 1

= 4N (N + 1), πk

= .

cos2 πk 1 − cos N

6

k=1 2n+1 k=1

Another famous application of Chebyshev polynomials is related to the determination of the spectrum of tridiagonal

Toeplitz matrices. Due to the Laplace expansion and the recurrence relation for Chebyshev polynomials

2x 1 0 ... 0

1 2x 1 ... 0

.. ..

det 0

1 . . 0 = Un (x)

.. .. .. ..

. .

. . 1

0 0 0 1 2x

so the spectrum of the n × n matrix with C on the diagonal, 1 on the sup- and sub-diagonal and zero anywhere else

is given by:

πk

λk = C + 2 cos , k = 1, 2, . . . n.

n+1

These matrices are deeply involved in the numerical solution of differential equations depending on the Laplacian

operator and in extensions of the rearrangement inequality, like:

π

|a1 a2 + a2 a3 + . . . + an−1 an | ≤ a21 + . . . + a2n cos2

,

n+1

that combined with the shoelace formula can be used to prove the isoperimetric inequality in the polygonal case.

Another important (but lesser-known) application of Chebyshev polynomials is the proof of the uniform convergence

of the Weierstrass products for the sine and cosine functions, over compact subsets of R:

Y x2 4x2

Y

sinc(x) = 1− 2 2 , cos(x) = 1−

π n (2n + 1)2 π 2

n≥1 n≥0

Exercise 65 (Uniform convergence of the Weierstrass product for the cosine function).

Let I = [a, b] ⊆ R and {fn (x)}n∈N the sequence of real polynomials defined through:

n

4 x2

Y

fn (x) = 1− .

j=0

(2j + 1)2 π 2

Prove that on I the sequence of functions {fn (x)}n∈N is uniformly convergent to cos x.

Proof. The factorization of Chebyshev polynomials of the first and second kind leads to the following identities:

n

! n−1

!

sin x Y sin2 2n+1

x Y sin2 2n

x

x = 1− , cos x = 1− ,

(2n + 1) sin 2n+1

k=1

sin2 2n+1

kπ

j=0 sin2 (2j+1)π

4n

Page 35 / 222

holding for every x ∈ R and every n ∈ Z+ .

We may assume without loss of generality that 0 < x < m < n holds, with m and n being positive natural numbers.

Since for every θ in the interval 0, π2 we have 2θ

π < sin θ < θ, it follows that:

n

! n n

sin2 x

x2 x2

Y Y X 1

1> 1− 2n

> 1− > 1 − x2 >1− ,

(2k+1)π

sin2 4n (2k + 1)2 (2k + 1) 2 4m

k=m+1 k=m+1 k=m+1

m

Y sin2 x

2n

Hm (x) = 1− ,

j=0 sin2 (2j+1)π

4n

x2

1− Hm (x), Hm (x) .

4m

By sending n towards +∞ we get that cos x belongs to the interval:

2

Y m 2

Y m 2

1− x 1−

4x

, 1−

4x ,

4m j=0 (2j + 1)2 π 2 j=0

(2j + 1)2 π 2

so, by sending m towards +∞, the pointwise convergence of the Weierstrass product for the cosine function is proved.

Additionally, by the last line it follows that:

4x2 /m 4x2

|fm (x) − cos x| ≤ |cos x| 2

≤ ,

1 − 4x /m m − 4x2

and such inequality proves the uniform convergence. The proof of the uniform convergence (over compact subsets of

the real line) of the Weierstrass product for the sine function is analogous.

Chebyshev polynomials can also be employed to prove the following statement (a first density result in Functional

Analysis) through an approach due to Lebesgue.

Theorem 66 (Weierstrass approximation Theorem). If f (x) is a continuous function on the interval [a, b],

for any ε > 0 there exists a polynomial pε (x) such that:

We may clearly assume [a, b] = [−1, 1] without loss of generality. Moreover, any continuous function over a compact

interval of the real line is uniformly continuous, so f can be uniformly approximated by a piecewise-linear function of

the form

n

X k

gn (x) = ck x −

n

k=−n

and it is enough to prove the statement for the function f (x) = |x| on the interval [−1, 1]. For such a purpose, we

may consider the projection of f (x) on the subspace of L2 (−1, 1) (equipped with the “Chebyshev” inner product

R1

hu(x), v(x)i = −1 u(x)

√

v(x)

1−x2

dx) spanned by T0 (x), T1 (x), . . . , T2N (x). Since

1 1 π

|x| T2k+1 (x) |x| T2k (x)

Z Z Z

√ dx = 0, √ dx = |cos θ| cos(2kθ) dθ

−1 1 − x2 −1 1 − x2 0

N

2 4 X (−1)k+1

pN (x) = + T2k (x)

π π 4k 2 − 1

k=1

Page 36 / 222

3 CHEBYSHEV AND LEGENDRE POLYNOMIALS

and it can be proved that the maximum difference, in absolute value, between pN (x) and |x| occurs at x = 0 and

equals:

N

2 2X 1 1 4 X 1 2

pN (0) = − − = = ,

π π 2k − 1 2k + 1 π 4k 2 − 1 π(2N + 1)

k=1 k>N

so the sequence of polynomials {pN (x)}N ≥0 provides a uniform approximation of |x| as wanted.

As an alternative we may consider:

N

2n (1 − x2 )n

X

qN (x) = 1 −

n=1

n 4n (2n − 1)

√

from the truncation of the Taylor series at the origin of 1 − z, evaluated at z = 1 − x2 . In such a case it is trivial

that ||x| − qN (x)| achieves its maximum value at the origin, but the approximation we get this way, according to the

degree of the approximating polynomial, is worse than the approximation we got through Chebyshev polynomials,

1

since |qN (0)| ≈ √πN .

R1

The projection technique on L2 (−1, 1) (equipped with the non-canonical inner product hu(x), v(x)i = −1 u(x)

√

v(x)

1−x2

dx)

is also known as Fourier-Chebyshev series expansion. About applications, it is important to mention that many

functions have a pretty simple Fourier-Chebyshev series expansion:

X Tn (x)

− log(1 − x) = log(2) + 2

n

n≥1

p 2 4 X T2n (x)

1 − x2 = −

π π 4n2 − 1

n≥1

4 X T2n−1 (x)

arcsin(x) = .

π (2n − 1)2

n≥1

We many notice that the last identity provides an interesting way for the explicit evaluation of ζ(2) and ζ(4).

Since T2n−1 (1) = 1,

3 X 1 π π2

ζ(2) = = arcsin(1) = .

4 (2n − 1)2 4 8

n≥1

Z 1 Z π/2

8X 1 arcsin2 (x) π3

4

= √ dx = θ2 dθ =

π (2n − 1) −1 1 − x2 −π/2 12

n≥1

so:

16 X 1 16 π 4 π4

ζ(4) = 2

= · = .

15 (2n − 1) 15 96 90

n≥1

1 1

cos(πq) ∈ −1, − , 0, , 1 .

2 2

Page 37 / 222

Proof. We may assume without loss of generality q = ab with a, b ∈ Z+ and gcd(a, b) = 1.

Let us study the case in which b is odd first. With such assumption:

so cos(πq) is a root of Tb (x) − (−1)a . The value at the origin of such polynomial is ±1 and the leading term is 2b−1 :

due to the rational root Theorem,

1

cos(πq) ∈ Q =⇒ cos(πq) = ± k .

2

However, due to the cosine duplication formula, if α = cos(πq) is a rational number, 2α2 − 1 = cos(2πq) is a rational

number too. Such observation leads to a proof of the given claimwhen b is odd. If ν2 (b) ≥ 1, it is enough to consider

q

1+cos θ

that by the duplication/bisection formulas cos(θ) = ± 2 we have:

1 1

cos(πq) ∈ Q =⇒ cos(2ν2 (b) πq) ∈ Q =⇒ cos(2ν2 (b) πq) ∈ −1, − , 0, , 1

2 2

so cos(πq) ∈ {−1, 0, 1}.

ϕ(n)

Lemma 69. If n ≥ 3, the number α = cos 2π

n is an algebraic number over Q with degree 2 .

Additionally, the Galois group of its minimal polynomial over Q is cyclic.

2πk n

αk = cos , 1 ≤ k < , gcd(k, n) = 1.

n 2

We may notice that the previous result is actually just a corollary of this Lemma.

2π 3π 5π

α = cos − cos − cos

19 19 19

is an algebraic number over Q with degree 3, i.e. it is a root of a cubic polynomial with integer coefficients.

Legendre polynomials share many properties with Chebyshev polynomials. Our opinion is that the most efficient

way for introducing Legendre polynomials is to do that through Rodrigues formula:

Definition 71.

1 dn 2

Pn (x) = (x − 1)n

2n n! dxn

Such definition provides a simple way for proving the following statements about Legendre polynomials:

• (orthogonality)

Z 1

2 δ(m, n)

Pn (x) Pm (x) dx =

−1 2n + 1

d 2 d

(1 − x ) Pn (x) + n(n + 1)Pn (x) = 0

dx dx

Page 38 / 222

3 CHEBYSHEV AND LEGENDRE POLYNOMIALS

1 X

√ = Pn (x)tn

1 − 2xt + t2 n≥0

n

X

Un (x) = Pl (x)Pn−l (x)

l=0

d

(n + 1)Pn+1 (x) = (2n + 1)xPn (x) − Pn−1 (x), (2n + 1)Pn (x) = (Pn+1 (x) − Pn−1 (x))

dx

• (an explicit representation)

n 2 n−k k

X n 1+x 1−x

Pn (x) = (−1)k .

k 2 2

k=0

It is very practical to introduce shifted Legendre polynomials too, defined by P̃n (x) = Pn (2x − 1).

Due to such affine transform, shifted Legendre polynomials fulfill the following properties:

• (Rodrigues formula)

1 dn 2

P̃n (x) = (x − x)n

n! dxn

• (orthogonality)

Z 1

δ(m, n)

P̃n (x) P̃m (x) dx =

0 2n + 1

• (a simple generating function)

1 X

p = P̃n (x)tn

(t + 1)2 − 4tx n≥0

d

(n + 1)P̃n+1 (x) = (2n + 1)(2x − 1)P̃n (x) − P̃n−1 (x), (4n + 2)P̃n (x) = P̃n+1 (x) − P̃n−1 (x)

dx

• (an explicit representation)

n

X n n+k

Pn (x) = (−1)n (−x)k .

k k

k=0

The explicit representation and orthogonality are really important. Shifted Legendre polynomials give an orthogonal

R1

base (with respect to the canonical inner product hf, gi = 0 f (x) g(x) dx) of the space of square-integrable functions

over (0, 1), and every C 1 function over (0, 1) has a representation of the form:

X Z 1

f (x) = cn P̃n (x), cn = (2n + 1) f (x) P̃n (x) dx.

n≥0 0

We may define a Fourier-Legendre series expansion just like we did for the Fourier-Chebyshev series expansion:

we just have to change the integration range into (0, 1) and the inner product into the canonical one, since the existence

of a complete orthogonal base of polynomials is unchanged.

Let us study an application of the Fourier-Legendre series expansion.

Z 1 Z 1 2 Z 1 2

2 f (x)2 dx ≥ 3 x f (x) dx + f (x) dx

−1 −1 −1

Page 39 / 222

Proof. We may assume to have X

f (x) = cn Pn (x)

n≥0

Z 1 X c2 Z 1 Z 1

n 2c1

f (x)2 dx = 2 , f (x) dx = 2c0 , P1 (x) f (x) dx = ,

−1 2n + 1 −1 −1 3

n≥0

X c2n 4c2

4 ≥ 1 + 4c20 ,

2n + 1 3

n≥0

that is trivial and holds as an equality iff c2 = c3 = c4 = . . . = 0, i.e. iff f (x) is a linear polynomial of the form

ax + b.

Like in the Chebyshev case, the Fourier-Legendre series expansion of many functions can be simply derived by ma-

nipulating the generating function for our sequence of polynomials. For any x ∈ (0, 1) we have, for instance:

√ = 2 Pn (x), − log(x) = 1 + P̃n (x), − log(1 − x) = 1 + P̃n (x)

1−x n(n + 1) n(n + 1)

n≥0 n≥1 n≥1

dθ

K(k) = p =2

0 1 − k 2 sin2 θ 2n + 1

n≥0

Legendre and Chebyshev polynomials share the uniform boundedness property

A possible proof relies on the application of Cauchy-Schwarz inequality to the integral representation

1 π

Z p n

Pn (x) = x + i 1 − x2 cos θ dθ

π 0

that is a consequence of the generating function. An improved inequality is due to Tricomi:

2 1

∀x ∈ (−1, 1), |Pn (x)| ≤ p · √

4

.

(2n + 1)π 1 − x2

Readers can find a sketch of its (highly non-trivial) proof on the Whittaker&Watson book or on MSE (thanks to

M.Spivey). From these remarks (and/or from Bonnet’s recursion formula) it follows that the Legendre polynomial

Pn (x), just like the Chebyshev polynomial Tn (x), has only real roots.

Theorem 73 (Turán). For any x in the interval (−1, 1) the following inequality holds:

We will later see a proof of this brilliant result (a key ingredient for the Askey-Gasper inequality, that led De

Branges in 1985 to the proof of Bieberbach conjecture) based on the following remarks:

Page 40 / 222

3 CHEBYSHEV AND LEGENDRE POLYNOMIALS

In this paragraph we will outline an alternative and really elegant approach, due to Szegö.

We just need few preliminary lemmas.

−1

is the k-th elementary symmetric function of a1 , . . . , an , by setting Sk = σk nk it follows that Sk2 > Sk−1 Sk+1 .

f (z) = zn

n!

n≥0

is an entire function and the zeroes of f are real and simple, as soon as these zeroes ζ1 , ζ2 , ζ3 , . . . fulfill

X 1

< +∞

ζk2

k≥1

X (−1)n z 2n

J0 (z) =

4n n!2

n≥0

Szegö’s remark is that the following identity is easy to derive from the generating function for Legendre polyno-

mials: X Pn (x) p

z n = ezx J0 (z 1 − x2 )

n!

n≥0

hence Turán’s inequality is a straightforward consequence of Pólya-Schur’s Lemma. Szegö’s idea is quite deep

since it implies the existence of many “Turán-type” inequalities, not only for Legendre polynomials, but for many

solutions of second-order differential equations with polynomial coefficients: Chebyshev, Hermite, Laguerre, Jacobi

polynomials, Bessel functions . . .

n

n n + j (−1)n+j (−1)n

X

= .

j=0

j j (j + 1)2 n(n + 1)

n

X

∀x ∈ (−1, 1), Pk (x) > 0.

k=0

n 2 n

X n 2l 2n−2l

X 2l 2n − 2l 2l 2n−2l

(x + y) (x − y) = x y .

l l n−l

l=0 l=0

Page 41 / 222

Exercise 80 (Ramanujan-like formulas for π1 and π12 ). Use Rodrigues’ formula or the generating function for Legendre

polynomials to show that

1 L2 (0,1) X 4n + 1 2n2

= π P2n (2x − 1).

16n

p

x(1 − x) n

n≥0

Use Bonnet’s formula to deduce how the Fourier-Legendre series of a function g(x) changes if g(x) is replaced by

g(1 − x) or x · g(x). Use such transformations for showing that

2

p L2 (0,1) πX 4n + 1 2n

x(1 − x) = P2n (2x − 1).

8 (n + 1)(1 − 2n)16n n

n≥0

1

Compute P2n (0) in explicit terms, then use the evaluation of the previous line at x = 2 and Parseval’s identity for

showing that

4 X (4n + 1)(−1)n 2n3

= ,

π (n + 1)(1 − 2n)64n n

n≥0

4

32 X (4n + 1) 2n

2

= 2 2 n

.

3π (n + 1) (2n − 1) 256 n

n≥0

A note on Delannoy numbers and their asymptotic behaviour. Let us assume to travel in Z × Z, having

the origin as a starting point and the allowed steps

Let us denote with Dn the number of paths from the origin to (n, n).

By this way we define the sequence of Delannoy numbers

1 1 X 1 k 2

n n n n

Dn = [x y ] = [x y ] = .

1 − x − y − xy 2 − (1 + x)(1 + y) 2k+1 n

k≥0

Through Cauchy’s integral theorem or the orthogonality relations in L2 (0, 2π) the RHS of the previous line can

be written as Z 2π

1 dθ

Dn = √

2π 0 (3 − 2 2 cos θ)n+1

giving that {Dn }n≥0 is a sequence of moments, hence a log-convex sequence.

This integral representation leads to the ordinary generating function

X 1

D n xn = √

n≥0

1 − 6x + x2

1 √

Dn ∼ √ (1 + 2)2n

3n

as n → +∞. Given the generating function for Legendre polynomials, we have Dn = Pn (3).

Page 42 / 222

3 CHEBYSHEV AND LEGENDRE POLYNOMIALS

Exercise 81. The n-th Fibonacci number can be computed in at most (2 + ε) log2 (n) integer multiplications by

exploiting the relations between Fn and Ln and the duplication formulas

Exercise 82. Prove that if we restrict the paths defining Dn to lie in the region m ≤ n, we get the sequence of

Schroeder numbers {Sn }n≥0 , with ordinary generating function

√

X

n 1 − x − 1 − 6x + x2

Sn x =

2x

n≥0

1

Sn = (−Pn−1 (3) + 6Pn (3) − Pn+1 (3)) .

2

Through the last identity, find the asymptotic behaviour of Sn as n → +∞.

Page 43 / 222

4 The glory of Fourier, Laplace, Feynman and Frullani

In the computation of many limits it is often very practical to exploit Taylor series and Landau notation, i.e., loosely

speaking, the fact that a wide class of functions is well-approximated by polynomials. A key observation in Analysis

(due to Joseph Fourier) is that the same holds for trigonometric polynomials: every real function that is 2π-periodic

and regular enough can be written as a combination of terms of the form sin(nx) or cos(mx) (in Signal Processing these

terms are often called harmonics). For every couple (m, n) of positive natural numbers the following orthogonality

R 2π

relations, with respect to the canonical inner product hf, gi = 0 f (x) g(x) dx, hold:

Z 2π Z 2π Z 2π

sin(nx) sin(mx) dx = cos(nx) cos(mx) dx = π δ(m, n), sin(nx) cos(mx) dx = 0.

0 0 0

In particular, the coefficients of the involved harmonics can be easily computed through integrals: the determination

of such coefficients is a problem equivalent to finding the coordinates of a vector in a infinite-dimensional vector space,

equipped with an inner product and an orthogonal base. We perform such computation in three examplary cases.

Let g(x) be the 2π-periodic function that equals π−x

2 on the interval (0, 2π).

X sin(nx)

∀x ∈ R \ 2πZ, f (x) = .

n

n≥1

Let f (x) be the 2π-periodic function that equals 1 on the interval (0, π) and −1 on the interval (π, 2π).

4 X sin((2n + 1)x)

∀x ∈ R \ πZ, f (x) = .

π 2n + 1

n≥0

Let h(x) be the 2π-periodic function that equals π − |x| on the interval [−π, π].

π 4 X cos((2n + 1)x)

∀x ∈ R, f (x) = + .

2 π (2n + 1)2

n≥0

We may notice that 85 follows from 84 by mapping g(x) into g(x) − g(x + π), while 86 follows from 85 by integration.

We remark that the computation of Fourier coefficients for a 2π-periodic and piecewise-polynomial function is

always pretty simple, but there might be converge issues nevertheless:

• The Fourier series of f (x) might fail to be pointwise convergent to f (x) for some x ∈ [0, 2π]: for instance, such

lack-of-convergence phenomenon occurs for sure if limx→0+ f (x) and limx→0− f (x) exist but do not agree;

• Even if pointwise convergence holds, the Fourier series of f might fail to be uniformly convergent to f (x) on

[0, 2π]: that happens for sure when Gibbs phenomenon occurs:

Z

π − x X N 1

sin(nx) sin(πx)

lim sup − = dx 6= 0.

2 n πx

N →+∞ x∈(π/(N +1),π) 0

n=1

Page 44 / 222

4 THE GLORY OF FOURIER, LAPLACE, FEYNMAN AND FRULLANI

Moreover, given a trigonometric series, it might be extremely difficult to find a 2π-periodic function with the given

series as a Fourier series. These subtleties will be investigated in full detail during Calculus courses. Now we are just

interested in proving some consequences of the above results, by starting from this observation: if f (x) is a 2π-periodic

function with mean zero, it is a continuous function on (0, 2π) and its Fourier series is pointwise convergent to f (x) on

R \ πZ, by performing a termwise integration on the Fourier series of f (x) we get the Fourier series of an antiderivative

for f (x), and the convergence becomes uniform. In particular, for any x ∈ (0, 2π) we have:

Z x

πx x2 π−y X 1 − cos(nx) X cos(nx)

− = dy = 2

= c0 −

2 4 0 2 n n2

n≥1 n≥1

2

where c0 has to be the mean value of the function πx x

2 − 4 on the interval (0, 2π), since every term of the form cos(nx)

has mean zero. It follows that:

Z 2π Z 1

π2

1 1 1

2πx − x2 dx = π 2 x − x2 dx = π 2

c0 = ζ(2) = − =

8π 0 0 2 3 6

and such identity further leads to:

X cos(2πnx) π2

∀x ∈ (0, 1), 2

= (1 − 6x + 6x2 ).

n 6

n≥1

Z 1

1

cos(2πmx) cos(2πnx) dx = δ(m, n)

0 2

we also have that:

Z 1 2 2

π4 1 π4

X 1 Z

π 2 2 2

ζ(4) = = 2 (1 − 6x + 6x ) dx = (1 − 6x + 6x ) dx = .

n4 0 6 18 0 90

n≥1

Lemma 87. By defining the sequence of Bernoulli polynomials through B0 (x) = 1 and

Z x

∀n ≥ 0, Bn+1 (x) = κn+1 + (n + 1) Bn (y) dy,

0

where the κn+1 constant is chosen in such a way that Bn+1 (x) has mean zero over (0, 1),

R1

we have that for any n ≥ 1 the value of ζ(2n) is a rational multiple of π 2n 0 Bn (x)2 dx, hence:

ζ(2n)

∈ Q.

π 2n

n

If we apply the same technique to the Fourier series of the triangle wave n≥0 (−1) (2n+1)

sin((2n+1)x)

P

2 ,

we may prove in a similar way that:

X (−1)n

∀m ∈ N, ∈ π 2m+1 Q.

(2n + 1)2m+1

n≥0

X (−1)n π3

=

(2n + 1)3 32

n≥0

3

it follows that π ≈ 31 is a pretty accurate approximation. Let us see how to prove the last identity through a very

powerful tool, the Laplace transform. Let us assume that f (x) is a continuous and “vaguely integrable” function

on R+ , meaning that the following limit

Z M

lim f (x)e−sx dx

M →+∞ 0

Page 45 / 222

is finite for any s ∈ R+ . Its Laplace transform, denoted by Lf , is defined through:

Z +∞

∀s ∈ R+ , (Lf )(s) = f (x)e−sx dx.

0

In the given hypothesis, the map sending a continuous and vaguely integrable function into its Laplace transform

is injective, so if Lg = f holds we may say that g is the inverse Laplace transform of f , by using the notation

g = L−1 f .

If f (x) = xk with k ∈ N, we may notice that

k!

Lf (s) = k+1 ,

s

hence by exploiting the linearity of the Laplace transform and the integration by parts formula we may state that:

s2 −s/2

1

L−1 3

= e .

(2x + 1) 16

P (−1)n

That allows us to convert the series n≥0 (2n+1)3 into an (indefinite) integral:

+∞ Z +∞

(−1)n X s2 s2

Z

X 1

= e−s/2 (−1)n e−ns ds = ds

(2n + 1)3 0 16 32 0 cosh(s/2)

n≥0 n≥0

and the equality between the last integral and π 3 is a straightforward consequence of the residue Theorem. In a similar

way,

1 +∞

Z

3 X 1 s ds

ζ(2) = = .

4 (2n + 1)2 8 0 sinh(s/2)

n≥0

The trick of converting a series into an integral through the (inverse) Laplace transform is widespread in Analytic

Number Theory: for instance, it is the key ingredient of Ankeny and Wolf’s proof of the fact that, by assuming the

generalized Riemann hypothesis (GRH), for any prime p large enough the minimum quadratic non-residue (mod p) is

≤ 2 log2 (p). It is interesting to underline that the sharpest, GRH-independent upper bound actually known, arising

1

√

from the combination of Burgess inequality with Vinogradov’s amplification trick, is extremely weaker: ηp p 4 e .

Another crucial property of the Laplace transform is the following one:

Lemma 88. Under suitable (and not particularly restrictive) hypothesis on the regularity of f and g and their speed

of decay, we have: Z +∞ Z +∞

f (x) · g(x) dx = (Lf )(s) · (L−1 g)(s) ds.

0 0

There are many famous applications of this Lemma, that can be seen as a “regularized” version of the integration

by parts formula. For instance, Dirichlet and Fresnel’s integrals are very simple to compute by using the Laplace

transform.

Lemma 89 (Dirichlet).

Z M

sin x π

lim dx = .

M →+∞ 0 x 2

x = 1 and, due to the integration by parts formula, L(sin x) = s2 +1 , we have:

Z M Z +∞

sin x ds π

lim dx = = lim arctan(N ) = .

M →+∞ 0 x 0 s2 + 1 N →+∞ 2

Page 46 / 222

4 THE GLORY OF FOURIER, LAPLACE, FEYNMAN AND FRULLANI

Lemma 90 (Fresnel).

M M

Z Z r

2 2 π

lim sin(x ) dx = lim cos(x ) dx = .

M →+∞ 0 M →+∞ 0 8

1

√

Proof. Since the equality Γ 2 =

π holds (that can be proved by considering the integral of a gaussian function

or,

−1 √1

equivalently, as a consequence of Legendre’s duplication formula for the Γ function), we have that L x

= √1πs

(i.e. √1 is an eigenfunction for the Laplace transform). In particular:

x

Z +∞ Z +∞ Z +∞ Z +∞

2 sin x 1 ds 1 dt

sin(x ) dx = √ dx = √ √ 2 =√ 4

,

0 0 2 x 2 π 0 s(s + 1) π 0 t +1

Z +∞ Z +∞ Z +∞ √ Z +∞ 2

cos x 1 s ds 1 t dt

cos(x2 ) dx = √ dx = √ = √ .

0 0 2 x 2 π 0 (s2 + 1) π 0 t4 + 1

We remark that the integrals of rapidly oscillating functions have been converted into integrals of positive and very

well-behaved functions. Additionally, from Sophie Germain’s identity

2

we can derive the partial fraction decomposition of t41+1 , t4t+1 and the computation of Fresnel integrals

boils down to the computation of some values of the arctangent function, like in the previous case.

We also have that the Laplace transform (or the Fourier transform, that we have not introduced yet) can be used

to define derivatives of fractional order, or fractional derivatives. We have indeed that due to the integration by

parts formula, the Laplace transform of f (n) (x) depends in a very simple way on sn (Lf ) (s), hence we may define a

half-differentiation operator (also known as semiderivative) in the following way:

√

d

As soon as all the involved transforms and inverse transforms are well-defined, we have D1/2 ◦ D1/2 = dx .

For instance, by setting Z +∞

2

C= e−x dx

−∞

√ C 2√

we have L( x) = and D 1/2

x= Additionally L √1 C

C x. = and

√

2s3/2 x s

d 1/2 1

D3/2 x = D x= √ .

dx C x

However this is not the only way for defining fractional derivatives.

For instance, for every C ∞ function on the whole real line we have that:

1 X n

f (n) (x) = lim+ n (−1)k f (x − kh)

h→0 h k

k≥0

n

where the binomial coefficient k is well-defined also if n 6∈ N.

So we might introduce the semiderivative of f (in the Grünwald-Letnikov sense) also as

X 2k f (x − kh)

1 X 1/2 1

lim √ (−1)k f (x − kh) = lim √ f (x) −

h→0+ h k≥0 k h→0+ h k 4k (2k − 1)

k≥1

but in order that the involved series is (at least conditionally) convergent, the f function has to be smooth and with a

sufficiently rapid decay to zero. For functions in the Schwartz space S(R), the two given definitions of semiderivative

are equivalent, but they are not in full generality.

Page 47 / 222

Exercise 91. Investigate about the interplay between the semiderivative of the sine function

(defined through the Laplace transform) and Fresnel integrals.

Z +∞ Z +∞ 4

sin x π

lim 2α sin(xα ) dx = π, dx = .

α→+∞ 0 0 x 3

Lemma 93. For any a > 0, the integration by parts formula leads to:

1 +∞

Z Z +∞

1 sin(nx) −ax

cos(nx) e−ax dx = 2 = e dx.

a 0 a + n2 0 n

+∞

X sin(nx)

n=1

n

1 nxo

f (x) = π −

2 2π

+∞ Z +∞

X 1 1 n x o −ax

2 + n2

=π − e dx,

n=1

a 0 2 2π

and by partitioning [0, +∞) as [0, 2π) ∪ [2π, 4π) ∪ . . . it follows that:

+∞ 2π

e2aπ π − x −ax πa coth(πa) − 1

Z

X 1

2 2

= 2aπ e dx = .

n=1

a +n e −1 0 2 2a2

π2

By computing the limit of both sides as a → 0+ , we find ζ(2) = 6 again.

Exercise 95. Prove the following identity through the Laplace transfom:

Z +∞

dt 1

e−t sin2 (t) = log 5.

0 t 4

Page 48 / 222

4 THE GLORY OF FOURIER, LAPLACE, FEYNMAN AND FRULLANI

Z +∞ Z +∞

sin(s) sin(x) − x cos(x)

s−1

ds, dx.

0 e 0 x2

Exercise 97. By considering the Laplace transform of the indicator function of (0, a),

prove that for any a > 0 we have: Z +∞

sin t

arctan(a) = 1 − e−at dt.

0 t

Z +∞

1 dx 1 1

· = + .

0 x + 1 − u π 2 + log2 (x) u log(1 − u)

Exercise 99. By the Laplace transform and the Cauchy-Schwarz inequality, prove that:

Z +∞

sin x

dx ≤ 1.

0 x(x + 1)

Prove that, due to the Laplace transform:

X (−1)n Z +∞

1 ds

√ =√

2n + 1 π 0 cosh(s2 )

n≥0

where a slowly convergent series has been converted into the integral of a function in S(R+ ).

Notice that it is possible to state:

Z +∞ X (−1)n Z +∞

1 ds 1 ds

√ s4 ≤ √ ≤ √ 4 .

1 + s2

π 0 exp 2 2n + 1 π 0

n≥0

The Laplace transform also encodes an important Theorem known as differentiation under the integral sign or

Feynman’s trick. In his biography Surely you’re joking, Mr. Feynman, the American physicist Richard Feynman

talks about instruments learned by reading Woods Advanced Calculus book, that in Feynman’s opinion did not get

proper credit in University courses: differentation under the integral sign granted some fame to Feynman, since he

proved many logarithmic integrals can be tackled in a very slick way by exploiting such instrument.

Lemma 101. As soon as the hypothesis of the dominated convergence Theorem are met,

Z Z

d ∂f

f (x, y) dx = (x, y) dx.

dy E E ∂y

Page 49 / 222

∂

R

The very deep consequences of the fact that, under suitable assumptions, the operators ∂y and E dx commute might

not be evident at first sight. Practical applications are needed to fully understand the hidden “magic”.

arctan(x) π √

I= √ dx = log(1 + 2).

0 x 1−x 2 2

Z π/2

arctan(k sin θ)

I(k) = dθ

0 sin θ

we have that I = I(1) and

Z π/2

1 π

I 0 (k) = = √

0 1 + k 2 sin2 θ 2 1 + k2

holds by the substitution θ = arctan(t). Since limk→0+ I(k) = 0, it follows that:

π

Z 1

dk π π √

I = I(1) = √ = arcsinh(1) = log(1 + 2)

2 0 1+k 2 2 2

as wanted.

π/2

1 + |k|

Z

2 2 2

log(sin θ + k cos θ) dθ = π log .

0 2

R π/2

Proof. By denoting as I(k) the integral in the LHS and exploiting 0

log tan(θ) dθ = 0 we have that:

π/2 +∞

log(k 2 + t2 )

Z Z

2 2

I(k) = log(k + tan θ) dθ = dt

0 0 1 + t2

Z +∞

2k dt π

I 0 (k) = = .

0 (k 2 2 2

+ t )(1 + t ) k + Sign(k)

Exercise 104. Prove that for any couple (a, b) of distinct and positive natural numbers we have:

+∞

log2 (b) − log2 (a)

Z

log x

I(a, b) = dx = .

0 (x + a)(x + b) 2(b − a)

Z +∞ Z +∞

log x 1 x x log(x)

dx = − dx

0 (x + a)(x + b) b−a 0 x + a x + b x

Z +∞

1 b a

= − log2 (x) dx.

2(b − a) 0 (x + b)2 (x + a)2

Page 50 / 222

4 THE GLORY OF FOURIER, LAPLACE, FEYNMAN AND FRULLANI

If we define J(c) as

+∞

c log2 (x) +∞

log2 (cz)

Z Z

J(c) = dx = dz

0 (x + c)2 0 (z + 1)2

and notice that

+∞

log2 (c) +∞

Z Z

2 log(c) log(z)

dz = log2 (c), dz = 0

0 (z + 1)2 0 (z + 1)2

we immediately have:

log2 (z) +∞

log2 (z)

Z Z 1

2 2

J(c) = log (c) + dz = log (c) + 2 dz

0 (z + 1)2 0 (z + 1)

2

and the claim, since the last integral does not depend on c.

Anyway we may observe that by the integration by parts formula

log2 (z)

Z 1 Z 1 Z 1 X Z 1 (−1)n+1 z n−1 X (−1)n+1

2 log z log(1 + z)

2

dz = − dz = 2 dz = 2 dz = 2 = ζ(2).

0 (1 + z) 0 (1 + z) 0 z 0 n n2

n≥1 n≥1

1

1 − cos t e − 1 = .

0 tet 2

+∞

1 − cos(ωt) −t

Z

I(ω) = e dt.

0 t

It is trivial that limω→0+ I(ω) = 0. We also have:

Z +∞

ω

I 0 (ω) = ω sin(ωt)e−t dt =

0 1 + ω2

hence for any ω > 0 Z ω

u 1

I(ω) = du = log(1 + ω 2 )

0 1 + u2 2

√

and by considering ω = e − 1 the claim follows.

Z π

dθ πa

I(a, b) = 2

= 2 .

0 (a + b cos θ) (a − b2 )3/2

π/2 π/2 π/2

a2 + b2 cos2 θ

Z Z Z

dθ dθ

I(a, b) = + =2 dθ.

0 (a + b cos θ)2 0 (a − b cos θ)2 0 (a2 − b2 cos2 θ)2

By enforcing the substitution θ = arctan t we get:

Z +∞ 2 Z +∞ 2

a (1 + t2 ) + b2 2 (a + b2 ) + (a2 − b2 )t2

I(a, b) = 2 2 dt = dt

0 (a2 (1 + t2 ) − b2 ) 2

a(a − b )2 3/2

0 (1 + t2 )2

R +∞ dt R +∞ t2 dt π

so I(a, b) only depends on the integrals H0 = 0 (1+t2 )2 and H2 = 0 (1+t2 )2 = 2 − H0 .

On the other hand, for any β > 0

Z +∞ Z +∞ Z +∞

dt ∂ 1 d dt d π π

2 )2

=− 2

dt = − 2

=− √ = 3/2

0 (β + t 0 ∂β β + t dβ 0 β + t dβ 2 β 4β

Page 51 / 222

π

hence H0 = H2 = 4 and we have:

2 πa2 πa

I(a, b) = · = 2

a(a2 2

−b )3/2 2 (a − b2 )3/2

as wanted. The given problem can be solved through an interesting geometric approach, too. If a simple closed curve

around the origin is regular and has a parametrization of the form (ρ(θ) cos θ, ρ(θ) sin θ) for θ ∈ [0, 2π], the enclosed

area is given by:

1 2π

Z

A= ρ(θ)2 dθ.

2 0

1

In our case the curve decribed by ρ(θ) = a+b cos θ is an ellipse and

Z π Z 2π

dθ 1

I(a, b) = = ρ(θ)2 dθ

0 (a + b cos θ)2 2 0

is precisely the area enclosed by such ellipse. Since affine maps preserve the ratios of areas, the area enclosed by an

1 1

ellipse is π times the product between the lengths of semi-axis. The length of the major axis is given by a−b − a+b =

2a

a2 −b2 and the ratio between the square of the minor axis and the major axis (also known as semi-latus rectum) equals

1

a . It follows that:

a 1

I(a, b) = A = π · 2 ·√

a − b2 a2 − b2

just like we found before.

Exercise 107. Prove that for any couple (A, B) of positive real numbers the following identity holds:

Z +∞

dx π

I(A, B) = x2

x2

= HM(A, B)

0 1+ A2 1+ B2

4

2AB

where HM(A, B) is the harmonic mean of A and B, i.e. A+B .

Z +∞

sin(3x) sin(4x) sin(5x)

dx.

0 x sin2 (x) cosh(x)

Proof. It is useful to notice that:

Z +∞

def sin(2nx) πn

I(n) = dx = 2 arctan tanh .

0 x cosh(x) 2

1

By expanding cosh x as a geometric series we get

1 2 2e−x

= 2 e−x − e−3x + e−5x − e−7x + . . .

= x =

cosh x e + e−x 1 + e−2x

and it follows that Z +∞

X sin(2nx) −(2k+1)x

I(n) = 2 (−1)k e dx.

0 x

k≥0

X Z +∞ X (2k + 1)(−1)k

0

I (n) = 4 (−1)k

cos(2nx)e−(2k+1)x dx = 4

0 (2k + 1)2 + 4n2

k≥0 k≥0

X

k 1 1

= 2 (−1) + .

(2k + 1) + 2in (2k + 1) − 2in

k≥0

Page 52 / 222

4 THE GLORY OF FOURIER, LAPLACE, FEYNMAN AND FRULLANI

The last series can be interpreted as a logarithmic derivative, and it leads to the identity

π

I 0 (n) = .

cosh(πn)

Since limn→0+ I(n) = 0, or by exploiting limn→+∞ I(n) = 0 that follows from the Riemann-Lebesgue Lemma, we

get: Z n

π πn

I(n) = du = 2 arctan tanh .

0 cosh(πu) 2

On the other hand, by exploiting Chebyshev polynomials of the second kind it is not difficult to check that:

= 2 sin(2x) + 3 sin(4x) + 3 sin(6x) + 2 sin(8x) + sin(10x),

sin(x)2

Z +∞

sin(3x) sin(4x) sin(5x) 11π

2 dx = 2 I(1) + 3 I(2) + 3 I(3) + 2 I(4) + I(5) ≈

0 x sin (x) cosh(x) 2

π

where the last approximation follows from the fact that I(n), for n 1, converges quite fast to the value 2.

1

xt − 1

Z

I(t) = dx = log(t + 1).

0 log x

Proof. Since xt − 1 = exp(t log x) − 1 = t log(x) + O t2 log2 x for t → 0+ and − log(x) is a positive and integrable

function over the interval (0, 1), we have that limt→0+ I(t) = 0. In order to prove the claim it is enough to show that:

1

xt − 1

Z

d 1 d

dx = = log(t + 1)

dt 0 log x t+1 dt

∂ xt −1

R1 1

which is a trivial consequence of ∂t log x = xt and 0

xt dx = t+1 .

Z π

log(7 + cos θ) dθ.

0

2n n−1

Y

2n

Y πik 2 2 πk

x −1= x − exp = (x − 1) x + 1 − 2x cos .

n n

k=1 k=1

x2 +1

√

If we choose x in such a way that 2x = −7 holds, for instance through x = 4 3 − 7, we get:

n−1

π √

Z

π Y

πk

7

log(7 + cos θ) dθ = lim log 7 + cos n = π log +2 3 .

0 n→+∞ n 2

k=1

Page 53 / 222

Exercise 111. Prove that for any a ∈ (−1, 1)

Z π

log(1 − 2a cos x + a2 ) dx = 0.

0

arctan x π

= log 2.

0 x+1 8

π/2

π3

Z

π

log(sin x) log(cos x) dx = log2 (2) − .

0 2 48

At this point it should be clear that Feynman’s trick is a really powerful tool.

In the following formal manipulation

Z Z Z 1Z Z 1

˜ ∂ ˜

f (x) dx = f (x, 1) dx = f (x, α) dx = g(α) dα

E E 0 E ∂α 0

we have a complete freedom in choosing where to introduce the dummy parameter α in the definition of f , in such a

way that f˜(x, α)|α=1 = f (x) holds. In particular, Feyman’s trick is really effective in the computation of logarithmic

d α

integrals, since log(x) = dα x α=0+ , or in proving identities like

X log n d X 1

=− = −ζ 0 (2).

n2 dα nα

n≥1 n≥1

α=2

Additionally differentiation under the integral sign, the Laplace transform and Frullani’s Theorem

provide interesting integral representations for logarithms.

Theorem 114 (Frullani). If f ∈ C 1 (R+ ) and limx→+∞ f (x) = 0, for any couple (a, b) of positive real numbers we

have: Z +∞

f (ax) − f (bx) b

dx = log · lim f (x).

0 x a x→0+

The most typical case of Frullani’s Theorem is related to the function f (x) = e−x :

+∞

e−ax − e−bx

Z

b

dx = log .

0 x a

Let us prove this Corollary through Feynman’s trick and the Laplace transform.

It is enough to show that: Z +∞ −x

e − e−ax

∀a > 0, I(a) = dx = log(a).

0 x

Page 54 / 222

4 THE GLORY OF FOURIER, LAPLACE, FEYNMAN AND FRULLANI

Z +∞

1

I 0 (a) = e−ax dx =

0 a

we have: Z a Z a

0 dξ

I(a) = I (ξ) dξ = = log(a).

1 1 ξ

−1 1 −ax 1

As an alternative, by exploiting L x = 1 and L(e )= a+s ,

Z +∞

1 1

I(a) = − ds = lim [log(M + 1) − log(M + a) + log(a)] = log(a).

0 s+1 a+s M →+∞

1 1

P

We remark the strong analogy between the last identity and Hn = m≥1 m − m+n . Frullani’s Theorem can be

used to provide integral representations for the Euler-Mascheroni constant γ or the constant log π2 :

X1 Z +∞

def 1 1 1

γ = lim (Hn − log n) = − log 1 + = − dx.

n→+∞ n n 0 ex − 1 xex

n≥1

Z +∞

e−nx − e−(n+1)

1 1 −nx

− log 1 + = e − dx

n n 0 x

We may notice that by Weierstrass product for the Γ function we have γ = −Γ0 (1).

It follows that by Feynman’s trick we also have:

Z +∞ Z +∞

d s−1 −x

log x

γ=− x e dx =− dx.

ds 0 s=1 0 exp(x)

π

Lemma 117 (An integral representation for the log 2 constant).

+∞

ex − 1 dx

Z

π

log = · .

2 0 ex + 1 xex

Proof. By Weierstrass product for the sinc function it is simple to prove that:

2 Y 1

= 1− 2 .

π 4n

n≥1

Z +∞ X −nx Z +∞ x

π X

n+1 n+1 e − e−(n+1)x e − 1 dx

log = (−1) [log(n + 1) − log(n)] = (−1) dx = ·

x + 1 xex

.

2 0 x 0 e

n≥1 n≥1

Page 55 / 222

Exercise 118. Prove that

X +∞

sinh x − x 1 − log 2

Z

2n + 1

n log −1 = 2 dx = .

2n − 1 0 2x sinh (x) 2

n≥1

Hint: the first equality follows from the Laplace transform, and the equality between the first term and the last one

is a consequence of Stirling’s inequality, since:

N N

X 2n + 1 X

SN = n log −1 = −N + n [log(2n + 1) − log(2n − 1)]

n=1

2n − 1 n=1

N

X −1

= −N + N log(2N + 1) − log(2n + 1)

n=1

= −N + N log(2N + 1) − log [(2N − 1)!!]

= −N + N log(2N + 1) − log [(2N )!] + N log 2 + log [N !] .

X 1

XX

1 X ζ(2m)

S∞ = 2n arctanh −1 = 2m

=

2n (2m + 1)(2n) 4m (2m + 1)

n≥1 n≥1 m≥1 m≥1

R 1/2

and recognize in the last series the integral 0 [1 − πx cot(πx)] dx that is simple to compute by integration by parts:

R π/2

by this way the problem boils down to the computation of the well-known integral 0 log sin(θ) dθ = − π2 log(2).

We now introduce another very powerful tool for dealing with series, i.e. the discrete equivalent of the integration by

parts formula.

(ΣΣ) version: if {an }n≥1 and {bn }n≥1 are two sequences of complex numbers, by setting An = a1 + . . . + an we have:

N

X N

X −1

an bn = AN bN − An (bn+1 − bn ).

n=1 n=1

(Σ ) version: if φ(x) is a C 1 function on R+ , for any sequence {an }n≥1 of complex numbers we have:

R

X Z x

an φ(n) = A(x)φ(x) − A(u)φ0 (u) du

1≤n≤x 1

where X

A(x) = an .

1≤n≤x

The existence of a discrete-discrete and a discrete-continuous analogue of the integration by parts formula is very

practical, almost comforting. Abel’s theorem allows to study the convergence of a power series on the boundary of the

region of convergence, or to regularize series whose convergence is uncertain at first sight. The following Lemma, for

instance, is a straightforward consequence of the summation by parts formula:

Page 56 / 222

4 THE GLORY OF FOURIER, LAPLACE, FEYNMAN AND FRULLANI

(Discrete version) If {an }n≥1 and {bn }n≥1 are two sequence of real numbers, such that An = a1 + . . . + an is bounded

and bn is decreasing towards zero, the series X

an bn

n≥1

is convergent.

Rx

(Continuous version) If f (x), g(x) ∈ C 0 (R+ ), where g(x) is weakly decreasing towards zero and 0 f (u) du is bounded,

the improper Riemann integral Z +∞

f (x) g(x) dx

0

is convergent.

The convergence of

X sin n X cos n Z +∞

sin x

, , dx

n n 0 x

n≥1 n≥1

immediately follows from Dirichlet’s test, once we show two simple results on the behaviour of particular trigonometric

sums.

Lemma 121. For any couple (k, N ) of positive real numbers we have:

N N 1

X

1 k

X 1

sin(kn) ≤ 2 cot 4 , cos(kn) ≤ + .

2 2 sin k2

n=1 n=1

As a matter of fact, by the sine/cosine addition formulas:

N N h i

X X

k 1 1 1 1

cos k2 − cos (2N 2+1)k ,

sin 2 sin(kn) = 2 cos n− 2 k − cos n+ 2 k = 2

n=1 n=1

N N h i

X X

k 1 1 1 1

sin k2 − sin (2N 2+1)k ,

sin 2 cos(kn) = 2 sin n− 2 k − sin n+ 2 k = 2

n=1 n=1

so, in particular:

N N

X 1 + cos k X 1 + sin k 1 1

1

sin(kn) ≤ 2

= cot k , cos(kn) ≤ 2

≤ + .

2 sin k2 2 4

2 sin k2 2 2 sin k2

n=1 n=1

Exercise 122 (about Van Der Corput’s trick and Weyl’s inequality). Prove that the series

X sin(n2 )

n

n≥1

is convergent. Hint: it is enough to show that for any N large enough the inequality

N

X √

exp(in2 ) ≤ 4 N log2 (N )

n=1

Page 57 / 222

sin n cos n

P P

Let us see how to compute n≥1 n and n≥1 n in a explicit way, now that we know they are convergent series.

Since we have:

X zn

= − log(1 − z)

n

n≥1

for any complex number z with modulus less than 1, it is reasonable to expect that, by evaluating both sides at z = ei ,

X sin n π−1 X cos n

= −Im log(1 − ei ) = = −Re log(1 − ei ) = − log 2 sin 21

,

n 2 n

n≥1 n≥1

n

holds. That is correct, indeed, but we have to explain why we are allowed to evaluate the Taylor series n≥1 zn at a

P

point on the boundary of the convergence region |z| < 1. We may notice that the function − log(1 − z) is continuous

(and much more, actually: holomorphic) in a neighbourhood of z = ei : that is enough to justify the “wild” evaluation

we performed. Such observation is also known as Abel’s Lemma. We may also proceed through a deformation

1

of an integration path. Let us consider the function g(z) = 1−z : it is a meromorphic function with a simple pole at

z = 1, and at the interior of the unit disk we have:

1 X

= 1 + z + z2 + z3 + . . . = zn.

1−z

n≥0

Z z X zn Z ei

du X sin n

− log(1 − z) = = , = Im g(z) dz.

0 1−u n n 0

n≥1 n≥1

straight segment joining the origin with ei equals the integral along

the path given by the concatenation of the straight segment joining

z = 0 with z = −1 and the depicted circle arc γ joining −1 with ei .

In particular:

X sin n Z −1 Z

dz dz

= Im + Im

n 0 1−z γ 1−z

n≥1

As a consequence:

Z ei

sin θ2

Z 1 Z π Z π

X sin n dz ieiθ eiθ/2 dθ π−1

= Im = Im iθ

dθ = Re iθ/2 −iθ/2

= θ

dθ = .

n≥1

n −1 1 − z π 1−e 1 e −e 1 2 sin 2 2

gence of the Fourier series of π−x

2 on the interval (0, π). More-

P sin n

R +∞ sin x

over, both the series n≥1 n and the integral 0 x dx

can be computed through the properties of the Fejér kernel.

Page 58 / 222

4 THE GLORY OF FOURIER, LAPLACE, FEYNMAN AND FRULLANI

The Fejér kernel. In Functional Analysis density and regularization (or “mollification”) tricks are often used.

They are based on the fact that the convolution (f ∗ g)(x), defined through

Z +∞

(f ∗ g)(x) = f (τ ) g(x − τ ) dτ

−∞

inherits the best regularity between the behaviours of f (x) and g(x). In particular, if g(x) is a non-negative

function with unit integral, belonging to C k and concentrated enough around the origin, (f ∗ g)(x) is an excellent

approximation of f (x) that may be way more regular than f (x). Functions g(x) fulfilling the previous constraints

are regular approximations of the Dirac δ distribution, and they are said approximated identities or convolution

kernels. The Fejér kernel is a classical example.

Definition 123.

N −1 k N −1 k

!

def

X |j|

1 X X isx 1 X X

ijx

FN (x) = 1− e = e = 1+2 cos(sx)

N N N s=1

|j|≤N k=0 s=−k k=0

!2

sin N2x

1 1 − cos(N x) 1

= = .

N 1 − cos x N sin x2

This particular trigonometric function is non-negative due to the last identity (for short: it is a square).

Rπ

Moreover limx→0 FN (x) = N and −π FN (x) dx = 2π for any N ≥ 1. By termwise integration,

Z 1 N −1 k N −1

1 X X sin s X sin s s

FN (x) dx = 1 + =1+2 1− .

0 N s=1

s s=1

s N

k=1

N N N

X sin s X sin s s 1 X 1

− 1− = sin(s) = O ,

s=1

s s=1

s N N s=1 N

so: Z 1

X sin s 1 π−1

= lim −1 + FN (x) dx =

s 2 N →+∞ 0 2

s≥1

Rπ

since limN →+∞ 1

FN (x) dx = 0. It is not difficult to locate the stationary points of the function sin(N x/2)

sin(x/2) and

2

state that for any large enough N the ratio sin(N x/2)

sin(x/2) is bounded by an absolute constant on the interval (1, π),

Rπ 1

from which it follows that 1 FN (x) dx = O N . With the same approach based on termwise integration and by

exploiting a Riemann sum we have:

k

Z Z k

N sin x

lim FN (x) dx = 2 · Si(k) = 2 dx

N →+∞ 0 0 x

Z +∞

sin x π

dx = .

0 x 2

The pointwise convergence of the Fourier series of the sawtooth wave can also be studied through the Laplace transform.

If f (x) is a bounded and vaguely integrable function over R+ , the dominated convergence theorem ensures

Z +∞

lim+ f (x) = lim se−sx f (x) dx = lim s · Lf (s).

x→0 s→+∞ 0 s→+∞

Page 59 / 222

P sin(nx)

If this manipulation (convolution with an approximate identity) is applied to n≥1 n , it produces:

X sin(nx) X s

lim+ = lim

x→0 n s→+∞ s + n2

2

n≥1 n≥1

−1+πs coth(πs)

where the last series equals 2s . But even ignoring this identity, Riemann sums grant

Z +∞

X s dx π

lim 2 2

= 2

= .

s→+∞ s +n 0 1+x 2

n≥1

C

Lemma 124 (Poisson). If f ∈ C 2 (R), |f (x)| ≤ 1+x2 and the first two derivatives of f are integrable on R,

+∞ +∞ Z +∞

def

X X

f (n) = fˆ(n), fˆ(ν) = f (x)e−2πiνx dx.

n=−∞ n=−∞ −∞

This result has a great importance in Harmonic Analysis and Number Theory. It follows from the fact that the

distribution known as Dirac comb is a fixed point of the Fourier transform, and the identity

+∞ +∞

X 1 X

2

π

∀a > 0, exp(−πan + 2bin) = √ exp − (n − b)2

n=−∞

a n=−∞ a

leads to the reflection formula for the Riemann ζ function. It is possible to employ the Poisson summation formula

to prove interesting identities related to modular forms, like:

X coth(πn) 7π 3

=

n3 180

n≥1

X 1 X 1 X

−1 + 2 = = πe−2π|n| = π coth(π),

n2 +1 2

n +1

n≥1 n∈Z n∈Z

X 1 π π

= −1 + tanh .

n2 + (n + 1)2 2 2

n≥1

R 2π

The Poisson kernel. Since log kzk = Re log z and 0

eniθ dθ = 2πδ(n), for any r ∈ R we have:

Z 2π

log
1 − reiθ
dθ = 2π log max(1, |r|).

0

Z 2π

∀r ∈ R, log(1 + r2 − 2r cos θ) dθ = 4π log max(1, |r|).

0

Page 60 / 222

4 THE GLORY OF FOURIER, LAPLACE, FEYNMAN AND FRULLANI

(

2π 2π

r − cos θ if |r| > 1

Z

∀r ∈ R \ {−1, +1}, dθ = r

0 1 + r2 − 2r cos θ 0 if |r| < 1,

(

2π

1 − r cos θ if |r| < 1

Z

2π

∀r ∈ R \ {−1, +1}, dθ =

0 1 + r2 − 2r cos θ 0 if |r| > 1.

That is not entirely surprising: the last result also follow from the behavior of complex homographies

or Cayley transforms, since

1 + reiθ 1 − r2

X

|n| inθ

∀r ∈ [0, 1), Pr (θ) = r e = Re = .

1 − reiθ 1 + r2 − 2r cos θ

n∈Z

Z 2π

1 (1 + 3r2 ) − (5r + r3 ) cos(θ) + (1 + r2 ) cos(2θ)

f (r) = dθ

2π 0 (1 + r2 − 2r cos θ)2

1

equals 1 on the interval (−1, 1) and r2 outside the previous interval.

∞

1

1 1 X sin 2πnz · log n

log Γ(z) = 2 − z (γ + log 2) + (1 − z) ln π − log sin πz + , 0<z<1

2 π n=1 n

is an important identity that can be derived through Weierstrass product for the Γ function and the Laplace

transform. For a long time it was credited to Ernst Kummer, that proved it in 1847. Only recently Iaroslav

Blagouchine has pointed out the same result was known to the Swedish mathematician Carl Johan Malmsten

since 1842.

q

1

an + a2n + 4n

an+1 =

2

for any n ≥ 0. Prove that: r

5 2

lim an ≤ , lim an = .

n→+∞ 12 n→+∞ π

Exercise 127 (Kronecker’s Lemma). A sequence {an }n≥1 of real numbers is such that

N

X an

lim = C < +∞.

N →+∞

n=1

n

Page 61 / 222

Prove that {an }n≥1 is necessarily a sequence with mean zero, i.e.

a1 + a2 + . . . + aN

lim = 0.

N →+∞ N

a1 a2 aM

Proof. Let us set AM = 1 + 2 + ... + M . By summation by parts:

n n n−1

1X 1X ak 1X

ak = k· = An − Ak

n n k n

k=1 k=1 k=1

and the convergence of the original series ensures that for any ε > 0, there is some N such that |An − C| ≤ ε holds

for any n ≥ N . By picking an ε and considering the associated N , the previous RHS can be written as

N −1 n−1

1 X n−N 1 X

An − Ak − C− (Ak − C).

n n n

k=1 k=N

If now we consider the limit as n → +∞, the first term goes to C, which cancels with the third term; the second term

goes to zero (as the sum is a fixed value) and the last term is bounded in absolute value by n−N

n ε ≤ ε.

A remark about nuking mosquitoes: since the sequence 1, 1, 1, . . . has not mean zero, the harmonic series is divergent.

Exercise 128. Find a function f ∈ C 0 (R+ ) such that the following equality holds for any t > 0:

Z t

f (t) = e−3t + e−t e−τ f (τ )dτ

0

Proof. Let g(s) = (Lf )(s). The Laplace transform of e−x f (x) is given by g(s + 1) and the Laplace transform of

R x −u

0

e f (u) du is given by 1s g(s + 1), hence the given differential equation can be written in terms of g as

1 g(s + 2)

g(s) = +

s+3 s+1

leading to:

1 1 1 1 1 1 1

g(s) = + + + + ...

s+3 s+1 s+5 s+3 s+7 s+5 s+9

1 1 1 1

g(s) = + + + + ...

s + 3 (s + 1)(s + 5) (s + 1)(s + 3)(s + 7) (s + 1)(s + 3)(s + 5)(s + 9)

It is clear that g(s) is a meromorphic function with poles at the negative odd integers. Additionally it is simple to

compute the closed form of Rξ = Ress=ξ g(s) for any ξ ∈ Ξ = {−1, −3, −5, . . .}, then to consider the inverse Laplace

transform of g:

X

f (x) = Rξ eξx .

ξ∈Ξ

1+s√

e −Γ 3+s + Γ 3+s 1

1+2 2

2 2 , 2

g(s) =

1+s

" Z 1/2 #

1 1+s √ s+1

−u

= 1−2 2 e u 2 e du

1+s 0

√ 1√ 1√ 1 √ 1 √

2− e 2 e 8 e 48 e 384 e

= + − + − + ...

s+1 s+3 s+5 s+7 s+9

Page 62 / 222

4 THE GLORY OF FOURIER, LAPLACE, FEYNMAN AND FRULLANI

√ X (−1)n+1 −(2n+1)x

√

f (x) = (2 − e e)e−x + e

(2n)!!

n≥1

−x

= e−x 2 − ee sinh(x) .

The last entry of this section is a surprising consequence of the Laplace transform and the residue Theorem:

Theorem 129 (Ramanujan’s Master Theorem). Under suitable regularity assumptions for φ, the identity

X φ(n)

f (x) = (−x)n

n!

n≥0

implies: Z +∞

xs−1 f (x) dx = Γ(s) φ(−s).

0

X ζ(k)

log Γ(1 + x) = −γx + (−x)k

k

k≥2

Z +∞

dx π ζ(1 + s)

∀s ∈ (0, 1), (γx + log Γ(1 + x)) s+2

= · .

0 x sin(πs) 1+s

Exercise 130 (The Russian Integral). Prove that for any a ∈ R+ and any b ∈ (0, 2) the following identity holds:

sinh a arccos 2b

Z +∞

x−ia 2π

dx = √ · .

0 x2 + bx + 1 4 − b2 sinh(aπ)

eBs − eBs

−ia ia−1 −1 1

L(x )(s) = s Γ(1 − ia), L (s) = √

x2 + bx + 1 b2 − 4

where B is the root of x2 + bx + 1 with a positive imaginary part. By the properties of the Laplace transform, the

original integral is converted into

Γ(1 − ia) +∞ ia−1 Bs

Z

√ s e − eBs ds

b2 − 4 0

which can be evaluated in terms of the Γ function. Due to the reflection formula, the final outcome simplifies into

−ia

B −ia − B π

√

sinh(πa) 4 − b2

and we may notice that B = exp i arccos 2b allows a further simplification, proving the claim.

Page 63 / 222

P cos(2nx)

Exercise 131. Find a closed form for n≥1 2n and use it to prove the following identity:

π/2

π 2e2 + 1

Z

dθ

2 = · 2 .

0 1 + 8 sin (tan θ) 6 2e − 1

X (−1)n+1 Z +∞ p−1 −s

s e ds 1

η(p) = = · = E

np 0 Γ(p) 1 + e−s 1 + exp(−X)

n≥1

1

where X is a random variable with a Γ(p, 1) distribution. Given the RHS, the inequality 2 < η(p) < 1 is trivial.

Given the identity

2 π/2 1 +∞ e−nx

Z Z

1 2n 2

n

= sin θ dθ = √ x dx

4n n π 0 π 0 e −1

we have the equality L−1 41s 2s = π√e1x −1 , hence the asymptotic behaviour of 41n 2n

s n depends on the Maclaurin

q

series of exx−1 . On the other hand

x x x x X ζ(2n)

= coth − 1 = 1 − − (−1)n x2n

ex −1 2 2 2 (2π)2n

n≥1

implies

x x X ζ(2n)

log =− + (−1)n x2n

ex − 1 2 n(2π)2n

n≥1

and

r

x x X ζ(2n)

= exp − + (−1)n x2n .

ex − 1 4 2n(2π)2n

n≥1

e−x/4 x2

1 2s

L−1 ≈ √ 1 −

4s s π x 48

it is straightforward to recover the very accurate asymptotic approximation (which actually is a lower bound)

1 2n (8n + 1)(8n + 3)

n

∼ √ .

4 n 2 π(4n + 1)5/2

√

The evaluation of both sides at n = 2 produces 1292 729 as a rational approximation of π, whose absolute error

is about 1.63 · 10−4 . The evaluation at n = 6 produces the approximation 60928 34375 , whose absolute error is less

than 3 · 10 . An interesting exercise is to check that the approximation produced via L, L−1 outperforms the

−6

2 n 2 n −1

1 2n 1Y 1 1 Y 1 1 Y 1

= 1 − = 1 − = 1 −

4n n 4 2k 4n (2k − 1)2 πn (2k + 1)2

k=2 k=2 k≥n

1 XX −1 CT 1 Y 8k − 1 8k + 9 1 8n − 1

= exp 2m

≈ · = · .

πn m(2k + 1) πn 8k + 7 8k + 1 πn 8n + 1

k≥n m≥1 k≥n

On the other hand a refinement of this approach leads to a tighter upper bound:

2

(8n + 1)2 (8n + 3)2 L 1 2n CT 1 64n2 − 8n + 3

5

≤ n

≤ · .

4π(4n + 1) 4 n πn 64n2 + 8n + 3

Page 64 / 222

5 THE BASEL PROBLEM

The Basel problem has been posed by Pietro Mengoli in 1644 and solved by Leonhard Euler in 1735 with a not

entirely rigorous argument, fixed in 1741. This section is dedicated to many classical and alternative approaches for

showing that:

X 1 π2

ζ(2) = 2

= .

n 6

n≥1

Let us consider the function of complex variable usually denoted as sinc(z), i.e. the function that equals 1 at the origin

and sinz z anywhere else. Due to the formula 5 sin(z) = 2i 1

eiz − e−iz , all the zeroes of the function sinc(z) belong

to the real line, are simple and exactly located at the elements of πZ \ {0}. We may recall that by the Fundamental

Theorem of Algebra, any even polynomial p(z) with simple zeroes, such that p(0) = 1, can be written in the form

n

z2

Y

p(z) = 1− 2

ζk

k=1

where ±ζ1 , . . . , ±ζn stands for the zeroes of p(z). Given such identity,

n

X 1

= −[z 2 ]p(z)

ζk2

k=1

By assuming to be allowed to deal with sinc(z) as a “polynomial with an infinite degree”, from the identity

Y z2

sinc(z) = 1− 2 2

π k

k≥1

it follows that:

z2 z4 π2

2 2 2 2

ζ(2) = −π [z ] sinc(z) = −π [z ] 1 − + − ... = .

6 120 6

Quoting Sullivan, you may now sit back and smile smugly at his brilliance.

However, besides the last proof being absolutely brilliant and incredibly efficient, it is based on a unproven assumption:

the statement that we are allowed to deal with sinc(z) by regarding it as an infinite-degree polynomial. Which is true,

indeed, since the Weierstrass product for the entire function sinc(z) has no exponential part, also as a consequence of

Mittag-Leffler’s Theorem. However this part of Complex Analysis has been developed only in the middle nineteenth

century, and it was most certainly unknown to Euler. Such “flaw” was probably the reason for Euler to fix his original

proof, based on such inspired guess. In 1741, by starting from the property of uniform convergence of the Weierstrass

product for the sinc(z) function, he proved that in a neighbourhood of the origin we have:

z2

X

log sinc(πz) = log 1 − 2

n

n≥1

1 X 2z

− π cot(πz) =

z n2 − z 2

n≥1

π2

X 1 1 1 π cos(πz)

ζ(2) = lim = lim − =

z→0 n2 − z 2 2 z→0 z2 z sin(πz) 6

n≥1

5 usually attributed to De Moivre, even if we are pretty sure Newton knew it yet, back in 1676.

Page 65 / 222

d

and rigour is safe. We may state that considering the logarithmic derivative dz log(·) is a big detour on the original

idea, nevertheless Euler’s second proof exploits an instrument that few years later will become essential in Complex

Analysis (deeply related with the topological degree of curves), and is able to show that all the values of the ζ function

at positive even integers can be computed through the coefficients of the Taylor series of z cot z at the origin:

1 − πz cot(πz) X

= ζ(2n) z 2n .

2

n≥1

In other terms, Euler’s “fixed” proof exhibits a generating function for the sequence {ζ(2n)}n≥1 . To produce deeper

results with less than two pages of written math is barely human: this is one of the reasons for which, since the

nineteenth century, concise and elementary proofs, able to prove highly non-trivial statements, have been mentioned

as Eulerian.

We remark that, by following the Eulerian approach of differentiating the logarithm of a Weierstrass product,

then exploiting the substitution z 7→ iz, we have:

X 1 −1 + πz coth(πz)

=

n2 + z 2 2z 2

n≥1

for any z ∈ C. Additionally, by differentiating both sides with respect to z we have that:

π2 z2

X 1 1 πz cosh(πz)

= −2 + + .

(n2 + z 2 )2

n≥1

4z 4 sinh(πz) sinh2 (πz)

Before studying other “classical attacks” to Basel problem, we outline a further proof due to Euler, related to manip-

ulations of the squared arcsine function.

Z 1

arcsin(x) 1 π2

√ dx = arcsin2 (1) = .

0 1 − x2 2 8

Additionally the Taylor series at the origin of arcsin(x) and √ 1 are well-known. It follows that:

1−x2

P∞ (2n−1)!! x2n+1

π2 4

Z 1

arcsin x 4

Z 1 x+ n=1 (2n)!! 2n+1

= √ dx = √ dx

6 3 0 1−x 2 3 0 1 − x2

1 ∞ Z 1

4 X (2n − 1)!!

Z

4 x x

= √

dx + x2n √ dx

03 1−x 2 3 n=1 (2n)!!(2n + 1) 0 1 − x2

∞

4 4 X (2n − 1)!! (2n)!!

= +

3 3 n=1 (2n)!!(2n + 1) (2n + 1)!!

∞

4X 1

=

3 n=0 (2n + 1)2

∞ ∞

!

4 X 1 1X 1

= −

3 n=1

n2 4 n=1 n2

∞

X 1

= 2

= ζ(2).

n=1

n

We remark we already exploited a similar idea: we showed it is possible to prove the identity

X 3

ζ(2) = 2 2n

n≥1

n n

Page 66 / 222

5 THE BASEL PROBLEM

by creative telescoping or by exploiting the tangent half-angle substitution in a logarithmic integral. Once the coeffi-

cients of the Taylor series of arcsin2 (x) have been computed, one may recognize in the RHS the quantity 43 arcsin2 (1).

Proof # 3 (Cauchy, 1821) The following proof comes from its author’s Cours d’Analyse, note VIII.

Let x ∈ 0, π2 and let n be an odd natural number. By De Moivre’s formula and the definition of cotangent we have:

=

(sin x)n (sin x)n

n

cos x + i sin x

=

sin x

= (cot x + i)n .

(cot x + i)n

n n n n−1 n n−1 n n

= cot x + (cot x)i + . . . + (cot x)i + i

0 1 n−1 n

n n n n

= cotn x − cotn−2 x ± . . . + i cotn−1 x − cotn−3 x ± . . . .

0 2 1 3

By comparing the equations above and considering the imaginary parts of the involved terms:

sin(nx) n n−1 n n−3

= cot x− cot x ± ... .

(sin x)n 1 3

rπ

Given the last identity, we may fix a positive integer m, set n = 2m + 1 and define xr = 2m+1

per r = 1, 2, . . . , m. Since nxr is an integer multiple of π, we have sin(nxr ) = 0. As a consequence:

2m + 1 2m + 1 2m + 1

0= cot2m xr − cot2m−2 xr ± · · · + (−1)m

1 3 2m + 1

holds for r = 1, 2, . . . , m. The numbers x1 , . . . , xm are distinct elements of the interval 0, π2 and the function cot2 (x)

is injective on such interval, hence the numbers tr = cot2 xr (per r = 1, 2, . . . , m) are distinct. Given the previous

equation, these m numbers are the roots of the following polynomial with degree m:

2m + 1 m 2m + 1 m−1 m 2m + 1

p(t) = t − t ± · · · + (−1) .

1 3 2m + 1

Due to Vieta’s formulas we may compute the sum of the roots of p through its coefficients:

2m+1

2 2 2 3 2m(2m − 1)

cot x1 + cot x2 + · · · + cot xm = 2m+1 = .

1

6

2m(2m − 1) 2m(2m + 2)

csc2 x1 + csc2 x2 + · · · + csc2 xm = +m= .

6 6

1 rπ

By considering the inequality cot2 (x) < x2 < csc2 (x) and summing its terms on xr = 2m+1 we get:

2 2 2

2m(2m − 1) 2m + 1 2m + 1 2m + 1 2m(2m + 2)

< + + ... + < .

6 π 2π mπ 6

2

π

By multiplying both sides by 2m+1 :

π2 π2

2m 2m − 1 1 1 1 2m 2m + 2

< 2 + 2 + ··· + 2 < .

6 2m + 1 2m + 1 1 2 m 6 2m + 1 2m + 1

Page 67 / 222

π2

If we consider the limits of the RHS and LHS as m → +∞, both limits equal 6 , hence

π2

X 1

1 1 1

ζ(2) = 2

= lim 2

+ 2

+ . . . + 2

=

k m→∞ 1 2 m 6

k≥1

1−z

follows by (squeezing). Cauchy’s proof can be shortened a bit by recalling that the Cayley transform z 7→ 1+z

n

is an involution on C \ {−1}. In particular a solution of 1−z

1+z = eiθ is given by

1 − eiθ/n θ

z= = −i tan 2n

1 + eiθ/n

and the minimal polynomial of tan nπ or cot nπ over Q is simple to derive. A proof relying on squeezing, but not

π

requiring the explicit construction of the minimal polynomial of cot2 2m+1 , is presented in Proofs from The Book :

Assuming 0 < x < π/2 we have that

1 1 1

< 2 <

tan2 x x sin2 x

and we may notice that tan12 x = sin12 x − 1. If we partition the interval 0, π2 in 2n equal subintervals,

then sum both sides of the previous inequality evaluated at xk = π2 · 2kn , we get:

n n n n

2X −1 2X−1 2X−1 2X−1

1 1 1

2 − 1 < 2 < .

k=1

sin xk k=1 k=1

xk

k=1

sin2 xk

n

2X −1 2

n 2 · 2n 1

Sn − (2 − 1) < < Sn .

π k2

k=1

1 1 cos2 x + sin2 x 4

2 + 2 π = = .

sin x sin ( 2 − x) cos2 x · sin2 x sin2 2x

As a consequence, by coupling terms appearing in Sn , except the contribution given by term associated with π4 , we get

4 times a sum with the same structure, but with a doubled “step” and a halved number of terms. The contribution

provided to Sn by the term associated with π4 equals sin2 (π/4)

1

= 2, hence we have the following recurrence formula:

Sn = 4Sn−1 + 2

that together with the initial condition S1 = 2 produces the following explicit formula:

2(4n − 1)

Sn = .

3

As a consequence, the following inequality holds:

n

2 −1

2(4n − 1) 4n+1 X 1 2(4n − 1)

− (2n − 1) ≤ ≤

3 π2 k2 3

k=1

π2

and by considering the limit as n → +∞ we reach, like in Cauchy’s proof, the wanted identity ζ(2) = 6 .

Many solutions to Basel problem are based on trigonometric identities and bits of Theory of Hilbert spaces,

in particular Parseval’s Theorem.

Page 68 / 222

5 THE BASEL PROBLEM

Proof # 4 (Fourier, 1817) On the interval (0, π) the following identity holds pointwise:

π − x X sin(nx)

= .

2 n

n≥1

Additionally the convergence is uniform on any compact subinterval. Due to the orthogonality relations

Z 2π

sin(nx) sin(mx) dx = π δ(m, n)

0

we have that:

2π 2

π3

π−x

Z

= dx

6 0 2

X Z 2π sin(nx) 2

= dx

n

n≥1 0

X π

= = π ζ(2).

n2

n≥1

R1

There are many orthogonal bases of L2 (0, 1), equipped with the canonical inner product hf, gi = 0 f (x) g(x) dx, so

the above proof admits many variations. For instance, we may consider the Fourier-Legendre expansions of K(m) (the

1

complete elliptic integral of the first kind, having the elliptic modulus as a variable) and √1−m :

X 2 1 X

K(m) = Pn (2m − 1), √ = 2 · Pn (2m − 1).

2n + 1 1 − m n≥0

n≥0

R1 K(m)

These expansions lead to 3 ζ(2) = 0

√

1−m

dm. On the other hand, by the Taylor series of K,

1 2n 2 Z 1 2n Z 1

xn

Z

K(m) πX n X

n dx

√ dm = n

√ dx = π n (2n + 1)

=π √

0 1−m 2 16 0 1−x 4 0 1 − x2

n≥0 n≥0

π2

hence 3 ζ(2) = π arcsin(1). There is a deep connection between Fourier-analytique proofs of the identity ζ(2) = 6

and Euler’s proof #2: f (x) = arcsin (sin x) is a triangle wave.

π2

By combining different ideas from Apostol, Pace and Ritelli, the identity ζ(2) = 6 turns out to be a consequence

of simple manipulations of a double integral:

+∞

4 1 log y

Z

4X 1

ζ(2) = = dy

3 n=0 (2n + 1)2 3 0 y2 − 1

+∞

2 1 1 1 + x2 y 2

Z

= log dy

3 0 y2 − 1 1 + x2 x=0

4 1 +∞

Z Z

x

= dx dy

3 0 0 (1 + x2 )(1 + x2 y 2 )

4 1 +∞ π2

Z Z

dx dz 4 π π

= 2 2

= · · = .

3 0 0 (1 + x )(1 + z ) 3 4 2 6

Proof # 6 (D’Aurizio, 2015)

The identity

3 X 1 π X (−1)k π π

ζ(2) = 2

= = ·

4 (2n + 1) 2 2k + 1 2 4

n≥0 k≥0

Z ∞ y

y e −1 1 1

I= e 2

− 2y + 1

dy

−∞ y y e

Page 69 / 222

is clearly real, hence the imaginary part of the sum of residues of the integrand function equals zero:

∞ ∞ ∞

! !

1 X (−1)n 2 X 1 2i X (−1)n

I = 2πi − − 2

π n=0 2n + 1 π 2 n=0 (2n + 1)2 π n=0 (2n + 1)2

This argument first appeared at this MSE thread, where the identity

Z +∞

4 X (−1)n

x−1 1 dx

− =

0 log2 x log x x2 + 1 π

n≥0

(2n + 1)2

is also shown. We may find a similar symmetry trick in a further proof due to Euler.

If we consider the reflection formula for the Γ function

π

Γ(z)Γ(1 − z) =

sin(πz)

d2

and apply the operator dz 2 log(·) to both sides, we get:

π2

ψ 0 (z) + ψ 0 (1 − z) =

sin2 (πz)

From which:

1 X 1

π2 = 2 ψ0 =2 = 6 ζ(2).

2 1 2

n≥1 n − 2

It is interesting to remark that the previous approach through residues can be encoded in a combinatorial Lemma

not making any explicit mention of residues:

a2n is convergent,

P

Lemma 132. If {an }n≥0 is a weakly decreasing sequence of positive numbers and n≥0

the following series are also convergent

∞ ∞ ∞

def def def

X X X

s = (−1)n an , δk = an an+k , ∆ = (−1)k−1 δk

n=0 n=0 k=1

and we have: X

a2n = s2 + 2∆.

n≥0

1

Proof # 8 (Knopp, 1950) If we consider the sequence defined through an = 2n+1 , we have that:

X 1 π

s= (−1)n =

2n + 1 4

n≥0

∞

X 1 1 X 1 1 1 1 1

δk = = − = 1 + + ... + .

(2n + 1)(2n + 2k + 1) 2k 2n + 1 2n + 2k + 1 2k 3 2k − 1

n≥0 n≥0

In particular:

X 1 π 2 X (−1)k−1

1 1

π2 π2 π2

= + 1 + + ... + = + =

(2n + 1)2 4 k 3 2k − 1 16 16 8

n≥0 k≥1

4X 1 π2

ζ(2) = 2

= .

3 (2n + 1) 6

n≥0

Page 70 / 222

5 THE BASEL PROBLEM

A crucial step has been hidden “under the carpet”: the equality of blue terms is not entirely trivial. A proof of such

equality (not really efficient, but hopefully interesting) has been included in a box at the end of this section. We finish

by presenting a proof that follows from the contents of the previous sections in a very straightforward way.

n−1

Y πk 2n

sin = ,

n 2n

k=1

Z π/2

π

log sin(θ) dθ = − log(2),

0 2

π

hence by the substitution θ 7→ 2 − θ we have:

Z π/2

0= log (2 cos θ) dθ.

0

By exploiting the Taylor series of log(1 + x) at the origin and termwise integration we have:

Im log (2 cos θ) dθ = θ dθ + Im e−2niθ dθ

0 0 n 0

n≥1

(−1)n+1 i 1 − e−πin

2

π X

= − Im

8 2n2

n≥1

2X (−1)n+1 (1 − (−1)n )

π

= −

8 2n2

n≥1

π2 X 1

= .

8 (2m + 1)2

m≥0

2017 addendum. Proof # 10 (MSE, 2017) Here it is another crazy approach by symmetry. For any s > 1 we have

−1 X −1 Z +∞ s−1

(−1)n+1 L−1 1

X 1

2 2 t

ζ(s) = = 1− s = 1− s dt

ns 2 ns Γ(s) 2 0 e t+1

n≥1 n≥1

where the RHS is converging for any s > 0, providing an analytic continuation of the LHS over such region. By

applying integration by parts twice, we get the following integral representation for the ζ function over the region

s > −2:

−1 Z +∞ s+1 t t

1 2 t e (e − 1)

ζ(s) = 1− s dt

Γ(s + 2) 2 0 (et + 1)3

ζ(1−s) 2 Γ(s)

cos πs

and due to the reflection formula ζ(s) = (2π)s 2 we have

+∞ 2−s t

(2π)s e (et − 1)

Z

−1 t

ζ(s) = (1 − 2s ) dt

2 Γ(3 − s) Γ(s) cos πs

2 0 (et + 1)3

for any s < 3. By evaluating the previous line at s = 2 and by enforcing the substitution t = log u we get:

+∞ 1

2π 2 u−1 1

2π 2 2π 2 1 π2

Z Z

u7→ 1−v v

ζ(2) = du = dv = · = .

3 1 (u + 1)3 3 0 (2 − v)3 3 4 6

Page 71 / 222

In equivalent terms, the value of ζ(2) can be derived from the value of ζ(−1), which on its turn is related to a Bernoulli

number.

2018 addendum. Proof # 11. We start with the statement of the MacMahon master theorem.

Let A = (aij )m×m be a complex matrix, and let x1 , . . . , xm be formal variables. Consider a coefficient

m

Y ki

G(k1 , . . . , km ) = xk11 · · · xkmm

ai1 x1 + · · · + aim xm .

i=1

Let t1 , . . . , tm be another set of formal variables, and let T = (δij ti )m×m be a diagonal matrix. Then

X 1

G(k1 , . . . , km ) tk11 · · · tkmm = ,

det(Im − T A)

(k1 ,...,km )

where the sum runs over all nonnegative integer vectors (k1 , . . . , km ), and Im denotes the identity matrix of size m.

By considering the matrix

0 1 −1

A = −1 0 1

1 −1 0

it is not difficult to derive an important combinatorial identity due to Dixon:

X

k a+b b+c c+a (a + b + c)!

(−1) =

a+k b+k c+k a!b!c!

k∈Z

holding for any a, b, c ∈ N+ . By logarithmic convexity and the Bohr-Mollerup theorem, the range of validity of Dixon’s

identity can be extended to a, b, c ∈ (−1, +∞). If we consider the instance a = b = c = 12 we get:

X 1 3π 2

2 3

= ,

(1 − 4k ) 32

k∈Z

X 1 1 1 1 3 1 3 1 3 1 3 1

· − · + · + · + · − ·

8 (2k + 1)3 8 (2k − 1)3 16 (2k + 1)2 16 (2k − 1)2 16 2k + 1 16 2k − 1

k∈Z

3 1

P

which by cancellation and symmetry boils down to 4 k≥0 (2k+1)2 .

π2

The identity ζ(2) = 6 is a straightforward consequence.

R +∞ arctan x 1 +∞ 2

Proof # 12. We may consider that J = 0 2

1+x2 dx = 2 arctan x 0 = π8 .

On the other hand, by Feynman’s trick or Fubini’s theorem

Z +∞ Z 1 Z 1

x − log a

J = 2 2 2

da dx = 2

da

0 0 (1 + x )(1 + a x ) 0 1−a

R1 1 1

and since 0

− log(x)xn dx = (n+1)2 , by expanding 1−a2 as a geometric series we have

π2 X 1

=J = .

8 (2n + 1)2

n≥0

Page 72 / 222

5 THE BASEL PROBLEM

Exercise 133. In order to stress the importance of the dominated convergence Theorem, prove that:

XX m2 − n2 π X X m2 − n2 π

2 2 2

= , 2 2 2

=− .

(m + n ) 4 (m + n ) 4

n≥1 m≥1 m≥1 n≥1

m2 − n2 π2

X 1 1

= −

m≥1

(m2 + n2 )2 2 n2 sinh2 (πn)

and by exploiting the identity in the next exercise, or by exploiting Poisson summation formula.

Exercise 134 (First steps towards modular forms). Prove the identity:

X 1 1 1

2 = − .

sinh (πn) 6 2π

n≥1

d2 sinh z

Hint: apply the dz 2 log(·) operator to both sides of the equality representing the Weierstrass product for the z

function.

Exercise 135 (First steps towards modular forms). Prove the identity:

X n(−1)n+1 1

= .

sinh(πn) 4π

n≥1

Exercise 136 (Ramanujan’s first steps towards modular forms). Prove the identity:

2π

+ 4π

+ 6π

+ ... = .

e −1 e −1 e −1 24

Page 73 / 222

The identity under the carpet. As promised, we are going to prove that:

X (−1)k+1 1 1

π2

1 + + ... + = .

k 3 2k − 1 16

k≥1

According to the terminology introduced by Gioffrè, Iandoli and Scandone, the following proof is a “level 4 proof ”,

since at some point four consecutive symbols for series or integrals appear:

X (−1)k+1 1 1

X X 2

1 + + ... + = (−1)k+1

k 3 2k − 1 (2n + 1)(2n + 2k + 1)

k≥1 k≥1 n≥0

Z 1Z 1XX

= 2 (−1)k+1 x2n y 2n+2k dx dy

0 0 k≥1 n≥0

x y 2 dx dy

= 2

(1 + y 2 )(1 − x2 y 2 )

(0,1)2

x dx dy x dx dy

= 2 2 2

−2 2 )(1 − x2 y 2 )

1 − x y (1 + y

(0,1)2 (0,1)2

Z 1

X 1 arctanh(y)

= 2 2

− 2 dy

(2n + 1) 0 y(y 2 + 1)

n≥0

Z +∞

X 1 u coth(u) du

(y 7→ tanh u) = 2 −2

(2n + 1)2 0 cosh(2u)

n≥0

Z 1

X 1 z(1 + z 2 )2 log z

(u 7→ − log z) = 2 + 4 dz

(2n + 1)2 0 1 − z8

n≥0

X 1 X 1 1 2

(Feynman’s trick) = 2 − 4 + +

(2n + 1)2 (8k + 2)2 (8k + 6)2 (8k + 4)2

n≥0 k≥0

1X 1

= = 38 ζ(2).

2 (2n + 1)2

n≥0

Pn

In a simpler way, we may set Hn = 1

k=0 2k+1 and notice that:

Hn x2n+1 = , (−1)n Hn x2n+1 = ,

1 − x2 1 + x2

n≥0 n≥0

X (−1)n Z 1

arctan x

Hn = 2 dx = arctan2 (1).

n+1 0 1 + x2

n≥0

dx 8

=π− .

−∞ (1 + x2 ) cosh2 (πx) π

XX 1 π6

= .

m2 n2 (m + n)2 2835

m≥1 n≥1

Page 74 / 222

6 SPECIAL FUNCTIONS AND SPECIAL PRODUCTS

This section is dedicated to an introduction the Γ function and its properties. We start by investigating about the

interplay among values of the Riemann ζ function, Bernoulli numbers and power series. We recall that, by considering

the logarithmic derivative of the Weierstrass product for the sine function (Euler docet):

1 π X 1 X z 2h X X

2h+1

− cot(πz) = z = z ζ(2h + 2) = ζ(2k)z 2k−1

2z 2 n2 n2h+2

n≥1 h≥0 h≥0 k≥1

2k−1

1 d 1 π

ζ(2k) = − cot(πz)

(2k − 1)! dz 2k−1 2z 2 z=0

z X Bn

= zn

ez −1 n!

n≥0

we immediately have that every Bernoulli number with odd index equals zero, with the only exception of B1 = − 21 .

This follows from the fact that

z z z z

+ = coth

ez − 1 2 2 2

is clearly an odd function. Additionally, since coth is the logarithmic derivative of

Y z2

sinh(z) = z 1+ 2 2 ,

π n

n≥1

1 1

, B6 = 42 1

, B8 = − 30 5

, B10 = 66 691

, B12 = − 2730 , . . .,

have alternating signs from B2 onward. Due to the ordinary generating function for {ζ(2n)}n≥1 introduced in the

Basel problem section,

(2π)2k |B2k |

ζ(2k) = ∈ π 2k Q.

2 · (2k)!

Since for any k ≥ 1 we have 1 ≤ ζ(2k) ≤ ζ(2), the previous formula allows a simple estimation of the magnitude of

|Bn |. We may further notice that:

x

e −1 x X 1 X Bn n

1= · x = xn · x

x e −1 (n + 1)! n!

n≥1 n≥0

n

!

X X Bk

= xn

k!(n + 1 − k)!

n≥0 k=0

n !

X X n+1 xn

= Bk

k (n + 1)!

n≥1 k=0

6 According to the usual terminology, the ordinary and exponential generating functions

X X an

(OGF) an z n , (EGF) zn .

n≥0 n≥0

n!

Page 75 / 222

n−1

1 X n+1

Bn = − Bk ,

n+1 k

k=0

an identity allowing the evaluation of Bernoulli numbers in a recursive fashion. Due to such relation, Bernoulli numbers

play a major role in problems related to (finite) power sums:

n p

X

p 1 X j p+1

k = (−1) Bj np+1−j .

p + 1 j=0 j

k=1

Since for any k ≥ 1 we have Bk = −k ζ(1 − k), the previous statement can also be written as:

n p−1

X np+1 X p

kp = − ζ(−j)np−j

p + 1 j=0 j

k=1

from which it follows that ζ(s) for s < 1 can be computed through the regularization of a divergent series. 7 For

√ (1/2)

instance we proved that the sequence given by an = 2 n − Hn is convergent as n → +∞. Due to Faulhaber’s

formula, we may further state that:

√ √ X (−1)n+1

2 n − Hn(1/2) = (1 + 2) 1

lim √ = −ζ 2

n→+∞ n

n≥1

1

then use the (inverse) Laplace transform to get an integral representation for −ζ 2 :

√ Z

1

√ Z +∞ X (−1)n+1 e−ns 2 + 2 2 +∞ ds

−ζ 2 = (1 + 2) √ ds = √ 2)

.

0 πs π 0 1 + exp(s

n≥1

A de facto equivalent technique is to apply Ramanujan’s Master Theorem to the exponential generating function

of Bernoulli numbers.

The recurrence relation fulfilled by Bernoulli numbers, together with Lucas Theorem on the behaviour of binomial

coefficients (mod p), leads to an interesting consequence: the denominator of the rational number B2k is a squarefree

integer, given by the product of primes p such that p − 1 is a divisor of 2k:

X 1

∀n ≥ 1, B2n + ∈ Z.

p

p∈P

(p−1)|(2n)

7 The

P 1 1

famous “claim” n≥1 n = − 12 is preposterous since the LHS is a Pdivergent series. ζ(−1) = − 12 holds, but ζ(s) for s < 1 is

1

defined through an analytic continuation and not directly through the series n≥1 ns , which is convergent only if Re(s) > 1.

Page 76 / 222

6 SPECIAL FUNCTIONS AND SPECIAL PRODUCTS

Theorem 141. If s is a complex number with a positive real part, the following definitions are equivalent:

• (A)(Integral representation)

Z +∞

Γ(s) = xs−1 e−x dx

0

• (B)(Euler product)

n!ns

Γ(s) = lim

n→+∞ s(s + 1) · . . . · (s + n)

• (C)(Weierstrass product)

Γ(s) = 1+ e , γ = lim (Hn − log n)

s n n→+∞

n≥1

with a convex logarithm that fulfills Γ(1) = 1 and Γ(s + 1) = s Γ(s) for any s > 0.

Sketch of proof. (A) ↔ (B). Due to the integration by parts formula, the function defined by (A) fulfills Γ(1) = 1

and Γ(s + 1) = s Γ(s). Due to the dominated convergence Theorem we have:

Z +∞ Z n

s−1 −x

x n

x e dx = lim xs−1 1 − dx

0 n→+∞ 0 n

where the integral appearing in the RHS can be computed by integration by parts too, proving (B) and (B) → (A).

(B) ↔ (C) follows from simple algebraic manipulations, since any n ∈ Z+ can be written as a telescopic product:

n−1

Y

1

n= 1+ .

k

k=1

(A) → (D). If s is a positive real number, from (A) it follows that Γ(s) is a moment for a positive random variable,

i.e. an integral of the form R+ xs ω(x) dx with ω(x) being a positive and locally integrable function. In particular,

R

Z +∞ Z +∞ Z +∞ 2 2

dx dx s1 +s2 dx s1 + s2

Γ(s1 )Γ(s2 ) = xs1 xs2 ≥ x 2 =Γ .

0 xex 0 xex 0 xex 2

It follows that log Γ is a midpoint-convex function, and since it is a continuous function on R+ , it is tout court convex.

(D) → (B). Due to logarithmic convexity, the function of real variable defined through (D) has a representation as

an infinite product and it is not only continuous, but analytic. Due to the analytic continuation principle, the

function defined by (D) can be extended to the whole half-plane Re(s) > 0 and the statement (B) holds in that region

too. Summarizing:

A B

D C

We may also notice that once the Γ(s) function is defined on the half-plane Re(s) > 0, the functional identity

Γ(s + 1) = s Γ(s) allows to extend Γ to the whole complex plane. In particular, from the Weierstrass product it follows

that:

Page 77 / 222

• the singularities of Γ(s) are simple poles and they are located at the elements of {0} ∪ Z− . Additionally:

(−1)n

∀n ∈ N, Res (Γ(s), s = −n) =

n!

• the Γ function fulfills the following reflection formula:

π

∀z 6∈ Z, Γ(z) Γ(1 − z) = .

sin(πz)

The logarithmic convexity allows to produce tight approximations of the Γ function, like:

Theorem 142 (Gautschi’s inequality). For any real number x > 0 and for any s ∈ (0, 1) we have:

Γ(x + 1)

x1−s < < (x + 1)1−s .

Γ(x + s)

To have extended the factorial function allows us to define binomial coefficients also for non-integer parameters.

In particular, from n! = Γ(n + 1) it follows that:

2n (2n)! Γ(2n + 1)

= 2

= .

n n! Γ(n + 1)2

Given these important notions, we are ready to study the behavior of central binomial coefficients.

We may start by noticing that for any n ∈ Z+ ,

n

1 2n (2n)! (2n − 1)!! Y 1

= = = 1 − .

4n n (2n)!!2 (2n)!! 2k

k=1

2 n n n n −1

1 2n 1Y 1 1 1Y 1 Y 1 1 Y 1

= 1 − + = 1 − 1 + = 1 − .

4n n 4 k 4k 2 4 k 4k(k − 1) 4n (2k − 1)2

k=2 k=2 k=2 k=2

−1

1

Q

If now we denote as W the infinite product k≥2 1− (2k−1)2 we have:

2 −1

1 2n W Y 1 W Y 1

= 1− = 1+ .

4n n 4n (2k − 1)2 4n 4k(k − 1)

k>n k>n

Since in a right neighbourhood of the origin we have e > 1 + x (truth to be told, such inequality holds for any x ∈ R∗

x

2

1 2n W X −1 W 1

> exp = exp −

4n n 4n 4k(k − 1) 4n 4n

k>n

1 2n W X −1 W 1

< exp = exp −

4n n 4n (2k − 1)2 4n 4n + 2

k>n

by creative telescoping. Essentially, it is enough to find an explicit value for W (also known as Wallis product) to

have accurate approximations of central binomial coefficients. Due to the Weierstrass product for the cosine function,

Y 4z 2

cos(πz) = 1−

(2k + 1)2

k≥0

we have:

1 Y 1

cos(πz) d.H. π

= 1− 2

= lim = ,

W (2k + 1) z→1/2 1 − 4z 2 4

k≥1

hence:

Page 78 / 222

6 SPECIAL FUNCTIONS AND SPECIAL PRODUCTS

1 2n 1 1 1

=√ 1− +O

4n n πn 8n n2

where the asymptotic expansion still holds if n ≥ 1 is not an integer. By the reflection formula we clearly have

√

Γ 21 = π, leading to a remarkable consequence:

Lemma 143. √

+∞ +∞

e−t

Z Z

−x2 (A) 1 1

π

e dx = √ dt = 2 Γ 2 = .

0 0 2 t 2

2

In this framework we will see soon that the area of the unit circle being equal to Γ 12 is not accidental at all. Before

introducing the Beta function and the multiplication formula for the Γ function, let us investigate about the Eulerian

thought “if something is defined by an infinite product, it might be the case to consider its logarithmic derivative ”.

For any complex number with a positive real part, let us set:

d def

ψ(s) =

log Γ(s).

ds

The ψ function is also known as Digamma function. Due to the Weierstrass product for the Γ function, we have:

Γ0 (s) X1

d Xs s 1 1

ψ(s) = = − log s − γs + − log 1 + =− −γ+ −

Γ(s) ds k k s k k+s

k≥1 k≥1

where the function relation Γ(s + 1) = s Γ(s) translates into ψ(s + 1) = 1s + ψ(s) for the Digamma function. Such

identity allows an analytic continuation of the ψ function to the whole complex plane. The Digamma function turns

out to be a meromorphic function with simple poles with residue −1 at each non-positive integer. Far enough from

the singularities, the following identities hold:

X 1 1

ψ(b) − ψ(a) = −

(n + a) (n + b)

n≥0

ψ(b) − ψ(a)

X 1

=

b−a (n + a)(n + b)

n≥0

1 1

X X

ψ(−a )Res =

k z=a k

Q Q

k (z − ak ) k (n − ak )

k n≥0

(pi −1)

pi

(−1) ψ (−ζi ) 1

X X

=

Q Q

j6=i i − ζj ) − ζi )pi

Main properties of

i

pi ! (ζ i (n

n≥0

X 1

the Digamma function

ψ 0 (a) =

(n + a)2

n≥0

ψ (n) (a) 1

X

= (−1)n+1

(k + a)n+1

n!

k≥0

ψ(z) − ψ(1 − z)

= π cot(πz)

n−1

1X k

ψ(nz) = log(n) + ψ z+

n n

k=0

ψ(1/2), ψ(1), ψ(n) = −γ − log 4, −γ, Hn−1 − γ.

They are clearly interesting from a combinatorial point of view. On the half-plane Re(s) > 0 we also have the following

integral representations:

Z +∞ −t Z 1

e−st 1 − xs

e

ψ(s) = − dt, ψ(s + 1) = −γ + dx

0 t 1 − e−t 0 1−x

Page 79 / 222

and the Taylor series of ψ(x + 1) at the origin is really simple:

X

ψ(x + 1) = −γ − ζ(k + 1)(−z)k

k≥1

1 X B2n

ψ(x) = log(x) − − .

2x 2nx2n

n≥1

The last identity is a typical consequence of the Euler-McLaurin summation formula or, as we will see, of

techniques based on creative telescoping. We remark that by combining the discrete Fourier transform with the

multiplication and reflection formulas for the Γ and ψ functions we get a deep result:

Theorem 144 (Gauss Digamma Theorem). If r, m are positive integers and r < m, we have:

π πr 2πnr πn

ψ = −γ − log(2m) − cot +2 cos log sin

m 2 m n=1

m m

allowing an explicit evaluation of ψ(s) for any s ∈ Q. A proof can be found on PlanetMath.

The multiplication formulas for the Γ function follow from the multiplication formulas for the ψ function

in a very straightforward way. Since:

s n−1

XZ s

log Γ(ns) − log Γ(n)

Z

k

= ψ(nz) dz = (s − 1) log n + ψ z+ dz

n 1 n

k=0 1

n−1 n−1

X k X k

= (s − 1) log n + log Γ s + − log Γ 1 +

n n

k=0 k=0

Theorem 145 (Multiplication formulas for the Γ function). For any s ∈ C with positive real part and for any n ∈ Z+ ,

n−1

1−n

ns−1/2

Y k

Γ(ns) = (2π) 2 n Γ s+ .

n

k=0

22z−1

1

Γ(2z) = √ Γ(z)Γ z + .

π 2

These identities can also be proved through a slick technique known as Herglotz trick: if two meromorphic functions

f (z), g(z) only have simple poles at the same points with the same residues, they share a value at a regularity point

and they fulfill the same functional equation, they are the same function. In particular, for any z ∈ C with positive

real part we may define

def Γ(2z)

g(z) =

Γ(z)Γ(z + 1/2)

Page 80 / 222

6 SPECIAL FUNCTIONS AND SPECIAL PRODUCTS

and check that g(z) is an entire function such that g(1) = √2π and g(z + 1) = 4 g(z). Legendre duplication formula

and, in a similar fashion, Gauss multiplication formulas are so straightforward consequences.

Back to central binomial coefficients, we may notice that:

Γ n + 12

1 2n Γ(2n + 1) Γ(2n)

= n = 2n−1 =√

4n n 4 Γ(n + 1)2 2 Γ(n + 1)2 π Γ(n + 1)

1

√

= π4 is equivalent to the identity Γ 12 = π, and loosely speaking:

Q

hence the identity k≥1 1 − (2k+1) 2

1

√

Wallis product ←→ Γ 2 = π ←→ Integral of the Gaussian function.

Theorem 146 (Stirling’s approximation). For any n > 0 (without assuming n ∈ Z) the following inequality

holds: n n √ n n √

1 1

2πn exp ≤ n! ≤ 2πn exp .

e 12n + 1 e 12n

1 1 1

Proof. Since n2 − n(n+1) = n2 (n+1) , for any m ∈ Z+ we have:

X 1 X 1 1

1 X 1

1

= − + −

n2 n (n + 1) 2 n2 (n + 1)2

n≥m n≥m n≥m

1 X 1 1 1 X 1

+ 3

− 3

−

6 n (n + 1) 6 n3 (n + 1)3

n≥m n≥m

hence: X 1 1 1 1

ψ 0 (m) = 2

≤ + 2

+

n m 2m 6m3

n≥m

and the inequality still holds if m ≥ 1 does not belong to Z. Additionally, in a similar fashion:

1 1 1 1

ψ 0 (m) ≥ + + − .

m 2m2 6m3 30m5

By integrating both sides with respect to the m variable twice, we get that log Γ(m) has the following behaviour:

1 1

log Γ(m) ≈ m − log(m) − αm + β +

2 12m

where α = 1 follows from the functional relation log Γ(m + 1) − log Γ(m) = log m. That gives Stirling’s approx-

√

imation up to a multiplicative constant: β = log 2π then follows from Legendre duplication formula and the

√

identity Γ 12 = π.

The coefficients found through creative telescoping, due to Faulhaber’s formula, directly depend on Bernoulli numbers.

In particular the Trigamma function ψ 0 (z) has the following asymptotic expansion:

X 1 1 1 X B2t

ψ 0 (z) = 2

= + 2+

(z + m) z 2z z 2t+1

m≥0 t≥1

1 X B2t

ψ(z) = log(z) − − ,

2z 2tz 2t

t≥1

Page 81 / 222

√

1 X B2t

log Γ(z) = z− log(z) − z + log 2π +

2 2t(2t − 1)z 2t−1

t≥1

√ +∞

2 arctan zt

Z

1

= z− log(z) − z + log 2π + dt.

2 0 e2πt − 1

where the last identity, following from the (inverse) Laplace transform, is also known as second Binet’s Theorem

for log Γ.

Z 1

I(a, b) = xa−1 (1 − x)b−1 dx

0

for positive integer values of a and b. By exploiting the substitution x = e−t and the identity

b−1 b−1

−1 −t b−1

X

k k (b − 1)!

L (1 − e ) = (−1) =

s+k s(s + 1) · · · (s + b − 1)

k=0

we have:

+∞

(b − 1)! (a − 1)!(b − 1)!

Z

Γ(a) Γ(b)

I(a, b) = e−ta (1 − e−t )b dt = = = .

0 a(a + 1) · · · (a + b − 1) (a + b − 1)! Γ(a + b)

The last identity holds in a more general context:

Theorem 147 (Eulero). If a, b are complex numbers with positive real parts,

Z 1

def Γ(a)Γ(b)

B(a, b) = xa−1 (1 − x)b−1 dx = .

0 Γ(a + b)

Proof. The function B(a, b) is clearly continuous and non-vanishing on its domain of definition.

Through the substitution x 7→ (1 − x) we have B(a, b) = B(b, a) and due to the integration by parts formula:

a

B(a + 1, b) = B(a, b).

a+b

R1

B(1, 1) = 0 1 dx = 1 holds and on the positive real line both B(·, b) and B(a, ·) are log-convex, since they are moments.

The claim hence follows from the Bohr-Mollerup characterization: B(a, b) is known as Euler’s Beta function.

Euler’s Beta function immediately gives extremely useful integral representations. For instance:

Lemma 148. If α and β are complex numbers with real parts greater than −1,

β+1

Z π/2 Γ α+1

2 Γ 2

sinα (θ) cosβ (θ) dθ = .

0 2 Γ α+β+2

2

Z π

π(−1)n

2m 2m

sin (θ) cos(2nθ) dθ = .

0 4n m+n

Lemma 150. If Re(a) > −1 and Re(b) > Re(a) + 1 we have:

Z +∞ a

t dt π

b

= .

0 1+t b sin π(a+1) b

Page 82 / 222

6 SPECIAL FUNCTIONS AND SPECIAL PRODUCTS

By combining Euler’s Beta function with Feynman’s trick, we get that many non-trivial integrals can be computed in

an explicit way. We immediately study a highly non-trivial example.

Z π/2

π 2

log3 (sin θ) dθ = − π log 2 + 4 log3 2 + 6 ζ(3) .

0 8

√

π d3 Γ 12 + α2

Z π/2

3

log (sin θ) dθ = .

2 dα3 Γ 1 + α2

0

α=0

In particular the value of our integral just depends on the values of Γ, Γ0 , Γ00 and Γ000 at the points 21 and 1.

√

We know that Γ(1) = 1, Γ 21 = π and by differentiating both sides of Γ0 (x) = Γ(x) ψ(x) multiple times we get:

Γ0 (x) = Γ(x)ψ(x),

00

Γ (x) = Γ(x)ψ(x)2 + Γ(x)ψ 0 (x),

Γ000 (x) = Γ(x)ψ(x)3 + 3Γ(x)ψ(x)ψ 0 (x) + Γ(x)ψ 00 (z)

The problem boils down to computing the values of ψ, ψ 0 and ψ 00 at the points 1

2 and 1.

By Gauss Theorem we have:

ψ 12 = −γ − 2 log 2

ψ(1) = −γ,

and we also know that: X 1

ψ 0 (a) =

(n + a)2

n≥0

0 π2 0 1 π2

from which ψ (1) = ζ(2) = 6 and ψ = 3 ζ(2) =

2 follow.2

By differentiating again with respect to the a variable, we get:

X 1

ψ 00 (a) = −2

(n + a)3

n≥0

At this point, in order to prove the claim it is enough to trust in a Computer Algebra System to perform the needed

simplifications, or just perform them by hand with a bit of patience. In a similar way it is possible to prove:

Z π/2

π π3

log(sin θ) log(cos θ) dθ = log2 (2) − ,

0 2 48

Z π/2

π3 π

log2 (sin θ) dθ = + log2 (2),

0 24 2

Z π/2 5

19π π3 π

log4 (sin θ) dθ = + log2 (2) + log4 (2) + 3π log(2) ζ(3).

0 480 4 2

As mentioned before, we investigate now about the interplay between the Γ function and the area of the unit circle.

√

Since the graph of f (x) = 1 − x2 over [−1, 1] is a half-circle,

1 1

1 √

Z p Z

π = 4 √

1 − x2 dx = 2 1 − u du

0 0 u

Z 1 1 3

1 3 Γ Γ 1 2

u 2 −1 (1 − u) 2 −1 du = 2 B 21 , 32 = 2 2 2

= 2 =Γ 2 .

0 Γ(2)

Page 83 / 222

The last identity has an interesting generalization: let Vn (ρ) and An (ρ) be the volume and the surface area of the

Euclidean ball with radius ρ ≥ 0 in Rn . For starters, we may notice that:

d

Vn (ρ) = ρn Vn (1), An (ρ) = ρn−1 An (1), An (ρ) = Vn (ρ) = nρn−1 Vn (1),

dρ

then we consider the integral Z

exp −(x21 + x22 + . . . + x2n ) dµ.

In =

Rn

Due to Fubini’s Theorem, the following identity clearly holds:

Z n

2

In = e−x dx = π n/2 .

R

Z +∞ Z +∞ Z +∞

−ρ2 n−1 −ρ2 An (1) n

In = An (ρ)e dρ = An (1) ρ e dρ = u 2 −1 e−u du

0 0 2 0

2π n/2 = An (1)Γ(n/2)

and finally:

Vn (ρ) = ρn , An (ρ) = ρ .

Γ(1 + n/2) Γ(n/2)

Additionally, if n ≥ 13 the volume of the unit balls is less than the volume of the unit cube.

X 1 4 2 1 1

π= − − − .

16k 8k + 1 8k + 4 8k + 5 8k + 6

k≥0

This quite recent result (1995) has opened new frontiers in the problem of finding in a efficient way the digits in the

binary expansion of π. The proof of such identity through the instruments so far acquired is pretty simple:

Z 1 X 8k

X 1 4 4x − 2x8k+3 − x8k+4 − x8k+5

2 1 1

− − − = dx

16k 8k + 1 8k + 4 8k + 5 8k + 6 0 16k

k≥0 k≥0

1

1−x

Z

= 16 dx

0 4 − 4x + 2x3 − x4

Z 1

x dx

(x 7→ 1 − x) = 16

0 (1 + x2 )(1 + 2x − x2 )

Z 1 Z 1

1+x x

= 4 dx − 4 dx

0 1 + x2 0 2 − x2

= (π + log 4) − log 4 = π.

The author conjectures the BBP formula (or series expansions with a similar structure) might be the key for proving

the base-2 normality of the π constant, i.e., loosely speaking, the fact that any binary string appears in the binary

Page 84 / 222

6 SPECIAL FUNCTIONS AND SPECIAL PRODUCTS

expansion of π an infinite number of times, possibly with a regular frequency. On its behalf, the base-2 normality of π

n

is deeply related to the (actually unknown) convergence properties of series like n≥1 sin(2 )

P

n and similar ones, linked

with universally bad averaging sequences.

X n! 2 1 2 1/4

= 2e1/4 e−x dx ≤ + e .

(2n + 1)! 0 3 3

n≥0

X n! X Γ(n + 1) X B(n + 1, n + 1)

= =

(2n + 1)! Γ(2n + 2) n!

n≥0 n≥0 n≥0

Z 1X n

x (1 − x)n

= dx

0 n≥0 n!

Z 1

= exp [x(1 − x)] dx

0

Z 1/2

= 2 exp [x(1 − x)] dx

0

1

1 − x2

Z

= exp dx.

0 4

The last inequality follows from the fact that, by convexity, on the interval [0, 1] we have:

1 − x2

exp ≤ 1 + (e1/4 − 1)(1 − x2 ).

4

Exercise 154 (Reflection formula for the Dilogarithm function). Prove that for any x ∈ (0, 1) we have:

xn

P

where Li2 (x) = n≥1 n2 .

f (x) = log(x) log(1 − x) + Li2 (x) + Li2 (1 − x).

We are interested in showing that f is constant, hence we compute f 0 :

0 log(1 − x) log(x) log(1 − x) log x

f (x) = − − + = 0.

x 1−x x 1−x

In order to prove the claim it is enough to compute f (x) at a point of regularity, or to compute the limit:

x→1− x→1

π2 log2 2

X 1 1 1 1

= Li2 = f − log2 2 = − .

2n n2 2 2 2 12 2

n≥1

Page 85 / 222

Exercise 155 (“Continuous” binomial theorem). Prove that for any n ∈ N we have:

Z +∞

n

dx = 2n .

−∞ x

n n! sin(πx)

= ·

x π (n − 1)(n − 1 − x) · . . . · (1 − x)x

and since

n Z +∞

1 X ck sin(πx)

= , = (−1)k π

(n − 1)(n − 1 − x) · . . . · (1 − x)x x−k −∞ x−k

k=0

(−1)k n

follows from computing a partial fraction decomposition through the residue Theorem, ck = n! k , we have:

Z +∞ n

n X n

dx = = 2n .

−∞ x k

k=0

sin x

Exercise 156. By recalling that sinc(x) is the function that equals 1 at x = 0 and x anywhere else, prove that for

any couple (α, β) of real numbers in (0, 1) we have:

X π

sinc(nα) sinc(nβ) = .

max(α, β)

n∈Z

Proof. Since

X 2 X sin(nα) sin(nβ)

sinc(nα) sinc(nβ) = 1 + ,

αβ n2

n∈Z n≥1

due to the addition formulas for the sine and cosine functions it is enough to prove the equality

def

X cos(nθ) π2 θ(2π − θ)

∀θ ∈ [0, 2π], g(θ) = = −

n2 6 4

n≥1

X g(|α − β|) + g(α + β)

sinc(nα) sinc(nβ) = 1+

αβ

n∈Z

1

(α − β)2 − (α + β)2 + 2π(α + β − |α − β|)

= 1+

4αβ

1 π

= 1+ [−4αβ + 4π min(α, β)] = .

4αβ max(α, β)

n

Y 3k

An = ,

6k − 4

k=1

P

what is the value of n≥1 An ?

Page 86 / 222

6 SPECIAL FUNCTIONS AND SPECIAL PRODUCTS

n

· B n, 13 , we have:

Proof. Since An = 2n

X X n Z 1

An = xn−1 (1 − x)−2/3 dx

2n 0

n≥1 n≥1

Z 1

2(1 − x)−2/3

= dx

0 (2 − x)2

Z 1

2x−2/3

(x 7→ 1 − x) = 2

dx

0 (1 + x)

Z 1

6 du

(x 7→ u3 ) = 3 2

0 (1 + u )

and the last integral can be (tediously) computed through partial fraction decomposition:

X 4π 4 log 2

An = 1 + √ + .

3 3 3

n≥1

def

X sin(n2 x)

f (x) =

n

n≥1

we have that:

π

lim f (x) = .

x→0+ 2

Proof. f (x) is defined by a pointwise convergent series by Dirichlet’s test and by Weyl’s inequality. It follows

that the wanted limit can be computed through a convolution trick, i.e. by expoiting an approximated identity:

Z +∞ X mn X 2m2 n

lim+ f (x) = lim f (x)me−mx dx = lim = lim .

x→0 m→+∞ 0 m→+∞ m2 + n4 m→+∞ 4m4 + n4

n≥1 n≥1

2m2 n

m 1 1

= − ,

4m4 + n4 2 2m2 + 2mn + n2 2m2 − 2mn + n2

hence:

X 2m2 n i

lim = lim H−m(1+i) − Hm(−1+i) − Hm(1−i) + Hm(1+i)

m→+∞ 4m4 + n4 m→+∞ 4

n≥1

i π

= [log(−1 − i) − log(−1 + i) − log(1 − i) + log(1 + i)] = .

4 4

In a similar way it is possible to show that:

X sin(nk x) π

∀k ∈ Z+ , lim+ = .

x→0 n 2k

n≥1

Exercise 159 (Raabe’s log Γ Theorem). Prove that for any positive real number α we have:

Z 1 √

log Γ(x + α) dx = α log α − α + log 2π.

0

Page 87 / 222

Exercise 160 (Glasser’s Master Theorem). Prove that if F and F ◦ ϕ are integrable functions on the real line, where

N

X |ak |

ϕ(x) = |a|x − ,

x − bk

k=1

F (x) dx = F (ϕ(x)) dx.

−∞ −∞

For the proof presented here, the author is really grateful to achillehui.

φ(R) ⊂ R∗

def

P = φ−1 (∞) = p ∈ C : p poles of φ(z) ⊂ R

=⇒

−1

φ (R) ⊂ R

S

2. Split R \ P as a countable union of its connected components (an , bn ) . Each connected component is an open

n

−

interval (an , bn ) and on such an interval, φ(z) increases from −∞ at a+

n to ∞ at bn .

∞

[

{0} ⊂ D1 ⊂ D2 ⊂ · · · with Dk = C

k=1

whose boundaries ∂Dk are ”well behaved”, ”diverge” to infinity and |z − φ(z)| is bounded on the boundaries.

More precisely, let

def

lim R = ∞

Rk

= inf |z| : z ∈ ∂Dk k→∞ k

def

lim Lk2 = 0

R

Lk = |dz| < ∞ and

∂Dk k→∞ Rk

def

Mk = sup |z − φ(z)| : z ∈ ∂Dk M = supk Mk < ∞

Given such a meromorphic function φ(z) and any Lebesgue integrable function f (x) on R, we have the following

identity: Z ∞ Z ∞

f (φ(x))dx = f (x)dx (∗1)

−∞ −∞

In order to prove this, we split our integral into a sum over the connected components of R \ P .

Z Z XZ bn

f (φ(x))dx = f (φ(x))dx = f (φ(x))dx

R R\P n an

For any connected component (an , bn ) of R \ P and y ∈ R, consider the roots of the equation φ(x) = y. Using

properties (1) and (2) of φ(z), we find there is a unique root for the equation y = φ(x) over (an , bn ). Let us call this

root as rn (y). Enforcing the substitution y = φ(x) the integral becomes

!

XZ ∞ Z ∞

drn (y) X drn (y)

f (y) dy = f (y) dy.

n −∞ dy −∞ n

dy

drn (y)

We can use the obvious fact dy ≥ 0 and dominated convergence theorem to justify the switching of order of

summation and integral.

Page 88 / 222

6 SPECIAL FUNCTIONS AND SPECIAL PRODUCTS

X drn (y) ?

=1 (∗2)

n

dy

For any y ∈ R, let R(y) = φ−1 (y) ⊂ R be the collection of roots of the equation φ(z) = y.

Over any Jordan domain Dk , we have the following expansion

φ0 (z) X 1 X 1

= − + something analytic

φ(z) − y z−r z−p

r∈R(y)∩Dk p∈P ∩Dk

This leads to

φ0 (z)

Z

X X 1

r− p= z dz

2πi ∂Dk φ(z) − y

r∈R(y)∩Dk p∈P ∩Dk

φ0 (z)

Z Z

X drn (y) 1 1 d 1

= z dz = − z dz

dy 2πi ∂Dk (φ(z) − y)2 2πi ∂Dk dz φ(z) − y

rn (y)∈Dk

Z

1 dz

=

2πi ∂Dk φ(z) − y

For those k large enough to satisfy Rk > 2(M + |y|), we can expand the integrand in last line as

∞

1 1 1 X (y + z − φ(z))j

= = +

φ(z) − y z − (y + z − φ(z)) z j=1 z j+1

∞ Z ∞ j

(|y| + |z − φ(z)|)j

X drn (y)

− 1 ≤ 1

X (M + |y|)Lk X M + |y| M + |y| Lk

|dz| ≤ 2 ≤

rn (y)∈Dk dy 2π

j=1 ∂Dk

|z|j+1 2πR k j=0

R k π Rk2

Lk

Since lim 2 = 0, this leads to

k→∞ Rk

X drn (y) X drn (y)

= lim =1

n

dy k→∞ dy

rn (y)∈Dk

drn (y)

This justifies (∗2) and hence (∗1) is proved. Notice all the dy are positive, there is no issue in rearranging the order

of summation in last line.

Corollary 161.

+∞ +∞ +∞

x2 dx

Z Z Z

parity 1 dx G.M.T. 1 dx π

= = = √ .

x4 + x2 + 1 2 1 2 2 x2+3

0 −∞ x− x +3 −∞ 2 3

1 1 1 1 1

cot x = lim + + + ··· + +

N →+∞ x x+π x−π x + Nπ x − Nπ

leading to: Z +∞ Z +∞

f (x − cot x) dx = f (x) dx.

−∞ −∞

π

By the substitution x 7→ 2 − x it follows that:

Z +∞ Z +∞

dx dx

= = π.

−∞ 1 + (x + tan x)2 −∞ 1 + x2

Page 89 / 222

Corollary 163. For any a, b ∈ R+ ,

+∞

π −2√ab

Z r

−ax2 + xb2

e dx = e .

0 4a

The Riemann ζ function and its analytic continuation. For any complex number with real part greater

than one the Riemann ζ function is defined through the absolutely convergent series

X 1

ζ(s) =

ns

n≥1

that through the (inverse) Laplace transform admits the following integral representation:

+∞

xs−1

Z

1

ζ(s) = dx.

Γ(s) 0 ex − 1

X (−1)n+1

def

2

η(s) = = 1 − s ζ(s),

ns 2

n≥1

but the series defining η(s) has a larger domain of (conditional) convergence, Re(s) > 0.

Due to such fact we are allowed to extend the ζ function to the half-plane Re(s) > 0 through

−1 X

(−1)n+1

2

ζ(s) = 1− s

2 ns

n≥1

−1 Z +∞ s−1

−1 1 2 x

(by L ) = 1− s dx

Γ(s) 2 0 ex + 1

+∞

4s xs dx

Z

(by parts) = s

2(2 − 2)Γ(s + 1) 0 cosh2 (x)

1

4s

Z

(by substitution) = arctanh(x)s dx

2(2s − 2)Γ(s + 1) 0

1

2s−1

Z

1+x

= s

logs dx.

(2 − 2)Γ(s + 1) 0 1−x

These integral representations shed a good amount of light on the tight interplay between logarithmic integrals

and values of the ζ function. They also underline that s = 1 is a simple pole with residue 1, that ζ(s) attains

negative values for any s ∈ [0, 1) and that ζ(0) = − 12 . By exploiting Euler’s product

Y −1

1

∀s : Re(s) > 0, ζ(s) = 1− s

p

p∈P

and its logarithmic derivative, we get that there is a very deep correspondence between the distribution of zeroes

of the ζ function in the region 0 < Re(s) < 1 and the distribution of prime numbers:

X X X ρ X X −2n

log p = X − (b + 1) − lim + .

T →+∞ ρ 2n

pm <X |Im(ρ)|<T n≥1

Page 90 / 222

6 SPECIAL FUNCTIONS AND SPECIAL PRODUCTS

lim =1

n→+∞ n

is substantially equivalent to the statement the ζ function is non-vanishing on the line Re(s) = 1, statement

that was almost simultaneously (but independently) proved by Hadamard and de la Vallée-Poussin through

a trigonometric trick. By defining π(x) as the number of prime numbers in the interval [1, x], the following

strengthening of the PNT Z x

√

dt

x log2 (x)

π(x) −

2 log t

is substantially equivalent to Riemann Hypothesis (RH): all the zeroes of the ζ(s) function in the region

0 < Re(s) < 1 lie on the critical line Re(s) = 21 . Conversely, inequalities of the form

Z x

π(x) − dt α

x

2 log t

for some α ∈ 21 , 1 imply the absence of zeroes of the ζ function in subsets of the critical line 0 < Re(s) < 1.

The best result actually known about the zero-free region for the ζ function is due to Korobov and Vinogradov:

!!

X log3/5 X

log p = X + O X exp −c .

pm <X

(log log X)1/5

This result (pitifully very far from RH) comes from a sophisticated combinatorial manipulation of exponential sums

(variants on Van Der Corput’s trick), combined with classical inequalities in Complex Analysis (Hadamard,

Borel-Caratheodory). The reader can find on Terence Tao’s blog a very detailed lecture. The ζ function can

be further extended to the whole complex plane. By setting

X 2

θ(z) = eπin z

n∈Z

1 1

θ(z) = √ θ

−iz z

s

def

ξ(s) = π −s/2 Γ ζ(s)

2

has the following integral representation

+∞

θ(iy) − 1 s/2

Z

1 dy

ξ(s) = − + y + y (1−s)/2

s(1 − s) 1 2 y

and ξ turns out to be a meromorphic function (with simple poles at s = 0 and s = 1) such that ξ(s) = ξ(1 − s).

This is known as the reflection formula for the ζ function.

By defining P as the set of prime numbers, prove that for any real number s > 1 we have:

Y −1

1 X 1

1− s = ζ(s) =

p ns

p∈P n≥1

Page 91 / 222

then prove the following inequality: r

X 1 5

≤ log .

p2 2

p∈P

(ζ(2n) − 1) = .

4

n≥1

z2

X 1 − πz cot(πz) d.H. 3

(ζ(2n) − 1) = lim− − = .

z→1 2 1 − z2 4

n≥1

As an alternative,

+∞ +∞ +∞

x2n−1

XZ Z Z

X 1 1 sinh(x) t+1 3

(ζ(2n) − 1) = x

− x dx = dx = dt =

0 (2n − 1)! e −1 e 0 e (ex − 1)

x

1 2t3 4

n≥1 n≥1

or simply:

X XX 1 X 1 1 X 1 1 H2

(ζ(2n) − 1) = = = − = .

m2n m2 − 1 2 m−1 m+1 2

n≥1 n≥1 m≥2 m≥2 m≥2

Exercise 166. Prove that for any s ≥ 1 the following inequality holds:

Z +∞ r

dx π

2 )s

≤

0 (1 + x 4s −3

√

and notice that it implies π < 10 (Hint: evaluate both sides at s = 3).

X 1 √

≤ 4 ζ(4) + π 2 ζ(3).

m4 +n 4

(m,n)∈A

π

= · .

2 2n + 1 4n

n≥0

Page 92 / 222

6 SPECIAL FUNCTIONS AND SPECIAL PRODUCTS

X

2 1

log 1+ < 1.

n

n≥1

2Hn−1 n

Proof. By recalling log2 (1 − z) =

P

n≥1 n z and the integral representation for harmonic numbers we have:

X

1 X X 2Hn−1 (−1)n

2

S= log 1+ =

m nmn

m≥1 m≥1 n≥1

Z 1XX

2Hn−1 (z n−1 − 1)(−1)n

= dz

0 m≥1 n≥1 nmn (z − 1)

Z 1X z 1

log 1 + m − z log 1 + m

= 2 dz

0 m≥1 z(1 − z)

Z 1

log Γ(1 + z)

(by Euler’s product for the Γ function) = −2 dz

0 z(1 − z)

Z 1

z

(by integration by parts) = 2 ψ(z + 1) log dz

0 1−z

Z 1

1 1

(by the reflection formula for ψ) = 2 − − π cot(πz) log(z) dz

0 z 1−z

The magic now comes from studying the function g(z) = z1 − 1−z 1

− π cot(πz) over the interval (0, 1).

It is extremely close to 2z − 1 by “cancellation of singularities”, hence

Z 1

S≈2 (2z − 1) log(z) dz = 1.

0

1

g(z) is symmetric with respect to z = 2 and log(z) is negative and increasing over (0, 1), hence ≈ is indeed a <.

X n+1

log2 ≤ 2 log 2 + 2 − log2 2.

n−1

n≥2

X (−1)k+1 X 1

= 1.

k k2n + 1

k≥1 n≥0

Proof. The dominated convergence Theorem allows us to perform the following manipulations:

X (−1)k+1 X 1 X X (−1)k 1 1 1

X h X (−1)k+1

h+1 2

0

= − + − . . . = (−1)

k k2n + 1 k k2n k 2 4n k 3 8n 2h − 1 k h+1

k≥1 n≥0 n≥0 k≥1 h≥1 k≥1

Z +∞ Z +∞

X0 1 X (−1)h+1 xh

= (−1)h+1 ζ(h + 1) = dx = e−x dx = 1

0 ex − 1 h! 0

h≥1 h≥1

P0

where denotes a regularized sum in the Cesàro sense.

Page 93 / 222

An exercise on generalized Euler sums, logarithmic integrals and values of ζ.

∞

X (−1)k 3

Hk Hk−1 = ζ(4).

k2 16

k=2

Proof. We are going to see a generalization of the approach used (11) to show that ζ(4) = 25 ζ(2)2 . This is also

an opportunity to make a tribute to Pieter J. de Doelder (1919-1994) from Eindhoven University of Technology,

who evaluated in closed form the given series in a somewhat famous paper published in 1991. One may start by

using the following identity coming from the Cauchy product,

∞

X Hn n+1

ln2 (1 + x) = 2 (−1)n−1 x

n=1

n+1

giving

∞

1

ln(1 − x) ln2 (1 + x)

Z Z 1

n−1 Hn

X

dx = 2 (−1) xn ln(1 − x) dx,

0 x n=1

n + 1 0

Z 1

Hn+1

xn ln(1 − x) dx = − , n ≥ 0,

0 n+1

one gets

∞

1

ln(1 − x) ln2 (1 + x)

Z X Hn Hn−1

dx = 2 (−1)n−1 .

0 x n=2

n2

Here are the main steps which de Doelder took to evaluate the related integral. We clearly have

1 1

ln3 (1 + x) 1

ln(1 − x) ln2 (1 + x)

Z Z Z

1+x dx

ln3 = dx − 3 dx

0 1−x x 0 x 0 x

1

ln2 (1 − x) ln(1 + x) 1

ln3 (1 − x)

Z Z

+3 dx − dx

0 x 0 x

and

ln3 1 − x2

Z 1 3

1

ln(1 − x) ln2 (1 + x)

Z Z 1

ln (1 + x)

dx = dx + 3 dx

0 x 0 x 0 x

1

ln2 (1 − x) ln(1 + x) 1

ln3 (1 − x)

Z Z

+3 dx + dx,

0 x 0 x

1 1

Z 1 Z 1 3

ln (1 − x)

Z Z

1 + x dx

6 dx = dx − ln3 −2 dx

0 x 0 x 0 1−x x 0 x

= I1 − I2 − 2I3 .

Page 94 / 222

6 SPECIAL FUNCTIONS AND SPECIAL PRODUCTS

It is easy to obtain

1

ln3 1 − x2 1 1 ln3 (1 − u)

Z Z

I1 = dx = du (u = x2 )

0 x 2 0 u

1 1 ln3 v

Z

= dv (v = 1 − u)

2 0 1−v

∞ Z

1X 1 n 3

= v ln v dv

2 n=0 0

∞

X 1 π4

= −3 = − ,

n=1

n4 30

similarly

1

ln3 (1 − x) π4

Z

I3 = dx = − .

0 x 15

1−x dx −2 du

By the change of variable, u = , one has = getting

1+x x 1 − u2

1

ln3 u

1

Z Z

1+x dx

I2 = ln3 = −2 2

du

0 1−x x 0 1−u

∞ Z 1

X

= −2 u2n ln3 u dv

n=0 0

∞

X 1 π4

= 12 4

= .

n=0

(2n + 1) 8

Then,

1

ln(1 − x) ln2 (1 + x) π4

Z

dx = −

0 x 240

and

∞

X Hn Hn−1 3 π4

(−1)n = ζ(4) =

n=2

n2 16 480

as wanted.

ex + 1

Z

I1 = dx = 1.

−∞ (ex − x + 1)2 + π 2

Lemma 174. If a > 0 and b ∈ R,

Z +∞

a2 dx 1

=

1 + W a1 e−b/a

x 2 2

−∞ (e − ax − b) + (aπ)

Page 95 / 222

where W is Lambert’s function, i.e. the principal branch of the inverse function of x 7→ xex .

1

Let us consider the function f (z) = a log(−z)+b−z ·

1 ≥0

z on the region D = C \ R . For any z ∈ D it is

possible to pick some r > 0 and some ϕ ∈ − π2 , π2

same sign as ϕ and the real part is a mono-

tonic function of the r variable. It follows that

z = −a · W a1 e−b/a is the only pole of f , at-

computation gives

1 −b/a 1 1

Res f (z), z = −a · W e = · ,

a a 1 + W a1 e−b/a

Z Z Z Z

1 2πi

f dz + f dz + f dz + f dz = ·

a 1 + W a1 e−b/a

γR,ε CR δR,ε cε

R

From L(CR ) ∼ 2πR and M (CR ) = maxz∈CR |f (z)| ∼ R12 it follows that CR f dz ≤ L(CR ) M (CR ) ∼ 2π R.

R

Therefore the contribution of CR f dz is negligible as R → +∞. From L(cε ) = πε and M (cε ) = maxz∈cε |f (z)| ∼

R

−1 1

≤ L(cε ) M (cε ) ∼ a −π +

R

· it follows that f dz log ε , hence the contribution of cε f dz is negligible as r → 0 .

a log ε ε cε

By letting R → +∞ and ε → 0+ we get the equality

Z ∞

1 1 dx 1 2πi

− = ·

a 1 + W a1 e−b/a

0 a log(−x − i0+ ) + b − x a log(−x + i0+ ) + b − x x

and since:

1 1 2πi · a

− = ,

a(log x − iπ) + b − x a(log x + iπ) + b − x (a log x + b − x)2 − (iπa)2

∞

a2

Z

dx 1

· = .

0 (a log x + b − x)2 + (aπ)2 x 1 + W a1 e−b/a

∞

a2

Z

1

dx = .

−∞

x 2

(ax + b − e ) + (aπ) 2 1 + W a1 e−b/a

In our case, by choosing a = 1 and b = −1 we get that I1 depends on W (e) = 1 and equals 1.

Page 96 / 222

6 SPECIAL FUNCTIONS AND SPECIAL PRODUCTS

Z +∞

u+1 du

I1 = ·

0 u (u + 1 − log u)2 + π 2

Z +∞ Z +∞

1

= (u + 1)ux−1 e−(u+1)x sin(πx) dx du

π 0 0

2 +∞ −x −x

Z

= e x Γ(x) sin(πx) dx

π 0

Z +∞ −x −x

e x

= 2 dx

0 Γ(1 − x)

Z +∞

(ex)−x 1

dx = (HNT)

0 Γ(1 − x) 2

equivalent to the claim I1 = 1. It would be interesting to find an independent proof of (HNT), maybe based

on Glasser’s master theorem, Ramanujan’s master theorem or Lagrange inversion. There also is an interesting

discrete analogue of (HNT),

X nn

=1

n≥1

n!(4e)n/2

From the Weierstrass product

Y z2

cosh(πx/2) = 1+

(2m + 1)2

m≥0

2

d

by applying dz 2 log(·) to both sides we get:

π2 X (2m + 1)2 − z 2

=

8 cosh2 (πx/2) m≥0 ((2m + 1)2 + z 2 )

X π2 X X (2m + 1)2 − (2n + 1)2

2 =

n≥0

8 cosh (π(2n + 1)/2) n≥0 m≥0 ((2m + 1)2 + (2n + 1)2 )2

XXZ +∞ XZ +∞

−(2m+1)x x cos((2n + 1)x)

cos((2n + 1)x)xe dx = dx

0 0 2 sinh(x)

n≥0 m≥0 n≥0

X Z +∞ x cosh(x) − sinh(x) sin((2n + 1)x)

· dx

0 2 sinh(x) 2n + 1

n≥0

P

2n+1 is the Fourier series of a 2π-periodic rectangle wave that equals 4 over (0, π)

π

and − 4 over (π, 2π). That implies, by massive cancellation:

X 1 1

2 = .

cosh (π(2n + 1)/2) 2π

n≥0

Page 97 / 222

√

1 s 8π

On the other hand, the Fourier transform of cosh2 (πx)

is given by by sinh(πs) .

By Poisson’s summation formula,

X (−1)n+1 n 1

= .

sinh(πn) 4π

n≥1

Z π/4

2 cos(3φ)

arctan p dφ = 0.

0 (3 + 2 cos(2φ)) cos(2φ)

Z 1 √ ! Z 1 √

1 2(1 − 3t2 ) π2 3 2 t arctan(t)

I= 2

arctan √ dt = − √ dt.

0 1+t 8

2

(5 + t ) 1 − t 2 2 2

0 (3 − t ) 1 − t

√ r

1 − t2

Z

3 2t

√ dt = −3 arctan

(3 − t2 ) 1 − t2 2

integrating by parts once again we get:

r

1

π2 1 − t2

Z

1

I= −3 arctan dt

8 0 1 + t2 2

1

r √1 √

1 − t2 arctan 1 − 2t2 π2

Z Z

dt 2

arctan = dt =

0 1 + t2 2 0 1+t2 24

1 2m+1

1 √

(1 − 2t2 )

Z Z

dt 2m+1 2 2

(1 − t2 ) 2 , dt

0 1 + t2 0 1 + t2

can be computed through the residue theorem or other techniques. For instance:

2m+1

1 1

m + 32 Γ n + 12

(1 − t) nΓ

Z Z

2 2m+1

n− 21

X X

n

1 dt = (−1) (1 − t) 2 t dt = (−1)

0 t 2 (1 + t) 0 Γ(m + n + 2)

n≥0 n≥0

or just:

q

Z 1 1−t2

2 π 1

2 dt = 1− √

1 + 1−t 2(1 + u2 )

0 (1 + t2 ) 2 u

2 2 + u2

from which: r

1 1

1 − t2 π2

Z Z

dt π du 1

arctan = 1− √ =

0 1 + t2 2 2 0 1 + u2 2 + u2 24

as wanted, since: Z

du u

√ = arctan √ .

(1 + u2 ) 2 + u2 2 + u2

Page 98 / 222

6 SPECIAL FUNCTIONS AND SPECIAL PRODUCTS

Exercise 176. Prove that for any a > −2 the following identity holds:

Z 1 √

1−x dx 1 a+2

·√ =√ log 1 + .

0 1+x x4 + ax2 + 1 a+2 2

Proof. Simple algebraic manipulations allow us to write the given integral in the following form:

Z +∞

2 du

−1 + √ √

4+u 2 2

u u +a+2

0

√

and by setting u = a + 2 sinh θ we get:

Z +∞

1 2 dθ

√ −1 + q

a+2 0 2 sinh θ

4 + (a + 2) sinh θ

We may get rid of the last term through the ”hyperbolic Weierstrass substitution”

v

e +1

θ = 2 arctanh(e−v ) = log

ev − 1

Z +∞

1 2

√ −1 + q dv

a+2 0 4+ a+2

sinh2 v

To fill in the missing details is an exercise we leave to the reader.

1

√

arctan x2 + 2 5π 2

Z

√ dx = .

0 (x2 + 1) x2 + 2 96

Proof. In 2001-2002, Zahar Ahmed proposed the above integral in the American Mathematical Monthly (AMM). Here

we present his maiden solution. Let us call the given integral as I and use arctan z = π2 − arctan z1 to split I as I1 − I2 .

Using the substitution x = tan θ, we can write

Z π/4

cos θ π

I1 = p dθ

0 2

2 − sin2 θ

π2

√

which can be evaluated as I1 = 12 by using the substitution sin θ = 2 sin ϕ. Next we use the representation

Z 1

1 1 dx

arctan =

a a 0 x2 + a2

to express

Z 1 Z 1

dx dy

I2 = .

0 0 (1 + x2 )(2 + x2 + y 2 )

By partial fraction decomposition I2 can be re-written as:

Z 1 Z 1 Z 1 Z 1

dx dy dx dy

I2 = − .

0 0 (1 + x2 )(1 + y 2 ) 0 0 (1 + x2 )(2 + x2 + y 2 )

Page 99 / 222

Using the symmetry of the integrands and the domains for x and y, the second integral in the RHS of the last identity

equals I2 itself. This leads to:

Z 1 2

π2 1 dx π2 π2 5π 2

I= − 2

= − = .

12 2 0 1+x 12 32 96

π/2 √ π2 π/2 √ π2

Z Z

arccot 1 + csc θ dθ = , arccsc 1 + cot θ dθ = .

0 8 0 12

√

1/ 2

arcsin(x2 ) π2

Z

√ dx = .

0 (2x2 + 1) x2 + 1 144

Z 1

√ !

88 21 dx

ζ(2) = arctan √ .

0 215 + 36x2 1 − x2

X H2 17π 4

n

= .

n2 360

n≥1

(2)

X 3 Hn2 − Hn

− log3 (1 − z) = z n+1 ,

n+1

n≥1

X H 2 − Hn(2) 1 1 − log3 (1 − z)

Z

1 1 − log3 z

Z

π4

n

= dz = dz = .

(n + 1)2 3 0 z 3 0 1−z 45

n≥1

X H2 2

X Hn−1 X Hn−1

n

S= = +2 + ζ(4)

n2 n 2 n3

n≥1 n≥1 n≥1

2 (2) (2)

X Hn−1 − Hn−1 X Hn−1 π4

= 2

+ 2

+

n n 60

n≥1 n≥1

6 SPECIAL FUNCTIONS AND SPECIAL PRODUCTS

P Hn−1

since the value of ζ(4) is known and the Euler sum n≥1 n3 can be tackled through the Theorem (41).

On the other hand, by symmetry:

(2)

X Hn−1 X 1 1 π4

ζ(2)2 − ζ(4) =

2

= 2 2

=

n m n 2 120

n≥1 m,n≥1,

m<n

π4 π4 π4 17π 4

S= + + =

45 120 60 360

as wanted.

Exercise 182. Prove that by applying Feynman’s trick to an integral representation for ζ(s) in the region Re(s) ∈ (0, 1)

we have: Z +∞

log(x) log(2) log(8π 2 )

dx = − .

0 e2πx + 1 4π

Exercise 183. By exploiting Euler’s Beta function, convolutions and the discrete Fourier transform, prove that for

any n ∈ N the following identity holds:

2n −1

X 4n 2n 1

(−1)k = .

2k k 1 − 2n

k=0

Proof.

2n Z 1

X 4n k

S(n) = (2n + 1) (−1) (1 − x)k x2n−k dx

2k 0

k=0

2n Z 1

X 4n

= (2n + 1) (−1)k 2z(1 − z 2 )k z 4n−2k dx

2k 0

k=0

Z π/2

sin(θ) cos(θ) e4niθ + e−4niθ dθ

= −(2n + 1)

0

Z π/2

= −(2n + 1) sin(2θ) cos(4nθ) dθ

0

2n + 1 1

= − =− .

4n2 − 1 2n − 1

Exercise 184. By using suitable substitutions, the Laplace transform and special values of Γ0 , ζ, ζ 0 , prove that:

Z 1

1 1 dt 1 log 2 log π 3 X log n

+ 2

= + − − 2 .

0 log t 1 − t (1 + t) 2 3 4 2π n2

n≥1

1+t , check that

Z 1

arctan x q

5 2

dx = π2 Γ

p 4 .

0 x(1 − x2 )

7 The Cauchy-Schwarz inequality and beyond

Definition 186. A Hilbert space is a vector space H (real or complex) equipped with a positive definite inner

product h·, ·i such that (H, k · k), where kuk2 = hu, ui, is a complete metric space.

Hilbert spaces like `2 or L2 (respectively the space of square-summable sequences and the space of square-integrable

functions) are the most natural places to extend the theory of inner products and orthogonal projections on Rn :

Fourier series and Fourier transforms are so natural products of this viewpoint. The context of Hilbert spaces is

also the typical framework for important inequalities like Bessel’s inequality (becoming Parseval’s identity under the

completeness assumption) and the Cauchy-Schwarz inequality:

Theorem 187 (Cauchy-Schwarz inequality). If (a1 , . . . , an ) and (b1 , . . . , bn ) are n-uples of real numbers, we have:

n

!2 n

! n

!

X X X

ak bk ≤ a2k · b2k

k=1 k=1 k=1

and equality holds if and only if a n-uple is a real multiple of the other one.

!2 ! !

Z b Z b Z b

2 2

f (x) g(x) dx ≤ f (x) dx · g(x) dx

a a a

f (x) g(x)

and equality holds if and only if g(x) = λ, or f (x) = λ, almost everywhere in (a, b).

We start this section by studying multiple proofs of Cauchy-Schwarz inequalities, among “classical” approaches and

less typical ones. The first proof depends on a amplification trick.

Proof # 1. Le us assume that u, v are non-zero vectors. From the trivial inequality

ku − vk2 ≥ 0

it follows immediately:

hu, vi ≤ 21 kuk2 + 12 kvk2

where by considering some λ > 0, the inner product hu, vi stays unchanged if u is replaced by λu and v is replaced by

v

λ , implying:

2

∀λ > 0, hu, vi ≤ λ2 kuk2 + 2λ1 2 kvk2

and we have complete freedom in choosing λ in such a way that the RHS is as small as possible.

kvk

With the optimal choice λ = kuk we get:

hu, vi ≤ kuk · kvk

that is precisely Cauchy-Schwarz inequality: the inner product between two vectors is bounded by the product of their

lengths 8 . It is pretty clear that equality holds only if u and v are linearly dependent, and this is even more evident

as a consequence of the next proof.

8 This is without doubt a really efficient mnemonic trick for recalling if CS holds as ≤ or ≥ when needed.

7 THE CAUCHY-SCHWARZ INEQUALITY AND BEYOND

n

! n

! n

!2

X X X X

a2i · b2i = ai bi + (aj bk − bj ak )2 .

i=1 i=1 i=1 j6=k

To check the last identity is straightforward, and what is the best way for proving an inequality, than deriving it from

an identity? Given Lagrange’s identity, it is clear that Cauchy-Schwarz inequality holds as an equality if and only if

aj bk = ak bj for any k 6= j. 9

The most well-known proof exploits the concept of discriminant for a quadratic polynomial. The function

n

X

p(t) = (ai − tbi )2

i=1

is clearly a quadratic polynomial, not assuming any negative value for any t ∈ R (as a sum of squares).

That implies the discriminant of p(t) is non-positive, and from

n

X n

X n

X

2

[t ]p(t) = b2i , 1

[t ]p(t) = −2 a i bi , 0

[t ]p(t) = a2i

i=1 i=1 i=1

Cauchy-Schwarz inequality immediately follows. Another approach, usually proposed to students as a tedious exercise,

is to prove Cauchy-Schwarz inequality by induction on n. Such approach is not tedious anymore (quite the opposite,

indeed) if we bring the Cauchy-Schwarz inequality into a (almost) equivalent form:

Lemma 190 (Titu). If (a1 , . . . , an ) and (b1 , . . . , bn ) are n-uples of positive real numbers, we have:

n

X a2 k (a1 + . . . + an )2

≥ .

bk b1 + . . . + bn

k=1

We may firstly notice that Titu’s Lemma is a consequence of the Cauchy-Schwarz inequality:

2 n

! n

!

a2k

2 a1 p an p CS X X

(a1 + . . . + an ) = √ · b1 + . . . + √ · bn ≤ · bk .

b1 bn bk

k=1 k=1

Conversely, Titu’s Lemma implies that Cauchy-Schwarz inequality holds for positive n-uples.

On the other hand the n = 1 case of Titu’s Lemma is trivial and the n = 2 case

a21 a2 (a1 + a2 )2

+ 2 ≥

b1 b2 b1 + b2

is equivalent to the inequality (a1 b2 − a2 b1 )2 ≥ 0. By combining the cases n = 2 and n = N ,

N +1 N

X a2k a2N +1 X a2k

= +

bk bN +1 bk

k=1 k=1

a2N +1 (a1 + . . . aN )2

(case n = N ) ≥ +

bN +1 b1 + . . . + bN

(a1 + . . . + aN +1 )2

(case n = 2) ≥

b1 + . . . + bN +1

the proof of Titu’s lemma turns out to be straightforward.

9 This

rises a hystorical/epistemological caveat: since the Cauchy-Schwarz inequality immediately follows from Lagrange’s identity,

why such inequality was not attributed to Lagrange, whose work came before Cauchy’s work by at least thirty years?

We may notice that the triangle inequality is a consequence of the Cauchy-Schwarz inequality:

hu, vi ≤ kuk · kvk −→ ku + vk2 ≤ kuk2 + kvk2 + 2kuk · kvk −→ ku + vk ≤ kuk + kvk.

The Cauchy-Schwarz inequality can also be used to reconcile the typical definitions of inner product in Physics and

Linear Algebra. By assuming that u, v ∈ R2 , u = (ux , uy ), v = (vx , vy ), hu, vi can be equivalently defined as ux vx +uy vy

or as kukkvk cos θ, where θ is the angle between the u and v vectors. How they come to be equivalent? Simple: on

one hand, kukkvk cos θ is the product between the length of v and the length of the projection of u along v, that we

may name w. w is a vector of the form λv minimizing ku − λvk or, equivalently:

u v +u v

Such quadratic polynomial in λ attains its minimum value at λ = x kvk x

2

y y

, reconciling the previous definitions and

proving that two vectors have a zero inner product if and only if they are orthogonal.

p

max (A sin θ + B cos θ) = A2 + B 2 .

θ

2

(A sin θ + B cos θ) ≤ A2 + B 2

A

and equality holds as soon as tan θ = B.

√

Exercise 192. Prove that for any N ≥ 1 we have HN ≤ 2N .

v

N u N

X 1 u X 1 p

HN = ≤ tN · 2

≤ N ζ(2).

n=1

N n=1

n

an

n≥1

X 1

n2 an

n≥1

is a divergent series.

N

! N

! N

!2

X X 1 X 1

an · 2a

≥ ≥ log2 (N ).

n=1 n=1

n n n=1

n

7 THE CAUCHY-SCHWARZ INEQUALITY AND BEYOND

√

Exercise 194. Prove that 3 log(3) < 2.

Proof. s

Z 2 Z 2 Z 2

dx CS dx 2

log(3) = < dx = √ .

0 1+x 0 (1 + x)2 0 3

+∞

Z r

sin x π

I= dx ≤ .

0 x+1 8

in order to improve the convergence speed,

1 −1 1 −s

it is very practical to consider that L(sin x) = s2 +1 and L x+1 = e , from which:

sZ

+∞ +∞ +∞

e−s

Z Z r

CS ds π

I= 2

ds ≤ e−2s ds = .

0 s +1 0 0 (1 + s2 )2 8

(1+s2 )2 have a very similar behaviour in a right

neighbourhood of the origin.

Exercise 196. In a acute-angled triangle ABC, LA , LB , LC are the feet of the angle bisectors from A, B, C.

By denoting as r the inradius, prove that:

Proof. By naming I the incenter of ABC, from Van Obel’s theorem it follows that:

AI b+c

=

ILA a

1 1 1 CS

ALA + BLB + CLC ≥ r · (a + b + c) · + + ≥ 9r.

a b c

Exercise 197. Prove that if p(x) ∈ R[x] is a polynomial with non-negative coefficients,

for any couple (x, y) of positive real numbers we have:

√ p

p( xy) ≤ p(x)p(y).

With greater generality, the concept of convexity is a cornerstone in the theory of inequalities.

Theorem 198 (Arithmetic-geometric inequality, AM-GM). If a1 , . . . , an are non-negative real numbers, we have:

def √ a1 + . . . + an def

GM(a1 , . . . , an ) = n

a1 · . . . · an ≤ = AM(a1 , . . . , an )

n

and equality holds iff a1 = . . . = an .

Proof. We may clearly assume that all the variables are positive without loss of generality.

In such a case, by setting bk = log ak , the arithmetic-geometric inequality turns out to be equivalent to

n

! n

1X 1X

exp bk ≤ exp (bk )

n n

k=1 k=1

2

d x

that is Jensen’s inequality for the exponential function, holding since dx 2e = ex > 0. An alternative and really

interesting approach is due to Cauchy. In his Cours d’analyse he observes the arithmetic-geometric inequality can be

proved through an “atypical induction”:

• if the AM-GM inequality holds for n variables, it also holds for 2n variables;

• if the AM-GM inequality holds for (n + 1) variables, it also holds for n variables.

Lemma 199 (Superadditivity of the geometric mean). If a1 , . . . , an are positive and distinct numbers,

n

Y √ n

(τ + ak ) > (τ + n

a1 · . . . · an )

k=1

ak

and by setting bk = τ and ck = log bk it is also equivalent to:

n

1X c1 +...+cn

log(1 + eck ) > log 1 + e n ,

n

k=1

i.e. Jensen’s inequality for f (x) = log(1 + ex ). In order to prove the claim it is enough to notice that:

d2 x ex

log(1 + e ) = > 0.

dx2 (ex + 1)2

In a similar way we have that, if (a1 , . . . , an ) and (b1 , . . . , bn ) are n-uples of non-negative numbers,

Exercise 200 (Huygens inequality). Prove that for any θ ∈ [0, 1] we have:

7 THE CAUCHY-SCHWARZ INEQUALITY AND BEYOND

Proof.

Z θ Z θ

1 AM-GM

tan θ + 2 sin θ = + cos u + cos u du ≥ 3 1 dθ = 3θ.

0 cos2 u 0

Exercise 201 (Weitzenbock’s inequality). ABC is a triangle with area ∆ and side lengths a, b, c.

√

Prove that the equality a2 + b2 + c2 = 4∆ 3 implies that ABC is an equilateral triangle.

p

4∆ = (a2 + b2 + c2 )2 − 2(a4 + b4 + c4 ),

√

hence a2 + b2 + c2 = 4∆ 3 implies

(a2 + b2 + c2 )2 = (1 + 1 + 1)(a4 + b4 + c4 ),

CS

that is the equality case in (a2 + b2 + c2 )2 ≤ (1 + 1 + 1)(a4 + b4 + c4 ).

For other interesting proofs have a look at this thread on MSE.

N

2 X

1

2

3

5 + log N ≥ n1/n − 1 ≥ 1

2 log2 N.

n=1

Proof. Since

2 n AM-GM Hn−1

n1/n = GM 1, , . . . , < 1+

1 n−1 n

and

N (2)

X Hn−1 X 1 H 2 − HN

= = N

n=1

n kn 2

1≤k<n≤N

In order to prove the inequality on the right it is enough to show that for any N ≥ 1 we have:

1 2

1

log (N + 1) − log2 (N ) .

(N + 1) N +1 − 1 ≥ 2

To complete the proof is a task left to the reader. It might be useful to exploit:

Z N +1

2 log x

1

log (N + 1) − log2 (N ) =

2 dx.

N x

n

1

an = 1 +

n

Proof. We have an+1 > an by AM-GM:

√

n+1 1 1 AM-GM 1 1 1

1 · an = GM 1, 1 + , . . . , 1 + < 1+n· 1+ =1+ .

n n n+1 n n+1

Additionally: n

a2n 1 1 1

= 1+ ≤ 1 =1+

an 4n(n + 1) 1− 4(n+1)

4n + 3

hence for any N ≥ 1 we have:

Y 1

1 16 Y 1 16 Y 1 + 2k+1 20

aN ≤ a1 1+ = 1+ ≤ 1 = .

4 · 2k + 3 7 k

4·2 +3 7 1+ 2k+2

7

k≥0 k≥1 k≥1

Exercise 204 (from Baby Rudin, 1). Let {an }n∈N be an increasing and unbounded sequence of positive real numbers.

P

Prove that there is a sequence {bn }n∈N of positive real numbers such that the series n∈N bn is convergent but the

P

series n∈N an bn is divergent.

P

Exercise 205 (from Baby Rudin, 2). Let {an }n∈N be a sequence of positive real numbers, such that the series n∈N an

is convergent. Prove the existence of a sequence {bn }n∈N of positive real numbers, increasing and unbounded, such

P

that the series n∈N an bn is convergent.

Proof. In the first exercise we may consider the sequence {bn }n∈N defined through:

1 1 1

b0 = √ , bn = √ −√ .

a0 an−1 an

N

X 1 X 1 1 2 1

bn = √ + √ −√ =√ −√ .

a0 n=1 an−1 an a0 aN

n≤N

√2 .

P

Since the sequence {an }n∈N is unbounded, the previous identity proves the series n∈N bn is convergent to a0

We further have: √ √

an − an−1 √ √

an bn = an · √ ≥ an − an−1 .

an−1 an

PN √ √ P

The last identity proves n=1 an bn ≥ aN − a0 , hence the series n∈N an bn is divergent.

P

About the second exercise, we may assume n∈N an = 1 without loss of generality, by multiplying every element of

{an }n∈N by a suitable constant. Under such assumption, we may consider the sequence {bn }n∈N defined through:

n−1

1 X

b0 = c0 = 1, bn = √ , cn = 1 − am .

cn m=0

P

Since n∈N an is convergent to 1 and has positive terms, {cn }n∈N is a sequence with positive terms decreasing to 0,

hence {bn }n∈N is positive and unbounded. We also have an = cn − cn+1 , from which:

√ √ √ √

cn − cn+1 cn − cn+1 · cn + cn+1 √ √

an bn = √ = √ ≤ 2 cn − cn+1 .

cn cn

It follows that: X √

X √ √

an bn ≤ a0 + 2 · cn − cn+1 = a0 + 2 1 − cN +1 ,

n≤N n≤N

7 THE CAUCHY-SCHWARZ INEQUALITY AND BEYOND

P

hence the series n∈N an bn is convergent to some value ≤ (a0 + 2).

Both the exercises studied here follow from a more general inequality:

−β !1−β

N N

X X 1 X

∀β ∈ (0, 1), ∀N ≥ 1, an am < an ,

n=1

β n=1

m≥n

P+∞

holding for any sequence {an }+∞

n=1 with positive terms such that n=1 an is convergent.

Exercise 206. {an }n≥1 is a sequence of real numbers with the following property: for any sequence of real numbers

PN

{bn }n≥1 such that n≥1 b2n is convergent, limN →+∞ n=1 an bn exists and it is finite. Prove that {an }n≥1 is square-

P

P 2

P 2 an

summable, i.e. n≥1 an is convergent. Hint: assume that n≥1 an is divergent and consider bn = An , where

AN = a1 + a2 + . . . + aN .

Z b

a+b 1 f (a) + f (b)

f ≤ f (x) dx ≤ .

2 b−a a 2

In other terms: for convex (or concave) functions it is not difficult to estimate the magnitude of the error

Rb

in computing a f (x) dx through the rectangle or trapezoid numerical methods.

Z n

n log n − n + 1 = log(x) dx ≥ f (1) + f (2) + . . . + f (n) = log(n!).

1

π2 1 π2 1

− 1 ≤ Hn(2) ≤ − .

6 n+ 2

6 n+1

log b−log a

Exercise 210. Given b > a > 0, provide an accurate lower bound for b−a in terms of a rational function.

Proof. As a rule of thumb, when dealing with objects like f (b) − f (a) it always is the case to wonder if such difference

has a nice/practical integral representation. That is certainly the case here:

b b−a 1

log b − log a

Z Z Z

def 1 dx 1 dx du

δ(a, b) = = = =

b−a b−a a x b−a 0 x+a 0 (1 − u)a + ub

Z 1

a+b du

δ(a, b) = .

2 0 ((1 − u)a + ub) (ua + (1 − u)b)

This “folding trick ” leads to an integrand function that is convex, almost-constant on [0, 1] and symmetric with respect

to u = 21 . By applying the same trick a second time we get:

Z 1

2(a + b)

δ(a, b) = dt,

0 (a + b)2 − (b − a)2 t2

2(a+b)

hence by denoting ga,b (t) = (a+b)2 −(b−a)2 t2 and by exploiting the Hermite-Hadamard inequality we get:

1

8(a + b)

δ(a, b) ≥ ga,b 2 = .

(a + 3b)(3a + b)

24

As a straightforward corollary, log(2) ≥ 35 holds, and such inequality is pretty tight.

Let (a1 , . . . , ak ) and (b1 , . . . , bk ) be sequences of real numbers with the following properties:

def Pi Pi def

• ∀i ∈ [1, k], Ai = j=1 aj ≥ j=1 bj = Bi (the first sequence majorizes the second one);

Pk

• j=1 (aj − bj ) = 0 (same sum).

states that the previous assumptions grant that for every real convex function f :

k

X k

X

f (ai ) ≥ f (bi ).

i=1 i=1

f (b) − f (a)

δf (a, b) =

b−a

is symmetric in its arguments and increasing with respect to each argument.

In the given hypothesis we may define

ci = δf (ai , bi ),

then notice that:

k

X k

X k

X k−1

X

(f (ai ) − f (bi )) = ci (ai − bi ) = ci (Ai − Ai−1 − Bi + Bi−1 ) = (ci − ci+1 )(Ai − Bi ),

i=1 i=1 i=1 i=1

so

ci = δf (ai , bi ) ≥ δf (bi , ai+1 ) ≥ δf (ai+1 , bi+1 ) = ci+1

and the claim is proved.

1 1

Theorem 212 (Young’s inequality). If a and b are positive real numbers, p > 1 and p + q = 1, we have:

ap bq

ab ≤ + ,

p q

where equality holds iff ap = bq .

1 1 1 p 1 q

log(ab) = log(ap ) + log(bq ) ≤ log a + b ,

p q p q

then the claim follows by exponentiation. We may notice that, if ap 6= bq , the last inequality is tight.

As an alternative, one might consider that on R+ the function f (x) = xp−1 has the inverse function g(x) = xq−1 .

By a well-known theorem on the integration of inverse functions (or simply by integration by parts) it follows that:

Z a Z b

ap bq

+ = f (x) dx + g(x) dx ≥ ab.

p q 0 0

7 THE CAUCHY-SCHWARZ INEQUALITY AND BEYOND

Hölder’s inequality can be obtained from Young’s inequality through an amplification trick:

1

Theorem 213 (Hölder’s inequality). If x1 , . . . , xn and y1 , . . . , yn are non-negative real numbers, p > 1 and p + 1q = 1,

we have: !1/p !1/q

Xn Xn Xn

p q

xi yi ≤ xi · yi ,

i=1 i=1 i=1

where equality holds if and only if for any k ∈ [1, n] we have xk = λyk .

Proof. By setting

n

!1/p

X

kxkp = xpi ,

i=1

Pn n

i=1 xi yi X xi yi 1 kxkpp 1 kykqq 1 1

= ≤ p + q = + = 1.

kxkp · kykq i=1

kxkp kykq p kxkp q kykq p q

1 1 1

Morever, it is not difficult to prove the following generalization: if p, q, r are positive real numbers and p + q + r = 1,

then:

! p1 ! q1 ! r1

X X X X

p q r

|xi yi zi | ≤ |xi | · |xi | · |xi | .

i i i i

Theorem 214 (Minkowski’s inequality, triangle inequality for the Lp norm). If x1 , . . . , xn and y1 , . . . , yn are non-

negative real numbers and p > 1, by setting

n

!1/p

X p

kxkp = xi

i=1

we have:

kx + ykp ≤ kxkp + kykp .

Proof.

kx + ykpp = k(x + y)p k1 ≤ kx(x + y)p−1 k1 + ky(x + y)p−1 k1 ,

p ,

from which:

kx + ykpp ≤ (kxkp + kykp ) kx + ykpp−1 .

Theorem 215 (Carleman’s inequality). If {an }+∞

n=1 is a sequence of positive real numbers and the serie

+∞

X

an

n=1

+∞ Y

n

!1/n

X

ai ≤ e C.

n=1 i=1

Proof. By denoting through GM the geometric mean, through AM the arithmetic mean :

e

where by Stirling’s inequality we have (n!)1/n ≤ n+1 for any n ≥ 1, hence:

n

e X

GM(a1 , . . . , an ) ≤ kak .

n(n + 1)

k=1

It follows that:

+∞ +∞ +∞

X X X 1 X

GM(a1 , . . . , an ) ≤ e kak = e ak .

n=1

n(n + 1)

k=1 n≥k k=1

We may notice that such inequality always holds as a strict inequality. As a matter of fact, by assuming

D

we have ak = k for some constant D, but the harmonic series is divergent.

P

Exercise 216 (Indam test 2014, Exercise B3). Given n≥1 an , convergent series with positive terms, prove that

X n−1

(an ) n

n≥1

n−1

Proof. We may notice that (an ) n ≤ 1+(n−1)a by AM-GM, but n≥1 n1 is a divergent series, so such argument does

n

P

n

not prove the claim directly. By tweaking it a bit:

n−1 1 3 n

an = GM

n

, 2an , an , . . . , an

n 2 n−1

n−1

n−1 1 1 X k+1 1 log n

an n

≤ + an ≤ 2 + 1+ an

n n k n n

k=1

hence

π2

X n−1 1 X

ann

≤ + 1+ an .

6 e

n≥1 n≥1

An alternative approach is the following one: let α > 1 some real number and let

1 1

S = n : an ≤ n , L = n : an > n .

α α

7 THE CAUCHY-SCHWARZ INEQUALITY AND BEYOND

P an P an

Our purpose is to prove that both n∈S √n a

n

and n∈L √ n a

n

are convergent. By comparison with a geometric series,

P a P 1 α −1/n

n∈S n an ≤ n≥1 αn−1 = α−1 . On the other hand, if n ∈ L then an ≤ α, hence:

√n

X n−1 α X

(an ) n

≤ +α an

α−1

n≥1 n≥1

2

sX

X n−1

(an ) n

≤ 1 + an .

n≥1 n≥1

Theorem 217 (Hilbert’s inequality). If {am }m∈N and {bn }n∈N are two sequences in `2 (R) the following inequality

holds: v v

+∞ X +∞ u +∞ u +∞

X am bn uX

2

uX

≤π t am · t b2n .

n=1 m=1

n + m m=1 n=1

Proof. The proof we are going to present is essentially due to Schur, whose key idea is to prove the claim by a weighted

version of the Cauchy-Schwarz inequality. For any positive real number λ we have:

+∞ X+∞

!2 +∞ X +∞ +∞ +∞

X am bn X a2m m 2λ X X b2n n 2λ

≤ · .

m=1 n=1

n+m m=1 n=1

n+m n n=1 m=1

n+m m

We may notice that the first term in the RHS can be written as

+∞ +∞

X X 1 m 2λ

a2m ,

m=1 n=1

m+n n

so, by symmetry, it is enough to prove the existence of some λ > 0 such that, for any m ≥ 1:

+∞

X 1 m 2λ

≤ π.

n=1

m+n n

Since the terms of the series on the LHS are positive and decreasing, the following inequality holds:

+∞ Z +∞ Z +∞

X 1 m 2λ dx m2λ dy π

≤ 2λ

= 2λ

= ,

n=1

m+n n 0 m+x x 0 (1 + y) y sin(2πλ)

we leave to the reader to check that, if an = bn = n−1/2−ε ,

v v

+∞ X +∞ u +∞ u +∞

X am bn uX

2

uX

= (π − O(ε)) t am · t b2n .

n=1 m=1

n + m m=1 n=1

Prove that if p > 1, p1 + 1q = 1, {am }m∈N ∈ `p (R), {bn }n∈N ∈ `q (R), then

+∞ X+∞

+∞

!1/p +∞

!1/q

X am bn π X X

≤ apm · bqn .

n+m

n=1 m=1

sin πp m=1 n=1

Exercise 219 (A variant of Hilbert’s inequality).

Prove that if {an }n≥1 and {bn }n≥1 are two sequences in `2 (R+ ), then:

Γ 41 2

X X

a b

m n

√ ≤ 2√π kak2 kbk2 .

m≥1 n≥1 m2 + n2

Theorem 220 (Hardy’s inequality). Let p > 1 and let a1 , . . . , aN be positive real numbers.

Pk

By setting Ak = i=1 ai , we have:

N p p XN

X An p

≤ apn .

n=1

n p − 1 n=1

Z +∞ Z x p p Z +∞

1 p

f (u) du dx ≤ f (x)p dx.

0 x 0 p − 1 0

We may firstly prove:

N N

X Apn p X an Ap−1 n

(♠) p

≤ p−1

.

n=1

n p − 1 n=1

n

An

Let Bn = n and let ∆n be the difference between the n-th terms in the RHS and LHS of (♠). We have:

def p p

∆n = Bnp − an Bnp−1 = Bnp − (nBn − (n − 1)Bn−1 ) Bnp−1 ,

p−1 p−1

or:

np p(n − 1)

∆n = Bnp 1 − + Bn−1 Bnp .

p−1 p−1

As a consequence of Young’s inequality (212) we have:

p

Bn−1 Bp

Bn−1 Bnp−1 ≤ + (p − 1) n ,

p p

n−1 p n

∆n ≤ Bn−1 − Bp ,

p−1 p−1 n

and by creative telescoping we may state:

N p

X N BN

∆n ≤ − < 0,

n=1

p−1

proving (♠). By applying Hölder’s inequality (213) to the RHS of (♠) we have:

N N N

!1/p N

!(p−1)/p

X Apn p X an Ap−1n p X X Apn

p

≤ p−1

≤ apn ,

n=1

n p − 1 n=1

n p − 1 n=1 n=1

np

7 THE CAUCHY-SCHWARZ INEQUALITY AND BEYOND

P+∞

Exercise 221. Let {pn }+∞

n=1 be a sequence of positive real numbers such that the series

1

n=1 pn is convergent.

Prove that such assumptions grant that the series

+∞

X n2

pn

n=1

(p1 + . . . + pn )2

is convergent as well.

+∞ N N N

X 1 X X n2 X n2 p n

C2 = , PN = pi , SN = pn 2

= .

p

n=1 n i=1 n=1

(p1 + . . . + pn ) n=1

Pn2

N =1 is an increasing sequence, we have:

N

X n2 (Pn − Pn−1 ) N

X n2 (Pn − Pn−1 ) N

n2 n2

1 1 1 X

SN = + < + = + − ,

p1 n=2 Pn2 p1 n=2 Pn Pn−1 p1 n=2 Pn−1 Pn

−1

N

! N

5 X 2n + 1 N2 X n

SN < + − <5 .

P1 n=2

Pn PN P

n=1 n

By exploiting the Cauchy-Schwarz inequality we also have:

v v

N uN uN

X n2 pn

X n u X 1 u p

≤ t ·t 2 ≤ C Sn ,

P

n=1 n

p

n=1 n n=1

PN

p

SN < 5C SN ,

or:

SN < 25 C 2 .

Since such inequality holds for any N , and since the sequence {SN }+∞

N =1 is increasing, the following series is convergent

by the monotone convergence Theorem:

+∞ 2

X n pn

.

n=1

Pn2

It is not difficult to prove that we actually have the sharper inequality

N N

X 2n + 1 X 1

<4 ,

n=1

Pn p

n=1 n

2

SN < + 4C 2 .

a1

P+∞

Exercise 222. Prove that if n=1 a1n is a convergent series with positive real terms,

there exists a constant C ∈ R such that:

+∞ +∞

X n X 1

≤C .

a + . . . + an

n=1 1

a

n=1 n

Proof. Due to the AM-GM inequality we have that:

+∞ +∞ +∞

X n X 1 X 1 1

≤ = GM ,..., ,

a + . . . + an

n=1 1 n=1

GM(a1 , . . . , an ) n=1 a1 an

+∞

X 1

e ,

a

n=1 n

so the given inequality holds for sure by taking C = e. In the next exercise we will see that such result can be

improved: the given inequality holds for C = 2, that is the optimal constant.

n n

X 2k + 1 X 1

<4 .

a1 + a2 + . . . + ak ak

k=1 k=1

Proof. We recall that, by exploiting the Cauchy-Schwarz inequality in the form of Titu’s Lemma,

γ2 α2 β2

≤ + .

r1 + r2 r1 r2

This Lemma implies:

(n + 1)2 2n − 1 4 2n − 1 + (n − 1)2 4 n2

+ ≤ + = + ,

a1 + . . . + an a1 + . . . + an−1 an a1 + . . . + an−1 an a1 + . . . + an−1

from which it follows that:

n n−1

n2 X 2k + 1 4 (n − 1)2 X 2k + 1

+ ≤ + + ,

a1 + . . . + an a1 + a2 + . . . + ak an a1 + . . . + an−1 a1 + a2 + . . . + ak

k=1 k=1

and by induction:

n n

! n

n2 X 2k + 1 X 4 1 3 X 4

+ ≤ + + = .

a1 + . . . + an a1 + a2 + . . . + ak ak a1 a1 ak

k=1 k=2 k=1

Exercise 225. Prove that for any p > 1 and for any a, b, α, β > 0 we have:

1/p 1/p 1/p

(α + β)p+1 αp+1 β p+1

≤ + .

ap + bp ap bp

Proof. If we set b/a = x, it is enough to show that the minimum of the function f : R+ → R+

defined by:

p+1 p+1

f (x) = α p (1 + xp )1/p + β p (1 + x−p )1/p

p+1

is exactly (α + β) p . In order to do that, it is enough to check that f 0 (x) vanishes only at

1/p

β

x= .

α

7 THE CAUCHY-SCHWARZ INEQUALITY AND BEYOND

Theorem 226 (Knopp). For any real number p ≥ 1 there exists some constant Cp ∈ R+ such that

for any sequence a1 , . . . , aN of positive real numbers,

N 1/p N

X n X 1

< Cp .

n=1

ap1 + . . . + apn a

n=1 n

Proof. In a similar way to Exercise 223, we prove there is a positive and increasing function f : N0 → R+ such that:

!1/p N 1/p !1/p N −1 1/p

f (N ) X n Cp f (N − 1) X n

(♦) PN + ≤ + + ,

ap1 + . . . + apn ap1 + . . . + apn

p PN −1 p

n=1 an n=1

aN n=1 an n=1

!1/p N 1/p N

f (N ) X n 1 + f (1)1/p X Cp

+ ≤ + .

ap1 + . . . + apn

PN p

n=1 an n=1

a1 a

n=2 n

In order that (♦) implies the claim it is enough that f (1) ≤ (Cp − 1)p holds and:

!1/p !1/p !1/p

f (N ) N Cp f (N − 1)

∀N ≥ 2, PN p

+ PN ≤ + PN −1 p .

n=1 an n=1 apn an n=1 an

p/(p+1)

By exploiting the inequalities proved in 225, by assuming f (N )1/p + N 1/p ≥ Cp we have:

p

p+1 p p+1

p

f (N )1/p

+N 1/p

Cp f (N )1/p + N 1/p − Cpp+1

P 1/p ≤ a + P 1/p ,

N p N N −1

n=1 an n=1 apn

p

p+1 p

1

f (N )1/p + N 1/p ≤ f (N − 1) p+1 + Cpp+1 .

1

Now we may consider Cp = (1 + p) p , that is the best constant we may put in the RHS of the initial inequality, if

an = n. Then we consider f (N ) = k · N p+1 : the previous inequality becomes:

p

p+1

1 1 1 1

(♠) k p+1 N 1+ ≤ k p+1 (N − 1) + (1 + p) p+1 .

N k 1/p

Due to Bernoulli’s inequality we have:

p

p+1

1 p

1+ ≤1+ ,

N k 1/p N (p + 1)k 1/p

so if we have some k such that:

p 1 1

p

1

(♥) k − p(p+1) + k p+1 ≤ Cpp+1 = (p + 1) p+1

p+1

the inequality (♠) is fulfilled. By studying the stationary points of the function g(x) = Ax−α + xβ

it is simple to derive that the choice

k = (p + 1)−p

N p+1

leads to an equality in (♥). It just remains to prove that with the choice f (N ) = (p+1)p

1

we have f (1) ≤ (Cp − 1)p , or Cp ≥ 1 + p+1 , i.e.:

1 1

(p + 1) p ≥ 1 + .

p+1

By multiplying both sides by (p + 1) we get that such inequality is equivalent to:

p+1

(p + 1) p ≥ p + 2,

p+1 p+1

(p + 1) p ≥1+ · p = p + 2.

p

of positive real numbers we have:

p+1 N 1/p N

N p X n 1 X 1

+ ≤ (1 + p) p ,

1/p

(p + 1) (ap1 + . . . + apN ) n=1

ap1 + . . . + apn a

n=1 n

that essentially is a generalization of Hardy’s inequality (220) to negative exponents. We leave to the reader to prove

that the shown Cp constant is optimal, since by assuming an = n and letting N → +∞, it is clear it cannot be

replaced by any smaller real number.

Theorem 227 (Brunn-Minkowski). If A and B are two compact subsets of Rn and µ is the n-dimensional Lebesgue

measure,

1 1 1

µ(A + B) n ≥ µ(A) n + µ(B) n .

If both A and B are given by cartesian products of closed intervals (let us say boxes) the given inequality is trivial

by AM-GM. By an ingenious trick known as Hadwiger-Ohmann’s cut it is possible to show it continues to hold for

disjoint unions of boxes, hence the claim follows from the regularity of the Lebesgue measure.

Equality is achieved only if A and B are homothetic shapes, i.e. can be brought one into the other by the composition

of a uniform dilation and a translation. This inequality is extremely powerful and it can be employed to prove the

isoperimetric inequality in n dimensions.

Theorem 228 (Isoperimetric inequality in the plane). If γ is a regular, closed and simple curve, we have:

4πA ≤ L2

where L is the length of γ and A is the area enclosed by γ. Equality holds if and only if γ is a circle.

A key observation is that if a simple, regular and closed curve with a

given length encloses the maximum area, it has to be convex.

Otherwise we might apply a reflection to an arc of such curve, increas-

ing the enclosed area without affecting the length.

7 THE CAUCHY-SCHWARZ INEQUALITY AND BEYOND

On the other hand, given a closed and convex curve, we may consider

an arbitrary chord and apply a reflection with respect to the perpen-

dicular bisector of such chord to one of the arcs cut. Through such

transform, both the perimeter and the enclosed area stay unchanged.

In particular any curve with a given length that is regular, simple,

closed and maximizes the enclosed area is mapped by any of such

transforms into a convex curve: otherwise it would be possible to in-

crease the enclosed area without affecting the length, as depicted on

the right.

Given these considerations it is not difficult to show that the circle is the only solution of the isoperimetric prob-

lem in the plane. As an alternative, one may follow a discretization approach.

Lemma 229. Among all the simple and closed polygonal lines P1 P2 . . . Pn (Pn+1 = P1 ) having sides li = Pi Pi+1 with

fixed lengths, the polygonal line enclosing the greatest area is such that P1 , . . . , Pn are vertices of a cylic polygon.

That case can be tackled through Ptolemy’s inequality.

Theorem 230 (Ptolemy). If A, B, C, D (in this ordering) are the vertices of a convex quadrilateral in the plane, we

have:

AB · CD + BC · DA ≥ AC · BD

Proof. It is enough to apply a circle inversion with respect to a unit circle centered at A, then consider how distances

change under circle inversion.

We have: 2

a = p21 + q12 − 2p1 q1 cos(π − θ)

b2 = p2 + q 2 − 2p q cos(θ)

1 2 1 2

2 2 2

c = p2 + q2 − 2p2 q2 cos(π − θ)

2 2 2

d = p2 + q1 − 2p2 q1 cos(θ)

from which it follows that:

= (2pq)2 − (2pq cos θ)2

≤ (2ac + 2bd)2 − (a2 − b2 + c2 − d2 )2

= (a + b + c − d)(a + b − c + d)(a − b + c + d)(−a + b + c + d)

and ≤ holds as an equality iff the previous quadrilateral is cyclic. Moreover, given some closed and simple polygonal

line P1 , P2 , . . . , Pn (Pn+1 = P1 ) having sides with fixed lengths, it is always possible to rearrange its vertices in such a

way they lie on the same circle. Indeed we may place Q1 , Q2 , . . . , Qn , Qn+1 on a circle with a huge radius, in such a way

that Q1 Q2 = P1 P2 , . . . , Qn Qn+1 = Pn Pn+1 hold, then slowly decrease the radius of such circle, letting Q1 , . . . , Qn+1

“slide” on it while preserving the mutual distances. By continuity, there exists some radius such that Qn+1 ≡ Q1 ,

leading to a cyclic rearrangement of the polygonal line.

In particular, if P1 P2 . . . Pn is a closed and simple polygonal line with n ≥ 4 vertices, enclosing the largest possible

area, all the quadrilaterals Pk Pk+1 Pk+2 Pk+3 are cyclic. Otherwise by leaving Pk and Pk+3 where they are and by

rearranging Pk+1 and Pk+2 we would increase the enclosed area. This proves Lemma (229).

onal line, and since all the solutions to the isoperimetric problem

in the plane (i.e. the curves of fixed length enclosing the largest

possible area) are regular functions, the isoperimetric inequality

follows from Lemma (229) “by sending n towards +∞”. How-

ever the Brunn-Minkowski inequality provides a less involved

approach.

K having perimeter L and area A, and Br is a circle with radius

r, the area of K + Br is given by A + Lr + πr2 .

the enclosed area increases.

the union of K, n rectangles with height r and bases on the sides of

K, n circle sectors corresponding to a partition of Br .

region enclosed10 by a regular, simple and closed curve with length L,

we have:

µ(K + Br ) = µ(K) + rL + πr2 .

Due to Minkowski’s inequality with n = 2:

p p √

µ(K) + rL + πr2 ≥ µ(K) + πr2 ,

Equality holds if and only if, up to translations, K = λBr , meaning that K is a circle with radius λr. The last

approach can be easily extended to the n > 2 case: if the surface area of the boundary is fixed, the suitable closed balls

with respect to the Euclidean norm are convex sets enclosing the maximum volume. The last proof of the isoperimetric

inequality we are going to see has a more analytic flavour, and it is deeply related to the Poincaré-Wirtinger

inequality.

10 We used the letter K twice on purpose. The solutions to the isoperimetric problem are given by convex sets and, since every regular

and convex curve is a limit of boundaries of convex polygons, in the isoperimetric problem the continuous and discrete approaches are

equivalent.

7 THE CAUCHY-SCHWARZ INEQUALITY AND BEYOND

Theorem 232 (Wirtinger’s inequality for functions). (Version I) If f ∈ C 1 (R) is a 2π-periodic function such that

R 2π

0

f (θ) dθ = 0, we have:

Z 2π Z 2π

2

f (θ) dθ ≤ f 0 (θ)2 dθ

0 0

(Version II) If f is a function of class C 1 on the interval [0, 2π] and f (0) = f (2π) = 0, we have:

Z 2π Z 2π

f (θ)2 dθ ≤ 4 f 0 (θ)2 dθ

0 0

θ

where equality holds iff f (θ) = κ sin 2.

X X

f (θ) = sn sin(nθ) + cn cos(nθ),

n≥1 n≥1

X X

f 0 (θ) = nsn cos(nθ) − ncn sin(nθ)

n≥1 n≥1

Z 2π X X Z 2π

f (θ)2 dθ = π (s2n + c2n ) ≤ π n2 (s2n + cn )2 = f 0 (θ)2 dθ

0 n≥1 n≥1 0

and ≤ holds as an equality iff c2 = c3 = . . . = s2 = s3 = . . . = 0. This proves the first version of Wirtinger’s inequality.

When proving the second version we may assume without loss of generality

X

f (θ) = sn sin(nθ/2)

n≥1

X

0 1

f (θ) = 2 nsn cos(nθ/2)

n≥1

(in the L2 -sense) and the claim follows again from Parseval’s identity.

We may now consider that any regular, simple and closed curve with length L has an arc length parametrization,

i.e. a couple of piecewise-C 1 , L-periodic functions x(s), y(s) such that

2 2

dx dy

+ = 1.

ds ds

By introducing

def Lθ def Lθ

f (θ) = x , g(θ) = y

2π 2π

R 2π

we have that the area enclosed by γ, as a consequence of Green’s Theorem, is given by the integral 0 f (θ) g 0 (θ) dθ.

By defining f as the mean value of f on the interval [0, 2π], and by noticing that the mean value of g 0 (θ) is zero, we

have:

Z 2π Z 2π

A= f (θ) g 0 (θ) dθ = (f (θ) − f ) g 0 (θ) dθ

0 0

Z 2π

1

(f (θ) − f )2 + g 0 (θ)2 dθ

≤

2 0

Z 2π

Wirt 1

f 0 (θ)2 + g 0 (θ)2 dθ

≤

2 0

2π

L2 dθ L2

Z

1

= = .

2 0 4π 2 4π

Equality holds if and only if f (θ)−f = a sin θ +b cos θ and g 0 (θ) = f (θ)−f , i.e. only when γ is a circle. It is interesting

to point out that the given proof can also be reversed, proving that Wirtinger’s inequality for functions (or, at least,

its first version) is a consequence of the isoperimetric inequality in the plane.

Exercise 233 (Dido’s problem). A rich landowner has bought 1 Km of metal fence. He wants to use such fence and

a wall of his huge home to enclose the largest possible area for his flock. What is the optimal shape he may choose

for the fence, and how large is the largest area he can dedicate to his flock?

Exercise 234 (Lhuilier’s inequality). Prove that if P is a convex polygon with n sides, having external angles

α1 , . . . , αn , we have:

n α

i

X

L(∂P )2 ≥ 4 A(P ) tan

2

k=1

Exercise 235. Prove that if P is a convex polygon with n sides containing a unit circle, we have:

π

A(P ) ≥ n tan

n

and equality holds if and only if P is a regular polygon.

π

L(∂P )2 ≥ 4n tan A(P )

n

and equality holds if and only if P is a regular polygon.

Exercise 237 (Bonnesen’s inequality). Let γ be a regular, simple and closed curve enclosing a convex region P .

Let R, r denote the radii of the circumscribed and inscribed circle. Prove that the following strengthening of the

isoperimetric inequality holds:

L(γ)2 − 4π A(P ) ≥ π 2 (R − r)2 .

Among the convex polygons enclosing the origin, having sides of fixed lengths, the cyclic polygon encloses the

maximum area.

\ Pi O, Li = d(Pi , Pi+1 ) and φi = OP

\i+1 Pi , we want to find the maximum of

1 X sin θi sin φi 2

A= L

2 sin(θi + φi ) i

subject to the constraint: X

(θi + φi ) = (n − 2)π.

7 THE CAUCHY-SCHWARZ INEQUALITY AND BEYOND

We are in position of applying Lagrange multipliers, from which we have, for any i:

2 2

sin φi ∂ sin θi sin φi λ ∂ sin θi sin φi sin θi

= = 2 = = .

sin(θi + φ1 ) ∂θi sin(θi + φi ) Li ∂φi sin(θi + φi ) sin(θi + φ1 )

It follows that all the triangles OPi Pi+1 are isosceles triangles with vertex at O, hence the cyclic polygon P1 . . . Pn

certainly has the greatest area.

Every compact set K ⊂ Rn of diameter d is contained in some closed ball of radius

r

n

R≤d

2(n + 1)

Proof. Given a set of points of diameter d in Rn it is trivial to see that it can be covered by a ball of radius d.

But the above Theorem by Jung improves the result by a factor of about √12 , and is the best possible.

We first prove this Theorem for sets of points S with |S| ≤ n + 1 and then extend it to an arbitrary point set. If

|S| ≤ n + 1 then the smallest ball enclosing S exists. We assume that its center is the origin and denote its radius

by R. Denote by S 0 ⊆ S the subset of points such that kpk = R for p ∈ S 0 . It is easy to see that S 0 is in fact non empty.

Observation: The origin must lie in the convex hull of S 0 . Assuming the contrary, there is a separating hyperplane H

such that S 0 lies on one side and the origin lies on the other side of H (strictly). By assumption, every point in S \ S 0

has a distance strictly less than R from the origin. Move the center of the ball slightly from the origin, in a direction

perpendicular to the hyperplane H towards H such that the distances from the origin to every point in S \ S 0 remains

less than R. However, now the distance to every point of S 0 is decreased and so we will have a ball of radius strictly

less than R enclosing S which is a contradiction to the minimality of R.

Let S 0 = {p1 , p2 , . . . , pm } where m ≤ n ≤ d+1 and because the origin is in the convex hull of S 0 so we have non-negative

λi such that,

X X

λi pi = 0, λi = 1.

P

1 − λk = i6=k λi

1

Pm 2

≥ d2 P i=1 λi kpi − pk k

1 m 2

= d2 i=1 λi (2R − 2hpi , pk i)

1 2

Pm

= d2 2R − 2 h i=1 λi pi , pk i

2R2

= d2

2mR2

m−1≥

d2

2

Thus we get R m−1 n x−1

d2 ≤ 2m ≤ 2n+2 since m ≤ n + 1 and the function 2x is monotonic. So we have immediately

q

n

R ≤ d 2n+2 . The remainder of the proof uses the beautiful theorem of Helly. So assume S is any set of points of

q

n

diameter d. With each point as center draw a ball of radius R = d 2n+2 . Clearly any n + 1 of these balls intersect.

This is true because the center of the smallest ball enclosing n + 1 of the points is at most R away from each of those

points. So we have a collection of compact convex sets, any n + 1 of which have a nonempty intersection. By Helly’s

theorem all of them have an intersection. Any point of this intersection can be chosen to be the center of a ball of

radius R that will enclose all of S.

Theorem 239 (Helly). Let {X1 , . . . , Xd } be a finite collection of convex subsets of Rd , with n > d. If the intersection

of every d + 1 of these sets is non-empty, then the whole collection has a nonempty intersection; that is,

n

\

Xj 6= ∅.

j=1

For infinite collections one has to assume compactness: Let {Xα } be a collection of compact convex subsets of Rd ,

such that every subcollection of cardinality at most d + 1 has a non-empty intersection, then the whole collection has

a non-empty intersection.

Base case: Let n = d + 2. By our assumptions, for every j = 1, . . . , n there is a point xj that is in the common

intersection of all Xi with the possible exception of Xj . Now we apply Radon’s Theorem to the set A = {x1 , . . . , xn },

which furnishes us with disjoint subsets A1 , A2 of A such that the convex hull of A1 intersects the convex hull of A2 .

Suppose that p is a point in the intersection of these two convex hulls. We claim that

n

\

p∈ Xj .

j=1

Indeed, consider any j ∈ {1, . . . , n}. We shall prove that p ∈ Xj . Note that the only element of A that may not be in

Xj is xj . If xj ∈ A1 , then xj 6∈ A2 , and therefore Xj ⊃ A2 . Since Xj is convex, it then also contains the convex hull

of A2 and therefore also p ∈ Xj . Likewise, if xj 6∈ A1 , then Xj ⊃ A1 , and by the same reasoning p ∈ Xj . Since p is in

every Xj , it must also be in the intersection.

Above, we have assumed that the points x1 , . . . , xn are all distinct. If this is not the case, say xi = xk for some i 6= k,

then xi is in every one of the sets Xj , and again we conclude that the intersection is nonempty. This completes the

proof in the case n = d + 2.

Inductive Step: Suppose n > d + 2 and that the statement is true for n − 1. The argument above shows that any

subcollection of d + 2 sets will have nonempty intersection. We may then consider the collection where we replace the

two sets Xn−1 and Xn with the single set Xn−1 ∩ Xn . In this new collection, every subcollection of d + 1 sets will have

nonempty intersection. The inductive hypothesis therefore applies, and shows that this new collection has nonempty

intersection. This implies the same for the original collection, and completes the proof.

Theorem 240 (Radon, 1921). Any set of d + 2 points in Rd can be partitioned into two disjoint sets whose convex

hulls intersect. A point in the intersection of these convex hulls is called a Radon point of the set.

Proof. Consider any set {x1 , x2 , . . . , xd+2 } ⊂ Rd of d + 2 points in a d-dimensional space. Then there exists a set of

multipliers a1 , a2 , . . . , ad+2 , not all of which are zero, solving the system of linear equations

d+2

X d+2

X

ai xi = 0, ai = 0,

i=1 i=1

because there are d + 2 unknowns (the multipliers) but only d + 1 equations that they must satisfy (one for each

coordinate of the points, together with a final equation requiring the sum of the multipliers to be zero). Fix some

7 THE CAUCHY-SCHWARZ INEQUALITY AND BEYOND

particular nonzero solution a1 , a2 , . . . , ad+2 . Let I be the set of points with positive multipliers, and let J be the set

of points with multipliers that are negative or zero. Then I and J form the required partition of the points into two

subsets with intersecting convex hulls. The convex hulls of I and J must intersect, because they both contain the

point

X ai X −aj

p= xi = xj ,

A A

i∈I j∈J

where X X

A= ai = − aj .

i∈I j∈J

The left hand side of the formula for p expresses this point as a convex combination of the points in I, and the

right hand side expresses it as a convex combination of the points in J. Therefore, p belongs to both convex hulls,

completing the proof.

Van Der Corput’s trick for lower bounds. The purpose of this paragraph is to prove that the partial sums

of the sequence {sin(n2 )}n≥1 are not bounded. We have:

n

!2 n n

X X X

2 sin(k 2 ) = cos(j 2 − k 2 ) − cos(j 2 + k 2 )

k=1 j,k=1 j,k=1

2 2

nX −1 2n

X

= n+2 d1 (m) cos(m) − 2 d2 (m) cos m

m=1 m=2

where d1 (m) accounts for the number of ways to write m as j 2 − k 2 with 1 ≤ k < j ≤ n and d2 (m) accounts for

the number of ways to write m as j 2 + k 2 with 1 ≤ j, k ≤ n. Since both these arithmetic functions do not deviate

much from their average order (by Dirichlet’s hyperbola method d1 (m) behaves on average like log m and d2 (m)

behaves on average like π4 ), It is not terribly difficult to prove that for infinitely many ns

n

X √

sin(k 2 ) ≥ C n

k=1

holds for some absolute constant C ≈ √12 through summation by parts and the Cauchy-Schwarz inequality.

A detailed exposition on Dirichlet’s hyperbola method can be found on Terence Tao’s blog.

We recall that Van Der Corput’s trick is usually employed to produce upper bounds: for instance

n

X √

sin(k 2 ) ≤ D n log n,

k=1

for some absolute constant D > 0, holds for any n large enough.

2

In particular n≥1 sin(n )

is convergent for any α > 12 .

P

nα

8 Bessel functions and the Gauss circle problem

Bessel functions naturally arise in the solution of the problem ∆u = f with certain boundary conditions, and they are

extremely relevant in Harmonic Analysis. We may introduce them by studying the Fourier sine series of the arcsin

function.

Z 1 Z π/2

arcsin(x) sin(πnx) dx = Im eiπn sin z z cos(z) dz

−1 −π/2

X (−1)m (πn)2m+1 Z π/2 2m+1

= z cos(z) (sin z) dz

(2m + 1)! −π/2

m≥0

!

IBP

X (−1)m (πn)2m+1 z sin(z)2m+2 π/2 Z π/2

2m+2

= − (sin z) dz

(2m + 1)! 2m + 2 −π/2 −π/2

m≥0

" 2m+2

#

X (−1)m (πn)2m+1 π m+1

= · 1 − m+1

(2m + 1)! 2m + 2 4

m≥0

= −π

n 4m+1 (m + 1)!2

m≥0

m 2m

1 X (−1) (πn)

= (−1)n+1 +

n 4m m!2

m≥0

X (−1)n z 2n

J0 (z) =

4n n!2

n≥0

we have

L2 (−1,1) X (−1)m+1 + J0 (πm)

arcsin(z) = sin(πmz),

m

m≥1

arcsin(z) − z = .

2 m

m≥1

It is straightfoward to check from the power series that J0 is the unique solution of the following differential equation

of order 2:

z f 00 (z) + f 0 (z) + z f (z) = 0, f (0) = 1, f 0 (0) = 0

which allows to state many interesting facts about the behaviour of J0 far from the origin.

From the power series definition it also follows that

1

(LJ0 )(s) = √

1 + s2

2 2 4 1

(LJ0 )(s) = K − 2 = √

πs s AGM(s, s2 + 4)

π in a right neighbourhood of the origin and like 1s in a left neighbourhood of +∞.

Anyway the properties of J0 are better understood by introducing the Bessel functions of the first kind Jn and

their generating function: for such a purpose, we study the Fourier cosine series of eiz cos θ , which will lead to the

Jacobi-Anger expansion.

8 BESSEL FUNCTIONS AND THE GAUSS CIRCLE PROBLEM

Z 2π X in z n Z 2π

eiz cos θ cos(mθ) dθ = (cos θ)n cos(mθ) dθ

0 n! 0

n≥0

X in z n Z 2π

= (eiθ + e−iθ )n cos(mθ) dθ

2n n! 0

n≥0

X πin z n m + 2k

= π

2n n! k

n=m+2k

X (−1)k z 2k m + 2k

m m

= πi (z/2)

4k k!(m + k)! k

k≥0

hence by defining

X (−1)n (z/2)2n+m

Jm (z) =

n!(n + m)!

n≥0

we have

X

eiz cos θ = J0 (z) + 2 im Jm (z) cos(mθ).

m≥1

By considering the real or imaginary part of both sides, we get that Bessel functions of the first kind provide the

coefficients of the Fourier series of sin(z cos θ) and cos(z cos θ). By applying Cauchy’s integral formula to the Jacobi-

Anger expansion we find the following integral representation:

1 π

Z

Jn (z) = cos(z sin θ − nθ) dθ

π 0

I

1 z 1 dt

Jn (z) = exp t− n+1

.

2πi kzk=ε 2 t t

Bessel functions share with the families of orthogonal polynomials many interesting properties. The recurrence relations

1

J00 (z) = −J1 (z), Jn0 (z) = (Jn−1 (z) − Jn+1 (z)) ,

2

2n d

Jn−1 (z) + Jn+1 (z) =

Jn (z), (xm Jm (x)) = xm Jm−1 (x)

z dx

are straightforward to prove through the series definition or the integral representation. In the same way the entire

function Jn (z) can be checked to be a solution of Bessel’s differential equation

whose structure recalls the structure of Legendre’s differential equation. The identities

X X n

X X

1 = J0 (x)2 + 2 Jk (x)2 , 1 = J0 (x) + 2 J2k (x), Jn (2z) = Jk (z)Jn−k (z) + 2 (−1)k Jk (z)Jn+k (z)

k≥1 k≥1 k=0 k≥1

can be proved through the orthogonality relations in L2 (−π, π) applied to the Jacobi-Anger expansion. The integral

representation or Bessel’s differential equation allow to define Jν (x) also for non-integer values of ν. For n ∈ N, the

function J−n (x) is defined as (−1)n Jn (x). According to this convention, Bessel functions of the first kind have a very

simple addition formula of the convolution-type:

X

Jn (y + z) = Jm (y)Jn−m (z).

m∈Z

√

The substitution g(x) = xf (x) turns the differential equation defining J0 , namely xf 00 (x) + f 0 (x) + xf (x) = 0, into:

1

g 00 (x) + 1 + 2 g(x) = 0.

4x

Since the term 4x1 2 is non-negative and negligible for large values of x, it is reasonable to assume that g(x) =

A(x) cos(x) + B(x) sin(x) is such that A(x) → A and B(x) → B for x → +∞. In order to find the value of these

constants, we may notice that

√ Z √2x r

0 2 2 2 t2

cos(x)J0 (x) − sin(x)J0 (x) = √ cos(t ) 1 − dt,

π x 0 2x

√ Z √2x r

0 2 2 2 t2

sin(x)J0 (x) + cos(x)J0 (x) = √ sin(t ) 1 − dt,

π x 0 2x

hence by the dominated convergence theorem and Fresnel integrals we have A = B = √1π .

d2 1

By studying the action of the operator dx 2 + 1 + 4x2 on the conjectural expression

√ A0 A00 B0 B 00

πx J0 (x) = A + + 2 + . . . cos(x) + B + + 2 + . . . sin(x)

x x x x

Poisson derived the formal series

1 9 9 · 25 1 9 9 · 25

1− − + + . . . cos(x) + 1 + − − + . . . sin(x)

8x 2 · 82 x2 2 · 3 · 8 3 x3 8x 2 · 82 x2 2 · 3 · 83 x3

which is not convergent, but whose truncations allow to devise arbitrarily accurate approximations of J0 (x) as x → +∞.

1

As a side note, we may notice that over the real line the Laplace transform of J0 (x), namely √1+s 2

, and the Laplace

sin(x)+cos(x)

q

1 1

√

transform of √

πx

, namely 1+s2 + √1+s2 , differ by a term bounded between 0 and 2 − 1 and behaving like

√1 for |s| → +∞.

|s|

Theorem 242 (RH for dummies). All the zeroes of the entire functions Jn (z), Jn0 (z) in C \ {0} are real and simple.

The average distance between a zero and the next one approaches π.

Proof. Assuming that for some z ∈ C we have Jn (z) = 0 and Jn0 (z) = 0, Bessel’s differential equation implies Jn00 (z) = 0,

(m)

then Jn (z) = 0 by induction, hence Jn (z) = 0 by the principle of analytic continuation, which is a contradiction.

Since the coefficients of the Maclaurin series of Jn (z) are real, assuming Jn (z0 ) = 0 with z0 6∈ R implies that z0 is a

zero, too. The same holds for Jn0 . Now we may consider the identity

Z z

2 2

(a − b ) tJn (at)Jn (bt) dt = z [bJn (az)Jn0 (bz) − aJn0 (az)Jn (bz)]

0

Z 1

2

0 = (z02 − z0 2 ) t |Jn (z0 t)| dt

0

which can only be true for z0 = iy with y ∈ R \ {0}. In such a case, however,

X 1 y 2m

m!(m + n)! 2

m≥0

8 BESSEL FUNCTIONS AND THE GAUSS CIRCLE PROBLEM

is clearly positive. We may invoke the Gauss-Lucas theorem (the zeroes of f 0 (z) lies in the convex hull of the zeroes

of f (z)) to deduce that all the zeroes of Jn0 (z) in C \ {0} are real and simple. The asymptotic formula

r

2 πn π π

Jn (z) ∼ cos z − − for |z| → +∞ with |arg z| ≤ −ε

πz 2 4 2

1

x2 y 00 + xy 0 + x2 − y = 0,

4

for which y = J 21 is a solution, is mapped into a well-known homogeneous differential equation with constant coefficients

f (x)

by the substitution y(x) = √ .

x

R +∞ J1 (x) 2

Exercise 244 (Exploiting the Fourier transform). Compute the value of the integral 0 x dx.

x2 f 00 + xf 0 + x2 f = f

Z +∞ Z +∞ 2

1 J1 (x)

2

J1 (x) J100 (x)

J1 (x) + dx = dx

0 2 0 x

J1 (x)

so we just need to recall that the Fourier transform of x is given by:

r

J1 (x) 2p

F (t) = 1 − t2 · 1(−1,1) (t)

x π

to be able to state: Z +∞ Z 1

1 2

J1 (x)2 + J1 (x) J100 (x) dx = 1 − t2 dt =

0 π 0 3π

as a consequence of Parseval’s theorem.

Z +∞ Z +∞

J1 (x) 1 sα−1

dx = √ √ ds

0 xα Γ(α) 0 1 + s2 (s + 1 + s2 )

Z π/2 α−1 Z π/2 h

(tan θ) α−1 −1−α α −1−α

i

= dθ = (sin θ) (cos θ) − (sin θ) (cos θ) dθ

0 1 + sin θ 0

πα

= 2

1 + α2 sin πα

2α+1 Γ 2

holds for any α ∈ − 12 , 2 , as soon as the original integral is intended as an improper Riemann integral.

√

π(1 − α)Γ α2

Z +∞

J1 (x)2

dx = 3

xα 2(1 + α)Γ 1+α cos πα

0

2 2

for any α ∈ (0, 1). We also have relations with Fourier-Legendre series and hypergeometric functions. For instance:

2 K − s42

Z +∞ Z +∞ Z +∞

J0 (x)2 ds

√ dx = 3/2

ds = √ √

0 x 0 (πs) 0 πs AGM(s, s2 + 4)

Z +∞ Z +∞

ds ds

= √ √ = q

2πs AGM(s, s + 1) 2 p

0 0 2πs(s2 + 1) AGM(1, 1 − s21+1 )

√

Z +∞ 2 K 21 ds Z +∞ K 1 ds Z 1

s +1 s+1 1 K (s) ds

= p = p √ = √ 3/4

π 3 s(s2 + 1) 3

2π s(s + 1) s 2π 3

0 0 0 (s(1 − s))

2

Γ 41 X Γ n + 12 Γ n + 14 Γ 41

· 2 F1 14 , 12 ; 1; 1

= √ 2

= √

3

8π n≥0 Γ(n + 1) 8π 2

1 4

= Γ 14 .

4π 5/2

The last integrals are just special instances of the Sonine-Schafheitlin integral formula.

Once the modified Bessel functions of the first kind are defined through

X (x/2)2n+m 1

Im (x) = = m Jm (ix)

n!(n + m)! i

n≥0

Z +∞

def

I(α, β, m) = xα e−βx Im (x) dx

0

1s>1 (s) p m

L(Im (x))(s) = √ s − s2 − 1

s2 − 1

we have:

" √ m #

dα s − s2 − 1

I(α, β, m) = √

dsα s2 − 1 s=β

2

xm−1 dx

I

s7→ x 2x

+1

α!

= √ α+1

2πi x2 +1

2x − β

kx−(β− β 2 −1)k=ε

p m+α

z + β − β 2−1

α!

= Res α+1

2α+1 z=0 α+1

p

z z − 2 β2 − 1

p m+α

α+1 x + β − β 2−1

α!(−1) α+2

= p α+1 · [x ] α+1

4α+1 2

β −1 1 − √ z2

2 β −1

α+1 α+2

X

α!(−1) m+α 2α + 2 − k p p k−1

= α+1

(β + β 2 − 1)m+α−k 2k−1 β 2 − 1 .

8 (β 2 − 1)α+1 k α

k=0

Besides this characterization of the inner products against the elements of Span(xα e−βx : α ∈ N, β > 0),

modified Bessel functions of the first kind have a simple integral representation:

Z π

1

In (α) = cos(nx) eα cos x dx.

π 0

8 BESSEL FUNCTIONS AND THE GAUSS CIRCLE PROBLEM

By the cosine addition formula and integration by parts we may easily get the following recurrence relation:

α

In = (In−1 − In+1 ) .

2n

From this identity, we have that the ratio between In and In−1 is the continued fraction:

In 1

(α) = 2 1 .

In−1 α n+ 2

(n+1)+...

α

In particular

I1 1 x

r(x) = (x) = 2 1 = x2

.

I0 x + 4 1 2+ x2

x+ 6+ 1 4+ 2

x ... 6+ x

8+...

We also have

d 1

In = (In−1 + In+1 )

dα 2

mimicking the recurrence relation for Jn0 .

Corollary 245.

1

P

I1 (2) m≥0 m!(m+1)!

[0; 1, 2, 3, 4, 5, 6, 7, 8, 9, . . .] = = P 1 ≈ 0.697774657964.

I0 (2) m≥0 m!2

2

IA/D

D

[A + D; A + 2D, A + 3D, A + 4D, . . .] = 2 .

I1+A/D D

The asymptotic behaviour of I0 (x) for x → +∞ can be derived from Bessel’s differential equation:

ex

1 9 9 · 25

I0 (x) = √ 1+ + + + ...

2πx 1! · 8x 2! · 82 x2 3! · 83 x3

the identity holds in the Poisson sense. If combined with the continued fraction representation for Im+1

Im ,

it leads to the asymptotic behavior of Im (x) as x → +∞, for any m ∈ N. For a fixed x ∈ R+ , the continued fraction

representation immediately leads to the upper bound

(x/2)n

In (x) ≤ I0 (x).

n!

If one is just interested in the first term of the asymptotic expansion of I0 , it can be recovered in a very simple way:

for s → 1+ ,

1 1 1

(LI0 )(s) = √ ∼√ √ =L √ (s).

s2 − 1 2 s−1 ex 2πx

P+∞ (−1)m 2m

Relations with the sine and cosine integrals. Since J0 (x) = m=0 4m m!2 x we have:

π/2 +∞

(−1)m x2m π/2

Z X Z

J0 (x cos θ) cos θ dθ = m m!2

(cos θ)2m+1 dθ

0 m=0

4 0

+∞

X (−1)m x2m 4m m!2

= ·

m=0

4m m!2 (2m + 1)!

+∞

X (−1)m x2m sin x

= = .

m=0

(2m + 1)! x

P+∞ (−1)m 2m+1

In a similar way J1 (x) = m=0 2·4m (m+1) m!2 x gives:

π/2 +∞ π/2

(−1)m x2m+1

Z X Z

J1 (x cos θ) dθ = (cos θ)2m+1 dθ

0 m=0

2 · 4m (m + 1) m!2 0

+∞

X (−1)m x2m+1 4m m!2

= ·

m=0

2 · 4m (m + 1) m!2 (2m + 1)!

+∞

X (−1)m x2m+1 1 − cos x

= = .

m=0

(2m + 2)! x

Z +∞ Z π/2 Z +∞ Z π/2

sin x cos θ π

dx = cos(θ) J0 (x cos θ) dx dθ = dθ =

0 x 0 0 0 cos θ 2

and similarly

+∞ π/2 +∞ π/2

1 − cos x

Z Z Z Z

J1 (x cos θ) π

dx = dx dθ = 1 dθ = .

0 x2 0 0 x 0 2

X X 1 Z 2π

J0 (n) = eni cos θ dθ.

2π 0

n≥0 n≥0

The partial sums of this series can be written in the following form:

N

sin N + 21 cos θ

Z 2π Z π

X 1 1 − e(N +1)i cos θ 1 1

SN = J0 (n) = dθ = + dθ

1 − ei cos θ sin cos2 θ

n=0

2π 0 2 2π 0

1

by invoking the identities cos(θ + π) = − cos θ and + 1−e1−z = 1. Rearranging,

1−ez

Z 1

1 1 D (x)

SN = + √N dx

2 2π −1 1 − x2

1 (x)

where DN (x) is Dirichlet’s kernel. If the coefficients cn of the Fourier cosine series of (−1,1)

√

1−x2

were in `1 ,

we could immediately state

Z 1 Z 1

1 D (x) 1 dx 1 1

√N

(♥) dx → √ + · √ =1 as N → +∞.

2π −1 1 − x2 2π −1 1 − x2 2 1 − x2 x=0

Unluckily, our case is not the case: it can be shown that |cn | ≤ π28√n for any n ≥ 1, but the density of ns such

√

that 21 ≤ n|cn | ≤ 1 is positive, hence {cn }n≥1 6∈ `1 . On the other hand, in order to prove (♥) it is enough to

1 (x)

show that the Fourier series of (−1,1)

√

1−x2

is pointwise convergent at the origin. Since

X 2n Z 1

2

1 − e−s X (2n)!

1(0,1) (x) n 2n −sx 1 1

L √ (s) = x e dx = + + w

1 − x2 4n 0 s 2n n! s2n+1 s

n≥0 n≥1

1(0,1) (x)

lim s · L √ (s) = 1

s→+∞ 1 − x2

and the pointwise convergent at the origin for the Fourier series of √ 1 is proved.

1−x2

As a straightforward corollary, we have

X 3

J0 (n) = .

2

n≥0

8 BESSEL FUNCTIONS AND THE GAUSS CIRCLE PROBLEM

Z +∞

X f (0)

f (n) = + + f (x) dx

2 0

n≥0

Z +∞

X f (0)

f (n) = − + f (x) dx.

2 0

n≥0

R +∞

The same argument applied to Jk (x) with k ∈ N+ leads to

P

n≥0 Jk (n) = 0

Jk (x) dx = 1.

+∞

X

eiz cos θ = J0 (z) + 2 in Jn (z) cos(nθ)

n=1

+∞

X

sin(z cos θ) = 2 (−1)m J2m+1 (z) cos((2m + 1)θ)

m=0

Z π/2

π

cos((2n + 1)x) cos((2m + 1)x)dx = δm,n ,

0 4

+∞

X 1

(−1)m cos((2m + 1)θ) = ,

m=0

2 cos θ

Z π/2 +∞

4 sin(z cos θ) X

dθ = 2 J2m+1 (z),

π 0 2 cos θ m=0

hence:

+∞ Z π/2 Z 1

X 1 sin(z cos θ) 1 sin(zt)

J2m+1 (z) = dθ = √ dt

m=0

π 0 cos θ π 0 t 1 − t2

+∞

(−1)r 1 z

Z

1X

= z 2r+1 = J0 (u)du.

2 r=0

(2r + 1)4r (r!)2 2 0

+∞

X 1

(−1)m J2m+1 (z) = sin z.

m=0

2

There are no issues in exploiting the pointwise convergence of a Fourier series since g(θ) = eiz cos θ is an analytic

function.

The integral of sin2 sin2 . We have:

π π

Z2 Z2

1 − cos(2 sin2 (x))

sin2 (sin2 (x))dx = dx

2

0 0

π

Z2 X

π 1 (−1)k 4k

= − sin4k (x)dx

4 2 (2k)!

0 k≥0

π

Z2

π 1 X (−1)k 4k

= − sin4k (x)dx

4 2 (2k)!

k≥0 0

π 1 X (−1)k 4k (4k)! π

= −

4 2 (2k)! (4k (2k)!)2 2

k≥0

π π X (−1)k (4k)!

= − .

4 4 ((2k)!)3 4k

k≥0

π π π

Z2 Z2 Z2

1 − cos(2 sin2 (x)) 1 − cos(1 − cos(2x))

sin2 (sin2 (x))dx = dx = dx

2 2

0 0 0

π π

Z2 Z2

π cos(1) cos(cos(2x)) sin(1) sin(cos(2x))

= − dx − dx

4 2 2

0 0

π

Z 2 Zπ

π cos(1) π cos(1)

= − cos(cos(2x))dx = − cos(cos(x))dx

4 2 4 4

0 0

π cos(1)π

= − J0 (1).

4 4

X (2n)!

= e2 I0 (2).

n!3

n≥0

Proof. By exploiting the integral representation for central binomial coefficients we have

π/2

(2 cos θ)2n 2 π/2 4 cos2 θ

X (2n)! Z Z

1

2

= 1 F1 2 ; 1; 4 = dθ = e dθ

n!3 0 π

n! π 0

n≥0

Z π/2

2e2 e2 π 2 cos ϕ 2e2 π/2

Z Z

2 cos(2θ)

= e dθ = e dϕ = cosh(2 cos ϕ) dϕ

π 0 π 0 π 0

2e2 X 1 Z π/2 X 1

= (2 cos ϕ)2m dϕ = e2 = e2 1 F2 (1; 1, 1; 1) = e2 I0 (2).

π (2m)! 0 m!2

m≥0 m≥0

X (2n)!

S=

n!4

n≥0

8 BESSEL FUNCTIONS AND THE GAUSS CIRCLE PROBLEM

Proof. Through the series representation for I0 (2z) and the integral representation for central binomial coefficients we

have

2 π/2 2 1 I0 (4x)

Z Z

S = 1 F2 21 ; 1, 1; 4 =

I0 (4 cos θ) dθ = √ dx

π 0 π 0 1 − x2

and the RHS can be written as the following double series:

2m n X 2n

2 X m 4 1

X

= n

= = I0 (2)2 = 1 F2 (1; 1, 1; 1)2 .

π 4m n!2 (2m + 2n + 1) n!2 m!2 n!2

m,n≥0 n≥0 m,n≥0

Actually S is just one of the coefficients of the Fourier cosine series of I0 (4 cos θ):

1 2π X 4n (cos θ)2n

X Z

I0 (4 cos θ) = I0 (2)2 + cm cos(2mθ), cm = cos(2mθ) dθ.

π 0 n!2

m≥1 n≥0

iθ −iθ

e +e

Using both cos θ = Re(eiθ ) = 2 and the binomial theorem we have

2n

X n+m

cm = 2 = 2Im (2)2

n!2

n≥0

X

I0 (4) = I0 (2)2 + 2 Im (2)2 ,

m≥1

i.e. to an instance of the duplication formula for the I0 function, which can be seen as a direct consequence of Parseval’s

2

identity, too. Since the Fourier coefficients of I0 (4 cos θ) depend on Im , Parseval’s identity provides a curious identity

4

about the series of Im (2) .

X 4n (2n)!

I0 (4z)2 = z 2n

n!4

n≥0

gives

π/2

π X (2n)!2

Z

π

I0 (4 cos θ)2 dθ = 1 1

= · 2 F3 2 , 2 ; 1, 1, 1; 16

0 2 n!6 2

n≥0

hence by the orthogonality relations

X (2n)!2 X

1 1

= I0 (2)4 + 2 Im (2)4 .

= 2 F3 2 , 2 ; 1, 1, 1; 16

n!6

n≥0 m≥1

√ X 1/2

4 1/2

x + 1 − 1 = −4 +2 xm

m−1 m

m≥3

r r !4

1 1 X 1/2

1/2

1+ − = −4 +2 xm

x x m+1 m+2

m≥1

hence:

X √ √ 4 X 1/2

1/2

n+1− n = −4 +2 ζ(m)

m+1 m+2

n≥1 m≥1

Z +∞

xm−1

dx X 1/2 1/2

= −4 x

+ 2

0 e −1 m+1 m+2 (m − 1)!

m≥1

Z +∞ −x/2

x I0 x2 − 4 I1 x2

2e

= dx

0 x2 (ex − 1)

Z +∞

I2 (x) dx

= 2

0 xe (ex − 1)(ex + 1)

x

1 −2x

where the last integrand function is pretty close to 16 e , from which it follows that the original series is pretty

1

close to 16 . The following integral representation for the Bessel function I2

π

x2

Z

I2 (x) = exp (x cos θ) sin4 (θ) dθ

3π 0

leads to:

X √ √ 4

Z π

1

ψ0 3−cos θ

sin4 (θ) dθ

n+1− n = 2

6π 0

n≥1

Z π

1

ψ 0 1−cos θ

4

= −1 + 2 sin (θ) dθ

6π 0

π π/2 sin4 (θ)

Z

= −1 + dθ

6 0 sin2 π sin2 θ2

π π/4 sin4 (2θ) dθ

Z

= −1 +

sin2 π sin2 θ

3 0

by the reflection formula for the trigamma function. The blue integral can be written in the more symmetric form

X √ 1 1

√ 4 x3/2 (1 − x)3/2 (1 − x2 )3/2 dx

Z Z

4π π

n+1− n = dx = .

3 0 sin2 (πx) 6 0 cos2 πx2

n≥0

√ 3

X 1/2 1/2

( x + 1 − 1) = 4 + xm

m m−1

m≥3

r r !3

1 1 X 1/2 1/2

1+ − = 4 + xm−3/2

x x m m−1

m≥3

X √ √ 3 X 1/2 1/2 3

n+1− n = 4 + ζ m−

m m−1 2

n≥1 m≥3

Z +∞ m−5/2

X 1/2 1/2 1 x

= 4 + 5

dx

m m−1 m− 2 ! 0 ex − 1

m≥3

Z +∞ −x

3e (2 − 2ex + x + ex x)

= √ dx

0 2 πx5/2 (ex − 1)

Z +∞

3 1 1 1

= √ − 5/2 x + 5/2 x x dx

π 0 2x3/2 ex x e x e (e − 1)

hence

X √ √ 3 3 3

n+1− n = 2π ζ 2

n≥0

just follows from integration by parts, Frullani’s Theorem and the integral representation for the ζ function.

The same approach allows an explicit evaluation in terms of the ζ function for any odd value of s. For instance:

X √ √ 5 X √ √ 7

15 5 7

ζ 32 − 2π

105 7

n + 1 − n = 2π 2 ζ 2 , n + 1 − n = 2π 3 ζ 2

n≥0 n≥0

X √ √ 9 90 5 945 9

n+1− n = 2π 2 ζ 2 − 2π 4 ζ 2 .

n≥0

8 BESSEL FUNCTIONS AND THE GAUSS CIRCLE PROBLEM

Relations between Hermite polynomials and Bessel functions Starting with the generating function

2 X tn

e2xt−t = Hn (x)

n!

n≥0

X tn

exp 2xteiθ − t2 e2iθ = Hn (x)eniθ

n!

n≥0

π

t2n

Z X

exp 4xt cos θ − 2t2 cos(2θ) dθ = 2π Hn (x)2 2 .

−π n!

n≥0

2

Now we can multiply both sides of (3) by e−x ekix and apply

R

R

(. . .) dx to get

π

√ X Z t2n

Z

2 2 2

πe−k /4

e2t −2ikt cos θ

dθ = 2π Hn (x)2 e−x ekix dx

−π R n!2

n≥0

√ −k2 /4 2t2

X Z

2 −x2 kix t

πe e J0 (2kt) = Hn (x) e e dx

R n!2

n≥0

2

and the wanted integral can be recovered from the Cauchy product between the Taylor series of e2t and the

Taylor series of J0 (2kt):

Z

2

√ 2 X 2a (−1)b k 2b t2a+2b

Hn (x)2 e−x ekix dx = πe−k /4 n!2 · [t2n ]

R a!b!2

a,b≥0

such that:

n

√ 2n−b (−1)b k 2b

Z

2 −x2 kix −k2 /4

X

2

Hn (x) e e dx = πe n! .

R (n − b)!b!2

b=0

One could place the right hand side of the last equation into the more known form of

2

√ n −k2 /4 k

π 2 n! e Ln ,

2

Disclaimer : most of the contents of this subsection are freely adapted from the books, notes and articles of T. Jameson,

A. Ivic, J. Bell and G.N. Watson. Let N (R) be the number of lattice points in the region {(x, y) : x2 + y 2 ≤ R2 } and

let r2 (N ) be the following arithmetic function:

r2 (N ) = {(x, y) ∈ Z2 : x2 + y 2 = N } .

R

X jp R2

k X

N (R) = 1+2 R 2 − x2 = r2 (N )

x=−R N =0

and a reasonable claim is that N (R), for sufficiently large values of R, is close to the area of a circle with radius R,

i.e. πR2 . Since Z[i] (the ring of Gaussian integers) is an Euclidean domain, r2 (N ) only depends on the prime factors

8.1 The Gauss circle problem

of N . Such numbers of representations can be shown to be four times a multiplicative function, since Z[i] has four

invertible elements:

X

1 if d ≡ 1 (mod 4),

r2 (N ) = 4(χ4 ∗ 1)(N ) = 4 χ4 (d), χ4 (d) = −1 if d ≡ 3 (mod 4),

d|N 0 if d ≡ 0 (mod 2).

def

X r2 (N ) X (−1)n+1

L(r2 , s) = = 4ζ(s)L(χ4 , s), L(χ4 , s) =

Ns (2n + 1)s

N ≥1 n≥0

X (−1)n+1 Z 1

dx π

L(χ4 , 1) = = 2

=

2n + 1 0 1 + x 4

n≥1

1

ζ(s) = + γ + O(s − 1)

s−1

as s → 1+ . Summation by parts also gives

√

√

X

X 1 1 N ( n) 1 log n

L(r2 , s) = N ( n) s − = + (s − 1) · O

n (n + 1)s n n+1 n2

n≥1 n≥1

2 is convergent, its limit has to be 4L(χ4 , 1) = π. This can be shown through a simple

geometric argument. Let us consider U (R) as the union of the squares centered at the lattice points √

of {(x, y) :

2 2 2 2

x + y ≤ R }, all of them having unit side length. U (R) is contained in a circle having radius R + 2 . Conversely,

√

2

U R− 2 is contained in a circle with radius R. It follows that

√ !2 √ !2

2 2

π R− ≤ N (R) ≤ π R +

2 2

hence

√

N (R) − πR2 ≤ ( 2 + ε)R = O(R)

as R → +∞.

Soon after Gauss’ work, mathematicians wondered about the optimality of the bound O(R) for the difference between

N (R) and the area of the circle with radius R. Since the Dirichlet L-function L(χ4 , s) has an infinitude of zeroes in

the strip 0 ≤ Re(s) ≤ 1, it can be shown (as done by Hardy) that the bound for the error term cannot be improved

beyond O(R1/2 ). The purpose of the final part of this section is to prove that it can be improved as follows:

8 BESSEL FUNCTIONS AND THE GAUSS CIRCLE PROBLEM

The classical proofs revolve around a few key ingredients. The first fact is that the sum

R

X p

1+2 R 2 − x2

x=−R

can be estimated through the trapezoid method or Simpson’s rule, hence the problem is reduced to an accurate

estimation of

XR np o

R 2 − x2

x=−R

where {z} stands for the fractional part of z. The Fourier sine series of ρ(z) = 21 − {z} is well-known, the discrete

√

Fourier transform of { R2 − x2 } is related to Bessel functions of the first kind, with a known asymptotic behavior. In

order to partially compensate the erratic behaviour of r2 (n) over the integers, actual bounds for N (R) are produced

by considering an averaged version of r2 (n) over suitably short intervals. Let N 0 (R) denote the number of pairs of

√

integers (m, n) satisfying m2 + n2 ≤ R2 , m > 0, n ≥ 0 and let M = b √R2 c, f (x) = R2 − x2 − x. Then

M

X M

X M

X

N 0 (R) = M + bRc + 2 bf (m)c = bRc + 2 f (m) + 2 ρ(f (m))

m=1 m=1 m=1

M

πR2 X

N 0 (R) = +2 ρ(f (m)) + O(1).

4 m=1

√ √

If we define P (x) as N 0 ( x) − πx 0

4 , from the fact that N ( x) is increasing we get

πy πy

P (X) ≤ P (X + y) + , P (X) ≥ P (X − y) − ,

4 4

hence by integration and the triangle inequality we have

Z Z !

1 X X+Y πY

|P (X)| ≤ max P (x) dx , P (x) dx +

Y X−Y X 8

yielding a bound for P (X), given a bound for the average of P (x) over short intervals. In the following manipulations

√ √ √

we will assume Y X, such that x = X + O(Y ) ensures x = X + O(1). We have:

X p

P (x) = 2 ρ( x − m2 ) + O(1),

√

m≤ X/2

Z X+Y X Z X+Y p

P (x) dx = 2 ρ( x − m2 ) dx + O(Y ).

X−Y √ X−Y

m≤ X/2

By using the Fourier series of the fractional part provided by Bernoulli polynomials we obtain

Z X+Y √ 2 X 1 X p h p p i

P (x) dx = O( X) + 2 Re X − m 2 e X − Y − m 2 −e X + Y − m2 (1)

X−Y π h2 √

h≥1 m≤ X/2

where e(z) is the common shorthand notation for e2πiz . The original problem is now converted into the approximated

evaluation of exponential sums. Such task can be accomplished by recalling two classical results due to Kusmin,

Landau and Van der Corput.

Theorem 250 (Kusmin-Landau). Let fA , . . . , fB be real numbers (where A, B are integers with A < B) and let

δn = fn − fn−1 . Suppose that δn is a weakly monotonic function of n. Suppose also that there is some integer K and

some δ ∈ 0, 21 such that δn ∈ [K + δ, K + 1 − δ] for all n. Then

B

X

def

2

S = e(fn ) ≤ .

sin(πδ)

n=A

8.1 The Gauss circle problem

If fn is decreasing we may negate all the fn s, which simply turns S into S, which has the same size as S. Then we

may assume without loss of generality that δn is non-decreasing. We may also assume that K = 0, by replacing fn by

fn − Kn. The proof of the Kusmin-Landau theorem relies on a clever trick to express e(fn ) as a difference, then on

the application of summation by parts. We begin by writing (for A < n ≤ B)

then setting

1 1 i

gn = = − cot(πδn )

1 − e(−δn ) 2 2

such that

e(fn ) = gn e(fn ) − gn e(fn−1 )

and

1

|1 − gn | = |g n | = |gn | = .

2 sin(πδn )

The expression on the RHS is positive because from our assumptions we have 0 < δn < 1.

By using summation by parts we get

B−1

X

S = (1 − gA+1 )e(fA ) + gB e(fB ) + (gn − gn−1 )e(fn )

n=A+1

B−1

X 1 1 2

|S| ≤ |1 − gA+1 | + |gB | + |gn − gn−1 | ≤ + ≤ .

sin(πδA+1 ) sin(πδB ) sin(πδ)

n=A+1

Suppose we have a differentiable function f (x) with f 0 (x) being monotonic and fulfilling f 0 (x) ∈ [K + δ, K + 1 − δ]

Rn

for all x ∈ [A, B]. Then with fn = f (n) we have δn = f (n) − f (n − 1) = n−1 f 0 (x) dx which clearly satisfies the

conditions for the applicability of the Kusmin-Landau theorem. Since we also have sin(πδ) ≥ 2δ for δ ∈ 0, 12 , we

may write B

X 1

e(f (n)) ≤ .

δ

n=A

Also, if A = B then this sum contains only one term and it has modulus 1, so that this bound holds trivially (since

1 00

δ ≥ 2). Suppose now that we have a twice differentiable f (x) defined for x ∈ [a, b], such that 0 < λ ≤ f (x) ≤ hλ.

Here we have a < b and a, b need not be integers. This implies that h ≥ 1 and that f 0 (x) is strictly increasing. From

the Kusmin-Landau theorem we will derive a bound for

def

X

S = e(f (n)).

a≤n≤b

Let f 0 (a) = α and f 0 (b) = β. For a free parameter δ ∈ 0, 21 we partition the interval [α, β] into sub-intervals of the

form In = (n − δ, n + δ) (containing the reals close to the integer n), and of the form Jn = [n + δ, n + 1 − δ] (containing

reals at least δ-apart from the integers). Accordingly, we split the sum S into subsums over ranges for x corresponding

to these ranges for f 0 (x). The condition that In ⊆ [α, β] is equivalent to the condition n ∈ (α − δ, β + δ), and the

number of ns satisfying this is ≤ β − α + 2. Similarly, the condition that Jn ⊆ [α, β] is equivalent to the condition

n ∈ (α − 1 + δ, β − δ) and the number of ns satisfying this is ≤ β − α + 2 as well. By invoking the Kusmin-Landau

theorem we have

1 2δ

|S| ≤ (β − α + 2) + +1

δ λ

p

and this expression is minimized by choosing δ = λ/2, leading to:

8 BESSEL FUNCTIONS AND THE GAUSS CIRCLE PROBLEM

r

8

|S| ≤ (β − α + 2) +1 .

λ

The square root appearing in the RHS is crucial in encoding the cancellations in the involved exponential sum.

When dealing with weighted exponential sums, we may observe that

X Z b+ X

g(n)cn =

g(x) d cn

a +

a<n≤b a<n≤x

X Z b X

0

= g(b)

cn − g (x) cn dx

a<n≤b a a<n≤x

!

Z b X

0

≤ |g(b)| + |g (x)| dx max cn

a a<x≤b a<n≤x

which is very practical in inserting (or removing, depending on which way around we read things) a smooth slowly

varying weight g(x). In the current case Van der Corput’s theorem leads to

−1/2

X p h

e h X − m2 h √ = X 1/4 h1/2

m≤t

X

p

for t ≤ X/2 + O(1), then summation by parts grants

Xp p

X − m2 e(h X − m2 ) X 3/4 h1/2

m≤t

and X p h p p i √ hY

X − m2 e(h X − Y − m2 ) − e(h X + Y − m2 ) X 3/4 h1/2 · X · = X 1/4 h3/2 Y.

√ X

m≤ X/2

Z X+Y X X

P (x) dx Y X 1/4 h−1/2 + X 3/4 h−3/2

X−Y

h≤X 1/2 Y h>X 1/2 Y

1/2 1/2

−1/2

X 1/2

X

Y X 1/4 + X 3/4

Y Y

(Y X)1/2

1/2

X

P (X) Y +

Y

which is clearly minimized by taking Y = X 1/3 , finally giving

P (X) X 1/3 .

It should not take too much imagination to envisage that there is nothing spectacularly important about us restricting

to a circle in all of this, so that (using mainly just a little notational gameplay) we can get similar results for simple

closed curves satisfying certain conditions (like twice differentiability and radius of curvature bounded above and

below). We have had formulae with square root signs all over the place coming from Pythagoras, but in essence we

are just using linearising tricks. More usually a proof is given using the Poisson summation formula and bounds for

8.1 The Gauss circle problem

the resulting “exponential integrals”: this is exactly the second approach we are going to outline.

2s−p−1 Γ(s/2) 3

0 < Re(s) = σ < p + 2

Γ(p + 1 − s/2)

is the Mellin transform of x−p Jp (x). This brings to the table the Perelli S-class of Dirichlet series.

If an is a arithmetic function and the associated Dirichlet series

X an

L(a, s) = Re(s) > 1

ns

n≥1

has an analytic continuation to C whose only possible pole is at s = 1, and there are A, a1 , . . . , ag such that

g

Y

γ(s) = As Γ(aj s)

j=1

fulfills the reflection formula γ(s)L(a, s) = γ(1 − s)L(a, 1 − s), then by assuming f ∈ S(R) and defining

Z +∞ Z Z +∞

1 γ(s)

M (f )(s) = s−1

x f (x) dx, H(x) = −s

x ds, g(x) = f (y)H(xy) dy

0 2πi Re(s)= 32 γ(1 − s) 0

we have:

X X

an f (n) = f (0)L(a, 0) + Res M (f )(s)L(a, s) + an g(n).

s=1

n≥1 n≥1

Since f is a Schwartz function its Mellin transform M (f ) is holomorphic on Re(s) > 0. Over such region integration

by parts ensures

1

M (f )(s) = − M (f 0 )(s + 1),

s

hence M (f ) has an analytic continuation to C possibly with poles at 0, −1, −2, . . ..

Denoting M (f ) as F , the Mellin inversion formula grants

X X an Z

an f (n) = n−s F (s) ds

2πi Re(s)= 23

n≥1 n≥1

Z

1

= F (s)L(a, s) ds.

2πi Re(s)= 32

The only possible pole of L(a, s) is at s = 1. From M (f )(s) = − 1s M (f 0 )(s + 1) the only possible pole of F (s) in the

half-plane Re(s) > −1 is at s = 0, and the residue of F (s) at s = 0 is

Z +∞

−M (f 0 )(1) = − f 0 (x) dx = f (0),

0

so the residue of F (s)L(a, s) at s = 0 is f (0)L(a, 0). By the residue theorem, taking as given that F (s)L(a, s) → 0

uniformly in − 21 ≤ Re(s) ≤ 23 as |Im(s)| → ∞, we have

Z

X 1

an f (n) = f (0)L(a, 0) + Res F (s)L(a, s) + F (s)L(a, s) ds

s=1 2πi Re(s)=− 21

n≥1

γ(s)

by shifting the integration line. Now we may exploit the reflection formula for L(a, s), introducing G(s) = F (s) γ(1−s) .

R

The last integral in the previous line turns into Re(s)= 3 G(s)L(a, s) ds, and it is tedious but straightforward to check

2

that Z

1

x−s G(s) ds = g(x),

2πi Re(s)= 32

8 BESSEL FUNCTIONS AND THE GAUSS CIRCLE PROBLEM

proving the Lemma through the Mellin inversion formula. Since L(r2 , s) belongs to the Perelli S-class and

Γ 2−s

(s+1)/2

1−2s π 2

L(χ4 , s) = 2 L(χ4 , 1 − s)

π (2−s)/2 Γ 1+s

2

holds as a consequence of the Poisson summation formula, we have the following resummation formula for r2 :

X0 Z b X Z b √

r2 (n)f (n) = π f (x) dx + r2 (n) f (x)J0 (2π xn) dx

a≤n≤b a n≥1 a

P0

where f (x) is a suitably smooth function and denotes that at n = a or n = b the summand is to be halved if a or

b is an integer.

11

In view of the non-negativity of r2 (n) we have

X X X

f− (n)r2 (n) ≤ r2 (n) ≤ f+ (n)r2 (n)

n≥1 X<n≤2X n≥1

where f− is a smooth, non-negative function supported in [X, 2X] such that f (x) = 1 for x ∈ [X + G, 2X − G]

√

(X ε ≤ G ≤ X), while similarly f+ is supported in [X − G, 2X + G] and satisfies f (x) = 1 for x ∈ [X, 2X]. If

henceforth we denote by f (x) either f− (x) or f+ (x), then f (r) (x) r G−r (r = 0, 1, 2, . . .) and by the Voronoi

summation formula Z 2X+G

X X √

f (n)r2 (n) = πX + O(G) + r2 (n) f (x)J0 (2π xn) dx.

n≥1 n≥1 X−G

d

From the theory of Bessel functions we recall the identity dz [z ν Jν (z)] = z ν Jν−1 (z) and the bound Jν (z) √1z

for z → +∞. It follows that for small values of n (n ≤ Y ) the integral appearing in the RHS of the last line is

X 1/4 n−3/4 , and by invoking n≤x r2 (n) x and summation by parts we have

P

X

r2 (n)X 1/4 n−3/4 (XY )1/4 .

n≤Y

By using the recurrence relation for Jν0 , performing two integration by parts and noting that the support of f 00 has

measure G, we obtain that

X Z 2X+G √ X r2 (n) Z 2X+G √

r2 (n) f (x)J0 (2π xn) dx = 2

f 00 (x)xJ2 (2π xn) dx

X−G π n X−G

n>Y n>Y

X

r2 (n)n−5/4 G−1 X 3/4 X 3/4 G−1 Y −1/4 .

n>Y

X

f (n)r2 (n) = πX + O(G) + O((XY )1/4 ) + O(X 3/4 G−1 Y −1/4 )

n≥1

r2 (n) = πX + O(X 1/3 )

X<n≤2X

11 A resummation formula exists every time a Dirichlet L-series has a suitably structured reflection formula, and vice-versa.

8.1 The Gauss circle problem

is finally proved. Various authors have exploited the oscillations of J0 and J1 to refine the previous bounds. For

instance Huxley has proved (in 2003) that the exponent 23 can be replaced with the slightly smaller 131 208 . In the

opposite direction, Hardy has shown in 1925 that N (R) − πR2 is as large as R1/2 infinitely often, by exploiting the

reflection formula X r2 (n) √ X r2 (n) √

√ e−2π (n+a)b = √ e−2π (n+b)a

n≥0

n+a n≥0

n+b

due to Ramanujan. The techniques outlined in this section can be applied to the divisor problem too, concerning the

number of lattice points in the first quadrant under a rectangular hyperbola. We have

X0

D(x) = d(n) = x(log x + 2γ − 1) + 41 + ∆(x)

n≤x

where ∆(x) = O(x1/2 ) follows from simple geometric arguments. Voronoi’s summation formula for d(n) involves Bessel

functions of the second kind and it allows to prove that ∆(x) = O(x1/3 log x). In the opposite direction, ∆(x) is as

large as x1/4 log x infinitely often.

9 REMARKABLE RESULTS IN LINEAR ALGEBRA

This brief section is devoted to some important results in Linear Algebra: we will outline proofs by density for the

Hamilton-Cayley Theorem and the identity Tr(AB) = Tr(BA), then we will outline a recent elementary proof (due to

Suk-Geun Hwang) of Cauchy’s interlace theorem, admitting Sylvester’s criterion as a straightforward corollary.

Theorem 253 (Hamilton-Cayley). If p ∈ C[x] is the characteristic polynomial of a n × n matrix A with complex

entries,

p(A) = 0.

Proof. If A is a diagonalizable matrix the claim is trivial: if A = J −1 DJ is the Jordan normal form of A, the diagonal

entries of D are the eigenvalues λ1 , . . . , λn of A. We have p(λj ) = 0 by the very definition of characteristic polynomial,

and since p(M k ) = p(M )k ,

p(A) = p(J −1 DJ) = J −1 p(D)J = 0.

On the other hand diagonalizable matrices form a dense subspace in the space of n × n matrices with complex entries.

Assuming that A is not a diagonalizable matrix it follows that for any ε > 0 there exist some diagonalizable matrix

Aε such that kA − Aε k2 ≤ ε (actually the choice of the Euclidean norm is immaterial, any induced norm does the job

equally fine). The eigenvalues of Aε converge to the eigenvalues of A and p is a continuous function, hence

ε→0

Exercise 254. We have that {xn }n≥1 , {yn }n≥1 , {zn }n≥1 are three sequences of real numbers such that any term

among xn , yn , zn can be written as a linear combination of xn−1 , yn−1 , zn−1 with constant coefficients, for instance:

Prove that there exists an order-3 linear recurrence relation simultaneously fulfilled by each one of the sequences

{xn }n≥1 , {yn }n≥1 , {zn }n≥1 .

2an

a1 = 1, an+1 = .

7 + an

log an

Prove that limn→+∞ an = 0, then find limn→+∞ log n .

Theorem 256 (“The trace is Abelian”). For any couple (A, B) of n × n matrices with complex entries, the following

identity holds:

Tr(AB) = Tr(BA).

Proof. Assuming A is an invertible matrix, AB and BA share the same characteristic polynomial, since they are

conjugated matrices due to BA = A−1 (AB)A. In particular they have the same trace. Equivalently, they share the

same eigenvalues (counted according to their algebraic multiplicity) hence they share the sum of such eigenvalues. On

def

the other hand, if A is a singular matrix then Aε = A + εI is an invertible matrix for any ε 6= 0 small enough. It

follows that Tr(Aε B) = Tr(BAε ), and since Tr is a continuous operator, by considering the limits of both sides as

ε → 0 we get Tr(AB) = Tr(BA) just as well.

Corollary 257. If A is a n × n matrix with real entries and B = AT , A and B have the same rank. Prove this

statement by showing that Tr(Ak ) = Tr(B k ) for any k ∈ N. Hint: notice that Tr(M ) = Tr(M T ) and that the power

sums of the eigenvalues fix the coefficients of the characteristic polynomial of a matrix.

Exercise 258. Prove that any matrix T with real entries such that Tr(T ) = 0 can be written in the form T = AB −BA

for a suitable choice of the matrices A, B.

Unexpected applications of the Hamilton-Cayley theorem: if you are able to draw it,

you also know its asymptotic behaviour.

Exercise 259. Let Tn be the number of strings over the alphabet Σ = {0, 1} with length n

and exactly one occurrence of the “11” substring. Find an explicit formula for Ln .

Proof. It is pretty simple to construct a finite automaton accepting the strings of our (regular) language.

The automaton depicted above has the following transition matrix:

1 1 0 0

1 0 1 0

M = .

0 0 0 1

0 0 1 1

Since the starting state is A and the accepting states are C, D, we simply have:

Tn = (1 0 0 0) M n (0 0 1 1)T

{Tn }n≥0 and the matrix M share the same characteristic

polynomial. Since the eigenvalues of M are 1±2 5 and they both have algebraic multiplicity 2 and geometric

multiplicity 1, by the Jordan decomposition of M we have:

√ n √ n

Tn = (a + bn) 1+2 5 + (c + dn) 1−2 5

In particular we have Tn = 51 (nLn − Fn ). To prove the same in a purely combinatorial fashion is a bit more

involved, but certainly not impossible. By stars and bars, the number of strings with length n − 1 and exactly k

non-adjacent 1s is given by n−k

k . In particular:

X n − k n−1

X

Tn = k= Ak An−k

k

k≥1 k=1

9 REMARKABLE RESULTS IN LINEAR ALGEBRA

where Ak is the number of strings with length k, having no adjacent 1s and starting with a 1. By the convolu-

tion machinery the generating function of {Tn }n≥0 is simply given by the square of the generating function of

{An }n≥0 = {Fn+1 }n≥0 . The latter is a meromorphic function with two simple poles and we may recover the

previous closed form by partial fraction decomposition.

A

Exercise 260. ( ) Let s be a non-empty string over the alphabet Σ = {0, 1}. Let As be the finite automaton

accepting the strings over Σ that do not contain s as a substring. Let us define Spec(s) as the spectrum of the

transition matrix of As . Investigate about the relations between Spec(s.t) and Spec(s), Spec(t), where . denotes

the concatenation of strings.

A

Exercise 261. ( ) According to the notation introduced in the previous exercise, prove or disprove the existence

of a string s such that 2 cos 2π

7 ∈ Spec(s).

Definition 262. We say that a weakly increasing sequence of real numbers a1 ≤ a2 ≤ . . . ≤ an interlaces another

weakly increasing sequence of real numbers b0 ≤ b1 ≤ . . . ≤ bn if

b0 ≤ a1 ≤ b1 ≤ a2 ≤ . . . ≤ an ≤ bn

holds.

Theorem 263 (Cauchy’s interlace Theorem). The eigenvalues of a Hermitian matrix A of order n are interlaced with

those of any principal submatrix of order n − 1.

Proof. Hermitian matrices have real eigenvalues. Let A be a Hermitian matrix of order n and let B be a principal

submatrix of A of order n − 1. If λn ≤ λn−1 ≤ . . . ≤ λ1 lists the eigenvalues of A and µn ≤ µn−1 ≤ . . . ≤ µ2 the

eigenvalues of B, we shall prove that

λn ≤ µn ≤ λn−1 ≤ µn−1 ≤ . . . ≤ λ2 ≤ µ2 ≤ λ1 .

Proofs of this theorem have been based on Sylvester’s law of inertia and the Courant-Fischer min-max theorem.

Here we will give a simple, elementary proof of the theorem by using the intermediate value theorem.

Simultaneously permuting rows and columns, if necessary, we may assume that the submatrix B occupies rows 2, 3, . . . n

and columns 2, 3, . . . , n, so that A has the form

!

a y∗

A=

y B

where ∗ stands for the conjugate transpose of a matrix. Let D = diag(µ2 , µ3 , . . . , µn ). Then, since B is also Hermitian,

by the spectral Theorem there exists a unitary matrix U of order n − 1 such that U ∗ BU = D. Let U ∗ y = z =

(z2 , z3 , . . . , zn )T . We first prove the theorem for the special case where µn < µn−1 < . . . < µ3 < µ2 and zi = 0 for

i = 2, 3, . . . , n. Let !

1 0T

V =

0 U

in which 0 denotes the zero vector. Then V is a unitary matrix and

!

∗ a z∗

V AV = .

z D

Let f (x) = det(xI − A) = det(xI − V ∗ AV ), where I denotes the identity matrix. Expanding det(xI − V ∗ AV ) along

the first row, we get

n

X

f (x) = (x − a)(x − µ2 )(x − µ3 ) · · · (x − µn ) − fi (x)

i=2

2

where fi (x) = |zi | (x − µ2 ) · · · (x\ − µi ) · · · (x − µn ) for i = 2, 3, . . . , n, with the hat-sign denoting a missing term.

We may notice that fi (µj ) = 0 when j 6= i and fi (µi ) is strictly positive or strictly negative according to i being,

respectively, even or odd. It follows that f (µi ) is positive if i is odd and negative if i is even. Since f (x) is a

polynomial of degree n with positive leading coefficient, the intermediate value Theorem ensures the existence of n

roots λ1 , λ2 , . . . , λn of the equation f (x) = 0 such that λn < µn < λn−1 < µn−1 < . . . < λ2 < µ2 < λ1 .

For the proof of the general case, let ε1 , ε2 , . . . be a sequence of positive real numbers such that εk is decreasing towards

zero, zi + εk 6= 0 for i = 2, 3, . . . , n and k = 1, 2, . . . and the diagonal entries of D + εk diag(2, 3, . . . , n) are distinct for

fixed k. For k = 1, 2, . . . let !

a z(εk )∗

Ck =

z(εk ) D(εk )

where z(εk ) = z + εk (1, 1, . . . , 1)T and D(εk ) = D + εk diag(2, 3, . . . , n), and let Ak = V Ck V ∗ . Then Ak is Hermitian

(k) (k) (k) (k)

and Ak converges towards A. Let λn ≤ λn−1 ≤ . . . ≤ λ2 ≤ λ1 list the eigenvalues of Ak . Then

λ(k)

n < µn + nεk < λn−1 < µn−1 + (n − 1)εk < . . . < λ2 < µ2 + 2εk < λ1 .

(k) (k) (k)

Since λn , λn−1 , . . . , λ1 are n distinct roots of det(xI − Ak ) = 0 for each k and since the graph of y = det(xI − Ak )

close to that of y= det(xI − A), it follows that the proof is complete by invoking the implicit function

is sufficiently

(k) (k) (k)

Theorem: λn , λn−1 , . . . , λ1 → (λn , λn−1 , . . . , λ1 ).

if and only if the determinants of the leading principal minors are positive.

Proof. If some minor has a negative or zero determinant the original matrix M cannot be positive definite by Cauchy’s

interlace theorem. This proves that the positivity of the mentioned determinants is a necessary condition. The converse

implication can be easily shown by induction on the dimension of M , or by exploiting the Cholesky decomposition

A = B T B with B being a non-singular matrix.

Theorem 265 (Banach-Steinhaus uniform boundedness theorem). Let F be a family of bounded linear operators

from a Banach space X to a normed linear space Y . If F is pointwise bounded (i.e., supT ∈F kT xk < ∞ for all x ∈ X),

then F is norm-bounded (i.e., supT ∈F kT k < ∞).

Proof. The following proof is due to Alan D.Sokal. Let T be a bounded linear operator from a normed linear space X

to a normed linear space Y . Then for any x ∈ X and r > 0, we have

sup kT x0 k ≥ kT kr,

x0 ∈B(x,r)

where, as usual, B(x, r) = {x ∈ X : kx0 − xk < r}. Indeed, for any ξ ∈ X we have

kT (x + ξ)k + kT (x − ξ)k

max{kT (x + ξ)k, kT (x − ξ)k} ≥ ≥ kT ξk

2

9 REMARKABLE RESULTS IN LINEAR ALGEBRA

where the second ≥ uses the triangle inequality in the form kα − βk ≤ kαk + kβk, and we may consider the supremum

over ξ ∈ B(0, r). If we assume that supT ∈F kT k = ∞, we may construct a sequence {Tn }n≥1 such that kTn k ≥ 4n .

Setting x0 = 0, we may use the previous lemma on supx0 ∈B(x,r) kT x0 k to choose inductively xn ∈ X such that

kxn − xn−1 k ≤ 3−n and

2

kTn xn k ≥ 3−n kTn k.

3

{xn }n≥1 is a Cauchy sequence, hence it is convergent to some x ∈ X: it is easy to check that kx − xn k ≤ 12 3−n , such

n

that kTn xk ≥ 16 3−n kTn k ≥ 16 43 → ∞, contradicting the pointwise-boundedness of F .

Corollary 266. There is a 2π-periodic and continuous function f whose Fourier series

Z 2π

X 1

b inx

f (n)e , f (n) =

b f (x)e−nix dx

2π 0

n∈Z

Proof. In order to invoke the Banach-Steinhaus theorem, we consider the functionals given by the partial sums of the

Fourier series of f , evaluated at x = 0: X

λN (f ) = fb(n).

|n|≤N

Z 1 X X

|λN (f )| ≤ e−2πinx · |f (x)| dx ≤ kf k∞ · e−2πinx .

0 1

|n|≤N |n|≤N

1

X

−2πinx sin 2πx N + 2

e =

sin(πx)

|n|≤N

and {gj (x)}j≥1 as a sequence of periodic continuous functions, such that |gj (x)| ≤ 1 and gj (x) → g(x), by dominated

convergence Z 1 Z 1 X

X

lim λN (gj ) = g(x) e−2πinx dx = e−2πinx dx,

j→+∞ 0 0

|n|≤N |n|≤N

2

hence the previous bound for the norm of λN holds as an equality. Since the mean value of |sin x| is π, integration by

parts leads to
X
2

e−2πinx
∼ log N,

π

1

|n|≤N

hence there is no uniform bound for the L1 -norm of the Dirichlet kernel. By Banach-Steinhaus, there is some f in the

unit ball of C 0 (T) such that

sup |λN (f )| = +∞.

N

In fact, the collection of such f s is dense in the unit ball, and it is an intersection of a countable collection of dense

open sets (a Gδ ).

The same phenomenon does not occur if the Dirichlet kernel is replaced by the Fejér kernel: if f ∈ C 0 (T),

the sequence of trigonometric polynomials defined by

X

|n| b

pN (x) = 1− f (n)e−2πinx

N

|n|≤N

P

Corollary 267. If A = {an }n≥1 is a sequence of real numbers such that |hΛ, Ai| = n≥1 λn an is finite

Proof. We work by contradiction: we assume that A = n≥1 a2n is unbounded and we show that for some sequence Λ ∈

P

`2 the series n≥1 λn an is positively divergent. By picking the multipliers λn with the same sign as the corresponding

P

an , we may as well assume that an ≥ 0. For any real number c > 0, the subsequence {aσ(n) }n≥1 made by the terms

≥ c has a finite number of terms: otherwise the inner product with the sequence {λσ(n) = n1 }n≥1 ∈ `2 would be

unbounded. In particular 0 is the only accumulation point of {an }n≥1 and limn→+∞ an = 0. Let us define a0 = 1,

n

X

Sn = a2n

k=0

q

1 1

In this case Λ ∈ `2 by construction, since λ2n is a telescopic series and

P

and let us consider λn = Sn−1 − Sn . n≥1

Sn → +∞ by the original assumption B 6∈ `2 . If we manage to prove that

s

X 1 1 X a2 X a2

an − = p n ≥ n

Sn−1 Sn Sn−1 Sn Sn

n≥1 n≥1 n≥1

is divergent we are done. Let us define, by induction, τ (0) as the smallest n such that Sn ≥ 2, τ (m) as the smallest n

such that Sτ (m) is ≥ 2 · 2 · Sτ (m−1) . We have

X a2n 1 1

≥ Sτ (m+1) − Sτ (m) ≥

Sn Sτ (m+1) 2

τ (m)<n≤τ (m+1)

q

1 1

P

hence by summing both sides on m ≥ 0 we have that n≥1 an Sn−1 − Sn is divergent.

The last proof can be easily adapted to the continuous case through very few adjustements:

Corollary 268. If f : R+ → R is a function such that f · g ∈ L1 (R+ ) for any g ∈ L2 (R+ ), then f ∈ L2 (R+ ).

Rx

Proof. We work by contradiction: we assume that 0 f (t)2 dt is unbounded and we show that for some function

x

g ∈ L2 (R+ ) the integral 0 f (t)g(t) dt is positively divergent as x → +∞. By picking the multipliers g(x) such that

R

f /g has almost everywhere the same sign, we may as well assume that f (x) ≥ 0. For any real number c > 0, the set

of x ∈ R+ such that f (x) ≥ c has finite measure, otherwise the integral of √fx(x)

2 +1

over such set would be unbounded.

By replacing f with the convolution between f and a non-negative, compact supported and smooth kernel we may

also assume that f (x) is continuous on R+ . Let us define

Z x

F (x) = f (t)2 dt

0

r

d

and let us consider g(x) = dx − 1+F1 (x) . In this case g ∈ L2 (R+ ) by construction, since the integral of g(x)2 can

be computed through the fundamental Theorem of Calculus and F (x) is increasing to +∞. If we manage to prove

that s

Z +∞ Z +∞

f (x)2

d 1

f (x) − dx = dx

0 dx 1 + F (x) 0 1 + F (x)

is divergent we are done. Let us define, by induction, τ (0) as the infimum of the set {x : F (x) ≥ 2},

τ (m) as the infimum of the set {x : F (x) ≥ 2F (τ (m − 1))}. We have

Z τ (m+1) Z τ (m+1)

f (x)2 1 F (τ (m + 1)) − F (τ (m)) 1

dx ≥ f (x)2 dx = ≥

τ (m) 1 + F (x) 1 + F (τ (m + 1)) τ (m) 1 + F (τ (m + 1)) 3

10 THE FUNDAMENTAL THEOREM OF ALGEBRA

r

R +∞ d

hence by summing both sides on m ≥ 0 we have that 0

f (x) dx − 1+F1 (x) dx is divergent.

The powerful lemma 267 is a Corollary of the Banach-Steinhaus theorem: given the sequence A = {an }n≥1 and any

Λ ∈ `2 , the operators

XN

TN (Λ) = an λn

n=1

are linear, continuous and pointwise bounded. The uniform boundedness criterion ensures they are norm-bounded,

i.e. A ∈ `2 . The lemma 267 immediately leads to the fact that `2 and L2 (R+ ) are complete spaces. Since Fourier

series give an isometry between L2 (0, 2π) and `2 , such lemma also provides a proof of the completeness of L2 (I) for

any bounded interval I.

The purpose of this section is to shortly introduce some elements of Complex Analysis and use them to produce a

proof of the Fundamental Theorem of Algebra, stating that C is an algebraically closed field, or, in layman’s terms:

The first serious attempt to such problem is due to Gauss: it was mainly geometric, but it had a topological gap,

filled by Alexander Ostrowski in 1920. A rigorous proof was first published by Argand in 1806 (and revisited in 1813).

We will actually show many approaches and focus on geometric and analytic insights and their consequences. All

proofs below involve some analysis, or at least the topological concept of continuity of real or complex functions. Some

also use differentiable or even analytic functions. This fact has led to the remark that the Fundamental Theorem of

Algebra is neither fundamental, nor a theorem of algebra.

Definition 270. Suppose we are given a closed, oriented curve in the xy plane, not going through the origin. We

can imagine the curve as the path of motion of some object, with the orientation indicating the direction in which

the object moves. Then the winding number of the curve is equal to the total number of counterclockwise turns

that the object makes around the origin. When counting the total number of turns, counterclockwise motion counts

as positive, while clockwise motion counts as negative. For example, if the object first circles the origin four times

counterclockwise, and then circles the origin once clockwise, then the total winding number of the curve is three.

But given a closed curve γ : [0, 2π] → R2 \ {(0, 0)} represented by γ(t) = (x(t), y(t)) (which we may temporarily

assume to be smooth, too), how can we find its winding number around the origin? We may notice that in the open

y(t)

first quadrant arctan x(t) gives an “angular displacement” with respect to the origin: in order to compute the winding

number of γ we just need to find a continuous determination of such angular displacement. For brevity we will not

delve into the theory of differential forms, we just mention that such continuous determination can be achieved through

a step of differentiation and a step of integration, namely:

d y(t)

d y(t) dt x(t) x(t)y 0 (t) − y(t)x0 (t)

arctan = y(t)2

=

dt x(t) 1 + x(t) 2

x(t)2 + y(t)2

leading to

2π

x(t)y 0 (t) − y(t)x0 (t)

Z

1

dt

2π 0 x(t)2 + y(t)2

as an expression for the winding number of γ around the origin. What if our curve is given by some f : S 1 → C∗ ,

with f being a holomorphic function? In such a case the previous winding number takes the following form:

f 0 (z)

I

1

dz.

2πi |z|=1 f (z)

Now it comes an interesting remark, clarifying the interplay between winding numbers and zeroes of holomorphic

d

functions: in the previous line, the integrand function is formally dz log f (z), hence if f = hg with h and g being

holomorphic functions, the winding number of f is just the sum between the winding number of h and the winding

number of g. It is very simple to check that for any m ∈ N the winding number of z m is exactly m and for every

w ∈ C\S 1 , by setting f (z) = z −w we get that the winding number of f is 1 or 0 according to |w| < 1 or |w| > 1: in the

first case f (S 1 ) encloses the origin, in the latter it does not. Conversely, if f (z) is a holomorphic and non-vanishing

0 0

function over D, then ff (z) (z)

is a holomorphic function over D and the integral ∂D ff (z) (z)

H

dz equals zero by Stokes’

theorem.

f 0 (z)

I

1

Nf = dz

2πi ∂D f (z)

This Lemma can be seen both as a consequence of the residue Theorem or as a remark in Differential Geometry that

can be used to prove the residue Theorem. Such Lemma has a crucial role in the usual proof of the Jordan curve

Theorem, since it gives that a smooth, simple and closed curve cannot partition R2 in more than two connected

components. A curve fulfilling such constraints splits R2 in at least two connected components by the existence of

a tubular neighbourhood, then the chance to drop the previous smoothness assumption is granted by invoking Sard’s

Theorem.

Theorem 272 (The double leash principle). Assume that you are connected to a thin tree by a leash of fixed length

L. Assume that you dog is connected to you by a leash of fixed length l < L. If you take a walk and after some time

you and your dog return at the starting points, your winding number around the tree and your dog’s are the same.

The above statement is usually known as Rouché Theorem. We opted for such fancy introduction since we believe

the previous formulation might help the reader to grasp the geometric idea faster and better. In a more rigorous way:

Theorem 273 (Rouché). If f (z) and f (z) + g(z) are holomorphic function on the closed unit disk D centered at the

origin, and for every z ∈ ∂D we have 0 < |g(z)| < |f (z)|, then f and f + g have the same number of zeros inside D,

where each zero is counted as many times as its multiplicity. The same holds if D is replaced by some compact region

K whose boundary ∂K is a simple, piecewise-smooth and closed curve.

Proof. We have already shown that the wanted number of zeroes is given by a winding number. Since for any z ∈ ∂D

we may write

g(z)

f (z) + g(z) = f (z) 1 +

f (z)

the winding number of f + g is given by the sum between the winding number of f and the winding number of the

curve

def g(eiθ )

h : S 1 → C∗ , h(eiθ ) = 1 + .

f (eiθ )

However, the winding number of h is clearly zero, since h stays “on the right” of the origin. As a matter of fact,

g(z)

f (z)

10 THE FUNDAMENTAL THEOREM OF ALGEBRA

is a continuous function on a compact set (S 1 ) attaining a maximum M < 1, hence Re(h) ≥ 1 − M > 0 by the triangle

inequality. It follows that f and f + g have the same winding number, hence the same number of zeroes in D.

Lemma 274. The Fundamental Theorem of Algebra is a simple consequence of the double leash principle.

is a monic polynomial with degree n ≥ 1 and p(0) = a0 6= 0. Our purpose is to prove it has a zero somewhere.

We prove first that if it has a zero, it cannot be too far from the origin. Let us consider

M = 1 + |an−1 | + . . . + |a0 |

and show that |z| ≥ M implies p(z) 6= 0. Since M > 1, for any z ∈ C such that |z| ≥ M we have:

an−1 z n−1 + . . . + a0 ≤ |an−1 ||z|n−1 + . . . + |a0 | ≤ (|an−1 | + . . . + |a0 |) |z|n−1 = (M − 1)|z|n−1 < |z|n

hence p(z) = 0 cannot occur, since it would imply |z|n = an−1 z n−1 + . . . + a0 . This proves all the complex zeroes of p,

if existing, lie in the region |z| < M . But the inequality above also shows that f (z) = z n and g(z) = an−1 z n−1 +. . .+a0

meet the hypothesis of Rouché’s Theorem for the the region K = {z ∈ C : |z| ≤ M }. In particular the number of

zeroes of p(z) = f (z) + g(z) in K equals the number of zeroes of f (z) = z n in K and we are done:

p(z) has exactly n zeroes in the region |z| < M, counted according to their multiplicity.

The inclusion

{z ∈ C : p(z) = 0} ⊂ {z ∈ C : |z| ≤ M }

can also be proved by applying the Gershgorin circle Theorem to the companion matrix of p.

pn (z) = z n + z + 1

not greater than one.

unity ω, ω 2 are also roots of pn . Conversely, if we denote as D

the unit disk centered at the origin, we may notice that ∂D and

−(∂D + 1) intersect only at ω, ω 2 : it follows that the n ≡ 2

(mod 3) case is the only case in which pn (z) has roots at ∂D.

The number of roots we want to approximate is given by the

winding number of the curve γn : [0, 2π] → C, γn (θ) = pn (eiθ ).

The diagram to the right depicts the n = 13 case, for instance.

The graph of e13iθ + eiθ + 1 for θ ∈ [0, 2π].

It is pretty clear that the graph of γn is given by the union of n approximated

circles, completing a revolution around the point z = 1 in n steps. Assuming

n 6≡ 2 (mod 3) the number of roots we are interested in is exactly given by

the number of the previous approximated circles enclosing the origin. If such

portions of γn were perfect circles, from the diagram on the left it would be

clear that about n3 of them would enclose the origin. To fill in the missing

details is a task we leave to the reader.

We now outline another classical proof of the Fundamental Theorem of Algebra, relying on the following results:

Theorem 276 (Maximum modulus principle). If D is a closed disk in the complex plane and f is a non-constant

holomorphic function over D,

max |f (z)|

z∈D

is attained at ∂D.

Proof. If we assume that maxz∈D |f (z)| is attained at some z0 belonging to the interior of D we get a contradiction,

since by Cauchy’s integral formula or termwise integration of a Taylor series we have

I

1 f (z)

f (z0 ) = dz

2πi |z−z0 |=ε z

for any ε > 0 small enough, implying I

1

|f (z0 )| ≤ |f (z)| dz.

2πε |z−z0 |=ε

If equality holds for any ε small enough then |f (z)| is constant in a neighbourhood of z0 and f (z) is constant as

well.

Proof. The theorem follows from the fact that holomorphic functions are analytic.

If f is an entire function, it can be represented by its Taylor series about 0:

∞

X

f (z) = ak z k

k=0

f (k) (0)

I

1 f (ζ)

ak = = dζ

k! 2πi Cr ζ k+1

and Cr is the circle about 0 of radius r > 0. Suppose f is bounded: i.e. there exists a constant M such that |f (z)| ≤ M

for all z. We can estimate directly

|f (ζ)|

I I I

1 1 M M M M

|ak | ≤ k+1

|dζ| ≤ k+1

|dζ| = k+1

|dζ| = k+1

2πr = k ,

2π Cr |ζ| 2π Cr r 2πr Cr 2πr r

where in the second inequality we have used the fact that |z| = r on the circle Cr . But the choice of r above is arbitrary.

Therefore, letting r tend to infinity gives ak = 0 for all k ≥ 1. Thus f (z) = a0 and this proves the theorem.

10 THE FUNDAMENTAL THEOREM OF ALGEBRA

Corollary 278 (Casorati-Weierstrass). If f is a non-constant entire function, then its image is dense in C.

Proof. If the image of f is not dense, then there is a complex number w and a real number r > 0 such that the open

1

disk centered at w with radius r has no element of the image of f . Define g(z) = f (z)−w . Then g is a bounded entire

function, since

1 1

∀z ∈ C, |g(z)| = <

|f (z) − w| r

So, g is constant, and therefore f is constant.

Corollary 279. If p(z) ∈ C[z] is a monic polynomial with degree n ≥ 1 and p(0) 6= 0, it has a complex root.

Proof. Since |p(z)| → +∞ as |z| → +∞, there is some closed disk D centered at the origin such that |p(z)| > |p(0)|

for any z outside D. Assuming that p(z) is non-vanishing, it follows that minz∈C |p(z)| is attained at some z0 ∈ D

1

and q(z) = p(z) is an entire function such that

1

∀z ∈ C, |q(z)| ≤ .

|p(z0 )|

By Liouville’s Theorem we get that both q and p are constant functions, leading to a contradiction.

A shortened proof. Assume that p(z) = z n + an−1 z n−1 + . . . + a0 ∈ C[z] with n ≥ 1 is non-vanishing over C.

By the residue Theorem, for any r > 0 we have that

I

dz 2πi

= 6= 0

|z|=r z p(z) p(0)

Theorem 280 (Open mapping Theorem). Any non-constant holomorphic function on C is an open map,

i.e. it sends open subsets of C to open subsets of C.

Proof. Assume f : U → C is a non-constant holomorphic function and U is a domain of the complex plane. We have

to show that every point in f (U ) is an interior point of f (U ), i.e. that every point in f (U ) has a neighbourhood (open

disk) which is also in f (U ). Consider an arbitrary w0 in f (U ). Then there exists a point z0 in U such that w0 = f (z0 ).

Since U is open, we can find d > 0 such that the closed disk B around z0 with radius d is fully contained in U .

Consider the function g(z) = f (z) − w0 . Note that z0 is a root of the function. We know that g(z) is not constant and

holomorphic. The roots of g are isolated by the identity theorem, and by further decreasing the radius of the image

disk d, we can assure that g(z) has only a single root in B (although this single root may have multiplicity greater

than 1). The boundary of B is a circle and hence a compact set, on which |g(z)| is a positive continuous function, so

the extreme value Theorem guarantees the existence of a positive minimum e, that is, e is the minimum of |g(z)| for

z on the boundary of B and e > 0. Denote by D the open disk around w0 with radius e. By Rouché’s theorem, the

def

function g(z) = f (z) − w0 will have the same number of roots (counted with multiplicity) in B as h(z) = f (z) − w1 for

any w1 in D. This is because h(z) = g(z) + (w0 − w1 ), and for z on the boundary of B, |g(z)| ≥ e ≥ |w0 − w1 |. Thus,

for every w1 in D, there exists at least one z1 in B such that f (z1 ) = w1 . This means that the disk D is contained in

f (B). The image of the ball B, f (B), is a subset of the image of U , f (U ). Thus w0 is an interior point of f (U ). Since

w0 was arbitrary in f (U ) we know that f (U ) is open. Since U was arbitrary, the function f is open.

Proof. We have that |p(z)| → +∞ as |z| → +∞. Suppose that p(zk ) → w ∈ C as k → +∞: then {zk } is bounded,

so taking a subsequence if necessary, there is z ∈ C such that zk → z. By continuity p(zk ) → p(z), concluding that

w = p(z).

Corollary 282. If p(z) ∈ C[z] is a non-constant polynomial, p(C) is unbounded and simultaneously open and closed.

If follows that p(C) = C, i.e. any non-constant polynomial with complex coefficients is surjective. In particular there

is at least a complex solution of p(z) = 0.

Exercise 283. Given a non-constant polynomial p(z) ∈ C[z] such that p(0) 6= 0, prove that the following statements

are equivalent forms of the Fundamental Theorem of Algebra:

1

2. q(z) = p(z) is an analytic function in a neighbourhood of the origin with a finite radius of convergence;

the following limit does not exist:

Xn

lim ak M k .

n→+∞

k=0

An interesting part of Complex Analysis is related to the problem of extending Rolle’s Theorem (if f is a differentiable

function on [a, b] and f (a) = f (b) = 0 holds, there is some ξ ∈ (a, b) such that f 0 (ξ) = 0) to the complex case.

all the roots of p0 (z) lie inside the convex hull of ζ1 , . . . , ζn .

Proof. We may assume without loss of generality that p(z) is a monic polynomial with simple roots. Then by

considering the logarithmic derivative of p(z) we have the following identity

n

X 1

p0 (z) = p(z)

z − ζk

k=1

Pn

and p0 (z) vanishes iff s(z) = k=1 z−ζ

1

k

vanishes, since p(z) and p0 (z) have no common root. Let us assume that s(z)

vanishes at a point w lying outside the convex hull of ζ1 , . . . , ζk , or on its boundary. By the Hann-Banach theorem

there is some line ` through the origin such that all the vectors w − ζ1 , . . . , w − ζk lie on the same side of `. By

1 1

conjugation, the same applies to the vectors w−ζ 1

, . . . , w−ζ k

, hence for some θ ∈ R the complex number eiθ s(w) has

a non-zero real/imaginary part. s(w) 6= 0 leads to a contradiction, completing the proof.

complex plane and we denote as D, E the roots of

d

(z − A)(z − B)(z − C),

dz

then D, E are the foci of the Steiner inellipse of ABC, centered at

A+B+C

3 and tangent to the sides of ABC at their midpoints.

What is the best possible improvement of the Gauss-Lucas Theorem still is an open problem in the general case.

The following result has only been proved for polynomials having degree ≤ 8 and for some other special cases:

10 THE FUNDAMENTAL THEOREM OF ALGEBRA

The Ilieff-Sendov conjecture. If all the zeros of a polynomial p(z) lie in kzk ≤ 1 and if r is a zero of p(z),

then there is a zero of p0 (z) in the disk kz − rk ≤ 1.

Exercise 286. By exploiting the Gauss-Lucas theorem, show that the following entire functions only have real zeroes:

√ √ X (−1)n z 2n+1 X (−z 2 )n

sin(z) + 2 sin(z 2), Si(z) = , J0 (z) = .

(2n + 1) · (2n + 1)! 4n n!2

n≥0 n≥0

Exercise 287. By exploiting the Gauss-Lucas theorem, show that for any λ ∈ [1, +∞) all the solutions of

cot(x) + λx = 0 are real numbers.

Yet another way for approaching Calculus. The exponential function is usually introduced by proving that

n def n

limn→+∞ 1 + n1 exists, then showing that ex = limn→+∞ 1 + nx is a differentiable function, then noticing

d x

that dx e = ex leads to very-known Taylor series and to De Moivre’s formula. Here we outline a different way for

approaching the early stages of Calculus.

1. One may directly introduce the complex exponential function through an everywhere-convergent power

series,

def

X zn

∀z ∈ C, ez = ,

n!

n≥0

2. then check through the convolution machinery that such function fulfills ea · eb = ea+b for any a, b ∈ C;

3. In particular, for any θ ∈ R we have that eiθ ∈ S 1 , since |eiθ |2 = eiθ · e−iθ = 1

4. and the map γ : R → S 1 given by γ(θ) = eiθ is a parametrization of S 1 with constant speed, since by the

d iθ

series definition dθ e = ieiθ and the last quantity has unit modulus by the previous point (Pythagorean

Theorem);

def def

sin(θ) = Im eiθ , cos(θ) = Re eiθ

and derive the addition formulas for sin and cos from the point (2.);

def

π = inf θ ∈ R+ : sin(θ) = 0

we get that π equals half the length of the unit circle, or, equivalently, the area of the unit circle.

Additionally, eiπ + 1 = 0;

7. By the previous points ez is an entire function and a solution of the differential equation f 0 (z) = f (z);

X (−1)n X (−1)n

sin(z) = z 2n+1 , cos(z) = z 2n

(2n + 1)! (2n)!

n≥0 n≥0

are entire functions and solutions of the differential equation f 00 (z) + f (z) = 0;

9. From the Pythagorean Theorem sin2 θ + cos2 θ = 1, hence

√

Z 1 √ Z 1 Z 1/ 2

dx x7→ u du symmetry dx

π=2 √ = p = 4 √

0 1 − x2 0 u(1 − u) 0 1 − x2

and by integrating termwise the Taylor series of √ 1 we get the following series representation for π:

1−x2

√ X 2n

n

π=2 2 = 3.1415926535897932384626433832795 . . .

8n (2n + 1)

n≥0

√

As an alternative, the integral of 1 − x2 over some sub-interval of [−1, 1] is clearly related to the area of

a circle sector. By computing a Taylor series and applying termwise integration again, we may easily derive

Newton’s identity

2n

X

n

π =4−4 .

(4n2 − 1)4n

n≥1

Let us assume to have a holomorphic function which is z + o(z) in a neighbourhood of the origin, like

X (−1)n z 2n+1

sin(z) = (1)

(2n + 1)!

n≥0

and to want to compute the coefficients of the Maclaurin series of its inverse function arcsin(z), say the coefficient

of z 7 . By Cauchy’s integral formula

I

7 1 arcsin(z)

[z ] arcsin(z) = dz (2)