% Chapter 9 of Concrete Mathematics
% (c) Addison-Wesley, all rights reserved.
% This chapter must begin on right-hand page
% because: (a) it looks good; (b) there's no running head till 3rd page of chapter!
\input gkpmac
\refin bib
\pageno=439
\refin chap2
\refin chap3
\refin chap4
\refin chap5
\refin chap6
\refin chap7
\beginchapter 9 Asymptotics
EXACT ANSWERS are great when we can find them; there's something very
satisfying about complete knowledge. But there's also a time when
approximations are in order. If we run into a sum or a recurrence
whose solution doesn't have a closed form (as far as we can tell), we still
would like to know something about the answer; we don't have to insist
on all or nothing. And even if we do have a closed form, our knowledge
might be imperfect, since we might not know how to compare it with
other closed forms.
For example, there is (apparently) no closed form for the sum
\begindisplay
S_n=\sum_{k=0}^n{3n\choose k}\,.
\enddisplay
But it is nice to know that
\begindisplay
S_n\sim 2{3n\choose n}\,,\qquad\hbox{as $n\to\infty$};
\enddisplay
we say that the sum is ``"asymptotic" to'' $2{3n\choose n}$.
\g Uh oh \dots\ here comes that A-word.\g
It's even nicer to have more detailed information, like
\begindisplay
S_n={3n\choose n}\biggl(2-{4\over n}+O\Bigl({1\over n^2}\Bigr)\biggr)\,,
\eqno\eqref|opening-sum|
\enddisplay
which gives us a ``relative error of order $1/n^2$.'' But even this isn't enough
to tell us how big $S_n$ is, compared with other quantities. Which is larger,
$S_n$ or the Fibonacci number~$F_{4n}$? Answer: We have $S_2=22>F_8=21$ when
$n=2$; \looseness=-1
but $F_{4n}$ is eventually larger, because
$F_{4n}\sim \phi^{4n}\!/\sqrt5$ and $\phi^4\approx6.8541$, while
\begindisplay
S_n=\sqrt{3\over\pi n}(6.75)^n\biggl(1-{151\over72n}+O\Bigl({1\over n^2}\Bigr)
\biggr)\,.
\eqno\eqref|opening-asympt|
\enddisplay
Our goal in this chapter is to learn how to understand and to
derive results like this without great pain.
The word {\it"asymptotic"\/} stems from a Greek root meaning
\g Other words like `symptom' and `ptomaine' also come from this root.\g
``not falling together.\qback'' When ancient Greek mathematicians
studied conic sections, they considered hyperbolas like the graph of
$y=\sqrt{1+x^{\mathstrut2}}$,
\begindisplay
\unitlength=.2in
\beginpicture(10,6)(-5,-1)
\put(-1,-1){\line(1,1)6}
\put(1,-1){\line(-1,1)6}
\put(-5,0){\line(1,0){10}}
\put(0,-1){\line(0,1)6}
\put(0,0){\squine(0,0.41421355,1,1.0,1.0,1.41421357)}
\put(0,0){\squine(1,1.38742588,2,1.41421357,1.68816501,2.23606798)}
\put(0,0){\squine(2,2.4142136,3,2.23606798,2.60655183,3.16227767)}
\put(0,0){\squine(3,3.43405694,4,3.16227767,3.57406026,4.12310565)}
\put(0,0){\squine(4,4.4470902,5,4.12310565,4.55684686,5.0990195)}
\put(0,0){\squine(-0,-0.41421355,-1,1.0,1.0,1.41421357)}
\put(0,0){\squine(-1,-1.38742588,-2,1.41421357,1.68816501,2.23606798)}
\put(0,0){\squine(-2,-2.4142136,-3,2.23606798,2.60655183,3.16227767)}
\put(0,0){\squine(-3,-3.43405694,-4,3.16227767,3.57406026,4.12310565)}
\put(0,0){\squine(-4,-4.4470902,-5,4.12310565,4.55684686,5.0990195)}
\endpicture
\enddisplay
which has the lines $y=x$ and $y=-x$ as ``asymptotes.\qback'' The
curve approaches but never quite touches these asymptotes, when
$x\to\infty$. Nowadays we use ``asymptotic'' in a broader sense to
mean any approximate value that gets closer and closer to the truth,
when some parameter approaches a limiting value. For us, asymptotics
means ``almost falling together.\qback''
Some asymptotic formulas are very difficult to derive, well beyond
the scope of this book. We will content ourselves with an introduction
to the subject; we hope to acquire a suitable foundation on which
further techniques can be built. We will be particularly interested
in understanding the definitions of `$\sim$' and `$O$' and similar
symbols, and we'll study basic ways to manipulate asymptotic quantities.
\beginsection 9.1 A Hierarchy
Functions of $n$ that occur in practice usually have different ``asymptotic
growth ratios''; one of them will approach infinity faster than another.
We formalize this by saying that
\begindisplay
f(n)\prec g(n)\quad\iff\quad\lim_{n\to\infty}{f(n)\over g(n)}\!=\!0\,.
\eqno\eqref|prec-def|
\enddisplay
This relation is
transitive: If $f(n)\prec g(n)$ and
$g(n)\prec h(n)$ then $f(n)\prec h(n)$.
\g All functions great~and small.\g
We also may write $g(n)\succ f(n)$ if $f(n)\prec g(n)$.
This notation was introduced in 1871 by Paul "du Bois-Reymond"~[|du-bois|].
For example, $n\prec n^2$; informally we say that $n$ grows
more slowly than~$n^2$. In fact,
\begindisplay
n^\alpha\prec n^\beta\iff \alpha<\beta\,,
\eqno\eqref|power-prec|
\enddisplay
when $\alpha$ and $\beta$ are arbitrary real numbers.
There are, of course, many functions of $n$ besides powers of~$n$. We can
use the $\prec$ relation to rank lots of functions into an asymptotic
pecking order that includes entries like this:
\begindisplay
1\prec\log\log n\prec\log n\prec n^\epsilon\prec n^c\prec n^{\log n}
\prec c^n\prec n^n\prec c^{@ c^n}\,.
\enddisplay
(Here $\epsilon$ and $c$ are arbitrary constants with $0<\epsilon<10$ whenever $n\ge n_0$.
We can use these rules to show, for example, that $f(n)+g(n)$ is in $\Lfr$
whenever $f(n)$ and $g(n)$ are, because $f(n)+g(n)=f(n)-\bigl(0-g(n)\bigr)$.
If $f(n)$ and $g(n)$ are eventually positive members of~$\Lfr$, their product
$f(n)@g(n)=e^{\,\ln f(n)+\ln g(n)}$ and quotient $f(n)/g(n)=
e^{\,\ln f(n)-\ln g(n)}$ are in~$\Lfr$; so are functions like
$\sqrt{f(n)}=e^{\half\ln f(n)}$,
etc. Hardy proved that every logarithmico-exponential
function is eventually positive, eventually
negative, or identically zero. Therefore the product and quotient of
any two $\Lfr$-functions is in~$\Lfr$, except that we cannot divide
by a function that's identically zero.
Hardy's main theorem about logarithmico-exponential functions is that they
form an asymptotic hierarchy: {If\/ $f(n)$ and\/ $g(n)$ are any functions
in\/~$\Lfr$, then either\/ $f(n)\prec g(n)$, or\/ $f(n)\succ g(n)$, or\/
$f(n)\asymp g(n)$. In the last case there is, in fact, a constant\/~$\alpha$
such that}
\begindisplay
f(n)\sim\alpha\,g(n)\,.
\enddisplay
The proof of Hardy's theorem is beyond the scope of this book; but it's
nice to know that the theorem exists, because almost every function we
ever need to deal with is in~$\Lfr$. In practice, we can generally fit
a given function into a given hierarchy without great difficulty.
\beginsection 9.2 O Notation
A wonderful notational convention for asymptotic analysis was introduced
by Paul "Bachmann" in 1894 and popularized in
subsequent years by Edmund "Landau" and others.
\g\noindent\llap{``}\dots\ wir durch das Zeichen\/ $O(n)$ eine Gr\"o\ss e ausdr\"ucken,
deren "O"rdnung in Bezug auf\/ $n$ die Ordnung von\/~$n$ nicht
\"uberschreitet; ob sie wirklich Glieder von der Ordnung\/~$n$
in sich enth\"alt, bleibt bei dem bisherigen Schlu\ss verfahren
dahingestellt.''\par\hfill\kern-4pt\dash---P. Bachmann [|bachmann|]\g
We have seen it in formulas like
\begindisplay
H_n=\ln n+\gamma+O(1/n)\,,
\eqno
\enddisplay
which tells us that the $n$th harmonic number is equal to the natural
logarithm of~$n$ plus Euler's constant, plus a quantity that is
``"Big Oh" of\/ $1$\kern-1pt~over~$n$.\qback'' This last quantity isn't specified
exactly; but whatever it is, the "notation" claims that its absolute
value is no more than a constant times~$1/n$.
The beauty of $O$-notation is that it suppresses unimportant detail
and lets us concentrate on salient features: The quantity $O(1/n)$ is
negligibly small, if constant multiples of~$1/n$ are unimportant.
Furthermore we get to use $O$ right in the middle of a formula. If we
want to express \thiseq\ in terms of the notations in Section~9.1,
we must transpose `$\ln n+\gamma$' to the left side and specify a weaker
result like
\begindisplay
H_n-\ln n-\gamma\prec{\log\log n\over n}
\enddisplay
or a stronger result like
\begindisplay
H_n-\ln n-\gamma\asymp{1\over n}\,.
\enddisplay
The Big Oh notation allows us to specify an appropriate amount of
detail in~place, without transposition.
The idea of imprecisely specified quantities can be made clearer if
we consider some additional examples. We occasionally use the notation
`$@\pm1@$'
to stand for something that is either $+1$ or $-1$; we don't
know (or perhaps we don't care) which it is, yet we can manipulate it
in formulas.
N.\thinspace G. "de Bruijn"
begins his book \kern-1pt{\sl Asymptotic Methods in Analysis\/}~[|de-bruijn|]
by considering a "Big~Ell" notation that helps us understand Big~Oh.
If we write $L(5)$ for a number whose absolute value is less than~$5$
(but we don't say what the number is), then we can perform certain
calculations without knowing the full truth. For example, we can
deduce formulas such as $1+L(5)=L(6)$; $L(2)+L(3)=L(5)$; $L(2)L(3)=L(6)$;
$e^{L(5)}=L(e^5)$; and so on. But we cannot conclude that $L(5)-L(3)=L(2)$,
since the left side might be $4-0$.
In fact, the most we can say is $L(5)-L(3)=L(8)$.
Bachmann's
$O$-notation is similar to $L$-notation but it's even less precise:
$O(\alpha)$ stands for a number whose absolute value is at most a constant
times~$\vert \alpha\vert$. We don't say what the number is and we don't
even say what the constant is. Of course the notion of a ``constant''
is nonsense
\g It's not nonsense, but it is pointless.\g
if there is nothing variable in the picture, so we use $O$-notation only
in contexts when there's at least one quantity (say~$n$) whose value is varying.
The formula
\begindisplay
f(n)=O\bigl(g(n)\bigr)\qquad\hbox{for all $n$}
\eqno\eqref|o-examp|
\enddisplay
means in this context that there is a constant $C$ such that
\begindisplay
\bigl\vert f(n)\bigr\vert\le C\bigl\vert g(n)\bigr\vert\qquad\hbox{for all $n$};
\eqno\eqref|o-def|
\enddisplay
and when $O\bigl(g(n)\bigr)$ stands in the middle of a formula it
represents a function $f(n)$ that satisfies~\thiseq. The values of~$f(n)$
are unknown, but we do know that they aren't too large. Similarly,
de~Bruijn's `$L(n)$' represents an unspecified function~$f(n)$ whose
values satisfy $\bigl\vert f(n)\bigr\vert<\vert n@\vert$. The main difference between
$L$ and~$O$ is that $O$-notation involves an unspecified constant~$C$;
each appearance of~$O$ might involve a different~$C$, but each~$C$ is
\g I've got a little list\dash---I've got a little list,\par
Of annoying terms and details that might well be under ground,\par
And that never would be missed\dash---that never would be missed.
"!Gilbert"\g % Mikado
independent of~$n$.
For example, we know that the sum of the first $n$ squares is
"!sum of consecutive squares"
\begindisplay
\Sq_n=\textstyle
{1\over3}n(n+\half)(n+1)={1\over3}n^3+\half n^2+{1\over6}n\,.
\enddisplay
We can write
\begindisplay
\Sq_n=O(n^3)
\enddisplay
because
$\vert{1\over3}n^3+\half n^2+{1\over6}n@\vert
\le{1\over3}\vert n@\vert^3+\half\vert n@\vert^2+{1\over6}\vert n@\vert
\le{1\over3}\vert n^3\vert+\half\vert n^3\vert+{1\over6}\vert n^3\vert
=\vert n^3\vert$ for all integers~$n$.
Similarly, we have the more specific formula
\begindisplay
\Sq_n=\textstyle{1\over3}n^3+O(n^2)\,;
\enddisplay
we can also be sloppy and throw away information, saying that
\begindisplay
\Sq_n=O(n^{10})\,.
\enddisplay
Nothing in the definition of $O$ requires us to give a best possible bound.
But wait a minute. What if the variable $n$ isn't an integer? What if we
have a formula like $S(x)=
{1\over3}x^3+\half x^2+{1\over6}x$, where $x$~is a real number?
Then we cannot say that $S(x)=O(x^3)$, because the ratio
$S(x)/x^3={1\over3}+\half x^{-1}+{1\over6}x^{-2}$ becomes unbounded when
$x\to0$. And we cannot say that $S(x)=O(x)$, because the ratio
$S(x)/x={1\over3}x^2+\half x+{1\over6}$ becomes unbounded when $x\to\infty$.
So we apparently can't use $O$-notation with~$S(x)$.
The answer to this dilemma is that variables used with $O$ are generally
subject to side conditions. For example, if we stipulate that $\vert x\vert
\ge1$, or that $x\ge\epsilon$ where $\epsilon$ is any positive constant,
or that $x$ is an integer,
then we can write $S(x)=O(x^3)$. If we stipulate that $\vert x\vert\le1$,
or that $\vert x\vert\le c$ where $c$ is any positive constant, then we
can write $S(x)=O(x)$. The $O$-notation is governed by its environment,
by constraints on the variables involved.
These constraints are often specified by a limiting relation.
For example, we might say that
\begindisplay
f(n)=O\bigl(g(n)\bigr)\qquad\hbox{as $n\to\infty$}.
\eqno
\enddisplay
This means that the $O$-condition is supposed to hold when $n$ is ``near''~%
$\infty$; we don't care what happens unless $n$ is quite large. Moreover,
we don't even specify exactly what ``near'' means; in such cases each
appearance of~$O$ implicitly asserts the existence of {\it two\/}
constants $C$ and~$n_0$, such that
\begindisplay
\bigl\vert f(n)\bigr\vert\le C\bigl\vert g(n)\bigr\vert
\qquad\hbox{whenever $n\ge n_0$}.
\eqno\eqref|o-def-limit|
\enddisplay
The values of $C$ and $n_0$ might be different for each $O$, but they do
not depend on~$n$. Similarly, the notation
\g You are the fairest\par\quad of your sex,\par
Let me be your\par\quad hero;\par
I love you as\par
\quad one over $x$,\par
As $x$ approaches\par\quad zero.\par
Positively.\g
\begindisplay
f(x)=O\bigl(g(x)\bigr)\qquad\hbox{as $x\to0$}
\enddisplay
means that there exist two constants $C$ and $\epsilon$ such that
\begindisplay
\bigl\vert f(x)\bigr\vert\le C\bigl\vert g(x)\bigr\vert
\qquad\hbox{whenever $\vert x\vert\le \epsilon$}.
\eqno\eqref|o-def-limit0|
\enddisplay
The limiting value does not have to be $\infty$ or $0@$; we can write
\begindisplay
\ln z=z-1+O\bigl((z-1)^2\bigr)\qquad\hbox{as $z\to1$},
\enddisplay
because it can be proved that $\vert\ln z-z+1\vert\le\vert z-1\vert^2$
when $\vert z-1\vert\le\half$.
Our definition of $O$ has gradually developed, over a few pages,
from something that seemed
pretty obvious to something that seems rather complex; we now have
$O$ representing an undefined function and either one or two unspecified
constants, depending on the environment. This may seem complicated enough
for any reasonable notation, but it's still not the whole story! Another
subtle consideration lurks in the background. Namely, we need to
realize that it's fine to write
\begindisplay
\textstyle{1\over3}n^3+\half n^2+{1\over6}n=O(n^3)\,,
\enddisplay
but we should {\it never\/} write this equality with the sides reversed.
Otherwise we could deduce ridiculous things like $n=n^2$ from the
identities $n=O(n^2)$ and $n^2=O(n^2)$. When we work with $O$-notation
and any other formulas that involve imprecisely specified quantities,
\g\noindent\llap{``}And to auoide the tediouse repetition of these woordes: is equalle to:
I~will sette as I doe often in woorke use, a paire of paralleles,
or Gemowe lines of one lengthe, thus: $=\joinrel=\joinrel=\joinrel=\,$,
bicause noe~.2.\ thynges, can be moare equalle.''\par
\hfill\dash---R. "Recorde" [|recorde-wit|]\g
we are dealing with {\it"one-way equalities"}. The right side of an
"!equality, one-way"
equation does not give more information than the left side, and it may
give less; the right is a ``crudification'' of the left.
From a strictly formal point of view, the notation
$O\bigl(g(n)\bigr)$ does not stand for a single function~$f(n)$, but
for the {\it set\/} of all functions~$f(n)$ such that
$\bigl\vert f(n)\bigr\vert\le C\bigl\vert g(n)\bigr\vert$ for some constant~$C$.
An ordinary formula~$g(n)$ that doesn't involve $O$-notation stands for the
set containing a single function $f(n)=g(n)$. If $S$ and~$T$ are sets
of functions of~$n$, the notation $S+T$ stands for the set of all
functions of the form $f(n)+g(n)$, where $f(n)\in S$ and $g(n)\in T$;
other notations like $S-T$, $ST$, $S/T$, $\sqrt S$, $e^S$, $\ln S$
are defined similarly. Then an ``equation'' between such sets of functions
is, strictly speaking, a {\it"set inclusion"\/}; the `$=$' sign really
means `$\subseteq$'. These formal definitions put all of our
$O$~manipulations on firm logical ground.
For example, the ``equation''
\begindisplay
\textstyle {1\over3}n^3+O(n^2)=O(n^3)
\enddisplay
means that $S_1\subseteq S_2$, where $S_1$ is the set of all functions
of the form ${1\over3}n^3+f_1(n)$ such that there exists a constant~$C_1$
with $\bigl\vert f_1(n)\bigr\vert\le C_1\vert n^2\vert$, and where
$S_2$ is the set of all functions
$f_2(n)$ such that there exists a constant~$C_2$
with $\bigl\vert f_2(n)\bigr\vert\le C_2\vert n^3\vert$.
We can formally prove this ``equation'' by taking an arbitrary element of
the left-hand side and showing that it belongs to the right-hand side:
Given
${1\over3}n^3+f_1(n)$ such that $\bigl\vert f_1(n)\bigr\vert\le C_1\vert n^2\vert$,
we must prove that there's a
constant~$C_2$
such that $\vert{1\over3}n^3+ f_1(n)\vert\le C_2\vert n^3\vert$.
The constant $C_2={1\over3}+C_1$ does the trick, since
$n^2\le\vert n^3\vert$ for all integers~$n$.
If `$=$' really means `$\subseteq$',
why don't we use `$\subseteq$'
instead of abusing the equals sign?
"!notation"
There are four reasons.
First, tradition.
Number theorists started using the equals~sign
with $O$-notation and the practice stuck.
It's sufficiently well established by now that
we cannot hope to get the mathematical community to change.
Second, tradition.
Computer people are quite used to seeing equals~signs abused\dash---%
for years "FORTRAN" and "BASIC" programmers have been writing
assignment statements like `$N = N+1$'.
One more abuse isn't much.
Third, tradition.
We often read `$=$' as the word `is'.
For instance we verbalize the formula $H_n = O(\log n)$ by saying
``$H$~sub~$n$ is Big~Oh of log~$n$.\qback''
And in English, this~`is' is one-way.
We say that a bird is an animal,
but we don't say that an animal is a bird;
``animal'' is a "crudification" of ``bird.\qback''
Fourth, for our purposes it's natural.
\g \noindent\llap{``}It is obvious that the sign\/~$=$ is really the wrong sign
for such relations, because it suggests symmetry, and there is
no such symmetry. \dots\ Once this warning has been given, there is,
however, not much harm in using the sign\/~$=$, and we shall maintain~it,
for no other reason than that it is customary.''\par
\hfill\kern-4pt\dash---N.\thinspace G.\thinspace de\thinspace
Bruijn\thinspace [|de-bruijn|]"!de Bruijn"\g
If we limited our use of $O$-notation to situations where
it occupies the whole right side of a formula\dash---as in
the harmonic number approximation $H_n = O(\log n)$,
or as in the description of a sorting algorithm's
running time $T(n) = O (n \log n)$\dash---%
it wouldn't matter whether we used `$=$'
or something else.
But when we use $O$-notation in the middle of an expression,
as we usually do in asymptotic calculations,
our intuition is well satisfied if we think of
the equals~sign as an equality, and if we
think of something like $O(1/n)$ as a very small quantity.
So we'll continue to use `$=$',
and we'll continue to regard~$O\bigl(g(n)\bigr)$ as
an incompletely specified function,
knowing that we can always fall back on the
set-theoretic definition if we must.
But we ought to mention one more technicality while we're picking nits
about definitions: If there are several variables in the environment,
$O$-notation formally represents sets of functions of two or more
variables, not just one. The domain of each function is every variable
that is currently ``free'' to vary.
This concept can be a bit subtle,
because a variable might be defined only in parts of an expression, when
it's controlled by a $\sum$ or something similar. For example, let's
look closely at the equation
\begindisplay
\sum_{k=0}^n\,\bigl(k^2+O(k)\bigr)={\textstyle{1\over3}}n^3+O(n^2)\,,
\qquad\hbox{integer $n\ge0$}.
\eqno
\enddisplay
The expression $k^2+O(k)$ on the left stands for the set of all two-variable
functions of the form $k^2+f(k,n)$ such that there exists a constant~$C$
with $\bigl\vert f(k,n)\bigr\vert\le Ck$ for $0\le k\le n$. The sum of
this set of functions, for $0\le k\le n$, is the set of all functions~$g(n)$
of the form
\begindisplay
\sum_{k=0}^n\bigl(k^2{+}f(k,n)\bigr)=\textstyle
{1\over3}n^3+\half n^2+{1\over6}n+f(0,n)+f(1,n)+\cdots+f(n,n)\,,
\enddisplay
where $f@$ has the stated property. Since we have
\begindisplay
&\textstyle
\bigl\vert\half n^2+{1\over6}n+f(0,n)+f(1,n)+\cdots+f(n,n)\bigr\vert\cr
&\qquad\le\textstyle\half n^2+{1\over6}n^2+C\cdt0+C\cdt1+\cdots+C\cdt n\cr
&\qquad0$.}
\eqno
\enddisplay
We have $f(n)=\Omega\bigl(g(n)\bigr)$ if and only if
$g(n)=O\bigl(f(n)\bigr)$. A sorting algorithm whose running time is
$\Omega(n^2)$ is inefficient compared with one whose running time is
$O(n\log n)$, if $n$ is large enough.
Finally there's "Big Theta", which specifies an
"!$\Theta$"
\g Since\/ $\Omega$ and\/ $\Theta$ are uppercase Greek letters,
the\/ $O$ in\/ $O$\kern-1pt-notation must be a capital Greek Omicron.\par
After all, Greeks invented asymptotics.\g
exact order of growth:
\begindisplay
f(n)=\Theta\bigl(g(n)\bigr)\iff
\tworestrictions{\displaymath f(n)=O\bigl(g(n)\bigr)$}%
{and\quad\displaymath f(n)=\Omega\bigl(g(n)\bigr)\,.$}
\eqno
\enddisplay
We have $f(n)=\Theta\bigl(g(n)\bigr)$ if and only if $f(n)\asymp g(n)$
in the notation we saw previously, equation \eq(|hardy-asymp-def|).
Edmund "Landau" [|landau-primes|] invented a ``"little oh"'' notation,
\begindisplay \openup-3pt
&f(n)=o\bigl(g(n)\bigr)\cr
&\qquad\iff
\bigl\vert f(n)\bigr\vert\le \epsilon\bigl\vert g(n)\bigr\vert
\qquad\tworestrictions{for all $n\ge n_0(\epsilon)$ and}%
{for all constants $\epsilon>0$.}
\eqno\eqref|little-o-def|
\enddisplay
This is essentially the relation $f(n)\prec g(n)$ of \eq(|prec-def|).
We also have
\begindisplay
f(n)\sim g(n)\iff f(n)=g(n)+o\bigl(g(n)\bigr)\,.
\eqno
\enddisplay
Many authors use `$o$' in asymptotic formulas, but a more explicit
`$O$' expression is almost always
preferable. For example,
the average running time of a computer method called ``"bubblesort"''
"!sorting" "!$o$, considered harmful"
depends on the asymptotic value of the sum $P(n)=\sum_{k=0}^n k^{n-k\,}k!/n!$.
Elementary asymptotic methods suffice to
prove the formula $P(n)\sim\sqrt{\pi n/2}$, which means that the ratio
$P(n)/\mskip-2mu\sqrt{\pi n/2}$ approaches~$1$ as $n\to\infty$. However,
the true behavior of $P(n)$ is best understood by considering
the {\it difference}, $P(n)-\sqrt{\pi n/2}$, not the ratio:
\begindisplay \def\preamble{\bigstrut$\hfil##$\enspace%
&&\vrule##&\enspace\hfil$##$\hfil\enspace}%
\offinterlineskip
n\,&&P(n)/\mskip-2mu\sqrt{\pi n/2}&&P(n)-\sqrt{\pi n/2}\cr
\omit&height2pt&&\cr
\noalign{\hrule}
\omit&height2pt&&\cr
1&&0.798&&-0.253\cr
%5&&0.853&&-0.411\cr
10&&0.878&&-0.484\cr
%15&&0.893&&-0.518\cr
20&&0.904&&-0.538\cr
30&&0.918&&-0.561\cr
40&&0.927&&-0.575\cr
50&&0.934&&-0.585\cr
\enddisplay
The numerical evidence in the middle column is not very compelling;
it certainly is far from a dramatic proof that $P(n)/\mskip-2mu
\sqrt{\pi n/2}$ approaches~$1$ rapidly, if at all. But the right-hand column
shows that $P(n)$ is
very close indeed to $\displaystyle\sqrt{\pi n/2}$.
Thus we can characterize the behavior
of~$P(n)$ much better if we can derive formulas of the form
\begindisplay
P(n)=\sqrt{\pi n/2}+O(1)\,,
\enddisplay
or even sharper estimates like
\begindisplay
P(n)=\sqrt{\pi n/2}-\textstyle{2\over3}+O(1/\sqrt n\,)\,.
\enddisplay
Stronger methods
of asymptotic analysis are needed to prove $O$-results,
but the additional effort required to learn these stronger methods
is amply compensated by the improved understanding that comes with~$O$-bounds.
Moreover, many sorting algorithms have running times of the form
\begindisplay
T(n)=A\,n\lg n\,+\,B\,n\, +\, O(\log n)
\enddisplay
for some constants $A$ and $B$. Analyses that stop at $T(n)\sim A\,n\lg n$
don't tell the whole story, and it turns out to be a bad strategy to
choose a sorting algorithm based just on its $A$ value. Algorithms with
a good~`$A$' often achieve this at the expense of a bad~`$B$'. Since $n\lg n$
grows only slightly faster than~$n$, the algorithm that's faster asymptotically
(the one with a slightly smaller~$A$~value) might be faster only for
values of~$n$ that never actually arise in practice. Thus, asymptotic
methods that allow us to go past the first term and evaluate~$B$
are necessary if we are to make the right choice of method.
Before we go on to study $O$, let's talk about one more small aspect
of mathematical style. Three different notations for
logarithms have been used in this chapter: "lg", "ln", and~"log". We often
\g Also lD, the Dura\-flame logarithm.\g
use `lg' in connection with computer methods, because binary
logarithms are often relevant in such cases; and we often use `ln'
in purely mathematical calculations, since the formulas for natural
logarithms are nice and simple. But what about `log'? Isn't this the ``common''
base-10 logarithm that students learn in high school\dash---the ``common''
"!common logarithm"
\tabref|nn:log|logarithm that turns out
to be very uncommon in mathematics and computer science? Yes; and many
mathematicians confuse the issue by using `log'
to stand for natural logarithms or binary logarithms.
There is no universal agreement here.
But we can usually breathe a sigh of relief when a logarithm appears inside
$O$-notation, because $O$ ignores multiplicative constants.
There is no difference between $O(\lg n)$, $O(\ln n)$, and $O(\log n)$, as
$n\to\infty$;
similarly, there is no difference
between $O(\lg\lg n)$, $O(\ln\ln n)$, and $O(\log\log n)$.
\g Notice that\par$\log\log\log n$\par is undefined when\/ $n\le10$.\g
We get to choose whichever we please; and the one with
`log' seems friendlier because it is more pronounceable.
Therefore we generally use `log' in all contexts where it improves
readability without introducing ambiguity.
\beginsection 9.3 O Manipulation
Like any mathematical formalism, the $O$-notation has rules of manipulation
that free us from the grungy details of its definition. Once
we prove that the rules are correct, using the definition, we can henceforth
work on a higher plane and forget about actually verifying that one
set of functions is contained in another. We don't even need to calculate the
\g The secret of being a bore is to tell everything.\par
\hfill\dash---"Voltaire"\g
constants $C$ that are implied by each~$O$, as long as we follow rules
that guarantee the existence of such constants.
For example, we can prove once and for all that
\begindisplay \openup2pt
&n^m=O(n^{m'}),\qquad\hbox{when $m\le m'$};\eqno\cr
&O\bigl(f(n)\bigr)+O\bigl(g(n)\bigr)=
O\bigl(\vert f(n)\vert+\vert g(n)\vert\bigr)\,.
\eqno\eqref|o-f+g|
\enddisplay
Then we can say immediately that ${1\over3}n^3+{1\over2}n^2+{1\over6}n
=O(n^3)+O(n^3)+O(n^3)=O(n^3)$, without the laborious calculations
in the previous section.
Here are some more rules that follow easily from the definition:
\begindisplay
f(n)&=O\bigl(f(n)\bigr)\,;\eqno\eqref|o-identity|\cr
c\cdot O\bigl(f(n)\bigr)&=O\bigl(f(n)\bigr)\,,
\qquad\hbox{if $c$ is constant};\eqno\cr
O\bigl(O\bigl(f(n)\bigr)\bigr)&=O\bigl(f(n)\bigr)\,;\eqno\cr
O\bigl(f(n)\bigr)@O\bigl(g(n)\bigr)&=O\bigl(f(n)@g(n)\bigr)\,;
\eqno\eqref|o-prod-in|\cr
O\bigl(f(n)\,g(n)\bigr)&=f(n)@O\bigl(g(n)\bigr)\,.
\eqno\eqref|o-prod-out|\cr
\enddisplay
Exercise |prove-o-f+g| proves \eq(|o-f+g|), and the proofs of the
others are similar. We can always replace something of the form on the
left by what's on the right, regardless of the side conditions on the
variable~$n$.
Equations \eq(|o-prod-out|) and \eq(|o-identity|) allow us to derive
\g\vskip-20pt
(Note: The formula\/ $O(f(n))^2$ does not denote the set of all functions\/
$g(n)^2$ where\/ $g(n)$ is in $O(f(n))$; such functions $g(n)^2$ cannot
be negative, but the set\/ $O(f(n))^2$ includes negative functions. In general,
when $S$~is a set, the notation\/ $S^2$ stands for the set of all products\/
$s_1s_2$ with $s_1$~and\/~$s_2$ in~$S$, not for the set of all
squares\/ $s^2$ with $s\in S$.)\g
the identity $O\bigl(f(n)^2\bigr)=O\bigl(f(n)\bigr){}^2$. This sometimes
helps avoid parentheses, since we can write
\begindisplay
O(\log n)^2\qquad\hbox{instead of}\qquad O\bigl((\log n)^2\bigr)\,.
\enddisplay
Both of these are preferable to `$O(\log^2 n)$', which is ambiguous
because some authors use it to mean `$O(\log\log n)$'.
Can we also write
\begindisplay
O(\log n)^{-1}\qquad\hbox{instead of}\qquad O\bigl((\log n)^{-1}\bigr)\,?
\enddisplay
No! This is an abuse of notation, since the set of functions $1/O(\log n)$
is neither a subset nor a superset of $O(1/\!@\log n)$. We could legitimately
substitute $\Omega(\log n)^{-1}$ for $O\bigl((\log n)^{-1}\bigr)$, but
this would be awkward. So we'll restrict our use of ``exponents outside
the~$O@$'' to constant, positive integer exponents.
Power series give us some of the most useful operations of all. If the
sum
\begindisplay
S(z)=\sum_{n\ge0}a_n\,z^n
\enddisplay
converges absolutely for some complex number $z=z_0$, then
\begindisplay
S(z)=O(1)\,,\qquad\hbox{for all $\vert z\vert\le\vert z_0\vert$}.
\enddisplay
This is obvious, because
\begindisplay
\vert S(z)\vert\le
\sum_{n\ge0}\vert a_n@\vert\,\vert z\vert^n\le
\sum_{n\ge0}\vert a_n@\vert\,\vert z_0\vert^n=C<\infty\,.
\enddisplay
In particular, $S(z)=O(1)$ as $z\to0$, and $S(1/n)=O(1)$ as $n\to\infty$,
provided only that $S(z)$ converges for at least one nonzero value of~$z$.
We can use this principle to truncate a power series at any convenient
point and estimate the remainder with~$O$. For example, not only is
$S(z)=O(1)$, but
\begindisplay
S(z)&=a_0+O(z)\,,\cr
S(z)&=a_0+a_1z+O(z^2)\,,\cr
%S(z)&=a_0+a_1z+a_2z^2+O(z^3)\,,\cr
\enddisplay
and so on, because
\begindisplay
%S(z)=a_0+a_1z+a_2z^2+z^3\sum_{n\ge3}a_{n-3}z^{n-3}
S(z)=\sum_{0\le k\log_n k$, hence $S_n=\sum_{k\ge1}1/kN_n(k)^2<1+(\log n)^2\sum_{k\ge2}
1/k(\log k)^2$.
Proceeding as in Problem 1, we can try to write $N_n(k)=\log_n k+O(1)$ and
substitute this into the formula for~$S_n$. The term represented here
by~$O(1)$
is always between $0$ and~$1$, and it is about $\half$ on the average,
so it seems rather well-behaved. But still, this isn't a good enough
approximation to tell us about~$S_n$; it gives us zero significant figures
(that is, high relative error)
when $k$ is small, and these are the terms that contribute the most to
the sum. We need a different idea.
The key (as in Problem 4) is to use our manipulative skills to put the
sum into a more tractable form, before we resort to asymptotic estimates.
We can introduce a new variable of summation, $m=N_n(k)$:
\begindisplay
S_n&=\sum_{k,m\ge1}{\bigi[m=N_n(k)\bigr]\over km^2}\cr
&=\sum_{k,m\ge1}{\[n^{m-1}\le kn$, so it cannot be more than $nH_n=
O(n\log n)$ in absolute value.
This preliminary analysis indicates that we'll find it advantageous to write
\begindisplay \openup3pt
\Phi(n)=\half\sum_{k=1}^n\mu(k)\biggl(\Bigl({n\over k}\Bigr)+O(1)\biggr)^{\!2}
&=\half\sum_{k=1}^n\mu(k)\biggl(\Bigl({n\over k}\Bigr)^{\!2}+
O\Bigl({n\over k}\Bigr)\biggr)\cr
&=\half\sum_{k=1}^n\mu(k)\Bigl({n\over k}\Bigr)^{\!2}
+\sum_{k=1}^n O\Bigl({n\over k}\Bigr)\cr
&=\half\sum_{k=1}^n\mu(k)\Bigl({n\over k}\Bigr)^{\!2}\,+\,
O(n\log n)\,.\cr
\enddisplay
This removes the floors; the remaining problem is to evaluate the unfloored sum
$\half\sum_{k=1}^n
\mu(k)n^2\!/k^2$ with an accuracy of $O(n\log n)$; in other words, we want
to evaluate $\sum_{k=1}^n \mu(k)1/k^2$ with an accuracy of $O(n^{-1}\log n)$.
But that's easy; we can simply run the sum all the way up to $k=\infty$, because
the newly added terms are
\begindisplay
\sum_{k>n}{\mu(k)\over k^2}=O\Bigl(\sum_{k>n}{1\over k^2}\Bigr)
&=O\Bigl(\sum_{k>n}{1\over k(k-1)}\Bigr)\cr
&=O\biggl(\sum_{k>n}\Bigl({1\over k-1}-{1\over k}\Bigr)\biggr)
=O\Bigl({1\over n}\Bigr)\,.
\enddisplay
We proved in \equ(7.|inverse-zeta|) that $\sum_{k\ge1}\mu(k)/k^z=
1/\zeta(z)$. Hence $\sum_{k\ge1}\mu(k)/k^2=1\big/\bigl(\sum_{k\ge1}1/k^2\bigr)=
6/\pi^2$, and we have our answer:
\g\vskip20pt(The error term was shown to be at most
$O(n(\log n)^{2/3}\*\null\quad(\log\log n)^{1+\epsilon})$ by "Saltykov" in
1960~[|salty-phi|]. On the other hand, it is not as small as
$o(n(\log\mskip-1mu\log\mskip-1mu n)^{1/2})$, according to
"Montgomery"~[|monty-phi|].)\g
\begindisplay
\Phi(n)={3\over \pi^2}n^2+O(n\log n)\,.
\eqno\eqref|o-phi|
\enddisplay
\beginsection 9.4 Two Asymptotic Tricks
Now that we have some facility with $O$ manipulations, let's look at what
we've done from a slightly higher perspective. Then we'll have some
important weapons in our asymptotic arsenal, when we need to do battle with
tougher problems.
\subhead Trick 1: Bootstrapping.
When we estimated the $n$th prime $P_n$ in Problem~3 of Section~9.3,
we solved an asymptotic recurrence of the form
\begindisplay
P_n=n\ln P_n\bigl(1+O(1/\!@\log n)\bigr)\,.
\enddisplay
We proved that $P_n=n\ln n+O(n)$ by first using the recurrence to show the
weaker result $O(n^2)$. This is a special case of a general method called
{\it"bootstrapping"}, in which we solve a recurrence asymptotically by
starting with a rough estimate and plugging it into the recurrence; in this
way we can often derive better and better estimates, ``pulling ourselves
up by our bootstraps.''
Here's another problem that illustrates bootstrapping nicely: What is
the asymptotic value of the coefficient $g_n=[z^n]\,G(z)$ in the
generating function
\begindisplay
G(z)=\exp\Bigl(@\sum_{k\ge1}{z^k\over k^2}\Bigr)\,,
\eqno
\enddisplay
as $n\to\infty$? If we differentiate this equation with respect to~$z$, we find
\begindisplay
G'(z)=\sum_{n=0}^\infty ng_n@z^{n-1}=\Bigl(@
\sum_{k\ge1}{z^{k-1}\over k} \Bigr)\,G(z)\,;
\enddisplay
equating coefficients of $z^{n-1}$ on both sides gives the recurrence
\begindisplay
ng_n=\sum_{0\le k$ doesn't appear in Sloane's {\sl Handbook\/}~[|sloane|];
therefore a closed form for $g_n$ seems out of the question, and asymptotic
information is probably the best we can hope to derive.
Our first handle on this problem is the observation that $01$}.
\enddisplay
And we can bootstrap yet again:
\begindisplay
ng_n&={1\over n}+\sum_{0n}{(\log k)^2\over k^2}<\sum_{m\ge1}\,\sum_{n^m0$. (This is a truncation of a series that diverges
for all fixed~$n$ if we let $m\to\infty$.)
There's only one flaw in our solution: We were too cautious.
We derived \eq(|log-bound|)
on the assumption that $k<\lfloor\lg n\rfloor$, but exercise~|prove-log-bound|
proves that the stated estimate is actually valid for all values of~$k$.
If we had known the stronger general
result, we wouldn't have
had to use the two-tail trick; we could have gone directly to
the final formula! But later we'll encounter problems where
exchange of tails is the only decent approach available.
\beginsection 9.5 Euler's Summation Formula
And now for our next trick\dash---which is, in fact, the last important
"!Euler's summation formula"
technique that will be discussed in this book\dash---we turn to a
general method of approximating sums that was first published by
Leonhard "Euler"~[|euler-summation|] in 1732. (The idea is sometimes
also associated with the name of Colin "Maclaurin", a professor of mathematics
at Edinburgh
who discovered it independently a short time later~[|maclaurin|, page~305].)
Here's the formula:
\begindisplay \openup4pt
\sum_{a\le k**1$, we need
to show that
$R_{m-1}=(B_m/m!)f^{(m-1)}(x)\between_0^1+R_m$, namely that
\begindisplay \openup3pt
&(-1)^m \int_0^1{B_{m-1}(x)\over(m-1)!}\,f^{(m-1)}(x)\,dx\cr
&\qquad={B_m\over m!}\,f^{(m-1)}(x)\bigg\vert_0^1-
(-1)^m \int_0^1{B_m(x)\over m!}\,f^{(m)}(x)\,dx\,.
\enddisplay
This reduces to the equation
\begindisplay
(-1)^mB_m@f^{(m-1)}(x)\bigg\vert_0^1
=m \int_0^1 \!B_{m-1}(x)@f^{(m-1)}(x)\,dx
+ \int_0^1 \!B_m(x)@f^{(m)}(x)\,dx\,.
\enddisplay
Once again \eq(|int-by-parts|) applies to these two integrals, with
\g Will the authors never get serious?\g
$u(x)=f^{(m-1)}(x)$ and $v(x)=B_m(x)$, because the derivative of
the Bernoulli polynomial \eq(|bern-poly-def|) is
\begindisplay
{d\over dx}\mskip-1mu\sum_k\mskip-2mu{m\choose k}B_kx^{m-k}
&=\sum_k{m\choose k}(m-k)B_kx^{m-k-1}\cr
&=m\sum_k\mskip-2mu{m{-}1\choose k}B_kx^{m-1-k}=mB_{m-1}(x)\,.\eqno\cr
\enddisplay % make that equation a line longer if there's room!
(The absorption identity \equ(5.|bc-absorb-r-k|) was useful here.)
Therefore the required formula will hold if and only if
\begindisplay
(-1)^mB_mf^{(m-1)}(x)\big\vert_0^1=B_m(x)@f^{(m-1)}(x)\big\vert_0^1\,.
\enddisplay
In other words, we need to have
\begindisplay
(-1)^mB_m=B_m(1)=B_m(0)\,,\qquad\hbox{for $m>1$}. \eqno
\enddisplay
This is a bit embarrassing, because $B_m(0)$ is obviously equal to $B_m$,
not to~$(-1)^mB_m$. But there's no problem really, because $m>1$;
we know that $B_m$ is zero when $m$~is odd. (Still,
that was a close call.)
To complete the proof of Euler's summation formula
we need to show that $B_m(1)=B_m(0)$, which is the
same as saying that
\begindisplay
\sum_k{m\choose k}B_k=B_m\,,\qquad\hbox{for $m>1$}.
\enddisplay
But this is just the definition of Bernoulli numbers,
\equ(6.|bern-def|), so we're done.
The identity $B_m'(x)=mB_{m-1}(x)$ implies that
\begindisplay
\int_0^1B_m(x)\,dx={B_{m+1}(1)-B_{m+1}(0)\over m+1}\,,
\enddisplay
and we know now that this integral is zero when $m\ge1$. Hence the remainder
term in Euler's formula,
\begindisplay
R_m={(-1)^{m+1}\over m!}\int_a^b B_m\bigl(\{x\}\bigr)@f^{(m)}(x)\,dx\,,
\enddisplay
multiplies $f^{(m)}(x)$ by a function $B_m\bigl(\{x\}\bigr)$ whose
average value is zero. This means that $R_m$ has a reasonable
chance of being small.
Let's look more closely at $B_m(x)$ for $0\le x\le1$, since $B_m(x)$ governs
the behavior of~$R_m$. Here are the graphs for $B_m(x)$
for the first twelve values of~$m$:\looseness=-1
\begindisplay \unitlength=50pt \advance\belowdisplayskip -\baselineskip
&}\hfill m=1\hfill{\qquad&}\hfill m=2\hfill{\qquad
&}\hfill m=3\hfill{\qquad&}\hfill m=4\hfill{\qquad\cr
\raise.5\unitlength\hbox{$B_m(x)$}\quad&
\beginpicture(1,1)(0,-.5)
\put(0,0){\squine(0,.5,1,-.5,0,.5)}
\put(0,0){\line(1,0)1}
\endpicture
\qquad&
\beginpicture(1,1)(0,-.5)
\put(0,0){\squine(0.0,0.125,0.25,0.166666666,0.041666666,-0.020833334 )}
\put(0,0){\squine(0.25,0.375,0.5,-0.020833334,-0.083333334,-0.083333334 )}
\put(0,0){\squine(0.5,0.625,0.75,-0.083333334,-0.083333334,-0.020833334 )}
\put(0,0){\squine(0.75,0.875,1.0,-0.020833334,0.041666664,0.166666666 )}
\put(0,0){\line(1,0)1}
\endpicture
\qquad&
\beginpicture(1,1)(0,-.5)
\put(0,0){\squine(0.0,0.111111113,0.25,0.0,0.0555555564,0.046875 )}
\put(0,0){\squine(0.25,0.33333334,0.5,0.046875,0.041666667,0.0 )}
\put(0,0){\squine(0.5,0.66666666,0.75,0.0,-0.041666666,-0.046875 )}
\put(0,0){\squine(0.75,0.88888889,1.0,-0.046875,-0.0555555564,0.0 )}
\put(0,0){\line(1,0)1}
\endpicture
\qquad&
\beginpicture(1,1)(0,-.5)
\put(0,0){\squine(0.0,0.0625,0.25,-0.033333333,-0.033333333,.00182291679 )}
\put(0,0){\squine(0.25,0.395833332,0.5,.00182291679,0.0291666668,0.0291666668 )}
\put(0,0){\squine(0.5,0.604166664,0.75,0.0291666668,0.0291666668,.00182291679 )}
\put(0,0){\squine(0.75,0.9375,1.0,.00182291679,-0.033333333,-0.033333333 )}
\put(0,0){\line(1,0)1}
\endpicture
\cr
\noalign{\vskip-4pt}
\raise.5\unitlength\hbox{$B_{4+m}(x)$}\quad&
\beginpicture(1,1)(0,-.5)
\put(0,0){\squine(0.0,0.151851851,0.25,0.0,-0.0253086418,-0.0244140623 )}
\put(0,0){\squine(0.25,0.338095237,0.5,-0.0244140623,-0.0236111109,0)}
\put(0,0){\squine(0.5,0.66190476,0.75,0,0.023611111,0.0244140625 )}
\put(0,0){\squine(0.75,0.848148175,1.0,0.0244140625,0.0253086425,0 )}
\put(0,0){\line(1,0)1}
\endpicture
\qquad&
\beginpicture(1,1)(0,-.5)
\put(0,0){\squine(0.0,0.084999998,0.25,0.0238095238,0.0238095238,-.000360398087 )}
\put(0,0){\squine(0.25,0.405000005,0.5,-.000360398087,-0.023065477,-0.0230654762 )}
\put(0,0){\squine(0.5,0.595000006,0.75,-0.0230654762,-0.0230654757,-.000360398087 )}
\put(0,0){\squine(0.75,0.914999984,1.0,-.000360398087,0.023809521,0.0238095238 )}
\put(0,0){\line(1,0)1}
\endpicture
\qquad&
\beginpicture(1,1)(0,-.5)
\put(0,0){\squine(0.0,0.157768156,0.25,0.0,0.026294693,0.0260620115 )}
\put(0,0){\squine(0.25,0.33998976,0.5,0.0260620115,0.0258349872,0)}
\put(0,0){\squine(0.5,0.66001026,0.75,0,-0.0258349907,-0.0260620154 )}
\put(0,0){\squine(0.75,0.84223185,1.0,-0.0260620154,-0.0262946966,0)}
\put(0,0){\line(1,0)1}
\endpicture
\qquad&
\beginpicture(1,1)(0,-.5)
\put(0,0){\squine(0.0,0.089505268,0.25,-0.033333333,-0.033333333,.000129191205 )}
\put(0,0){\squine(0.25,0.408006445,0.5,.000129191205,0.033072917,0.033072916 )}
\put(0,0){\squine(0.5,0.59199359,0.75,0.033072916,0.0330729154,.000129191205 )}
\put(0,0){\squine(0.75,0.910494655,1.0,.000129191205,-0.0333333216,-0.0333333258 )}
\put(0,0){\line(1,0)1}
\endpicture
\cr
\noalign{\vskip-4pt}
\raise.5\unitlength\hbox{$B_{8+m}(x)$}\quad&
\beginpicture(1,1)(0,-.5)
\put(0,0){\squine(0.0,0.15885393,0.25,0.0,-0.047656179,-0.047550202 )}
\put(0,0){\squine(0.25,0.340605214,0.5,-0.047550202,-0.047444853,0.0 )}
\put(0,0){\squine(0.5,0.659394786,0.75,0.0,0.0474448525,0.0475502014 )}
\put(0,0){\squine(0.75,0.841146074,1.0,0.0475502014,0.0476561785,0)}
\put(0,0){\line(1,0)1}
\endpicture
\qquad&
\beginpicture(1,1)(0,-.5)
\put(0,0){\squine(0.0,0.090523467,0.25,0.075757576,0.075757576,-.0000738371164 )}
\put(0,0){\squine(0.25,0.408854794,0.5,-.0000738371164,-0.075609611,-0.075609611 )}
\put(0,0){\squine(0.5,0.59114521,0.75,-0.075609611,-0.075609611,-.0000738371164 )}
\put(0,0){\squine(0.75,0.90947651,1.0,-.0000738371164,0.075757567,0.075757576 )}
\put(0,0){\line(1,0)1}
\endpicture
\qquad&
\beginpicture(1,1)(0,-.5)
\put(0,0){\squine(0.0,0.159084527,0.25,0.0,0.13257044,0.132496597 )}
\put(0,0){\squine(0.25,0.340781596,0.5,0.132496597,0.132422864,0.0 )}
\put(0,0){\squine(0.5,0.65921841,0.75,0.0,-0.132422863,-0.132496595 )}
\put(0,0){\squine(0.75,0.84091552,1.0,-0.132496595,-0.132570438,0)}
\put(0,0){\line(1,0)1}
\endpicture
\qquad&
\beginpicture(1,1)(0,-.5)
\put(0,0){\squine(0.0,0.090766151,0.25,-0.253113553,-0.253113553,.000061765313 )}
\put(0,0){\squine(0.25,0.409078423,0.5,.000061765313,0.252989963,0.252989963 )}
\put(0,0){\squine(0.5,0.59092157,0.75,0.252989963,0.252989963,.000061765313 )}
\put(0,0){\squine(0.75,0.90923382,1.0,.000061765313,-0.253113512,-0.253113553 )}
\put(0,0){\line(1,0)1}
\endpicture
\enddisplay
Although $B_3(x)$ through $B_9(x)$ are quite small, the Bernoulli polynomials
and numbers ultimately get quite large. Fortunately $R_m$ has a compensating
factor~$1/m!$, which helps to calm things down.
The graph of $B_m(x)$ begins to look very much like a sine wave when
%$m\ge3$; exercise~|bernoulli-sines-and-cosines| proves that $B_m(x)/m!$
%is in fact very nearly equal to $-\bigl(2/(2\pi)^m\bigr)\cos(2\pi x-\half\pi m)$,
$m\ge3$; exercise~|bernoulli-sines-and-cosines| proves that $B_m(x)$ can
in fact be well approximated by a negative multiple of $\cos(2\pi x-\half\pi m)$,
with relative error $1/2^m$.
In general, $B_{4k+1}(x)$ is negative for $00$. The integral of $B_{4k+2}(x)$ is
$B_{4k+3}(x)/(4k+3)$, which must therefore be positive when $00$.}
\enddisplay
Therefore we can rewrite Euler's formula \eq(|euler-sf|) as follows:
\begindisplay \openup3pt
\sum_{a\le k****0$;
hence $R_{2m}=R_{2m+1}$, and the
first discarded term must be
\begindisplay
R_{2m}-R_{2m+2}\,.
\enddisplay
We therefore want to show that $R_{2m}$ lies between $0$ and $R_{2m}-R_{2m+2}$;
and this is true if and only if $R_{2m}$ and $R_{2m+2}$ have opposite signs.
We claim that
\begindisplay
f^{(2m+2)}(x)\ge0\quad\hbox{for $a\le x\le b$\quad implies\quad}
(-1)^mR_{2m}\ge0\,.
\eqno
\enddisplay
This, together with \eq(|remainder-hypothesis|), will prove that
$R_{2m}$ and $R_{2m+2}$ have opposite signs, so the proof
of \eq(|remainder-theta|) will be complete.
It's not difficult to prove \thiseq\ if we recall the definition of
$R_{2m+1}$ and the facts we proved about the graph of $B_{2m+1}(x)$.
Namely, we have
\begindisplay
R_{2m}=R_{2m+1}=
\int_a^b{B_{2m+1}\bigl(\{x\}\bigr)\over(2m+1)!}\,f^{(2m+1)}(x)\,dx\,,
\enddisplay
and $f^{(2m+1)}(x)$ is increasing because its derivative $f^{(2m+2)}(x)$
is positive.
(More precisely, $f^{(2m+1)}(x)$ is nondecreasing because
its derivative is nonnegative.)
The graph of $B_{2m+1}\bigl(\{x\}\bigr)$ looks like
$(-1)^{m+1}$ times a sine wave, so it is geometrically
obvious that
the second half of each sine wave is more influential than the first
half when it is multiplied by an increasing function.
This makes $(-1)^mR_{2m+1}\ge0$, as desired.
Exercise~|prove-remainder-thm| proves the result formally.
\beginsection 9.6 Final Summations
Now comes the summing up, as we prepare to conclude this book. We will
apply Euler's summation formula to some interesting and important
examples.
\subhead Summation 1: This one is too easy.
But first we will consider an interesting {\it unimportant\/} example,
namely a sum that we already know how to do. Let's see what
Euler's summation formula tells us if we apply it to the
telescoping sum
\begindisplay
S_n=\sum_{1\le k6192$; therefore $\bigl\vert R_{22}(n)
\bigr\vert$ will be much larger than $\bigl\vert R_4(n)\bigr\vert$.
We~lose~totally.
There is a way out, however, and this escape route will turn out to be
important in other applications of Euler's formula. The key is to
notice that $R_4(n)$ approaches a definite limit as $n\to\infty$:
\begindisplay
\lim_{n\to\infty}R_4(n)=-\int_1^\infty B_4\bigl(\{x\}\bigr)\Bigl(
{1\over x^5}-{1\over(x+1)^5}\Bigr)\,dx=R_4(\infty)\,.
\enddisplay
The integral $\int_1^\infty B_m\bigl(\{x\}\bigr)@f^{(m)}(x)\,dx$ will
exist whenever $f^{(m)}(x)=O(x^{-2})$ as $x\to\infty$, and in this
case $f^{(4)}(x)$ surely qualifies. Moreover, we have
\begindisplay
R_4(n)&=R_4(\infty)+\int_n^\infty B_4\bigl(\{x\}\bigr)\Bigl(
{1\over x^5}-{1\over(x+1)^5}\Bigr)\,dx\cr
&=R_4(\infty)+O\Bigl(\int_n^\infty x^{-6}\,dx\Bigr)=R_4(\infty)+O(n^{-5})\,.
\enddisplay
Thus we have used Euler's summation formula to prove that
\begindisplay
\sum_{1\le k1$. We have proved that
\begindisplay
\sum_{1\le k0$.
Therefore \eq(|euler-sf++|) tells us that
\begindisplay
H_n=\ln n+\gamma+{1\over2n}-\sum_{k=1}^m{B_{2k}\over2k n^{2k}}
+\theta_{m,n}{B_{2m+2}\over(2m+2)n^{2m+2}}\,,
\eqno
\enddisplay
\looseness=-1
where $\theta_{m,n}$ is some fraction between $0$ and~$1$.
This is the general formula whose first few terms
are listed in Table~|o-special|. For example, when $m=2$ we get
\begindisplay
H_n=\ln n+\gamma+{1\over2n}-{1\over12n^2}+{1\over120n^4}
-{\theta_{2,n}\over252n^6}\,.
\eqno\eqref|harmonic-theta|
\enddisplay
This equation, incidentally, gives us a good approximation to~$\gamma$
even when $n=2$:
\begindisplay
\textstyle \gamma=H_2-\ln 2-{1\over4}+{1\over48}-{1\over1920}+\epsilon
=0.577165\ldots+\epsilon\,,
\enddisplay
where $\epsilon$ is between zero and $1\over16128$. If we take
$n=10^4$ and $m=250$, we get the value of~$\gamma$ correct to
1271 decimal places, beginning thus~[|knuth-gamma|]:
\begindisplay
\gamma=0.57721\,56649\,01532\,86060\,65120\,90082\,40243\ldots\,.
\eqno
\enddisplay
But Euler's constant appears also in other formulas that allow it to be
"!Euler's constant, evaluation of"
evaluated even more efficiently [|sweeney-gamma|].
\subhead Summation 3: Stirling's approximation.
If $f(x)=\ln x$, we have $f'(x)=1/x$, so we can evaluate the sum of
logarithms using almost the same calculations
as we did when summing reciprocals.
Euler's summation formula yields
\begindisplay
\sum_{1\le kn^{1/2+\epsilon}}e^{-k^2\!/n}
&<\exp\bigl(-\lfloor n^{1/2+\epsilon}\rfloor^2\!/n\bigr)
(1+e^{-1/n}+e^{-2/n}+\cdots\,)\cr
&=O(e^{-n^{2\epsilon}})\cdot O(n)\,,
\enddisplay
which is $O(n^{-M})$ for all~$M$; so $\sum_{k\notin D_n}b_k(n)$ is
asymptotically
negligible. (We chose the cutoff at $n^{1/2+\epsilon}$ just so that
$\smash{e^{-k^2\!/n}}$ would be exponentially small outside of~$D_n$. Other
choices like $n^{1/2}\log n$ would have been good enough too, and the
resulting estimates would have been slightly sharper, but the formulas would
have come out more complicated. We need not make the strongest
possible estimates, since our main goal is to establish the value
of the constant~$\sigma$.) Similarly, the other tail
\begindisplay
\sum_{k>n^{1/2+\epsilon}}{2n\choose n+k}
\enddisplay
is bounded by $2n$ times its largest term, which occurs at the cutoff
point $k\approx n^{1/2+\epsilon}$. This term is known to be approximately
$b_k(n)$, which is exponentially small compared with~$A_n$; and an
exponentially small multiplier wipes out the factor of~$2n$.
Thus we have successfully applied the tail-exchange trick to prove
the estimate
\begindisplay
2^{2n}=\sum_k{2n\choose k}={\sqrt{2\pi}\over e^\sigma}2^{2n}+
O(2^{2n}n^{-\half+3\epsilon})\,,\quad\hbox{if $0<\epsilon<{1\over6}$}.
\eqno\eqref|final-summation|
\enddisplay
We may choose $\epsilon={1\over8}$ and conclude that
\g Thanks for reading this, hope it comes in handy.\par
\hfill\kern-4pt\dash---The authors\g
\begindisplay
\sigma=\textstyle\half\ln2\pi\,.
\enddisplay
QED.
\beginexercises
\subhead \kern-.05em Warmups
\ex:
Prove or disprove:
If $f_1(n)\prec g_1(n)$ and $f_2(n)\prec g_2(n)$, then we have
$f_1(n)+f_2(n)\prec g_1(n)+g_2(n)$.
\answer True if the functions are all positive. But otherwise we might
have, say, $f_1(n)=n^3+n^2$, $f_2(n)=-n^3$, $g_1(n)=n^4+n$, $g_2(n)=-n^4$.
\source{"Hardy" [|hardy-tract|, 1.3(g)].}
\ex:
Which function grows faster:
\smallskip
\itemitem{a}$n^{(\ln n)}$ \ or \ $(\ln n)^n$?
\itemitem{b}$n^{(\ln\ln\ln n)}$ \ or \ $(\ln n)!\mskip2mu$?
\itemitem{c}$(n!)!$ \ or \ $\bigl((n-1)!\bigr)!\,(n-1)!^{n!}@$?
\itemitem{d}$F^2_{\lceil H_n\rceil}$ \ or \ $H_{F_n}@$?
\answer (a) We have $n^{\ln n}\prec c^n\prec(\ln n)^n$, since
$(\ln n)^2\prec n\ln c\prec n\ln\ln n$. (b)~$n^{\ln\ln\ln n}\prec
(\ln n)!\prec n^{\ln\ln n}$. (c)~Take logarithms to show that
$(n!)!$ wins. (d)~$F^2_{\lceil H_n\rceil}\asymp \phi^{2\ln n}=n^{2\ln\phi}$;
$H_{F_n}\sim n\ln\phi$ wins because $\phi^2=\phi+10$.
\answer The text's derivation for the case $\alpha=1$ generalizes to give
\begindisplay
b_k(n)={2^{(2n+1/2)\alpha}\over(2\pi n)^{\alpha/2}}e^{-k^2\mskip-1mu\alpha/n}\,,
\ c_k(n)=2^{2n\alpha}\,n^{-(1+\alpha)/2+3\epsilon}
e^{-k^2\mskip-1mu\alpha/n}\,;\cr
\enddisplay
the answer is $2^{2n\alpha}(\pi n)^{(1-\alpha)/2}\alpha^{-1/2}\bigl(
1+O(n^{-1/2+3\epsilon})\bigr)$.
\source{"Bender" [|bender|, \S3.1].}
\subhead Homework exercises
\ex:
Use a computer to compare the left and right sides of the approximations
in Table~|o-special|, when $n=10$, $z=\alpha=0.1$, and $O\bigl(f(n)\bigr)=
O\bigl(f(z)\bigr)=0$.
\answer $H_{10}=2.928968254\approx2.928968256$; $10!=3628800\approx
3628712.4$; $B_{10}=0.075757576\approx0.075757494$;
$\pi(10)=4\approx10.0017845$;
$e^{0.1}=1.10517092\approx1.10517083$;
$\ln 1.1=0.0953102\approx0.0953083$;
$1.1111111\approx1.1111000$;
$1.1^{0.1}=1.00957658\approx1.00957643$.
(The approximation to $\pi(n)$ gives more significant figures when $n$ is larger;
for example, $\pi(10^9)=50847534\approx50840742$.)
\ex:
Prove or disprove the following estimates, as $n\to\infty$:
\smallskip
\itemitem{a}\displaymath O\Biggl(\biggl({n^2\over\log\log n}\biggr)^{\!1/2}\Biggr)
=O\bigl(\lfloor\sqrt n\rfloor^2\bigr)\,$.
\smallskip
\itemitem{b}\displaymath e^{(1+O(1/n))^2}=e+O(1/n)\,$.
\smallskip
\itemitem{c}\displaymath n!=O\Bigl(\bigl((1-1/n)^nn\bigr)^n\Bigr)\,$.
\answer (a) Yes; the left side is $o(n)$ while the right side is equivalent to
$O(n)$. (b)~Yes; the left side is $e\cdot e^{O(1/n)}$.
(c)~No; the left side is about $\sqrt n$ times the bound on the right.
\source{1971 final exam.}
\ex:\exref|pn-rel3|%
Equation \eq(|pn-rel2|) gives the $n$th prime with relative error
$O(\log n)^{-2}$. Improve the relative error to~$O(\log n)^{-3}$
by starting with another term of \eq(|o-pi|) in \eq(|o-pi-trunc2|).
\answer We have $P_n=m=n\bigl(\ln m-1-1/\!@\ln m+O(1/\!@\log n)^2\bigr)$, where
\begindisplay \openup3pt
\ln m&=\ln n+\ln\ln m-1/\!@\ln n+\ln\ln n/(\ln n)^2+O(1/\!@\log n)^2\,;\cr
\ln\ln m&=\ln\ln n+{\ln\ln n\over\ln n}
-{(\ln\ln n)^2\over2(\ln n)^2}
+{\ln\ln n\over(\ln n)^2}+O(1/\!@\log n)^2\,.
\enddisplay
It follows that
\begindisplay
P_n&=n\biggl(\ln n+\ln\ln n-1\cr
&\hskip5em+{\ln\ln n-2\over\ln n}
-{\half(\ln\ln n)^2-3\ln\ln n\over(\ln n)^2}+O(1/\!@\log n)^2\biggr)\,.
\enddisplay
(A slightly better approximation
\g What does a drowning analytic number theorist say?\par
\medskip log log log log \dots\g
replaces this $O(1/\!@\log n)^2$ by the quantity
$-5/(\ln n)^2+O(\log\log n/\!@\log n)^3$; then we estimate
$P_{1000000}\approx15483612.4$.)
\ex:
Improve \eq(|golomb-ans|) to $O(n^{-3})$.
\answer Replace $O(n^{-2k})$ by $-{1\over12}n^{-2k}+O(n^{-4k})$ in the
expansion of $H_{n^k}$; this replaces $O\bigl(\Sigma_3(n^2)\bigr)$
by $-{1\over12}\Sigma_3(n^2)+O\bigl(\Sigma_3(n^4)\bigr)$ in
\equ(9.|golomb-pieces|). We have
\begindisplay
\textstyle\Sigma_3(n)={3\over4}n^{-1}+
{5\over36}n^{-2}+O(n^{-3})\,,
\enddisplay
hence the term $O(n^{-2})$ in \equ(9.|golomb-ans|)
can be replaced by % 1/12*3/4 + 1/2*5/36 = 19/144
$-{19\over144}n^{-2}+O(n^{-3})$.
\ex:\exref|improve-boot3|%
Push the approximation \eq(|boot3|) further, getting absolute error $O(n^{-3})$.
\Hint: Let $g_n=c/(n+1)(n+2)+h_n$; what recurrence does $h_n$ satisfy?
\answer $nh_n=\sum_{0\le k1$.
\itemitem{\bf b}$f(n)=\alpha^{-n}$, \ $\alpha>1$.
\answer (a) If $\sum_{k\ge0}\bigl\vert f(k)\bigr\vert<\infty$
and if $f(n-k)=O\bigl(f(n)\bigr)$ when $0\le k\le n/2$, we have
\begindisplay
\sum_{k=0}^n a_kb_{n-k}=\sum_{k=0}^{n/2}O\bigl(f(k)\bigr)@O\bigl(f(n)\bigr)
+\sum_{k=n/2}^{n}O\bigl(f(n)\bigr)@O\bigl(f(n-k)\bigr)\,,
\enddisplay
which is $2@@O\bigl(f(n)\sum_{k\ge0}\big\vert f(k)\big\vert\bigr)$,
so this case is proved. (b) But in this case if $a_n=b_n=\alpha^{-n}$,
the convolution $(n+1)\alpha^{-n}$ is not $O(\alpha^{-n})$.
\source{[|greene-knuth|, \S4.1.6].}
\ex:
Prove \eq(|opening-sum|) and \eq(|opening-asympt|), with which we
opened this chapter.
\answer $S_n\big/{3n\choose n}=\sum_{k=0}^n n\_k/(2n+1)\_^k$.
We may restrict the range of summation to $0\le k\le(\log n)^2$, say.
In this range $n\_k=n^k\bigl(1-{k\choose2}/n+O(k^4\!/n^2)\bigr)$ and
$(2n+1)\lower2pt\hbox{$\_^k$}=(2n)^k\bigl(1+{k+1\choose2}/2n+O(k^4\!/n^2)\bigr)$,
so the summand is
\begindisplay
{1\over2^k}\biggl(1-{3k^2-k\over4n}+O\Bigl({k^4\over n^2}\Bigr)\biggr)\,.
\enddisplay
Hence the sum over $k$ is $2-4/n+O(1/n^2)$. Stirling's approximation
can now be applied to ${3n\choose n}=(3n)!/(2n)!\,n!$,
proving \equ(9.|opening-asympt|).
\ex:
Equation \eq(|stirling-by-euler|) shows how to evaluate $\ln 10!$ with
an absolute error $<{1\over126000000}$. Therefore if we take exponentials, we
get $10!$ with a relative error that is less than $e^{1/126000000}-1<10^{-8}$.
(In fact, the approximation gives $3628799.9714$.) If we now round
to the nearest integer, knowing that $10!$ is an integer, we get
an exact result.
\smallskip\item{} Is it always possible to calculate $n!$ in a similar way,
if enough terms of "Stirling's approximation" are computed? Estimate the value
of~$m$ that gives the best approximation to~$\ln n!$, when $n$ is a fixed (large)
integer. Compare the absolute error in this approximation
with $n!$ itself.
\answer The minimum occurs at a term $B_{2m}/(2m)(2m-1)n^{2m-1}$
where $2m\approx2\pi n+{3\over2}$, and this term is approximately equal to
$1/(\pi e^{2\pi n}\sqrt n\,)$. The absolute error in $\ln n!$ is therefore
too large to determine $n!$ exactly by rounding to an integer, when
$n$ is greater than about $e^{2\pi+1}$.
\ex:\exref|general-harmonic-estimate|%
Use Euler's summation formula to find the asymptotic value of $H_n^{(-\alpha)}
=\sum_{k=1}^n k^\alpha$, where $\alpha$ is any fixed real number.
(Your answer may involve a constant that you do not know in closed form.)
\answer We may assume that $\alpha\ne-1$. Let $f(x)=x^\alpha$; the
answer is
\begindisplay\tightplus
\sum_{k=1}^n k^\alpha\!=\!C_\alpha+{n^{\alpha+1}\over\alpha+1}+{n^\alpha\over2}
+\!\sum_{k=1}^m{B_{2k}\over 2k}{\alpha\choose 2k-1}n^{\alpha-2k+1}
+O(n^{\alpha-2m-1})\,.
\enddisplay
(The constant $C_\alpha$ turns out to be $\zeta(-\alpha)$, which is in fact
\g In particular, $\zeta(0)=-1/2$, and
$\zeta(-n)=-B_{n+1}/(n{+}1)$\par for integer $n>0$.\g
{\it defined\/} by this formula when $\alpha>-1$.)
\source{"Titchmarsh" [|zeta-function|].}
\ex:
Exercise 5.|hyperfactorial-def| defines the "hyperfactorial"
function $Q_n=1^12^2\ldots n^n$. Find the asymptotic value of $Q_n$
with relative error~$O(n^{-1})$.
(Your answer may involve a constant that you do not know in closed form.)
\answer In general, suppose $f(x)=x^\alpha\ln x$ in Euler's summation formula,
when $\alpha\ne-1$. Proceeding as in the previous exercise, we find
\begindisplay\openup3pt
\sum_{k=1}^n k^\alpha\ln k&=C'_\alpha+{n^{\alpha+1}\ln n\over\alpha+1}
-{n^{\alpha+1}\over(\alpha+1)^2}+{n^\alpha\ln n\over2}\cr
&\qquad{}+\sum_{k=1}^m{B_{2k}\over2k}{\alpha\choose2k-1}n^{\alpha-2k+1}
(\ln n+H_\alpha-H_{\alpha-2k+1})\cr
&\qquad{}+O(n^{\alpha-2m-1}\log n)\,;\cr
\enddisplay
the constant $C'_\alpha$ can be shown [|de-bruijn|, \S3.7] to be
$-\zeta'(-\alpha)$. (The $\log n$ factor in the $O$~term can be removed when
$\alpha$ is a positive integer $\le2m$; in that case we also replace the
$k$th term of the right sum by $B_{2k}\alpha!\*\,({2k-2-\alpha})!\*(-1)^\alpha
n^{\alpha-2k+1}/(2k)!$ when $\alpha<2k-1$.)
To solve the stated problem, we let $\alpha=1$ and $m=1$,
taking the exponential of both sides to get
\begindisplay
Q_n=A\cdot n^{n^2\!/2+n/2+1/12}e^{-n^2\!/4}\bigl(1+O(n^{-2})\bigr)\,,
\enddisplay
where $A=e^{1/12-\zeta'(-1)}\approx1.2824271291$
is ``"Glaisher's constant".\qback''
\source{"Glaisher" [|glaisher|].}
\ex:
Estimate the function $1^{1/1}2^{1/2}\ldots n^{1/n}$
as in the previous exercise.
\answer Let $f(x)=x^{-1}\ln x$. A slight modification of the calculation in
the previous exercise gives
\begindisplay
\sum_{k=1}^n{\ln k\over k}&={(\ln n)^2\over2}+\gamma_1+{\ln n\over2n}\cr
&\qquad{}
-\sum_{k=1}^m{B_{2k}\over2k}n^{-2k}(\ln n-H_{2k-1})+O(n^{-2m-1}\log n)\,,\cr
\enddisplay
where $\gamma_1\approx-0.07281584548367672486$ is a ``"Stieltjes constant"''
(see the answer to 9.|stieltjes-const|).
Taking exponentials gives
\begindisplay
e^{\gamma_1}\sqrt{n^{\mathstrut\ln n}}\,\biggl(1+{\ln n\over2n}
+O\Bigl({\log n\over n}\Bigr)^2\biggr)\,.
\enddisplay
\source{"de Bruijn" [|de-bruijn|, \S3.7].} % probably not the best source
\ex:\exref|theta-sums|%
Find the asymptotic value of $\sum_{k\ge0}k^le^{-k^2\!/n}$ with
absolute error $O(n^{-3})$, when $l$~is a fixed nonnegative integer.
\answer Let $g(x)=x^le^{-x^2}$ and $f(x)=g(x/\sqrt n\,)$. Then
$n^{-l/2}\sum_{k\ge0}k^le^{-k^2\!/n}$ is
\begindisplay
&\int_0^\infty f(x)\,dx
-\sum_{k=1}^m{B_k\over k!}f^{(k-1)}(0)
-(-1)^m\int_0^\infty{B_m\bigl(\{x\}\bigr)\over m!}f^{(m)}(x)\,dx\cr
&\qquad=n^{1/2}\int_0^\infty g(x)\,dx
-\sum_{k=1}^m{B_k\over k!}n^{(k-1)/2}g^{(k-1)}(0)+O(n^{-m/2})\,.
\enddisplay
Since $g(x)=x^l-x^{2+l}\!/1!+x^{4+l}\!/2!-x^{6+l}\!/3!+\cdots\,$, the derivatives
$g^{(m)}(x)$ obey a simple pattern, and the answer is
\begindisplay
\half n^{(l+1)/2\,}\Gamma\Bigl({l+1\over2}\Bigr)-{B_{l+1}\over(l+1)!\,0!}
+{B_{l+3}n^{-1}\over(l+3)!\,1!}-{B_{l+5}n^{-2}\over(l+5)!\,2!}+O(n^{-3})\,.
\enddisplay
\ex:
Evaluate $\sum_{k\ge0}1/(c^k+c^m)$ with absolute error $O(c^{-3m})$,
when $c>1$ and $m$~is a positive integer.
\answer The somewhat surprising identity $1/(c^{m-k}+c^m)+1/(c^{m+k}+c^m)=1/c^m$
makes the terms for $0\le k\le2m$ sum to $(m+\half)/c^m$. The remaining
terms are
\begindisplay
\sum_{k\ge1}{1\over c^{2m+k}+c^m}&=\sum_{k\ge1}\biggl({1\over c^{2m+k}}
-{1\over c^{3m+2k}}
+{1\over c^{4m+3k}}-\cdots\,\biggr)\cr
&={1\over c^{2m+1}-c^{2m}}
-{1\over c^{3m+2}-c^{3m}}
+\cdots\,,
\enddisplay
and this series can be truncated at any desired point, with an error not
exceeding the first omitted term.
\subhead Exam problems
\ex:
Evaluate $e^{H_n+H_n^{(2)}}$ with absolute error $O(n^{-1})$.
\answer $H_n^{(2)}=\pi^2\!/6-1/n+O(n^{-2})$ by Euler's summation
formula, since we know the constant; and $H_n$ is given by \equ(9.|harmonic-theta|).
\g The world's top three constants, $(e,\pi,\gamma)$, all appear
in this answer.\g
So the answer is
\begindisplay
ne^{\gamma+\pi^2\!/6}\bigl(1-{\textstyle\half}n^{-1}+O(n^{-2})\bigr)\,.
\enddisplay
\source{1976 final exam.}
\ex:
Evaluate $\sum_{k\ge0}{n\choose k}/n\_^k$ with absolute error $O(n^{-3})$.
\answer We have $n\_k/n\_^k=1-k(k-1)n^{-1}+\half k^2(k-1)^2n^{-2}+O(k^6n^{-3})$;
dividing by $k!$ and summing over~$k\ge0$ yields
$e-en^{-1}+{7\over2}en^{-2}+O(n^{-3})$.
\ex:
Determine values $A$ through $F$ such that $(1+1/n)^{nH_n}$ is
\begindisplay
An+B(\ln n)^2+C\ln n+D+{E(\ln n)^2\over n}+{F\ln n\over n}+O(n^{-1})\,.
\enddisplay
\answer $A=e^\gamma$; $B=0$; $C=-\half e^\gamma$; $D=\half e^\gamma(1-\gamma)$;
$E={1\over8}e^\gamma$; $F={1\over12}e^\gamma(3\gamma+1)$.
\source{1973 final exam.}
\ex:
Evaluate $\sum_{k=1}^n1/kH_k$ with absolute error $O(1)$.
\answer Since $1/k\bigl(\ln k+O(1)\bigr)
=1/k\ln k+O\bigl(1/k(\log k)^2\bigr)$, the given sum is $\sum_{k=2}^n1/k\ln k
+O(1)$. The remaining sum is $\ln\ln n +O(1)$ by Euler's summation formula.
\source{1975 final exam.}
\ex:
Evaluate $S_n=\sum_{k=1}^n{1/(n^2+k^2)}$ with absolute error $O(n^{-5})$.
\answer This works out beautifully with Euler's summation formula:
\begindisplay \openup5pt
S_n&=\sum_{0\le kn}\ln(1-\alpha^k)\,.
\enddisplay
The latter sum is $\sum_{k>n}O(\alpha^k)=O(\alpha^n)$. Hence the answer is
\begindisplay
\phi^{n(n+1)/2}5^{-n/2}C+O(\phi^{n(n-3)/2}5^{-n/2})\,,
\enddisplay
where $C=(1-\alpha)(1-\alpha^2)(1-\alpha^3)\ldots\,\approx1.226742$.
\source{1980 final exam.}
\ex:\exref|binomial-tail|%
Let $\alpha$ be a constant in the range $0<\alpha<\half$. We've seen in
previous chapters that there is no general closed form for the sum
$\sum_{k\le\alpha n}{n\choose k}$. Show that there is, however, an
asymptotic formula
\begindisplay
\sum_{k\le\alpha n}{n\choose k}=2^{nH(\alpha)-\half\lg n+O(1)}\,,
\enddisplay
where $H(\alpha)=\alpha\lg{1\over\alpha}+(1-\alpha)\lg({1\over1-\alpha})$.
\Hint: Show that ${n\choose k-1}<{\alpha\over1-\alpha}{n\choose k}$
for $0$,
and let $\alpha_m$ be the continued fraction
$1/(a_m+\alpha_{m+1})$ for $m\ge1$. Then
$D(\alpha,n)=D(\alpha_1,n)\hat y+\half\delta$.
Therefore $\epsilon+\epsilon'>\delta$. And if $\delta<\min(\epsilon,\epsilon')$,
the rounding does distinguish $\hat y$ from both $y-\epsilon$ and $y+\epsilon'$.
Hence $10^{-b_k}<1/(k-1)+1/k$ and $10^{1-b_k}\ge1/k$; we have
$b_k=\log k+O(1)$.
Finally, therefore, $\sum_{k=1}^n d_k=\sum_{k=1}^n(\log k+\log\log k+O(1))$,
which is $n\log n+n\log\log n+O(n)$ by Euler's summation formula.
\source{1980 final exam.}
\ex:\exref|worm-finale|%
In Chapter 6 we considered the tale of a worm that reaches the end of
"!worm on band"
a stretching band after $n$ seconds, where $H_{n-1}<100\le H_n$.
Prove that if $n$ is a positive integer such that $H_{n-1}\le\alpha\le H_n$,
then
\begindisplay
\textstyle\lfloor e^{\alpha-\gamma}\rfloor\le n\le\lceil
e^{\alpha-\gamma}@\rceil\,.
\enddisplay
\answer We have $H_n>\ln n+\gamma+\half n^{-1}-{1\over12}n^{-2}=f(n)$,
where $f(x)$ is increasing for all $x>0$; hence if $n\ge e^{\alpha-\gamma}$
we have $H_n\ge f(e^{\alpha-\gamma})>\alpha$. Also $H_{n-1}<\ln n+\gamma
-\half n^{-1}=g(n)$, where $g(x)$ is increasing for all $x>0$; hence if
$n\le e^{\alpha-\gamma}$ we have $H_{n-1}\le g(e^{\alpha-\gamma})<\alpha$.
Therefore $H_{n-1}\le\alpha\le H_n$ implies that $e^{\alpha-\gamma}+1>n
>e^{\alpha+\gamma}-1$. (Sharper results have been obtained by
"Boas" and "Wrench"~[|boas-wrench|].)
\source{1974 final exam.}
\ex:
"Venture capitalists" in Silicon Valley are being offered a deal giving
them a chance for an exponential payoff on their investments: For an
$n$~million dollar investment, where $n\ge2$,
the GKP consortium promises to pay up to
$N$~million dollars after one year, where $N=10^n$. Of course there's
some risk; the actual deal is that GKP pays $k$~million dollars with
probability $1/(k^2H_N^{(2)})$, for each integer~$k$ in the range
$1\le k\le N$. (All payments are in megabucks, that is, in exact
multiples of \$1,000,000; the payoff is determined by a truly
random process.) Notice that an investor
always gets at least a million dollars back.
\smallskip
\itemitem{a}What is the asymptotic expected return after one year,
if $n$~million dollars are invested?
(In other words, what is the mean value of the payment?)
\g I once earned\par $O(10^{-n})$ dollars.\g
Your answer should be correct within an absolute error of
$O(10^{-n})$ dollars.
\itemitem{b}What is the asymptotic probability that you make a profit,
if you invest $n$~million?
(In other words, what is the chance that you get back more than you put~in?)
Your answer here should be correct within an absolute error of
$O(n^{-3})$.
\answer (a) The expected return is $\sum_{1\le k\le N}k/(k^2H_N^{(2)})
=H_N/H_N^{(2)}$, and we want the asymptotic value to $O(N^{-1})$:
\begindisplay
{\ln N+\gamma+O(N^{-1})\over\pi^2\!/6-N^{-1}+O(N^{-2})}
={6\ln 10\over\pi^2}n+{6\gamma\over\pi^2}+{36\ln10\over\pi^4}{n\over10^n}
+O(10^{-n})\,.
\enddisplay
The coefficient $(6\ln 10)/\pi^2\approx1.3998$ says that we expect
about 40\% profit.
\par(b) The probability of profit is $\sum_{n$ by setting $m_0=0$ and letting $m_k$ be the least
integer $>m_{k-1}$ such that
\begindisplay
\Bigl({k+1\over k}\Bigr)^{\!m_k}\ge f(k+1)^2\,.
\enddisplay
Now let $A(z)=\sum_{k\ge1}(z/k)^{m_k}$. This power series converges for all~$z$,
because the terms for $k>\vert z\vert$ are bounded by a geometric series.
Also $A(n+1)\ge{\bigl((n+1)/n\bigr){}^{m_n}}\ge f(n+1)^2$, hence
$\lim_{n\to\infty}f(n)/A(n)=0$.
\source{"Poincar\'e" [|poincare|]; "Borel" [|borel|, p.~27].}
\ex:\exref|prove-log-bound|%
Prove that if $f(x)$ is a function whose derivatives satisfy
\begindisplay \displaythick=\normalthick
f'(x)\le0\,,\ \ -f''(x)\le0\,,\ \ f'''(x)\le0\,,\ \ \dots,\ \
(-1)^mf^{(m+1)}(x)\le0
\enddisplay
for all $x\ge0$, then we have
\begindisplay
f(x)=f(0)+{f'(0)\over1!}x+\cdots+
{f^{(m-1)}(0)\over(m-1)!}x^{m-1}+O(x^m)\,,\quad
\hbox{for $x\ge0$}.
\enddisplay
In particular, the case $f(x)=-\ln(1+x)$ proves \eq(|log-bound|)
for all $k,n>0$.
\answer By induction, the $O$ term is $(m-1)!^{-1}\int_0^x t^{m-1}
f^{(m)}(x-t)\,dt$. Since $f^{(m+1)}$ has the opposite sign to $f^{(m)}$,
the absolute value of this integral is bounded by $\bigl\vert f^{(m)}(0)
\bigr\vert \int_0^xt^{m-1}\,dt$; so the error is bounded by the
absolute value of the first discarded term.
\source{"P\'olya" and "Szeg\H o" [|polya-szego|, part 1, problem 140].}
\ex:\exref|tail-estimate|%
Let $f(x)$ be a positive, differentiable function such that
$xf'(x)\prec f(x)$ as $x\to\infty$. Prove that
\begindisplay
\sum_{k\ge n}{f(k)\over k^{1+\alpha}}=O\biggl({f(n)\over n^{\alpha}}\biggr)\,,
\qquad\hbox{if $\alpha>0$}.
\enddisplay
\Hint: Consider the quantity $f(k-\half)/(k-\half)^\alpha
-f(k+\half)/(k+\half)^\alpha$.
\answer Let $g(x)=f(x)/x^\alpha$. Then $g'(x)\sim-\alpha g(x)/x$ as $x\to
\infty$. By the mean
\g Sounds like a nasty theorem.\g
value theorem, $g(x-\half)-g(x+\half)=-g'(y)\sim
\alpha g(y)/y$ for some $y$ between $x-\half$ and $x+\half$. Now $g(y)
=g(x)\bigl(1+O(1/x)\bigr)$, so $g(x-\half)-g(x+\half)\sim\alpha g(x)/x
=\alpha f(x)/x^{1+\alpha}$. Therefore
\begindisplay
\sum_{k\ge n}{f(k)\over k^{1+\alpha}}=
O\Bigl(@\sum_{k\ge n}\textstyle\bigl(g(k-\half)-g(k+\half)\bigr)\Bigr)
=O\bigl(g(n-\half)\bigr)\,.
\enddisplay
\ex:
Improve \eq(|final-summation|) to relative error $O(n^{-3/2+5\epsilon})$.
\answer The estimate of $
(n+k+\half)\ln(1+k/n)+
(n-k+\half)\ln(1-k/n)$ is extended to
$k^2\!/n+k^4\!/6n^3+O(n^{-3/2+5\epsilon})$, so we apparently want to have
an extra factor $e^{-k^4\!/6n^3}$ in $b_k(n)$, and $c_k(n)=
2^{2n}n^{-2+5\epsilon}e^{-k^2\!/n}$. But it turns out to be better
to leave $b_k(n)$ untouched and to let
\begindisplay
c_k(n)=2^{2n}n^{-2+5\epsilon}
e^{-k^2\!/n}+2^{2n}@n^{-5+5\epsilon}k^4e^{-k^2\!/n}\,,
\enddisplay
thereby replacing $e^{-k^4\!/6n^3}$ \kern-1ptby $1+O(k^4\!/n^3)$. The sum
$\sum_k\kern-1.5pt k^4e^{-k^2\!/n}$ is $O(n^{5/2})$, as shown in exercise~|theta-sums|.
\ex:\exref|q-of-n|%
The quantity $Q(n)=1+{n-1\over n}+{n-1\over n}{n-2\over n}+\cdots\,
=\sum_{k\ge1}n\_k/n^k$ occurs in the analysis of many algorithms.
Find its asymptotic value, with absolute error $o(1)$.
\answer If $k\le n^{1/2+\epsilon}$ we have $\ln(n\_k/n^k)=
-\half k^2\!/n+\half k/n-{1\over6}k^3\!/n^2+O(n^{-1+4\epsilon})$
by Stirling's approximation, hence
\begindisplay
n\_k/n^k=e^{-k^2\!/2n}\bigl(1+k/2n
-{\textstyle{2\over3}}k^3\!/(2n)^2+O(n^{-1+4\epsilon})\bigr)\,.
\enddisplay
Summing with the identity in exercise |theta-sums|, and remembering to
omit the term for $k=0$, gives
$-1+\Theta_{2n}+\Theta_{2n}^{(1)}-{2\over3}\Theta_{2n}^{(3)}
+O(n^{-1/2+4\epsilon}) =\sqrt{\pi n/2}-{1\over3}+O(n^{-1/2+4\epsilon})$.
\ex:\exref|stieltjes-const|%
An asymptotic formula for "Golomb"'s sum $\sum_{k\ge1}1/k\lfloor1+\log_n k
\rfloor^2$ is derived in \eq(|golomb-ans|). Find an asymptotic formula
for the analogous sum without floor brackets,
$\sum_{k\ge1}1/k(1+\log_n k)^2$.
\Hint: Consider the integral $\int_0^\infty ue^{-u}k^{-tu}\,du=1/(1+t\ln k)^2$.
\answer Using the hint, the given sum becomes $\int_0^\infty ue^{-u}\zeta
(1+u/\!@\ln n)\,du$. The zeta function can be defined by the series
\begindisplay
\zeta(1+z)=z^{-1}+\sum_{m\ge0}(-1)^m\gamma_mz^m\!/m!\,,
\enddisplay
where $\gamma_0=\gamma$
and $\gamma_m$ is the "Stieltjes constant" [|stieltjes|, |keiper|]
\begindisplay
\lim_{n\to\infty}\biggl(\sum_{k=1}^n
{(\ln k)^m\over k}\;-\;{(\ln n)^{m+1}\over m+1}\biggr)\,.
\enddisplay
Hence the given sum is
\begindisplay
\ln n+\gamma-2\gamma_1(\ln n)^{-1}+3\gamma_2(\ln n)^{-2}-\cdots\,.
\enddisplay
\source{Andrew M. "Odlyzko".*}
\ex:\exref|bernoulli-sines-and-cosines|%
Prove that
\begindisplay
B_m\bigl(\{x\}\bigr)=-2{m!\over(2\pi)^m}
\sum_{k\ge1}{\cos(2\pi kx-\half\pi m)\over k^m}\,,\qquad\hbox{for $m\ge2$},
\enddisplay
by using residue calculus, integrating
\begindisplay
{1\over2\pi i}\oint\,\,{2\pi i\,e^{2\pi iz\theta}\over e^{2\pi iz}-1}{dz\over z^m}
\enddisplay
on the square contour $z=x+iy$, where
$\max\bigl(\vert x\vert,\vert y\vert@\bigr)
=M+\half$, and letting the integer $M$ tend to $\infty$.
\answer Let $0\le\theta\le1$ and $f(z)=e^{2\pi iz\theta}\!/(e^{2\pi iz}-1)$.
We have
\begindisplay \openup6pt
\bigl\vert f(z)\bigr\vert&={e^{-2\pi y\theta}\over1+e^{-2\pi y}}\le 1\,,
&\qquad\hbox{when $x\bmod1=\half$};\cr
\bigl\vert f(z)\bigr\vert&\le{e^{-2\pi y\theta}\over\vert e^{-2\pi y}-1\vert}
\le {1\over 1-e^{-2\pi\epsilon}}\,,
&\qquad\hbox{when $\vert y\vert\ge\epsilon$}.
\enddisplay
Therefore $\bigl\vert f(z)\bigr\vert$ is bounded on the contour, and the integral
is $O(M^{1-m})$. The residue of $2\pi if(z)/z^m$ at $z=k\ne0$ is $e^{2\pi ik\theta}
/k^m$; the residue at $z=0$ is the coefficient of $z^{-1}$ in
\begindisplay
{e^{2\pi iz\theta}\over z^{m+1}}\Bigl(B_0+B_1{2\pi iz\over1!}+\cdots\,\Bigr)
={1\over z^{m+1}}\Bigl(B_0(\theta)+B_1(\theta){2\pi iz\over1!}+\cdots\,\Bigr)\,,
\enddisplay
namely $(2\pi i)^mB_m(\theta)/m!$. Therefore the sum of residues inside the
contour is
\begindisplay
{(2\pi i)^m\over m!}B_m(\theta)\;+\;2\sum_{k=1}^M e^{\pi i m/2}
{\cos(2\pi k\theta-\pi m/2)\over k^m}\,.
\enddisplay
This equals the contour integral $O(M^{1-m})$,
so it approaches zero as $M\to\infty$.
\source{"Henrici" [|henrici|, exercise 4.9.8].}
\ex:\exref|prove-jacobi|%
Let $\Theta_n(t)=\sum_k e^{-(k+t)^2\!/n}$, a periodic function of\/~$t$.
Show that the expansion of $\Theta_n(t)$ as a "Fourier series" is
\begindisplay
\Theta_n(t)=\sqrt{\pi n}\bigl(1+2e^{-\pi^2n}(\cos2\pi t)
&+2e^{-4\pi^2n}(\cos4\pi t)\cr
&\quad +2e^{-9\pi^2n}(\cos6\pi t)+\cdots\,\bigr)\,.
\enddisplay
(This formula gives a rapidly convergent series for the sum
$\Theta_n=\Theta_n(0)$ in equation \eq(|theta-by-euler|).)
\answer If $F(x)$ is sufficiently
well behaved, we have the general identity
\begindisplay
\sum_k F(k+t)=\sum_n G(2\pi n)e^{2\pi int}\,,
\enddisplay
where $G(y)=\int_{-\infty}^{+\infty}e^{-iyx}F(x)\,dx$.
(This is ``"Poisson's summation formula",\qback'' which can be found in
standard texts such as "Henrici"~[|henrici|, Theorem 10.6e].)
\ex:
Explain why the coefficients in the asymptotic expansion
\begindisplay
{2n\choose n}={4^n\over\sqrt{\pi n}}\biggl(1-{1\over8n}+
{1\over128n^2}+{5\over1024n^3}-{21\over32768n^4}+O(n^{-5})\biggr)
\enddisplay
all have denominators that are powers of~$2$.
\answer The stated formula is equivalent to
\begindisplay
n\_^{1/2}=n^{1/2}\biggl(1-{1\over8n}+
{1\over128n^2}+{5\over1024n^3}-{21\over32768n^4}+O(n^{-5})\biggr)
\enddisplay
by exercise 5.|factorial-dup|. Hence the result
follows from exercises 6.|half-stirling-upper| and 9.|factorial-power-asymp|.
\source{[|knuth-vardi|].}
\ex:
Exercise |prove-discrep| proves that the "discrepancy" $D(\alpha,n)$ is
$o(n)$ for all irrational numbers $\alpha$. Exhibit an irrational~$\alpha$
such that $D(\alpha,n)$ is {\it not\/} $O(n^{1-\epsilon})$ for any
$\epsilon>0$.
\answer The idea is to make $\alpha$ ``almost'' rational. Let
$a_k=2^{2^{2^k}}$ be the $k$th partial quotient of~$\alpha$, and let
$n=\half a_{m+1}q_m$, where $q_m=K(a_1,\ldots,a_m)$ and $m$~is even.
Then $0<\{q_m\alpha\}<1/K(a_1,\ldots,a_{m+1})<1/(2n)$, and if we take
$v=a_{m+1}/(4n)$ we get a discrepancy $\ge{1\over4}a_{m+1}$. If this
were less than $n^{1-\epsilon}$ we would have
$a_{m+1}^\epsilon=O(q_m^{1-\epsilon})$;
but in fact $a_{m+1}>q_m^{2^m}$.
\ex:
\def\\{\overline m(n)}%
Given $n$, let ${n\brace m(n)}=\max_k{n\brace k}$ be the largest
entry in row~$n$ of Stirling's subset triangle. Show that
for all sufficiently large~$n$, we have $m(n)=\lfloor \\\rfloor$
or $m(n)=\lceil \\\rceil$, where
\begindisplay
\\(\\+2)\ln(\\+2)=n(\\+1)\,.
\enddisplay
\Hint: This is difficult.
\answer See "Canfield" [|canfield|];
see also "David" and "Barton"~[|david-barton|, Chapter~16] for asymptotics
of "Stirling numbers" of both kinds.
\source{"Canfield" [|canfield|].}
\ex:
Prove that S.\thinspace W. "Golomb"'s self-describing sequence of
exercise 2.|self-descr| satisfies $f(n)=\phi^{2-\phi}n^{\phi-1}+
O(n^{\phi-1}/\!@\log n)$.
\answer Let $c=\phi^{2-\phi}$. \kern-1.7pt
The estimate $cn^{\phi-1}+o(n^{\phi-1})$ was proved by "Fine"~[|golomb-self|].
Ilan~"Vardi" observes that the sharper estimate stated can be deduced from the
fact that the error term $e(n)=f(n)-cn^{\phi-1}$ satisfies the approximate
recurrence $c^\phi n^{2-\phi}e(n)\approx-\sum_k e(k)\[1\le kc\cdot273^{\phi-1}\approx38.4997$. But the
small errors are eventually magnified, because of results like those in
exercise 2.|self-descr|. For example, $e(201636503)\approx35.73$;
$e(919986484788)\approx-1959.07$.
\source{"Vardi" [|vardi-self|].}
\ex:
Find a proof of the identity
\begindisplay
\sum_{n\ge1}{\cos 2n\pi x\over n^2}=\pi^2\bigl(x^2-x+\textstyle{1\over6}\bigr)\,
\qquad\hbox{for $0\le x\le1$},
\enddisplay
that uses only ``Eulerian'' (eighteenth-century) mathematics.
\answer (From this identity for $B_2(x)$ we can easily derive
\g\noindent\llap{``}The paradox is now fully established that the utmost
\hbox{abstractions} are
the true~weapons with which to control our thought of concrete~fact.''
\par"!philosophy"
\hfill\dash---A.\thinspace N. "!Whitehead"White-\par
\hfill head [|whitehead-science|]\g
the identity of exercise~|bernoulli-sines-and-cosines| by
induction on~$m$.) If $0n$. Find the smallest
integer $n$ such that
$Q_n\ne\lfloor e^{n-\gamma}+\half\rfloor$, or prove that no such $n$ exist.
\g\vskip.5in Th-th-th-that's all, folks!\g
\answer This would fail if, for example, $e^{n-\gamma}=m+\half+\epsilon/m$
for some integer~$m$ and some $0<\epsilon<{1\over8}$;
but no counterexamples are known.
\source{"Boas" and "Wrench" [|boas-wrench|].}
**