Table of contents

It’s been too long—a month and a half since my last review, and about three months since *Analysis I.* I’ve been immersed in my work for chai, but reality doesn’t grade on a curve, and I want more mathematical firepower.

*Metric spaces; completeness and compactness.*

It sucks and I hate it.

*Generalized continuity, and how it interacts with the considerations introduced in the previous chapter. Also, a terrible introduction to topology.*

There’s a lot I wanted to say here about topology, but I don’t think my understanding is good enough to break things down—I’ll have to read an actual book on the subject.

*Pointwise and uniform convergence, the Weierstrass* $M$*-test, and uniform approximation by polynomials.*

Suppose we have some sequence of functions $f_{(n)}:[0,1]→R$, $f_{(n)}(x)=defx_{n}$, which converge pointwise to the 1-indicator function $f:[0,1]→R$ (i.e. $f(1)=1$ and $0$ otherwise). Clearly, each $f_{(n)}$ is (infinitely) differentiable. However, the limiting function $f$ isn’t differentiable at all! Basically, pointwise convergence isn’t at all strong enough to stop the limit from “snapping” the continuity of its constituent functions.

As in previous posts, I mark my progression by sharing a result derived without outside help.

*Already proven:* $∫_{−1}(1−x_{2})_{N}dx≥N 1 $.

*Definition.* Let $ϵ>0$ and $0<δ<1$. A function $f:R→R$ is said to be an $(ϵ,δ)$*-approximation to the identity* if it obeys the following three properties:

- $f$ is compactly supported on $[−1,1]$.
- $f$ is continuous, and $∫_{−∞}f=1$.
- $∣f(x)∣≤ϵ$ for all $δ≤∣x∣≤1$.

*Lemma:* For every $ϵ>0$ and $0<δ<1$, there exists an $(ϵ,δ)$-approximation to the identity which is a polynomial $P$ on $[−1,1]$.

*Proof of Exercise 14.8.2(c).* Suppose $c∈R,N∈N$; define $f(x)=defc(1−x_{2})_{N}$ for $x∈[−1,1]$ and 0 otherwise. Clearly, $f$ is compactly supported on $[−1,1]$ and is continuous. We want to find $c,N$ such that the second and third properties are satisfied. Since $(1−x_{2})_{N}$ is non-negative on $[−1,1]$, $c$ must be positive, as $f$ must integrate to 1. Therefore, $f$ is non-negative.

We want to show that $∣c(1−x_{2})_{N}∣≤ϵ$ for all $δ≤∣x∣≤1$. Since $f$ is non-negative, we may simplify to $(1−x_{2})_{N}≤cϵ $. Since the left-hand side is strictly monotone increasing on $[−1,−δ]$ and strictly monotone decreasing on $[δ,1]$, we substitute $x=δ$ without loss of generality. As $ϵ>0$, so we may take the reciprocal and multiply by $ϵ$, arriving at $ϵ(1−δ_{2})_{−N}≥c$.

We want $∫_{−∞}f=1$; as $f$ is compactly supported on $[−1,1]$, this is equivalent to $∫_{−1}f(x)dx=1$. Using basic properties of the Riemann integral, we have $∫_{−1}(1−x_{2})_{N}dx=c1 $. Substituting in for $c$,

$ϵ_{−1}(1−δ_{2})_{N} ≤N 1 ≤∫_{−1}(1−x_{2})_{N}dx, $with the second inequality already having been proven earlier. Note that although the first inequality is not always true, we can make it so: since $ϵ$ is fixed and $1−δ_{2}∈(0,1)$, the left-hand side approaches 0 more quickly than $N 1 $ does. Therefore, we can make $N$ as large as necessary; isolating $ϵ$,

$ϵ≥(1−δ_{2})_{N}N ϵ≥N >(1−δ_{2})_{N}N ,$the second line being a consequence of $1>(1−δ_{2})_{N}$. Then set $N$ to be any natural number such that this inequality is satisfied. Finally, we set $c=∫_{−1}(1−x_{2})_{N}dx1 $. By construction, these values of $c,N$ satisfy the second and third properties. □

Those looking for an excellent explanation of convolutions, look no further!

*Theorem.* Suppose $f:[a,b]→R$ is continuous and compactly supported on $[a,b]$. Then for every $ϵ>0$, there exists a polynomial $P$ such that $∣∣P−f∣∣_{∞}<ϵ$.

In other words, any continuous, real-valued $f$ on a finite interval can be approximated with arbitrary precision by polynomials.

*Why I’m talking about this.* On one hand, this result makes sense, especially after taking machine learning and seeing how polynomials can be contorted into basically whatever shape you want.

On the other hand, I find this theorem intensely beautiful. $P[a,b] =C[a,b]$’s proof was slowly constructed, much to the reader’s benefit. I remember the very moment the proof sketch came to me, newly-installed gears whirring happily.

*Real analytic functions, Abel’s theorem, $exp$ and $g$, complex numbers, and trigonometric functions.*

Cached thought from my CS undergrad: Exponential functions always end up growing more quickly than polynomials, no matter the degree. Now, I finally have the gears to see why:

$exp(x)=defk=0∑∞ k!x_{k} .$$exp$ has *all* the degrees, so no polynomial (of necessarily finite degree) could ever hope to compete! This also suggests why $dxd e_{x}=e_{x}$.

- The book
- You can multiply a number by itself some number of times.
- Me
- The book
- You can multiply a number by itself a negative number of times.
- Me
- Sure.
- The book
- You can multiply a number by itself an irrational number of times.
- Me
- ... OK, I understand limits.
- The book
- You can multiply a number by itself an imaginary number of times.
- Me
- Out. Now.

Seriously, this one’s weird (rather, it *seems* weird, but how can “how the world is” be “weird”)?

Suppose we have some $c∈C$, where $c=a+bi$. Then $e_{c}=e_{a}e_{bi}$, so “all” we need to figure out is how to take an imaginary exponent. Brian Slesinsky has us covered.

*Years before becoming involved with the rationalist community, Nate asks this question, and Qiaochu answers.*

*This isn’t a coincidence, because nothing is ever a coincidence.*

*Or maybe it is a coincidence, because Qiaochu answered every question on StackExchange.*

*Periodic functions, trigonometric polynomials, periodic convolutions, and the Fourier theorem.*

*A beautiful unification of Linear Algebra and calculus: linear maps as derivatives of multivariate functions, partial and directional derivatives, Clairaut’s theorem, contractions and fixed points, and the inverse and implicit function theorems.*

If you have a set of points in $R_{n}$, when do you know if it’s secretly a function $g:R_{n−1}→R$? For functions $R→R$, we can just use the geometric “vertical line test” to figure this out, but that’s a bit harder when you only have an algebraic definition. Also, sometimes we can implicitly define a function locally by restricting its domain (even if no explicit form exists for the whole set).

*Theorem.* Let $E$ be an open subset of $R_{n}$, let $f:E→R$ be continuously differentiable, and let $y=(y_{1},…,y_{n})$ be a point in $E$ such that $f(y)=0$ and $∂x_{n}∂f =0$. Then there exists an open $U⊆R_{n−1}$ containing $(y_{1},…,y_{n−1})$, an open $V⊆E$ containing $y$, and a function $g:U→R$ such that $g(y_{1},…,y_{n−1})=y_{n}$, and

So, I think what’s really going on here is that we’re using the derivative at this known zero to locally linearize the manifold we’re operating on (similar to Newton’s approximation), which lets us have some neighborhood $U$ in which we can derive an implicit function, even if we can’t always write it out.

*Outer measure; measurable sets and functions.*

Tao lists desiderata for an ideal measure before deriving it. Imagine that.

*Building up the Lebesgue integral, culminating with Fubini’s theorem.*

Suppose $Ω⊆R_{n}$ is measurable, and let $f:Ω→[0,∞]$ be a measurable, non-negative function. The Lebesgue integral of $f$ is then defined as

$∫_{Ω}f=defsup{∫_{Ω}s:sis simple and non-negative, and minorizesf}.$This hews closely to how we defined the *lower* Riemann integral in Chapter 11; however, we don’t need the equivalent of the upper Riemann integral for the Lebesgue integral.

To see why, let’s review why Riemann integrability demands the equality of the lower and upper Riemann integrals of a function $g$. Suppose that we integrate over $[0,1]$, and that $g$ is the indicator function for the rationals. As the rationals are dense in the reals, any interval $[a,b]⊆[0,1]$ ($b>a$) contains rational numbers, no matter how much the interval shrinks! Therefore, the upper Riemann integral equals 1, while the lower equals 0 (for similar reasons). $g$ *is* Lebesgue integrable; since it’s 0 almost everywhere (as the rationals have 0 measure), its integral is 0.

This marks a fundamental shift in how we integrate. With the Riemann integral, we consider the $limsup$ and $limf$ of increasingly-refined upper and lower Riemann sums—this is the *length* approach. In Lebesgue integration, however, we consider which $E⊆Ω$ is responsible for each value $y$ in the range (i.e. $f_{−1}(y)=E$), multiplying $y$ by the measure of $E$—this is *inversion*.

In a sense, the Lebesgue integral more cleanly strikes at the heart of what it *means* to integrate. Surely, Riemann integration was not far from the mark; however, if you rotate the problem slightly in your mind, you will find a better, cleaner way of structuring your thinking.

Although Tao botches a few exercises and the section on topology, I’m a big fan of *Analysis I* and *II*. Do note, however, that *II* is far more difficult than *I* (not just in content, but in terms of the exercises). He generally provides relevant, appropriately-difficult problems, and is quite adept at helping the reader develop rigorous and intuitive understanding of the material.

- To avoid getting hung up in Chapter 17, this book should be read after a linear algebra text.
- Don’t do exercise 17.6.3—it’s wrong.
- Deep understanding comes from sweating it out. Don’t hide, don’t wave away bothersome details—stay and explore. If you follow my strategy of quickly generating outlines—can you formally and precisely write out each step?

I completed every exercise in this book; in the second half, I started avoiding looking at the hints provided by problems until I’d already thought for a few minutes. Often, I’d solve the problem and then turn to the hint: “be careful when doing *X*—don’t forget edge case *Y*; hint: use lemma *Z*”! A pit would form in my stomach as I prepared to locate my mistake and back-propagate where-I-should-have-looked, before realizing that I’d *already* taken care of that edge case using that lemma.

One can argue that my time would be better spent picking up things as I work on problems in alignment. However, while I’ve made, uh, quite a bit of progress with impact measures this way, concept-shaped holes are impossible to notice. If there’s some helpful information-theoretic way of viewing a problem that I’d only realize if I had *already taken* information theory, I’m out of luck.

Also, developing mathematical maturity brings with it a more rigorous thought process.

There’s a sense I get where even though I’ve made immense progress over the past few months, it still *might not be enough*. The standard isn’t “am I doing impressive things for my reference class?”, but rather the stricter “am I good enough to solve serious problems that might not get solved in time otherwise?”. This is quite the standard, and even given my textbook and research progress (including the upcoming posts), I don’t think I meet it.

In a way, this excites me. I welcome any advice for buckling down further and becoming yet stronger.

ThanksThank you to everyone who has helped me. In particular,

`TheMajor`

has been incredibly generous with their explanations and encouragement.

**Sequence:**Becoming Stronger

Find out when I post more content: newsletter & RSS

`alex@turntrout.com`