(Note: In this post I will refer to the complex numbers by the set $\Bbb C$ and real numbers by $\Bbb R$. $\Bbb R^n$ is merely the Cartesian product of $\Bbb R$ with itself $n$ times (which is the same as vectors with $n$ components). If $n=1$, this is simply the real line; if $n=2$, this is the $xy$ plane; and if $n=3$, this is three dimensional space as we know it.)
Calculus on $\Bbb R^2$
(Side note: In the upcoming sections I should say that the functions of interest are defined on subsets of $\Bbb R^n$ and $\Bbb C$ but I will omit this technicality except in the case of functions on $\Bbb R$ because it only makes definitions longer without adding much to the actual concept.)
With this machinery we can start discussing derivatives of functions from $\Bbb R^n$ to $\Bbb R^m$. One important property of the derivative on $\Bbb R$ is that it linearizes a function, that is to say that if we have a function $f$ that is differentiable at $x$, then $f(x+a)\approx f(x)+f'(x)a$, when $a$ is small. This property is very powerful because it allows us to get approximate values of a function based on knowing its derivative and value at a point. If we are to generalize this idea to work on $\Bbb R^n$ (to $\Bbb R^m$), we will need that $f'(x)$ is an $m\times n$ matrix so that multiplication by a vector in $\Bbb R^n$ makes sense (analogous to multiplying by $a$ above) and gives a vector in $\Bbb R^m$. (If you recall from multivariable calculus, this matrix is the Jacobian.)
To formalize the above approximation, what one can say is that $f(x+a)-f(x)-f'(x)a = O(a^2)$. That is to say that it has powers of $a^2$ and higher so that when $a$ is small, these terms are negligible and go to $0$ when $a$ goes to $0$. If we massage this a little we get the following definition for differentiability (which is equivalent to the first definition above):
A function $f:[a,b]\to\Bbb R$ is differentiable at $c\in(a,b)$ if there exists $L\in\Bbb R$ so that
$$ \lim_{h\rightarrow 0} \frac{f(c+h)-f(c)-Lh}{h} = 0 ,$$
where $L$ is called the derivative of $f$ and is denoted $f'(c)$. It turns out that this definition carries over very well to the case of functions from $\Bbb R^n$ to $\Bbb R^m$ and the definition is as follows. A function $f:\Bbb R^n\to\Bbb R^m$ is differentiable at $\vec{x}\in\Bbb R^n$ if there exists an $m\times n$ matrix $L$ such that
$$ \lim_{\|\vec{h}\|\rightarrow 0} \frac{\|f(\vec{x}+\vec{h})-f(\vec{x})-L\vec{h}\|}{\|\vec{h}\|} = 0 ,$$
(Side note: In the upcoming sections I should say that the functions of interest are defined on subsets of $\Bbb R^n$ and $\Bbb C$ but I will omit this technicality except in the case of functions on $\Bbb R$ because it only makes definitions longer without adding much to the actual concept.)
Firstly, "calculus on $\Bbb R^2$" is a bit ambiguous since one can consider functions from $\Bbb R^2$ to $\Bbb R$, $\Bbb R^2$ to $\Bbb C$ or $\Bbb R^2$ to $\Bbb R^n$. Examples of such functions include $f(x,y) = x$, $g(x,y) = x+iy$ and $h(x,y) = (x,0,\ldots,0)$, respectively. I will restrict my discussion to $\Bbb R^2$ since complex functions take $\Bbb C$ to $\Bbb C$ and the analogy is best served in this light. Since this post is about calculus on certain spaces, we need to explore topics in calculus. The simplest one to consider is, of course, limits, but limits behave similarly in $\Bbb R^2$ and $\Bbb C$ since they "look the same" (effectively have the same notion of distance). The next layer of abstraction would then be differentiation and this is where we begin to see differences between real and complex analysis.
We wish to consider differentiable functions from $\Bbb R^2$ to $\Bbb R^2$ but what is a derivative between these two sets? What does it look like? It turns out that derivatives of functions from $\Bbb R^n$ to $\Bbb R^m$ are represented $m\times n$ matrices. To develop the notion of a derivative of functions from $\Bbb R^n$ to $\Bbb R^m$, one must reconsider what a derivative is. The reason being that the regular definition of a derivative does not carry over exactly as it is defined in many introductory calculus courses.
The standard definition of a derivative (in introductory calculus courses) follows. A function $f:[a,b]\to\Bbb R$ has a derivative at some point $c\in(a,b)$ if
$$ \lim_{h\rightarrow 0} \frac{f(c+h)-f(c)}{h} $$
exists. This quantity is called the derivative of $f$ at $c$ and is denotied $f'(c)$. We would like to adapt this definition for the case of functions from $\Bbb R^n$ to $\Bbb R^m$ so let's attempt to do so.
Suppose $f:\Bbb R^n\to\Bbb R^m$. Let's try to define the derivative like we do above. Let $\vec{x},\vec{h}\in \Bbb R^n$, then if we haphazardly apply the previous definition we have
$$ \lim_{\vec{h}\rightarrow \vec{0}} \frac{f(\vec{x}+\vec{h})-f(\vec{x})}{\vec{h}}. $$
However this isn't well defined. The numerator makes sense, but it doesn't make sense to divide by a vector. This harks back to my point in the introduction to this post. We cannot divide by a vector so this notion of differentiation cannot possibly work out. Now we could define some notion of multiplication of vectors so that we can define division by vectors (or rather, the inverse of a vector) but it is entirely non-unique and only makes the matter entirely more complicated.
Remark: The first definition of a derivative is a limit and inherently when working with limits, one needs have a notion of "length" (really one needs a distance, but if one can speak about lengths of vectors, one can speak about distances between vectors). That is, we need to be able to say vectors are getting close to one another. There are many possible definitions, but the one I will stick to is the most familiar to you and that is the Euclidean distance. In $\Bbb R^n$ the Euclidean length of a vector $\vec{x}$ is given by
$$ \|\vec{x}\| = \left(\sum_{i=1}^n x_i^2\right)^{\frac{1}{2}}, $$
where the $x_i$ are the components of $\vec{x}$. This can be thought of as the familiar Pythagorean theorem generalized to $n$ dimensions. Now when doing limits in $\Bbb R^n$, the notion of distance between vectors $\vec{x}$ and $\vec{y}$ will be given by $\|\vec{x}-\vec{y}\|$, so we say that a sequence of vectors $\vec{x}_m$ has a limit vector of $\vec{L}$ if $\|\vec{x}_m-\vec{L}\|$ goes to $0$, meaning our vectors get close to $\vec{L}$.
To formalize the above approximation, what one can say is that $f(x+a)-f(x)-f'(x)a = O(a^2)$. That is to say that it has powers of $a^2$ and higher so that when $a$ is small, these terms are negligible and go to $0$ when $a$ goes to $0$. If we massage this a little we get the following definition for differentiability (which is equivalent to the first definition above):
A function $f:[a,b]\to\Bbb R$ is differentiable at $c\in(a,b)$ if there exists $L\in\Bbb R$ so that
$$ \lim_{h\rightarrow 0} \frac{f(c+h)-f(c)-Lh}{h} = 0 ,$$
where $L$ is called the derivative of $f$ and is denoted $f'(c)$. It turns out that this definition carries over very well to the case of functions from $\Bbb R^n$ to $\Bbb R^m$ and the definition is as follows. A function $f:\Bbb R^n\to\Bbb R^m$ is differentiable at $\vec{x}\in\Bbb R^n$ if there exists an $m\times n$ matrix $L$ such that
$$ \lim_{\|\vec{h}\|\rightarrow 0} \frac{\|f(\vec{x}+\vec{h})-f(\vec{x})-L\vec{h}\|}{\|\vec{h}\|} = 0 ,$$
where $L$ is called the derivative of $f$ and is denoted $f'(\vec{x})$. In this definition you can see the pieces we put together: dividing by $\|\vec{h}\|$ instead of $\vec{h}$ and the linearization behavior of the derivative. The reason we take the norm of the numerator is so that we don't have to worry about it when we do the limit (though it wasn't entirely necessary since we need to speak of lengths anyway when doing limits). And now we have a proper notion of differentiation of functions from $\Bbb R^n$ to $\Bbb R^m$ (it turns out that this definition of a derivative can be generalized to functions from more abstract spaces - like function spaces so you'd be taking derivatives of functions of functions). In the case of functions from $\Bbb R^2$ to $\Bbb R^2$, the derivative is a $2\times 2$ matrix. It will turn out that derivatives of functions from $\Bbb C$ to $\Bbb C$ are much different due to the ability to divide and multiply complex numbers.
Calculus on $\Bbb C$
A function $f$ on $\Bbb C$ can be written as $f(z) = f(x+iy)$ (recall from my last post that we write $z=x+iy$). This will, in general, assign each $(x,y)\in\Bbb C$ to a pair $(x',y')\in\Bbb C$ so we will write $f(x,y) = u(x,y)+iv(x,y)$. We say that $u$ is the real part of $f$ and $v$ is the imaginary part of $f$, that is to say that $u$ associates $(x,y)$ with $x'$ and $v$ associates $(x,y)$ with $y'$. Like above, we would like to define a length of complex numbers. Since $\Bbb C$ and $\Bbb R^2$ both have the same structure with respect to addition, they both look the same (meaning we associate each complex number $x=iy$ with an ordered pair $(x,y)$ in the plane $\Bbb R^2$). In this way you can kind of view complex numbers as two-dimensional vectors with one difference: we can multiply and divide by complex numbers. Since $\Bbb C$ can be viewed as a plane of points and can associate $x+iy$ with the vector $(x,y)$, we can speak of the length of a complex number.
By inspection we see that the length of $x+iy$ (denoted $|x+iy|$ - this is called the modulus or norm of $x+iy$) is $\sqrt{x^2+y^2}$. It turns out that we can write this as $\sqrt{(x+iy)(x-iy)}$ (check this for yourself). The quantity $x-iy$ looks very similar to $x+iy$ but with the sign of $iy$ changed, and this is in fact the complex conjugate of $x+iy$ (the complex conjugate changes the sign of the term with $i$ in it). If we write $z=x+iy$, then we write $z^*=x-iy$ and call $z^*$ the complex conjugate of $z$. Other common notation is $\bar{z}$ for the complex conjugate. Then we write that $|z|^2 = zz^* = x^2+y^2$.
Since we can multiply complex numbers, we should be able to divide by complex numbers (with the exception to $0$ since dividing by $0$ is not well-defined). Let $z,z'$ be complex numbers with $z$ given. If we want to find what the inverse of $z$ is, we want to solve $zz' = 1$ for $z'$. If we multiply both sides by $z^*$, we get that $(x^2+y^2)z' = z^*$ so that $z' = \dfrac{z^*}{x^2+y^2}$ and so we can take inverses of complex numbers that aren't $0$! This fact becomes very important when considering derivatives of functions from $\Bbb C$ to $\Bbb C$.
Now we have built up all of the necessary machinery for talking about derivatives of functions from $\Bbb C$ to $\Bbb C$. Let us try to apply the initial definition of a limit and see if there are any difficulties like the real variable case.
Suppose $f:\Bbb C\to\Bbb C$. Let us haphazardly use the definition of a derivative from the $\Bbb R$ to $\Bbb R$ case (with minor changes) and see if anything goes wrong. We then wish to look at the following
$$ \lim_{\Delta z\rightarrow 0} \frac{f(z+\Delta z)-f(z)}{\Delta z}. $$
Since we can add, subtract and divide complex numbers (as long as they are never zero), we can make sense of this limit unlike in the real variable ($\Bbb R^2$) case! There is no need to talk about dividing by the lengths of complex numbers (norms) in this case as a result. Since this quotient makes sense, we can define the derivative of a complex function with it.
Let $f:\Bbb C\to\Bbb C$. It is said to be complex differentiable at $z$ if
$$ \lim_{\Delta z\rightarrow 0} \frac{f(z+\Delta z)-f(z)}{\Delta z} $$
exists. We call this the derivative of $f$ at $z$ and denote it by $f'(z)$.
Recall from multivariable real analysis that if a limit of a function exists at a particular point, the value should come out the same regardless of how we approach that point. Making that same restriction here, we can come up with equations that couple the $u$ and $v$. More specifically, if we approach $z$ along a horizontal and vertical line, we should get the same result for the derivative.
Let's assume $f$ is complex differentiable and see what comes of it. Along horizontal lines lines, $z' = x+iy'$, where $y'$ is constant, and along vertical lines, $z' = x'+iy$, where $x'$ is constant. Therefore, along horizontal lines, $\Delta z' = \Delta x$; along vertical lines, $\Delta z' = i\Delta y$. Hence our expression for the derivative becomes (along a horizontal line)
$$ \lim_{\Delta x\rightarrow 0} \frac{f(x+iy+\Delta x) - f(x+iy)}{\Delta x} = \lim_{\Delta x\rightarrow 0} \frac{u(x+\Delta x, y) + iv(x+\Delta x,y) - u(x, y) - iv(x,y)}{\Delta x}. $$
Along a vertical line, our expression for the derivative becomes
$$ \lim_{\Delta y\rightarrow 0} \frac{f(x+iy+\Delta y) - f(x+iy)}{\Delta y} = \lim_{\Delta y\rightarrow 0} \frac{u(x,y+\Delta y) +iv(x, y+\Delta y) - u(x,y) - iv(x,y)}{i\Delta y} .$$
We can recognize the limit along $x$ direction as the derivative of $u$ and $v$ with respect to $x$, giving
$$ f '(z) = \frac{\partial u}{\partial x} + i\frac{\partial v}{\partial x} $$
and we can recognize the limit along the $y$ direction as the derivative of $u$ and $v$ with respect to $y$, giving
$$ f '(z) = -i\frac{\partial u}{\partial y} + \frac{\partial v}{\partial y} .$$
Equating the real and imaginary parts in these two expressions leads to the following relations
$$ \frac{\partial u}{\partial x} = \frac{\partial v}{\partial y} $$
and
$$ \frac{\partial u}{\partial y} = -\frac{\partial v}{\partial x} . $$
These are known as the Cauchy-Riemann equations. It turns out that these equations imply that $u$ and $v$ are both harmonic functions and a lot of analysis of such functions has been done (maximum principle, for example). These two seemingly innocuous equations are at the heart of why complex analysis is so different from real analysis. In the real case, there is no such coupling between the components of the function (if you separate them similarly to what we did here) and this is simply because we could divide complex numbers.
It turns out that if a complex function is complex differentiable on an open set, then it is infinitely differentiable on that open set. This shows that complex differentiability is much stronger than ordinary differentiability. One particular oddity of complex analysis is that if one has a complex function that is differentiable everywhere (this is referred to as entirety), then it is either the constant function or it must necessarily not be bounded. Contrast this with the real function case: $\sin(xy)$ is real differentiable everywhere and is bounded, but is not the constant function. Complex analysis is rife with counter-intuitive results such as this. Another example is that if one has an entire function whose image omits two (or more) points in $\Bbb C$, then it must be constant (this is called Picard's little theorem).
It turns out that if a complex function is complex differentiable on an open set, then it is infinitely differentiable on that open set. This shows that complex differentiability is much stronger than ordinary differentiability. One particular oddity of complex analysis is that if one has a complex function that is differentiable everywhere (this is referred to as entirety), then it is either the constant function or it must necessarily not be bounded. Contrast this with the real function case: $\sin(xy)$ is real differentiable everywhere and is bounded, but is not the constant function. Complex analysis is rife with counter-intuitive results such as this. Another example is that if one has an entire function whose image omits two (or more) points in $\Bbb C$, then it must be constant (this is called Picard's little theorem).
So despite the vast similarities between $\Bbb R^2$ and $\Bbb C$, defining multiplication of complex numbers vastly changes the landscape of calculus between the two spaces as is evidenced by the Cauchy-Riemann equations and other results. The beauty of complex analysis truly stems from the added structure of multiplication of complex numbers, without a notion of multiplication calculus on $\Bbb C$ would be the same as calculus on $\Bbb R^2$. There is a lot more that could be said but I feel this is a good place to stop, especially since the post got away from me again and ended up being much longer than I intended.
No comments:
Post a Comment