`$xOy$`

平面上，将两个半圆弧 `$(x-1)^2+y^2=1~(x\geq 1)$`

和 `$(x-3)^2+y^2=1~(x\geq 3)$`

、两条直线 `$y=1$`

和 `$y=-1$`

围成的封闭图形记为 `$D$`

，如图中阴影部分. 记 `$D$`

绕 `$y$`

轴旋转一周而成的几何体为 `$\Omega$`

，过 `$(0,y)~(|y|\leq 1)$`

作 `$\Omega$`

的水平截面，所得截面面积为 `$4\pi\sqrt{1-y^2}+8\pi$`

，试用祖暅原理、一个平放的圆柱和一个长方体，得出 `$\Omega$`

的体积值为 `$\boxed{?}$`

.`$1$`

，高为 `$2\pi$`

的平放圆柱和一个高为 `$2$`

，底面面积为 `$8\pi$`

的长方体组成，其体积为 `$V=2\pi^2+16\pi$`

.
我并不知道题面中的水平截面面积为何要额外给出，在我看来这是冗余的条件，下文给出几种非祖暅原理的解法，可能有点小题大作，不过主要是为了熟悉一下微积分相关知识点……

**法 2 (旋转体)** 注意到水平截面的表达式为 `$4\pi\sqrt{1-y^2}+8\pi$`

，可得到 `$$\begin{aligned}V&=\int_{-1}^{1}4\pi\sqrt{1-y^2}+8\pi\mathrm{d}y\\&=4\pi\int_{-1}^{1}\sqrt{1-y^2}+2\mathrm{d}y\\(x=\sin(\theta))&=4\pi\int_{-\frac\pi2}^{\frac\pi2}(\cos(\theta)+2)\cos(\theta)\mathrm{d}\theta\\&=4\pi\int_{-\frac\pi2}^{\frac\pi2}\cos^2(\theta)+2\cos(\theta)\mathrm{d}\theta\\&=4\pi\int_{-\frac\pi2}^{\frac\pi2}\frac{\cos(2\theta)+1}{2}+2\cos(\theta)\mathrm{d}\theta\\&=4\pi\left[\frac{\sin(2\theta)}{4}+\frac\theta2+2\sin(\theta)\right]^{\frac\pi2}_{-\frac\pi2}\\&=4\pi\left(\frac\pi2+4\right)\\&=2\pi^2+16\pi.\end{aligned}$$`

虽然这里直接用到了题面所给截面面积表达式，但是该方法与利用旋转体公式求 `$$V=\pi\int_{-1}^{1}(\sqrt{1-y^2}+3)^2-(\sqrt{1-y^2}+1)^2\mathrm{d}y$$`

是等价的.

**法 3 (帕普斯法则)** 应用帕普斯法则，先求该平面图形的质心，可利用分割法与二重积分求得，首先分割区域如下图

其中 `$R_0$`

是中央的 `$2\times2$`

矩形区域，`$R_1,R_2$`

分别为两个半圆区域。（由于这些区域的质心显然在 `$x$`

轴上，下文直接使用 `$x$`

坐标表示其质心。）

区域 `$R_0$`

的质心显然为 `$x_0=2$`

；为求得 `$R_1,R_2$`

两个区域的质心，逆用帕普斯法则可以知道其质心分别为 `$$x_1=1+\frac{4}{3\pi},x_2=3+\frac{4}{3\pi}.$$`

接着利用二重积分求区域 `$D$`

的质心 `$G$`

，又注意到 `$D=R_0-R_1+R_2$`

，根据质心公式有 `$$\begin{aligned}G_x&=\frac{1}{S_D}\iint_{D}x\mathrm{d}A\\&=\frac{1}{S_D}\left(\iint_{R_0}x\mathrm{d}A-\iint_{R_1}x\mathrm{d}A+\iint_{R_2}x\mathrm{d}A\right),\end{aligned}$$`

其中 `$S_D=4$`

表示区域 `$D$`

的面积.

对于区域 `$R_0,R_1, R_2$`

分别使用质心公式可以知道 `$$\begin{aligned}x_0&=\frac{1}{S_{R_0}}\iint_{R_0}x\mathrm{d}A\\x_1&=\frac{1}{S_{R_1}}\iint_{R_1}x\mathrm{d}A\\x_2&=\frac{1}{S_{R_2}}\iint_{R_2}x\mathrm{d}A,\end{aligned}$$`

其中 `$S_{R_0},S_{R_1},S_{R_2}$`

分别表示区域 `$R_0,R_1,R_2$`

的面积，经过变形后得到一组二重积分的表达式

`$$\begin{aligned}\iint_{R_0}x\mathrm{d}A&=S_{R_0}x_0\\\iint_{R_1}x\mathrm{d}A&=S_{R_1}x_1\\\iint_{R_2}x\mathrm{d}A&=S_{R_2}x_2,\end{aligned}$$`

将其代入之前的式子可得到 `$$\begin{aligned}G_x&=x_0-\frac{S_{R_1}}{S_D}x_1+\frac{S_{R_2}}{S_D}x_2\\&=2-\frac{\frac\pi2}{4}\left(1+\frac{4}{3\pi}\right)+\frac{\frac\pi2}{4}\left(3+\frac{4}{3\pi}\right)\\&=\frac\pi4+2,\end{aligned}$$`

则几何体的体积可表示为 `$$\begin{aligned}V&=2\pi\cdot G_x\cdot S_D\\&=2\pi\cdot(\frac\pi4+2)\cdot4\\&=2\pi^2+16\pi.\end{aligned}$$`

Like a regular integral, the double integral `$\iint_{R}f(x,y)\mathrm{d}A$`

represents the volume under the surface on the region `$R$`

where `$\mathrm{d}A$`

is a small piece of the region.

So we can have a summation like the following `$$\iint_{R}f(x,y)\mathrm{d}A=\lim_{\Delta A_i\to 0}\sum f(x,y)\Delta A_i.$$`

In pratice, we will not calculate double integral by definition, and now we try to convert it to two single integrals.

The idea is to divide the volume into several slices along `$x$`

direction, and the volume of these slices are changing with `$x$`

, so the slice determined by `$x$`

should be `$$S(x)=\int_{y_{\min}(x)}^{y_{\max}(x)}f(x,y)\mathrm{d}y$$`

, and now we activate `$x$`

, so the double integral becomes `$$\begin{aligned}\iint_{R}f(x,y)\mathrm{d}A&=\int_{x_{\min}}^{x_{\max}}S(x)\mathrm{d}x\\&=\int_{x_{\min}}^{x_{\max}}\int_{y_{\min}(x)}^{y_{\max}(x)}f(x,y)\mathrm{d}y\mathrm{d}x.\end{aligned}$$`

**e.g.** Calculate the integral of `$z=1-x^2-y^2$`

on the region `$0\leq x\leq 1,0\leq y\leq 1$`

.

**Solution** `$$\begin{aligned}\iint_{R}1-x^2-y^2\mathrm{d}y\mathrm{d}x&=\int_0^1\int_0^1 1-x^2-y^2\mathrm{d}y\mathrm{d}x\\&=\int_0^1\left[y-x^2y-\frac13y^3\right]^1_0\mathrm{d}x\\&=\int_0^1\frac23-x^2\mathrm{d}x\\&=\left[\frac23x-\frac13x^3\right]^1_0\\&=\frac13.\end{aligned}$$`

**e.g.** Calculate the volume of `$z=1-x^2-y^2$`

above the `$xy$`

plane.

**Solution** Here the region is actually `$x^2+y^2\leq 1$`

, and according to symmetric graph, we only need to calculate the quarter of the region and multiply `$4$`

.

`$$\begin{aligned}\iint_{R}1-x^2-y^2\mathrm{d}y\mathrm{d}x&=4\int_0^1\int_0^{\sqrt{1-x^2}} 1-x^2-y^2\mathrm{d}y\mathrm{d}x\\&=4\int_0^1\left[y-x^2y-\frac13y^3\right]^{\sqrt{1-x^2}}_0\mathrm{d}x\\&=4\int_0^1(1-x^2)\sqrt{1-x^2}-\frac13(1-x^2)\sqrt{1-x^2}\mathrm{d}x\\&=4\int_0^1\frac23(1-x^2)\sqrt{1-x^2}\mathrm{d}x\\(x=\sin(\theta))&=\frac83\int_{\frac\pi2}^0\cos^4(\theta)\mathrm{d}\theta\\&=\frac\pi2.\end{aligned}$$`

Last time we converted the volume into slices along `$x$`

direction, why not the other direction? Actually, it's feasible. And in theory, these two directions both work. The form along the other direction can be written as `$$\begin{aligned}\iint_{R}f(x,y)\mathrm{d}A&=\int_{y_{\min}}^{y_{\max}}T(x)\mathrm{d}x\\&=\int_{y_{\min}}^{y_{\max}}\int_{x_{\min}(y)}^{x_{\max}(y)}f(x,y)\mathrm{d}x\mathrm{d}y.\end{aligned}$$`

**e.g.** Evaluate `$\int_0^1\int_{x}^{\sqrt{x}}\frac{e^y}{y}\mathrm{d}y\mathrm{d}x$`

.

**Solution** The region to be integrated is like the following.

Here the expression is to make `$y$`

dependent on `$x$`

, and now we swap the order and make `$x$`

dependent on `$y$`

that is `$$\begin{aligned}\int_0^1\int_{x}^{\sqrt{x}}\frac{e^y}{y}\mathrm{d}y\mathrm{d}x&=\int_0^1\int_{y^2}^{y}\frac{e^y}{y}\mathrm{d}x\mathrm{d}y\\&=\int_0^1\left[\frac{e^y}{y}x\right]^{y}_{y^2}\mathrm{d}y\\&=\int_0^1 e^y(1-y)\mathrm{d}y\\&=\left[2e^y-ye^y\right]^1_0\\&=e-2.\end{aligned}$$`

In fact, sometimes it becomes easier when dealing the same problem in polar coordinates. The idea is to express `$\mathrm{d}A$`

in polar coordinates like the following picture.

From the picture, it can be seen that the small piece of region becomes `$$\mathrm{d}A=r\mathrm{d}r\mathrm{d}\theta.$$`

**e.g.** Calculate the volume of `$z=1-x^2-y^2$`

above the `$xy$`

plane.

**Solution** The expression can be converted to `$z=1-r^2$`

and the region becomes `$R:r\leq 1$`

, so the integral becomes `$$\begin{aligned}\iint_{R}1-r^2\mathrm{d}A&=\int_0^{2\pi}\int_0^1(1-r^2)r\mathrm{d}r\mathrm{d}\theta\\&=\int_0^{2\pi}\left[\frac{r^2}{2}-\frac{r^4}{4}\right]^1_0\mathrm{d}\theta\\&=\int_0^{2\pi}\frac14\mathrm{d}\theta\\&=\left[\frac14\theta\right]^{2\pi}_0\\&=\frac\pi2.\end{aligned}$$`

- Get the area of a region on the plane

Given a region`$R$`

on the plane, we'd like to get its area. It's equivalent to evaluate`$\iint_{R}\mathrm{d}A$`

, where`$\mathrm{d}A$`

depends on the type.

**Variant**Get the total mass of a flat object with the density`$\delta$`

function:`$\iint_{r}\delta\mathrm{d}A$`

. - Average

Like the single integral, an average of the function`$f$`

on a region`$R$`

can be defined as`$$\bar{f}=\frac1{S_R}\iint_{R}f\mathrm{d}A,$$`

where`$S_R$`

is the area of the region`$R$`

.

**Variant**Weighted average:`$$\tilde{f}=\frac1{M_R}\iint_{R}f\cdot\delta\mathrm{d}A,$$`

where`$M_R=\iint_{R}\delta\mathrm{d}A$`

.

**Variant**Center of mass of a flat object (in Cartesian plane):`$$\bar{x}=\frac1{M_R}\iint_{R}x\cdot\delta\mathrm{d}A,\bar{y}=\frac1{M_R}\iint_{R}y\cdot\delta\mathrm{d}A,$$`

where`$M_R=\iint_{R}\delta\mathrm{d}A$`

.

Like the implicit differentiation in single variable calculus, there are times we cannot solve a variable as a function dependent on others (especially the powers are larger than `$2$`

).

**e.g.** Given the implicit function `$y^3+xy+5=0$`

, what is `$\frac{\mathrm{d}y}{\mathrm{d}x}$`

?

**Solution** Apply derivatives at both sides we get `$$3y^2\cdot\frac{\mathrm{d}y}{\mathrm{d}x}+\left(1\cdot y+x\cdot\frac{\mathrm{d}y}{\mathrm{d}x}\right)=0,$$`

and the answer is `$$\frac{\mathrm{d}y}{\mathrm{d}x}=-\frac{y}{3y^2+x}.$$`

When there are not only two variables in a formula, and it's not easy to solve a variable as a function dependent on others, is there similar tricks? The answer is to use the total differential.

**e.g.** Given an equation `$x^3+yz+z^3=8$`

, find `$\frac{\partial z}{\partial x}$`

and `$\frac{\partial z}{\partial y}$`

at point `$(2,3,1)$`

.

**Solution** Consider a function `$f(x,y,z)=x^2+yz+z^3=8$`

and its total differential `$$\mathrm{d}f=2x\mathrm{d}x+z\mathrm{d}y+(y+3z^2)\mathrm{d}z,$$`

since `$f(x,y,z)=8$`

so `$$\mathrm{d}f=2x\mathrm{d}x+z\mathrm{d}y+(y+3z^2)\mathrm{d}z=0.$$`

And let's think about the actual meaning of partial derivatives `$\frac{\partial z}{\partial x}$`

and `$\frac{\partial z}{\partial y}$`

, it keeps a variable active and others constant, so to get `$\frac{\partial z}{\partial x}$`

, we can set `$\mathrm{d}y=0$`

, since `$y$`

is considered constant and get `$$3x^2\mathrm{d}x+(y+3z^2)\mathrm{d}z=0,$$`

so plug the point into it and we have `$$4\mathrm{d}x+6\mathrm{d}z=0,$$`

so `$$\frac{\partial z}{\partial x}=-\frac23,$$`

and similarly `$$\frac{\partial z}{\partial y}=-\frac16.$$`

When I carefully reviewed the equation above, I was ignited by the meaning of the partial derivative and find a bridge between the method and implicit differentiation.

When we are computing `$\frac{\partial z}{\partial x}$`

, we only care about the variation of `$x$`

and `$z$`

, so we can treat anything else as constants and the problem is again solved by implicit differentiation that is `$$\frac{\mathrm{d}}{\mathrm{d}x}(x^2+yz+z^3)=\frac{\mathrm{d}}{\mathrm{d}x}8,$$`

and here `$y$`

is a number not a variable, we can get `$$2x+yz'+3z^2z'=0,$$`

so `$$z'=-\frac{2x}{3z^2+y},$$`

namely `$$\frac{\partial z}{\partial x}=-\frac{2x}{3z^2+y}.$$`

Consider a function `$f(x,y)=x+y$`

and set `$x=u,y=u+v$`

, so `$f(u,v)=2u+v$`

. Compute `$\frac{\partial f}{\partial x}=1,\frac{\partial f}{\partial u}=2$`

, but `$x=u$`

, what is the contradiction?

Actually it's a trap from notation, because when we write `$\frac{\partial f}{\partial x}$`

, we are assuming that `$x$`

is active and `$y$`

is constant. On the other hand, when we write `$\frac{\partial f}{\partial u}$`

, we are assuming that `$u$`

is active and `$v$`

is constant. The common variation of `$x=u$`

is the same, but what we keep constant is different in these two notations. So here are another notation to clearify what we keep constant: `$\left(\frac{\partial f}{\partial x}\right)_y$`

indicates that `$x$`

is active and `$y$`

is constant.

**e.g.** Given a *right* triangle with area `$A$`

, like the following:

find `$\frac{\partial A}{\partial \theta}$`

.

**Solution** We can write the expression `$$A=\frac12ab\sin(\theta),$$`

the key point here is that we have a constraint: a *right* triangle. So here exist the notation traps.

- Treat
`$a,b$`

are independent variables

Then the answer is just the partial derivative which keeps both`$a$`

and`$b$`

constant:`$\left(\frac{\partial A}{\partial \theta}\right)_{a,b}=\frac12ab\cos(\theta).$`

- Treat
`$a,b$`

are non-independent

The constraint here is exactly`$a=b\cos(\theta)$`

, and there are another two ways: either`$a$`

or`$b$`

is constant. - Treat
`$a$`

constant

**Method 1 (Total Differential)**Now we compute the differential`$$\begin{aligned}\mathrm{d}A&=\frac{\partial A}{\partial a}\mathrm{d}a+\frac{\partial A}{\partial b}\mathrm{d}b+\frac{\partial A}{\partial \theta}\mathrm{d}\theta\\&=\frac12b\sin(\theta)\mathrm{d}a+\frac12a\sin(\theta)\mathrm{d}b+\frac12ab\cos(\theta)\mathrm{d}\theta,\end{aligned}$$`

and another differential derived from the constraint`$$\begin{aligned}\mathrm{d}a&=\frac{\partial a}{\partial b}\mathrm{d}b+\frac{\partial a}{\partial \theta}\mathrm{d}\theta\\&=\cos(\theta)\mathrm{d}b-b\sin(\theta)\mathrm{d}\theta.\end{aligned}$$`

Here we're keeping`$a$`

constant, so`$\mathrm{d}a=0$`

, and get the differential relationship between`$b$`

and`$\theta$`

that is`$$\cos(\theta)\mathrm{d}b-b\sin(\theta)\mathrm{d}\theta=0,$$`

namely`$$\mathrm{d}b=b\tan(\theta)\mathrm{d}\theta.$$`

Substitute`$\mathrm{d}b$`

and`$\mathrm{d}a=0$`

into the previous differential, we can get`$$\mathrm{d}A=\frac12ab\sin(\theta)\tan(\theta)\mathrm{d}\theta+\frac12ab\cos(\theta)\mathrm{d}\theta,$$`

so we have`$$\mathrm{d}A=\frac12ab\left(\sin(\theta)\tan(\theta)+\cos(\theta)\right)\mathrm{d}\theta,$$`

therefore`$$\left(\frac{\partial A}{\partial \theta}\right)_{a}=\frac12ab\sec(\theta).$$`

**Method 2 (Chain Rule)**We can have`$$\begin{aligned}\left(\frac{\partial A}{\partial \theta}\right)_{a}&=\left(\frac{\partial A}{\partial \theta}\right)_{a}\left(\frac{\partial \theta}{\partial \theta}\right)_{a}+\left(\frac{\partial A}{\partial a}\right)_{a}\left(\frac{\partial a}{\partial \theta}\right)_{a}+\left(\frac{\partial A}{\partial b}\right)_{a}\left(\frac{\partial b}{\partial \theta}\right)_{a}\\&=\frac{\partial A}{\partial \theta}+\frac{\partial A}{\partial b}\frac{\partial b}{\partial \theta}=\frac12ab\cos(\theta)+\left(\frac12a\sin(\theta)\right)\cdot\left(a\tan(\theta)\sec(\theta)\right)\\&=\frac12ab\cos(\theta)+\frac12a\sec(\theta)\cdot\sin(\theta)\tan(\theta)\\&=\frac12ab\sec(\theta).\end{aligned}$$`

- Treat
`$b$`

constant

Omitted.

Phenomenon in reality is generally governed by partial differential equations, like heat equation in 3D space `$$\frac {\partial u}{\partial t}=\alpha \left({\frac {\partial ^{2}u}{\partial x^{2}}}+{\frac {\partial ^{2}u}{\partial y^{2}}}+{\frac {\partial ^{2}u}{\partial z^{2}}}\right).$$`

We've known the method to maximize or minimize a multivariable function, but what happens if there are some constraint? The critical point usually does not fulfill the constraint, so we have to maximize or minimize in another way.

**e.g.** Find the closest point to origin on `$xy=3$`

.

**Idea** We are going to minimize `$f(x,y)=x^2+y^2$`

subject to `$xy=3$`

. These are the level surfaces of `$f(x,y)$`

and `$g(x,y)=xy=3$`

.

When the level surface `$f(x,y)=c$`

becomes smaller and smaller, until it has no intersection with the `$xy=3$`

, then we have almost achieved the goal. We can find that at the maximum or minimum `$f_0$`

, the level surface `$f(x,y)=f_0$`

is tangent to the level surface `$g(x,y)=3$`

.

It means that the gradient at `$f(x,y)$`

is parallel to `$g(x,y)$`

, namely, `$\nabla f=\lambda\nabla g$`

, where `$\lambda$`

is an unknown. So, we have such system of equations `$$\left\{\begin{aligned}2x=\lambda y\\2y=\lambda x\\xy=3\end{aligned}\right.,$$`

and we get the point is `$(\sqrt3,\sqrt3)$`

or `$(-\sqrt3,-\sqrt3)$`

.

Now we conclude the method, given a function `$f(x,y)$`

and a constraint `$g(x,y)=c$`

, then to maximize or minimize the function, we need to solve the system of equations `$\left\{\begin{aligned}\frac{\partial f}{\partial x}&=\lambda\frac{\partial g}{\partial x}\\\frac{\partial f}{\partial y}&=\lambda\frac{\partial g}{\partial y}\\g(x,y)&=c\end{aligned}\right..$`

Why it is correct? If there is no constraint `$g(x,y)=c$`

, we just solve `$f_x=f_y=0$`

, and it means when we move on a horizontal surface near the point, the function doesn't change.

Now we have a constraint, similarly, we just find a point where we move along the constraint surface that the function doesn't change too. So `$\nabla_{\hat\mathbf{u}}f=0$`

, where `$\hat\mathbf{u}$`

is any direction on the constraint surface, in other words, `$\nabla f\cdot\hat\mathbf{u}=0$`

. Since the gradient `$\nabla g$`

is also normal to the constraint surface, we have `$\nabla f\parallel\nabla g$`

.

**e.g.** Find a best solution to minimizing the surface area of a pyramid with a given triangular base `$a_1,a_2,a_3$`

and a given height `$h$`

.

**Solution** We can plot the pyramid in `$xy$`

plane.

And it looks like the following in 3D space.

To determine the position of the vertex `$D$`

more easily, we project the vertex on the `$xy$`

plane.

And take the distance from the projection to three sides `$a_1,a_2,a_3$`

as `$d_1,d_2,d_3$`

, and we can express the surface area `$S$`

and the base area `$A$`

: `$$\begin{aligned}S=\frac12a_1\sqrt{d_1^2+h^2}+\frac12a_2\sqrt{d_2^2+h^2}+\frac12a_3\sqrt{d_3^2+h^2}\\A=\frac12a_1d_1+\frac12a_2d_2+\frac12a_3d_3\end{aligned}.$$`

And apply the Lagrange Multiplier `$$\left\{\begin{aligned}\frac{\partial S}{\partial d_1}=\lambda\frac{\partial A}{\partial d_1}\\\frac{\partial S}{\partial d_2}=\lambda\frac{\partial A}{\partial d_2}\\\frac{\partial S}{\partial d_3}=\lambda\frac{\partial A}{\partial d_3}\end{aligned}\right.,$$`

and found that `$d_1=d_2=d_3$`

, so the vertex is just above the incenter of the triangular base.

The chain rule for a multivariable function `$f(x,y,z)$`

, where `$x=x(t),y=y(t),z=z(t)$`

is `$$\frac{\mathrm{d}f}{\mathrm{d}t}=\frac{\partial f}{\partial x}\frac{\mathrm{d}x}{\mathrm{d}t}+\frac{\partial f}{\partial y}\frac{\mathrm{d}y}{\mathrm{d}t}+\frac{\partial f}{\partial z}\frac{\mathrm{d}z}{\mathrm{d}t}.$$`

But now, when the gradient `$\nabla f=\left(\frac{\partial f}{\partial x},\frac{\partial f}{\partial y},\frac{\partial f}{\partial z}\right)$`

is introduced, the formula has another form `$$\frac{\mathrm{d}f}{\mathrm{d}t}=\nabla f\cdot\frac{\mathrm{d}\mathbf{r}}{\mathrm{d}t},$$`

where `$\mathbf{r}(t)=(x(t),y(t),z(t))$`

.

The gradient `$\nabla f$`

is normal to the level surface `$F(x,y,z)=c$`

, also to the tangent plane of the level surface.

**Proof** Given a function `$f(x,y,z)$`

, and take *any* curve `$\mathbf{r}(t)=(x(t),y(t),z(t))$`

on the level surface `$f(x,y,z)=c$`

. According to the chain rule, `$$\frac{\mathrm{d}f}{\mathrm{d}t}=\nabla f\cdot\frac{\mathrm{d}\mathbf{r}}{\mathrm{d}t},$$`

where `$f$`

now is on the level surface, so `$$\frac{\mathrm{d}f}{\mathrm{d}t}=0,$$`

that is `$$\nabla f\cdot\frac{\mathrm{d}\mathbf{r}}{\mathrm{d}t}=0.$$`

**e.g.** Solve the tangent plane for `$x^2+y^2-z^2=4$`

at `$(2,1,1)$`

.

**Solution 1** Consider a three-variable function `$f(x,y,z)=x^2+y^2-z^2$`

, then it becomes a level surface `$f=4$`

, since the gradient is normal to the tangent plane of the level surface, so the normal vector of the tangent plane is `$$\mathbf{n}=\left(\frac{\partial f}{\partial x},\frac{\partial f}{\partial y},\frac{\partial f}{\partial z}\right)_{(2,1,1)}=(4,2,-2),$$`

so we have a equation like `$$4x+2y-2z=k,$$`

and plug the point into it, we get the tangent plane `$$4x+2y-2z=8.$$`

**Solution 2** Another point of view is at the total differential, near the point that is `$$\mathrm{d}f=\frac{\partial f}{\partial x}_{(2,1,1)}\mathrm{d}x+\frac{\partial f}{\partial y}_{(2,1,1)}\mathrm{d}y+\frac{\partial f}{\partial z}_{(2,1,1)}\mathrm{d}z,$$`

since we are moving on the level, thus `$\mathrm{d}z$`

is actually `$0$`

, which gives us `$$\frac{\partial f}{\partial x}_{(2,1,1)}\mathrm{d}x+\frac{\partial f}{\partial y}_{(2,1,1)}\mathrm{d}y+\frac{\partial f}{\partial z}_{(2,1,1)}\mathrm{d}z=0,$$`

which means `$$4(x-x_0)+2(y-y_0)-2(z-z_0)=0,$$`

namely `$$4(x-2)+2(y-1)-2(z-1).$$`

Sometimes, we care not only the derivative on `$\hat\mathbf{i}$`

and `$\hat\mathbf{j}$`

but on some other direction `$\hat\mathbf{u}$`

.

The directional derivative is defined as `$$\nabla|_{\hat\mathbf{u}}f=\nabla f\cdot\hat\mathbf{u},$$`

it's natural because near some point `$$\frac{\mathrm{d}f}{\mathrm{d}s}=\nabla f\cdot\frac{\mathrm{d}r}{\mathrm{d}s},$$`

where `$s$`

is a tiny segment on the direction `$\hat\mathbf{u}$`

, that becomes `$$\frac{\mathrm{d}f}{\mathrm{d}s}=\nabla f\cdot\hat\mathbf{u}.$$`

According to the directional derivative, we can write it in a geometric form `$$\nabla|_{\hat\mathbf{u}}f=|\nabla f||\hat\mathbf{u}|\cos(\theta),$$`

where `$\theta$`

is the angle between `$\nabla f$`

and `$\hat\mathbf{u}$`

.

- When
`$\theta=0$`

, the directional derivative is maximal, so the function increases fastest in the direction of`$\nabla f$`

; - When
`$\theta=\pi$`

, the directional derivative is minimal, so the function decreases fastest in the opposite direction of`$\nabla f$`

; - When
`$\theta=\frac\pi2$`

, the directional derivative is`$0$`

, so the function does not change and stay on a level surface.

So, the gradient points at the direction where the function has a max rate of change.

When we are considering a multivariable function, is there a way to hold changes of all components?

Well there's the total differential defined for `$f(x,y,z)$`

that is `$$\mathrm{d}f=\frac{\partial f}{\partial x}\mathrm{d}x+\frac{\partial f}{\partial y}\mathrm{d}y+\frac{\partial f}{\partial z}\mathrm{d}z.$$`

**Notice** I've been confusing the derivatives and differentials, but I'm now clearing the edge between them. In single variable situation, when apply "differential" to some function `$f(x)$`

, we get actually another function `$$\mathrm{d}f(x,\Delta x)\overset{\Delta}{=}f'(x)\Delta x.$$`

We often write something like `$$\mathrm{d}f=\boxed{}\mathrm{d}x$$`

because according to definition `$$\mathrm{d}(x,\Delta x)=\Delta x.$$`

If we have some multivariable function `$f(x,y,z)$`

, where `$x=x(t),y=y(t),z=z(t)$`

, we can get `$$\frac{\mathrm{d}f}{\mathrm{d}t}=\frac{\partial f}{\partial x}\frac{\mathrm{d}x}{\mathrm{d}t}+\frac{\partial f}{\partial y}\frac{\mathrm{d}y}{\mathrm{d}t}+\frac{\partial f}{\partial z}\frac{\mathrm{d}z}{\mathrm{d}t}.$$`

Treat product of two functions `$u=u(t),v=v(t)$`

as a multivariable function `$f(u,v)=uv$`

, and apply the chain rule `$$\begin{aligned}\frac{\mathrm{d}f}{\mathrm{d}t}&=\frac{\partial f}{\partial u}\frac{\mathrm{d}u}{\mathrm{d}t}+\frac{\partial f}{\partial v}\frac{\mathrm{d}v}{\mathrm{d}t}\\&=v\frac{\mathrm{d}u}{\mathrm{d}t}+u\frac{\mathrm{d}v}{\mathrm{d}t}.\end{aligned}$$`

The quotient rule can be validated similarly, omitted.

Given a function `$f(x,y)$`

where `$x=x(u,v),y=y(u,v)$`

, how to get `$\frac{\partial f}{\partial u}$`

and `$\frac{\partial f}{\partial v}$`

without plugging `$x=x(u,v)$`

and `$y=y(u,v)$`

in?

Let's calculate the total differential of `$f$`

, that is `$$\begin{aligned}\mathrm{d}f&=\frac{\partial f}{\partial x}\mathrm{d}x+\frac{\partial f}{\partial y}\mathrm{d}y\\&=\frac{\partial f}{\partial x}\left(\frac{\partial x}{\partial u}\mathrm{d}u+\frac{\partial x}{\partial v}\mathrm{d}v\right)+\frac{\partial f}{\partial y}\left(\frac{\partial y}{\partial u}\mathrm{d}u+\frac{\partial y}{\partial v}\mathrm{d}v\right)\\&=\left(\frac{\partial f}{\partial x}\frac{\partial x}{\partial u}+\frac{\partial f}{\partial y}\frac{\partial y}{\partial u}\right)\mathrm{d}u+\left(\frac{\partial f}{\partial x}\frac{\partial x}{\partial v}+\frac{\partial f}{\partial y}\frac{\partial y}{\partial v}\right)\mathrm{d}v.\end{aligned}$$`

And notice that `$$\mathrm{d}f=\frac{\partial f}{\partial u}\mathrm{d}u+\frac{\partial f}{\partial v}\mathrm{d}v,$$`

therefore we get `$$\left\{\begin{aligned}\frac{\partial f}{\partial u}&=\frac{\partial f}{\partial x}\frac{\partial x}{\partial u}+\frac{\partial f}{\partial y}\frac{\partial y}{\partial u}\\\frac{\partial f}{\partial v}&=\frac{\partial f}{\partial x}\frac{\partial x}{\partial v}+\frac{\partial f}{\partial y}\frac{\partial y}{\partial v}.\end{aligned}\right.$$`

At a higher point of view, a single variable function maps some number to a number, and a multivariable function maps some `$n$`

-tuple to a number. The essence of a function doesn't change.

Like a single variable function, a multivariable function have its domain, like `$$f(x,y)=x^2+y^2$$`

can be defined all the time, and `$$f(x,y)=\sqrt{y}$$`

is only defined when `$y\geq 0$`

.

It's difficult to plot a multivariable function accurately, but the main idea doesn't change.

**e.g.** Plot the graph of `$f(x,y)=-y$`

.

Consider the `$yz$`

plane, it's just a line through origin. And now we move the value of `$x$`

, and it doesn't depend on the value `$x$`

, so it will be a plane.

**e.g.** Plot the graph of `$f(x,y)=1-x^2-y^2$`

.

Consider the `$yz$`

plane, where `$x=0$`

, it will be a parabola with the quation `$z=1-y^2$`

. Similarly, the part on `$xz$`

plane is still a parabola with the quation `$z=1-x^2$`

.

But if we consider the graph on the `$xy$`

plane, where `$z=0$`

, we'll get a unit circle that is `$x^2+y^2=1$`

.

The process of a traditional plot is hard and not that easy to understand. Take the last example, we can draw a contour plot like the following

which indicated where the function achieves the same value on the `$xy$`

plane.

We can feel the change of the function by observing the gap between curves and know how the function changes along some direction.

We care about the change rate of a multivariable function, but it has several variables. What we do is just convert the multivariable function into a single variable function, that is treat other variables as constants.

Given a function `$f(x,y)$`

, the partial derivative at point `$(x_0,y_0)$`

of `$f$`

with respect to `$x$`

is `$$\frac{\partial f}{\partial x}(x_0,y_0)=\lim_{\Delta x\to 0}\frac{f(x_0+\Delta x, y_0)-f(x_0, y_0)}{\Delta x},$$`

and generally the partial derivative of `$f$`

with respect to `$x$`

is a multivariable function that is `$$f_x=\frac{\partial f}{\partial x}=\lim_{\Delta x\to 0}\frac{f(x+\Delta x, y)-f(x, y)}{\Delta x}.$$`

Like the linear approximation in single variable function, a multivariable function also has approximation. That is

`$$\Delta f(x,y)\approx f_x\Delta x+f_y\Delta y.$$`

Why is it correct?

Thinking about the following assumption `$$\left\{\begin{aligned}f_x(x_0,y_0)=a\\f_y(x_0,y_0)=b\end{aligned}\right.,$$`

and we have two tangent lines `$$l_1:\left\{\begin{aligned}z&=z_0+a(x-x_0)\\y&=y_0\end{aligned}\right.,l_2:\left\{\begin{aligned}z&=z_0+b(y-y_0)\\x&=x_0\end{aligned}\right..$$`

And these two lines can determine a tangent plane `$z=z_0+a(x-x_0)+b(y-y_0)$`

.

At local maxima or minima, we have `$f_x(x_0,y_0)=0$`

and `$f_y(x_0,y_0)=0$`

, which means the point has a horizontal tangent plane.

Point `$(x,y)$`

is a critical point of `$f$`

if `$f_x(x_0,y_0)=0$`

and `$f_y(x_0,y_0)=0$`

.

**e.g.** Find critical points for `$f(x,y)=x^2-2xy+3y^2+2x-2y$`

.

**Solution** According to definition, we get `$$\left\{\begin{aligned}f_x&=2x-2y+2\\f_y&=-2x+6y-2=0\end{aligned}\right.,$$`

and we can solve the critical point that is `$(-1,0)$`

.

But what's the minima?

Notice that `$$f(x,y)=x^2-2xy+3y^2+2x-2y=(x-y+1)+2y^2-1\geq -1,$$`

and we plug `$(-1,0)$`

into it, it's exactly `$-1$`

, so we get local and global minima at `$(-1,0)$`

.

Focus on the condition `$f_x(x_0,y_0)=0$`

and `$f_y(x_0,y_0)=0$`

, it's neccessary but not sufficient in terms of maxima or minima. Because we could have some examples that `$f_x(x_0,y_0)=0$`

and `$f_y(x_0,y_0)=0$`

where the function doesn't have local minima or maxima at `$(x_0,y_0)$`

.

For example, the function `$f(x,y)=x^2-y^2$`

.

It's easy to validate that `$f_x(0,0)=0,f_y(0,0)=0$`

, anyhow, it doesn't achieve maxima or minima at the point, these points are called saddle points.

When we are given a lot of discrete data (often seen in scientific experiments), we want to find some line to approximate these data, and enable us to find the relation between variables and predict their value.

Considering the data set `$(x_1,y_1),(x_2,y_2),\cdots,(x_n,y_n)$`

and the target line `$y=ax+b$`

, how to optimize the line in order that the line become best for discrete data? Actually, it's a minima problem.

As a convention, the indicator we use here is the offset squared, for each data `$(x_i,y_i)$`

that is `$(ax_i+b-y_i)^2$`

, so the function we'd like to analyze is `$$D(a,b)=\sum_1^n (ax_i+b-y_i)^2,$$`

whose minima we want to find.

We take the partial derivative with respect to `$a,b$`

`$$\frac{\partial D}{\partial a}=\sum_1^n [2(ax_i+b-y_i)x_i],\frac{\partial D}{\partial b}=\sum_1^n [2(ax_i+b-y_i)],$$`

and to get the critical point, we solve the system `$$\left\{\begin{aligned}&\frac{\partial D}{\partial a}=0\\&\frac{\partial D}{\partial b}=0\end{aligned}\right.,$$`

namely `$$\left\{\begin{aligned}&\left(\sum_1^nx_i^2\right)a+\left(\sum_1^nx_i\right)b=\sum_1^nx_iy_i\\&\left(\sum_1^nx_i\right)a+nb=\sum_1^ny_i\end{aligned}\right..$$`

Consider the following function behavior at origin `$$f(x,y)=ax^2+bxy+cy^2,$$`

and we try to complete the square here to judge whether the function can achieve maxima or minima at origin, namely `$$\begin{aligned}f(x,y)&=a\left(x^2+\frac{b}{a}xy\right)+cy^2\\&=a\left(x+\frac{b}{2a}y\right)^2-\frac{b^2}{4a}y^2+cy^2\\&=a\left(x+\frac{b}{2a}y\right)^2+\frac{4ac-b^2}{4a}y^2\\&=\frac1{4a}\left[4a^2\left(x+\frac{b}{2a}y\right)^2+(4ac-b^2)y^2\right].\end{aligned}$$`

It's obvious that the origin is a critical point because `$$\frac{\partial f}{\partial x}=2ax+by,\frac{\partial f}{\partial y}=2cy+bx.$$`

And plug the origin into these, easy to get they are `$0$`

, the problem here is can `$f(0,0)$`

be the true local maxima or minima?

Observe the completed part, the signs before two squared terms are interesting, `$4a^2$`

is always positive, and `$4ac-b^2$`

is not determined.

If `$4ac-b^2>0$`

, then the parts in bracket are two non-negative terms, and have to be non-negative. And `$f(0,0)=0$`

, then it has local maxima or minima (depending on the sign of `$\frac1{4a}$`

, namely `$a$`

).

If `$4ac-b^2=0$`

, the function will depends on only one variable `$x$`

, and the behavior here cannot be concluded. In this special case, it will be local maxima or minima, where any point `$(0,t)$`

will achive.

If `$4ac-b^2<0$`

, then the parts in bracket are one non-negative, the other non-positive, so it's possible to get either positive or negative value. So `$f(0,0)$`

cannot be maxima or minima, it's a saddle point.

We find that quadratic discriminant `$b^2-4ac$`

occur in the analysis above, is that a coincidence?

Still the example above, notice that each term is quadratic, so `$$f(x,y)=y^2\left(a\left(\frac{x}{y}\right)^2+b\left(\frac{x}{y}\right)+c\right),$$`

and near the origin, `$\frac{x}{y}$`

can be any number.

If the equation `$at^2+bt+c=0$`

has two roots, namely, `$b^2-4ac>0$`

, it means that the function can achieve two sides of `$0$`

, and it keeps `$0$`

on some direction (`$\frac{x}{y}$`

indicates the direction when approaching the origin).

If the equation `$at^2+bt+c=0$`

has only one root, namely, `$b^2-4ac=0$`

, actually it can't be concluded, anyhow, it indicates that on some direction, the function keeps its value at origin.

If the equation `$at^2+bt+c=0$`

has no roots, namely, `$b^2-4ac<0$`

, it means that `$f(0,0)=0$`

will be exactly local maxima or minima (depending on the sign of `$a$`

), because near the origin, the function value can't be zero.

According to multivariable quadratic Taylor's formula `$$\Delta f\approx f_x(x-x_0)+f_y(y-y_0)+\frac12f_{xx}(x-x_0)^2+f_{xy}(x-x_0)(y-y_0)+\frac12f_{yy}(y-y_0)^2,$$`

we can have a general test on other functions.

To test a critical point `$(x_0,y_0)$`

of `$f$`

, let `$$A=f_{xx}(x_0,y_0),B=f_{xy}(x_0,y_0),C=f_{yy}(x_0,y_0),$$`

- if
`$AC-B^2>0$`

- if
`$A>0$`

, we get local minima at`$(x_0,y_0)$`

- if
`$A<0$`

, we get local maxima at`$(x_0,y_0)$`

- if
- if
`$AC-B^2<0$`

, then it's a saddle point - if
`$AC-B^2=0$`

, no conclusion

Recall the cycloid, we use a parametric equation to describe some point `$P(x(t),y(t),z(t))$`

. The vector `$\mathbf{r}(t)=\mathbf{OP}=(x(t),y(t),z(t))$`

is called position vector, and we can learn about more details when analysing the vector.

Take a cycloid where `$t=\theta$`

as an example that is `$$\mathbf{r}(t)=(t-\sin(t),1-\cos(t)),$$`

though it hasn't the third component, but it doesn't lose generality.

We care about not only the rate at some point but also the direction, in multivariable calculus, a vector can be diffentiated. To get the velocity, we just differentiate the position vector `$$\mathbf{v}=\frac{\mathrm{d}\mathbf{r}}{\mathrm{d}t}.$$`

In this cycloid case, `$$\mathbf{v}=\frac{\mathrm{d}\mathbf{r}}{\mathrm{d}t}=\left(\frac{\mathrm{d}x}{\mathrm{d}t},\frac{\mathrm{d}y}{\mathrm{d}t}\right)=(1-\cos(t),\sin(t)).$$`

In some applications, we only care about the rate. The magnitude of velocity becomes speed, in this example that is `$$|\mathbf{v}|=\sqrt{(1-\cos(t))^2+\sin^2(t)}=\sqrt{2-2\cos(t)}.$$`

Image you are driving, and you take a tight turn without change of the speed, in the view of single variable calculus, you do not have acceleration. Anyhow, in multivariable calculus, your change of direction is also taken into consideration. Similarly defined like velocity, we have the acceleration `$$\mathbf{a}=\frac{\mathrm{d}\mathbf{v}}{\mathrm{d}t}.$$`

In this cycloid case, `$$\mathbf{a}=\frac{\mathrm{d}\mathbf{v}}{\mathrm{d}t}=(\sin(t),\cos(t)).$$`

If we add velocity continuously, we can get a vector from the start point to the end point. How can we get the arc length during the process? A good idea is to add speed continuously, because speed doesn't have direction. So we have `$$\frac{\mathrm{d}s}{\mathrm{d}t}=|\mathbf{v}|.$$`

**e.g.** Length of an arch of cycloid is `$\int_0^{2\pi}\sqrt{2-2\cos(t)}\mathrm{d}t$`

Trajectory unit tangent vector is defined as `$\hat{\mathbf{T}}=\frac{\mathbf{v}}{|\mathbf{v}|}$`

.

And we notice that `$$\mathbf{v}=\frac{\mathrm{d}\mathbf{r}}{\mathrm{d}t}=\mathbf{v}=\frac{\mathrm{d}\mathbf{r}}{\mathrm{d}s}\frac{\mathrm{d}s}{\mathrm{d}t},$$`

where `$\frac{\mathrm{d}s}{\mathrm{d}t}$`

is actually `$|\mathbf{v}|$`

, so we have `$$\hat{\mathbf{T}}=\frac{\mathrm{d}\mathbf{r}}{\mathrm{d}s}.$$`

[TO BE CONTINUED]

]]>We have known that a line can be treated as the intersection of two planes, so it's okay to describe a line by two equations, but not that natural.

Here's another way, a line can be treated as the trajectory of a moving point.

**e.g.** Line through `$Q_0(-1,2,2),Q_1(1,3,-1)$`

**Solution** Select a point `$Q(x,y,z)$`

on the plane, and we should have `$\mathbf{Q_0Q}=t\mathbf{Q_0Q_1}=(2,1,-3)$`

, for some `$t$`

. Let's consider `$t$`

as a parameter and expand the formula that is `$$\begin{bmatrix}x+1\\y-2\\z-2\end{bmatrix}=t\begin{bmatrix}2\\1\\-3\end{bmatrix}.$$`

We can write `$x,y,z$`

as functions with respect to `$t$`

as the following `$$\left\{\begin{aligned}x(t)&=&-1&+2t\\y(t)&=&2&+t\\z(t)&=&2&-3t\end{aligned}\right..$$`

**e.g.** What's `$Q_0(-1,2,2)$`

and `$Q_1(1,3,-1)$`

like relative to `$x+2y+4z=7$`

?

Same side / Opposite sides / One is on the plane.

**Solution** The plane `$x+2y+4z=7$`

is actually dividing the whole space into two regions where one fulfills `$x+2y+4z>7$`

and the other `$x+2y+4z<7$`

. We just plugin these two points into the plane equation and check the equality.

For `$Q_0$`

, `$-1+2\times 2+4\times 2=11>7$`

; and for `$Q_1$`

, `$1+2\times 3+4\times(-1)=3<7$`

, so neither of them is on the plane and they are on the opposite sides.

To get the intersection, we just plug the parametric equation of the line into the plane equation that is `$$x(t)+2y(t)+4z(t)=7,$$`

after solving, we can get `$t=\frac12$`

, and indicating the intersection is `$(0,\frac25,\frac12)$`

.

The whole process is just a trick in geometry, the parametric equation for a cycloid with radius `$a$`

is

`$$ \left\{ \begin{aligned} x(\theta)&=a(\theta-\sin(\theta))\\ y(\theta)&=a(1-\cos(\theta)) \end{aligned} \right.. $$`

Let's observe the edge like the place near `$\theta=0,2\pi,\cdots$`

. Without loss of generality, take `$\theta=0$`

as an example.

According to Taylor's theorem, `$\sin(\theta)\sim\theta-\frac{\theta^3}{3!}$`

and `$\cos\theta\sim1-\frac{\theta^2}{2!}$`

, consider the slope near `$\theta=0$`

that is `$\frac{y(t)}{x(t)}=\frac{1-\cos(\theta)}{\theta-\sin(\theta)}~\frac{\frac{\theta^2}{2!}}{\frac{\theta^3}{3!}}=\frac{3}{\theta}\to\infty$`

, indicating that the behavior near `$\theta=0$`

is to toggle some motion, which is obvious to be understood when we imagine a wheel rolling on the ground.

The equation of a plane should be in the form of `$ax+by+cz=d$`

.

**e.g.** Plane through origin with normal vector `$\mathbf{N}=(1,5,10)$`

.

**Solution** Select a point `$P(x,y,z)$`

on the plane and we have `$\mathbf{OP}\cdot\mathbf{N}=0$`

, that is `$x+5y+10z=0$`

.

**e.g.** Plane through `$P_0(2,1,-1)$`

with normal vector `$\mathbf{N}=(1,5,10)$`

.

It's slightly different but what we will do is almost the same. Select a point `$P(x,y,z)$`

on the plane and we have `$\mathbf{P_0P}\cdot\mathbf{N}=0$`

, that is `$(x-2)+5(y-1)+10(z+1)=0$`

, simplified as `$x+5y+10z=-3$`

.

Observe these two examples, their coefficients before `$x,y,z$`

are exactly the components of the normal vector `$\mathbf{N}$`

, the only difference is just the constant. Here the constant can be understood as some offset to origin, since there is no offset when the constant is `$0$`

, namely through the origin.

After we know the fact, we can solve the second example in an easier way. Because we've known the form in the second example is `$x+5y+10z=d$`

, we just plug `$P_0(2,1-1)$`

into it and solve the value of `$d$`

, finally we get the whole equation.

If we know a vector and the equation of a plane, we can judge their relationship by the above trick.

**e.g.** Given a vector `$\mathbf{v}=(1,2,-1)$`

and a plane `$x+y+3z=5$`

, what's their relationship? Parallel / Perpendicular / Neither.

**Solution** The normal vector of the plane is `$\mathbf{N}=(1,1,3)$`

, and we find that `$\mathbf{N}\cdot\mathbf{v}=0$`

. Caution! It DOES NOT mean `$\mathbf{v}$`

is perpendicular to the plane, instead, what we check is `$\mathbf{v}$`

and the *normal* vector `$\mathbf{N}$`

of the plane. Therefore, it oppositely suggests that the vector `$\mathbf{v}$`

is parallel to the plane.

Think about a `$3\times 3$`

linear system, for example, `$$\left\{\begin{alignedat}{4} x & {}+{} & & {}{} & z & {}={} & 1 \\ x & {}+{} & y & {} {} & & {}={} & 2 \\ x & {}+{} & 2y & {}+{} & 3z & {}={} & 3\end{alignedat}\right..$$`

These three equations are all planes, and what's the meaning of a solution?

In geometry, we can only find only one solution when the first two planes intersect in a line, and the line intersects with the third plane with a point. To solve this, a good way is to use matrix, a.k.a. `$\mathbf{A}\mathbf{X}=\mathbf{B}\Leftrightarrow\mathbf{X}=\mathbf{A}^{-1}\mathbf{B}$`

.

Anyhow, there exists some weird situtations and the method doesn't work. Generally, the solution of the system have four possibilities:

- No solutions
- One point (unique solution)
- A line (infinite solutions)
- A plane (infinite solutions)

If two planes are parallel, then there exists no solutions at all.

If three planes are all the same, then there exists infinite solutions, namely a plane.

If two planes intersect in a line, and the thrid plane is parallel to the line (not contained), then there exists no solutions.

If two planes intersect in a line, and the thrid plane is contain the line, then there exists infinite solutions, namely a line.

What's above method going wrong in algebra? The formula `$\mathbf{A}^{-1}=\frac{\operatorname{adj}(\mathbf{A})}{\det(\mathbf{A})}$`

is definitely correct, but what will happen if `$\det(\mathbf{A})=0$`

? We say that those matrices whose determinant `$\det(\mathbf{A})\neq 0$`

are invertible and they have inverse, the others are not invertible and don't have inverse.

A homogeneous system has the form `$\mathbf{A}\mathbf{X}=0$`

，we can find that no matter what `$\mathbf{A}$`

is, obviously `$\mathbf{X}=0$`

is always a solution, which is called a trivial solution.

If `$\det(\mathbf{A})\neq 0$`

, `$\mathbf{X}=0$`

is the unique solution.

If `$\det(\mathbf{A})=0$`

, then `$\det(\mathbf{N_1},\mathbf{N_2},\mathbf{N_3})=0$`

, which means their normal vectors are coplanar. Then there exists nontrivial solutions like `$\mathbf{X}=\mathbf{N_1}\times\mathbf{N_2}$`

.

For a system `$\mathbf{A}\mathbf{X}=\mathbf{B}$`

- if
`$\det(\mathbf{A})\neq 0$`

, then it has unique solution - if
`$\det(\mathbf{A})=0$`

, then it has either no solutions or infinite solutions.

The more details will be found when trying to solve the system by elimination and substitution, but if we know `$\det(\mathbf{A})=0$`

and find a solution, then it has to have infinite solutions; and when we get something contradictory, then it turns out that the system has no solutions.