Density matrix, 4

Consider a system that is in a mixed state. The system has 0.3 of probability in a pure state |\psi_1 \rangle and 0.7 of probability in another pure state |\psi_2 \rangle. Then the density matrix \rho is

0.3 | \psi_1 \rangle \langle \psi_1 | + 0.7 | \psi_2 \rangle \langle \psi_2 |

In the most general cases, neither |\psi_1 \rangle nor |\psi_2 \rangle is an eigenstate. So we cannot expect that \rho is diagonal.

For example, if each of the pure state |\psi_1 \rangle and |\psi_2 \rangle is a superposition of two eigenstates (|\phi_1\rangle, |\phi_2\rangle), then

| \psi_1 \rangle = \frac{1}{\sqrt 2} |\phi_1 \rangle + \frac{1}{\sqrt 2} |\phi_2 \rangle

| \psi_2 \rangle = \frac{1}{\sqrt 3} |\phi_1 \rangle + \sqrt{\frac{2}{3}} |\phi_2 \rangle

and

\rho

= 0.3 | \psi_1 \rangle \langle \psi_1 | + 0.7 | \psi_2 \rangle \langle \psi_2 |

= 0.3 \left( \frac{1}{\sqrt 2} |\phi_1 \rangle + \frac{1}{\sqrt 2} |\phi_2 \rangle \right) \left( \frac{1}{\sqrt 2} \langle \phi_1 | + \frac{1}{\sqrt 2} \langle \phi_2 | \right)

+ 0.7 \left( \frac{1}{\sqrt 3} |\phi_1 \rangle + \sqrt{\frac{2}{3}} |\phi_2 \rangle \right) \left( \frac{1}{\sqrt 3} \langle \phi_1 | + \sqrt{\frac{2}{3}} \langle \phi_2 | \right)

.

For simplicity, assume that the eigenstates \{ |\phi_1\rangle, |\phi_2\rangle \} form a complete orthonormal set.

If we use \{ | \phi_1 \rangle, |\phi_2 \rangle \} as basis,

\rho

= 0.3 \begin{bmatrix} \frac{1}{\sqrt 2} \\ \frac{1}{\sqrt 2} \end{bmatrix} \begin{bmatrix} \frac{1}{\sqrt 2} & \frac{1}{\sqrt 2} \end{bmatrix} + 0.7 \begin{bmatrix} \frac{1}{\sqrt 3} \\ \sqrt{\frac{2}{3}} \end{bmatrix} \begin{bmatrix} \frac{1}{\sqrt 3} & \sqrt{\frac{2}{3}} \end{bmatrix}

= \frac{0.3}{2} \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} + \frac{0.7}{3} \begin{bmatrix} 1 & \sqrt{2} \\ \sqrt{2} & 2 \end{bmatrix}

=\cdots

— Me@2018.03.12 11:51 AM

.

.

2018.03.14 Wednesday (c) All rights reserved by ACHK

Mixed states

To me the claim that mixed states are states of knowledge while pure states are not is a little puzzling because of the fact that it is not possible to uniquely recover what aspects of the mixed state are subjective and what aspects are objective.

The simple case is this:

Let’s work with a spin-1/2 particle, so there are states:

|0 \rangle
|1 \rangle
|+ \rangle = \frac{1}{\sqrt{2}} \left( |0 \rangle + |1 \rangle \right)
|- \rangle = \frac{1}{\sqrt{2}} \left( |0 \rangle - |1 \rangle \right)

The mixed state corresponding to 50% |0> + 50% |1> is the SAME as the mixed state corresponding to 50% |+> + 50% |->.

— Daryl McCullough

— Comment #13 November 19th, 2011 at 2:00 pm

— The quantum state cannot be interpreted as something other than a quantum state

.

\frac{1}{2}_c | + \rangle \langle + | + \frac{1}{2}_c | - \rangle \langle - |

=\frac{1}{2}_c \left( \frac{1}{\sqrt{2}}_q | 0 \rangle + \frac{1}{\sqrt{2}}_q | 1 \rangle \right) \left( \frac{1}{\sqrt{2}}_q \langle 0 | + \frac{1}{\sqrt{2}}_q \langle 1 | \right)+ \frac{1}{2}_c \left( \frac{1}{\sqrt{2}}_q | 0 \rangle - \frac{1}{\sqrt{2}}_q | 1 \rangle \right) \left( \frac{1}{\sqrt{2}}_q \langle 0 | - \frac{1}{\sqrt{2}}_q \langle 1 | \right)

=\frac{1}{2}_c \frac{1}{\sqrt{2}}_q \frac{1}{\sqrt{2}}_q \left( | 0 \rangle + | 1 \rangle \right) \left( \langle 0 | + \langle 1 | \right)+ \frac{1}{2}_c \frac{1}{\sqrt{2}}_q \frac{1}{\sqrt{2}}_q \left( | 0 \rangle - | 1 \rangle \right) \left( \langle 0 | - \langle 1 | \right)

=\frac{1}{2}_c \frac{1}{2}_q \left( | 0 \rangle + | 1 \rangle \right) \left( \langle 0 | + \langle 1 | \right) + \frac{1}{2}_c \frac{1}{2}_q \left( | 0 \rangle - | 1 \rangle \right) \left( \langle 0 | - \langle 1 | \right)

=\frac{1}{2}_c \frac{1}{2}_q \left( | 0 \rangle \langle 0 | + | 1 \rangle \langle 1 | + | 0 \rangle \langle 0 | + | 1 \rangle \langle 1 | \right)

=\frac{1}{2}_c \frac{1}{2}_q \left( 2_c | 0 \rangle \langle 0 | + 2_c | 1 \rangle \langle 1 | \right)

= \frac{1}{2}_q | 0 \rangle \langle 0 | + \frac{1}{2}_q | 1 \rangle \langle 1 |

— Me@2018-03-11 03:14:57 PM

.

How come the classical probabilities \frac{1}{2}_c of a density matrix in one representation can become quantum probabilities \frac{1}{2}_q in another?

\frac{1}{2}_c | + \rangle \langle + | + \frac{1}{2}_c | - \rangle \langle - | = \frac{1}{2}_q | 0 \rangle \langle 0 | + \frac{1}{2}_q | 1 \rangle \langle 1 |

1. Physically, whether we label the coefficients as “classical probabilities” or “quantum probabilities” gives no real consequences. The conflict lies only in the interpretations.

2. The interpretation conflict might be resolved by realizing that probabilities, especially classical probabilities, is meaningful only when being with respect to an observer.

For example,

\frac{1}{2}_c | + \rangle \langle + | + \frac{1}{2}_c | - \rangle \langle - | = \frac{1}{2}_q | 0 \rangle \langle 0 | + \frac{1}{2}_q | 1 \rangle \langle 1 |

represents the fact that the observer knows that the system is either in state |+\rangle \langle+| or |-\rangle \langle-|, but not |0 \rangle \langle 0| nor |1 \rangle \langle 1|.

However,

\frac{1}{2}_c | 0 \rangle \langle 0 | + \frac{1}{2}_c | 1 \rangle \langle 1 |

represents the fact that the observer knows that the system is either in state |0 \rangle \langle 0| or |1 \rangle \langle 1|, but not |+\rangle \langle+| nor |-\rangle \langle-|.

— Me@2018-03-13 08:10:46 PM

.

.

2018.03.14 Wednesday (c) All rights reserved by ACHK

機遇再生論 1.6

.

所以,「同情地理解」,亦可稱為「意念淘金術」。

機遇再生論,可以同情地理解為,有以下的意思:

(而這個意思,亦在「機遇再生論」的原文中,用作其理據。)

假設,你現在手中,有一副樸克牌,存在於某一個排列 A 。洗牌一次之後,排列仍然是 A 的機會極微。

一副完整的撲克牌,共有 N = 52! \approx 8.07 \times 10^{67} 個,可能的排列。亦即是話,洗牌後仍然是排列 A 的機會率,只有 \frac{1}{N}

由於分母 N 太大(相當於 8 之後,還有 67 個位),洗牌後,理應變成另外一個排列 B 。

P(A) = \frac{1}{N}

P(\text{not} A) = 1 - \frac{1}{N}

洗了一次牌後,發覺排列是 B 不是 A 後,我們可以再問,如果再洗一次牌,「是 A」和「不是 A」的機會,分別是多少?

.

由於,機會率只是與未知的事情有關,或者說,已知的事件,發生的機會率必為 1;所以,如果發生了第一次洗牌,而你又知道其結果的情況下,問「如果再洗一次牌,『是 A』和『不是 A』的機會,分別是多少」,第二次洗牌各個可能結果,發生的機會率,與第一次洗牌的結果無關。

第二次洗牌結果為組合 A 的機會率,仍然是

P(A) = \frac{1}{N}

不是組合 A 的機會率,仍然是

P(\text{not} A) = 1 - \frac{1}{N}

.

(問:那樣,為什麼要問多一次呢?)

我是想釐清,我真正想問的是,並不是這個問題,而是另一個:

如果在第一次洗牌之前,亦即是話,一次牌都未洗的話,問:

「如果洗牌兩次,起碼一次洗到原本排列 A 的機會率是多少?」

把該事件標示為 A_2

A_2 = 兩次洗牌的結果,起碼一次洗到原本排列 A

再把該事件的機會率,標示為 P(A_2)

由於 P(A_2) 相對麻煩,我們可以先行運算其「互補事件」的機會率。

A_2 的互補事件為「不是 A_2」:

不是 A_2

= 兩次洗牌的結果,不是起碼一次洗到原本排列 A

= 兩次洗牌的結果,都不是排列 A

其機會率為

P(\text{not} A_2) = (1 - \frac{1}{N})^2

那樣,我們就可推斷,

P(A_2)
= 1 - P(\text{not} A_2)
= 1 - (1 - \frac{1}{N})^2

.

同理,在一次牌都未洗的時候,問:

如果洗牌 m 次,起碼一次洗到原本排列 A 的機會率是多少?

答案將會是

P(A_m)= 1 - (1 - \frac{1}{N})^m

留意,N = 52! \approx 8.07 \times 10^{67},非常之大,導致 (1 - \frac{1}{N}) 極端接近 1。在一般情況,m 的數值還是正常時, P(A_m) 會仍然極端接近 0。

例如,你將會連續洗一千萬次牌(m = 10,000,000),起碼有一次,回到原本排列 A 的機會是:

P(A_m)
= 1 - (1 - \frac{1}{N})^m
= 1 - (1 - \frac{1}{52!})^{10,000,000}

你用一般手提計算機的話,它會給你 0。你用電腦的話,它會給你

1.239799930857148592 \times 10^{-61}

— Me@2018-01-25 12:38:39 PM

.

.

2018.02.13 Tuesday (c) All rights reserved by ACHK

Logical arrow of time, 6.2

Source of time asymmetry in macroscopic physical systems

Second law of thermodynamics

.

.

Physics is not about reality, but about what one can say about reality.

— Bohr

— paraphrased

.

.

Physics should deduce what an observer would observe,

not what it really is, for that would be impossible.

— Me@2018-02-02 12:15:38 AM

.

.

1. Physics is about what an observer can observe about reality.

2. Whatever an observer can observe is a consistent history.

observer ~ a consistent story

observing ~ gathering a consistent story from the quantum reality

3. Physics [relativity and quantum mechanics] is also about the consistency of results of any two observers _when_, but not before, they compare those results, observational or experimental.

4. That consistency is guaranteed because the comparison of results itself can be regarded as a physical event, which can be observed by a third observer, aka a meta observer.

Since whenever an observer can observe is consistent, the meta-observer would see that the two observers have consistent observational results.

5. Either original observers is one of the possible meta-observers, since it certainly would be witnessing the comparison process of the observation data.

— Me@2018-02-02 10:25:05 PM

.

.

.

2018.02.03 Saturday (c) All rights reserved by ACHK

機遇再生論 1.5

例如,

甲在過身之後,一千億年內會重生。

是句「科學句」(經驗句),因為你知道在什麼情境下,可以否證到它 —— 如果你在甲過身後,等了一千億年,甲還未重生的話,那句就為之錯。

但是,

甲在過身之後,只要等足夠長的時間,必會重生。

則沒有任何科學意義,只是一句「重言句」;因為,沒有人可以講得出,它在什麼情況下,為之錯。

如果你等了一千億年,甲還未重生的話,這個「機遇再生論」,仍然不算錯;因為,那只代表了,那一千億年,還未「足夠長」。

把「重言句」假扮成「經驗句」,就為之「空廢命題」。

(請參閱本網誌,有關「重言句」、「經驗句」和「印證原則」的文章。)

但是,那不代表我們,應該立刻放棄,機遇再生論。反而,我們可以試行「同情地理解」。

「同情地理解」的意思是,有些理論,雖然在第一層次的分析之後,有明顯的漏洞,但是,我們可以試試,代入作者發表該理論時的,心理狀態和時空情境;研究作者發表該理論的,緣起和動機;從而看看,該理論不行的原因,會不會只是因為,作者的語文或思考不夠清晰,表達不佳而已?

其實,該理論的「真身」,可能充滿著新知洞見。那樣的話,我們就有機會把「機遇再生論」,翻譯成有意義,不空廢的版本。

所以,「同情地理解」亦可稱為「意念淘金術」。

機遇再生論,可以同情地理解為,有以下的意思:

(而這個意思,亦在「機遇再生論」的原文中,用作其理據。)

假設,你現在手中,有一副樸克牌,存在於某一個排列 A 。洗牌一次之後,排列仍然是 A 的機會極微。

一副完整的撲克牌,共有 N = 54! = 2.3 \times 10^{71} 個,可能的排列。亦即是話,洗牌後仍然是排列 A 的機會率,只有 \frac{1}{N}

由於分母 N 太大(相當於 2 之後,還有 71 個位),洗牌後,理應變成另外一個排列 B 。

P(A) = \frac{1}{N}

P(not A) = 1 - \frac{1}{N}

— Me@2017-12-18 02:51:11 PM
 
 
 
2017.12.18 Monday (c) All rights reserved by ACHK

Mathematics

    The miracle of the appropriateness of the language of mathematics for the formulation of the laws of physics is a wonderful gift which we neither understand nor deserve.

    A possible explanation of the physicist’s use of mathematics to formulate his laws of nature is that he is a somewhat irresponsible person. As a result, when he finds a connection between two quantities which resembles a connection well-known from mathematics, he will jump at the conclusion that the connection is that discussed in mathematics simply because he does not know of any other similar connection.

— The Unreasonable Effectiveness of Mathematics in the Natural Sciences

— E. P. Wigner

2017.10.07 Saturday ACHK

Uncertainty Principle 8

EPR paradox for entangled particles

Bohr was compelled to modify his understanding of the uncertainty principle after another thought experiment by Einstein. In 1935, Einstein, Podolsky and Rosen (see EPR paradox) published an analysis of widely separated entangled particles. Measuring one particle, Einstein realized, would alter the probability distribution of the other, yet here the other particle could not possibly be disturbed. This example led Bohr to revise his understanding of the principle, concluding that the uncertainty was not caused by a direct interaction.

But Einstein came to much more far-reaching conclusions from the same thought experiment. He believed the “natural basic assumption” that a complete description of reality, would have to predict the results of experiments from “locally changing deterministic quantities”, and therefore, would have to include more information than the maximum possible allowed by the uncertainty principle.

In 1964, John Bell showed that this assumption can be falsified, since it would imply a certain inequality between the probabilities of different experiments. Experimental results confirm the predictions of quantum mechanics, ruling out Einstein’s basic assumption that led him to the suggestion of his hidden variables. Ironically this fact is one of the best pieces of evidence supporting Karl Popper’s philosophy of invalidation of a theory by falsification-experiments. That is to say, here Einstein’s “basic assumption” became falsified by experiments based on Bell’s inequalities.

While it is possible to assume that quantum mechanical predictions are due to nonlocal, hidden variables, and in fact David Bohm invented such a formulation, this resolution is not satisfactory to the vast majority of physicists. The question of whether a random outcome is predetermined by a nonlocal theory can be philosophical, and it can be potentially intractable. If the hidden variables are not constrained, they could just be a list of random digits that are used to produce the measurement outcomes.

— Wikipedia on Uncertainty principle

2017.01.18 Wednesday ACHK

Superposition always exists

A Non-classical Feature, 2

.

superposition

~ linear overlapping

~ f(ax + by) = a f(x) + b f(y)

.

Reality is a linear overlapping of potential realities, although different components may have different weightings.

Superposition always exists, if it exists at the beginning of a process.

So the expression “the wave function collapses and the superposition ceases to exist” does not make sense.

.

Superposition always exists; interference (pattern) does not.

For a superposition to have an interference pattern, the two (for example) component eigenstates need to have a constant phase difference.

In other words, they have to be coherent.

.

superposition without an interference pattern

~ microscopically decoherent component states

~ macroscopically a classical state

— Me@2016-09-01 4:42 AM

.

The above is not correct.

A quantum superposition is not just an overlapping of classical states, because if it is, for example, there would be no interference patterns formed in the double-slit experiment. If a quantum superposition is just an overlapping of classical worlds, how can you explain the destructive interference part?

— Me@2020-12-19 07:19:08 PM

.

.

2016.11.27 Sunday (c) All rights reserved by ACHK

A Non-classical Feature

What makes the interference pattern of electrons in the double-slit experiment a non-classical feature?

The probability pattern of every electron being a particle and that of being a wave are different.

For the particle pattern, the left-slit part and the right-slit part of the probability wave do not overlap. The quantum superposition does not cause a (interference) pattern.

This is why the interference pattern is a non-classical feature of the electron double-slit experiment.

— Me@2016-10-06 09:53:07 AM

2016.10.07 Friday (c) All rights reserved by ACHK

Black hole complementarity 3

Raphael nicely avoids many of the confusions by introducing a refined version of the complementarity principle, the so-called observer complementarity… If I add some “foundations of quantum mechanics” flavor to the principle, it says:

Quantum mechanics is a set of rules that allows an observer to predict, explain, and/or verify observations (and especially their mutual relationships) that he has access to.

An observer has access to a causal diamond – the intersection of the future light cone of the initial moment of his world line and the past light cone of the final moment of his world line (the latter, the final moment before which one must be able to collect the data, is more important in this discussion).

No observer can detect inconsistencies within the causal diamonds. However, inconsistencies between “stories” as told by different observers with different causal diamonds are allowed (and mildly encouraged) in general (as long as there is no observer who could incorporate all the data needed to see an inconsistency).

Bohr has said that physics is about the right things we can say about the real world, not about objective reality, and it has to be internally consistent. However, in the context of general relativity, the internal consistency doesn’t imply that there has to be a “global viewpoint” or “objective reality” that is valid for everyone.

— Raphael Bousso is right about firewalls

— Lubos Motl

2016.07.27 Wednesday ACHK

Euler Formula

Exponential, 2
 

a^x

general exponential increase ~ the effects are cumulative
 
e^x

natural exponential increase ~ every step has immediate and cumulative effects

— Me@2014-10-29 04:44:51 PM
 

exponent growth

e^x = \lim_{n \to \infty} \left(1 + \frac{x}{n}\right)^n

~ compound interest effects with infinitesimal time intervals
 

multiply -1

~ rotate to the opposite direction

(rotate the position vector of a number on the real number line to the opposite direction)

~ rotate 180 degrees
 

multiply i

~ rotate to the perpendicular direction

~ rotate 90 degrees
 

For example, the complex number (3, 0) times i equals (0, 3):

3 \times i = 3 i
(3, 0) (0, 1) = (0, 3)
 

multiplying i

~ change the direction to the one perpendicular to the current moving direction

(current moving direction ~ the direction of a number’s position vector)
 

exponential growth with an imaginary amount

e^{i \theta} = \lim_{n \to \infty} \left( 1 + \frac{i \theta}{n} \right)^n

~ change the direction to the one perpendicular to the current moving direction continuously

~ rotate \theta radians

— Me@2016-06-05 04:04:13 PM
 
 
 
2016.06.08 Wednesday (c) All rights reserved by ACHK

Quantum entanglement 3

Nature never forgets about any correlations: …

— Lubos Motl

entanglement ~ correlation ~ book-keeping

— Me@2012-04-11 12:10:08 AM

2016.05.20 Friday (c) All rights reserved by ACHK

Gradient 1.2

Distance vs Displacement, 2

.

The physical reason of “the magnitude of the gradient vector represents the spatial rate of change” of a scalar field is that \displaystyle{\frac{\partial f}{\partial x}} represents the spatial rate of change of a scalar field along the \displaystyle{x} direction.

Directional derivative has exactly the same meaning except that its direction may not be along any one of the coordinate axes.

— Me@2016-02-06 07:23:32 AM

.

Assume that \displaystyle{\delta x} represents a displacement from point 1 to point 2 along the \displaystyle{x} direction and \displaystyle{\delta y} represents a displacement from point 2 to point 3 along the \displaystyle{y} direction.

Denote “the value of the vector field” as “height”. Then

the height difference between point 3 and point 1

= the height difference between point 2 and point 1

+ the height difference between point 3 and point 2

.

That is the exact reason that the change of the \displaystyle{f} due to the displacement \displaystyle{\mathbf{v}} is

\displaystyle{    \begin{aligned}    \left(\delta f\right)_{\mathbf{v}}    &= \frac{\partial f}{\partial x} \delta x + \frac{\partial f}{\partial x} \delta y \\  &= \left(\frac{\partial f}{\partial x}, \frac{\partial f}{\partial x}\right) \cdot (\delta x, \delta y) \\  &= \left(\nabla f\right) \cdot \mathbf{v} \\    \end{aligned}}

The “height difference” does not care about the cause or process that introduces that height change.

— Me@2016-04-21 11:16:06 PM

.

.

2016.05.01 Sunday (c) All rights reserved by ACHK

A state of confusion, 2

algorias 1117 days ago

An incredibly accurate depiction of research in any theoretical field, I’d say. Compound that with the fact that during your education you’re mostly presented with texts that summarize decades or more of research into a scant few pages as if the people involved had just flowed naturally from one idea to the next, from a problem statement to the incredibly complex idea that unlocks the proof.

When it’s finally your turn to try your hand at actual research, it turns out that your contributions are barely a couple of side notes on a restricted subset of a problem in the hope that someone will use that information to find out something that is actually relevant in practice.

Rather than turning me off from academia, it makes me marvel at the tower of minuscule pebbles upon which our modern civilization rests. One day, I might get to place a few more of them on top.

— Hacker News

2016.04.12 Tuesday ACHK

Gradient

Assume \displaystyle{(x, y)} represents the position of an object and \displaystyle{f(x,y)} is a scalar field on the \displaystyle{x}\displaystyle{y} plane. Then \displaystyle{\frac{\partial f}{\partial x}} represents the change of \displaystyle{f} per unit length along the positive \displaystyle{x} direction. In other words, it is the spatial rate of change of \displaystyle{f} along the \displaystyle{x} direction.

Similarly, derivative \displaystyle{\frac{\partial f}{\partial y}} represents the spatial rate of change of \displaystyle{f} along the \displaystyle{y} direction.

For an arbitrary direction, due to the nature of displacement, the change of \displaystyle{f} is \displaystyle{\delta f = \frac{\partial f}{\partial x} \delta x + \frac{\partial f}{\partial x} \delta y} when the object has finished moving \displaystyle{\delta x} in \displaystyle{x} direction and then \displaystyle{\delta y} in \displaystyle{y} direction.

Then, the spatial rate of change of \displaystyle{f} is

\displaystyle{   \begin{aligned}   &\frac{\delta f}{\sqrt{(\delta x)^2 + (\delta y)^2}} \\  &= \frac{\partial f}{\partial x} \frac{\delta x}{\sqrt{(\delta x)^2 + (\delta y)^2}}  + \frac{\partial f}{\partial x} \frac{\delta y}{\sqrt{(\delta x)^2 + (\delta y)^2}} \\  \end{aligned} }

.

For simplicity, denote the resultant displacement as \displaystyle{\mathbf{v}}:

\displaystyle{\mathbf{v} = (\delta x, \delta y)}

and define \displaystyle{\nabla f(x)} as

\displaystyle{\left( \frac{\partial f}{\partial x}, \frac{\partial f}{\partial y} \right)}

Then, the change of the \displaystyle{f} due to the displacement \displaystyle{\mathbf{v}} is

\displaystyle{\begin{aligned}  \left(\delta f\right)_{\mathbf{v}}  &= \frac{\partial f}{\partial x} \delta x + \frac{\partial f}{\partial x} \delta y \\  &= \left(\frac{\partial f}{\partial x}, \frac{\partial f}{\partial x}\right) \cdot (\delta x, \delta y) \\  &= \left(\nabla f\right) \cdot \mathbf{v} \\  \end{aligned}}

.

So the spatial rate of change \displaystyle{f} along the direction of the vector \displaystyle{\mathbf{v}} is

\displaystyle{\begin{aligned}  D_{\mathbf{v}}(f)  &= \frac{\left(\delta f\right)_{\mathbf{v}}}{|\mathbf{v}|} \\  &= \frac{\partial f}{\partial x} \frac{\delta x}{\sqrt{(\delta x)^2 + (\delta y)^2}}  + \frac{\partial f}{\partial x} \frac{\delta y}{\sqrt{(\delta x)^2 + (\delta y)^2}} \\  &= \left(\nabla f\right) \cdot \frac{\mathbf{v}}{|\mathbf{v}|} \\  &= \left(\nabla f\right) \cdot \hat{\mathbf{v}} \\  \end{aligned}}

\displaystyle{D_{\mathbf{v}}(f)} is called directional derivative.

— Me@2016-02-06 09:49:22 PM

.

This is the reason that \displaystyle{\nabla f} is in the steepest direction.

If \displaystyle{\hat{\mathbf{v}}} is chosen to be parallel to \displaystyle{\nabla f}, the directional derivative \displaystyle{\left(\nabla f\right) \cdot \hat{\mathbf{v}}} would be maximized.

— Me@2021-08-20 05:20:02 PM

.

.

2016.02.21 Sunday (c) All rights reserved by ACHK

Self-information

The information entropy of a random event is the expected value of its self-information.

In information theory, self-information or surprisal is a measure of the information content [clarification needed] associated with an event in a probability space or with the value of a discrete random variable.

By definition, the amount of self-information contained in a probabilistic event depends only on the probability of that event: the smaller its probability, the larger the self-information associated with receiving the information that the event indeed occurred.

As a quick illustration, the information content associated with an outcome of 4 heads (or any specific outcome) in 4 consecutive tosses of a coin would be 4 bits (probability 1/16), and the information content associated with getting a result other than the one specified would be 0.09 bits (probability 15/16).

— Wikipedia on Self-information

2015.12.31 Thursday ACHK