Problem 2.3b5

A First Course in String Theory


2.3 Lorentz transformations, derivatives, and quantum operators.

(b) Show that the objects \displaystyle{\frac{\partial}{\partial x^\mu}} transform under Lorentz transformations in the same way as the \displaystyle{a_\mu} considered in (a) do. Thus, partial derivatives with respect to conventional upper-index coordinates \displaystyle{x^\mu} behave as a four-vector with lower indices – as reflected by writing it as \displaystyle{\partial_\mu}.


Denoting \displaystyle{ \eta_{\mu \rho} L^\rho_{~\sigma} \eta^{\nu \sigma}} as \displaystyle{L^{~\nu}_{\mu}} is misleading, because that presupposes that \displaystyle{ \eta_{\mu \rho} L^\rho_{~\sigma} \eta^{\nu \sigma}} is directly related to the matrix \displaystyle{L}.

To avoid this bug, instead, we denote \displaystyle{ \eta_{\mu \rho} L^\rho_{~\sigma} \eta^{\nu \sigma}} as \displaystyle{M ^\nu_{~\mu}}. So

\displaystyle{ \begin{aligned} (x')^\mu &= L^\mu_{~\nu} x^\nu \\ (x')^\mu (x')_\mu &= \left( L^\mu_{~\nu} x^\nu \right) \left( \eta_{\mu \rho} L^\rho_{~\sigma} \eta^{\beta \sigma} x_\beta \right) \\ (x')^\mu (x')_\mu &= \left( L^\mu_{~\nu} x^\nu \right) \left( M^{\beta}_{~\mu} x_\beta \right) \\ x^\mu x_\mu &= \left( L^\mu_{~\nu} x^\nu \right) \left( M^{\beta}_{~\mu} x_\beta \right) \\ \end{aligned}}

\displaystyle{ \begin{aligned} \nu \neq \mu:&~~~~~~\sum_{\mu = 0}^4 L^\mu_{~\nu} M^{\beta}_{~\mu} &= 0 \\ \nu = \mu:&~~~~~~\sum_{\mu = 0}^4 L^\mu_{~\nu} M^{\beta}_{~\mu} &= 1 \\ \end{aligned}}

Using the Kronecker Delta and Einstein summation notation, we have

\displaystyle{ \begin{aligned} L^\mu_{~\nu} M^{\beta}_{~\mu} &= M^{\beta}_{~\mu} L^\mu_{~\nu} \\ &= \delta^{\beta}_{~\nu} \\ \end{aligned}}


\displaystyle{ \begin{aligned} \sum_{\mu=0}^{4} L^\mu_{~\nu} M^{\beta}_{~\mu} &= \delta^{\beta}_{~\nu} \\ \end{aligned}}

\displaystyle{ \begin{aligned}   M^{\beta}_{~\mu} &= [L^{-1}]^{\beta}_{~\mu} \\   \end{aligned}}

In other words,

\displaystyle{ \begin{aligned}    \eta_{\mu \rho} L^\rho_{~\sigma} \eta^{\nu \sigma} &= [L^{-1}]^{\beta}_{~\mu} \\   \end{aligned}}

— Me@2020-11-23 04:27:13 PM


One defines (as a matter of notation),

{\displaystyle {\Lambda _{\nu }}^{\mu }\equiv {\left(\Lambda ^{-1}\right)^{\mu }}_{\nu },}

and may in this notation write

{\displaystyle {A'}_{\nu }={\Lambda _{\nu }}^{\mu }A_{\mu }.}

Now for a subtlety. The implied summation on the right hand side of

{\displaystyle {A'}_{\nu }={\Lambda _{\nu }}^{\mu }A_{\mu }={\left(\Lambda ^{-1}\right)^{\mu }}_{\nu }A_{\mu }}

is running over a row index of the matrix representing \displaystyle{\Lambda^{-1}}. Thus, in terms of matrices, this transformation should be thought of as the inverse transpose of \displaystyle{\Lambda} acting on the column vector \displaystyle{A_\mu}. That is, in pure matrix notation,

{\displaystyle A'=\left(\Lambda ^{-1}\right)^{\mathrm {T} }A.}

— Wikipedia on Lorentz transformation



\displaystyle{ \begin{aligned}   M^{\beta}_{~\mu} &= [L^{-1}]^{\beta}_{~\mu} \\   \end{aligned}}

In other words,

\displaystyle{ \begin{aligned}    \eta_{\mu \rho} L^\rho_{~\sigma} \eta^{\beta \sigma} &= [L^{-1}]^{\beta}_{~\mu} \\   \end{aligned}}


Denote \displaystyle{[L^{-1}]^{\beta}_{~\mu}} as

\displaystyle{ \begin{aligned}   N^{~\beta}_{\mu} \\   \end{aligned}}

In other words,

\displaystyle{ \begin{aligned}   N^{~\beta}_{\mu} &= M^{\beta}_{~\mu} \\   [N^T] &= [M] \\   \end{aligned}}


The Lorentz transformation:

\displaystyle{ \begin{aligned}   (x')^\mu &= L^\mu_{~\nu} x^\nu \\   (x')_\mu &= \eta_{\mu \rho} L^\rho_{~\sigma} \eta^{\beta \sigma} x_\beta \\   \end{aligned}}


\displaystyle{ \begin{aligned}   (x')^\mu &= L^\mu_{~\nu} x^\nu \\   (x')_\mu &= N^{~\nu}_{\mu} x_\nu \\   \end{aligned}}


\displaystyle{ \begin{aligned}   x^\mu &= [L^{-1}]^\mu_{~\nu} (x')^\nu \\   (x')_\mu &= M^{\nu}_{~\mu} x_\nu \\   \end{aligned}}


\displaystyle{ \begin{aligned}   x^\mu &= [L^{-1}]^\mu_{~\nu} (x')^\nu \\   (x')_\mu &= [L^{-1}]^{\nu}_{~\mu} x_\nu \\   \end{aligned}}


\displaystyle{ \begin{aligned}   \frac{\partial}{\partial (x')^\mu} &= \frac{\partial x^\nu}{\partial (x')^\mu} \frac{\partial}{\partial x^\nu} \\   &= \frac{\partial x^0}{\partial (x')^\mu} \frac{\partial}{\partial x^0} + \frac{\partial x^1}{\partial (x')^\mu} \frac{\partial}{\partial x^1} + \frac{\partial x^2}{\partial (x')^\mu} \frac{\partial}{\partial x^2} + \frac{\partial x^3}{\partial (x')^\mu} \frac{\partial}{\partial x^3} \\   \end{aligned}}

Now we consider \displaystyle{f} as a function of \displaystyle{x^{\mu}}‘s:

\displaystyle{f(x^0, x^1, x^2, x^3)}

Since \displaystyle{x^{\mu}}‘s and \displaystyle{(x')^{\mu}}‘s are related by Lorentz transform, \displaystyle{f} is also a function of \displaystyle{(x')^{\mu}}‘s, although indirectly.

\displaystyle{f(x^0((x')^0, (x')^1, (x')^2, (x')^3), x^1((x')^0, ...), x^2((x')^0, ...), x^3((x')^0, ...))}

For notational simplicity, we write \displaystyle{f} as


Since \displaystyle{f} is a function of \displaystyle{(x')^{\mu}}‘s, we can differentiate it with respect to \displaystyle{(x')^{\mu}}‘s.

\displaystyle{ \begin{aligned}   \frac{\partial}{\partial (x')^\mu} f(x^\alpha((x')^\beta))) &= \sum_{\nu = 0}^4 \frac{\partial x^\nu}{\partial (x')^\mu} \frac{\partial}{\partial x^\nu}  f(x^\alpha) \\   \end{aligned}}


\displaystyle{ \begin{aligned}   x^\nu &= [L^{-1}]^\nu_{~\beta} (x')^\beta \\   \end{aligned}},

\displaystyle{ \begin{aligned}   \frac{\partial f}{\partial (x')^\mu}   &= \sum_{\nu = 0}^4 \frac{\partial}{\partial (x')^\mu} \left[  \sum_{\beta = 0}^4 [L^{-1}]^\nu_{~\beta} (x')^\beta \right] \frac{\partial f}{\partial x^\nu} \\   &= \sum_{\nu = 0}^4 \sum_{\beta = 0}^4 [L^{-1}]^\nu_{~\beta} \frac{\partial (x')^\beta}{\partial (x')^\mu} \frac{\partial f}{\partial x^\nu} \\   &= \sum_{\nu = 0}^4 \sum_{\beta = 0}^4 [L^{-1}]^\nu_{~\beta} \delta^\beta_\mu \frac{\partial f}{\partial x^\nu} \\   &= \sum_{\nu = 0}^4 [L^{-1}]^\nu_{~\mu} \frac{\partial f}{\partial x^\nu} \\   &= [L^{-1}]^\nu_{~\mu} \frac{\partial f}{\partial x^\nu} \\   \end{aligned}}


\displaystyle{ \begin{aligned}   \frac{\partial}{\partial (x')^\mu} &= [L^{-1}]^\nu_{~\mu} \frac{\partial}{\partial x^\nu} \\   \end{aligned}}

It is the same as the Lorentz transform for covariant vectors:

\displaystyle{ \begin{aligned}   (x')_\mu &= [L^{-1}]^{\nu}_{~\mu} x_\nu \\   \end{aligned}}

— Me@2020-11-23 04:27:13 PM



2020.11.24 Tuesday (c) All rights reserved by ACHK

Global symmetry, 2

In physics, a global symmetry is a symmetry that holds at all points in the spacetime under consideration, as opposed to a local symmetry which varies from point to point.

Global symmetries require conservation laws, but not forces, in physics.

— Wikipedia on Global symmetry



2020.11.22 Sunday ACHK

Light, 3

無額外論 7


The one in the mirror is your Light.

— Me@2011.06.24


Thou shalt have no other gods before Me.

— one of the Ten Commandments


God teach you through your mind; help you through your actions.

— Me@the Last Century



2020.11.21 Saturday (c) All rights reserved by ACHK

一萬個小時 2.4

機遇創生論 1.6.6 | 十年 3.4











— Me@2020-11-16 04:55:18 PM



2020.11.20 Friday (c) All rights reserved by ACHK

Possion’s Lagrange Equation

Structure and Interpretation of Classical Mechanics


Ex 1.10 Higher-derivative Lagrangians

Derive Lagrange’s equations for Lagrangians that depend on accelerations. In particular, show that the Lagrange equations for Lagrangians of the form \displaystyle{L(t, q, \dot q, \ddot q)} with \displaystyle{\ddot{q}} terms are

\displaystyle{D^2(\partial_3L \circ \Gamma[q]) - D(\partial_2 L \circ \Gamma[q]) + \partial_1 L \circ \Gamma[q] = 0}

In general, these equations, first derived by Poisson, will involve the fourth derivative of \displaystyle{q}. Note that the derivation is completely analogous to the derivation of the Lagrange equations without accelerations; it is just longer. What restrictions must we place on the variations so that the critical path satisfies a differential equation?

Varying the action

\displaystyle{ \begin{aligned}   S[q] (t_1, t_2) &= \int_{t_1}^{t_2} L \circ \Gamma [q] \\   \eta(t_1) &= \eta(t_2) = 0 \\   \end{aligned}}

\displaystyle{ \begin{aligned}   \delta_\eta S[q] (t_1, t_2) &= 0 \\   \end{aligned}}

\displaystyle{ \begin{aligned}   \delta_\eta S[q] (t_1, t_2) &= \int_{t_1}^{t_2} \delta_\eta \left( L \circ \Gamma [q] \right) \\   \end{aligned}}

\displaystyle{ \begin{aligned}     \delta_\eta I [q] &= \eta \\  \delta_\eta g[q] &= D \eta~~~\text{with}~~~g[q] = Dq \\   \end{aligned}}


Let \displaystyle{h[q] = D^2 q}.

\displaystyle{ \begin{aligned}   \delta_\eta h[q]   &= \lim_{\epsilon \to 0} \frac{h[q+\epsilon \eta] - h[q]}{\epsilon} \\   &= \lim_{\epsilon \to 0} \frac{D^2 (q+\epsilon \eta) - D^2 q}{\epsilon} \\   &= \lim_{\epsilon \to 0} \frac{D^2 q + D^2 \epsilon \eta - D^2 q}{\epsilon} \\   &= \lim_{\epsilon \to 0} \frac{D^2 \epsilon \eta}{\epsilon} \\   &= D^2 \eta \\   \end{aligned}}

\displaystyle{ \begin{aligned}   \Gamma [q] (t) &= (t, q(t), D q(t), D^2 q(t)) \\  \delta_\eta \Gamma [q] (t) &= (0, \eta (t), D \eta (t), D^2 \eta (t)) \\  \end{aligned}}


Chain rule of functional variation

\displaystyle{ \begin{aligned} &\delta_\eta F[g[q]] \\   &= \delta_\eta (F \circ g)[q] \\   &= \delta_{ \left( \delta_\eta g[q] \right)} F[g] \\ \end{aligned}}

Since variation commutes with integration,

\displaystyle{ \begin{aligned}   \delta_\eta S[q] (t_1, t_2)   &= \delta_\eta \int_{t_1}^{t_2} L \circ \Gamma [q] \\   &= \int_{t_1}^{t_2} \delta_\eta \left( L \circ \Gamma [q] \right) \\   \end{aligned}}

By the chain rule of functional variation:

\displaystyle{ \begin{aligned}   \delta_\eta L \circ \Gamma [q] = \delta_{ \left( \delta_\eta \Gamma[q] \right)} L[\Gamma[q]] \\   \end{aligned}}

If \displaystyle{L} is path-independent,

\displaystyle{ \begin{aligned}   \delta_\eta \left( L \circ \Gamma [q] \right) = \left( DL \circ \Gamma[q] \right) \delta_\eta \Gamma[q] \\   \end{aligned}}

But is \displaystyle{L} path-independent?

The \displaystyle{L \circ \Gamma [.]} is path-dependent. Its input is a path \displaystyle{q}, not just \displaystyle{q(t)}, the value of \displaystyle{q} at the time \displaystyle{t}. However, \displaystyle{L(.)} itself is a path-independent function, because its input is not a path \displaystyle{q}, but a quadruple of values \displaystyle{(t, q(t), Dq(t), D^2 q(t))}.

\displaystyle{ \begin{aligned}   L \circ \Gamma [q] = L(t, q(t), Dq(t), D^2 q(t)) \\   \end{aligned}}

Since \displaystyle{L} is path-independent,

\displaystyle{ \begin{aligned}   \delta_\eta \left( L \circ \Gamma [q] \right)   = \left( DL \circ \Gamma[q] \right) \delta_\eta \Gamma[q] \\   \end{aligned}}

\displaystyle{ \begin{aligned}   &\delta_\eta S[q] (t_1, t_2) \\  &= \int_{t_1}^{t_2} \delta_\eta L \circ \Gamma [q] \\   &= \int_{t_1}^{t_2} \left( D \left( L \circ \Gamma[q] \right) \right) \delta_\eta \Gamma[q]  \\   &= \int_{t_1}^{t_2} \left( D \left( L(t, q, D q, D^2 q) \right) \right) (0, \eta (t), D \eta (t), D^2 \eta (t))  \\   &= \int_{t_1}^{t_2} \left[ \partial_0 L \circ \Gamma[q], \partial_1 L \circ \Gamma[q], \partial_2 L \circ \Gamma[q], \partial_3 L \circ \Gamma[q] \right] (0, \eta (t), D \eta (t), D^2 \eta (t))  \\   &= \int_{t_1}^{t_2} (\partial_1 L \circ \Gamma[q]) \eta + (\partial_2 L \circ \Gamma[q]) D \eta + (\partial_3 L \circ \Gamma[q]) D^2 \eta \\                        &=   \int_{t_1}^{t_2} (\partial_1 L \circ \Gamma[q]) \eta      + \left[ \left. (\partial_2 L \circ \Gamma[q]) \eta \right|_{t_1}^{t_2} - \int_{t_1}^{t_2} D(\partial_2 L \circ \Gamma[q]) \eta \right]     + \int_{t_1}^{t_2} (\partial_3 L \circ \Gamma[q]) D^2 \eta \\                        \end{aligned}}

Since \displaystyle{\eta(t_1) = 0} and \displaystyle{\eta(t_2) = 0},

\displaystyle{ \begin{aligned}   \delta_\eta S[q] (t_1, t_2)   &=   \int_{t_1}^{t_2} (\partial_1 L \circ \Gamma[q]) \eta      - \int_{t_1}^{t_2} D(\partial_2 L \circ \Gamma[q]) \eta      + \int_{t_1}^{t_2} (\partial_3 L \circ \Gamma[q]) D^2 \eta \\                        \end{aligned}}

Here is a trick for integration by parts:

As long as the boundary term \displaystyle{\left. u(t)v(t) \right|_{t_1}^{t_2} = 0},

\displaystyle{\int_{t_1}^{t_2} u(t) dv(t) = - \int_{t_1}^{t_2} v(t) du(t)}

So if \displaystyle{D \eta(t_1) = 0} and \displaystyle{D \eta(t_2) = 0},

\displaystyle{ \begin{aligned}   \delta_\eta S[q] (t_1, t_2)   &= \int_{t_1}^{t_2} (\partial_1 L \circ \Gamma[q]) \eta        - \int_{t_1}^{t_2} D(\partial_2 L \circ \Gamma[q]) \eta        - \int_{t_1}^{t_2} D(\partial_3 L \circ \Gamma[q]) D \eta \\                        \end{aligned}}

Since \displaystyle{\eta(t_1) = 0} and \displaystyle{\eta(t_2) = 0},

\displaystyle{ \begin{aligned}   \delta_\eta S[q] (t_1, t_2)   &= \int_{t_1}^{t_2} (\partial_1 L \circ \Gamma[q]) \eta        - \int_{t_1}^{t_2} D(\partial_2 L \circ \Gamma[q]) \eta        + \int_{t_1}^{t_2} D^2 (\partial_3 L \circ \Gamma[q]) \eta \\                        \end{aligned}}

\displaystyle{ \begin{aligned}   \delta_\eta S[q] (t_1, t_2)   &= \int_{t_1}^{t_2} \left[ (\partial_1 L \circ \Gamma[q]) - D(\partial_2 L \circ \Gamma[q]) + D^2 (\partial_3 L \circ \Gamma[q]) \right] \eta \\                        \end{aligned}}

By the principle of stationary action, \displaystyle{ \delta_\eta S[q] (t_1, t_2) = 0}. So

\displaystyle{ \begin{aligned}   0   &= \int_{t_1}^{t_2} \left[ (\partial_1 L \circ \Gamma[q]) - D(\partial_2 L \circ \Gamma[q]) + D^2 (\partial_3 L \circ \Gamma[q]) \right] \eta \\                        \end{aligned}}

Since this is true for any function \displaystyle{\eta(t)} that satisfies \displaystyle{\eta(t_1) = \eta(t_2) = 0} and \displaystyle{D\eta(t_1) = D\eta(t_2) = 0},

\displaystyle{ \begin{aligned}   (\partial_1 L \circ \Gamma[q]) - D(\partial_2 L \circ \Gamma[q]) + D^2 (\partial_3 L \circ \Gamma[q]) &= 0 \\                        D^2 (\partial_3 L \circ \Gamma[q]) - D(\partial_2 L \circ \Gamma[q]) + \partial_1 L \circ \Gamma[q] &= 0 \\                        \end{aligned}}



The notation of the path function \displaystyle{\Gamma} is \displaystyle{\Gamma[q](t)}, not \displaystyle{\Gamma[q(t)]}.

The notation \displaystyle{\Gamma[q](t)} means that \displaystyle{\Gamma} takes a path \displaystyle{q} as input. And then returns a path-independent function \displaystyle{\Gamma[q]}, which takes time \displaystyle{t} as input, returns a value \displaystyle{\Gamma[q](t)}.

The other notation \displaystyle{\Gamma[q(t)]} makes no sense, because \displaystyle{\Gamma[.]} takes a path \displaystyle{q}, not a value \displaystyle{q(t)}, as input.

— Me@2020-11-11 05:37:13 PM



2020.11.11 Wednesday (c) All rights reserved by ACHK

Memory as past microstate information encoded in present devices

Logical arrow of time, 4.2


Memory is of the past.

The main point of memories or records is that without them, most of the past microstate information would be lost for a macroscopic observer forever.

For example, if a mixture has already reached an equilibrium state, we cannot deduce which previous microstate it is from, unless we have the memory of it.

This work is free and may be used by anyone for any purpose. Wikimedia Foundation has received an e-mail confirming that the copyright holder has approved publication under the terms mentioned on this page.



~ some of the past microstate and macrostate information encoded in present macroscopic devices, such as paper, electronic devices, etc.


How come macroscopic time is cumulative?


Quantum states are unitary.

A quantum state in the present is evolved from one and only one quantum state at any particular time point in the past.

Also, that quantum state in the present will evolve to one and only one quantum state at any particular time point in the future.



\displaystyle{t_1} = a past time point

\displaystyle{t_2} = now

\displaystyle{t_3} = a future time point

Also, let state \displaystyle{S_1} at time \displaystyle{t_1} evolve to state \displaystyle{S_2} at time \displaystyle{t_2}. And then state \displaystyle{S_2} evolves to state \displaystyle{S_3} at time \displaystyle{t_3}.


State \displaystyle{S_2} has one-one correspondence to its past state \displaystyle{S_1}. So for the state \displaystyle{S_2}, it does not need memory to store any information of state \displaystyle{S_1}.

Instead, just by knowing that \displaystyle{t_2} microstate is \displaystyle{S_2}, we already can deduce that it is evolved from state \displaystyle{S_1} at time \displaystyle{t_1}.

In other words, microstate does not require memory.

— Me@2020-10-28 10:26 AM



2020.11.02 Monday (c) All rights reserved by ACHK