About the logical foundations of the second law of thermodynamics

“The law that entropy always increases holds, I think, the supreme position among the laws of Nature. If someone points out to you that your pet theory of the universe is in disagreement with Maxwell’s equations – then so much the worse for Maxwell’s equations. If it is found to be contradicted by observation—well, these experimentalists do bungle things sometimes. But if your theory is found to be against the Second Law of Thermodynamics I can give you no hope; there is nothing for it to collapse in deepest humiliation.”

― Arthur Eddington, The Nature of the Physical World, Chap. 4

But why is that so? Why is the Second Law so “special” among the other laws of physics?

Simply because—as we argue in a paper recently published on Physical Review E and freely available on the arXiv—the Second Law is not so much about physics, as it is about logic and consistent reasoning. More precisely, we argue that the Second Law can be seen as the shadow of a deeper asymmetry that exists in statistical inference between prediction and retrodiction, and ultimately imposed by the consistency of the Bayes–Laplace Rule.

A little bit of background. In the past two decades, thermodynamics has undergone unprecedented progresses. These can be traced back to the developments of stochastic thermodynamics, on the one hand, and the theory of nonequilibrium fluctuations, on the other. The latter, in particular, has shown that the Second Law emerges from a more fundamental “balance relation” between a physical process and its reverse. According to such a balance relation, for example, scrambled eggs are not forbidden to unscramble spontaneously—instead, the probability of such a process is just extremely tiny, compared with that of its more familiar reverse. In turn, entropy—i.e. the thing that “no one knows what it really is”, according to the apocryphal exchange between Shannon and von Neumann—precisely is a measure of such a disparity.

In this paper we go one step further and show that the existence of a disparity is not due to some kind of “physical propensity” that irreversible processes have for unfolding in one direction more likely than in the opposite direction—an explanation that would lead to a circular argument—, but to the intrinsic asymmetry that exists between prediction and retrodiction in inferential logic. We thus conclude that the foundations of the Second Law are not to be found within physics, but one step below, at the level of logic.

A nice little piece written by CQT/NUS outreach is also available here.

Colloquium on Bayesian retrodiction in statistical physics

One month ago I gave a colloquium at the 13th Annual Symposium of the Centre for Quantum Technologies (CQT) in Singapore. I decided to speak about my recent work on the role of Bayesian retrodiction in statistical mechanics — more precisely, in the conceptual foundations underlying fluctuation relations and the second law of thermodynamics.

Occasional references to the yin, the yang, and baseball bats, all point to the previous colloquium by Ruth Kastner (also available on Youtube).

The preprint is available on the arXiv: https://arxiv.org/abs/2009.02849

“Quantumness” always happens in time—and needs to be programmable


Incompatibility of quantum measurements lies at the core of nearly all quantum phenomena, from Heisenberg’s Uncertainty Principle, to the violation of Bell inequalities, and all the way up to quantum computational speed-ups.  Historically, quantum incompatibility has been considered only in a qualitative sense.  However, recently various resource-theoretic approaches have been proposed that aim to capture incompatibility in an operational and quantitative manner.  Previous results in this direction have focused on particular subsets of quantum measurements, leaving large parts of the total picture uncharted.

A work, which I wrote together with Eric Chitambar and Wenbin Zhou and was published yesterday on Physical Review Letters, proposes the first complete solution to this problem by formulating a resource theory of measurement incompatibility that allows free convertibility among all compatible measurements.  As a result, we are now able to explain quantum incompatibility in terms of quantum programmability; namely, the ability to switch on the fly between incompatible measurements is seen as a resource.  From this perspective, quantum measurement incompatibility is intrinsically a dynamical phenomenon that reveals itself in time as we try to control the system.

Read about this on Physical Review Letters or, for free, on the arXiv.

Is the Heisenberg picture propagating operators “backwards in time”?

A recent arXiv post ignited an interesting discussion with students and colleagues, demonstrating once more how the Heisenberg picture in quantum mechanics can easily be misunderstood to the point of becoming almost paradoxical. Here I intend to briefly summarize what I think may be the crux of the problem (or problems). The argument below follows a discussion on the topic that I had with Masanao Ozawa few years ago; however, any error or misunderstanding in it is to be entirely attributed to me.

One-step evolutions

Suppose that we are following the evolution of a quantum system from an initial time t=t_0 to a later time t=t_1\ge t_0 , and that the unitary operator evolving the state of the system is U(t_1,t_0), so that

\rho(t_1)=U(t_1,t_0) \rho(t_0) U(t_1,t_0)^\dagger.

The latter is called the Schrödinger picture of the evolution. In this picture, states evolve in time, while observables (like the Hamiltonian) do not.

The Heisenberg picture is meant to do the opposite: it keeps states “freezed”, while observables evolve. It can be also understood as a “pullback” operation: very much like when one looks at a rotation from the viewpoint of vectors (Schrödinger picture) or the viewpoint of the coordinate system (Heisenberg picture).

For the two pictures to give consistent predictions, that is, Tr[\rho(t_1)\ H(t_0)]=Tr[\rho(t_0)\ H(t_1)], it is prescribed that, if an observable at time t_0 is denoted as H(t_0), the same observable at the later time will be H(t_1)=U(t_1,t_0)^\dagger H(t_0) U(t_1,t_0). From this relation, we see that the state evolves according to U(t_1,t_0), while the observable evolves according to U(t_1,t_0)^\dagger .



It is quite tempting at this point to interpret this by saying that “states evolve forward in time, while observables evolve backwards in time”. If only two times are considered, that seems just a curious though innocuous way of phrasing it. Indeed I have heard a lot of researchers explaining the Heisenberg picture this way. I myself would have nodded my head hearing this some years ago. However, I now see why this interpretation can be in fact very confusing, potentially leading to wrong calculations, when more than two times are considered.

Two-step evolutions: the wrong approach

Imagine now to fix three instants in times, t_0\le t_1\le t_2 and two unitary operators:  one, U(t_1,t_0), describing the evolution of states from t_0 to t_1 as before; and another one, U(t_2,t_1), propagating states from t_1 to t_2 . The problem is: how should one model the evolution of an observable H from t_1 to t_2 ? A naive guess based on the “backwards-in-time evolution” intuition would suggest a scheme like the following:


But what should be the evolution operator describing the box denoted by question marks? As the arXiv post mentioned at the beginning of this post argues, one could be tempted to say that the right evolution operator is U(t_2,t_1)^\dagger, probably by symmetry with the Schrödinger’s branch evolving forward in time. This naive guess leads to the equation H(t_2)= U(t_2,t_1)^\dagger H(t_1) U(t_2,t_1) = U(t_2,t_1)^\dagger U(t_1,t_0)^\dagger H(t_0) U(t_1,t_0) U(t_2,t_1) .

Problem is, this is of course wrong! The correct thing to do is to understand that the total evolution of the state from t_0 to t_2 is given by the unitary operator U(t_2,t_0)=U(t_2,t_1)U(t_1,t_0). Consequently, one has that

H(t_2)= U(t_1,t_0)^\dagger U(t_2,t_1)^\dagger H(t_0) U(t_2,t_1) U(t_1,t_0).

This is the correct description of H(t_2) in the Heisenberg picture.

Another, more subtle, source of confusion

We have seen how the naive “backwards in time” interpretation is wrong. However, at this point, another structure emerges that still suggests some kind of “time-reversal”. I am speaking now of the fact that, in the correct equation, that is, H(t_2)= U(t_1,t_0)^\dagger U(t_2,t_1)^\dagger H(t_0) U(t_2,t_1) U(t_1,t_0), the order of the propagators is reversed with respect to the one that is used for states, that is \rho(t_2)= U(t_2,t_1) U(t_1,t_0) \rho(t_0)U(t_1,t_0)^\dagger U(t_2,t_1)^\dagger.

Given that the equation itself is correct, in what follows I am simply criticizing its interpretation. I would like to argue, in particular, that, even though the evolution operators act in reverse order on the observable, the Heisenberg picture should not (or, at least, need not) be interpreted or explained as “backwards in time” evolution.

The point is that U(t_2,t_1), on its own, has no meaning in the Heisenberg picture. In the Heisenberg picture, all operators must be evolved consistently. In particular, the operator U(t_2,t_1), which is defined formally at t=t_0 , when applied at time t_1 , must also be consistently evolved before being applied on anything. (Better said, the Hamiltonian generating the unitary evolves in time and, with it, the unitary operator it generates.)

Hence, in the Heisenberg picture, the propagator of observables from t_1 to t_2 is not U(t_2,t_1)^\dagger but its evolved version, that is,

\tilde U(t_2,t_1)^\dagger=U(t_1,t_0)^\dagger  U(t_2,t_1)^\dagger U(t_1,t_0).

If we substitute this into the initial formula, then we indeed obtain that H(t_2)=\tilde U(t_2,t_1)^\dagger H(t_1) \tilde U(t_2,t_1)=U(t_1,t_0)^\dagger  U(t_2,t_1)^\dagger H(t_0) U(t_2,t_1) U(t_1,t_0) , as it was computed at point 2 above.


However, once written as above, it gives us a very clear understanding of what is going on in the Heisenberg picture.

Summarizing, the Heisenberg picture is indeed a pullback transformation, but a pullback that happens forward in time. After all, both Heisenberg and Schrödinger pictures provide equivalent representations of exactly the same process, which of course happens forward in time.

Birkhoff-von Neumann Prize

I was delighted to learn that I was awarded with the “Birkhoff-von Neumann Prize” by the International Quantum Structures Association. I feel very honored and humbled — at once! — to join a list including such superb colleagues. Thank you very much!

Francesco Buscemi is Associate Professor at the Department of Mathematical Informatics of Nagoya University, Japan. His results solved some long-standing open problems in the foundations of quantum physics, using ideas from mathematical statistics and information theory. He established, in a series of single-authored papers, the theory of quantum statistical morphisms and quantum statistical comparison, generalizing to the noncommutative setting some fundamental results in mathematical statistics dating back to works of David Blackwell and Lucien Le Cam. In particular, Prof. Buscemi successfully applied his theory to construct the framework of “semiquantum nonlocal games,” which extend Bell tests and are now widely used in theory and experiments to certify, in a measurement device-independent way, the presence of non-classical correlations in space and time.

In such an occasion, it is impossible not to remember Professor Paul Busch, gentleman scientist, President of IQSA until his sudden death, of which I learned almost simultaneously with my award.

The “No-Hypersignaling Principle”

An important consequence of special relativity, in particular, of the constant and finite speed of light, is that space-like separated regions in spacetime cannot communicate. This fact is often referred to as the “no-signaling principle” in physics.

2018-04-19 09_51_05

However, even when signaling is in fact possible, there still are obvious constraints on how signaling can occur: for example, by sending one physical bit, no more than one bit of information can be communicated; by sending two physical bits, no more than two bits of information can be communicated; and so on. Such extra constraints, that by analogy we call “no-hypersignaling,” are not dictated by special relativity, but by the physical theory describing the system being transmitted. If the physical bit is described by classical theory, then the no-hypersignaling principle is true by definition. It is not so in quantum theory, where the validity of the no-hypersignaling principle becomes a non-trivial mathematical theorem relying on a recent result by Péter E. Frenkel and Mihály Weiner (whose proof, using the “supply-demand theorem” for bipartite graphs, is very interesting in itself).

As one may suspect, the no-hypersignaling principle does not hold in general: it is possible to construct artificial worlds in which the no-hypersignaling principle is violated. Such worlds are close relatives of the “box world,” a toy-model theory used to describe conceptual devices called Popescu-Rohrlich boxes. Exploring such alternative box worlds, one further discovers that the no-hypersignaling principle is logically independent of both the conventional no-signaling principle and the information causality principle, however related these two may seem to be with no-hypersignaling.

This means that the no-hypersignaling principle needs to be either assumed from the start, or derived from presently unknown physical principles analogous to the finite and constant speed of light behind Einstein’s no-signaling principle.

The paper was published on Physical Review Letters, but is also available free of charge on the arXiv.