Bayes’ rule and Jeffrey’s updating from the principle of minimum change

January 28, 2026February 28, 2026 francescoLeave a comment

What follows is, to the best of my understanding, largely folklore. Variants of the argument appear implicitly across the literature on Bayesian updating, relative entropy, and consistent inference, and the conclusion is often taken for granted once one is familiar with Jeffrey updating. However, I have not been able to find a place where the reasoning is laid out explicitly and in a self-contained way, starting from minimal assumptions and making clear what is, and is not, being imposed. For that reason, I am recording this note here, both for my own reference and in the hope that it may be useful to others. A PDF version containing the same material, with slightly improved formatting, is available here.

Let $X$ and $Y$ be finite-valued quantities (random variables) about which we wish to reason. Our prior information is summarized by a joint distribution

P(x,y)=p(x)\,\varphi(y|x),

where $p(x)$ represents our prior state of knowledge about $X$ , and $\varphi(y|x)$ encodes the likelihood. No interpretation beyond this bookkeeping role is assumed.

From $P$ we may compute the implied marginal distribution $P_Y(y)=\sum_x p(x)\,\varphi(y|x)$ and, by the product rule (I wouldn’t call this “Bayes’ theorem” yet), the inverse conditional distribution

\hat\varphi(x|y)=P(x|y)=\frac{p(x)\,\varphi(y|x)}{P_Y(y)}.

Now suppose that new information becomes available which does not specify a particular value of $Y$ , but instead constrains our revised beliefs about $Y$ to take the form of a probability distribution $\tau(y)$ . The problem is then to determine what joint distribution $R(x,y)$ should represent our new state of knowledge, given that it must be consistent with $\tau$ and must not contain any information not logically implied by the prior $P$ together with this new constraint.

Principle of minimum change

Following the general principle that probabilities should be updated only to the extent required by new information, we select $R$ to minimize the relative entropy

D(R\|P)=\sum_{x,y}R(x,y)\log\frac{R(x,y)}{P(x,y)},

subject to the constraint $R_Y=\tau$ . This criterion ensures that no unwarranted assumptions are introduced.

Solution

Any admissible $R$ may be written as

R(x,y)=\tau(y)\,R(x|y).

Substitution into the relative entropy yields the identity

\begin{aligned} D(R\| P) &amp;= \sum_{x,y}R(x,y)\log\frac{R(x,y)}{P(x,y)} \\ &amp;= \sum_{x,y}\tau(y)R(x|y)\left[\log\frac{\tau(y)}{P(y)}+\log\frac{R(x|y)}{\hat\varphi(x|y)}\right] \\ &amp;= \left\{\sum_{x,y}\tau(y)R(x|y)\log\frac{\tau(y)}{P(y)}+\sum_{x,y}\tau(y)R(x|y)\log\frac{R(x|y)}{\hat\varphi(x|y)}\right\} \\ &amp;= D(\tau\|P_Y)+\sum_{x,y}\tau(y)\;R(x|y)\;\log\frac{R(x|y)}{\hat\varphi(x|y)} \\ &amp; = D(\tau\|P_Y) + \sum_y \tau(y)\; D\Big(R(\cdot|y)\|\hat\varphi(\cdot|y)\Big)\;. \end{aligned}

(Apologies for the ugly formatting; I’m afraid this is a bug of WordPress; remember that a better formatted version can be downloaded from this link.) The first term depends only on the revised marginal $\tau$ and is therefore fixed. The second term is nonnegative and vanishes if and only if

R(x|y)=\hat\varphi(x|y) \quad\text{for all }x,y.

Hence the unique distribution consistent with the stated constraints and the principle of minimum updating is $R(x,y)=\tau(y)\,\hat\varphi(x|y)$ , wherever $\tau(y)>0$ . Note that, if it happens that $\tau(y)>0$ but $P_Y(y)=0$ , we are in the situation in which the new evidence is falsifying our prior. This is signaled by the fact that, in this case, $D(\tau\|P_Y)=+\infty$ .

Interpretation

The new information alters only our beliefs about $Y$ ; therefore, rational consistency requires that our conditional beliefs about $X$ given $Y$ remain exactly those implied by the prior. Any other choice would amount to smuggling in additional information not contained in the premises.

What does “macroscopic” really mean in quantum theory?

December 16, 2025February 28, 2026 francescoLeave a comment

One of the persistent difficulties in discussing the emergence of macroscopic irreversibility from microscopic quantum dynamics is that we often talk about “macroscopic states”, “macroscopic operations”, or “macroscopic correlations” without a fully satisfactory definition of what these terms actually mean.

This paper, written together with Teruaki Nagasawa, Eyuri Wakakuwa, and Kohtaro Kato, grew out of a very concrete problem related to the second law of thermodynamics. In earlier work, we showed that for an isolated quantum system the macroscopic entropy does not increase if and only if the system remains in a macroscopic state throughout its evolution. The problem was that, at that stage, “macroscopic state” was defined only implicitly. This made it hard to push the discussion further, because it was unclear what structural, algebraic, or operational features such states really had.

The goal of this work was therefore not to introduce yet another entropy, but to understand macroscopicity itself in a precise way. We wanted a definition that was algebraic, constructive, and operational, and that could serve as a solid foundation for discussing emergence.

Macroscopicity as an inferential boundary

One of the main messages of the paper is that the divide between “macro” and “micro” is not absolute. It depends on the observer. This statement should be taken in a very literal and modest sense.

By an observer we do not mean anything related to the measurement problem or consciousness. We simply mean a specification of which physical quantities can be simultaneously measured. These are what von Neumann already called macroscopic observables. Before discussing emergence, one must decide which variables one is actually focusing on.

Once this choice is made, the distinction between macroscopic and microscopic degrees of freedom coincides exactly with a boundary between what can be inferred and what cannot be inferred. Given the measured variables and some prior information, certain microscopic details can be retrodicted from macroscopic data, while others are irretrievably lost. What counts as macroscopic is determined by this inferential boundary. It is not that macroscopic variables “emerge” from microscopic ones that are “hidden” underneath. Rather, the choice of macroscopic variables determines what is macro and what is micro.

Interestingly, this inferential and retrodictive perspective was not imposed from the outside. It emerged naturally from the mathematics. In hindsight, this was quite telling, especially in light of recent work we did on the role of prediction and retrodiction in fluctuation relations.

From observational entropy to observational deficit

Observational entropy already captures part of this story. It explains how entropy can increase under unitary dynamics when one restricts attention to macroscopic measurements. However, in its standard formulation it relies on a uniform prior.

This is a serious limitation. In thermodynamic settings, a thermal prior is often the natural choice. In infinite-dimensional systems, the uniform state may not even exist. For this reason, we introduced the notion of observational deficit, defined relative to an arbitrary prior. Conceptually, the observational deficit measures how much information about a state is lost when one moves from a microscopic description to macroscopic data, taking into account the observer’s prior knowledge.

Macroscopic states are then precisely those for which this deficit vanishes. They are the states that can be perfectly retrodicted from macroscopic data alone.

Inferential reference frames and the MPPP

A central technical result of the paper is the existence and uniqueness of what we call the maximal projective post-processing (MPPP) of a measurement, relative to a given prior. This object plays a key conceptual role.

The result shows that any observer, defined by a measurement and a prior, can be represented by a suitable projective measurement. This projective measurement captures exactly the information that is inferentially accessible. For this reason, we interpret it as an inferential reference frame.

Just as a symmetry reference frame distinguishes between speakable and unspeakable information, an inferential reference frame distinguishes between what can be retrodicted from macroscopic data and what cannot. It is this structure then that decides what counts as macroscopic and what counts as microscopic for a given observer.

A resource theory of microscopicity

Once macroscopic states are clearly identified, it becomes natural to ask what makes one state more microscopic than another. Answering this question requires more than a classification. It requires an operational framework.

This is why we developed a resource theory of microscopicity. In this theory, macroscopic states are free states, and microscopicity is the resource. The framework forces one to think in terms of operations that do not generate microscopicity and to clarify what is really at stake when microscopic details matter.

An important outcome is that several well-known resource theories appear as special cases. Coherence, athermality, and asymmetry all fit naturally into this framework once appropriate choices of measurements and priors are made. Seeing these theories as instances of microscopicity provides a unified operational interpretation, and sheds light, for example, on the distinction between speakable and unspeakable coherence.

Observer-dependent correlations

The same perspective can be applied to correlations. Entanglement, discord, and related notions are often treated as absolute properties of states. Our results support a different view. The visibility and usefulness of correlations depend on the observer’s inferential reference frame.

Correlations are, after all, correlations among observers, so it is entirely natural to ask what it means for correlations themselves to be macroscopic. What matters is not an abstract, observer-independent property of a state, but which correlations are actually accessible given a specific inferential reference frame. From this perspective, it is natural to start from what correlations look like from the viewpoint of a single observer with limited access, since this already determines which inter-observer correlations can meaningfully be discussed. Our framework provides a starting point for such an analysis, and may eventually lead to something like a relativity theory for inferential reference frames.

What this enables

At a broad level, this work provides a rigorous solution to a basic mathematical problem: what does it mean for a state, an operation, or a correlation to be macroscopic?

Without a precise answer to this question, discussions of the emergence of macroscopic behavior risk remaining vague. With it, we can finally say what we are talking about. This does not solve the problem of emergence by itself, but it clarifies the language and the structure needed to address it in a meaningful way.

The paper is titled Macroscopicity and observational deficit in states, operations, and correlations and appeared in Reports on Progress in Physics earlier this year. It is available for free on the arXiv.

How do you update your quantum beliefs?

October 8, 2025November 11, 2025 francescoLeave a comment

Bayes’ rule is often introduced in textbooks through examples involving urns filled with colored balls. It is striking, however, that the same formula works just as well in situations totally unrelated to urns or counting. In fact, Bayes’ rule applies also to how we update beliefs, learn from data, and draw inferences about the world. The fact that its success extends far beyond probability puzzles suggests that Bayes’ rule captures something fundamental about rational reasoning itself.

Indeed, there are several ways to justify Bayes’ rule as the only possible rule of consistent updating. De Finetti, Cox, Jeffreys, Jeffrey, and many other argued that, if an agent deviates from Bayes’ rule, they open themselves to “attacks” that would make them “lose” (money, time, resources, etc.) with probability one in the long run. However, all such approaches rely on axioms, and while some axioms may seem natural and acceptable to one researcher, they may not seem as convincing to another.

Interestingly, there exists however another way (similar to Jaynes’ maximum entropy principle, in fact) to justify Bayes’ rule: the principle of minimum change. When new data arrive, we should only revise our prior beliefs as much as needed to remain consistent with the new evidence. This is a conservative stance toward knowledge that avoids bias by preferring the smallest possible adjustment compatible with the facts.

In a work recently published on Physical Review Letters, together with Ge Bai and Valerio Scarani, we asked how this principle might extend to the quantum world. Quantum systems do not possess definite properties before measurement, and probabilities are replaced by density matrices that describe our partial knowledge of outcomes. Updating such knowledge is therefore not straightforward. What does it mean to change a quantum state “as little as possible” while incorporating new information?

To address this, we reformulated the minimum change principle directly at the level of quantum processes rather than their individual states. We used quantum fidelity to measure how similar two processes are and searched for the update that maximizes this similarity. The result turned out to coincide, in many important cases, with a well-known mathematical transformation in quantum information theory called the Petz recovery map.

This finding provides a new interpretation of the Petz map: it is not merely a technical tool, but rather, the natural quantum analogue of Bayes’ rule. Of all the possible updates, this one changes our quantum description in the smallest consistent way. This explains why, despite often being described as merely “pretty good,” the Petz map keeps reappearing in many different contexts. It has been independently rediscovered in quantum error correction, statistical mechanics, and even quantum gravity because it expresses the same logic that makes Bayes’ rule universal. It is not an approximation, but rather, the quantum form of rational inference itself.

The quantum version of Bayes’ rule does more than provide a new mathematical identity. It offers a systematic way to reason about quantum systems, to retrodict past states from observed data, and to quantify how much information about the past is lost over time. This connects directly to our previous studies on observational entropy and retrodictability, where the second law of thermodynamics emerges as a statement about the progressive loss of our ability to reconstruct the past.

Seen from this perspective, learning, inference, and even entropy growth are aspects of a single story about how information evolves. The quantum Bayes’ rule sits at the centre of that story, providing a bridge between classical reasoning and the probabilistic structure of quantum theory. A rule that is often illustrated with urns and balls finds a new form in quantum theory, where neither urns nor balls exist. This perhaps suggests that rational updating is not tied to any particular physical model (classical, quantum, etc.) but expresses a deeper logic of information itself.

Causal vs noncausal information revivals: a fresh take on quantum non-Markovianity

April 24, 2025May 1, 2025 francescoLeave a comment

Two things have long bothered me about the way we discuss quantum non-Markovianity. First, that the concept fails to exclude classical processes, so that the use of the word “quantum” seems unjustified. Second, and related to the first but somewhat more puzzling, that a mere mixture of two Markovian processes can appear non-Markovian. Such a non-Markovianity surely shouldn’t be taken too seriously. This disconnect has persisted despite years of work, subtly undermining the foundations of our understanding.

In our new paper, Causal and Noncausal Revivals of Information: A New Regime of Non-Markovianity in Quantum Stochastic Processes, just published in PRX Quantum (open access) together with Rajeev Gangwar, Kaumudibikash Goswami, Himanshu Badhani, Tanmoy Pandit, Brij Mohan, Siddhartha Das, and Manabendra Nath Bera, we finally provide a satisfying resolution. By distinguishing between causal and noncausal information revivals, we create a conceptual and operational framework that clarifies what non-Markovianity in quantum systems really means.

The Puzzle of Mixtures and Backflows

Let’s begin with the basics. A process is Markovian if its future evolution depends only on the present state, not the past. In quantum dynamics, this notion gets murky when the system is not isolated but interacts with an environment. Textbooks typically assume that the system and its environment start uncorrelated: a neat assumption, but physically unrealistic. Initial correlations are inevitable, especially in strongly coupled systems. (In fact, the first post on this blog was about this very topic.)

This led to a long line of research investigating whether quantum channels can still describe the evolution of a subsystem even when initial correlations are present. While various conditions and formalisms were proposed (notably assignment maps and quantum discord), none satisfactorily addressed the oddity that a convex combination of Markovian processes can appear non-Markovian. This is particularly troubling if we hope to construct a resource theory of non-Markovianity, where convexity is essential.

Revivals Are Not Always Backflows

Our central insight is that information revivals, which we model as increases in the mutual information between a system and an ancillary reference after an interaction with the environment, can come in two flavors: causal and noncausal. Causal revivals correspond to genuine backflows of information from the environment to the system. Noncausal revivals, however, can be explained entirely using degrees of freedom that never interacted with the system at all.

This subtle distinction allows us to separate apparent non-Markovianity (which can be faked by a clever model with hidden correlations) from genuinely quantum non-Markovianity (which requires a true causal connection).

To formalize this, we bring in tools from quantum information theory: squashed entanglement, conditional mutual information, and Petz’s theory of statistical sufficiency. We prove that if a revival can be explained without violating the data-processing inequality for some inert extension (i.e., hidden, non-interacting degrees of freedom), then it is noncausal. In contrast, causal revivals must involve a flow of information from the environment.

Why This Matters

Distinguishing causal from noncausal revivals does more than just clean up conceptual confusion. It has direct implications for experiments and applications. For instance, we provide operational conditions, depending only on system observables, to detect genuine information backflow. This opens the door to device-independent verification of quantum memory effects.

Moreover, we show that anything that is not genuine non-Markovianity, thus including noncausal revivals, forms a convex set. In other words, if you take the convex mixture of processes without genuine non-Markovianity, you get another process without genuine non-Markovianity: genuine non-Markovianity, unlike conventional non-Markovianity, cannot simply be generated by classical randomization. This is a very welcome feature: convexity is a cornerstone of resource theories, and it means that we can now meaningfully talk about genuine non-Markovianity as a resource in quantum information processing.

Maxwell’s demon, quantum theory, and the second law: who’s really in charge?

February 7, 2025February 7, 2025 francescoLeave a comment

It is often said that the paradox of “Maxwell’s demon” is resolved by including the thermodynamic costs of measurement and memory erasure. The common claim is that these costs offset the demon’s apparent violation of the second law of thermodynamics, thereby restoring its validity. The truth, however, is quite the opposite: the fact that the costs of measurement and erasure offset the demon’s violation is a consequence of assuming that the second law is valid in the first place. In other words, it is the assumption of the validity of the second law that forces this equilibrium, not the other way around!

This realization, which echoes the thesis of Earman and Norton, came while studying some not-so-recent-anymore but still quite influential papers on quantum feedback protocols in quantum thermodynamics. While the narrative there adhered to the usual folklore on the subject (i.e., that quantum feedback protocols obey the second law whenever the thermodynamic costs of measurement and erasure are properly accounted for), what we found instead was quite different: quantum theory is in fact completely independent of the second law of thermodynamics! And it couldn’t be otherwise, simply because quantum theory has no built-in thermodynamics. There is no “quantum bullet” to exorcise Maxwell’s demon. In fact, in our work we explicitly constructed a model of a measurement and feedback process that violates the bounds required by the second law, even after including all costs (i.e., measurement and erasure costs) in the thermodynamic balance.

However, we also found that although quantum theory can violate the second law, it does not need to: any quantum process can be realized in a way that does not violate the second law of thermodynamics, simply by adding more systems (bath, battery, and so on) until the thermodynamic balance is restored.

There are two main takeaways from this story. First, even in quantum theory, it is the second law that guarantees the balance between gains and costs in feedback protocols, not the other way around. Second, although quantum theory and the second law of thermodynamics are logically independent, they can peacefully coexist. The second law of thermodynamics does not impose any hard constraints on what can be achieved in quantum theory: there is always a way to ensure compliance with the second law.

In conclusion, by no means can we claim that quantum theory is “demon-proof” by design. However, we now have a much clearer understanding of how quantum feedback protocols work, what they can and cannot do. The paper appeared today on npj Quantum Information but I like the look of the arXiv version better.

Quantum theory can exorcise Laplace’s demon. But Maxwell’s demon is still lurking…

Generalizing Observational Entropy for Complex Systems

November 15, 2024November 15, 2024 francescoLeave a comment

Work in collaboration with Ge Bai and Valerio Scarani (NUS, Singapore), Dom Šafránek (IBS, Korea) and Joe Schindler (UAB, Spain), published today on Quantum.

In his 1932 book, von Neumann not only introduced the now familiar von Neumann entropy, but also discussed another entropic quantity that he called “macroscopic”. He argued that this macroscopic entropy, rather than the von Neumann entropy, is the key measure for understanding thermodynamic systems. Here we revisit and extend a concept derived from von Neumann’s macroscopic entropy, observational entropy (OE) — an information-theoretic quantity that measures both the intrinsic uncertainty of a system and the additional uncertainty introduced by the measurement we use to observe it.

Typically, the definition of OE assumes a “uniform prior,” i.e., it starts with an assumption of maximum uncertainty about the state of the system. However, this assumption is not always tenable, especially in more complex systems, such as those influenced by energy constraints or infinite dimensional systems, where instead other priors, such as the Gibbs distribution, would be preferable, both physically and mathematically.

Measurement is our window to the microscopic world, but it’s like a stained glass window: what’s on the other side looks distorted and coarse-grained. That’s where the second law comes in.

To fill this gap and extend OE to arbitrary priors, we first show how OE can be interpreted in two ways: as a measure of how much a measurement scrambles the true state of a system (statistical deficiency), and as the difficulty of inferring the original state from the measurement results (irretrodictability). These two aspects provide complementary insights into how much we lose or gain in our knowledge of the original state of a system when we make observations on it.

This conceptual insight leads us to introduce three generalized versions of OE: two that capture either statistical deficiency or irretrodictability, but are inherently incompatible; and a third, based on Belavkin-Staszewski relative entropy, that instead is able to combine both perspectives and provide a unified view of commuting and non-commuting priors alike. We expect that our results will pave the way for a consistent treatment of the second law of thermodynamics and fluctuation relations in fully quantum scenarios. From the abstract:

Observational entropy captures both the intrinsic uncertainty of a thermodynamic state and the lack of knowledge due to coarse-graining. We demonstrate two interpretations of observational entropy, one as the statistical deficiency resulting from a measurement, the other as the difficulty of inferring the input state from the measurement statistics by quantum Bayesian retrodiction. These interpretations show that the observational entropy implicitly includes a uniform reference prior. Since the uniform prior cannot be used when the system is infinite-dimensional or otherwise energy-constrained, we propose generalizations by replacing the uniform prior with arbitrary quantum states that may not even commute with the state of the system. We propose three candidates for this generalization, discuss their properties, and show that one of them gives a unified expression that relates both interpretations.

Tim Maudlin about the Gibbsian approach

May 5, 2024August 30, 2024 francescoLeave a comment

Tim Maudlin here is talking about the Gibbsian approach to explain the second law of thermodynamics. He introduces the “coarse-graining” procedure, but then adds “but what the hell is going on here? how do you think about these interventions?

Here is the answer: the observer makes an inference based on what they can see about the system. That’s it. The second law is about the inferential process of the observer, not about the underlying physics. We have written about this in a few papers, and I have given several talks on this idea. It all makes perfect sense to me!

Virtual Quantum Broadcasting

March 28, 2024April 5, 2024 francescoLeave a comment

The new paper “Virtual Quantum Broadcasting” with Arthur Parzygnat, James Fullwood and Giulio Chiribella was published a few days ago in Physical Review Letters. Here are my answers to a number of questions posed by phys.org

Q: Did any personal motivation or inspiration drive you to pursue this research?

A: Quantum theory is often said to be the most successful theory ever conceived in physics, and yet its formalism cannot predict the outcome of a measurement: it can only predict averages. It’s as if there is a layer beyond which no one can look. This becomes painfully clear when you try to describe quantum phenomena in terms of trajectories: the formalism of quantum theory simply does not allow it. Nevertheless, there are situations where we would like to follow a quantum system — and the correlations in its physical properties — in time, like following its “trajectory”. So should we just give up? Or should we try to push the limits of quantum formalism a bit and see what we can do? More than a decade ago, I and other co-authors took the latter path and proposed the idea of a quantum “time correlator” (link to arXiv), which would allow direct observation of correlations in physical quantities of the same physical system at different times. Much like what happens when we can look at the trajectory of a classical object. Perhaps ten years ago this idea was premature, but in my eyes this paper finally brings it to fruition, expands its scope, and motivates it even further. I sincerely hope that more researchers in quantum information theory will join us on these still relatively unexplored paths.

Q: In simple terms, could you briefly explain your research and its key findings?

A: This paper is rather technical, but its key finding, loosely speaking, is that it is not necessary to give up on time correlations in quantum theory altogether. In fact, in agreement with our previous paper on the quantum time-correlator, we find that in quantum theory there’s basically a unique, canonical way to describe a quantum system at different times. This is achieved by means of a virtual broadcasting map, which is able to “spread” a quantum state symmetrically between different times, thus allowing the extraction of information that can be used to reconstruct the time correlations typical of a trajectory.

Q: Could you elaborate on any unexpected or particularly interesting findings that emerged during the course of your research?

A: The most appealing, motivating, and intriguing feature of this work, in my opinion, is that the virtual broadcasting map we found is uniquely characterized by a simple set of natural requirements. That’s why we call it “canonical”. Such a uniqueness property in turn seems to point to a whole new part of quantum theory, i.e. its time-like structure, which is still largely unexplored.

Q: How does the concept of virtual quantum broadcasting contribute to our understanding of quantum state manipulation and information processing?

A: For example, consider the situation where one needs to evaluate the accuracy of a measurement of a physical quantity. This is something we encounter all the time in the design of quantum devices. By definition, the measurement is “accurate” if the final result is “highly correlated” with the value of the physical quantity before the measurement. As natural and innocuous as this definition may seem, quantum theory does not allow it in the strict sense. However, with our virtual broadcasting map, we can now directly measure the accuracy of any measurement — something that was previously thought to be, if not impossible, at least “unorthodox”. This is just the first application that comes to mind, but there are many more in quantum communication and quantum computation.

Q: The study mentions the optimization of physical approximations to the canonical virtual broadcasting map. Could you explain the significance of this optimization and how it impacts the practical implementation of virtual quantum broadcasting?

A: Symmetric and antisymmetric cloning are the “optimal” physical approximations to our canonical virtual broadcasting map, in the precise sense that using the former makes it possible to reconstruct the latter with minimal overhead in terms of measurement statistics. It’s all about efficiency of reconstructing time correlations. Moreover, as an added bonus, symmetric and antisymmetric cloning can be easily implemented using simple interferometric schemes.

Q: What avenues do you see for further research in this area, and are there specific aspects you’d like to explore in the future?

A: As mentioned earlier, it seems that we have found enough mathematical and conceptual evidence to justify entering an area of quantum theory that was previously considered “off limits”. Perhaps the answers to many fundamental questions can be found here.

How much would you bet against Maxwell’s Demon?

November 10, 2023November 11, 2023 francesco1 Comment

I didn’t know well how to begin this post, and I also wanted to give these bots a try, so I’ve got the picture above from LensGo and I’ve asked chatGPT to generate a short story about Dr. Spock gambling against Maxwell’s Demon. Here it is:

Dr. Spock, the renowned physicist, found himself in an unexpected predicament. Across the green-felted table in a high-stakes casino, he faced an unlikely opponent: Maxwell’s Demon, a quantum-imbued entity known for challenging the laws of thermodynamics.
Their game wasn’t the usual casino fare. Instead, they wagered on the concept of dissipated work and energy. Dr. Spock, guided by the principles of scientific rationality, made precise calculations. Maxwell’s Demon, with a mischievous glint in its eye, played the odds.
As the final bets were placed, the cards were revealed, and the roulette wheel spun. In the end, it was Dr. Spock’s logical deductions that won the day, while Maxwell’s Demon learned that even in its magical universe, the laws of thermodynamics held sway. With a nod of acknowledgment, the demon vanished into the enigmatic folds of uncertainty, leaving Dr. Spock to ponder the intersection of science and chance.

Meh. Not super impressive… And why Maxwell’s Demon should be “quantum-imbued”? I didn’t prompt for anything quantum…

Anyway, this was to introduce a paper that we recently published in Physical Review Letters (paper freely available on the arXiv), discussing the second law of thermodynamics and Crooks’ fluctuation relations from the point of view of expected utility theory (EUT), which is what economists use to model betting strategies and rational agents.

The punchline is that all Rényi divergences between forward and backward process have the exact operational interpretation of how much a rational betting agent, with risk aversion r=𝛼-1, would be willing to pay to avoid betting on the performance of a stochastic thermal engine. In EUT jargon, this is called the certainty equivalent value. Thus, for example, the conventional relative entropy (𝛼=1) is not only the usual average dissipated work, but also the certainty equivalent amount for a perfectly rational agent (like the Spock in the picture…) who is neither risk-seeking nor risk-averse (r=0). Instead, for an infinitely risk-averse player (𝛼=r=∞), who cannot tolerate any uncertainty and simply wants to walk away from any possible gamble, the certainty equivalent amount naturally corresponds to the worst-case scenario of maximum dissipated work, as measured by Dmax, thus recovering a result of Yunger-Halpern et al. We are also able to describe the zoology of extremely risk-seeking, happy-go-lucky gamblers with r<-1 (𝛼<0), although the corresponding certainty equivalent amount is no longer a properly defined Rényi divergence.

We had fun with this paper: putting Maxwell’s Demon and economics together makes for a lot of jokes… Speaking of which, here is one by the bot:

Why did Maxwell’s Demon decide to walk into Wall Street?

Because he heard there was a hot market for sorting bull and bear markets!

OK, bot, nice try…

Say hello to von Neumann’s “other” entropy…

June 11, 2023June 11, 2023 francescoLeave a comment

…alias observational entropy!

In his 1932 book, von Neumann famously discusses a thermodynamic device, similar to Szilard’s engine, through which he is able to compute the entropy of a quantum state, obtaining the formula $S(\rho)=-Tr[\rho\log\rho]$ . I discussed about this in another post.

Probably less people know however that, just a few pages after having derived his eponymous formula, von Neumann writes:

Although our entropy expression, as we saw, is completely analogous to the classical entropy, it is still surprising that it is invariant in the normal (note added, in the sense of “unitary/Hamiltonian”) evolution in time of the system, and only increases with measurements — in the classical theory (where the measurements in general played no role) it increased as a rule even with the ordinary mechanical evolution in time of the system. It is therefore necessary to clear up this apparently paradoxical situation.

What von Neumann is referring to in the above passage is the phenomenon of free or “Joule” expansion, in which a gas, initially contained in a small volume is allowed to expand against the vacuum, thereby doing no work, but causing a net increase in the entropy of the universe, even though its evolution is Hamiltonian, i.e., reversible, all along.

In order to resolve this issue, von Neumann suggests that the correct quantity to consider in thermodynamic situations is not (what we call it today) von Neumann entropy, but another quantity, that he calls macroscopic entropy (and which today is called observational entropy):

$S_M(\rho)=-\sum_ip_i(\log p_i-\log V_i)$

where $M$ denotes a fixed measurement, i.e., a POVM $\{M_i\}_i$ , $p_i=Tr[M_i\ \rho]$ is the expected probability of occurrence of each outcome, and $V_i=Tr[M_i]$ are “volume” terms. The measurement, with respect to which the above quantity is computed, is used by von Neumann to represent, in a mathematically manageable way, a macroscopic observer “looking at” the system: the system’s state is $\rho$ , but the observer only possesses some coarse-grained information about it, and the amount of such coarse-grained information is measured by the macroscopic entropy.

In a paper written in collaboration with Dom Safranek and Joe Schindler and recently published on the New Journal of Physics, we delve into the mathematical properties and operational meaning of observational/macroscopic entropy, and discover some deep connections with the theory of approximate recovery and statistical retrodiction, which is a topic that keeps showing up in my recent works, even if I’m not looking for it from the start.

Is it perhaps because retrodiction really does play a central role in science? Or is it just me seeing retrodictive reasoning everywhere?

	bjoernkh on Is the Heisenberg picture prop…
	What does “macroscop… on About the logical foundations…
	asiff025 on How much would you bet against…
	Causal vs noncausal… on Correlations that enable anoma…
	Say hello to von Neu… on No Hamiltonian? No time-revers…

Quantum Quia

state contenti, umana gente, al quia

Category: research