Superposition and Entanglement in Quantum Mechanics (Part I)

Feb 28, 2024

Note : Apologies for re-posting this, but the original was quite lengthy and the math stuff failed to render properly on the phone app and email. So I’m seeing if splitting it into 2 shorter pieces that are not deemed “too long for email” helps

Note after posting : Nope - the math is still not being rendered properly in the phone app. That’s frustrating. I guess if you want to see the squiggles you’ll have to read this on the poota.

Today, I want to delve a little more deeply into quantum mechanics (QM) and why it’s a bit different from the “day-to-day” way we tend to interpret the world.

I’m going to get a little bit technical in places, but I urge you to stick with it if, like most people, the math squiggles give you the heebie-jeebies. I will try my best to ‘look inside’ the math - but, honestly, there’s no good way (that I know of) to really get to any sort of decent understanding of QM without some squiggles.

We’re Not in Kansas Anymore

If you feel a little like Dorothy in The Wizard of Oz whenever you try to get to grips with an account of QM, then welcome to the club. Most physicists feel the same way too, and there are some rather vigorous debates amongst physicists about what it all means - and we’re nearly 100 years on from when Schrödinger and Heisenberg first published their papers in what we call quantum mechanics.

The first really big difference is in the uncertainty. Just to clarify here, I’m not talking about the Uncertainty Principle, which you’ve probably heard of, but about how uncertainty itself is different in QM.

What do I mean by that?

Well, the old view, what is called the ‘classical’ view, was built on a set of physical laws that are time-reversible. Take Newton’s Laws, for example. Time enters into the equations (usually with the symbol t) and, if we can specify the starting conditions and all the forces acting, we can predict what the object is going to be doing at some later time. We can go the other way too. If we know the endpoints, we can work backwards (by ‘reversing’ time in the equations) to figure out what it was doing when it started. So we can predict forwards and ‘predict’ backwards.

This means that, if only we could figure out all the necessary starting conditions we could, in principle, know the future. The reason why we can’t is because we end up with having to handle too many equations (we can’t solve them) and we just can’t know all of the starting conditions well enough.

Uncertainty, in this perspective, arises from our ignorance. If only we could work the equations out, and know where everything started from, we could predict the future - there would be no uncertainty at all.

Even in classical physics we have to use the tools of uncertainty - probabilities, statistics, etc, but it’s because it’s the only way we can meaningfully get to grips with our own ignorance about the world.

Not so in QM.

In QM the uncertainty is baked in. It’s not arising because we don’t know stuff, or because we can’t solve the equations, but because the world is, in principle, unpredictable.

There are no things we’re ignorant of, no ‘hidden’ variables that, if we only knew them, would turn this state of uncertainty into certainty.

In 1964 the Irish physicist John Bell wrote a paper that is probably one of the most important ever written. In it he showed that any theory which assumes the existence of these ‘hidden’ variables cannot reproduce all of the predictions of QM.

This opens the way for an experimental test of whether the world is ‘quantum’ or ‘classical’. The 2022 Nobel Prize in physics went to 3 scientists who had been at the forefront of doing those experiments (amongst them being John Clauser who has found fame recently as a climate ‘sceptic’).

These experiments (and the many others done by lots of other physicists) have convincingly demonstrated that the world, as we know it, is quantum¹. The quantum predictions, the ones that can’t be reproduced by any hidden variable theory, are the ones we see.

So we can’t extricate ourselves from the quantum quagmire by assuming that there’s stuff (these hidden variables) that we just don’t know about. That’s not where the uncertainty is coming from.

We’ll come back to Bell’s paper in a bit, because it teases out this ‘quantumness’ by using the QM predictions for correlated particles. It shoved entanglement centre stage, but the uncertainty in QM is not dependent on entanglement, or correlation. Entanglement, as we shall see, arises from the superposition principle applied to 2 particles.

The difficulties inherent in understanding this superposition, even for single objects, was noted early on - and the most famous exposition of this can be found with Schrödinger’s unfortunate cat.

With a Little Bit of This, and a Little Bit of That

In order for us to understand this difference between the quantum and classical perspectives we have to know a little bit about how things are described in QM, and what the quantum ‘rules’ are.

I’m going to be using a view of QM that is mostly based on what’s called the Copenhagen Interpretation (CI). There are other ways to interpret the math - and the version of the CI that I present here has some philosophical flaws, serious ones, although it will still allow you to generate the correct predictions for experiments.

The reason I do this, despite the philosophical limitations which I fully acknowledge, is because it’s quite straightforward, albeit somewhat odd and unfamiliar to a classical perspective. It’s the way I usually work out stuff when it comes to solving quantum problems - although I have used other interpretations (they all give the same answers - they’re just different ways of thinking about what’s “going on”).

First off, we have to remind ourselves about vectors. Suppose we walk 1 metre to the East. We can represent this with an arrow. Then we walk 1 metre to the North. We can represent this with an arrow, also. We can see that the same endpoint could have been reached if we’d just walked in a straight line between the start and end.

These arrows can be thought of as vectors, and we can see that, visually

The way we’d write this is r = i + j where the r is the green arrow and i is a unit vector in the ‘East’ direction (the blue arrow) and j is a unit vector in the ‘North’ direction (the red arrow). The bold typeface is a typical way of indicating that you’re dealing with a vector quantity, and by ‘unit vector’ we mean a vector that has length 1.

The reason why the unit vectors are so convenient is that they’re at right angles to one another and they have unit length. You can see, I hope, that any ‘green’ vector you care to draw can be ‘made up’ of certain numbers of these blue and red steps. You could, for example, travel half a unit East and then 3 units North.

The blue and red unit vectors form what is known as a basis for the ‘space’ of green vectors². All green vectors can be ‘made up’ of an appropriate combination, a superposition, of blue and red vectors.

What’s all this got to do with QM?

Well, it turns out that when describing the ‘state’ of a physical system in QM we must think of it as a vector. It’s not a vector you can draw, though. If we think of the piece of paper for our arrows as being the ‘space’ where these vector arrows live, then state vectors in QM ‘live’ in a very different kind of space.

This is known as a Hilbert space and, furthermore, it’s a complex Hilbert space that requires the use of real and imaginary numbers.

What is this ‘state’ vector? Don’t worry too much about this, because physicists don’t know either. There are lots of arguments about what it ‘is’.

Broadly speaking we can think of the ‘state’ vector as something that encodes everything we can know, physically, about some object. Does it represent reality in some sense, or is it just some convenient mathematical tool that allows us to predict the results of experiments? Dunno - and neither does anyone else - and that’s what a lot of the arguments in QM fundamentals centre around.

The problem is is that if we don’t use these quantum ‘rules’ or postulates (in one form or another) we end up predicting the wrong things. There are different ways to formulate the quantum rules, but they all contain the same basic underlying algebraic system (even if that’s well-hidden) which is fundamental to getting the right predictions. You can use a Copenhagen Interpretation, a minimalist interpretation, a transactional interpretation, a many-worlds interpretation, or even the ‘pilot wave’ interpretation of Bohm, amongst others.

They all just hide the ‘weirdness’ in different places.

This is not as odd as it seems. In classical mechanics, for example, there are also different ways of formulating the same basic ideas. We might call these different ‘interpretations’. Thus we have the Newtonian formulation, but we also have the Lagrangian and Hamiltonian formulations of classical mechanics which look, mathematically, very different indeed. They’re all equivalent, though.

Anyway, when we write down the ‘state’ of a system in QM we use a vector and we typically use a specific notation invented by Dirac. The quantum equivalent of the vector equation above, r = i + j, would be to write something like

\(\ket{ \psi } = \frac {1}{\sqrt{2}}( \ket{0}+\ket{1})\)

Well, that looks a bit more scary than i + j but it’s pretty much the same. We have some vector being ‘made up’ of the sum of 2 other vectors. The angled bracket here is just a notation that tells us we’re dealing with a vector that lives in this complex Hilbert space (it’s not the same as an arrow drawn on a piece of paper, but it is still, nevertheless, a vector).

And what the heck is that square root of 2 doing? What’s all that about? We’ll come back to that a bit later.

If I were to write the same equation with different ‘things’ inside the angled brackets, we end up with Schrödinger’s cat

\(\ket{ cat} = \frac {1}{\sqrt{2}}( \ket{dead}+\ket{alive})\)

The ‘state’ of the cat is ‘made up’ of a sum (a superposition) of the two cat ‘states’, dead and alive.

This is where a lot of the confusion arises. This way of thinking tends to make us think that the cat is sort of a little bit dead and a little bit alive, all at the same time. The poor cat is in some horrible superposition of being both dead and alive.

But we don’t think that about r = i + j. We don’t think of the vector r as actually being a mix of two things, even though we have the strict equality in a mathematical sense.

So, what’s gone ‘wrong’ with the QM vector superposition here?

It all comes down to how we interpret the quantum equation. To understand (one way) of interpreting this equation we need to delve into what these various symbols mean and how they get used to furnish experimental predictions.

It’s worth noting that Schrödinger came up with the idea of his hapless cat to critique the interpretations of QM that were being discussed at the time. It was to highlight the absurdity of having a weird alive/dead cat thing as a description of reality.

I Don’t Care if They’re Stupid, They’re the Rules

Before we try to unpack what the quantum version of the vector superposition equation means, we need to get to grips, a bit, with the quantum rules.

If we look back at the arrows on paper example of vectors, we had two vectors, two special arrows (the blue and red) that were serving as our ‘basis’ for the space. Any general arrow (vector) could be written in terms of these two blue and red arrows.

We have exactly the same idea in QM. We have some general state (usually given the symbol of the Greek letter psi) that can be written as a sum (a superposition) of a set of special basis vectors.

Where do these basis vectors come from in QM?

Here’s where things start to go a bit all sort of “you what?”.

In QM the object we’re interested in is said to be in some quantum state (the ‘psi’ thing). The things we can measure, however, are represented mathematically by operators. These are not ‘variables’ in the way we’d think of them in classical physics.

If you want an example of an operator then our arrows on paper come in useful again. Suppose you had an arrow represented by the math - what mathematical object could be used to rotate that arrow so it ended up pointing in a different direction after it had been operated on? The mathematical object that does this for us is known as the rotation operator³.

The operators that represent physical observables in QM are special kinds of operators known as Hermitian operators.

OK we’re just going to have to accept this for the moment, because it gets a bit more odd. Associated with an Hermitian operator⁴ is a special set of vectors known as its eigenstates. These are a special set of states that, when operated on, don’t change except for multiplication by a constant. The constants produced are known as the eigenvalues.

It turns out that these special states, these eigenstates, form a basis for the space (like our red and blue arrows). Each observable can generate a different set of eigenstates - thus we might have energy eigenstates, or polarization eigenstates, etc. These states also have the nice property, like our red and blue arrows, that they are orthogonal to one another (which means they are at ‘right angles’ to one another in this more complicated space).

OK - you might need to read all that a few times. It’s definitely not Kansas any more. It’s physics, Jim, but not as we know it.

The basic idea is that we can take a general quantum state and write it as a sum of these special states. These special states, the eigenstates, are associated with observable properties. So, if we wrote a state in terms of the eigenstates of the energy operator we might say we had expanded the state in the energy basis.

Let’s try to illustrate this with an example. If you have an electromagnetic (EM) field you could have a field with no photons at all - this would be the vacuum state. You could have one with exactly one photon, or exactly two photons, and so on. These are the energy eigenstates for the EM field.

So you represent a general EM field state as a sum (a superposition) of all of these special state vectors (these special eigenstates) and it might look like this :

\(\ket { \psi } = a \ket{0} +b \ket{1} + c \ket{2} + d \ket{3} + \ldots\)

The a, b, c, and d (and so on) here just tell us ‘how much’ of each special eigenstate needs to get added into the mix in order to construct the overall ‘psi’ state.

For those of you who have done a technical degree, you might recognise the hand of linear algebra writ large here. You wouldn’t be wrong.

What does all this mean? What’s the connection with real ‘stuff’ - things we can measure, machines that go ping, and all that?

You’ve settled down a bit and are now ready for even more odd rules, aren’t you?

Well, let’s take our EM field example above, and go about ‘measuring’ the energy - at least theoretically. What do the quantum rules tell us will happen?

The rules say that if we make a measurement of energy we’ll get the result “0 photons” with probability a squared, the result “1 photon” with probability b squared, and so on.

This is where the randomness kicks in. There’s no way, in principle, to know which result we’re going to get. This isn’t happening because we just don’t know stuff well enough - it’s built in to the thing.

There’s another weird thing, too.

The rules (the ‘projection postulate’) tell us that if we do a non-destructive measurement (a very idealized form of measurement) then after the measurement the field will be in a new state - and it will be the eigenstate associated with our measurement result.

Not only is randomness built in, but there’s an irreversibility directly built in, too.

Holy Schmoly, what the heck are these physicists smoking?

It gets worse (but you knew that already, didn’t you?). Remember those constants that told us ‘how much’ of each special state goes into the mix? They’re complex numbers. So when we square them, to get our probabilities, we have to do complex number ‘squaring’.

Because, when complex ‘squared’, they’re probabilities, they also have to add up to one. We do a measurement of energy and we have to get some result - the sum of all these probabilities for individual results has to add up to one.

This is why we had the square root of 2 thingy in the first quantum equation above.

So let’s imagine we could create an EM field state that looked like this

\(\ket{ \psi } = \frac {1}{\sqrt{2}}( \ket{0}+\ket{1})\)

If we made a measurement of energy on this field we’d get the result “no photon” (i.e. the vacuum state) with probability of 1/2, and the result “one photon” with probability 1/2.

If we got the result “one photon” the field would now be in a different state (with one of these highly idealized non-destructive measurements) and that state would be the one photon state.

As soon as we start to imagine things in terms of photons (little bitty particles of light), as soon as we have this mental picture, we end up back in ‘cat’ territory. What does it even mean to have a state that is a superposition of no photons and one photon, at the same time?

Our brain starts to melt as we vainly struggle to ‘picture’ this.

The quantum ‘rule’, for measurements, is something like this.

Expand the state in terms of the eigenstates of the operator representing the thing you want to measure (eg, energy). The measurement result will be an eigenvalue of that operator and which one you get will be random with probability ‘complex squared’ of the coefficient of the eigenstate. Furthermore, the new state, after measurement, will be in the eigenstate associated with the measured eigenvalue.

Can I explain why this is so? Not at all. The rules are, by any ‘reasonable’ standard, just bonkers. But they are what they are.

I’ll emphasize again that what I’ve presented here is just one way of writing out the quantum ‘rules’. There are other approaches to QM which are possible - but these also tend to be just as technical and are still bonkers (the bonkers just gets shuffled around a bit in the formulation).

What happens in between measurements? Well, the state is said to evolve according to Schrödinger’s equation, which is a time-reversible equation. There’s been a lot of work done to try to explain the irreversibility of the quantum measurement rule as a kind of statistical property. In this approach you model the object (the thing you’re interested in) and the measuring device itself, quantum mechanically.

The irreversibility then arises as a result of the quantum interaction of a small quantum thing with a large (quantum) thing and is a very similar argument to that used to explain ‘the arrow of time’ in classical statistical mechanics.

It’s generally known as the ‘decoherence’ approach - but it’s a bit of a fix that doesn’t properly work in my view. The fundamental ‘measurement problem’ of QM is still there, but I’ll leave that for another day perhaps.

If you start to delve into the underlying algebraic structures more technically, you find that despite these ‘rules’ just looking bonkers, there’s a real connection with the classical perspective. It comes down to the fact that in QM we have to use what’s called a non-commuting algebra, whereas things in the classical version of ‘reality’ all commute.

We can get a feel for this by thinking about operators. Operators commute when it doesn’t matter what order you do them in. But try these two operations

(a) put socks on
(b) put shoes on

in a different order - and you’ll understand what non-commuting means.

. . . (to be continued in part II)

Riggery Pokery

Superposition and Entanglement in Quantum Mechanics (Part I)

We’re Not in Kansas Anymore

With a Little Bit of This, and a Little Bit of That

I Don’t Care if They’re Stupid, They’re the Rules