Quantum Mechanics is a Local Theory

There is No “Spooky Action at a Distance”

bunchberry
19 min readSep 9, 2024

It is often argued that Bell’s theorem proves that there is nonlocality in nature. It is also often stated that entanglement is a nonlocal effect whereby entangled particles can influence each other no matter how far away they are apart. However, both these arguments are not only wrong, but it can be shown that entanglement is precisely what guarantees that quantum mechanics is a local theory.

Entanglement Guarantees Locality

Quantum mechanics is a statistical theory whereby probability amplitudes are complex-valued. It operates identically to any other probabilistically theory except that it makes sense in quantum mechanics to say that events can have a negative, or even imaginary probability amplitude associated with them, rather than merely being between 0 and 1. This gives rise to interference effects which are the hallmark of quantum mechanics whereby positive and negative probabilities can cancel out in a way that cannot occur in traditional probability theory. I have an article here whereby I attempt to give a simple intuition for interference effects.

Not only do I explain interference in that article, but I also explain how whenever two systems are entangled, it no longer becomes valid to assign the state vector (the list of complex-valued probability amplitudes) to the individual subsystems, but only to the system as a whole. Interference effects thus only become a property of the system as a whole and disappear for the individual subsystems.

In the linked article I don’t actually go into the real mathematics, but here I will. We can demonstrate this using the program Octave (similar to MATLAB but better). First, we need to define some logic gates. The logic gates we will be using are only two: the CX gate and the Hadamard gate. Logic gates are represented by a row for each possible input and columns for the associated outputs. If it is a single qubit gate, then you will need two rows for the possible inputs of 0 and 1, and then two columns for the outcomes of 0 and 1. If it is a two qubit gate, then you will need four rows for the inputs 00, 01, 10, and 11, and four columns for the outputs 00, 01, 10, and 11. Below we define our two logic gates.

octave:1> H = 1/sqrt(2) * [ 1, 1; 1, -1 ]
H =
0.7071 0.7071
0.7071 -0.7071

octave:2> CX = [
> 1, 0, 0, 0;
> 0, 0, 0, 1;
> 0, 0, 1, 0;
> 0, 1, 0, 0
> ]
CX =
1 0 0 0
0 0 0 1
0 0 1 0
0 1 0 0

For the Hadamard gate, if the input is 0, then we look at the first row for the outputs, which is an output of 1/sqrt(2) for 0 and 1/sqrt(2) for 1. That means the Hadamard gate will place a qubit that is in an eigenstate (either 0 or 1) into a superposition of states. The CX gate negates the most significant qubit only if the least significant qubit is 1. Hence, 00 has an output of 00, 01 has an output of 11, 10 has an output of 10, and 11 has an output of 01. You can again tell this by matching the inputs to the rows and the outputs to the columns.

Applying a logic gate is performed just by multiplying the operation by the state vector as shown below where U is some unitary logic gate. In physical terms, this would be some sort of interaction with a particle that results in changing its probability amplitudes for what you might observe.

We can begin with a bipartite (two-qubit) system which would be represented by a state vector that would be of size 4 as it would have a list of probability amplitudes for the probabilities of measuring 00, 01, 10, and 11. We can entangle these two qubits together by first applying a Hadamard gate to the least significant qubit and then applying the CX gate. The most significant qubit will be negated based on the value of the least significant, but the least significant would be in a superposition of states. The result would be that the two qubits would be correlated with one another but both collectively in a superposition of states.

As you can see below in the final output, there are only non-zero probability amplitudes for the possible outcomes of 00 and 11, so the qubits are guaranteed to have the same value when you observe them, even if prior to observing them they are in a superposition of states.

octave:3> psi = [ 1; 0; 0; 0 ]
psi =
1
0
0
0

octave:4> psi = kron(eye(2), H) * psi
psi =
0.7071
0.7071
0
0

octave:5> psi = CX * psi
psi =
0.7071
0
0
0.7071

As a side note, you can Kronecker product logic gates together to form a larger gate that acts as running them in parallel. You need to do this in the example above (with the “kron” function) because we have a bipartite system which is too small to apply the Hadamard gate to it which is a single qubit logic gate, so we Kronecker product it with the identity matrix (with the “eye” function) which forms a new logic gate that applies the identity matrix (does nothing) to the most significant qubit and applies the Hadamard gate to the least significant.

Let’s say I handed you a single qubit that I applied a Hadamard gate to. You can see in the logic gate matrix that the input of 0 or 1 produces an output with equal probabilities of 0 or 1, so it would have a 50% chance of being 0 or 1. Yet, like the interference example I showed in the other article, if you apply the Hadamard gate twice it cancels itself out so you get the original value you started with, so if you started with a 1 and apply it twice then the output is a 1.

Classical and quantum probability behaves differently. Imagine if I randomly, with a 50% probability, handed you a 0 or a 1 and had you apply a Hadamard gate to it. If you follow the logic gate matrix, then either way, you end up with an output with an equal probability for 0 and 1. Despite in both cases there initially being a 50% chance of measuring a 0 or a 1 prior to applying a Hadamard gate, only in one of the cases does it give you a determined outcome, while in the other it gives you a random outcome.

The difference between quantum and classical probabilities can be expressed using density matrices. A density matrix is computed just by multiplying the Hermitian transpose of the state vector by itself. The Hermitian transpose is computed by transposing the matrix and taking the complex conjugate of all its values.

Density matrices hold the Born rule probabilities (values 0–1) in their diagonals, which are the numbers from the top-left to the bottom-right. For any superposition of states, there are at least some off-diagonals (numbers in the matrix which are not on the diagonal) with non-zero entries. For eigenstates, however, they always have all zeroes in the off-diagonals.

You can see below after applying a Hadamard gate the probability of the output of 50% for 0 and 50% for 1. There are also non-zero values for the off-diagonals. However, for the eigenstates, there are only zeroes in the off-diagonals.

octave:7> [1; 0] * transpose(conj([ 1; 0 ]))
ans =
1 0
0 0

octave:8> [0; 1] * transpose(conj([ 0; 1 ]))
ans =
0 0
0 1

octave:9> (H * [0; 1]) * transpose(conj(H * [ 0; 1 ]))
ans =

0.5000 -0.5000
-0.5000 0.5000

You can thus represent classical probabilities with linear combinations of eigenstate density matrices each multiplied by their probability of occurring. You will still get a matrix with probability values on its diagonals but with all zero entries in its off-diagonals, which would not correspond to any possible superposition of quantum states.

If you see the example below, we take the two eigenstate density matrices for 0 and 1 and combine them with 0.5 of each, and we get a new density matrix that still has probabilities of 50% for 0 and 50% for 1 in the diagonals but all zeroes in the off-diagonals, which differs from the density matrix whereby we apply the Hadamard gate.

octave:13> 0.5 * ([0; 1] * transpose(conj([ 0; 1 ]))) + 0.5 * ([1; 0] * transpose(conj([ 1; 0 ])))
ans =
0.5000 0
0 0.5000

To show why this is interesting, we then have to introduce the partial trace. Recall that whenever you have entangled particles or qubits, you cannot assign the state vector to the individual particles or qubits but only to the system as a whole. However, what if we want to know how a single particle will behave on its own? You can compute this using a partial trace whereby you can take a density matrix for an entangled system and trace out (ignore) particles you don’t care about. Below are the equations for this whereby the first traces out the least significant qubit and the second traces out the most significant qubit.

Let’s go back to our bipartite entangled system. Remember that? Let’s first compute its density matrix to see what it looks like. You can tell that it is a density matrix with non-zeroes in two of the off-diagonals so it is a quantum probability distribution that can interfere with itself.

octave:14> psi
psi =
0.7071
0
0
0.7071

octave:15> p = psi * transpose(conj(psi))
ans =
0.5000 0 0 0.5000
0 0 0 0
0 0 0 0
0.5000 0 0 0.5000

Now, what happens if we were to trace out the most significant qubit leaving us with just the density matrix for a single qubit in this entangled system? You can see the results below. What we are left with is a density matrix that is classical. That is to say, for a single qubit taken in isolation from an entangled pair, it would behave as if it were still random, but classically random. It would not exhibit interference effects.

octave:20> kron([1, 0], eye(2)) * p * kron(transpose(conj([1, 0])), eye(2)) + kron([0, 1], eye(2)) * p * kron(transpose(conj([0, 1])), eye(2))
ans =
0.5000 0
0 0.5000

Remember that we formed this entangled state by first applying the Hadamard gate to a single qubit then entangling it with another with the CX gate. That means the single qubit on its own could exhibit interference effects, then after being entangled with the other qubit, it lost its ability to interfere with itself. Indeed, take a look below where we apply simply the Hadamard gate to the least significant qubit without the CX gate and trace out the most the significant qubit again. We get the density matrix with non-zero values in the off-diagonals.

octave:21> kron(eye(2), H) * [1; 0; 0; 0]
ans =
0.7071
0.7071
0
0

octave:22> p = ans * transpose(conj(ans))
p =
0.5000 0.5000 0 0
0.5000 0.5000 0 0
0 0 0 0
0 0 0 0

octave:23> kron([1, 0], eye(2)) * p * kron(transpose(conj([1, 0])), eye(2)) + kron([0, 1], eye(2)) * p * kron(transpose(conj([0, 1])), eye(2))
ans =
0.5000 0.5000
0.5000 0.5000

Entangling a qubit or a particle with another one removes its ability to interfere with itself as only the system as a whole can exhibit interference effects. Indeed, we can repeat this whole process with a tripartite system where we entangle three qubits together and trace out just one of them. We find that we end up with two qubits that both together and separately do not interfere with themselves.

octave:24> psi = kron(CX, eye(2)) * kron(eye(2), CX) * kron(eye(4), H) * [1; 0; 0; 0; 0; 0; 0; 0 ]
psi =
0.7071
0
0
0
0
0
0
0.7071

octave:25> p = psi * transpose(conj(psi))
p =
0.5000 0 0 0 0 0 0 0.5000
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0.5000 0 0 0 0 0 0 0.5000

octave:27> kron([1, 0], eye(4)) * p * kron(transpose(conj([1, 0])), eye(4)) + kron([0, 1], eye(4)) * p * kron(transpose(
conj([0, 1])), eye(4))
ans =
0.5000 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0.5000

octave:28> kron([1, 0], eye(2)) * ans * kron(transpose(conj([1, 0])), eye(2)) + kron([0, 1], eye(2)) * ans * kron(transpose(conj([0, 1])), eye(2))
ans =
0.5000 0
0 0.5000

Entanglement, rather than being nonlocal, in fact guarantees that quantum mechanics remains a local theory. Why? Well, first, to entangle the qubits, you have to have them come together to interact. That is a local interaction. Second, when they become entangled, it is further guaranteed that separately they will not exhibit interference effects. They will behave like classical particles. This means that the only way to view the interference effects is to bring the entangled particles locally together.

Affecting One Does not Affect the Other

When the two particles are separate, they will, again, behave classically. There is nothing you could do to them that could not be explained classically. There is a myth that when you interact with one particle, it instantly interacts with the other, such as if you flip a particle in an entangled pair, then the other particle also gets flipped.

Let’s say we have two entangled qubits that have equal probabilities of being 00 and 11. If affecting one affects the other, then flipping one of them should flip the other. This would change the probability distribution to 11 and 00 with the same probabilities, although that is identical to the initial state, and thus nothing would change. If affecting one doesn’t affect the other, then flipping the least significant qubit would change the probability distribution for 00 and 11 to 01 and 10. It would transform the perfect correlation into a perfect anti-correlation.

Let’s say I have two envelopes with a coin in them. I mix the coins up such that in both envelopes H (heads) is facing up in both envelops or T (tails) is facing up in both envelopes. The possible outcomes are thus HH and TT. Now, let’s say we separate millions of miles apart before opening our envelopes, but you decide to be clever and flip yours upsidedown before opening it. The effect of this will be to change the probability distribution from HH and TT to HT and TH. Yours would be guaranteed to be the opposite of mine, but you did not affect mine at all.

We can carry out this experiment in Octave as shown below. Flipping one of the qubits changes the probabilities from equal probability of 00 and 11 to equal probability of 01 and 10. It does not affect the other qubit and behaves as it would in the classical case.

octave:29> X = [ 0, 1; 1, 0 ]
X =
0 1
1 0

octave:31> psi = CX * kron(eye(2), H) * [ 1; 0; 0; 0 ]
psi =
0.7071
0
0
0.7071

octave:32> kron(eye(2), X) * psi
ans =
0
0.7071
0.7071
0

Indeed, if we compute the reduced density matrix for both qubits, you would find that the reduced density matrices are actually not altered. There is nothing you could actually do to one of the qubits to affect the reduced density matrix of the other one.

That is essentially what the No-Communication Theorem demonstrates. If you create two density matrices where one is of the two entangled qubits and another is of the two entangled qubits by applying some sort of unitary operation to, let’s say, the least significant qubit, you can perform a partial trace to get the reduced density matrix of the most significant qubit and prove this reduced density matrix will always be the same as if no unitary operation was applied to the least significant qubit at all.

Separately, they are guaranteed not to affect one another. Only when brought back together do you observe interference between them. You thus can only ever observe interference effects locally. They simply cannot be observed nonlocally.

Therefore, if you want to claim quantum mechanics is nonlocal, you are pushed into a corner where you have to argue that this nonlocality is conspiratorial. It has to be nonlocal in a way where all that nonlocality is hidden below the surface and cannot actually be observed so that the mathematics just so happens to work out such that it is identical to as if it were local. Nature would have to be nonlocal but in a way that conspires to hide it from us.

In the real world, when Bell tests are carried out, you first entangle two particles locally, then you separate them at a vast distance, and then you bring them back together and observe their interference effects with each other. If you never bring them back together, no violations to Bell inequalities can be observed. Bell tests only would require something nonlocal if you presumed the outcomes were predetermined. I talk about it in this article here.

If you treat the outcome as predetermined, you have to preassign all the values, but this leads to a mathematical contradiction unless you presume the measurement settings alter the outcome. Yet, with a Bell test, the measurements are spatially distributed between the two particles. Each individual particle would have to be affected by the measurement settings which could be at a great distance, and thus this implies nonlocality.

Yet, if you do not presume predetermination, then you do not need to preassign all the values, and so you do not run into this contradiction. Bell’s theorem rules out local hidden variable theories (setting debates about superdeterminism aside) but not locality itself.

Quantum mechanics is probabilistic not because there is some hidden variable that if we knew we could predict the outcome ahead of time. Quantum mechanics is probabilistic precisely because the outcome is not predetermined. What we are ignorant of is what the outcome will be in the future when we interact with it. Of course, if we somehow knew that information, we could predict the outcome ahead of time, but that is impossible as it would imply knowing the future. Like all statistical theories, it is statistical because we are ignorant of something, but in this case, what we are ignorant of is something that cannot be acquired ahead of time.

There is No Nonlocal “Collapse”

Now, you might say, even if the two particles don’t affect each other, doesn’t measuring one collapse the wave function for both simultaneously and thus the collapse of the wave function is a nonlocal event? No, and for a simple reason: there just is no wave function collapse. And I do not mean this in the sense of devolving into mysticism about a multiverse. I already wrote an article here about why I think the Many-Worlds Interpretation is incoherent. Quantum mechanics taken at face value simply does not have a “collapse postulate” as it is often stated.

Indeed, when you observe a definite outcome, you reduce the state vector to 1 for whatever outcome you saw. This, however, has absolutely nothing to do with “collapse” nor is it a postulate. It is just how probability works. If I flip a coin, when it is in the air, it has a 50% chance of being heads and a 50% chance of being tails. If it lands on heads and I see the outcome, I will update my prediction to 100% for heads and 0% for tails because I know what the outcome is. This is not a separate postulate or an assumption, it necessarily follows from the definition of probability. If we already agree the state vector represents probability amplitudes then this follows by definition and is not an additional postulate on top of this. That’s just how probability works. You do not need a separate postulate to say that you reduce a probability distribution in this way when you make an observation.

The problem here is that people tend to objectify the state vector, which is merely a list of probability amplitudes. They claim that the state vector is not a list of probability amplitudes but instead represents an object: the dimensions of a physical wave. This reduction of the state vector would thus imply that you perturb the wave such that it physically “collapses” like a house of cards into a single particle.

Yet, this is just an incredibly silly belief. If I have one coin, the probability amplitudes I would assign to it are 2, these being the probability for 0 and the probability for 1. If I have two coins, I would assign 4 amplitudes, these being for 00, 01, 10, and 11. If I have 4 coins, then I would assign 8 amplitudes, these being for 000, 001, 010, 011, 100, 101, 110, and 111. The number of amplitudes is unbounded and grows exponentially the more coins I have.

Since the state vector is merely a list of probability amplitudes, it also grows unbounded and exponentially the more particles are under consideration. If each amplitude is not really a probability but describes some dimension of a wave, then the number of dimensions of the wave would be unbounded. This wave would not exist in our simple four-dimensional spacetime but would exist in an infinitely dimensional space known as a Hilbert space.

People like to trick themselves into thinking the wave picture is intuitive by imagining a wave passing through two slits in the double-slit experiment makes them believe they have an “intuition” for what is going on. Yet, they are just playing mental tricks on themselves. The wave going through the two slits would exist in Hilbert space and not in spacetime and thus any depiction of it as a wave moving through space is inherently wrong. That would simply be a false depiction of what would actually be going on.

Indeed, this false belief that imagining things as waves moving through space gives an “intuitive” visualization of it just leads to confusion and wasted time. The moment you try to apply this visualization to just about anything else, it doesn’t work. Consider a quantum computer where the qubit states are acting on electron spin. The electrons are, roughly speaking, held in the same place and are not moving through space like with the photons, yet you would still assign a state vector that evolves over time as you apply logic gates to this quantum computer.

With just 100 electrons, this wave would exist in a Hilbert space with over 10³⁰ dimensions. There is only 10¹⁸ grains of sands on earth. This wave would evolve over the course of the program without moving anywhere. How on earth do you visualize this? Falsely believing you can and then wasting your time trying to visualize quantum computation as waves just leads you to, well, waste your time. It is a mere mental trick that deludes yourself into thinking you can easily visualize the double-slit experiment with a false visualization that doesn’t even work at all when applied anywhere else. I wish no one ever told me to think about it that way because it definitely contributed to a lot of initial confusion as I tried to visualize things in that sense and it never helped at all but just left me more confused than ever.

It is much simpler not to think of the state vector as a literal wave but as, again, a list of complex-valued probability amplitudes. The complex-valued nature of them is what gives rise to interference effects. The wave-like behavior of light is not a property of a single particle as if it turns into a wave when not being looked at, and indeed there simply is no wave for a single particle. The actual observed waves such as the wave-like behavior of light is something weakly emergent from these interference effects over large numbers of particles. The observed waves such as the interference pattern should be seen as a separate and emergent phenomena to how the particles individually behave, which is simply probabilistic with interference.

When you make this distinction, then it becomes rather clear as to why separating two entangled particles and then measuring one of them is not having a nonlocal effect on the other. The state vector is again a list of probability amplitudes, it is thus a predictive tool and does not describe the system. If I say a coin has 50% chance of being heads and 50% chance of being tails, I am not describing the coin as halfway between the two somehow, but I am making a statement about what my prediction would be for the outcome, the gambling odds so to speak.

If I have entangled qubits, my gambling odds are first 50% for 00 and 50% for 11. If I measure one of them, and, let’s say, I measure 00, then I update my gambling odds to 100% for 00 and 0% for 11. This does not imply I perturbed the other particle in any way simply by updating my prediction. There is no “collapse” of some giant wave that connects me to the other person. I am merely updating my prediction because that’s how probability works.

Recall that I said believing quantum mechanics is nonlocal, due to the No-communication theorem, implies a sort of conspiratorial nonlocality. People who believe that particles literally turn into waves in Hilbert space until you look at them and “collapse” into particles believe that the natural world contains superluminal (faster-than-light) signaling but just below the surface where we can’t see it.

Consider, for example, if Alice and Bob shared an entangled particle and Alice measures her particle. If you believe the state vector literally represents a physical wave, then at that moment Alice collapses a giant wave stretching between Alice and Bob that leaves in its place a particle on both ends. If, somehow, Bob could see this wave without collapsing it, he would be immediately signaled faster than light as to what Alice had done, violating the speed of light limit.

Of course, Bob cannot see the wave without collapsing it. If he tries to look at it before Alice does, he collapses it himself and thus cannot be signaled by Alice. However, this would still imply that nature really does have superluminal signaling, but that it is just conspiratorial in such a way where it just so happens to be arranged that it disappears whenever you look. It is always just below the surface where you can’t see it, where what you can see behaves entirely locally, but if there was some godlike figure that could see nature “as it really is,” then they could make use of this superluminal signaling.

Why believe that nature conspires in a way that just so happens to appear local while in reality being nonlocal just below the surface that we can never observe? Why not just believe nature is local? That conclusion solely deals with what we observe and does not require positing these unobservable infinitely-dimensional nonlocal waves that collapse like a house of cards when you look at them.

--

--

bunchberry

Professional software developer (B.S CompSci), quantum computing enthusiast.