Today, I wanted to write about something fascinating, as opposed to modern cultural trends which could, perhaps, be described as both fascinating and horrifying.
This is a ‘brain dump’ from memory - and so one or two of the details may be a tiny bit off here and there, but I hope I can give a bit of the flavour of why one of the biggest revolutions in our understanding of the universe actually happened.
It’s interesting that what is, loosely, called ‘modern’ physics revolves around two great theories in which Einstein played pivotal roles. I’ve already written a bit about quantum mechanics (QM), but today I wanted to talk a bit about relativity.
Both of these great pillars of modern physics, QM and relativity, are counter-intuitive to say the least. In a way, that’s not surprising; QM deals with things on the scale of atoms and relativity only ‘kicks in’ when things are moving really, really fast1.
I’m only going to consider what is known as Special Relativity (SR) and try to explain why it’s necessary. The results you get from it are, on the surface, a bit bonkers. They’re really, really hard to understand - perhaps even more so than the results of QM.
The fascinating thing about SR is that had Newton just viewed things in the ‘right’ way, he could have derived it all. All of the maths and physics was (essentially) there back in Newton’s day.
So, what’s all the fuss about?
To understand that, we really have to go back to Galileo.
Galileo is, most famously, seen as the bloke who stood up to the Catholic Church. He was a spreader of ‘disinformation’ according to the ‘official’ narrative.
Many amateur scientists use Galileo as some kind of example to support their own pet (and incorrect) theories about physics. They liken their struggles to get their ideas accepted by the ‘dogmatic’ mainstream as akin to those of Galileo. They tend to forget that
To be seen as another Galileo it is not enough merely to be a heretic.
One must also be right.
Galileo was the first (that I’m aware of) to state the principle of relativity.
Ooh, that sounds all techy and stuff. But it’s not, really. It’s a very simple, yet very profound, idea. We all ‘intuitively’ know about it.
If you’re flying to your holiday destination (assuming they’ll let a filthy unvaccinated person in) you’re likely travelling at something like 400mph - but if it’s a nice smooth flight (no turbulence etc) you’re not aware that you’re moving. You can usually eat the sludge they serve as food without it ending up all over you. You can drop a peanut and it lands on the floor exactly as you would expect it to if you were back home watching the next episode of The Simpsons.
The idea here is that if you’re on a plane (nice smooth ride) and you can’t look out of the windows, how do you know you’re moving? What experiment could you do to figure out how fast you were going, or if you were moving at all?
Galileo’s great insight is that there is no such experiment. It can’t be done (at least not with any purely ‘mechanical’ experiment2).
OK, OK, I get that, but why the heck is this important and what’s it got to do with anything?
Well, if you’re going to try to do some (mechanical) experiment, the bits and bobs you use to conduct that experiment are all going to obey Newton’s Laws. Doesn’t matter what your experimental design is - whatever you build is going to behave itself according to Newton.
What Galileo’s principle of relativity means is that you could build your experiment on the ground, or you could build your experiment on the plane, and you’d get the same results.
If you got different results you’d be able to tell the difference between ‘stationary’ (on the ground) and ‘moving’ (in the plane).
Putting this another way would be to claim that the (mechanical) laws of physics do not depend on your state of motion - they do not ‘change’ if you’re on the ground or in the plane.
Here’s where it gets a bit more tricky. How do we ‘quantify’ any of this? How do we ‘firm up’ a bit on this?
You guessed it, we’re going to need that maths stuff again. The idea is that we write the laws of physics down mathematically - so that means writing things in terms of things like position in space and time. We call the set of things that describe position and time, coordinates.
We attach an x, y, and z value to be able to say where something is and we attach a time value, t, to be able to say when something is. It’s easier to think in just one dimension of space so we’re just going to think of the x value. An x value of 10, for example, would mean that the position we’re focusing on is 10m away from us. A t value of 2 would mean that we’re focusing on what’s happening at 2 seconds.
We imagine a kind of line of rulers and clocks stretching out. So, a flash of light that happens 10m away at a time of 2 seconds would be recorded by our imaginary rulers and imaginary clocks as having a coordinate value (x,t) of (10,2).
We say that we have a coordinate system for our frame of reference.
We could, we hope, write all our laws of physics in terms of these coordinates. Indeed, we can. Newton’s 2nd law, for example, which is often written as F = ma (force equals mass time acceleration) can be written as F(x,t) = mx"(t) which means that the force applied at position x and at time t gives rise to an acceleration (the acceleration is the 2nd time derivative3 of the position at time t).
Don’t worry if you haven’t quite got all the details there - the important thing to note is that we can write our laws in terms of these coordinates x and t.
All fine and dandy, if maybe a little unfamiliar and confusing at first.
But if we have this imaginary coordinate system on the ground (and it’s only imaginary in the sense that we’re thinking about it - we could actually implement this by having real clocks and rulers laid out) can’t the person on the plane do the same thing?
Sure can. The person on the plane can lay out a system of rulers and clocks too. However, we have to recognise that these might be a different set of coordinates to those of the person on the ground.
Eh? Well, imagine our flash of light as before. Let’s suppose our ground person assigns (10,2) as the coordinates of this event. What about the plane person? Well, he might have travelled 5m past the ground person4 and so, according to him the flash occurs at 5m, not 10m, according to his set of rulers. What about the time on the clocks? We’ll be general and suppose that it’s possible a different time was assigned by the plane person too.
This means that instead of the (x,t) being used on the ground, the plane person is using a different set of coordinates - and let’s label them as (X,T).
The things is, if our plane person is travelling past the ground person at a nice steady speed, we ought to be able to relate the two different sets of measurements for that flash of light. If we know the details of the plane’s person’s speed and we know what measurements (coordinates) the ground person has attached to this flash of light, we should be able to work out what measurements the plane person would attach to the same flash of light.
The technical name for doing this is a coordinate transformation.
I don’t blame you if you’re getting a bit lost at this point, so let’s remind ourselves of what we’re thinking about here. We’re trying to ‘firm up’ on this principle of relativity. We’re trying to find a way to describe it using our maths descriptions of the physical laws.
Now that we have an inkling that different people might assign different measurement values (different coordinates) to events (like flashes of light) what does this mean for our laws of physics?
Galileo’s principle of relativity means that when we write the laws down in terms of x and t, they look (mathematically) the same as when we write them down in terms of X and T.
To take a bit of a silly example, let’s suppose our (ground) law was written as
law = x + t
then Galileo’s principle means that our ‘plane’ law should be written as
LAW = X + T
If we had LAW = X + T + v, where v was the speed of the plane, then we’d be able to tell the plane was ‘moving’.
The ‘transformation’ that must be applied to the coordinates to go from one set (ground) to the other (plane) was thought to be what’s known as the Galilean transformation. It looks like this
X = x - vt
T = t
If we ‘transform’ our hypothetical (ground) law to the new coordinates relevant to the ‘plane’ then we’re going to get
x + t —→ X + vT + T
So, the law ‘looks’ different, mathematically - which means we would be able to detect our state of motion in violation of the Galilean principle of relativity with some experiment (we could work out the v here).
So we’ve been able to re-write the Galilean principle of relativity from “peanuts fall as expected” into : the laws of physics are invariant upon coordinate transformation between these ‘frames of reference’ (the ground and the plane).
It’s a lot of work and a lot of thought, but we’ve been able to capture the idea behind the principle of relativity in a more precise (and useful) way.
And, what do you know? - if you apply this Galilean coordinate transformation to Newton’s laws - the laws look the same in both the ‘ground’ and the ‘plane’ coordinate sets. Great stuff.
Then Maxwell buggered everything up.
Maxwell, building on the work of people like Gauss and Faraday, wrote down the complete set of equations that describe the electromagnetic field (EM field). Maxwell’s equations, as they came to be known, were a phenomenal success and were added in to the physics canon as the laws of electromagnetism. There was a tiny issue, though.
It was known that the laws of mechanics (Newton’s laws) ‘behaved’ themselves when transforming from one reference frame to another (from the ground frame to the plane frame, as above) using the Galilean transformation. Maxwell’s equations didn’t.
OK - maybe, then, we could use some electromagnetic experiment to determine the state of motion; we could tell the difference between ‘ground’ and ‘plane’, not by a purely mechanical experiment, but by using the properties of the EM field. They tried it and it didn’t work. They got the same results, just like they did when using purely mechanical experiments.
Physicists tried all sorts of tweaks and adaptations of Maxwell’s laws that transformed ‘properly’ under the Galilean transformation, but none of them gave the right predictions for experimental results. Then Hendrik Lorentz figured out a transformation rule that did work. If, instead of using the Galilean transformation, you used this new transformation (known as the Lorentz transformation) the equations of electromagnetism ‘looked’ the same when you went from ‘ground’ frame to ‘plane’ frame.
So you had this situation where one set of laws of physics (the mechanical laws) were invariant under the Galilean transformation, and another set of laws of physics (the EM laws) were invariant under this new Lorentz transformation.
What the heck was going on?
Experiments had shown that neither electromagnetism nor mechanics could be used to tell the difference between ‘ground’ and ‘plane’ frames - so the mathematical laws had to have this underlying invariance when you went from the description in one frame to the description in another. But 2 different transformations?
Many of you will probably already know that it was Einstein who put all the pieces of the jigsaw in place. His 1905 paper On The Electrodynamics Of Moving Bodies is an absolute classic, a real masterpiece. Other brilliant scientists like Lorentz, or mathematicians like Poincaré had come close, but it was Einstein who really fleshed everything out from first principles and completely changed our view of the world forever.
But what fewer people know is that Newton could have, more or less, worked all this out. In a brilliant paper, Mitchell Feigenbaum showed that if we assume three properties
homogeneity of space (space is the same ‘here’ as over ‘there’)
isotropy of space (space is the same whatever angle we turn through)
the principle of Galilean relativity
then there are only two possible coordinate transformations (ways of going from the ‘ground’ frame to the ‘plane’ frame) consistent with these assumptions.
No prizes for guessing that the only 2 possible candidates for this transformation are the Galilean transformation or the Lorentz transformation.
What’s the difference? The difference is that the Galilean transformation allows relative velocities to be infinite, whereas the Lorentz transformation requires some upper bound on this relative velocity. We know that this upper bound is in fact the speed of light.
I’ve introduced a new word there - relative. If you think about it, if you can’t tell the difference between ‘ground’ and ‘plane’ then you can’t tell which one is really ‘moving’. It’s a bit clearer if you think of two boxes floating in deep space moving apart from one another. Inside either box you can’t tell if you’re moving or not. So which one of the boxes is ‘moving’? You can’t tell. And so the only important thing is their relative state of motion - hence the word relativity in Special Relativity.
There is only one transformation that ‘works’ for both mechanics and electromagnetism and that’s the Lorentz transformation. So why didn’t we know about this from mechanics? The Lorentz transformation contains terms in v (the relative speed) divided by c (the speed of light). For our plane travelling at 400mph the ratio of v/c is about 0.00225 - and this ratio is squared in the actual formulae for the transformations so it’s about 0.000005. So the effect is small when your speed is not a big enough fraction of the speed of light. If we set the term v/c to zero in the Lorentz transformation we get back to the Galilean transformation.
The existence of an upper bound to relative velocity has all sorts of interesting and, frankly, bizarre consequences. You may have heard of some them; time dilation and length contraction, for example (not to mention the famous equation of E equals m c squared). They are very hard to understand, probably because our brains are wired to understand ‘low speed’ things. If we regularly zipped about closer to the speed of light throughout our evolution, then it would all probably be intuitive.
So, Einstein’s audacious step amounted to the statement that Maxwell’s equations were fine - they already had the right invariance property when going from one frame to another. It was Newton’s equations that needed re-interpreting.
Given how odd some of the consequences of SR actually are (time dilation, for example) it’s quite sobering that the experimental predictions of SR have been confirmed countless times in laboratories (not to mention the corrections from both special and general relativity6 that must be added to the GPS system in order to get the right accuracy at ground level).
There a reason why Einstein is (rightly) revered - and even if you exclude his work on relativity he would probably still be ranked as the greatest physicist of the 20th century.
This is not wholly accurate, but we’ll run with this convenient simplification as a way of accepting that, perhaps, our intuition derived from everyday experience may not serve us too well when applied to things that are greatly outside that experience
The laws of electromagnetism were not known in Galileo’s day - but we’ll get on to that in a bit
The double dash here is math shorthand for “differentiate twice”
And I suppose we’d have to imagine a very low flying plane here
This is, of course, incorrect. As Ben pointed out in the comments. I used the value of 186,000mph for the speed of light - when I should have used 186,000 miles per second. This gives a value for v/c of about 0.0000006 which becomes about 0.0000000000004 when squared. This should teach me not to attempt to write more technical pieces later at night whilst drinking brandy.
The special theory deals with frames of reference that are moving at a nice constant relative velocity. The general theory extends this to accelerating frames of reference - and it’s much, much harder mathematically - which is why it took Einstein over a decade to go from the special to the general theory.
Pretty sure your plane isn't traveling at 0.0022c but otherwise this is a superb explanation of special relativity.
Funny how physics and social sciences use the same basic principle: that things are only definable (measurable) relative something/-one else, even imaginary somethings.