This will be my first post (and could be the last for quite some time) on this blog. I hope it is of interest and a good read to you.
I was a sound engineer for quite a while, working live, in the studio and for TV. For me, sound technology is a fascinating thing somewhere in between engineering/science and art. However, most sound engineers have no background in “true” engineering and mathematics – and they certainly don’t have to have that in order to do a good job. But I became curious and after a couple of years of working in the industry, I wanted to know more about the basics. Now, I’m close to graduating in electrical engineering.
There is this German speaking group on Facebook (www.facebook.com/groups/215910965094245/) for sound engineers. A while ago, I couldn’t hold myself back and got in a discussion about “phase” there (rule #1: don’t get involved in online-discussions). It started off with someone stating that “phase inversion is something different than switching polarity”. I met several people claiming this over the years. While it may make you sound smart, it’s simply not true – phase inversion (a shift by 180 degrees) and changing polarity are the same thing. Here I am to prove this to you.
I chose “But What Exactly is Phase?” to be the title for this post because that’s the question I usually ask people that claim things they don’t really understand. It’s fascinating – phase seems to be this mystical thing that no one can really explain. At the same time, it’s the most heard explanation for any unexpected phenomenon encountered when working with sound technology. Feedback? Must be a phase issue. Indirect sound? Must be a phase issue. Bad sound in general? Could be a phase issue. But it was rare for anyone to be able to explain to me exactly what that means.
The Short Answer
Say Hello to Jean Baptiste Joseph Fourier!
This is going to involve some mathematics. I’m sorry about that. If you didn’t get “The Short Answer”, I’ll try to explain the concepts as straightforward as possible.
Whenever we are talking about phase, we mean the phase of a signal. We can then analyze the signal in order to look at it’s properties, e.g., the phase. But what is a signal? (This might be another good question that you can use to confuse your colleagues.)
A signal is defined to be a function that maps time to amplitude. This means: for each moment in time, we define a certain value that represents the current value of the signal. We can, for example, look at electrical signals (measured in voltage) or acoustical signals (measured in sound pressure). In most cases nowadays, we are analyzing digital signals. This type is actually not defined for each moment, but only at certain points in time. However, this doesn’t matter here and we’ll just forget about it for now.
Because both time and amplitude are one-dimensional, we can easily visualize a signal by drawing it. Some people call the plot of a signal a waveform. Below you can “see” one second of music.
You might have heard about the Fourier-transform. It is used to analyze the frequency content of signals. It basically splits up a signal in its harmonic (frequency) components. This means that we look at a new representation of the signal. Before, we looked at a value for each moment in time. Now, we look at a value for each frequency.
But why frequency? What is frequency? Well, here we could geek out and actually, there’s no end to this. But at the end of the day the answer to the question “why Fourier?” is: Because it makes math easy. There are endless applications of transforming a signal using the Fourier-transform or its mother, the Laplace-transform (ask any engineering student what exactly the difference between those two is and you’ll most probably won’t get an answer).
But What Exactly is the Fourier-Transform?
Simply speaking, it is just a mathematical trick. It takes a signal and converts it to a different representation. Luckily, there also is an inverse transform. So we can, for example, do something to the signal after transforming it, and then convert it back to the time-representation. Please note: in reality, we’re always confronted with time series. Audio, in real life, is a signal that is represented by values for each moment in time.
After coming up with the new representation, we could start thinking of signals as being sums. Looking at a standard, time-dependent signal, we could say that it consists of a large number of very short signals that do not overlap. Let’s look at our music signal from before and split it, as an example, in four shorter signals that do not overlap:
You could now imagine to continue like this and split the signal into shorter and shorter pieces. We would end up with a huge amount (strictly speaking infinitely many), but very short (strictly speaking infinitely short) signals that each represent a value at exactly one moment in time. But here’s the point: If we add all those signals up, we’d get back to our original signal.
Now, let’s do the same thought experiment for frequencies. Let’s assume that we know the Fourier-transform of a signal. We could split this transform up in smaller and smaller pieces until we’d end up with many different values, each representing one single frequency.
There is an important mathematical statement that says that we can represent any “realistic” time-signal as the sum of values at all different frequencies. And this “sum” is what the Fourier-transform actually calculates. Another important fact is that it makes no difference, mathematically, to split up the signal in its different frequencies in the frequency-domain, then transform each of those frequencies back to time-domain before summing back up. Think a minute about this, it is a quite powerful statement.
In a second step, we could ask: how would the signal that only consists of exactly one specific frequency look like? As you might now, the answer is called “sine wave”.
Now we have one small problem left. Our answer to the question above is not wrong, but incomplete. Because, as you might know as well, a cosine at the same frequency consists of only that frequency as well!
This might seem strange. Let’s go back to representing our time-signal as a sum of many, very short signals. If we know the values at all points in time, we would have defined one, and exactly one, distinct signal. But this seems not to be the case here. Apparently, it’s not enough to know “the value” for each frequency.
The reason for this is that the mathematical properties of the Fourier-transform lead to a complex representation in the frequency domain. The values at all the different frequencies are complex, meaning that they’re not represented by the “normal” numbers you’re used to. Complex numbers actually consist of two numbers, i.e., a real and an imaginary part. This leads to two conclusions:
Firstly, our favorite mathematician Leonhard Euler found out that any complex number can be written as a combination of a sine and a cosine. This fact explains the ambiguity from above. There can be a signal consisting of only one frequency looking like a cosine, or looking like a sine wave. Or anything in between.
Secondly, we see that the values at each frequency are somehow two-dimensional. We need two “normal” numbers to represent each of them. One way is the mentioned split in real and imaginary part, but another way is looking at the “absolute value” (think of a quantity related to how much power the signal has at a frequency or how “big” the number is) and the – here it is – “phase” (think of this as telling you if we have more of a cosine or a sine at that frequency).
First, let us recapitulate: Each signal can be represented as a sum of different frequencies. The value at each frequency has an absolute value and a phase.
I would like to emphasize one very important fact: If we look at a signal consisting of only one single frequency, that signal has to be infinitely long. We are, as stated above, talking about sine and cosine signals (and their combinations) here. The functions and are defined to have values for all time values and there is no way around this.
This leads to one conclusion, the so called uncertainty relation: a signal can be either defined precisely in time or in frequency, but not in both. If you listen to a sine wave, you actually never hear only one single frequency, because the duration of the signal is finite, and therefore, there have to be more frequencies involved.
Turning the Phase
The difference between sine and cosine is only a shift in phase. We could go deeper into the mathematics behind this, but let’s not do that for now.
Phase differences are related to angles. That’s another thing that I won’t explain here; it’s just math. But let’s see what happens when we shift the phase of a 2 Hz cosine wave by 0, 90, 180, 270 and 360 degrees:
We can see different things here. First, there are ambiguities in phase. A sine or cosine wave will look the same after a phase shift of 360 (and …-720, … ,-360, 720, 1080,…) degrees. Also, we can transform any sine wave to a cosine and vice versa by applying a phase shift.
The most important conclusion for our problem is the following: By applying a phase shift of 180 degrees, we’ll end up with the exact negative of our (co)sine.
If you would go further in the math, you’d see that this is simply a property of complex numbers. Actually, this is what i wrote down above as “the short answer”. A phase shift of 180 degrees is equivalent to multiplying the wave with the factor -1, i.e., switching polarity!
Again, i’d like to point out that you have to imagine the signals drawn above to be infinitely long. This is crucial! Therefore, after a phase shift of 360 degrees, we really do have the exact same signal again!
The Distributive Law
This will be our last step. You might remember elementary school:
This is the so-called distributive law. It applies to our Fourier-transformation as well.
Recall that we found out that each signal can be looked at as being a sum of many signals at all frequencies. We want to ask ourselves what happens if we shift the phase at each frequency by 180 degrees. In the last paragraph we have seen that for each frequency, this will just be a multiplication by minus one.
Because of the distributive law, it is equivalent to multiply each of the frequencies by -1 and then sum up, or to sum up the frequencies first and then multiply by -1.
But summing up all the frequencies first will give us back our original signal. Therefore, a phase shift by 180 degrees is equivalent to a multiplication by -1 for the whole signal as well.
Example 1: Facebook
I wrote all of this motivated by the discussion on Facebook that took place a while ago. I remember that the people there used an example looking like this:
This is some sort of time-restricted sine wave. Then, they stated that, after a 180 degrees time shift, this signal would look like this:
I can somehow understand where this comes from. But there is one very basic problem: As stated above, to have only one frequency, we need a signal that is spread out infinitely in time. This is simply not the case here. This becomes even more obvious when we look at the Fourier-transform (as stated, we need the absolute value and the phase) for the signal:
We can see clearly that this signal, because of its limitation in time, has much more than only one “active” frequency. Therefore, we do not see a phase shift of 180 degrees in the second plot. Believe it or not: a phase shift of 180 degrees will still result in an inversion of polarity.
Example 2: “Square” Wave
You might know that a square wave consists of a fundamental frequency and its odd harmonics. Without even looking at the Fourier-transform, we’ll limit ourselves to the first few harmonics in time-domain and we will construct our own simple (almost-)square wave. After that, we will shift the phase of all the waves by 180 degrees to see how the result looks like.
Above, we see the plots for the fundamental frequency and it’s first few odd harmonics. The sum of all those sine waves looks like this:
If we would add more and higher harmonics, we would come closer to a square wave. However, this approximation is good enough for our purposes here.
In a second step, we shift the phase of each individual sine wave by 180 degrees:
Finally, let’s see what happens if we sum up all of this:
Here we have it. It’s nothing else than a shift of polarity. Or a multiplication by -1. There is no magic in all of this.
I want to emphasize one last time that you have to imagine the basic sine waves to be spread out infinitely in time. The whole theory won’t work otherwise.
Some last words
Congratulations if you made it this far. I would love to get some feedback on this and to hear if you’re interested in more content.
Please note that we only scratched the surface here and that the case of a shift by 180 degrees at all frequencies is very specific. Things get interesting when the phase shift is not the same anymore for every frequency, as it is, e.g., in most filters.