Phil Karn

Who you jiving with that cosmik debris?

Now what kind of a guru are you, anyway?

Look here brother, don't waste your time on me!"

--Frank Zappa, Cosmik Debris

Extraordinary claims require extraordinary evidence.

--Attributed to Carl Sagan

I could not believe such total nonsense had been published in a mainstream engineering magazine. It reminded me of an article in 73 Amateur Radio Magazine (a less, uh, technically rigorous publication) about 10 years ago that also described a scheme for ultra-narrow-band digital modulation.

The idea was as simple as it was misguided: if the bandwidth of a phase-modulation signal depends on the deviation, then you could make the signal as narrow as you want by simply turning down the modulation. The fallacy is revealed with a simple analysis, so I wrote up a critique, posted it to USENET and forgot about it. I can't find my original analysis, but here is a recreation.

But crackpots are nothing if not tenacious. Like the poor souls who never give up trying to perfect perpetual motion, hoping to find (or claiming to have found) a loophole in the laws of thermodynamics, the communications field has its own share of cranks claiming to have beaten the Shannon limit.

The latest incarnation of this futile quest for digital snake oil
seems to be a technology called "VMSK" and a related variant called
"VMSK/2". A Yahoo! search for "VMSK"
produced literally *hundreds* of hits. Most
pages appear to belong to individuals who have bought into
a multilevel marketing scheme run by
AlphaCom Communications,
a company in Ohio, and contain links to one of the following sites:

- www.networkalpha.com
- www.vmsk-service.com
- www.vmsknetwork.com
- www.vmsk.org
- www.alphacomopportunity.com.

These sites tout a "revolutionary" new data communications technology that will speed up communications by factors variously stated as 20x (vs digital cellular) up to 1,000 times (vs conventional dialup telephone modems). What sketchy technical information can be found claims spectral efficiencies as high as 90 bits/sec/Hz. Besides the article in EDN, VMSK was mentioned in a Slashdot article and has apparently been touted by Leonard Nimoy on CNBC TV.

A look at VMSK shows it to be a variant of the "narrow-band PM"
scheme described in 73. It seems to be the brainchild of one
individual, Mr. Harold (Hal) R. Walker of Pegasus Data Systems in
Middlesex, NJ. Mr. Walker has written thoroughly confused papers such
as
Reevaluating Shannon's
Limit
that demonstrate an obstinate refusal to accept the
mathematical principles that govern digital communication. Like the
would-be perpetual motion inventors who claim that the Second Law of
thermodynamics doesn't really mean what it says, Mr. Walker would have
us believe that *his* invention is exempt from
the
Shannon limit.

Shannon's paper is a landmark precisely because it applies to
*every* possible modulation and coding scheme, whether or not
it had been conceived in Shannon's time. It makes no exceptions;
*schemes for reliably exceeding channel capacity simply do not
exist.* Having withstood the test of time, Shannon's work is now
as firmly established as the laws of thermodynamics.
[1].

Walker has also managed to publish papers in an IEEE publication and at several professional conferences. The ones I've found copies of are

- Attain High Bandwidth Efficiency With VMSK Modulation: VMSK modulation can achieve bandwidth efficiencies up to 30 b/s/Hz, Microwaves and RF, December 1997.
- VPSK and VMSK Modulation Transmit Digital Audio and Video at 15 Bits/Sec/Hz, IEEE Transactions on Broadcasting, March 1997.
- The Advantages of VPSK Modulation for Data Transmission: 10 bits/Hz data Compression Without Loss of Signal Power, Proceedings of Wescon '95.

A look at any of these writings should convince any communications engineer of his civic duty to help review submissions to conferences and journals.

Walker claims that "properly applied", the Shannon limit does not
invalidate his claims. But a look at his tortured analysis shows that
he merely redefines "bandwidth" to suit his aims; his *real* bandwidth
is far greater than he claims.

This is a bit like proving that 2 + 2 = 5 if we simply redefine the numbers 2 and 5.

Walker has two schemes he calls "VMSK" and "VMSK/2" [2]. Although his papers are hard to read and full of analytical errors, they do give enough information about the techniques to do an independent analysis.

The VMSK waveform has a rising transition at the start of each bit. A falling transition occurs in the middle of each bit; its precise timing depends on whether a 0 or 1 bit is being sent. In one of his examples, a falling edge at 7/16ths of the bit interval signals a zero, while a one causes the falling edge to occur at 9/16ths of the bit.

Just by looking at the VMSK waveform we can see that it makes very inefficient use of transmitter power. The first and last 7/16ths of each bit are always the same, whether the bit is a zero or a one. Thus 14/16ths (87.5%) of the energy in each bit does nothing to help the receiver distinguish a zero from a one. It is simply wasted energy.

This leaves the middle 2/16ths (12.5%) of each bit. If the bit is a zero, the waveform will be at -1 during this time; if it's a one, the waveform will be at +1. This forms an "antipodal" signal set essentially equivalent to coherent BPSK.

The bit error rate for coherent BPSK is:

**BER (bpsk) = 1/2 * erfc(sqrt(Eb/No))**

where **erfc** is the
complementary error function. It is the
area under the Gaussian error curve, integrated from the argument to
infinity, normalized to 1 for an argument of 0. I.e., for Eb/No = 0,
the BER is 0.5 and the demodulator output is completely useless. For a
very large Eb/No, the BER goes to zero. (The general shape of the erfc
function is familiar to anyone who has seen modem BER-vs-Eb/No
"waterfall" plots.) Achieving a 10**-5 BER with BPSK and an ideal
receiver requires an Eb/No of about 9.6 dB.

We must now account for the fact that only 1/8 of our bit energy lies in the "meaningful middle":

**BER (vmsk) = 1/2 * erfc(sqrt(Eb/(8*No)))**

So we can see that the power efficiency of VMSK with these parameters is a factor of 8 (9 dB) worse than BPSK. Achieving a 10**-5 BER with VMSK therefore requires a minimum Eb/No of 18.6 dB.

Here's another analysis that just happens to give the same results. We can look at VMSK as the sum of two signals: a 50% duty-cycle square wave "clock" at the data rate, plus one of two narrow pulses depending on the bit being sent. The pulse for a "zero" bit advances the negative-going clock transition in the middle of the data bit, and the pulse for a "one" retards it. If the clock has amplitude +/-1 unit, then the "zero bit" pulse is a negative-going pulse of amplitude -2 units and duration 1/16 starting at time 7/16 and ending at time 1/2. Similarly, the "one bit" pulse is a positive-going pulse of amplitude 2 units that starts at time 1/2 and ends at time 9/16.

Because the clock component is the same for each bit, it conveys no information. All of the information about the data is conveyed in the additive pulses. Because the pulses have amplitude +/-2 and are 1/16 of a bit long, their relative energy is (+/-2)**2 * (1/16) = 1/4.

Note that this is twice the energy of the antipodal pulses in the
previous analysis. But in this analysis, the two pulses do not
overlap; one is always zero whenever the other is nonzero.
This makes the signal set *orthogonal*, not antipodal.
In fact, these two pulses constitute pulse-position-modulation (PPM),
which is mentioned in Shannon's paper. Another widely used orthogonal
modulation technique is frequency-shift keying (FSK).

Because we have a clock, we know where each possible pulse
starts and stops, so we can demodulate this signal coherently. (The
clock isn't *totally* useless, but it's far stronger than it
needs to be. We could produce a good receiver clock by using a narrow
PLL to track a much weaker transmitted clock.)

The bit error rate for ideal coherent demodulation of binary orthogonal signalling is

**BER (coherent binary orthogonal) = 1/2 * erfc(sqrt(Eb/(2*No)))**

Note that this scheme is a factor of 2 (3dB) worse than coherent
BPSK, which is an antipodal (not orthogonal) scheme. This formula
applies to any kind of binary orthogonal modulation, such as binary
FSK and binary PPM, as long as they're all coherently
demodulated. (Note that binary FSK is usually demodulated
*non*coherently, and a different formula applies in that case
that provides worse performance.)

Once again, because the VMSK clock carries power but no useful information, we must account for the fraction (1/4) of transmitted power that goes into the data-bearing pulses. Then we get:

**BER (vmsk) = 1/2 * erfc(sqrt(Eb/(8*No)))**

Which happens to be the exact same formula we derived in our first analysis! Once again, we've shown that VMSK with these parameters is 6 dB worse than coherent binary orthogonal modulation, and 9 dB worse than coherent binary PSK.

Both analyses have shown the best that can be achieved with the best possible VMSK demodulator and no bandwidth restrictions. Filtering the signal would only make things worse as the filter would chop off some of the sideband energy, weakening it further with respect to the clock.

So we've quantified our intuitive conclusion that VMSK requires more transmitter power than BPSK. Now this would be acceptable if we got something in return, such as narrower bandwidth; after all, Shannon's equation shows that there is a fundamental tradeoff between SNR and bandwidth. But in the next section we'll see that, contrary to its inventor's claims, VMSK is very wasteful of bandwidth as well.

But the previous discussion should give you a clue as to what's really going on: so much transmitter power goes into the narrow band clock that it completely dominates the spectral display. Not only does a much smaller fraction of transmitter power go into the sidebands (the only part that carries useful information), but that power is spread over a much wider bandwidth, making it hard to see. But it's still there, and it's what makes the system work.

To understand why the VMSK spectral display looks as it does, we need to understand how a spectrum analyzer works. Traditionally, a spectrum analyzer consists of a swept bandpass filter synchronized with the horizontal drive of an oscilloscope. The power at the output of the filter is detected and drives the vertical axis of the display. So as the filter sweeps across the signal spectrum, it plots the signal energy as a function of frequency.

The swept filter bandwidth is usually an operator control
labeled as the *resolution bandwidth* (RBW). Other analyzer
parameters include the frequency limits of the filter sweep, and the rate
at which the sweep is made. It is especially important to ensure that
the sweep rate is limited according to the RBW; narrower
bandwidths require slower sweeps.

Some modern spectrum analyzers work by digitizing the input signal, performing a Fast Fourier Transform (FFT), and displaying the resulting amplitudes. The same process can be performed entirely in software, as when simulating a system. Although the implementation is completely different from a traditional spectrum analyzer, you can think of the FFT-based analyzer as having a bank of filters, one for each frequency "bin" across the desired range, all operating in parallel. The width of each frequency bin is the effective resolution bandwidth, and making the bins smaller requires that the analyzer process a longer stretch of input signal.

If a narrow signal (e.g., a clean unmodulated CW carrier) is
fed to a spectrum analyzer, it will fall
completely within the bandwidth of the analyzer filter as it sweeps
past. (In a FFT analyzer, it will fall entirely into one FFT frequency
bin or two adjacent bins.)
This tends to make it stand out prominently on the display. But when a
*wideband* signal is fed to an analyzer, the swept filter (or
each FFT bin) captures only a small fraction of the total signal
power. So even if the wideband signal has the same total power as the
CW carrier, it will appear as a broad plateau at a much lower
amplitude.

As we saw earlier, we can model the VMSK signal as the sum of two signals: a 50% duty-cycle square wave "clock" at the data rate, plus one of two narrow pulses that depend on the bit being sent. Seen on a spectrum analyzer, the clock will show up as a very prominent spike: not only is it narrow enough to fall entirely into the analyzer's RBW, but it also carries most of the signal power. (Actually, the clock will show up as several discrete frequency spikes because we've shown it as a square wave, and square waves have odd-order harmonics.)

Now let's look at the data pulses. The Fourier transform (frequency spectrum) of a single square pulse T seconds long is

**T*sin(pi*f*T)/(pi*f*T)**

where **f** is the frequency in Hz.
Because this sin(x)/x expression occurs so often in signal processing,
it has a special name: the **sinc** function, defined as

**sinc(x) == sin(pi*x)/(pi*x)**

So we can restate the frequency spectrum of the square pulse as

**T*sinc(f*T)**

The sinc function evaluates to 0 at all integer arguments except 0,
where it evaluates to 1. So our square pulse T seconds long has a
sinc-shaped spectrum with a peak at 0 Hz and nulls at f= (..., -2/T,
-1/T, +1/T, +2/T, ...). Note that except for these nulls, this pulse has
spectral energy at *every* frequency, though the amplitude
steadily decreases with increasing frequency. That's because a
time-limited signal like our single data pulse has an infinitely wide
spectrum (and conversely, a band-limited signal has an infinitely long
time duration).

Conventional BPSK uses square data pulses that last for the entire bit duration. So the baseband spectrum for the data stream is sinc-shaped with a maximum at 0 Hz (DC) and nulls at non-zero integral multiples of the bit rate:

But the data pulses in VMSK last only a fraction (1/16) of the bit time. What does this do to the VMSK spectrum relative to BPSK? It makes it wider!

Note that these two graphs have the same scale. Both BPSK and VMSK
have maximum spectral density at 0Hz (DC). But unlike BPSK, where the
first null occurs at 1 times the data rate, the first null in the VMSK
pulse spectrum occurs at *sixteen times* the data rate! At the
same time, the peak VMSK spectral density is 1/256 of that for BPSK
(i.e., it's 24 dB lower). That's because the VMSK pulse has 1/16 as
much energy as the BPSK pulse, and it spreads this energy over 16
times as much bandwidth.

All this follows from the fundamental property of Fourier transforms that signals shorter in time are wider in frequency, and vice versa.

Because the user data is random, the phase of the spectral components produced by each pulse will also be random, so averaging their spectra over time will produce the same continuous, wideband sinc-shaped spectrum as the single pulse, i.e., 16x as wide as the spectrum for conventional BPSK. At high data rates this modulation spectrum will be so weak and broad that it could easily fall beneath the noise floor of a spectrum analyzer unless the signal is extremely strong. (The front ends in spectrum analyzers are optimized for extremely linear operation over a wide frequency range, and noise figure is usually sacrificed as a result.)

So far we have analyzed the data spectrum separately from the clock. What do you see on an analyzer when you combine them? It depends on the RBW setting of the analyzer. Because the clock is a single frequency, its infinitely narrow spectrum always falls entirely into the analyzer filter when it sweeps by, no matter how small the RBW setting. So the clock always appears at the same amplitude on the analyzer, independent of the RBW setting.

This is not the case for the modulation unless the RBW is high enough to capture the entire spectrum at once. At lower settings, the apparent amplitude of the modulation will be roughly proportional to the RBW. Make the RBW small enough, and the noise and modulation will both decrease in apparent amplitude, leaving only the clock plainly visible.

His claim that the "grass" is somehow "intersymbol interference"
(ISI) is utter nonsense. ISI is the *time-domain* phenomenon
whereby the pulses from adjacent data bits (or symbols) are "smeared"
into each other, making data recovery difficult or impossible. Yet the
"grass" plainly appears on the spectra of clean VMSK waveforms with no
ISI whatsoever. Indeed, as we'll see below, filtering the VMSK signal
can only *introduce* ISI. And filtering any binary signal
(including VMSK) to the 90 bps/Hz bandwidths he claims would generate
so much ISI as to render the system completely unusable.

This is inherent in the way filters work; the ISI *cannot* be
circumvented by clever design. To avoid ISI, the filter *must*
have a certain minimum bandwidth.

To better understand why, let's return to the analysis of the
spectrum of BPSK. There we saw that the Fourier transform (frequency
spectrum) of a square pulse in time is a sinc function in
frequency. By the reciprocity property of Fourier transforms, the
Fourier transform of a square "pulse" in *frequency* is a sinc
function in *time*.

The shorter a pulse, the wider the bandwidth it must occupy. We saw this phenomenon in the spectrum of VMSK's narrow data pulses. In the limit, an infinitely short pulse has uniform energy over all frequencies, as its sinc-shaped spectrum has its first null at infinite frequency.

When a short pulse (or an impulse) is fed to a filter, it "rings",
stretching the short input pulse into a longer output pulse. The
output of a filter fed an impulse is called, naturally enough, its
*impulse response*. Any linear filter is completely described
by its impulse response.

The narrower the filter bandwidth, the longer the time between the zero crossings in the impulse response. If the filter is 1 Hz wide, then the time between zero crossings is 1 second. Conversely, a wider filter bandwidth will have less time between zero crossings in its impulse response. E.g., if the filter is 1KHz wide, the zero crossing times will be 1 ms apart. [3]

This is how excessively narrow bandpass filtering creates ISI. What constitutes "excessively narrow"? Let's say we set the filter bandwidth such that the sinc function zero crossings occur exactly one data bit time apart. Then if we superimpose the sinc functions that correspond to consecutive data bits on top of each other, we can see that at the moment the sinc function corresponding to any given data bit goes to its maximum of 1, all of the sinc functions belonging to adjacent bits go to zero. Hence we have zero ISI:

But if we try to spread the sinc function out further (by making the filter narrower), now we start to get ISI from the "main lobes" of the adjacent data bits:

Note how the peak of any given sinc pulse no longer coincides with the zeros of the adjacent sinc pulses. If these sinc pulses represent data bits, there is no longer any way to sample the value of one bit without interference from the adjacent bits. Thus we have ISI. [5]. The tighter we filter, the more these pulses will overlap.

Here's what would happen if we filtered VMSK to Walker's claimed bandwidth of 90 bps/Hz:

Think he (or anyone) could separate those pulses?

This principle that a certain minimum bandwidth (of a filter or a
communications channel) is required to support a given pulse
signalling rate is nothing other than the famous *Nyquist Sampling
Theorem*. This principle is better known for its requirement that
digital audio systems sample at a rate at least twice that of the
highest component in the analog input signal.
[4].

Although Walker's data pulses are much shorter than a bit, they still occur at the same rate as the data. So while their unfiltered spectra is much wider than the corresponding BPSK spectrum with NRZ data, their Nyquist bandwidth is the same as for BPSK: 1 bps/Hz. So Walker could filter his signal down to 1 bps/Hz -- but no further -- without introducing ISI.

Walker admits that "ordinary" filters with bandwidths equal to the claimed bandwidths of his VMSK signals "utterly destroy" his modulation. This should come as no surprise to anyone reading this paper. But instead of accepting what's really going on (that his signal is really much wider than he claims) he blames the "ordinary" filters and says you must use his special "patented zero group delay" [7] filters instead.

Although Walker has not responded to my requests for his filters' frequency response data, it is already clear what must be "special" about them: they're broad enough to pass the VMSK modulation sidebands at a sufficiently high level to let his demodulator work. And if they're wide enough to pass the modulation, they would also pass other nearby signals. This utterly dooms any attempt to take advantage of VMSK's supposedly narrow bandwidth by packing multiple VMSK signals close together.

The portion of the VMSK signal that actually carries data is just
ordinary BPSK, except that it is spread over an even wider band of
frequencies, making it much harder to see on a spectrum analyzer.
The best that can be done to reduce VMSK's bandwidth by filtering
is to make it exactly the *same* bandwidth as BPSK.

VMSK is therefore shown to be nearly *identical* to the
earlier narrow band PM scheme described in the
appendix, and which Walker concedes doesn't work. The *only*
difference is the location of the strong spectral line component that
wastes most of the transmitter power in both schemes. In "narrowband
PM" it is at the RF carrier frequency, while in VMSK it is offset from
the RF carrier by a frequency equal to the data rate.

Walker gets his ridiculously high bandwidth efficiency figures by
focusing on the one part of the VMSK signal he *can* easily see
-- the clock -- and proclaiming it to be the whole thing. Of *course*
the clock looks narrow! Ideally it would occupy no bandwidth at all,
but in reality it is slightly broadened by the response of the spectrum
analyzer and by the small amounts of phase noise inherent to even
a good crystal oscillator.

Walker even
boasts about meeting FCC emission mask limits, but of course this
proves nothing. FCC masks are designed solely to minimize harmful
interference to other users on adjacent channels; merely meeting one
is no guarantee that *your* receiver won't be creamed by those
same adjacent channel signals.

So VMSK is much less efficient than BPSK in power and, despite Walker's claims, at best no better than BPSK in bandwidth efficiency. It is utterly worthless as a practical modulation technique.

1. There is in fact a deep relationship between thermodynamics and information theory that shows up in thought experiments like Maxwell's Demon.

2. VMSK/2 is just VMSK run through divide-by-two counter triggered by the falling edges in the VMSK signal, producing a pulse whose leading and trailing edges are both varied slightly in time. The analysis for VMSK/2 is similar to that for VMSK.

3. Strictly speaking, this is true only for low pass filters, which have
impulse responses that look like sinc functions. The impulse response
of an ideal *bandpass* filter is a sinusoid at the
center frequency of the filter, amplitude modulated by a sinc-shaped "envelope"
according to the width of the filter. The time between zero crossings
of the sinc envelope is inversely proportional to the filter
bandwidth.

There are two ways to answer this. One is that you need two sinc pulses, properly separated to avoid ISI, per cycle of the highest input frequency to describe the waveform. Another is to say that an ordinary real signal with highest frequency component f actually has a bandwidth of 2f, ranging from -f to +f, with the negative frequencies being a mirror image of the positive frequencies.

5. Not every data communications system strives to completely avoid
ISI. The *partial response* modulation schemes deliberately
introduce a carefully limited and controlled amount of ISI to shape
the resulting modulation spectrum. One well-known example is
*duobinary*. Partial response schemes are popular in magnetic
recording because of the frequency limitations of the recording media,
particularly at low frequencies. But as Shannon dictates, higher
signal-to-noise ratios are always needed to overcome the ISI. Some of
this higher SNR can be recovered by the use of forward error
correction coding; duobinary can be decoded with the Viterbi
algorithm, for example.

6. It is ambiguous what Walker means by "work", as his system should clearly be able to pass data without any filtering as long as the signal is strong enough. It just uses a lot of bandwidth to do so. I suspect he meant that filtering is necessary for his signal to meet the emission limits imposed by the FCC emission masks. But as discussed in the conclusions, merely meeting a FCC emission mask is insufficient to claim ultra-narrow-band operation.

7. There is no such thing as a "zero group delay" filter.
Even a piece of wire has a non-zero group
delay because electromagnetic waves cannot travel faster than light.
Nor would a "zero group delay" filter ever be necessary in data
communications, only a filter where the group delay is *constant*
at all frequencies of interest. These are commonly implemented
as finite impulse response (FIR) filters in digital signal processing.
To achieve flat group delay, it is both necessary and sufficient for
the filter to have a symmetric impulse response, which is easy to do
in a FIR filter.

In both AM and narrow band PM, the RF bandwidth is twice the bandwidth of the modulating signal. And in both cases, the signal power is dominated by the power going into the carrier. Only a fraction goes into the sidebands that convey actual information.

For PM with digital modulation, the relative carrier and sideband powers, in dB relative to the total transmitter power, are as follows:

**carrier power = 20*log10[cos(phi)]**

**sideband power = 20*log10[sin(phi)]**

where **phi** is the phase deviation in radians. For
example, if the peak-to-peak deviation is pi radians (180 degrees),
then the one-sided deviation is pi/2 radians (90 degrees) and all of
the signal power is in the sidebands with none in the carrier. This is
classic suppressed-carrier BPSK. If the deviation is reduced to 90
degrees peak-to-peak, then the carrier and sidebands are both at -3dB,
i.e., they each take half of the total transmitter power. For very low
values of **phi**, nearly all of the power goes into the
carrier, with very little in the data-bearing sidebands.

So not only does this scheme not conserve RF bandwidth, it also wastes RF power on a useless carrier that carries no information.

Actually, residual carriers in a BPSK signal *are* sometimes
useful, provided they have reasonable amplitudes. While
suppressed BPSK carriers are commonly regenerated by the receiver with
a Costas or squaring loop, there are "squaring losses" from the
nonlinearities in both kinds of loops that degrade the signal-to-noise
ratio of the recovered carrier. These are usually compensated for by
narrowing the bandwidth of the loop filter. This is usually acceptable
at high data rates where taking several hundred or even several
thousand bit times to acquire lock is acceptable, and where the
carrier frequency uncertainty due to Doppler shift, oscillator noise,
drift, etc, relative to the data rate is small. But at the low data
rates used on very weak signals, however, the squaring losses may
force unacceptably long acquisition times.

JPL has used residual-carrier BPSK for decades on low-data-rate deep-space telemetry links, trading off transmitter power in the data for improved carrier tracking. For example, the low speed link on the ACE spacecraft puts a little more than half of its transmitter power into the carrier. This results in a 3.6 dB loss in Eb/No performance compared to a suppressed-carrier BPSK signal, but the designers considered this an acceptable trade for the greater ease in acquiring and tracking the signal.

The IS-95 CDMA digital cellular system has a *pilot*, which
is a carrier that is spread by a PN code but carries no other
information. The pilot lets the receiver track the rapid changes in
carrier phase that frequently occur on fading multipath channels. As
in deep space links, transmitter power spent on the pilot is power
that cannot be spent on user data, so setting the pilot amplitude
requires a careful tradeoff.

Because it's so fundamental, let's look at Shannon's famous formula
given in his 1948 paper, A
Mathematical Theory of Communication. It gives the *channel
capacity*, **C** in bits per second of a band limited
transmission channel impaired by additive white Gaussian noise. The
capacity depends on the channel bandwidth, **B** in Hz and the
signal-to-noise ratio, **SNR**, expressed as a numeric ratio:

C = B * log2(1 + SNR)(1)

Shannon proved that it was at least possible (though he didn't show
how) to reliably communicate at any rate below the channel capacity
C. By "reliably communicate", he meant that modulation and
coding schemes exist that yield an arbitrarily low error rate. At
the same time he proved that it was *impossible* to reliably communicate
faster than C.

Shannon's formula reveals a fundamental tradeoff: we can compensate for a lower SNR by increasing the bandwidth, and vice versa. But we can only go so far. To see why, let's rewrite Shannon's formula as:

C = B * log2(1 + S/(B*No))(2)

where **S** is the signal power in watts and **No** is the
noise power spectral density in watts/Hz (which is equivalent to
joules). By explicitly computing the noise power in this way, we can
see the problem: as **B** increases, so does the total noise power,
offsetting (some of) the benefit of the extra bandwidth.

Let's rewrite the formula again as follows, assuming that we operate right at capacity:

C/B = log2(1 + (Eb*C)/(B*No))(3)

where **Eb** is the energy per data bit, in joules. (**Eb** times
the data rate, which we assume to be C, is the signal power S).
The ratio **C/B** is the *spectral efficiency* in bits/sec/Hz.
If we solve
for the important ratio Eb/No, we get:

Eb/No = SNR / (C/B) = (2**(C/B) - 1)/(C/B)(4)

Eb/No and SNR are equal only when C/B = 1.

With these formulas, we can easily compute the minimum SNR (or Eb/No)
required to support a given spectral efficiency, C/B.
If we plug in values of C/B that are near zero, corresponding to
very high bandwidths,
we can see that Eb/No approaches the value ln(2), which is 0.6918 or -1.6
dB. This is the famous *Shannon bound*; no communication system
can possibly operate reliably below an Eb/No of -1.6 dB even when
infinite bandwidth is available. For smaller (finite) bandwidths,
higher Eb/No ratios are required.