README for the KA9Q AO-40 FEC telemetry prototype modem
Version 4.1
Note major changes below!


2 July 2003
Copyright 2003, Phil Karn, KA9Q
karn@ka9q.net

This software may be used under the terms of the GNU Public License
(GPL). See this web page for details:

http://www.fsf.org/licenses/licenses.html#GPL

For usage summary instructions, see the file USAGE

This package contains the prototype encoder/modulator,
decoder/demodulator and simulation tools for my proposed coded
telemetry format for AO-40.  For a detailed description and rationale
for the proposed format, see:

http://people.qualcomm.com/karn/ao40/

Please note that this is a *prototype* intended for developers,
experimenters and knowledgeable users. It is NOT a turnkey, end-user
communication package. It is intended to test and demonstrate the
proposed FEC format and various decoding algorithms, and to serve as a
reference for future incorporation into telemetry reception and
communication programs with their own user interfaces.

Supported platforms

This software builds and runs under Linux, UNIX or a similar
compatible system. GCC (the GNU C compiler) is required. I do *not*
support Windows, though I welcome porting efforts by others as long as
the GPL is respected.

Building the package

Three separate subroutine libraries are required and must be installed
*before* you build this package:

dsp		my library of SIMD-accelerated DSP primitives
simd-viterbi	my SIMD-accelerated Viterbi decoder library
rs		my Reed-Solomon encoder/decoder

These packages are available in source form from my web site as
follows:

http://www.ka9q.net/code/fec/dsp-1.0.2.tar.gz
http://www.ka9q.net/code/fec/simd-viterbi-2.0.3.tar.gz
http://www.ka9q.net/code/fec/reed-solomon-4.0.tar.gz

Each package contains a standard "configure" script. See the README
files in each package and build and install each one.

A fourth package, fftw, is required only if you want to generate new
FIR filter coefficients. The fftw package is standard in many Linux
distributions; if you're running Debian, install the "fftw2" and
"fftw-dev" packages with this command:

apt-get install fftw2 fftw-dev

If you need FFTW source, see http://www.fftw.org/.

Once you have these three (or four) libraries installed, you must edit
the makefile in this package according to your system. If you're
building on a Pentium-II or later Intel/AMD system, you probably won't
have to change anything. If you're on an original Pentium, you'll want
to change the -march=i686 parameter in the C compiler flags to
-march=i586. If you're on a non-IA32 machine, you'll want to change
that parameter accordingly.

(A future release will probably include a configure script that will
do all this automatically, just as the dsp, viterbi and rs packages
already do.)

This package provides the following four application programs:

dpsk_xmit	encode and modulate
dpsk_demod	noncoherent differential BPSK demodulator
fec_decode	decode FEC format
p3_decode	decode uncoded frame format
fade		add sinusoidal fading and Gaussian noise
addnoise	add Gaussian noise

dpsk_xmit reads user data from standard input, divides it into frames
of 256 bytes and encodes and modulates each one.  The modulated signal
appears as a continuous stream of signed 16 bit linear PCM audio
samples on standard output.

dpsk_demod reads a stream of signed 16 bit linear PCM audio samples on
standard input and writes demodulated 16-bit soft-decision symbols on
standard output. Note! dpsk_demod no longer does FEC decoding; it now
only does demodulation. The output of dpsk_demod is meant to be piped
into the new program fec_decode, or another program (not yet written)
to decode the non-FEC P3 format.

By default, dpsk_demod does its work silently; the --verbose option
will cause a considerable amount of information about the decoding of
each frame to be sent to standard error.

fec_decode reads a stream of demodulated 16-bit soft-decision symbols
on standard input (e.g., from dpsk_demod) and attempts to locate and
decode FEC-encoded frames. Data that cannot be decoded is discarded.
By default, fec_decode does its work silently; the --verbose option
will cause a considerable amount of information about the decoding of
each frame to be sent to standard error.

p3_decode reads a stream of demodulated 16-bit soft-decision symbols
(e.g., from dpsk_demod) on standard input and looks for frames in the
uncoded legacy Phase 3 format (Sync vector + 512 bytes data +
CRC). Frames with invalid CRC are discarded. Good frames are written
to standard output as 512 byte blocks (the sync vectors and CRCs are
removed as they carry no information). By default, p3_decode does its
work silently; the --verbose option will cause information about the
decoding of each frame to be sent to standard error.

dpsk_xmit uses a carrier frequency of 1600 Hz, approximately the
center of a SSB transceiver passband. dpsk_demod accepts any carrier
frequency between 1200 and 2000 Hz, i.e., +/- 400 Hz around the
nominal 1600 Hz carrier generated by dpsk_xmit.

Two mutually exclusive command line arguments are in both dpsk_xmit
and dpsk_demod:

--nrzi
--biphase

The --biphase option is the default. In this mode, dpsk_xmit and
dpsk_demod use biphase (Manchester) encoding with a filter matched to
the analog hardware filter on AO-40. This filter has significant
intersymbol interference that currently degrades performance by about
1 dB over an optimal Nyquist filter. (I plan further work on this
filter that will equalize the ISI at the expense of somewhat impaired
noise performance, but yield better performance overall.)

The "--nrzi" option will substitutes NRZI encoding with an optimal (no
ISI) filter for the AO-40 biphase format.  This mode occupies less
bandwidth and (because of the lack of ISI) provides somewhat better
SNR performance; it is recommended for user-to-user communications
(e.g., via the transponder) where compatibility with the existing
AO-40 beacon hardware is unnecessary.

The --monitor command causes dpsk_demod to echo its input to the sound
card by internally invoking the "bplay" command. Due to buffering in
dpsk_demod, bplay and the Linux sound driver, there will be a delay
between the sound you hear and the data currently being processed by
the demodulator.

The --verbose option to dpsk_demod is documented below.

Sample test command

The "fade" and "addnoise" programs read and write the same audio PCM
format, allowing them to be used in shell pipelines to test the
performance of the demodulator at various SNRs and fade rates. For
example, here is a shell command that tests the modem with random
data, a fade period of 1.0 seconds and an average Eb/No of 10 dB using
the AO-40 format:

dpsk_xmit < /dev/urandom | fade --fade-period 1.0 --bit-rate 160 --sample-rate 9600 --ebn0 10 | dpsk_demod --verbose --monitor | fec_decode --verbose > /dev/null

The "fade" command takes the following mandatory command-line arguments:

--fade-period <seconds>
--bit-rate <bps>
--sample-rate <hz>
--ebn0 <db>

The --fade-period parameter specifies the period, in seconds, of a
sinusoidal envelope that is applied to the input signal. Note that
there are two nulls (and two carrier phase reversals) per cycle of
this fading envelope.

The --bit-rate parameter specifies the user data rate (*not* the
symbol rate) in bits per second. For the proposed AO-40 format
implemented in dpsk_xmit and dpsk_demod, this is 160 bits per
second. (rate 0.4 FEC * 400 bps = 160 bps). The fade command does not
know the user data rate (or even the modulation symbol rate) so this
parameter is necessary to compute the appropriate noise amplitude to
add to the signal to yield the desired Eb/No.

The --sample-rate parameter specifies the PCM sample rate in Hz. This
is a compile-time parameter in both dpsk_xmit and dpsk_demod, defined
in dpsk.h. It is currently set to 9600 Hz, but any sufficiently high
rate that is an integral multiple of the 400 Hz symbol rate and
supported by the sound card will work.

The --ebn0 parameter specifies the desired *average* Eb/No (energy per
bit to noise power spectral density ratio) for the test. This
parameter is used in conjunction with the --bit-rate and --sample rate
parameters to compute the amplitude of the noise to be added to the
input signal after fading.

The "addnoise" command is identical to the "fade" command except that
it adds no fading -- only Gaussian noise. It accepts the --bit-rate,
--sample-rate and --ebn0 parameters.

Interfacing with Linux sound cards

The standard Linux sound card drivers require special system calls to
configure sample rate, number of channels and bits per sample.  My
programs do not implement these system calls as they are easily used
in conjunction with the standard Linux "bplay" and "brec" (or
equivalent) programs.  To encode and transmit a file through a sound
card, use a command like this:

dpsk_xmit < input_datafile | bplay -s 9600 -b 16

Use your favorite mixer control program (e.g., tkmixer) to adjust
the output level.

To receive and decode a signal fed into the sound card, use the command

bplay -s 9600 -b 16 | dpsk_demod [--verbose] [--monitor] | fec_decode [--verbose] > receive_file

Again use your favorite mixer program to adjust input levels. The
demodulator can operate over a fairly wide input signal range, but it
is best to set a reasonable input level. 

Important change!

With version 4.0 of the ao40prototype package, the DBPSK demodulation
and FEC decoding functions have been split into separate programs.
dpsk_demod handles demodulation, and fec_decode locates and decodes
FEC-encoded frames.

dpsk_demod

dpsk_demod demodulates audio samples in 1-second "chunks", outputting
16-bit soft-decision symbols.  For the first second, the demodulator
tries all possible carrier frequencies in 10 Hz steps from 1200-2000
Hz, a narrow range around the nominal 400 Hz clock frequency, and all
24 symbol timing offsets. The Es/No ratio is estimated, and if it
exceeds 2 dB the following chunks are demodulated with a narrower
search that tracks slow changes in the signal parameters. If the Es/No
ever falls below 2 dB, the wide search is resumed.

This search strategy means dpsk_demod will consume a considerable
amount of CPU time while attempting to demodulate random noise, and
much less time when locked onto a signal.  If the resulting CPU load
is objectionable, an external control that stops the demodulator when
reception is impossible seems desirable. This could be a manual
operator control (e.g., starting or stopping the program) or driven by
the AOS/LOS times from an orbit tracking program.

fec_decode

fec_decode reads demodulated 16-bit soft-decision symbols from
standard input and attempts to locate and decode FEC-encoded AO-40
frames.  A sync vector correlator is slid down the input stream,
looking for correlator peaks that exceed a threshold. (This threshold
is currently set empirically.)

A decoding attempt is made at any sync offsets that exceed the
threshold.  If both Reed-Solomon codewords decode, the decoded data is
written to standard output, the input data is purged and a search is
resumed for another frame.

If only one of the RS codewords decodes, the input is purged and no
data is written; in this case, we at least know that a frame was
definitely present at this offset.

If no RS codewords decode, then the window is advanced by only one
symbol and the sync search resumes. Because the FEC frames are sent
asynchronously, we cannot distinguish an undecodable frame from an
invalid frame offset.

The intent of both dpsk_demod and fec_decode is to briefly spend a lot
of CPU power to rapidly acquire a new signal, but to avoid wasting CPU
cycles once it has been acquired. Faster CPUs, especially those with
SIMD support, can easily keep up with a real-time signal, but this may
be too much of a load for older, slower CPUs. 

Debugging information

dpsk_demod

The --verbose option to dpsk_demod produces detailed information about
the demodulation and decoding of every frame that currently looks like
this:

Carrier(Hz)    Clock(Hz)      sym time(samp)  Es/No
1600           400.000        12              9.1            
1600           400.000        0               9.0            
1600           400.000        0               9.1            
1600           400.016        1               9.0            
1600           400.033        1               8.9     

(end of sample output)

Each line represents one "chunk" of data, currently 1 second.  The
estimated carrier frequency, clock frequency, symbol timing offset (in
samples from the previous chunk) and the Es/No ratio are shown for
each chunk. Note that the Es/No figure is only an *estimate*.

fec_decode

The verbose data from fec_decode looks like this:

correlator amp 34.9 dB energy = 74.4 dB scale = 898
VD histogram: 541 227 273 301 367 325 282 169 109 89 105 154 185 208 235 1565
RS byte corrections: 0 0 - Good Frame
channel symbol error corrections: 114 (2.19%)
Decode attempts: 26 Partial decodes: 0 Full decodes: 1
correlator amp 35.0 dB energy = 74.5 dB scale = 890
VD histogram: 492 206 283 326 316 372 268 184 104 78 110 139 196 221 223 1617
RS byte corrections: 0 0 - Good Frame
channel symbol error corrections: 109 (2.1%)
Decode attempts: 27 Partial decodes: 0 Full decodes: 2
correlator amp 34.1 dB energy = 74.5 dB scale = 893
VD histogram: 449 215 307 326 345 339 310 169 102 90 116 173 185 218 223 1568
RS byte corrections: 0 0 - Good Frame
channel symbol error corrections: 120 (2.31%)
Decode attempts: 28 Partial decodes: 0 Full decodes: 3

Each successful or partly successful decoding attempt produces
five lines of debugging output. The first line shows some relative
energy and amplitude figures that aren't too interesting.

The "VD histogram" shows the number of symbols in each frame that
quantized into each of the Viterbi decoder's 16 soft-decision input
"bins".  The first number is the number of "strongest zeroes" fed to
the decoder, and the last number is the number of "strongest ones". On
a clean, well-equalized channel, the largest counts will be for the
"strongest zero" and "strongest one" bins; the other channels will
have small counts. On a noisy channel, the intermediate bins will have
larger counts.  This summarizes the information you might derive from
a classical eye pattern.

"RS byte corrections" show how many byte errors were successfully
corrected in each of the two Reed-Solomon code words in the frame.
Values of zero indicate that the Viterbi decoder fully corrected any
channel symbol errors, leaving nothing for the Reed-Solomon decoder to
fix.

The Reed-Solomon code can correct up to 16 errors in each of the two
code words. If more than 16 occur in any one code word, "?" is
displayed. Future enhancements may make use of "erasure forecasting"
and other tricks to decode "difficult" frames.

The RS byte correction counts is a very sensitive indicator of overall
SNR and demodulator performance. Note that frames in which both RS
blocks fail are not shown. They cannot be distinguished from
improperly synchronized frames, and some decoding attempts are made on
such frames to lower the chances of missing a real frame.

"channel symbol error corrections" shows how many channel symbol
errors were corrected by the FEC decoder. This is determined by
re-encoding the decoded frame and comparing the re-encoded channel
symbols with hard-quantized versions of those actually received.  This
step is skipped if any Reed-Solomon decoding failures occur, because
there's no way to know what the channel symbols should have been.

