In
telecommunication, a line code is a pattern of voltage, current, or photons used to represent digital data
transmitted down a
transmission line. This repertoire of signals is usually called a constrained code in data storage systems. Some signals are more prone to error than others when conveyed over a
communication channel as the physics of the communication or
storage medium constrains the repertoire of signals that can be used reliably.
Common line encodings are
unipolar,
polar,
bipolar, and
Manchester code.
Transmission and storage
After line coding, the signal is put through a physical communication channel, either a
transmission medium or
data storage medium.
[Karl Paulsen]
"Coding for Magnetic Storage Mediums"
.2007. The most common physical channels are:
* the line-coded signal can directly be put on a
transmission line, in the form of variations of the voltage or current (often using
differential signaling).
* the line-coded signal (the ''
baseband signal'') undergoes further
pulse shaping (to reduce its frequency bandwidth) and then is
modulated (to shift its frequency) to create an ''
RF signal'' that can be sent through free space.
* the line-coded signal can be used to turn on and off a light source in
free-space optical communication, most commonly used in an infrared
remote control.
* the line-coded signal can be printed on paper to create a
bar code.
* the line-coded signal can be converted to magnetized spots on a
hard drive or
tape drive.
* the line-coded signal can be converted to pits on an
optical disc.
Some of the more common binary line codes include:

Each line code has advantages and disadvantages. Line codes are chosen to meet one or more of the following criteria:
* Minimize transmission hardware
* Facilitate synchronization
* Ease error detection and correction
* Achieve a target
spectral density
* Eliminate a
DC component
Disparity
Most long-distance communication channels cannot reliably transport a
DC component. The DC component is also called the ''disparity'', the ''bias'', or the
DC coefficient. The disparity of a bit pattern is the difference in the number of one bits vs the number of zero bits. The ''running disparity'' is the
running total of the disparity of all previously transmitted bits. The simplest possible line code,
unipolar, gives too many errors on such systems, because it has an unbounded DC component.
Most line codes eliminate the DC component such codes are called
DC-balanced, zero-DC, or DC-free. There are three ways of eliminating the DC component:
* Use a
constant-weight code. Each transmitted
code word in a constant-weight code is designed such that every code word that contains some positive or negative levels also contains enough of the opposite levels, such that the average level over each code word is zero. Examples of constant-weight codes include
Manchester code and
Interleaved 2 of 5.
* Use a
paired disparity code. Each code word in a paired disparity code that averages to a negative level is paired with another code word that averages to a positive level. The transmitter keeps track of the running DC buildup, and picks the code word that pushes the DC level back towards zero. The receiver is designed so that either code word of the pair decodes to the same data bits. Examples of paired disparity codes include
alternate mark inversion,
8B10B and
4B3T.
* Use a
scrambler. For example, the scrambler specified in RFC 2615 for
64b/66b encoding.
Polarity
Bipolar line codes have two polarities, are generally implemented as RZ, and have a radix of three since there are three distinct output levels (negative, positive and zero). One of the principle advantages of this type of code is that it can completely eliminate any DC component. This is important if the signal must pass through a transformer or a long transmission line.
Unfortunately, several long-distance communication channels have polarity ambiguity. Polarity-insensitive line codes compensate in these channels.
There are three ways of providing unambiguous reception of 0 and 1 bits over such channels:
* Pair each code word with the polarity-inverse of that code word. The receiver is designed so that either code word of the pair decodes to the same data bits. Examples include
alternate mark inversion,
Differential Manchester encoding,
coded mark inversion and
Miller encoding.
*
differential coding each symbol relative to the previous symbol. Examples include
MLT-3 encoding and
NRZI.
* Invert the whole stream when inverted
syncwords are detected
Run-length limited codes
For reliable
clock recovery at the receiver, a
run-length limitation may be imposed on the generated channel sequence, i.e., the maximum number of consecutive ones or zeros is bounded to a reasonable number. A clock period is recovered by observing transitions in the received sequence, so that a maximum run length guarantees sufficient transitions to assure clock recovery quality.
RLL codes are defined by four main parameters: ''m'', ''n'', ''d'', ''k''. The first two, ''m''/''n'', refer to the rate of the code, while the remaining two specify the minimal ''d'' and maximal ''k'' number of zeroes between consecutive ones. This is used in both
telecommunication and storage systems that move a medium past a fixed
recording head.
Specifically, RLL bounds the length of stretches (runs) of repeated bits during which the signal does not change. If the runs are too long, clock recovery is difficult; if they are too short, the high frequencies might be attenuated by the communications channel. By
modulating the
data, RLL reduces the timing uncertainty in decoding the stored data, which would lead to the possible erroneous insertion or removal of bits when reading the data back. This mechanism ensures that the boundaries between bits can always be accurately found (preventing
bit slip), while efficiently using the media to reliably store the maximal amount of data in a given space.
Early disk drives used very simple encoding schemes, such as RLL (0,1) FM code, followed by RLL (1,3) MFM code which were widely used in
hard disk drives until the mid-1980s and are still used in digital optical discs such as
CD,
DVD,
MD,
Hi-MD and
Blu-ray using
EFM and
EFMPLus codes. Higher density RLL (2,7) and RLL (1,7) codes became the
de facto standards for hard disks by the early 1990s.
Synchronization
Line coding should make it possible for the receiver to synchronize itself to the
phase of the received signal. If the clock recovery is not ideal, then the signal to be decoded will not be sampled at the optimal times. This will increase the probability of error in the received data.
Biphase line codes require at least one transition per bit time. This makes it easier to synchronize the transceivers and detect errors, however, the baud rate is greater than that of NRZ codes.
Other considerations
A line code will typically reflect technical requirements of the transmission medium, such as
optical fiber or
shielded twisted pair. These requirements are unique for each medium, because each one has different behavior related to interference, distortion, capacitance and loss of amplitude.
Common line codes
*
2B1Q
*
4B3T
*
4B5B
*
6b/8b encoding
*
8b/10b encoding
*
64b/66b encoding
*
128b/130b encoding
*
Alternate mark inversion (AMI)
*
Coded mark inversion (CMI)
*
EFMPlus, used in
DVDs
*
Eight-to-fourteen modulation (EFM), used in
compact discs
*
Hamming code
*
Hybrid ternary code
*
Manchester code and
differential Manchester
*
Mark and space
*
MLT-3 encoding
*
Modified AMI codes: B8ZS, B6ZS, B3ZS, HDB3
*
Modified frequency modulation, Miller encoding and delay encoding
*
Non-return-to-zero (NRZ)
*
Non-return-to-zero, inverted (NRZI)
*
Pulse-position modulation
*
Return-to-zero (RZ)
*
TC-PAM
Optical line codes
*
Alternate-Phase Return-to-Zero (APRZ)
*
Carrier-Suppressed Return-to-Zero (CSRZ)
*
Three of Six, Fiber Optical (TS-FO)
See also
*
Physical layer
*
Self-synchronizing code and bit synchronization
References
*
External links
Line Coding Lecture No. 9
Line Coding in Digital Communication
{{Bit-encoding
*
Category:Physical layer protocols
Category:Coding theory