TRANSACTIONS
ON
INFORMATION
THEORY,
VOL.
42, NO.
6, NOVEMBER
1996
1881
Abstract-It
is well known that a convolutional code is essentially a linear system defined over a finite field. In this paper
we elaborate on this connection. W e will define a convolutional
code as the dual of a complete linear behavior in the sense of
W illems. Using ideas from systems theory, we describe a set
of generalized first-order descriptions for convolutional codes.
As an application of these ideas, we present a new algebraic
construction for convolutional codes.
duality,
first-
I. INTRODUCTION
N THIS paper we take a detailed look at convolutional
codes from the perspective of linear systems theory with
an emphasis on duality relations and on the different representations of these codes. Using these representations, we
present a construction of convolutional codes with distance
lower-bounded by the complexity of the encoder.
Throughout the relatively short history of the theory of
convolutional codes, there have been several authors that
have made the link between convolutional codes and linear
systems theory. Among the first authors to do this were Massey
and Sain. They published a series of papers [20], [21], [33],
containing a systems-theoretic analysis of convolutional codes
and encoders. After this, Omura in [24] considered Viterbi
decoding and its relationship to dynamic programming and
later applications of control theory to optimal receiver design
for convolutional codes [25]. In several landmark papers [4],
[5], Fomey started to lay the foundation for the algebraic
structure of convolutional codes.
Since these papers were written, there have been significant
advances in the theory of linear systems. One notable advance
has been the behavioral approach to linear systems of W illems,
championed in the papers [39],[41]. This point of view
generated a renewed interest in the interplay between systems
theory and convolutional coding theory. W e would like to
mention in particular the recent papers by Fomasini and
Valcher [3], [37] and the recent papers [7], [16], [17] by
Fomey, Loeliger, Mittelholzer, and Trott. Actually, as will be
Manuscript received December 15, 1995; revised July 15, 1996. This work
was supported in part by NSF under Grant DMS-9400965. This research was
carried out in part while J. Rosenthal and E. V. York were visitors at CWI,
Amsterdam, The Netherlands. The material in this paper was presented in
part at the 34th IEEE Conference on Decision and Control, New Orleans,
LA, December 1995.
J. Rosenthal and E. V. York are with the Department of Mathematics,
University of Notre Dame, Notre Dame, IN USA 46556.5683.
J. M. Schumacher is with CWI, P.O. Box 94079, 1090 GB Amsterdam, The
Netherlands, and the Department of Economics, Tilburg University, Tilburg,
The Netherlands.
Publisher Item Identifier S 001%9448(96)07516-5.
001%9448/96$05.00
THE DUALITY
BETWEEN
1882
IEEE
TRANSACTIONS
ON
INFORMATION
THEORY,
VOL.
42, NO.
6, NOVEMBER
1996
(2.2)
ROSENTHAL
et al.: ON
BEHAVIORS
AND
CONVOLUTIONAL
CODES
1883
(2.3)
= (Gt(a)w,l).
1 P(a)w
= 0).
IEEE
1884
2 W$
--oo
2
0
i?
-cc
W$.
TRANSACTIONS
ON
INFORMATION
THEORY,
VOL.
42, NO.
6, NOVEMBER
1996
Dejinition 2.9 (cJ: (3, Proposition 2.11): A code C is observable if there exists an integer N such that, whenever the
supports of w and w are separated by a distance of at least N
and w + w E C, then also w E C and w E C.
In other words, observability means that one can be sure that
a message has been completed once a sufficiently long string of
zeros has been received. An important property of observable
codes is that they allow kernel representations, in the sense that
there exists a polynomial matrix H(s) (a syndrome former)
with the property that w E C if and only if H(z)w (2) = 0; this
is the dual of the fact that controllable behaviors have an image
representation [41, Proposition 4.31. For the case in which the
time axis is Z+, observability can be defined in the same way
as above. Codes on Z+ are naturally associated with matrices
over IF[s] rather than IF[s, s-l], and the standard concept of left
or right primeness for polynomial matrices requires that the
greatest common divisor of the appropriate minors should be
a constant. The characterization that we find for observability
is however the same as in the Z case.
Proposition 2.10: A code C (on Z+) is observable if and
only if it has a generator matrix G(s) that is right-prime when
considered as a matrix over IF[s, s-l].
The proof of this proposition is in the Appendix. The proof
also shows that membership of an observable code on Z+
cannot in general be decided by a syndrome former alone;
an additional finite test is needed. However, if the predictable
delay property [26, p. 441 is added, then this additional test
can be dispensed with and one has a so-called basic encoder
[26, p. 531. It follows from a result by Massey and Sain [21]
that the property in the above proposition is equivalent to the
existence of a feedforward inverse with delay.
C. Completion
On IF [[z]] one can introduce the topology of pointwise
convergence, with the understanding that on IFn the discrete
topology is used (i.e., the topology induced by the Hamming
distance). As noted by W illems [39], a behavior is complete if
and only if it is closed in this topology. For a code C on Z+, we
denote by c its completion, i.e., its closure with respect to the
topology of pointwise convergence. More explicitly, we have
C = {w E IFn[[z]] ] U[N E C/N for all N}.
(2.4)
) Z(z)
ROSENTHAL
et al.: ON
BEHAVIORS
AND
CONVOLUTIONAL
1885
CODES
h, w:+l,
l/:+2,
. . , w;+~,
0, 0, .
.) E
A.
(3.1)
= im (zGt - Ft)
+ Lx(z) + tiw(z)
= O}
and if (K, z, r\;r) satisfies the minimality conditions of Theorem 3.1, then there exist unique invertible matrices T and
S such that
(K,L,M)
= (TKS-l,TLS-l,TM).
(3.2)
IEEE
1886
TRANSACTIONS
ON
INFORMATION
THEORY,
VOL.
42, NO.
6, NOVEMBER
1996
010101101
001100001
100010000
( 000010111
hence
Let
X(Z) = diag (X,(Z),
X;(z)
. x,(z))
[l 25 . . z v,-l]t
(i=
l,...,k).
(3.3)
/1
f(z) = (fl(z),
. . , ,h(~>) E Ek[4,
deg f;(z) I v; - 1
I-+
zX(z>
[ 1
pfk
-3
w X(z)
G(z)
(3.4)
(d;+l
;).
:=
1
z
( 0
0
0 .
11
1\
M .= 00 00 01
.I 1 1 1 I
is
. . + HmP
ROSENTHAL
et al.: ON
BEHAVIORS
AND
CONVOLUTIONAL
CODES
1887
Q-q(i) =O.
and
M
(4.5)
M
Then, after permuting columns, Pv = 0 can be expressed as
R :=
r
(Y R) ;
0
=o.
C, = ker QIR.
(4.2)
One particular way to carry out this elimination is as follows. Note that any triple (K, L, M) that satisfies minimality
conditions 1) and 2) of Theorem 3.1 can be written, after
a suitable similarity transformation and a permutation of the
components of the code vector if needed, in the following
form:
[;I]
L=
[;i]
M=
[_oI
;].
(4.3)
AB
A2B
D
D
CB
CB
CAB
..
...
AYB
CAT-lB
...
CAy-2B
(4.1)
K=
A :=
B :=
yt = Cxt + Dut,
0 2 t < y, xy = 0, x-1 = 0.
(4.6)
(4.4)
ff
Q2
.
.
.
.
.
ac
c?
...
a4
. . .
ak-l
&V--1)
,9Jc . . .
&k-l)
IEEE
1888
TRANSACTIONS
ON
INFORMATION
THEORY,
VOL.
42, NO.
6, NOVEMBER
1996
c :=
R := [B AB
g-k-1
and
1
Q
.
D:=
i .
&-k-l)
::: cik
;2
a2(n-k-1)
. 1.
ak(n:k-l)
1.
. . A-IB]
...
Ay-lB
AYB].
(4.7)
[1
...
wt
(+ci+l,
. ,LLy-l,Uy)
c+
= b < cf
1.
CAi-lB
CAY-t-B
&
.:.
CAY-t-$
which is equivalent to
(CAC-l)t]t
Y=
ut+i + ABut+i+l
+ . . . + AY-t-iBu,).
(4.8)
. Q..-~) > c - b + b + E.
ROSENTHAL
ef al.: ON
BEHAVIORS
AND
CONVOLUTIONAL
CODES
1889
that 21 = Pt(z)e
then have
= (P(a)w,[)
= 0.
So it follows that C(Pt) & Bl. The rest of the proof will be
devoted to the reverse inclusion.
First assume that the matrix P(s) is left-prime, i.e., there
is a matrix F(s) such that
U(s) :=
[P(s)
m1
1)
D=(l
1).
322 + 42 + 9
18x2 + 26~
292
+ 292 + 9
(
2Z2 + 172 + 13
26.~ + 1
.
3422 + 142 + 14 )
W(s)
=: [T(s)
1 F(s)]
u = VW
= [P(z)
I m41
1 P(z)] [ ;I
= Pyz)v
so that TJ E C(Pt).
Now consider a general kernel representation P(s). W e may
assume without loss of generality that P(s) has full-row rank.
W e may then write P(s) = T(s)Q(s) where T(s) is a square
and nonsingular polynomial matrix, and Q(S) is of size k x n
and left-prime. (This follows by an application of the Smith
form, which is valid over a general Euclidean domain and so in
particular for matrices over IF[s].) From the above it already
follows that
C(P) c B1 c C(Qt).
To prove that actually Bl = C( P), it suffices to show that the
quotient spaces C(Qt)/B and C(Qt)/C(Pt)
are both finitedimensional vector spaces and that the dimensions of these
spaces agree.
As is well known, the behavior D(T) determined by the
nonsingular matrix T(s) is a finite-dimensional vector space
over IF with dimension r := deg det T(s). Also the mapping
Q(a) from F[[z]] to IF[[z]] is surjective so we can find
elements wl,. . , w, E P[[z]] such that the elements 2zli
defined by W i = Q(o)w; form a basis for B(T). Then n(P)
is spanned by B(Q) together with the elements wi, and so
w E Bl for u E F[z] if and only if u E C(Qt) and (w;, v) = 0
for all i = 1, . . , T. To show that these extra restrictions are
independent, assume that
=o
IEEE
1890
for some a!; E F and for all w E C(Qt). It then follows that
for all e E F[z] we have
so that
=0
c a;ur;
i=l
PmTO
0 = P,+1To
+ PmQ
H(z)
[H
1(z)
= [R(z) ) R(z)]-
TRANSACTIONS
ON
INFORMATION
THEORY,
VOL.
42, NO.
6, NOVEMBER
1996
0 or H(z)v(z)
= 0, and in both cases it follows from
H(z)(w(z) + w(z)) E C(T) that H(z)w(z) E C(T) as well
as H(z)v(z)
E C(T).
For the converse part of the proof, suppose now that the fullsize minors of G(z) have a common divisor that is not of the
form pmzm for some m. W e can then write G(z) = R(z)T(z),
where T(z) is diagonal and at least one of the diagonal
elements is not of the form pmzm. It then follows from the
preceding lemma that T(z) generates an unobservable code,
so for all integers N there exist polynomial vectors U(Z) and
W (Z) whose supports are separated by a distance at least N
and whose sum belongs to C(T), but that do not themselves
belong to C(T). By considering R(z)w(z) and R(z)w(z), one
sees that the same property holds for C(G) (note that it follows
from V(Z) $! C(T) that R(z)w(z) 6 C(G), because R(z) has
0
full-column rank).
ACKNOWLEDGMENT
REFERENCES
tances are maximal, IEEE Trans. Inform. Theory, vol. 35, no. 1, pp.
188-191, Jan. 1989.
PI E. Fomasini and M. E. Valcher, Algebraic aspects of 2D convolutional
codes, ZEEE Trans. Inform. Theory, vol. 40, no. 4, pp. 1068-1082,
July 1994.
131 E. Fornasini and M. E. Valcher, Observability and extendability of
finite support nD behaviors, in Proc. 34th IEEE Conf: on Decision and
Control (New Orleans, LA, 1995), pp. 3277-3282.
141 G. D. Fomey, Convolutional codes I: Algebraic structure, IEEE Trans.
Inform. Theory, vol. IT-16, no. 5, pp. 720-738, Sept. 1970.
Structural analysis of convolutional codes via dual codes,
[51
E?rans.
inform. Theory, vol. IT-19, no. 5, pp. 512-518, Sept. 1973.
[61 G. D. Forney and M. D. Trott, The dynamics of group codes: State
space, trellis diagrams and canonical encoders, IEEE Trans. Inform.
Theory, vol. 39, pp. 1491-1513, 1993.
Controllability, observability, and duality in behavioral group
[71 -3
systems, in Proc. 34th IEEE Conf: on Decision and Control (New
&leans, LA, 1995), pp. 3259-3264.
181M. L. J. Hautus. Controllability and observability condition for linear
autonomous systems, Ned. Akad. Wetenschappen,-Proc. Ser. A, vol. 72,
pp. 443-448, 1969.
191 J. Justesen, New convolutional code constructions and a class of
asymptotically good time-varying codes, IEEE Trans. Inform. Theory,
vol. IT-19, no. 2, pp. 22%225, Mar. 1973.
An algebraic construction of rate l/v convolutional codes,
1101->
IEEE Trans. Inform. Theory, vol. IT-21, no. 1, pp. 577-580, Jan. 1975.
1111S. Kaplan, Extension of the pontryagin duality I: Infinite products,
Duke Math J., vol. 15, pp. 649-658, 1948.
WI M. Kuiiper, First-Order Reoresentations of Linear Svstems. Boston.
MA: Birkhluser, 1994.
.
t131 S. Lang, Algebra. Reading, MA: Addison-Weslev, 1965.
[I41 Y. Levy and D. J. Costello,Jr., An algebraic approach to constructing
convolutional codes from quasicvclic codes, DIMACS Ser. in Discr.
Math. and Theor. Comp. SC;., voi. 14, pp. 189-198, 1993.
1151S. Lin and D. Costello, Error Control Coding: Fundamentals and
Applications. Englewood Cliffs, NJ: Prentice-Hall, 1983.
1161H. A. Loeliger, G. D. Fornev, T. Mittelholzer, and M. D. Trott,
Minimality and observability -of group systems, Linear Alg. AppZ.,
~01s. 205/206, pp. 937-963, 1994.
[I71 H. A. Loeliger and T. Mittelholzer, Convolution codes over groups,
this issue, pp. 1660-1686.
1181F. S. Macaulay, The Algebraic Theory of Modular Systems. Cambridge,
UK: Cambridge Univ. Press, 1916.
ROSENTHAL
et al.: ON
BEHAVIORS
AND
CONVOLUTIONAL
CODES
1891