Anda di halaman 1dari 4

arXiv:cs.

CL/0204023 v1 10 Apr 2002

Computational Phonology
Steven Bird
University of Pennsylvania
Phonology, as it is practiced, is deeply computational. Phonological analysis is data-intensive and the resulting models are nothing other than specialized
data structures and algorithms. In the past, phonological computation managing
data and developing analyses was done manually with pencil and paper. Increasingly, with the proliferation of affordable computers, IPA fonts and drawing
software, phonologists are seeking to move their computation work online. Computational Phonology provides the theoretical and technological framework for
this migration, building on methodologies and tools from computational linguistics. This piece consists of an apology for computational phonology, a history,
and an overview of current research.
Documentation and Description. Phonological data is of essentially three
types: texts, wordlists and paradigms. A text is any phonetically transcribed narrative or conversation. A wordlist is any compilation of linguistic forms which
can be uttered in isolation, with information about pronunciation and meaning. A
paradigm is broadly construed to mean any tabulation of words or phrases which
illustrates contrasts and systematic variation. Any of these data types may be annotated with more abstract information originating from a phonological theory,
such as syllable boundaries, stress marks and prosodic structure. Additionally,
any of these data types may be associated with recordings of audio, video or
physiological signals. Digitizing this documentation and description brings all
the different media types together, makes the cross-links navigable, and opens up
many new possibilities for management, access and preservation.
Exploration and Analysis. The data types described above are closely interconnected in phonological practice. For instance, the discovery of a new word
in a text may require an update to the lexicon and the construction of a new
paradigm (e.g. to correctly classify the word). Fresh insights may lead to new
annotations and further elicitation, closing the loop in this perpetual, exploratory
1

process. Phonological analysis typically involves defining a formal model, systematically testing it against data, and comparing it with other models. (In some
cases, the model may be incorporated into a software system, e.g. for generating
natural intonation in a text-to-speech system.) In this exploration and analysis
sorting, searching, tabulating, defining, testing and comparing the principal task
is computational.
Perhaps the earliest work in computational phonology was Bobrow and Frasers
Phonological Rule Tester (Bobrow and Fraser, 1968), an implementation of SPE
designed to alleviate the problem of rule evaluation. Shortly afterwards Johnson
showed that, while SPE rules resemble general rewriting systems at the top of the
Chomsky hierarchy, the way SPE rules are used in practice only requires finite
state power (Johnson, 1972). Independently, Kaplan and Kay discovered the connections between SPE grammars and finite state transducers in the 70s and 80s,
and laid down a complete algebraic foundation (ultimately reported in (Kaplan
and Kay, 1994)). Significant implementations followed, including (Koskenniemi,
1983; Beesley and Karttunen, 2002). Attempts to apply finite state devices to
Autosegmental Phonology have largely foundered, but applications to Optimality
Theory are thriving.
While finite-state phonology fixated on SPE, generative phonology continued its rapid evolution. The discovery of rule conspiracies (Kisseberth, 1970)
and the abstractness controversy (Koutsoudas et al., 1974), lead to calls for the
reintroduction of surface structure constraints. Many theories arose from the fallout; most notable for its computational ramifications was Montague Phonology
(Wheeler, 1981). This model adapted new lexicalist formalisms from syntax and
semantics, providing a declarative (as opposed to procedural) account of phonological well-formedness, and providing the first computational account of underspecification (where the phonological content of a lexical entry is incompletely
specified, to be filled in during a derivation). From these beginnings, Declarative
Phonology was born, and subsequent work provided a mathematical foundation
in first-order logic (Bird, 1995) and phonetic interpretation with links to Firthian
prosodic analysis and speech synthesis (Coleman, 1997), with implementations
generally in the Prolog programming language.
A third major strand of development, complementing the finite state and declarative models, is best characterized as statistical. It seeks to apply neural networks,
information theory, and weighted automata in the automatic discovery of phonological information. Gasser trained a recurrent neural network to recognize syllables and to repair ill-formed syllables (Gasser, 1992). Ellison showed how a
technique from information theory called MDL minimum description length
2

could be applied to automatically identify syllable boundaries in phonemically


transcribed texts (Ellison, 1992). Many researchers apply Markov models (a kind
of weighted automata) in speech recognition, mapping speech recordings to phonetic transcriptions and thence to orthographic words, using large, phonetically
annotated corpora as training data (e.g. TIMIT (Garofolo et al., 1986)).
Four key areas of ongoing research in computational phonology are in Optimality Theory, automatic learning, interfaces to grammar and phonetics, and supporting phonological description in the field. Comprehensive references to online
research papers in this areas may be found on the SIGPHON website.
Computational phonology is generating sophisticated and rigorous ways for
creating, exploring and disseminating multidimensional phonological information, encompassing primary recordings, texts, wordlists, paradigms, theories and
analyses. As phonologists adopt the computational methods described above, extending and adapting them as needed, the consequences for the discipline will be
increased accessibility, accountability, and stability of empirical research.
Resources. The Association for Computational Linguistics (ACL) has a special interest group in computational phonology (SIGPHON) with a homepage at
http://www.cogsci.ed.ac.uk/sigphon/. The website contains online proceedings for SIGPHON workshops and information about relevant books,
dissertations and articles. A special issue of Computational Linguistics devoted to
computational phonology was published in 1994 (Bird, 1994).

References
Beesley, K. R. and Karttunen, L. (2002). Finite-State Morphology: Xerox Tools
and Techniques. Studies in Natural Language Processing. Cambridge
University Press.
Bird, S., editor (1994). Computational Linguistics: Special Issue on
Computational Phonology, volume 20(3). MIT Press.
Bird, S. (1995). Computational Phonology: A Constraint-Based Approach.
Studies in Natural Language Processing. Cambridge University Press.
Bobrow, D. G. and Fraser, J. B. (1968). A phonological rule tester.
Communications of the ACM, 11:76672.
Coleman, J. S. (1997). Phonological Representations their names, forms and
powers. Cambridge Studies in Linguistics. Cambridge University Press.
3

Ellison, T. M. (1992). Machine Learning of Phonological Structure. PhD thesis,


University of Western Australia.
Garofolo, J. S., Lamel, L. F., Fisher, W. M., Fiscus, J. G., Pallett, D. S., and
Dahlgren, N. L. (1986). The DARPA TIMIT Acoustic-Phonetic Continuous
Speech Corpus CDROM. NIST.
http://www.ldc.upenn.edu/Catalog/LDC93S1.html.
Gasser, M. (1992). Learning distributed representations for syllables. In
Proceedings of the Fourteenth Annual Conference of the Cognitive Science
Society, pages 396401. Hillsdale NJ: Lawrence Erlbaum Associates.
Johnson, C. D. (1972). Formal Aspects of Phonological Description. The Hague:
Mouton.
Kaplan, R. M. and Kay, M. (1994). Regular models of phonological rule
systems. Computational Linguistics, 20:33178.
Kisseberth, C. W. (1970). On the functional unity of phonological rules.
Linguistic Inquiry, 1:291306.
Koskenniemi, K. (1983). Two-Level Morphology: A General Computational
Model for Word-Form Recognition and Production. PhD thesis, University
of Helsinki.
Koutsoudas, A., Sanders, G., and Noll, C. (1974). The application of
phonological rules. Language, 50:128.
Wheeler, D. W. (1981). Aspects of a Categorial Theory of Phonology. PhD
thesis, University of Massachusetts at Amherst.

Anda mungkin juga menyukai