and Stylistics
PALA Summer School, Maribor, 2014
In this lecture...
Stylistics and style
Combining stylistics + corpus linguistics
Examples of studies combining corpus
linguistics and stylistics
Analysis
Analysis
Analysis
Analysis
of
of
of
of
genres
the works by particular authors
individual texts
variation inside texts
Corpus Tools
WMatrix
Stylistics
Stylistics is the study of
literature using methods,
theories and concepts from
linguistics
(Leech and Short 2007: 1)
it is "[...] the study of the
relationship between linguistic
form and literary function [...]
(Leech and Short 2007: 3).
Linguistic style
Style is a way in which
language is used
(Leech and Short 2007: 31)
[S]tyle consists in choices
made from the repertoire
of the language.
(Leech and Short 2007: 31)
Linguistic style
Stylistic choice is
limited to those aspects
of linguistic choice which
concern alternative
ways of rendering the
same subject matter
(Leech and Short 2007:
31)
e.g. horse vs. steed but
not horse vs. dog
Linguistic style
Style and genre, e.g. science fiction,
romance novels, etc.
Style and author
Style and text
Style and parts of texts (e.g. the
narration or speech of different
characters)
Examples of studies
Combining corpus linguistics and
stylistics
Analysis
Analysis
authors
Analysis
Analysis
of genres
of the works by particular
of individual texts
of variation inside texts
Genre style
Biber (1988) multivariate statistical techniques
factor analysis
many different variables
variables = linguistic features (e.g. passive
constructions)
Number of instances of DS
Whole corpus
2,974
Fiction
1,569
Press
770
(Auto)biography
635
Fiction sub-section
Number of instances of DS
Serious
629
Popular
940
Authorial style
Studies attempting to fingerprint authors:
i.e. to identify linguistic items that
distinguish the works by one author from
those of others.
Burrows (1987): study of Jane Austens
novels focusing on closed-class words,
such as the, and, of, a and to.
Burrows found that these words can
distinguish the works of different authors ,
different novels, and even the words
spoken by different characters.
Authorial style
Hoover (2002) studied a series of corpora
containing chunks from novels by different
authors.
For example, he looked at a corpus
containing the first 30,000 words of 29
novels by 17 different authors.
The distribution of the 300 most frequent
words in the corpus as a whole correctly
clusters 15 out of 17 novels.
Authorial style
An analysis of the most frequent word
sequences (n-grams) can also be useful,
e.g.
of the
in the
to the
it was
he was
and the
Authorial style
Mahlberg (2007, 2009, 2012)
Corpus stylistics and Dickenss
fiction
Also shows that analysis of
frequent word sequences
(clusters) can be useful.
Clusters containing body parts
his hands in his pockets
his head on one side
his hands upon his
Text style
Stubbss (2005) study of
Joseph Conrads Heart of
Darkness, first published in
1899.
Marlow, the protagonist and
first-person narrator, tells of
how he was contracted to
travel up a river in the Belgian
Congo, in order to find an ivory
trader called Kurtz, who was
the subject of stories of
madness and suspect
practices. However, Kurtz dies
while travelling back down the
river.
Text style
Main themes
hypocrisy of the colonizers
unreliability of progress and civilization
breakdowns in communication
Light vs. dark
Restraint vs. frenzy
Appearance vs. reality
Marlows unreliable and distorted knowledge
(Stubbs 2005: 8-9)
Text style
Used WordSmith Tools (Scott 2007)
Compared one novel with a corpus of
fictional texts of around 700,000 words
Overused words in novel include: seemed,
mystery, darkness, absurd, horror, terror,
desolation
Several words concern uncertainty,
perception and knowledge.
Coincide with some of the novels themes
Text style
Stubbs shows how the application of
corpus methods can provide:
further justification for well-established
interpretations,
new insights into the language and
meaning potential of the text.
Corpus tools
Corpus tools make comparison
relatively easy
WordSmith Tools (Scott 2007)
WMatrix (Rayson 2009)
AntConc (Anthony 2011)
MLCT (Piao)
Summary
Style is the way in which language is
used.
The notion of style is fundamentally
based on comparison
Corpus linguistic methods are
relevant to the analysis of style in
fiction/literature.
They have been applied to the
analysis of genres, authors and texts.
Manual analysis and interpretation of
Summary
[...] corpus stylistics is not
purely a quantitative study
of literature. Rather, it is still
a qualitative stylistic
approach to the study of the
language of literature,
combined with or supported
by corpus-based
quantitative methods and
technology.
(Ho 2011:10)
References
Culpeper, J. (2009) Keyness: words, parts-of-speech and semantic categories in the character-talk
of Shakespeares Romeo and Juliet International Journal of Corpus Linguistics, 14(1): 29-59.
Ho, Y. (2011) Corpus Stylistics in Principles and Practice: A Stylistic Exploration of John Fowles The
Magus. London: Continuum
Leech, G. (2008) Language in Literature: style and foregrounding Harlow, UK: Pearson
Louw, B. (2008) "Consolidating Empirical method in data-assisted stylistics: Towards a corpusattested glossary of literary terms" in Zyngier, S., Bortlussi, M., Chesnokova, A. and Auracher, J.
Directions in Empirical Literary Studies, pp. 243-264. Amsterdam: Benjamins.
Mahlberg M. (2007) Clusters, Key Clusters and local textual functions in Dickens Corpora 2(1): 1-31
Mahlberg, M. (2009) Corpus Stylistics and the Pickwickian watering-pot, in Contemporary Corpus
Linguistics Baker, P. (ed.) Contemporary Corpus Linguistics, pp47-63. London: Continuum.
Mahlberg, M. (2012) Corpus Stylistics and Dickenss Fiction. London: Routledge
McIntyre, D. (2010) Dialogue and Characterization in Quentin Tarantinos Reservoir Dogs: A Corpus
Stylistic Analysis, in McIntyre, M. and Busse, B. (eds.) Language and Style pp 162-182.
Basingstoke: Palgrave.
McIntyre, D. and Walker, B. (2010) 'How can corpora be used to explore the language of poetry and
drama?' in McCarthy, M. and OKeefe, A. (eds) The Routledge Handbook of Corpus Linguistics.
London: Routledge
Widdowson, H. G. (2008) The Novel Features of Text. Corpus Analysis and Stylistics in Gerbig, A.
and Mason, O. (eds.)Language, People, Numbers: Corpus Stylistics and Society, pp. 293-304.
Amsterdam: Rodopi.
WMatrix
WMatrix
Web-based corpus tool
Developed by Paul Rayson at
Lancaster University
Automated grammatical and
semantic analysis of texts/corpora
A web-based front end for CLAWS
and USAS
WMatrix
Using a web interface:
Texts are uploaded onto the Wmatrix
server (at Lancaster)
The upload procedure automatically
adds
(i) Grammatical or Part of Speech
(POS) tags;
(ii) Semantic tags
WMatrix
CLAWS grammatical (POS) tagger.
CLAWS = Constituent Likelihood
Automatic Word-tagging System
USAS semantic tagger
USAS = UCREL Semantic Analysis System
(UCREL = University Centre for Corpus
Research on Language)
WMatrix
USAS
Assigns tags to each word using a
hierarchical framework of
categorization
Based originally on McArthurs
(1981) Longman Lexicon of
Contemporary English
B
THE BODY &
THE
INDIVIDUAL
C
ARTS &
CRAFTS
E
EMOTION
F
FOOD &
FARMING
G
GOVERNMENT
& PUBLIC
DOMAIN
H
ARCHITECTUR
E, HOUSING &
THE HOME
I
MONEY &
COMMERCE
(IN INDUSTRY)
K
ENTERTAINME
NT
L
LIFE & LIVING
THINGS
M
MOVEMENT,
LOCATION,
TRAVEL,
TRANSPORT
N
NUMBERS &
MEASUREMEN
T
O
SUBSTANCES,
MATERIALS,
OBJECTS,
EQUIPMENT
P
EDUCATION
Q
LANGUAGE &
COMMUNICATI
ON
S
SOCIAL
ACTIONS,
STATES &
PROCESSES
T
TIME
W
X
WORLD &
PSYCHOLOGIC
ENVIRONMENT AL ACTIONS,
STATES &
PROCESSES
Z
NAMES &
Y
SCIENCE &
TECHNOLOGY
WMatrix
G - Government and the public domain
G1
G2
G3
Government,
politics and
elections
Crime, law and
order
War, defence
and the army:
weapons
Government, etc.
G1.1
Politics
G1.2
WMatrix
Allows analysis of texts at :
the word level
the grammatical level (POS)
and the semantic level
WMatrix
Allows text comparison at:
the word level
the grammatical level (POS)
and the semantic level
WMatrix
Keyness
Word level Key-words
Grammatical level Key-POS
Semantic level Key-concepts