Anda di halaman 1dari 16

Linguistic Negotiations

of Predictability and Informativeness


Phillip Potamites

January 12, 2010

1 4 Chinese Classics: 1300-1800

Three Kingdoms:14C Marsh Chronicles:14C


1.0

1.0

●● ●●● ●
● ● ●
● ●● ●
0.8

● ●
0.8 ●
● ●

●●
●●●
● ●●●●●● ●
●●●●
●●● ●●
●● ● ● ● ●●● ● ●
●● ● ● ●
●●
●●●●●
● ● ● ●●
● ● ●● ● ●●●●●●●● ●
● ●●●● ●● ●

●●●


●●●
●●

●●

●●●

●●

●●


●●


●●
●●●


●●
●●



●●




●●●●
● ●●
●● ●●● ●●● ●●● ● ● ●
●●● ● ● ●● ● ●●●●● ● ●
●●●●● ●

●●●
●●
●●●
● ●
●●
●●● ● ● ●●● ●● ●●

● ●●

●●●● ● ●●● ●● ● ● ●● ●● ●
● ● ● ●
● ●● ● ●
● ●
●●●●●●● ● ●
●●●●●

●● ●●● ●●●● ●●

●●

●● ●

●● ●●●●
●●●●
●●● ●●
● ●● ●●●●● ● ●
●●
●●●
●●●●● ●●●
● ●●●● ●
●●● ●

● ● ●
● ●●● ● ● ● ● ● ● ●● ●
●●
●●

●●
●●●●●●●●
●● ●●●
●●
● ●●


●●●●●
●● ●
●●●●

●●●
● ● ●●●
● ●
●●●● ●●

● ●●
●● ●
●●● ● ●●

●●●●●●
●●●●●●●●● ●

●●●
● ●● ●●
●●●●

●●
● ●
●●●●●●●●●

● ● ●

●● ●●●● ● ●
0.6

●●
0.6

●●
●●●●● ●


●●● ● ● ●● ● ●● ● ● ●●●● ●● ●●
●● ●
● ●● ● ● ●

●●
●●
●●

●●
●● ● ●
●●

● ●
●●●
● ●●


●●●● ●
●●●●●●
●●●



●●●
● ●●●●●●● ● ●
●●●● ● ●●
● ●●
●● ● ●● ● ●
●●●●


●●


●● ●●●●●


●● ●

●●●●● ●●
●●
●●●

● ● ●●●●

● ● ●●●● ●●●
● ● ●●●● ●●●

● ●
● ● ●●● ● ●● ● ● ●●● ● ● ● ● ● ●
● ●● ●●● ● ●●●
●●●● ●
●●
●●
●● ●
●●●

●● ●
●●● ● ●●
● ●
●●
●●●●●● ● ●
●●● ● ●

●● ● ● ●●● ● ●● ●

●●
●●●
●●●●●
●●
●●●●● ●●●● ● ●●
●● ●
●●●● ●
●●
● ● ● ● ●
●●●●●● ●● ● ●●
●●●
●● ● ●●

●● ●●
●● ●●●
●●
●●
●●●● ● ●●
●●●●●●●
● ● ●● ●●●●

●●●
●● ● ●●●● ●
●●●●●

● ● ●● ●
● ●●● ● ●
●● ●●
rH


rH

●●●● ● ● ● ●
● ●
●●● ●
● ● ●● ●●● ●●
● ●●● ● ● ●● ● ● ●●● ●
● ● ●●●●● ● ●●●
● ● ●●●
● ●

● ●●
●●
●●
● ●
● ●●

●● ● ●
● ● ●●● ●
● ● ●
● ●
● ● ●● ● ● ● ●●
●●
●●

●●●

● ●●
●●● ●● ●
●●●●●
●●●● ● ● ●●

● ●●● ●●
●●

● ●●●●●

● ●●● ●●
● ●● ●
●● ●● ● ● ●●● ● ●●
●● ●

● ● ●●

● ●●
●●●● ●●

●●●

● ●●

● ● ● ● ● ●
0.4

0.4

●●● ●● ●● ●
●●●● ● ● ● ● ●● ● ●● ● ●
●● ● ●

●●●●● ●●● ● ●● ● ● ●●●●
● ●
● ●● ● ●●
●● ● ●●●
● ●● ● ● ● ● ●
●● ●●●
● ●● ● ●

● ●● ●● ● ● ●●●● ●●● ● ● ● ● ● ●
● ● ●●
● ● ● ●●●●
● ● ● ●

●● ●● ● ●●● ● ● ●
● ● ●● ●● ●● ●

● ●●● ●●● ●
● ● ●
● ● ● ●
0.2

0.2

● ● ● ●● ● ● ●
● ●● ●

● ● ● ● ● ●● ● ●
● ●● ● ●● ●●
●● ● ● ● ●
● ● ● ●● ● ● ●
●● ●
0.0

0.0

● ● ● ● ● ● ●● ●

−3.5 −3.0 −2.5 −2.0 −3.5 −3.0 −2.5 −2.0

log10(prob) log10(prob)

Western Journey:16C Stone Story:18C


1.0

1.0

● ●
●●● ●● ● ● ●●●●
0.8

0.8

●●●● ● ● ● ● ●

●●●●● ●●● ● ● ● ● ● ●● ●

●●●●

● ●
●●
●●● ●●
● ●●●●

●● ●
● ● ● ●●●●● ●



●●●


●●



● ●
●●
●● ●●
●● ●
●●
●●●●●
● ●● ●●●●●●●● ● ● ● ●

●●●
●●●



●●●

●●
●●
● ●●
● ● ●●
●●●● ● ●● ●● ●



●●

●●
● ●
●● ●
● ●● ● ●●
● ●●●●● ● ●
●●
● ●●
●●●

● ●● ●
●●●●● ●●●●● ● ●●






















●●









●●●

●●●

●●●

●●

●●




● ●
●●

●●●●●●

●●
● ●●●●
●● ●
●●●●● ● ● ●●
●●






●●





●●


●●●●


●●

●●

●●

●●
●●
●●

●●●


●●●
●●

●● ●●● ●
●●●●●●● ●● ●

●●

●●
●●


●●
●●

●●●
● ●
●●

●●●
●●

●●
●●
●● ●


●●●


● ●

● ●
●●
● ●●●
●● ●● ●

●●
● ●●●●●● ● ●


●●
●●


●●
●●


●●●●●
●●

●●●●

● ●

● ●
●●●
●●●
● ●
●●
●● ●●●
● ●● ● ●
●● ●●
● ●● ●●
0.6

0.6

●● ●●●
● ● ● ● ●
●●
●●●
●●● ● ●● ●●● ●● ● ●● ● ●● ●●●● ● ●●●●


●●

●●●

●●
●●
●●
●●
● ●


●●


●●●●
●●

●●
●●●
●●●
●●
●● ●●●● ●

●● ●

● ●●●●●● ● ● ●● ● ●●● ●

●●●●
● ●

●●


●●

●●
●● ●

●●
●●●


● ● ●● ●●●●●●
● ●


●●
●●● ● ●●● ●
●●

●●

●●










●●



●●●●


●●
●●





●●
●●






●●
●●
● ●●●

●●●
●●●●●



●●●●●●●● ●● ●●●●●
●●●●● ●● ● ●●●●

●●●
●●●



●●●●●
●●
● ●

●●●



●●

●●●●
● ●


●●



● ● ●
●●●●●

●●●● ●
● ●●
● ● ● ● ●●
● ●●
● ●●●●

●●

●● ● ● ●
● ● ● ● ●●
● ● ● ● ● ● ●●● ● ●●
● ● ●● ● ● ●●
●●
●●●●●
●● ●● ●●●●
●● ●
●●●●● ●● ●● ●●●


●●●
● ●●●●●●● ● ● ●●
●●●● ● ●● ●●●●
● ● ●
●●●
●●● ●●
●●
●●●
● ●● ●●●

● ●●●●● ●●●● ● ● ● ● ●●●
●●●● ●●● ●
rH

rH

●●● ●●●● ●●●●● ●● ●●● ● ●● ● ●●


● ● ● ● ● ●

● ●
●●●
● ●

●●●●●

● ●●●
●●●●

●● ●●

●●●
● ●



●● ●●●●
● ●●●●● ●●●● ●● ●
●●●●


●●●


●●●● ●●●






● ●
●●●
●●●
●●● ● ●
● ●
●● ●
●● ●● ● ●

●●●●●● ●● ●
● ●●●●
● ● ● ●● ● ●● ●● ●● ● ● ●●●
●●●● ●
● ●
●●●●●●

● ●
● ● ●
●●● ●● ●●● ● ●
●● ● ● ●●● ● ●●● ●
● ●●●● ●
0.4

0.4

● ● ● ● ● ● ●
●●
●●● ● ● ●● ● ● ● ● ●● ● ●●● ●● ● ●
●●●●
● ● ●●●● ● ●● ●
●● ●●● ● ●● ● ●
●●●
● ●●● ● ●●● ● ● ● ● ●
● ●●
● ●●● ● ● ●● ●●
● ● ●●● ●● ●● ●● ● ● ●● ●


● ● ● ● ●●● ●●●●●●
●●●● ● ● ●

●● ●● ●● ●● ●

● ● ●● ● ● ● ●
●● ● ● ● ●●●● ● ● ●
0.2

0.2

● ● ● ●●● ● ● ● ●
● ● ●
●● ● ●● ● ● ●●●
● ●●● ● ● ● ● ● ●
● ● ● ● ● ●● ● ●
●●
● ● ●●●● ● ●
● ●

0.0

0.0

● ●● ●● ● ● ● ● ●

−3.5 −3.0 −2.5 −2.0 −3.5 −3.0 −2.5 −2.0 −1.5

log10(prob) log10(prob)

1
1.1 Top 10s

3kingdoms MarshChronicles
w p H rH w p H rH
將 0.0077 6.614 0.5565 裏 0.0086 6.6369 0.5581
馬 0.008 6.1247 0.5134 去 0.0088 5.8709 0.4924
一 0.0082 7.2248 0.6031 是 0.0091 8.0159 0.6696
大 0.0085 5.9796 0.4972 個 0.0109 7.9621 0.6509
兵 0.0096 6.8322 0.5602 不 0.0133 6.9179 0.5526
軍 0.0099 7.1326 0.5821 人 0.0144 6.9081 0.5464
人 0.0104 6.9926 0.5678 一 0.0151 6.9949 0.5506
不 0.0137 7.3593 0.5789 來 0.0154 5.2464 0.412
之 0.016 6.0142 0.4649 道 0.017 0.9782 0.076
曰 0.0177 0.1282 0.0098 了 0.0191 5.7964 0.4442
mean H: 2.8726 mean H: 2.6453
mean rH: 0.6352 mean rH: 0.5822
tok: 490935 tok: 442894
typ: 4141 typ: 3602
ssum: 84501 ssum: 64570

TravelWest Dreams
w p H rH w p H rH
個 0.0096 8.4675 0.6786 我 0.0126 6.732 0.5119
他 0.0097 7.4035 0.5928 說 0.0129 4.9212 0.3729
來 0.0101 5.3607 0.4272 是 0.0138 7.8821 0.5933
是 0.0109 7.4956 0.5921 人 0.0144 7.0574 0.5287
我 0.0121 7.1087 0.5547 道 0.015 1.4555 0.1085
那 0.0126 7.1198 0.5531 來 0.0156 4.8955 0.3635
了 0.013 5.7659 0.4463 一 0.0166 7.0745 0.5218
一 0.0134 7.1426 0.5511 不 0.0205 7.0137 0.506
不 0.0149 7.0349 0.5364 的 0.0214 7.0827 0.5087
道 0.0184 1.0047 0.0749 了 0.029 4.8105 0.3351
mean H: 2.8387 mean H: 2.8891
mean rH: 0.6306 mean rH: 0.6503
tok: 594502 tok: 724688
typ: 4557 typ: 4318
ssum: 98798 ssum: 106170

2
Frequency of top 10, 100, 1000 frequency changers:
chars' frequency by book chars' frequency by book chars' frequency by book

0
−1

−1

−1
−2

−2

−2
log(cnt/tok,10)

log(cnt/tok,10)

log(cnt/tok,10)
−3

−3

−3
−4

−4

−4
−5

−5

−5
−6

−6

−6
3K Mh JW Dm 3K Mh JW Dm 3K Mh JW Dm

books books books

Frequency of bottom 10, 100, 1000 frequency changers:


chars' frequency by book chars' frequency by book chars' frequency by book
0

0
−1

−1

−1
−2

−2

−2
log(cnt/tok,10)

log(cnt/tok,10)

log(cnt/tok,10)
−3

−3

−3
−4

−4

−4
−5

−5

−5
−6

−6

−6
3K Mh JW Dm 3K Mh JW Dm 3K Mh JW Dm

books books books

‘Next word’ relative entropy of top 10, 100, 1000 frequency changers:
chars' 'next char' relative entropy by book chars' 'next char' relative entropy by book chars' 'next char' relative entropy by book
1.0

1.0

1.0
0.8

0.8

0.8
0.6

0.6

0.6
rH

rH

rH
0.4

0.4

0.4
0.2

0.2

0.2
0.0

0.0

0.0

3K Mh JW Dm 3K Mh JW Dm 3K Mh JW Dm

books books books

‘Next word’ relative entropy of bottom 10, 100, 1000 frequency changers:
chars' 'next char' relative entropy by book chars' 'next char' relative entropy by book chars' 'next char' relative entropy by book
1.0

1.0

1.0
0.8

0.8

0.8
0.6

0.6

0.6
rH

rH

rH
0.4

0.4

0.4
0.2

0.2

0.2
0.0

0.0

0.0

3K Mh JW Dm 3K Mh JW Dm 3K Mh JW Dm

books books books

3
Top 10, 100, 1000 freq. changers at 2 perspectives:
(though the drop lines aren’t much help beyond 100)

Books X rH X Log−Prob. Log−Prob. X Books X rH


● ●


−2.0



● ●

● ●

1.0

−2.5


● ●
● ●
● ●

● ● ●
● ● ●
−3.0

0.8


● ●
● ● ● ●
● ●
● ●
−3.5

● ●
log−prob.

● ●

0.6
● ●
●● ● ●
● ● ●
● ● ●

rH
−4.0


● ●

books

● ●

0.4
● Dm
● ● ●
● ●
−4.5

rH
● 1.0 JW

● ●
● 0.8

0.2
−5.0

0.6
Mh
0.4
0.2
−5.5

0.0
0.0 3K
3K Mh JW Dm −5.5 −5.0 −4.5 −4.0 −3.5 −3.0 −2.5 −2.0

books log−prob.

Books X rH X Log−Prob. Log−Prob. X Books X rH

● ● ●●



●● ●

● ●● ●
●● ●

● ● ●●
● ●● ●
● ● ●
● ●
● ●●● ● ● ● ●●●
● ● ●
●● ●●● ●● ●
−1

● ●● ● ● ● ●
●● ● ● ● ●● ● ●
● ● ● ●
●● ●
1.0

● ● ●● ● ●● ● ●
● ● ● ●● ● ● ● ●● ● ● ●
● ● ● ●
● ●
● ●
● ●
● ● ●

● ● ● ● ●●● ● ● ● ●●
● ● ● ●●● ● ● ● ● ● ● ●● ●● ● ●

● ● ●
● ●● ● ● ● ●

●● ● ● ● ● ● ● ●●


● ● ●● ● ●
−2

●● ● ● ● ● ● ● ● ●
●●
● ● ● ●● ● ● ● ●● ● ●●●
● ●●
● ● ● ●

0.8

●● ● ● ● ● ●● ● ● ●● ● ● ●
● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ●
● ● ●● ● ● ● ●


●● ● ● ● ● ●●
● ●● ●● ●● ●● ● ● ●
● ● ●● ●●
●● ●
● ● ●● ● ●● ● ●●●● ●●● ● ● ● ●●
● ● ●●●● ●
● ●● ●

●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●
●●● ● ●●● ● ● ●● ● ● ● ●●●● ●
● ● ● ●● ● ● ●
● ● ●● ● ● ●
● ● ● ● ● ●●● ●● ● ● ● ●
● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ●
−3


●● ● ● ● ● ●●● ● ● ●●●● ● ●
● ●● ●● ● ●● ● ● ● ● ● ●●●● ● ●● ● ● ● ● ●
log−prob.

0.6

●● ●
● ●● ● ● ● ● ●

●●●●
● ●● ● ● ● ● ● ● ● ● ●
● ●●● ●● ● ● ●● ●
● ● ● ● ● ●● ● ●
● ● ● ● ●● ● ●

● ●● ● ● ● ● ●


● ●●● ●● ● ● ● ● ●
● ● ●●● ● ● ● ● ●
● ●● ●● ● ● ● ●●● ● ● ● ●
● ●
● ● ●● ●
● ● ●● ●● ● ● ● ●●●● ●●
rH

● ●●●● ●
● ● ● ●●●● ●
● ● ●●
● ● ● ● ● ● ● ●
● ● ●● ● ● ● ● ●
●●

books
● ●● ● ● ● ●
−4

● ● ● ● ● ● ● ● ●
0.4

● ● ● ● ●● ● ●● ● Dm
● ● ● ●● ●
● ● ●
● ●●
rH

● ● ● ●

1.0 ●● ●

● ●
● ● ● ●
● ● ● ● ●
● ● 0.8 ●● ●● ● ● JW
● ●
● ●● ● ●
−5

0.6 ●
0.2


● 0.4 ● Mh
● ●


0.2
● ●
0.0


−6

0.0 ● ● 3K
3K Mh JW Dm −6 −5 −4 −3 −2 −1

books log−prob.

Books X rH X Log−Prob. Log−Prob. X Books X rH

● ●●●●●● ●●●

● ● ●
●●
●● ●● ●
●● ●● ●● ● ●
● ●●●● ● ● ●●
●● ●●
●●●●●● ● ● ● ● ●●●● ● ●
●●● ● ● ●●
●●
● ●●● ●
● ●● ●
●●●
●● ●● ●● ●

●●●

● ●

●●
● ●



●●

●● ●●


●●●●
●●● ●
●●●
●●
●●● ● ●●
● ● ● ●●
●● ●●
● ●●●
●●● ●
● ●
● ●●

●●●
●●

● ●●●
●●●● ● ● ●●
●● ● ●● ● ●●● ●
●●● ● ●●
● ●
●● ● ●●
● ●
●● ●●●●●●● ● ●● ●●● ●●● ● ● ●● ● ●●




●●

●●●




●●●●
●●

● ● ●●●●
●●●
●●
● ● ● ●● ● ●

●●
●●
●●● ●●●

●●●● ● ●● ●
● ●●
●●● ●●● ● ●
● ●●●● ● ●

●●●●●● ● ●
● ●●●●●●●● ●●● ●
● ●●
●●●●●● ●
● ●

●●● ●● ●●●
●●●
●●


●●
●●●● ●


● ● ●●●
●●
●●
● ●
● ● ●● ● ●●●
● ●● ●●●●●● ● ●●● ●●●●● ●
●●●

● ●
●●


●● ● ●
●●●● ● ●
● ● ●●● ● ●●
−1

●●● ● ● ● ●
●● ●●● ●●●●●●●●● ●
● ●●●●

● ● ●●●●●●●● ● ●
●●● ●
●● ● ● ● ● ● ●●● ● ● ● ● ●●
● ●
●● ● ● ●
●● ● ● ● ●●● ● ●●

● ● ● ●
● ● ● ●● ●●●● ●●●● ●●● ●
●●● ●●● ●●●●●● ●● ●
● ● ●● ● ●●●● ●
● ●●
●●●● ● ● ● ●●
●● ● ● ●● ●●● ●● ●
1.0

● ●● ● ● ●●●●● ● ●●●●●●●● ●● ● ●●●● ●


● ●
● ●●●
●● ●●●

● ● ●●●● ●●● ● ●● ●●
●● ●● ● ● ● ● ●
● ● ●
●● ● ● ●●●●●
●● ●●●●
●●●● ●
●● ●
● ● ● ● ●●●●●●
● ●● ● ●
●● ●● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ●●● ●●

● ●
●●
● ●

●●
●●
●●

●●●● ●●
●●●●●●

● ● ●

●●●●●●●●● ●
● ●●●

● ● ●●
● ● ● ●● ●
● ●● ●●●●●


●●●●
● ● ● ● ●
● ● ●



●●



● ●●●


●●●
●●
●●●●



● ●● ●
●● ●●
●●


● ● ●





●●●
● ●
● ●●
● ●● ●● ●●
●●● ●● ●●● ● ● ● ● ● ●● ●● ● ● ● ●●●● ● ● ●●● ●
●●●●●●
●●●
●●
●●● ●
●●●
● ● ●● ●● ●●●●●●● ●
●●●
●●●●●●● ●●

● ● ● ●● ●● ● ● ● ●
●●● ● ●● ● ● ●● ●● ●● ●●
●●● ●● ● ●●●●● ●● ●● ●●
● ● ● ●●
●●
●●●●●


●●●●●●● ●
●● ●●
●●● ●
●●● ●● ● ●●
● ● ●● ● ● ● ●●
● ●● ●● ● ● ● ●
●● ● ● ● ● ● ●● ●●●
●● ● ●● ● ●
●●● ●●●
●● ●●● ● ●
●●● ●
●● ●●


● ●
●●●●

●●
●● ●
●●●●●●● ●●
● ●

● ●
●●● ●
●● ● ●●●●●
●●● ● ● ● ●● ●
● ●● ●●● ● ● ●●● ●●● ●● ●

●● ●●●●● ● ● ●● ● ●●
●● ●●● ●● ●●●● ●●●●
●●●●●●●●●● ●●● ●●●●
● ●●
●●● ●
●●●
●●●

●●
●●

●●
●●●
● ●●
●●
●●● ●● ●●●●●● ●●● ●● ● ●
● ●●●● ● ● ●● ●●
●●● ●
●● ● ●
● ● ●●● ● ● ●
● ●● ● ● ● ● ●● ● ● ●
●●
●●●●●●
●●●● ●●●
● ●●● ●●
● ● ●
●●

●●●●
●●●●
●● ●
●●
● ●
● ●

●●●
● ●●
●●● ●
●●● ●● ●● ●
● ●● ●
●● ●● ●●● ● ●●●
●●● ● ●● ● ●●● ● ●● ●●●●●● ● ● ● ● ●● ●● ●
●●●
● ●● ●●
●●●
●●● ●
−2

●● ● ● ● ●●●● ● ●● ●● ●● ●● ● ●● ●●
● ●● ●●● ●● ● ● ●●● ●● ● ● ●●

●●● ●● ●
●●●
●●●
● ●●●● ●●● ● ●● ●
● ●● ●● ●● ●●● ● ●● ●● ●● ● ●●
●●● ●● ●●● ●●● ●●● ● ●
●● ●●●
● ● ● ● ● ●● ● ●

● ●●●●●

●●●
● ●●
● ●●

●● ●● ●
● ●● ●●● ●● ●
● ● ●●● ● ● ●●● ● ● ●
●● ● ● ● ● ● ● ● ●
● ●

●●●●● ●● ●● ●●● ●●
●●
● ●

●● ●● ●
●●●●
●●●

●● ●● ●

●●● ●● ●●
●●● ●●
●● ● ●●●●● ●●●
0.8


●●●●● ● ● ● ● ●● ● ● ●
●● ● ● ●● ● ● ●● ● ● ● ●
● ●
● ●● ●● ●● ● ● ●
● ●● ●
●● ●●●●
●● ●● ● ●●●●


●● ●

●● ● ●●●●● ● ●●
●●
●●
●●●●
● ●● ●●●
● ●●
●●●●
● ●
●●● ●●
●● ●


●●


●●
● ●●●
●● ●●

●●●
● ●●●●●●
●● ●●●●●
● ●

●●●●
●●●

●●
●●● ●●●
● ●
●●●



●●
●●●
●●

● ●●

● ●









●●
●●●

●●

● ●●● ●
●● ●


● ●
●●


●●●●
●●

●●
●●
●●●


●●●
●●


●●
●●





● ●●
●● ● ●●
●●
●● ●
● ●
●● ●●● ●
●● ● ● ●
●●● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ●●

● ●●●●
●●●●● ● ● ● ● ●● ●●● ●●●
●● ●● ● ●
● ●
●●● ● ● ● ●●
●● ●●●

● ● ● ●

● ● ● ● ●● ●● ● ●
●● ● ●

●● ● ● ● ●●●● ● ● ● ●
● ●
● ●
● ●● ●● ●●●
●●●
● ●
●●●

●●
●●
●●●●●● ●● ● ●
●●●● ●
●● ●
●●●●●●


● ●● ●

●●
●● ●●●● ●● ●●

●●●●●●●

●●●● ●
●● ● ●●● ● ●●●● ●● ●●
● ●● ● ●● ●●●●●● ●●●
● ●● ●●●●● ● ●●
● ●●●
●●●●
● ●●
●● ●●
●●●●


●●●●
●●
●●●●
●● ●
●●● ●●
●● ●●
●●
●●● ●●

●●●
●●●●●●
●●●●●● ●
● ● ● ●
● ●
● ●
● ● ● ●● ●● ●

●●●●●●●


●●●

●●●●●
●●●●●● ●●

● ● ● ● ● ●●
●●●●
●●●
●● ●●
● ● ● ●●● ●



●●●
●●






●●●
●●


● ●

●●



●●●
● ●●● ●● ● ●●

●●● ●●●●●●● ●
●●●●●●● ●● ●● ●● ●●●●● ●
● ●
●●●●●●
●●
●●
● ●
●●●●
●● ●●●●
●● ●●● ●●
●●●● ●●
●● ●
● ●●●●●●
●●●● ●
●● ●● ●● ● ● ●
● ●
● ● ● ●
● ● ●●●● ● ●●●
● ●● ●●●
● ●
● ● ●●


●●●●
●●●
●●

●● ●●●
●●●●●●
●●●●●●
● ●●
●●●●●●
●●●●
● ● ●● ●●●●
●●●●●● ●
●●
● ●●●
●●
● ●●●●

●●
● ●
●●

●● ● ● ●● ●●●● ●●
●●

●●

●●
●●●●
●●
●● ●●


●●●●

●●
● ●●


●●
●● ●●

●●●●●● ●●
●●
●● ●●
●● ●● ●●● ●●●●● ●●● ●
●●● ●● ●●●●●● ●● ●
● ●● ● ●● ● ● ●● ●
●●
●● ● ●●
●●●● ●●
●● ●●
●●●
● ●●●
●●
●●●●●
● ●
●●
● ●
●●●●●●●●● ●●●
●●
●●



● ●●●
●●
● ●●●●

●●
● ●●
●● ● ●● ●
●● ●●●

● ●

●●●

●●●

●●
● ●●
●●● ●●

●●● ●●●
●● ● ●● ●●●●● ●
● ●●● ●●●●


●●
●●●

●●●
●●●
● ● ●●● ●●●
● ●● ●
●●●●●

●●
● ●● ●
● ●● ●

● ●
● ●
●●●
● ●
●●
●● ● ● ●
● ●● ●
● ● ●● ●
● ● ●●●●● ● ● ●●
● ●
●●

●●●
●●
● ●
●● ● ● ●●●
● ●

●●
●●●● ● ● ● ●●●●●
● ●
●●
●●
●●
●●●

● ●●

●●● ●● ●●● ● ●●●
●●● ●●
●●● ●
●●
● ●●● ● ●
●●● ●
●●●
●●● ●
●● ●●
●●
●● ●●●
●● ● ●
● ●●● ●●
●●
●●
● ●●
●●●● ●● ●● ● ● ● ●
●● ●●●●●●●

●●●● ●●●●
●●●
●●
● ●●
● ●
●●



●●●
●●
●●




●●●

●●●●●
●●●●

●●●●



● ●●





●●●





●●
●●


●●


●●

●●


● ●●
●● ●●
●●●●● ●●


●●



●●
●●











●●●























●●




●●

● ●


● ●●●
● ●●
●●●
●●
● ●


●●
● ●●
●●











●●●●

●●

● ●●


●● ● ●●●●● ●●


●●●
●●



● ●●


●●
● ●●

●●●
● ●● ●●
●●●●●

●●●● ●● ●●●● ●●●●●●●●●● ●●●● ● ● ●● ● ●
●● ●● ● ●

●●●
● ● ● ● ●
−3

● ● ●● ● ● ● ● ●●●●
● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●
● ●● ● ● ●● ● ● ●● ●● ● ●●●●●●
● ● ● ●●● ●●●● ●●● ●
●●● ●
●●

●●
●●●
●●●
●●
●●

●●● ●
●●●●●● ●

●●●
●●● ●●
●●

●●
● ●
●●●


●●●

● ●● ●●●
●● ● ●
● ●●
●● ●●
●●● ●●
● ●


● ●●

●●


●●
●●● ●
●●● ●● ●●●● ● ●●
● ● ●
● ●●●●
●●●●

● ●●

● ●
●●●●
●●
●●

●●● ●●
● ●●
● ●● ●●
● ●
●●
● ● ●●●

● ● ● ●
● ●●
● ●●●●●●●● ●●●●● ● ●● ●●● ● ● ●●
●●● ● ● ● ● ●●●● ●● ● ● ●● ●● ● ●
●●●●● ●●
● ● ●
●● ●●●● ● ● ●● ● ● ●●●●●● ● ● ●● ● ● ●● ● ●
log−prob.

● ●●● ●●
0.6

●●●●●●● ●● ●●●●● ● ●●● ● ● ●


● ● ● ●●● ●●● ● ●●● ●● ●● ● ●● ● ● ● ●
●● ●●● ●● ● ●● ● ●● ●●● ● ●●● ●● ●● ● ●● ●
●● ● ●●●●●
●● ●●● ●
● ●●●

● ●●●●●

●●●
● ●

●●●

●●
●●● ●●
●●●●●●

●●●●
●●●●
●● ● ●●● ● ●● ●●●●
●●●●
●●●●
●●
●●


●●
●●●


●●

●●

●●●●
●●●
● ●●
●●●●●●●●●●●
●● ●●●●
●●●●


●●●●
●●●


●●●●●
●●●

● ●●● ● ● ●●
● ●●● ● ● ●

●● ●●
●●● ● ●● ●●●

● ●●●
●●
●●●●● ●●●
● ●●

● ●●●● ● ●●● ●● ●● ● ● ● ●● ●
● ●● ●● ●●●●
● ●
●●●●● ●
●●●●●●
●●●●
●●
●●
●●


●●●
●●

● ●●●

●●●

● ● ●● ● ●


●●
●●
● ●

●●
●●





●●
●●●
●●
●●
●●
●●●● ● ●● ●
●●

●●
●●
●●●●






●●●
●●●●







●●
●●● ●●

●●●
●●●● ●●
●● ● ●

●●●




●●●●
●●●
●●

●●

●●


●●



●●
●● ●
●●●
●●

●●






● ●
●●●● ● ●● ●●
● ● ●●
●●
●●
●●●●●●●●
●●
● ●●●

● ●●●

●●
●●
●●● ●●●

●●

● ●

●● ●● ●● ●● ●●● ● ● ●● ● ● ●
● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●●●● ●● ●● ● ● ● ●● ● ●● ● ● ●●
●● ●● ●● ● ●●●● ● ●
● ● ● ●● ●● ●
● ●● ●●●
●●●●
●●●
●● ●
●●●
●●●
●●
●●●●● ●
●●●●

● ●● ● ●●
● ●●●●
●●

●●
●●

●●●
● ●

●●
●●● ● ●●● ●
● ●● ●●●● ●●● ● ●
●●●

●●



●●
●●
●●●●●
●●
●●
● ●
●● ● ●
●●●●
●● ●
● ●●
●●
●●●●●
● ●●
●●
●●
●●








●●●


●●● ●● ● ● ●●●●●

●●

●●●●●●
●●
●●●● ●●● ● ●● ●●
● ●●●●●
●●●
● ●
●●
●● ●●● ● ● ●
● ● ●
● ●
● ● ●●● ●●●
● ●● ●
●● ●
● ● ● ●●
●●

● ●
●●●●
●●●


●●

●●●
● ● ● ● ●● ●●●● ●●●●
● ●●
●●



●●●● ●●

●●●
● ●●●●● ● ●
● ●
●●●

●●●●●

● ●●●

●●
● ●
●●●●●● ● ●●● ●●

● ●● ●●
● ● ●●●●●
● ●● ●●●

● ●● ● ●
● ●●●●
● ● ● ●● ● ●

rH

●●● ● ●●● ● ●
●●● ● ● ● ● ● ●
● ● ● ● ● ● ●●
● ●
●● ● ● ●● ● ●● ● ● ● ● ● ● ●● ●●● ●● ● ● ● ●● ●● ●
● ● ● ● ● ● ● ● ●
● ● ●●
●● ●● ● ● ●● ●●● ●● ●
●●●●● ●●
● ●●●●● ●
● ● ●● ●● ●● ●
● ●● ● ●●●●
● ● ●
●●●●●●●
●● ●● ● ●● ●● ● ● ●●
●●●
● ●● ●● ●●●● ●● ● ● ●●●●●●● ● ●●●● ● ●●●●●●● ●● ●
● ● ● ● ●● ●● ● ●●● ●●●●● ●
● ●●● ● ● ●
●●●●
●● ● ● ● ● ●●●●
● ● ● ● ● ● ●
●●●
● ●● ●●●● ●
● ●● ●●
●●●
●● ●● ●●
● ●
● ●● ● ● ● ●●● ●●
● ●● ● ●●●● ● ●●●● ● ●● ●●● ● ● ● ●● ● ●● ●●● ●● ●● ● ● ●●●●●● ● ●● ● ●●● ● ●●●● ● ● ●●
● ● ●● ● ● ●●
books

● ● ● ● ● ●●● ● ●●●● ● ● ●●● ●


● ● ●● ●● ●
● ● ●● ●●● ●●●● ●
● ●● ● ● ● ● ●● ●● ● ● ● ● ●● ● ●
−4

● ●●● ●●● ● ●● ●● ● ● ● ● ● ● ● ●●
● ●
● ●● ● ●
● ● ● ● ● ●● ● ● ● ● ● ● ●●● ●● ●● ● ●●
0.4

● ● ● ● ●● ●● ●● ● ●● ● ● ●●●●
●● ● ● ● ● ●● ● ● ●● ● ●● ●● ● ● ●● ●●
● ● ● ● ● Dm
● ● ● ●● ● ● ●●●●● ● ● ● ●● ●● ●●● ●● ● ●● ●● ● ●● ● ● ●●
● ●
● ●●●●●●
● ●●
● ● ●● ● ● ● ● ●●● ●

● ●●
rH

● ● ●● ●● ● 1.0 ● ● ●● ● ●● ● ●
●● ● ● ●
●●●
● ● ● ● ● ● ● ●
● ● ●● ●●● ●● ● ● ●●●● ●●● ● ● ●
● ● ● ●
● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●●● ●


● ●● ● ● ● ● ●● ● ● ●
● ● ● ● 0.8 ● ●● ● ● ● ● ● JW
● ● ● ●● ●● ●●● ●●● ● ● ●●● ●
● ● ● ● ● ●● ● ● ●●● ●● ●●
● ●


−5

0.6 ● ●
0.2

● ● ●● ● ● ●
● ● ● ● ● ●
● ●

● ● ● ●
● ● ● 0.4 ● ●●●● ●●● ● ●● ● ● Mh
● ● ●●
● ●
0.2 ● ●●
● ● ● ●
0.0


−6

0.0 ● ●● ● ●● ●●●● 3K
3K Mh JW Dm −6 −5 −4 −3 −2 −1

books log−prob.

4
There is a measurable inverse correlation of relative entropy with frequency.

3K MH

● ● ● ●
● ● ●●

−2

−2
● ●
● ●●● ● ● ●●● ●
● ●● ● ●●●● ● ●● ●● ●●●●
●●●●●●● ●● ● ●● ●● ●● ●


● ● ●●●
●●● ●

●● ● ● ●●●●●●●● ●
●●● ● ●● ●
● ●●●●●
● ●
●● ●

●●
●● ●●●● ● ●● ● ● ●
● ●●●
●●●● ●●● ●
●● ● ●
●● ●

●● ●
● ●
●● ●●●
●● ●● ●
● ●●●
●●
●● ●
●●
●●●

●●

●●● ●
●●
●●●
●●

● ●● ● ● ● ●●● ●● ●●●●●●
●●

● ●●
●●● ●

● ●● ●●● ● ● ●●
● ●●

●● ● ● ● ● ● ● ● ●●●● ●●
● ● ● ●●●●

●● ●●● ●●●●● ● ● ● ●
● ● ●●●● ● ● ● ● ●●●●●●
●● ● ● ● ● ●● ●●●● ● ● ●
−3

●● ●● ● ●● ● ●● ● ●● ●●● ●
●●● ● ●
● ●● ● ● ●●● ● ● ● ● ●● ●●● ●

−3
●● ● ●● ●

● ●●
● ●
● ●
●● ● ●● ● ●● ●●
●● ●●
● ●●●●●●●● ●●●

● ●
●●

● ● ●
●●●

●●


●●● ●●
●●
●●

●●●●



●●


●●●
●●

●●●● ●●
● ●●

●●●●● ●
● ● ● ●●●● ● ● ●●●● ●●
● ● ●●●
● ●



●● ●





●● ●●● ●
●●
●●●● ●

● ●● ●
●● ● ●●●● ●
●●●
●● ●●
●●●

● ● ●




●●
● ●

●●●●●


●●
●●
●●●

●● ● ● ●●
● ● ●
●●
●● ●● ●
●●● ●
●●● ●● ●●●
●●●
●●
●●

●●




●●

● ●

● ●
●●







●●●●
●●●
● ●●● ●●●● ●●● ●●● ●
●●

● ●

●●●●

●●
●●

●●
●●
●●●●●
●●
● ●
●●
●●●●●

●● ● ●● ●● ● ●● ●● ●●●●● ●●●●
●●●
●●●●

●●

●●
●●
●●●●●●

● ●●
●●●● ● ●
●●●●●
●●● ●●
●● ●
●●●
●●
● ●● ●
●●● ●● ● ● ●●● ●
● ●

●●●● ●●● ●●● ●
● ●● ●
freq

freq
● ●●●
●●● ● ●
●●●●●

●●
●●●

● ●
● ●
●●●


● ●
● ●●

●●

●●●
●●


●●

● ● ●
● ●● ●●● ● ●● ●●
● ●●

●● ●
●●
● ●●● ● ●
● ●
● ●●
●●
●●●● ● ●●●●
● ●●
● ●●
●● ●
●●

● ●●
●●
● ●●

● ●

●●
●●●
●●


●●
●●
●●
●●●●● ●
●● ●●● ●● ●● ●●● ●
● ● ●
●● ●

●● ●●●
●●
●●

●●●●●
●●





●●
●●
● ●


●●●
● ●●

●●●
● ●●
●●●
● ●
● ●● ● ● ●
●●

●●●●●● ●●●● ●●

●●
● ●


●●●

●●
●●
●●●●●●
●●●
●● ●●


●●● ●●●
●●
● ● ●●●● ● ● ●●
●● ● ●
●●
●● ●
●●●●●●

●●●
●●

●●
●●
●●


●●● ●
●●
●●●●

●●● ●
●●●
●● ●●●● ●● ●●
●●●●● ●●● ●●
●●


●●

●●
●●●

●●●

●●●●







●●
●●
●●



●●
●●
●●●



●●●
●●
● ●



●●●

●●


●●
●●●
●● ● ●● ●● ●● ●●
●● ●●
●●
● ●●●●




●●
●●
●●●● ●

●●●
●●
●●●●●




●●





















● ●






●●
























●●
●●
●● ●●

●● ●●● ●
●● ● ● ● ● ● ● ●●●●
●●●●
●●●

●●
●●
●●

●●●●●

●●
●●●●●●●
●●●

●●●

● ● ● ●
●● ● ●● ●
●●
● ● ●
●●●
●●
●●
●●
● ●

●●●

●● ●●●●

●●
●●
●●
●●
● ●
● ● ●●● ●
−4

● ● ●●● ●●● ●
● ● ●●● ● ● ●
●● ● ●●●
● ●●●●●●●●● ●● ●●●●●
●●● ●●●●●
●● ●●
● ●●●● ● ●● ●●
● ● ●● ●
●●●●
● ●

−4
● ● ●●
● ● ● ● ● ● ● ●● ● ●

●●
●●
●●
● ● ●

● ● ●●





●●


●●●
●●










●●


●●



●●


●●





●●














●●●
●●




















●●




●●



●●




●●●
●●● ●
●● ●● ●
● ●●


●●●
●●●● ●● ●


● ●●
●●●








●●

●●
●●


●●●





























●●●



●● ●






●●


















●●






●●




●●
●●●●●
●● ●
●●


●●
● ●● ●●●
●● ●●●● ●●● ●
● ●
●●●●●●
● ●● ● ● ●
●●●
● ●

●●●
●●● ●
● ●●
●●●●●● ●●● ● ●●
● ●● ● ●●● ●● ●●
●●●●●●●●●●
●●
●●●
●●
●●●● ●●●●
●● ● ●●● ●● ● ●● ●●
●●●●
●●● ●
●●●
●●●
● ●
●●●●
●●●
●●
●● ●
●●●●
●●
●●●
●●●
●●●
●●●

●●● ● ● ●● ●
●●●● ● ●● ● ● ●●
●●●●
●●● ● ●●
●●●
●●

●●
●●● ●
●●●●

●●
●●

●●●
●●
● ●


●●●●●●
● ●

●● ●
●● ●● ●● ●● ●
●● ●

●●
●● ●

●●●●
● ●●●●




● ●
●●
● ●
●●●
●●
●●

●●
●●

●●



●●
●●●
● ●●●
●●
● ●
●●

●●●



●● ●
● ● ●●●●● ●● ●● ●●●

●●● ●

●●

●●●●
● ●
●●
●●●●●
●●

●●● ●●●
●●

●●

●●

●●
●●●●
●●●●●
● ●
● ●● ●●
● ●●
● ●● ● ● ●●●● ●● ●
●●
●●
● ●●●

●●
●●
●●

● ● ●
●●●●●
●●● ●
● ●

●●● ●●
● ●
● ●

●●●●●● ● ●●● ●●●●● ● ●●●● ●
●●●
● ●●●●●
●● ●





●●
●●


● ●
●●●
● ●

●●

●●

● ●●

●●●
●●●
●●●
●●●
● ●
●●●
●●
●●
● ● ●●●● ●●● ●●●●●●● ●●●● ●
●●●●

●●● ●●●
●●
●●●
● ●● ●● ● ●●
●●● ●
●● ●
● ●● ●● ●●●
●●● ●●●●●
●● ● ●●
● ●
●●●● ●●●●●●
●●● ●
●●●●●●

● ●●
●●
● ●

●●●●

●● ●●
●● ●●

●● ●●

● ●● ● ●● ● ● ●● ●● ● ● ●● ●●●●

● ●
●●●
● ●

● ●●
●●●●
●●
● ●●● ●
●●●●
● ●
●●● ●
●● ● ●
● ● ● ●●●● ●●●

● ● ● ●● ● ●● ●

● ● ● ●●

●●●● ●
● ●●
● ●●●● ●●
●● ●●



●● ● ● ● ●● ●● ●● ●
● ● ●
● ●●● ● ●
●●● ● ● ●● ● ●●● ●
●● ●
● ●
● ●● ● ● ● ● ●● ●● ●● ● ●●● ● ● ●● ●● ●●●
● ●


● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
−5

−5
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ●

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

rH rH

JW DM

● ●
● ●● ●●
●●
−2

● ●● ●●● ● ● ●
● ● ●● ●
−2

●●●● ●● ●● ●●● ● ● ●●●●● ●


●●
●●●● ● ●● ●● ● ●●●●
● ● ●
● ● ●● ● ● ● ●● ●●●

●●●● ●●● ●

●●●

● ●●
●● ● ● ● ●● ●●●●● ●
● ● ● ●●●
●● ●●●● ●●
● ●● ● ● ●●●●● ● ●●●●●●●●
● ● ●● ● ●●


● ●●●●● ●●
●●
● ●●


● ●

●●●●●
● ●
●●●
●● ●
● ● ● ● ●

●● ● ●●●●
●●
● ● ●
● ● ●●● ●●●
● ●● ● ●
●●●●● ●●
●● ●● ● ●
● ● ●● ●●●● ●
● ● ● ● ● ● ●
●●
● ●●● ● ● ● ●●
●●●
● ● ●
−3

● ● ● ● ●
●●● ● ●●●●●●● ●● ●● ●●●● ●● ● ●
●● ●
●● ●● ●● ● ● ●●●●
●●
● ● ● ●● ● ● ●●
−3

● ● ●● ●
●●●
● ●

●●●
●●
● ●
●●●● ●●


●●●
●●
● ● ●●● ● ● ●●
● ●
● ●● ●
● ●
●●●● ●● ●
● ● ●● ●

●●●
●●● ●●

●● ●● ●
●●
● ●●

●●●●● ●●●● ●
● ●●●●●●
●●●

●●●
●●●
●●●●
●●●
●●
● ● ●●●
● ●
● ●
● ● ●● ● ● ●
●●
●●● ●● ●
● ●● ● ● ● ● ●● ● ●● ● ●
● ●
●●● ●●●●●

●●
●●

●●
●●

●●



●●●








●●


●●










●●
















●●
●●







●●





●● ● ● ● ● ●●●● ●●

●●●
●●●




●●



●●●


●●●


●●●● ●●
●●●
●●



●●●●●●

● ● ● ●● ● ● ● ●●●
● ●
●● ● ●● ●
● ●●● ● ●●● ●● ●● ● ● ●
freq

freq

●●● ●●● ● ● ●●●●●


●● ● ●
● ●● ● ●
● ●●●●●
● ● ● ● ●●
● ●● ● ●

● ● ● ●● ●
●● ●●●●

● ●
●● ●

●●●
●●●

●●●
●●
●●

●●●

●●
● ●


●●

●●
●●
●●





●●






● ●
●●● ● ● ●
●● ● ● ●

●●
●●●● ●


●●
●●●●●●

●● ●

●●●
●●●
●●
●●●
● ●
●●●●●
● ●●●

● ● ● ●● ● ● ●● ●●●


●●●●


● ●
●●




●●


●●
● ●●
●●




●●




●●















●●


●●










●●●
● ●●
●●
●●● ● ● ●●
●● ● ●
● ● ●●●●●

●●
●●
●●
●●●●
●●




●●●●
● ●●


●●●●






●●

●●
●●●
●●
●●
●●●
●●●

●●
● ●
● ●● ●●


●●● ●●●●●● ●
●●











●●
●●
●●

●●







●●●



●●







●●




































●●


●●

●●









●●
●●●● ● ●
●●●
● ●
●●●●●●


●●











●●



●●
●●

●●






●●●































●●●






●●





●●
●●

● ●
●● ● ●
●● ● ● ●● ● ●●●
●●● ● ● ● ● ● ●●●
● ●●●●

● ● ●●
●●●
● ● ●
● ● ●
−4

●● ●●
● ●●
●● ●● ●●● ●●
●●
● ● ●● ●●●● ●●● ● ●●● ●●●●● ●● ●●●●● ● ●
●● ● ●●
−4

● ●●● ●● ●●
● ● ●●

●●
●●● ● ●
●●
●●●

●●●
●●
● ●
●●

●●●
●●


●●

●●


●●

●●
●●


●● ●●

●●

●●

●●

●●●
● ● ●● ● ● ●
● ●

●●
●●●● ●●
●●●

●●
●●

●●
●●


●●●

● ●
●●
●●
●●●

●●●
●●●●


●●●●
●●
● ●
●● ● ● ●● ● ●● ●● ●● ●●●● ●● ● ● ●
● ●● ● ● ● ●
● ●● ● ●●● ●●●●

●●● ●●

●●● ●
●●






●●●
●●



●●
●●●●

●●
●●
●●

●●●●

●●
●●







●●●



●●
●●


●●●●


●●
● ●


●●●●
●●● ● ● ●
●●●●
● ●● ●●●●●●●● ●●●●
●●● ●●

●●
●●
●●
●●●




●●

●●
●●



●●
● ●



●●


●●

●●




●●

●●


●●


●●
●●





●●


●●
●●

● ●●●
●●
● ●●● ●
●●
● ●

● ●●







●●



●●
●●





●●





●●




●●



●●







●●













●●




●●●





●●


●●







●●

●●●● ● ● ● ●● ● ●●●●● ●


●●

●●
●●









●●
●●

●●






●●●
● ●
●●●














●●







●●
●●●●




●●






●●



●●





●●





●●
●●●●●
●● ●●
● ● ● ●●
●●●
●● ● ●●
●●●
●●●●
● ●
●●●
●● ●
● ●● ●
●●●
●●● ●
● ●●●● ● ●●● ● ●● ● ●● ● ●●
●●●●
●●

●●

●●
●●●●
●● ●


●●

●●●
● ●

●●●●
●●


●●

●●

●●●
●● ●●
● ● ●● ●● ●● ●
●● ● ●
●●●
● ●●●●●

●●●●●●
● ●
●●

●●●

●●●

●●






●●
●●


●●

●●
●●●

●●
●●

●●



●●


●●●


●●●●●● ● ●● ●● ●

●●●

●●●

●●●

●●
●●●

● ●

● ●
●●

●●
●●

●●
●●
●●
●●
●●



● ●

●●●



●●

●●
●●


●●
●● ●
●● ●
● ● ●● ● ●●●● ● ● ●●
●●●●● ●
●●
● ●●●
●●●●●●
● ●●

● ●

●●●
● ●
●●●●●
● ●●●●

●●●● ● ● ● ●
●●●● ●●● ●●
● ● ●
●● ●
●●●●●●
●●●

● ●
●●



●●
●●

●●●●

●●●

●●









●●

●●
●●●


●●●

●●








●●


●●
●●●
●●●


●●

● ● ● ● ●●●● ●
●●
● ●●●● ●●● ●● ●● ●●
●●●● ●●●●●● ● ● ● ●● ● ●● ● ● ●● ●● ●●●● ●●●
●●●● ●●●● ●
● ●● ● ● ●● ●● ● ●●●●●●● ●●

● ●●
●● ●●
● ●
●● ●
●●
● ●●● ●
●●●●● ●● ●

● ● ●●● ● ●● ●●

●●
● ● ●● ●●● ● ● ● ●
●●●●●● ● ●●

●●
● ●

●●●● ●●
●● ●●
●● ●●

●● ●● ●●

● ● ● ● ● ● ● ● ● ● ● ●● ●
● ●
−5

●● ● ● ● ● ●● ●● ●●
● ●● ● ●● ● ●● ●
−5


● ● ● ●● ●
● ● ● ●● ● ●●

● ●●
●●
● ●

● ● ●
● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●

● ● ● ● ●
● ● ● ●
● ● ● ● ● ●
● ● ● ●

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

rH rH

Isn’t that evidence against anti-redundancy? Higher frequency correlates with lower entropy, i.e. higher
redundancy? Can this be interpreted as the opposing force requiring robustness. Total diversity arises only
infrequently. Higher frequencies require some repetition.

5
Variance of log−prob as a function of variance of rH


1.5


● ●
● ● ●
● ●

● ●




1.0


var(log−prob)

● ● ●

● ● ●
●● ● ●
●● ● ● ● ● ●
● ● ●
●● ● ● ● ● ●
● ● ●● ●
● ●●● ●
● ● ●
●● ●

●● ●● ● ● ●
●● ● ●

●● ● ● ● ● ● ●
●●● ● ● ● ● ●● ● ●


● ● ●● ● ●
● ●● ● ● ●
●● ●●● ●● ● ●● ● ●
●● ●

0.5

●● ● ● ● ●●
● ● ●● ● ● ● ●
●● ●● ●● ● ● ● ●
● ●●●● ● ● ● ●●
● ●● ● ● ● ●●● ●● ● ● ● ● ●
●●● ● ●● ●
● ●● ●● ● ● ● ●● ●●
●● ● ● ● ●●
● ●●
● ● ● ● ●● ● ●● ●
●● ●
●●
● ●● ●● ● ●
● ● ●●
●●
● ●●●
●●● ● ● ● ● ● ● ● ● ● ●●
●●● ●● ●● ●●●●

●● ● ● ● ●● ● ●● ● ●● ●
●●
● ● ● ●


●●●

● ●●
●●● ● ●●●
● ● ● ●●●

● ● ● ● ●



●●●●● ●●
●●



●●
● ●●

●●
●●●● ●

●●●●● ● ● ●
● ●


● ●● ● ●●
●● ●●
●●

●●
●●
● ●
●●●
●●
●●●●●●●●● ●● ●●● ● ●● ● ● ● ●● ●●● ●

●●
● ● ●● ●
● ●●●●●
● ●●● ●
● ●●● ●● ● ●
●●●●● ●
●●
●● ●●●● ●
● ●
●●●● ●
● ●●● ● ●●●● ● ●● ● ●
● ●● ● ●
●●●● ●● ● ●●● ●● ●●●● ● ●


●●●

●●●





●●
●●
●●
●●

●●●
●●●●●●





●●●●●●●
●●●● ●
●● ● ● ●● ● ●


●●
●●




●●






●●



●●
●●


●●●



●●

●●●

●●
● ●●●







●●

●●●● ●●●● ●● ●

●●
● ●●●● ●● ●● ●● ● ● ●● ●● ● ● ● ●

●●

●●

●●


●●
















●●






●●
●●●


● ●●

● ●●
●●
●●

●● ●● ● ●

● ● ●●
●●● ●
● ●
● ● ●

● ● ● ● ●●●●
● ● ● ●●
● ● ● ● ● ●


●●


●●



●●


●●

●●

●●
●●

●●



●●
●●
●●●
●●

●●
●●

●●

●●
●●

●●

● ●●●
●●
●●●●●●●
●● ●●●●●●
● ●●● ● ● ●●
● ●● ● ●● ● ● ● ●● ●● ● ●



●●


●●

●●

●●


●●●

●●




●●
●●



●●●
●●
●●
●●


●●
●●

●●

● ●●

● ●
● ●
● ●● ●● ● ● ●
● ●





















●●





●●
●●


●●
●●








●●


●●●











●●●
● ●


●●●


●● ●
●●
● ●●
●●●●
●●
●●●

● ● ● ●
● ●
●●● ● ● ●● ●● ● ● ● ●

● ●
●● ● ● ● ●

●●



●●
●●


●●




●●
●●


●●●



●●

●●
●●
●●

●●

● ●
●●
● ●●●●●●●
● ●●
● ● ●
●●●
●●● ●● ● ● ●● ● ● ● ● ●
● ● ●●


●●








●●



●●


●●
●●



●●


●●

●●



●●
●●●●
●●


●●




●●


●●
● ●
●●
●●

●● ●
●●● ●●
●●●●● ●●
●●●●●● ● ● ●● ●
●● ●●
●● ●

●●



●●

●●
●●



● ●

●●●●
●●●

● ●


●●●

●●

●●

● ●
● ● ●●● ● ● ● ● ● ●
0.0




●●




●●


●●


●●

●●●
●●

●●
●●



● ●

● ●

●●
●●● ● ● ● ● ●●
● ●● ● ● ● ●

●●
●●
●●
●●
●●●●●●● ●●● ● ● ●

0.00 0.05 0.10 0.15 0.20 0.25

var(rH)

anova(lm(var(log-prob)∼var(rH))) cor.test(log-prob,rH)
Df, Sum Sq, Mean Sq, F value, Pr(>F) (Pearson’s product-moment correlation)
‘var(rh)’?: 1, 6.365, 6.365 , 177.09 ,< 2.2e-16 *** cor=0.2711801, t = 13.3074, df = 2231,
Residuals 2231, 80.188 , 0.036 p-value < 2.2e-16
cor.test(log-prob,rH,method=”spearman”)
rho=0.3964343, S = 1120055042,
p-value < 2.2e-16
(Cannot compute exact p-values with ties)
lm(var(log-prob)∼var(rH)) cor.test(log-prob,rH,method=”kendall”)
(Intercept) var(rH)(?) tau=0.2694387, z = 19.0833,
0.1396 1.2821 p-value < 2.2e-16

6
Mean succeeds in canceling itself out.
(though it would appear central means provide for higher variance)

Variance of log−prob as a function of mean of rH

1.5

● ●
● ● ●
● ●

● ●



1.0

var(log−prob)
●● ●

● ● ●●
● ● ●

● ● ●●● ●

●● ● ● ● ●● ● ●

● ●● ● ● ●
● ● ● ●
● ● ●
●● ● ● ● ●●● ● ●● ●●● ● ● ●
● ●● ●
● ● ●● ●●●
● ● ●● ● ● ●● ●
●●
● ● ●
● ● ● ● ●● ●● ● ● ●

●● ●
● ● ●● ●
0.5

● ●●
●● ● ●
● ●
● ●● ●● ● ●●●● ● ●
● ● ● ●● ●● ● ● ●● ●● ● ● ●
● ● ●● ● ● ● ●● ●●● ● ● ●●
● ●
●●●● ● ● ●●● ● ● ● ●
● ●● ●
● ● ●● ●● ● ● ● ●● ●●●
● ● ●
●●●
● ●● ●
●●●●●
●●
● ● ●● ● ● ● ●●● ●●●●●●
● ● ●● ●● ● ●
● ● ●
● ●● ● ●●●●● ● ● ●●●●● ● ●
● ●●● ●
●● ●


●●●● ●
● ●●● ● ●●
● ●

● ●●●●●





●●
●● ●
●●● ● ●
● ● ● ●
●● ● ●●●

● ●●
● ● ●
● ● ● ● ● ●●●
●●● ● ● ● ● ●
● ● ●● ● ●
●● ●●●● ●●● ● ● ●● ●●●
● ●●
● ●●●
● ●●●
● ●●●● ●●● ●●●
●●


● ● ● ●●
●● ● ●● ●●
● ●●
●● ● ●●

●●● ●●●● ●●●
●●


●●●●
●●●●●
●●
● ●
●●
●● ● ●

●● ● ●● ●

●●
●● ●● ●●● ●
● ●●●● ●

●● ●
●●●● ●
●●●
● ●● ●● ●●● ● ●●●
●● ● ● ●● ●● ● ●● ● ● ● ●●
● ●●●● ●●
● ●
●●●● ●● ●●●● ●
● ●●●●
● ● ● ●●●● ● ●●●●
●●●● ●●●●

● ●●

● ● ● ●
● ●●● ●●●●
●●●●
●●
● ● ●●
● ● ● ● ● ●●● ●●● ●●
●● ●●● ●●
● ● ●●●
● ● ●


●●●●●
●●
● ●
● ●
●●
●●●
●●
●●



●●
●●

●●
●●

●●
●●

●●●●●


●●●●● ●●●●
●●● ● ●●
●● ●
● ●●
●●
● ●● ●● ●
● ● ● ●● ●●●
● ●●
● ●● ● ●●

●●● ●●●



●●●
●●

● ●

●●●
●●




●●● ●


●● ●
●●
●●
●●
●●
●●●
● ●●●●●●



●●
●●●● ● ● ●●● ● ●
● ● ● ● ●● ●●●

●●●● ● ●
●●
●●
●● ●
●●
●●●●

●●

●●●
●● ●● ●

●● ●●●
●●●●●

●●

●●

●●
● ●●
● ● ●●● ●
●● ● ●● ● ●●
● ● ● ●● ● ● ● ● ●● ● ●●●

● ●●●●●●●●● ● ● ●

● ● ●●
● ●● ●● ●●
●●● ●●● ●●
● ● ● ●●●●
●●●●
● ●●●●● ●

● ●●
● ●
●●

●●
●●



● ●
●●

●●
●●
● ●●
●●
●●

●●


●●●


●●●

●●
●● ●
●●


●●
● ●

● ●●●● ●● ●● ● ●●● ● ●
● ●
● ●● ●●●●
● ●●● ●●
●●
●●



●●●●● ●
●●

●●●

●●
●●
●●
●●
●●●





●●

●●●

●●●
●●
●●

●●

●●
● ●●
●●

●●
●●●●
●●●


●●
●● ●

●●
● ●
●●●●

●●●●● ●● ● ●
● ● ●●●●
●●● ●● ●●●
● ●● ●
● ●●
●●●● ●

●●
●●● ●
●●●


●●
●●
●●


●●

●●


●●●

● ●
●●
●●

●●
●●
●●

●●

●●
●●●●●


●●●



●●

●●●

● ●

●●
●●●
●● ●●
●● ●●●●●
●●●

● ●● ● ●
● ● ● ●●

●●
●●

●●●●● ●

●●
●●●
●●

●●●
●●●●

● ●

●●
●●●●●
●●●

●●


●●●

●●●

●●●
●●
●●
●●


●●●

●●

●●●●


●●●



●●


● ●
●●●

●●


●●
●●
●●●
●● ●
● ●●●●
● ● ●●
● ●● ● ● ●
● ●
●● ●
●●●●




● ●
●●●
●●
●●
●●

● ● ●
●●●●●

●●

●●
●●
●●
●●
●●


●●

●●

●●

●●
●●
●●
●●●

●●



●●
●●

●●●

●●
●●
●●

●●●
●●
● ●


● ●

●●

●●● ●●
●●●● ● ● ●
0.0

● ●
●● ●
●●
●●●●
●●●●●●●

● ● ●

●●
● ●

●●
●●●
●●

●●
●●



●●
●●●

●●






●●●

●●



●●●



●●●

●●


●●
●●

●●●


●●●

●●

●●
●●
●●


●●
●●




● ●●
● ●

●●


● ●● ●●●●● ● ●●●
●●
● ●

0.0 0.2 0.4 0.6 0.8 1.0

mean(rH)

anova(lm(var(log-prob)∼mean(rH)))
Df, Sum Sq, Mean Sq, F value, Pr(>F)
1, 0.024, 0.024, 0.6214, 0.4306
Residuals: 2231, 86.529, 0.039

But mean’s difference from .5 is a significant predictor of frequency variance.


(though high variance seems to require low mean difference from the center.)

Variance of log−prob as a function of mean's sq−diff from .5


1.5


● ●
● ● ●
● ●

● ●




1.0


var(log−prob)

●● ●

●● ●●
● ● ●
●●● ●
● ●

● ●
● ● ●● ● ●
●● ●●
● ●● ●
● ● ● ●
● ● ●
●● ● ●●● ● ●● ● ● ● ● ● ●
●● ● ● ●●●
●●● ●● ● ● ● ●
●● ● ● ●● ● ● ●
●● ●● ● ● ● ● ● ●
●● ● ● ● ●
● ● ● ● ● ●●
0.5

● ●

●● ●● ●●●● ● ●●

● ●
● ● ●
● ●


●●●
● ●● ● ●● ●
● ●●●●● ●● ● ● ● ●
● ●
●●
● ● ● ● ●●
●●● ● ● ● ● ● ●
●● ● ●● ●●●●●●● ●● ● ●
●●● ●●●

●●●●
●● ●●●●

●● ● ●● ●●● ●● ● ●● ●

●●


●●● ● ●
●●●
● ● ●●●● ●● ●● ● ● ● ●
● ●●




●●
●●●●●
● ●

●●● ●●
●●
●●●●●
● ● ● ●●●● ● ●


●●

● ●●●●●●● ● ●●● ●
● ●●●● ●● ● ●● ● ● ●● ● ●● ● ●
●● ●


●●●
●●●●●● ●●

● ●●
●●●
● ●
●●●●●●●●● ●● ● ● ● ● ●● ● ●● ●●● ●● ●
●●


●●●
●●




●●●●●●
●●
●● ●●●● ●●●●● ●●●

● ●●●●●●●●
● ● ●● ● ●●● ●● ● ●● ● ●● ●● ● ● ●


●●
●● ●
●● ●●
●●●● ●● ●
●● ● ●●●● ● ● ● ●


●●●●
● ●
●●●
● ●●
● ●●●
● ●● ●●● ●● ● ● ●●● ●●●● ●● ● ● ●● ●● ●

●●

●●●●

●●

●●
●●
●●●●

●●●● ●●●●●

●●●●●
●●
●●
●●●


●●●●●

●●

●●●● ● ● ●●● ●●● ● ● ● ●●●● ●●● ● ● ● ●


●●





●●
●●
●●




●●
●●●


●●●

● ●
●●●●

●●●
●●
●●●
● ●●●● ●●● ●●
●●● ●● ●
● ● ●● ●● ● ● ● ● ● ● ●● ● ●





●●


● ●


●●
●●

●●


●●●●

●●

●●● ●● ●


●●●●


●●
● ●●●●●● ●●
●●
● ●● ●

●●● ● ●●● ●●●● ● ●●●



●●
●●

●●
●●
●●


●●●

●● ● ●
●●●●●
●●●
●●●●●
●●●
● ●●●
● ●●●●●
● ●●●●
●● ●● ● ●● ● ● ● ●



●●

●●●
●●

●●

●●


●●

●●●

●●●

● ●
●●
●● ●
● ●





●●
●●●
●●●●
●● ●●●●●●● ●●
●●●●● ●●●●● ● ●●● ●● ● ●● ● ●





●●



















●●

















●●
















●●

●●




●●
●●











●●●
●●
●●●
●●




●●


●●

●●






●●
●●●









●●
●●

● ●
●●●





● ● ●● ●●●● ●● ●
● ● ●● ● ●● ● ●
●●

● ●
●●
●●
●●


●●●
●●
●●● ●
●●●●
●●●
●●●
● ●
● ●
●●●

●●

● ●
●●
●●
●●
●● ● ●●●
●●
●●
●●● ●
●●●● ● ●●● ●
● ●●● ●● ●● ●




●●

















●●
●●




●●




●●

●●

●●


●●●
●●

●●

●●


●●

●●●
●●●

●●

●●●


●●

●●
●●
●●
●●
●●
●●●●

●●
●●

● ●
●●
●● ● ●● ●
● ● ●● ● ● ●

●● ●●●

● ●

●●●
●● ●●
● ●●●
● ● ● ● ●● ●
● ● ●●● ●● ●● ● ● ● ●● ● ● ●
0.0


●●

●●●●●
●●

●●●●

●●●

● ●●●● ●●●
●●● ●
●●●
●●●

●●●●●●●
● ●●
● ●● ● ●●● ●● ●


●●
●●
●●

●●
●●●
●●

●●●
●●●

● ●
●●●
● ●●●●●●
● ●● ●
● ●
●●● ● ●

0.00 0.05 0.10 0.15 0.20 0.25

(mean(rH)−.5)^2

2
anova(lm(var(log-prob)∼(mean(rH)-.5) ))
Df, Sum Sq, Mean Sq, F value, Pr(>F)
1, 0.211, 0.211, 5.4627, 0.01951 *
Residuals: 2231, 86.342, 0.039

(all of which may just suggest that comparing means is stupid.)

7
Means support typical probabilities in 10,000ths, and rH around .6.
Variance is typically low.
300

200 400 600 800


200
Frequency

Frequency
50 100
0

0
−5.5 −4.5 −3.5 −2.5 0.0 0.5 1.0 1.5

log−feq. mean log−feq. var


800

1000
600
Frequency

Frequency
400

600
200

0 200
0

0.0 0.2 0.4 0.6 0.8 1.0 0.00 0.05 0.10 0.15 0.20 0.25

rH mean rH var

8
2 does anti-redundancy apply to super-segmentals?
2.1 list of unique Chinese syllable-tone combinations
ce [4] cou [4] de [2] dei [3] dia [3] diu [1] ei [1] fo [2] gei [3] hei [1] keng [1] kuo [4] lia [3] lu:e [4] miu
[4] nen [4] neng [2] nin [2] nou [4] nu:e [4] nuan [3] ri [4] ruo [4] se [4] sen [1] seng [1] shei [2] te [4]
teng [2] yo [1] zei [2]

3 what do partial allophones say about redundancy?


3.1 distributions of the Japanese syllabary

3.1.1 pseudo-labials
400
300
200
100
0

ha hi hu he ho ba bi bu be bo pa pi pu pe po fa fi fu fe fo

9
3.1.2 affricates
1000
800
600
400
200
0

cha chi chu che cho tsa tsi tsu tse tso

3.1.3 stops
3000
2500
2000
1500
1000
500
0

ta ti tu te to da di du de do ka ki ku ke ko ga gi gu ge go

10
3.1.4 fricatives
2000
1500
1000
500
0

sa si su se so sha shi shu she sho za zi zu ze zo

3.1.5 sonorant
3500
3000
2500
2000
1500
1000
500
0

ya yi yu ye yo wa wi wu we wo ra ri ru re ro ma mi mu me mo na ni nu ne no

11
3.1.6 consonants
10000
8000
6000
4000
2000
0

h b p f t d k g s sh z ch ts y w r m n

3.1.7 vowels
20000
15000
10000
5000
0

a i u e o

12
3.1.8 all CV syllables
3500
3000
2500
2000
1500
1000
500
0

ha he bi bo pu fa fe ti to du ka ke gi go su shi za ze chi tsa tso ye wi wo ru ma mo ne

3.1.9 summary
The allophonic status of /f/ and /ts/ appears quite strong. Major forces of change however seem
to be acting on /sh/ and /ch/.
Grammatical effects appear to strongly interfere with phonetic uniformity. /ga/, /wa/, and
/no/ all stand out dramatically within their classes. Such functional items are acknowledged
manifestations of redundancy. To what degree do they disprove the theory of anti-redundancy?
What are they information metrics for these distributions? What’s /ta/?
Grenon 2005 seems to give the most elegant, general explanation of the alternations as palatal-
ization before the front high vowel and affricatization of stops before the back high vowel (or
fricativization for /h/ as it goes /f/).
from Grenon 2005:

13
Grenon 2006 argues that Japanese adults can distinguish /s/-/θ/ and /z/-/D/, though the former
more than the latter, merely stating that the contrast “contrast is unexpected, and suggest that
perception of sound contrasts is not solely constrained by the perception of phonological features”
(p. 7); the frequency discrepancies obvious between voiced and unvoiced segments seem a suggestive
difference.
Itô and Mester (1993) have argued that ‘*TI’ and ‘*F/*TS’ are not necessarily “enforced in
the periphery” while FU/TSU (apparently meaning *HU/*TU)“hold throughout the lexicon”
(p. 12; they give examples of /f/ followed by other phonemes than /u/, but claim there are no
examples of /hu/ nor /tu/; they also give examples where /ti/ has been adopted as /i/ or /te/,
but others where it has remained /ti/).
That’s all very interesting, but ot very relevant to the above corpora, where we have seen /f/
and /tsu/ to be far more constrained than /sh/ and /ch/ which appear to be losing all allophonic
status

4 Are Japanese words longer than Chinese words?


In fact, I am particularly concerned that they may be lengthened by highly predictable/identical
honorifics.
The following table indicates some greater length, though in fact it was the “words” of the
Sinica Treebank that actually ran up to 20 character/syllables.

English Word Length by Syllable


0.4
0.2
0.0

1 2 3 4 5 6 7 8

Chinese Word Length by Syllable


0.4
0.2
0.0

2 1 3 4 5 6 7 8 9 10

Japanese Word Length by Syllable


0.4
0.2
0.0

1 2 3 4 5 6 7 0 8 9

14
(0 is the Japanese histogram is the vowelless ‘n’, which appeared 48 times, and was left as such
due to my own uncertainty about whether it should count as syllabic or not).

5 English Syllables vs. Letters


Given that English writing is said to derive its many inconveniences from an older form of pro-
nunciation, it would seem reasonable to conclude that the latter changes faster than the former.
Furthermore, since Saussure, it has been common to consider speech the authentic, natural form
of language, with writing viewed as a sort of stagnant derivative (though Derrida’s early work
seriously disputes these stereotypes). Finally, common text book examples of Shannon’s entropy
frequently claim to show language’s low information content from the supposed redundancy of
English orthography.
However, the following table shows that the frequency distribution of written English words are
barely less redundant than the syllables of the words used to pronounce them (the text is from
the PTB and the pronunciation comes from the CMUDICT). In fact, both approach near maximal
entropy, as shown by the relative entropy, rH.

H rH
syllables 4.7673 0.90198
letters lwrcsd 4.1809 0.88948
letters caps 4.38891 0.76992
letters caps & pnc 4.76447 0.75364

These facts are not entirely fatal for a theory claiming evolutionary pressures for efficiency in
language. Phonetic writing is intended to imitate speech anyways. Additionally, the claim that
‘natural’ languages must be efficient in no way implies that ‘artificial’ languages could not be.
However, it is a little disappointing that orthography can not be added to the example of computer
languages to show higher redundancy in less adaptable systems.
Collocational statistics look even worse. Though syllables and lowercased letters can pass t-tests
against the capitalized and punctuated samples, they fail on such a test between each other.

15
collocational rH

1.0
0.8
0.6
rH

0.4



0.2



0.0

syl llc lcp lpnc

The evidence suggests that lowercase letters are quite efficient, but capitalization and punctu-
ation are not so much.

16

Anda mungkin juga menyukai