Anda di halaman 1dari 31

Hashing and Message

Authentication Codes

Andreas Klappenecker
Texas A&M University
Encryption is not all!

• Alice can use encryption to protect privacy

• Need key. Key distribution and management
• Protect integrity of message
• Authentication
Remember Eve
Let’s go to Christopher’s
tonight! Love, Alice Eve

Bob
Love, Alice

Authentication and integrity needed!

Hash Functions

A transformation of a message of
arbitrary length into a fixed-length
number is called a hash function

Alternate names are fingerprint or digest

Hash Functions are Versatile
Hash functions are used for
• message and file integrity
• fingerprints of keys
• authentication
• digital signatures
Desirable Properties of a Hash Function H

1) It should be possible to efficiently compute the

hash value z=H(m) of a message m.
2) Given the hash value z=H(m), it should be
computationally infeasible to find m. A function with
this property is called a one-way function.
3) Given a message m, it should be infeasible to find
another message m’ such that H(m)=H(m’).
4) It should be infeasible to find two messages m and
m’ such that H(m)=H(m’).
Property 3) is known as weak collision resistance, and
Property 4) is known as strong collision resistance.
Birthday Attacks
Any function H: {0,1}* ->{0,1}n must have
infinitely many collisions.

It requires O(2n/2) evaluations of H to find two

messages m and m’ that have a collision,
H(m)=H(m’).

This means n must be reasonably large,

otherwise it cannot be collision resistant.
Example
Suppose a hash function H produces n bit values.

Compose a document nice treaty and about 2n/2+1

semantically equivalent versions.
Similarly, compose an evil treaty and about 2n/2+1
semantically equivalent versions.

With probability ½ or more there will be a version of the

nice treaty and a version of the evil treaty that have the
same hash value.
Hash Algorithms
• The message digest algorithm MD5 by Ron Rivest
with 128 bit hash values.

• The secure hash algorithm SHA-1. It was developed

by NSA and standardized by NIST. This algorithm
uses 160 bit hash values encoded in 5 x 32 bit words.

• The family SHA-256, SHA-384, SHA-512 of hash

functions that are supposed to be used with AES.
They will be part of the NIST Cryptographic Toolkit.
Why are these bit lengths used?
MD5
It compresses messages of 512 bits length into a hash
of length 128 bits.
A message of arbitrary length is padded to length
k = 448 mod 512
A 64 bit string describing the length of the message is
added. The message length is now a multiple of 512.

The hashing is done block-by-block.

MD5
A buffer containing four words A,B,C,D of 32 bits is used
to compute the hash value. Initializations are
word A: 01 23 45 67 word B: 89 ab cd ef
word C: fe dc ba 98 word D: 76 54 32 10
The procedure uses four boolean functions that operate
bitwise on 32 bit word by
F(X,Y,Z) = XY v not(X) Z
G(X,Y,Z) = XZ v Y not(Z)
H(X,Y,Z) = X xor Y xor Z
I(X,Y,Z) = Y xor (X v not(Z))
T[i] = 4294967296 abs(sin(i))
MD5
The algorithm proceeds in four rounds that operate on 16
words X[k] of 32 bits, processing 16x32=512bits.

The operation [abcd k s i] is short for

a = b + ((a + F(b,c,d) + X[k] + T[i]) <<< s)
The first round consists of the 16 operations
[ABCD 0 7 1] [DABC 1 12 2] [CDAB 2 17 3] [BCDA 3 22 4]
[ABCD 4 7 5] [DABC 5 12 6] [CDAB 6 17 7] [BCDA 7 22 8]
[ABCD 8 7 9] [DABC 9 12 10] [CDAB 10 17 11] [BCDA 11 22 12]
[ABCD 12 7 13] [DABC 13 12 14] [CDAB 14 17 15] [BCDA 15 22 16]

The next three rounds are similar…

The hash value is A B C D.
MD5
At the end of the four rounds, the result is added to the
previous values of ABCD.
MD5 Reference
A detailed description of MD5 is contained in
RFC1321.

Hans Dobbertin has shown that MD5 is not

collision resistant, so it is not advisable to use
this algorithm.

It is used in IPSec and other protocols.

Secure Hash Algorithm
The four secure hash algorithms SHA-1 and SHA-256,
SHA-384, SHA-512 are described in the FIPS 180-2
document. See the CryptoToolkit page
http://csrc.nist.gov/CryptoToolkit/tkhash.html

You should browse through the standards posted by

NIST. The Cryptographic Toolkit contains all primitives
for authentication, encryption, digital signatures, etc.
Block Ciphers from Hash Functions
An interesting result by Luby and Rackhoff shows that a
Feistel cipher is strong if three random functions f1, f2, f3
are used.

f1
f2
f3

Thus, if we have a cryptographically strong hash function,

then we immediately can get a strong cipher
Remark
Some countries deem it necessary to restrict
the export of encryption algorithms.

Hash functions usually do not underlie such

export restrictions. The previous result shows
how absurd such policies are.
Message Authentication Codes
A message authentication code is a family of functions
hk that are parameterized by a secret key k such that

1) for a given k and an arbitrary input x, it is easy to

compute the MAC-value hk(x).
2) the function hk(x) maps a message x of arbitrary
length to a value with a fixed number of bits n.
3) if the key k is not known, then it is computationally
infeasible to compute the MAC value hk(m) of some
new message m, even if valid MAC values are
known for other messages m1,…,mg not equal to m.
Message Authentication

• Alice encrypts the message m to ensure privacy.

The resulting cryptogram is E(K,m).
• She forms the message authentication code
MAC(K,m) and sends both E(K,m) and MAC(K,m)
• Actually, the proper use of the MAC is a more
complicated than that, as you will see shortly...
Message Authentication Codes
from Block Ciphers
We can use any block cipher to construct a MAC, this is
the so-called CBC-MAC.

For a sequence of plaintext blocks P1,…,Pk do

H0 := IV some initial vector
Hi := E(K ,Pi xor Hi-1) for all i from 1 to k
MAC := Hk

This is simply the CBC mode of the block cipher, but just
the last block is transmitted and all others are deleted.
Using CBC-MAC
A number of attacks are known when the CBC-MAC is
simply applied to the message itself.

Rather use the following recipe:

1) Form s := m || l where l = length(m)
2) Pad s until it is a multiple of the block size
3) Apply CBC-MAC to the padded string
4) Use the last block and delete the others.
Message Authentication Codes
from Hash Functions
A hash function with n bits provides n/2 bits security
against certain attacks. A MAC should provide n bits of
security. This means we cannot define MAC(K,m) as
H( K || m) or the like

Instead, the idea of HMAC is to use the following simple

construction
HMAC(K,m) = H( K xor a || H( K xor b || m) )
where a and b are certain bit strings.
The Secure Channel
Eve

Alice Bob

Eve can insert, delete, and manipulate messages.

Alice and Bob want to transmit messages somehow.
The Secure Channel

We assume that Alice and Bob share a secret key

that is not known to anybody else.
Every time the channel is initialized, a new key K is
created. This prevents simple replay attacks.
Alice sends a sequence of messages m1, m2,… that
are processed by the secure channel algorithms, and
sends them to Bob.
Bob processes the received messages and ends up
with a sequence m1’,m2’,…
The Secure Channel
For the time being, our goal is that
1) Eve does not learn anything about the messages mi
except for their timing and size, and
2) even when Eve attacks the channel, Bob will
receive a subsequence of the message sequence
sent by Alice, and he will know which subsequence
he has received, that is, he knows which packets
are missing

Later we will improve on 1) by introducing

mechanisms that make it difficult for Eve to do
traffic analysis.
The Secure Channel
Die Gretchenfrage*
1) Should we encrypt first and then authenticate the
ciphertext, or
2) should we authenticate first and then encrypt both
message and MAC value

*For the original question, see Goethe’s Faust.

The Secure Channel
One can make a case for either version.

Ferguson and Schneier argue that authenticating first is

advisable if one favors security over computing time.

If you find a compelling argument for either case, then

let me know.
Secure Channel: Authentication
Alice numbers here messages 1,2,3,…

Denote by i the message number, and by xi an additional

portion of data that helps in the authentication. Let L(xi)
denote the length of xi in bytes.

Compute ai = MAC( i || L(xi) || xi || mi )

Secure Channel: Encryption
We can use for instance AES in CTR mode. Recall that the counter
mode is defined by Ki := E(K, Nonce || i ) and Ci := Pi xor Ki

We can use for instance a plaintext consisting of a 32

bit block number, a 32 bit message number, and 64 bit
zeros. For a message with nonce i, the key stream can
be defined by
k0 k1 k2 … = E(K, 0 || i || 0 ) || E(K, 1 || i || 0 ) || …
E(K, 232-1 || i || 0)

The final message sent is i || mi xor Ki1 || ai xor Ki2

References
Douglas R. Stinson, Cryptography: Theory and Practice,
Second Edition, Chapman & Hall/CRC, 2002.

Bruce Schneier, Applied Cryptography: Protocols,

Algorithms, and Source Code in C, Second Edition, John
Wiley & Sons, 1996.

William Stallings, Cryptography and Network Security:

Principles and Practice, Third Edition, Prentice Hall,
2003.

Ross Anderson: Security Engineering – A Guide to

Building Dependable Distributed Systems, Wiley, 2001
References
Niels Ferguson and Bruce Schneier: Practical
Cryptography, Wiley, 2003

Alfred J. Menezes, Paul C. van Oorschot and Scott A.

Vanstone: Handbook of Applied Cryptography, 5th
printing, CRC Press, 2001.

Amazingly, this book is completely available online at

http://www.cacr.math.uwaterloo.ca/hac/