An Overview of Cryptography *Gary C. Kessler <
kumquat@sover.net>*
May 1998
(18 November 2008)
A much shorter, edited version of this paper appears in the 1999 Edition of
*Handbook on Local Area Networks*, published by Auerbach in September 1998.
Since that time, this article has taken on a life of its own...
------------
------------------
CONTENTS
*1. INTRODUCTION* <http://www.garykessler.net/library/crypto.html#intro>
*2. THE PURPOSE OF
CRYPTOGRAPHY*<http://www.garykessler.net/library/crypto.html#purpose>
*3. TYPES OF CRYPTOGRAPHIC
ALGORITHMS*<http://www.garykessler.net/library/crypto.html#types>
3.1. Secret Key
Cryptography<http://www.garykessler.net/library/crypto.html#skc>
3.2. Public-Key
Cryptography<http://www.garykessler.net/library/crypto.html#pkc>
3.3. Hash Functions<http://www.garykessler.net/library/crypto.html#hash>
3.4. Why Three Encryption
Techniques?<http://www.garykessler.net/library/crypto.html#why3>
3.5. The Significance of Key
Length<http://www.garykessler.net/library/crypto.html#keylen>
*4. TRUST MODELS* <http://www.garykessler.net/library/crypto.html#trust>
4.1. PGP Web of Trust<http://www.garykessler.net/library/crypto.html#pgpweb>
4.2. Kerberos <http://www.garykessler.net/library/crypto.html#kerb>
4.3. Public Key Certificates and Certification
Authorities<http://www.garykessler.net/library/crypto.html#pkcca>
4.4. Summary<http://www.garykessler.net/library/crypto.html#trustsumm>
*5. CRYPTOGRAPHIC ALGORITHMS IN
ACTION*<http://www.garykessler.net/library/crypto.html#algorithms>
5.1. Password
Protection<http://www.garykessler.net/library/crypto.html#password>
5.2. Some of the Finer Details of Diffie-Hellman Key
Exchange<http://www.garykessler.net/library/crypto.html#dhmath>
5.3. Some of the Finer Details of RSA Public-Key
Cryptography<http://www.garykessler.net/library/crypto.html#rsamath>
5.4. Some of the Finer Details of DES, Breaking DES, and DES
Variants<http://www.garykessler.net/library/crypto.html#desmath>
5.5. Pretty Good Privacy
(PGP)<http://www.garykessler.net/library/crypto.html#pgp>
5.6. IP Security (IPsec)
Protocol<http://www.garykessler.net/library/crypto.html#ipsec>
5.7. The SSL "Family" of Secure Transaction Protocols for the World
Wide Web <http://www.garykessler.net/library/crypto.html#ssl>
5.8. Elliptic Curve
Cryptography<http://www.garykessler.net/library/crypto.html#ecc>
5.9. The Advanced Encryption Standard and
Rijndael<http://www.garykessler.net/library/crypto.html#aes>
5.10. Cisco's Stream
Cipher<http://www.garykessler.net/library/crypto.html#stream>
*6. CONCLUSION... OF
SORTS*<http://www.garykessler.net/library/crypto.html#conclusion>
*7. REFERENCES AND FURTHER
READING*<http://www.garykessler.net/library/crypto.html#refs>
*A. SOME MATH
NOTES*<http://www.garykessler.net/library/crypto.html#mathnotes>
A.1. The Exclusive-OR (XOR)
Function<http://www.garykessler.net/library/crypto.html#xor>
A.2. The *modulo*
Function<http://www.garykessler.net/library/crypto.html#modulo>
*ABOUT THE AUTHOR*<http://www.garykessler.net/library/crypto.html#author>
FIGURES
1. Three types of cryptography: secret-key, public key, and hash
function. <http://www.garykessler.net/library/crypto.html#fig01>
2. Sample application of the three cryptographic techniques for secure
communication. <http://www.garykessler.net/library/crypto.html#fig02>
3. Kerberos architecture.<http://www.garykessler.net/library/crypto.html#fig03>
4. GTE Cybertrust Global Root-issued certificate (Netscape
Navigator).<http://www.garykessler.net/library/crypto.html#fig04>
5. Sample entries in Unix/Linux password
files.<http://www.garykessler.net/library/crypto.html#fig05>
6. DES enciphering
algorithm.<http://www.garykessler.net/library/crypto.html#fig06>
7. A PGP signed
message.<http://www.garykessler.net/library/crypto.html#fig07>
8. A PGP encrypted
message.<http://www.garykessler.net/library/crypto.html#fig08>
9. The decrypted
message.<http://www.garykessler.net/library/crypto.html#fig09>
10. IPsec Authentication Header
format.<http://www.garykessler.net/library/crypto.html#fig10>
11. IPsec Encapsulating Security Payload
format.<http://www.garykessler.net/library/crypto.html#fig11>
12. IPsec tunnel and transport modes for
AH.<http://www.garykessler.net/library/crypto.html#fig12>
13. IPsec tunnel and transport modes for
ESP.<http://www.garykessler.net/library/crypto.html#fig13>
14. SSL v3 configuration screen (Netscape
Navigator).<http://www.garykessler.net/library/crypto.html#fig14>
15. SSL/TLS protocol
handshake.<http://www.garykessler.net/library/crypto.html#fig15>
16. Elliptic curve
addition.<http://www.garykessler.net/library/crypto.html#fig16>
17. AES pseudocode.<http://www.garykessler.net/library/crypto.html#fig17>
TABLES
1. Minimum Key Lengths for Symmetric
Ciphers.<http://www.garykessler.net/library/crypto.html#tab01>
2. Contents of an X.509 V3
Certificate.<http://www.garykessler.net/library/crypto.html#tab02>
3. Other Crypto Algorithms and Systems of
Note.<http://www.garykessler.net/library/crypto.html#tab03>
4. ECC and RSA Key
Comparison.<http://www.garykessler.net/library/crypto.html#tab04>
------------------------------
1. INTRODUCTION
Does increased security provide comfort to paranoid people? Or does security
provide some very basic protections that we are naive to believe that we
don't need? During this time when the Internet provides essential
communication between tens of millions of people and is being increasingly
used as a tool for commerce, security becomes a tremendously important issue
to deal with.
There are many aspects to security and many applications, ranging from
secure commerce and payments to private communications and protecting
passwords. One essential aspect for secure communications is that of
cryptography, which is the focus of this chapter. But it is important to
note that while cryptography is *necessary* for secure communications, it is
not by itself *sufficient*. The reader is advised, then, that the topics
covered in this chapter only describe the first of many steps necessary for
better security in any number of situations.
This paper has two major purposes. The first is to define some of the terms
and concepts behind basic cryptographic methods, and to offer a way to
compare the myriad cryptographic schemes in use today. The second is to
provide some real examples of cryptography in use today.
I would like to say at the outset that this paper is very focused on terms,
concepts, and schemes in *current* use and is not a treatise of the whole
field. No mention is made here about pre-computerized crypto schemes, the
difference between a substitution and transposition cipher, cryptanalysis,
or other history. Interested readers should check out some of the books in
the bibliography below for this detailed and interesting! background
information.
2. THE PURPOSE OF CRYPTOGRAPHY
Cryptography is the science of writing in secret code and is an ancient art;
the first documented use of cryptography in writing dates back to circa 1900
B.C. when an Egyptian scribe used non-standard hieroglyphs in an
inscription. Some experts argue that cryptography appeared spontaneously
sometime after writing was invented, with applications ranging from
diplomatic missives to war-time battle plans. It is no surprise, then, that
new forms of cryptography came soon after the widespread development of
computer communications. In data and telecommunications, cryptography is
necessary when communicating over any untrusted medium, which includes just
about *any* network, particularly the Internet.
Within the context of any application-to-application communication, there
are some specific security requirements, including:
- *Authentication:* The process of proving one's identity. (The primary
forms of host-to-host authentication on the Internet today are name-based or
address-based, both of which are notoriously weak.)
- *Privacy/confidentiality:* Ensuring that no one can read the message
except the intended receiver.
- *Integrity:* Assuring the receiver that the received message has not
been altered in any way from the original.
- *Non-repudiation:* A mechanism to prove that the sender really sent
this message.
Cryptography, then, not only protects data from theft or alteration, but
can also be used for user authentication. There are, in general, three types
of cryptographic schemes typically used to accomplish these goals: secret
key (or symmetric) cryptography, public-key (or asymmetric) cryptography,
and hash functions, each of which is described below. In all cases, the
initial unencrypted data is referred to as *plaintext*. It is encrypted into
*ciphertext*, which will in turn (usually) be decrypted into usable
plaintext.
In many of the descriptions below, two communicating parties will be
referred to as Alice and Bob; this is the common nomenclature in the crypto
field and literature to make it easier to identify the communicating
parties. If there is a third or fourth party to the communication, they will
be referred to as Carol and Dave. Mallory is a malicious party, Eve is an
eavesdropper, and Trent is a trusted third party.
3. TYPES OF CRYPTOGRAPHIC ALGORITHMS
There are several ways of classifying cryptographic algorithms. For purposes
of this paper, they will be categorized based on the number of keys that are
employed for encryption and decryption, and further defined by their
application and use. The three types of algorithms that will be discussed
are (Figure 1):
- Secret Key Cryptography (SKC): Uses a single key for both encryption
and decryption
- Public Key Cryptography (PKC): Uses one key for encryption and another
for decryption
- Hash Functions: Uses a mathematical transformation to irreversibly
"encrypt" information
FIGURE 1: Three types of cryptography: secret-key, public key, and hash
function.
3.1. Secret Key Cryptography
With *secret key cryptography*, a single key is used for both encryption and
decryption. As shown in Figure 1A, the sender uses the key (or some set of
rules) to encrypt the plaintext and sends the ciphertext to the receiver.
The receiver applies the same key (or ruleset) to decrypt the message and
recover the plaintext. Because a single key is used for both functions,
secret key cryptography is also called *symmetric encryption*.
With this form of cryptography, it is obvious that the key must be known to
both the sender and the receiver; that, in fact, is the secret. The biggest
difficulty with this approach, of course, is the distribution of the key.
Secret key cryptography schemes are generally categorized as being
either *stream
ciphers* or *block ciphers*. Stream ciphers operate on a single bit (byte or
computer word) at a time and implement some form of feedback mechanism so
that the key is constantly changing. A block cipher is so-called because the
scheme encrypts one block of data at a time using the same key on each
block. In general, the same plaintext block will always encrypt to the same
ciphertext when using the same key in a block cipher whereas the same
plaintext will encrypt to different ciphertext in a stream cipher.
Stream ciphers come in several flavors but two are worth mentioning
here. *Self-synchronizing
stream ciphers* calculate each bit in the keystream as a function of the
previous *n* bits in the keystream. It is termed "self-synchronizing"
because the decryption process can stay synchronized with the encryption
process merely by knowing how far into the *n*-bit keystream it is. One
problem is error propagation; a garbled bit in transmission will result in *
n* garbled bits at the receiving side. *Synchronous stream ciphers* generate
the keystream in a fashion independent of the message stream but by using
the same keystream generation function at sender and receiver. While stream
ciphers do not propagate transmission errors, they are, by their nature,
periodic so that the keystream will eventually repeat.
Block ciphers can operate in one of several modes; the following four are
the most important:
- *Electronic Codebook (ECB) mode* is the simplest, most obvious
application: the secret key is used to encrypt the plaintext block to form a
ciphertext block. Two identical plaintext blocks, then, will always generate
the same ciphertext block. Although this is the most common mode of block
ciphers, it is susceptible to a variety of brute-force attacks.
- *Cipher Block Chaining (CBC) mode* adds a feedback mechanism to the
encryption scheme. In CBC, the plaintext is exclusively-ORed (XORed) with
the previous ciphertext block prior to encryption. In this mode, two
identical blocks of plaintext never encrypt to the same ciphertext.
- *Cipher Feedback (CFB) mode* is a block cipher implementation as a
self-synchronizing stream cipher. CFB mode allows data to be encrypted in
units smaller than the block size, which might be useful in some
applications such as encrypting interactive terminal input. If we were using
1-byte CFB mode, for example, each incoming character is placed into a shift
register the same size as the block, encrypted, and the block transmitted.
At the receiving side, the ciphertext is decrypted and the extra bits in the
block (i.e., everything above and beyond the one byte) are discarded.
- *Output Feedback (OFB) mode* is a block cipher implementation
conceptually similar to a synchronous stream cipher. OFB prevents the same
plaintext block from generating the same ciphertext block by using an
internal feedback mechanism that is independent of both the plaintext and
ciphertext bitstreams.
A nice overview of these different modes can be found at
progressive-coding.com <http://www.progressive-coding.com/tutorial.php?id=3>
.
Secret key cryptography algorithms that are in use today include:
-
*Data Encryption Standard (DES):* The most common SKC scheme used today,
DES was designed by IBM in the 1970s and adopted by the National Bureau of
Standards (NBS) [now the National Institute for Standards and Technology
(NIST)] in 1977 for commercial and unclassified government applications. DES
is a block-cipher employing a 56-bit key that operates on 64-bit blocks. DES
has a complex set of rules and transformations that were designed
specifically to yield fast hardware implementations and slow software
implementations, although this latter point is becoming less significant
today since the speed of computer processors is several orders of magnitude
faster today than twenty years ago. IBM also proposed a 112-bit key for DES,
which was rejected at the time by the government; the use of 112-bit keys
was considered in the 1990s, however, conversion was never seriously
considered.
DES is defined in American National Standard X3.92 and three Federal
Information Processing Standards (FIPS):
- FIPS 46-3:
DES<http://csrc.nist.gov/publications/fips/fips46-3/fips46-3.pdf>
- FIPS 74: Guidelines for Implementing and Using the NBS Data
Encryption Standard <http://www.itl.nist.gov/div897/pubs/fip74.htm>
- FIPS 81: DES Modes of
Operation<http://www.itl.nist.gov/div897/pubs/fip81.htm>
Information about vulnerabilities of DES can be obtained from the Electronic
Frontier Foundation<http://www.eff.org/pub/Privacy/Crypto_misc/DES_Cracking>
.
Two important variants that strengthen DES are:
-
*Triple-DES (3DES):* A variant of DES that employs up to three 56-bit
keys and makes three encryption/decryption passes over the block; 3DES is
also described in FIPS
46-3<http://csrc.nist.gov/publications/fips/fips46-3/fips46-3.pdf>and
is the recommended replacement to DES.
-
*DESX <http://www.rsasecurity.com/rsalabs/node.asp?id=2232>:* A
variant devised by Ron Rivest. By combining 64 additional key bits to the
plaintext prior to encryption, effectively increases the keylength to 120
bits.
More detail about DES, 3DES, and DESX can be found below in Section
5.4<http://www.garykessler.net/library/crypto.html#desmath>
.
-
*Advanced Encryption Standard (AES):* In 1997, NIST initiated a very
public, 4-1/2 year process to develop a new secure cryptosystem for U.S.
government applications. The result, the Advanced Encryption
Standard<http://www.nist.gov/aes>,
became the official successor to DES in December 2001. AES uses an SKC
scheme called
Rijndael<http://www.esat.kuleuven.ac.be/%7Erijmen/rijndael/index.html>,
a block cipher designed by Belgian cryptographers Joan Daemen and Vincent
Rijmen. The algorithm can use a variable block length and key length; the
latest specification allowed any combination of keys lengths of 128, 192, or
256 bits and blocks of length 128, 192, or 256 bits. NIST initially selected
Rijndael in October 2000 and formal adoption as the AES standard came in
December 2001. FIPS PUB
197<http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf>describes
a 128-bit block cipher employing a 128-, 192-, or 256-bit key. The
AES process and Rijndael algorithm are described in more detail
below in Section
5.9 <http://www.garykessler.net/library/crypto.html#aes>.
-
*CAST-128/256:* CAST-128, described in Request for Comments (RFC)
2144<http://www.rfc-editor.org/rfc/rfc2144.txt>,
is a DES-like substitution-permutation crypto algorithm, employing a 128-bit
key operating on a 64-bit block.
CAST-256<http://www.entrust.com/resources/pdf/cast-256.pdf>(RFC
2612 <http://www.rfc-editor.org/rfc/rfc2612.txt>) is an extension of
CAST-128, using a 128-bit block size and a variable length (128, 160, 192,
224, or 256 bit) key. CAST is named for its developers, Carlisle Adams and
Stafford Tavares and is available internationally. CAST-256 was one of the
Round 1 algorithms in the AES process.
-
*International Data Encryption Algorithm
(IDEA)<http://home.ecn.ab.ca/%7Ejsavard/crypto/co0404.htm>
:* Secret-key cryptosystem written by Xuejia Lai and James Massey, in
1992 and patented by Ascom; a 64-bit SKC block cipher using a 128-bit key.
Also available internationally.
-
*Rivest Ciphers (*aka* Ron's Code):* Named for Ron Rivest, a series of
SKC algorithms.
-
*RC1:* Designed on paper but never implemented.
-
*RC2:* A 64-bit block cipher using variable-sized keys designed to
replace DES. It's code has not been made public although many
companies have
licensed RC2 for use in their products. Described in RFC
2268<http://www.rfc-editor.org/rfc/rfc2268.txt>
.
-
*RC3:* Found to be breakable during development.
-
*RC4 <http://ciphersaber.gurus.com/>:* A stream cipher using
variable-sized keys; it is widely used in commercial
cryptography products,
although it can only be exported using keys that are 40 bits or less in
length.
-
*RC5:* A block-cipher supporting a variety of block sizes, key sizes,
and number of encryption passes over the data. Described in RFC
2040<http://www.rfc-editor.org/rfc/rfc2040.txt>
.
-
*RC6 <http://www.rsasecurity.com/rsalabs/node.asp?id=2512>:* An
improvement over RC5, RC6 was one of the AES Round 2 algorithms.
-
*Blowfish <http://www.counterpane.com/blowfish.html>:* A symmetric 64-bit
block cipher invented by Bruce Schneier; optimized for 32-bit processors
with large data caches, it is significantly faster than DES on a
Pentium/PowerPC-class machine. Key lengths can vary from 32 to 448 bits in
length. Blowfish, available freely and intended as a substitute for DES or
IDEA, is in use in over 80 products.
-
*Twofish <http://www.counterpane.com/twofish.html>:* A 128-bit block
cipher using 128-, 192-, or 256-bit keys. Designed to be highly secure and
highly flexible, well-suited for large microprocessors, 8-bit smart card
microprocessors, and dedicated hardware. Designed by a team led by Bruce
Schneier and was one of the Round 2 algorithms in the AES process.
-
*Camellia <http://info.isl.ntt.co.jp/camellia/Publications/camellia.pdf>:
* A secret-key, block-cipher crypto algorithm developed jointly by Nippon
Telegraph and Telephone (NTT) Corp. and Mitsubishi Electric Corporation
(MEC) in 2000. Camellia has some characteristics in common with AES: a
128-bit block size, support for 128-, 192-, and 256-bit key lengths, and
suitability for both software and hardware implementations on common 32-bit
processors as well as 8-bit processors (e.g., smart cards, cryptographic
hardware, and embedded systems). Also described in RFC
3713<http://www.rfc-editor.org/rfc/rfc3713.txt>.
Camellia's application in IPsec is described in RFC
4312<http://www.rfc-editor.org/rfc/rfc4312.txt>
.
-
*MISTY1:* Developed at Mitsubishi Electric Corp., a block cipher using a
128-bit key and 64-bit blocks, and a variable number of rounds. Designed for
hardware and software implementations, and is resistant to differential and
linear cryptanalysis. Described in RFC
2994<http://www.rfc-editor.org/rfc/rfc2994.txt>
.
-
*Secure and Fast Encryption Routine
(SAFER)<http://fn2.freenet.edmonton.ab.ca/%7Ejsavard/co0403.html>
:* Secret-key crypto scheme designed for implementation in software.
Versions have been defined for 40-, 64-, and 128-bit keys.
-
*KASUMI <http://networking.champlain.edu/download/3G_KASUMI.pdf>:* A
block cipher using a 128-bit key that is part of the Third-Generation
Partnership Project (3gpp), formerly known as the Universal Mobile
Telecommunications System (UMTS). KASUMI is the intended confidentiality and
integrity algorithm for both message content and signaling data for emerging
mobile communications systems.
-
*SEED <http://www.kisa.or.kr/seed/seed_eng.html>:* A block cipher using
128-bit blocks and 128-bit keys. Developed by the Korea Information Security
Agency (KISA) and adopted as a national standard encryption algorithm in
South Korea. Also described in RFC
4269<http://www.rfc-editor.org/rfc/rfc4269.txt>
.
-
*Skipjack <http://csrc.nist.gov/CryptoToolkit/skipjack/skipjack.pdf>:*SKC
scheme proposed for Capstone. Although the details of the algorithm
were
never made public, Skipjack was a block cipher using an 80-bit key and 32
iteration cycles per 64-bit block.
3.2. Public-Key Cryptography
*Public-key cryptography* has been said to be the most significant new
development in cryptography in the last 300-400 years. Modern PKC was first
described publicly by Stanford University professor Martin Hellman and
graduate student Whitfield Diffie in 1976. Their paper described a two-key
crypto system in which two parties could engage in a secure communication
over a non-secure communications channel without having to share a secret
key.
PKC depends upon the existence of so-called *one-way functions*, or
mathematical functions that are easy to computer whereas their inverse
function is relatively difficult to compute. Let me give you two simple
examples:
1. *Multiplication vs. factorization:* Suppose I tell you that I have two
numbers, 9 and 16, and that I want to calculate the product; it should take
almost no time to calculate the product, 144. Suppose instead that I tell
you that I have a number, 144, and I need you tell me which pair of integers
I multiplied together to obtain that number. You will eventually come up
with the solution but whereas calculating the product took milliseconds,
factoring will take longer because you first need to find the 8 pair of
integer factors and then determine which one is the correct pair.
2. *Exponentiation vs. logarithms:* Suppose I tell you that I want to
take the number 3 to the 6th power; again, it is easy to calculate 36=729.
But if I tell you that I have the number 729 and want you to tell me the two
integers that I used, *x* and *y* so that logx 729 = y, it will take you
longer to find all possible solutions and select the pair that I used.
While the examples above are trivial, they do represent two of the
functional pairs that are used with PKC; namely, the ease of multiplication
and exponentiation versus the relative difficulty of factoring and
calculating logarithms, respectively. The mathematical "trick" in PKC is to
find a *trap door* in the one-way function so that the inverse calculation
becomes easy given knowledge of some item of information.
Generic PKC employs two keys that are mathematically related although
knowledge of one key does not allow someone to easily determine the other
key. One key is used to encrypt the plaintext and the other key is used to
decrypt the ciphertext. The important point here is that it *does not matter
which key is applied first*, but that both keys are required for the process
to work (Figure 1B). Because a pair of keys are required, this approach is
also called *asymmetric cryptography*.
In PKC, one of the keys is designated the *public key* and may be advertised
as widely as the owner wants. The other key is designated the *private
key*and is never revealed to another party. It is straight forward to
send
messages under this scheme. Suppose Alice wants to send Bob a message. Alice
encrypts some information using Bob's public key; Bob decrypts the
ciphertext using his private key. This method could be also used to prove
who sent a message; Alice, for example, could encrypt some plaintext with
her private key; when Bob decrypts using Alice's public key, he knows that
Alice sent the message and Alice cannot deny having sent the message (*
non-repudiation*).
Public-key cryptography algorithms that are in use today for key exchange or
digital signatures include:
-
*RSA:* The first, and still most common, PKC implementation, named for
the three MIT mathematicians who developed it Ronald Rivest, Adi Shamir,
and Leonard Adleman. RSA today is used in hundreds of software products and
can be used for key exchange, digital signatures, or encryption of small
blocks of data. RSA uses a variable size encryption block and a variable
size key. The key-pair is derived from a very large number, *n*, that is
the product of two prime numbers chosen according to special rules; these
primes may be 100 or more digits in length each, yielding an *n* with
roughly twice as many digits as the prime factors. The public key
information includes *n* and a derivative of one of the factors of *n*;
an attacker cannot determine the prime factors of *n* (and, therefore,
the private key) from this information alone and that is what makes the RSA
algorithm so secure. (Some descriptions of PKC erroneously state that RSA's
safety is due to the difficulty in *factoring* large prime numbers. In
fact, large prime numbers, like small prime numbers, only have two factors!)
The ability for computers to factor large numbers, and therefore attack
schemes such as RSA, is rapidly improving and systems today can find the
prime factors of numbers with more than 200 digits. Nevertheless, if a large
number is created from two prime factors that are roughly the same size,
there is no known factorization algorithm that will solve the problem in a
reasonable amount of time; a 2005 test to factor a 200-digit number took 1.5
years and over 50 years of compute time (see the Wikipedia article
on integer
factorization <http://en.wikipedia.org/wiki/Integer_factorization>.)
Regardless, one presumed protection of RSA is that users can easily increase
the key size to always stay ahead of the computer processing curve. As an
aside, the patent for RSA expired in September 2000 which does not appear to
have affected RSA's popularity one way or the other. A detailed example of
RSA is presented below in Section
5.3<http://www.garykessler.net/library/crypto.html#rsamath>
.
-
*Diffie-Hellman
<http://www.rsasecurity.com/rsalabs/faq/3-6-1.html>:*After the RSA
algorithm was published, Diffie and Hellman came up with their
own algorithm. D-H is used for secret-key key exchange only, and not for
authentication or digital signatures. More detail about Diffie-Hellman can
be found below in Section
5.2<http://www.garykessler.net/library/crypto.html#dhmath>
.
-
*Digital Signature Algorithm
(DSA)<http://www.nist.gov/public_affairs/releases/digsigst.htm>
:* The algorithm specified in NIST's Digital Signature Standard (DSS),
provides digital signature capability for the authentication of messages.
-
*ElGamal <http://www.iusmentis.com/technology/encryption/elgamal/>:*Designed
by Taher Elgamal, a PKC system similar to Diffie-Hellman and used
for key exchange.
-
*Elliptic Curve Cryptography (ECC):* A PKC algorithm based upon elliptic
curves. ECC can offer levels of security with small keys comparable to RSA
and other PKC methods. It was designed for devices with limited compute
power and/or memory, such as smartcards and PDAs. More detail about ECC can
be found below in Section
5.8<http://www.garykessler.net/library/crypto.html#ecc>.
Other references include "The Importance of
ECC"<http://www.certicom.com/index.php?action=res,ecc_intro_home>Web
page and the "Online
Elliptic Curve Cryptography
Tutorial"<http://www.certicom.com/index.php?action=ecc_tutorial,home>,
both from Certicom.
-
*Public-Key Cryptography Standards
(PKCS)<http://www.rsasecurity.com/rsalabs/node.asp?id=2124>
:* A set of interoperable standards and guidelines for public-key
cryptography, designed by RSA Data Security Inc.
- PKCS #1 <http://www.rsasecurity.com/rsalabs/node.asp?id=2125>: RSA
Cryptography Standard (Also RFC
3447<http://www.rfc-editor.org/rfc/rfc3447.txt>
)
- PKCS #2: *Incorporated into PKCS #1.*
- PKCS #3 <http://www.rsasecurity.com/rsalabs/node.asp?id=2126>:
Diffie-Hellman Key-Agreement Standard
- PKCS #4: *Incorporated into PKCS #1.*
- PKCS #5 <http://www.rsasecurity.com/rsalabs/node.asp?id=2127>:
Password-Based Cryptography Standard (PKCS #5 V2.0 is also RFC
2898<http://www.rfc-editor.org/rfc/rfc2898.txt>
)
- PKCS #6 <http://www.rsasecurity.com/rsalabs/node.asp?id=2128>:
Extended-Certificate Syntax Standard (being phased out in favor
of X.509v3)
- PKCS #7 <http://www.rsasecurity.com/rsalabs/node.asp?id=2129>:
Cryptographic Message Syntax Standard (Also RFC
2315<http://www.rfc-editor.org/rfc/rfc2315.txt>
)
- PKCS #8 <http://www.rsasecurity.com/rsalabs/node.asp?id=2130>:
Private-Key Information Syntax Standard (Also RFC
5208<http://www.rfc-editor.org/rfc/rfc5208.txt>
)
- PKCS #9 <http://www.rsasecurity.com/rsalabs/node.asp?id=2131>:
Selected Attribute Types (Also RFC
2985<http://www.rfc-editor.org/rfc/rfc2985.txt>
)
- PKCS #10 <http://www.rsasecurity.com/rsalabs/node.asp?id=2132>:
Certification Request Syntax Standard (Also RFC
2986<http://www.rfc-editor.org/rfc/rfc2986.txt>
)
- PKCS #11 <http://www.rsasecurity.com/rsalabs/node.asp?id=2133>:
Cryptographic Token Interface Standard
- PKCS #12 <http://www.rsasecurity.com/rsalabs/node.asp?id=2138>:
Personal Information Exchange Syntax Standard
- PKCS #13 <http://www.rsasecurity.com/rsalabs/node.asp?id=2139>:
Elliptic Curve Cryptography Standard
- PKCS #14: *Pseudorandom Number Generation Standard is no longer
available*
- PKCS #15 <http://www.rsasecurity.com/rsalabs/node.asp?id=2141>:
Cryptographic Token Information Format Standard
-
*Cramer-Shoup<http://www.zurich.ibm.com/Technology/Security/publications/1998/CS.pdf>
:* A public-key cryptosystem proposed by R. Cramer and V. Shoup of IBM in
1998.
-
*Key Exchange Algorithm
(KEA)<http://csrc.nist.gov/CryptoToolkit/skipjack/skipjack.pdf>
:* A variation on Diffie-Hellman; proposed as the key exchange method for
Capstone.
-
*LUC <http://www.kisa.or.kr/technology/sub1/LUC.htm>:* A public-key
cryptosystem designed by P.J. Smith and based on Lucas sequences. Can be
used for encryption and signatures, using integer factoring.
For additional information on PKC algorithms, see "Public-Key
Encryption"<http://networking.champlain.edu/download/hac_chap08.pdf>,
Chapter 8 in *Handbook of Applied Cryptography*, by A. Menezes, P. van
Oorschot, and S. Vanstone (CRC Press, 1996).
------------------------------
*A digression: Who invented PKC?* I tried to be careful in the first
paragraph of this section to state that Diffie and Hellman "first described
publicly" a PKC scheme. Although I have categorized PKC as a two-key system,
that has been merely for convenience; the real criteria for a PKC scheme is
that it allows two parties to exchange a secret even though the
communication with the shared secret might be overheard. There seems to be
no question that Diffie and Hellman were first to publish; their method is
described in the classic paper, "New Directions in Cryptography," published
in the November 1976 issue of *IEEE Transactions on Information Theory*. As
shown below, Diffie-Hellman uses the idea that finding logarithms is
relatively harder than exponentiation. And, indeed, it is the precursor to
modern PKC which does employ two keys. Rivest, Shamir, and Adleman described
an implementation that extended this idea in their paper "A Method for
Obtaining Digital Signatures and Public-Key Cryptosystems," published in the
February 1978 issue of the *Communications of the ACM (CACM)*. Their method,
of course, is based upon the relative ease of finding the product of two
large prime numbers compared to finding the prime factors of a large number.
Some sources, though, credit Ralph Merkle with first describing a system
that allows two parties to share a secret although it was not a two-key
system, per se. A *Merkle Puzzle* works where Alice creates a large number
of encrypted keys, sends them all to Bob so that Bob chooses one at random
and then lets Alice know which he has selected. An eavesdropper will see all
of the keys but can't learn which key Bob has selected (because he has
encrypted the response with the chosen key). In this case, Eve's effort to
break in is the square of the effort of Bob to choose a key. While this
difference may be small it is often sufficient. Merkle apparently took a
computer science course at UC Berkeley in 1974 and described his method, but
had difficulty making people understand it; frustrated, he dropped the
course. Meanwhile, he submitted the paper "Secure Communication Over
Insecure Channels" which was published in the *CACM* in April 1978; Rivest
et al.'s paper even makes reference to it. Merkle's method certainly wasn't
published first, but did he have the idea first?
An interesting question, maybe, but who really knows? For some time, it was
a quiet secret that a team at the UK's Government Communications
Headquarters (GCHQ) had first developed PKC in the early 1970s. Because of
the nature of the work, GCHQ kept the original memos classified. In 1997,
however, the GCHQ changed their posture when they realized that there was
nothing to gain by continued silence. Documents show that a GCHQ
mathematician named James Ellis started research into the key distribution
problem in 1969 and that by 1975, Ellis, Clifford Cocks, and Malcolm
Williamson had worked out all of the fundamental details of PKC, yet
couldn't talk about their work. (They were, of course, barred from
challenging the RSA patent!) After more than 20 years, Ellis, Cocks, and
Williamson have begun to get their due credit.
And the National Security Agency (NSA) claims to have knowledge of this type
of algorithm as early as 1966 but there is no supporting documentation...
yet. So this really was a digression...
------------------------------
3.3. Hash Functions
*Hash functions*, also called *message digests* and *one-way encryption*,
are algorithms that, in some sense, use no key (Figure 1C). Instead, a
fixed-length hash value is computed based upon the plaintext that makes it
impossible for either the contents or length of the plaintext to be
recovered. Hash algorithms are typically used to provide a *digital
fingerprint* of a file's contents, often used to ensure that the file has
not been altered by an intruder or virus. Hash functions are also commonly
employed by many operating systems to encrypt passwords. Hash functions,
then, provide a measure of the integrity of a file.
Hash algorithms that are in common use today include:
-
*Message Digest (MD) algorithms:* A series of byte-oriented algorithms
that produce a 128-bit hash value from an arbitrary-length message.
-
*MD2 (RFC 1319 <http://www.rfc-editor.org/rfc/rfc1319.txt>):* Designed
for systems with limited memory, such as smart cards.
-
*MD4 (RFC 1320
<http://www.rfc-editor.org/rfc/rfc1320.txt>):*Developed by Rivest,
similar to MD2 but designed specifically for fast
processing in software.
-
*MD5 (RFC 1321 <http://www.rfc-editor.org/rfc/rfc1321.txt>):* Also
developed by Rivest after potential weaknesses were reported in MD4; this
scheme is similar to MD4 but is slower because more manipulation
is made to
the original data. MD5 has been implemented in a large number of products
although several weaknesses in the algorithm were demonstrated by German
cryptographer Hans Dobbertin in 1996.
-
*Secure Hash Algorithm (SHA):* Algorithm for NIST's Secure Hash Standard
(SHS). SHA-1 produces a 160-bit hash value and was originally published as
FIPS 180-1 and RFC 3174 <http://www.rfc-editor.org/rfc/rfc3174.txt>. FIPS
180-2<http://csrc.nist.gov/publications/fips/fips180-2/fips180-2withchangenotice.pdf>describes
five algorithms in the SHS: SHA-1 plus SHA-224, SHA-256, SHA-384,
and SHA-512 which can produce hash values that are 224, 256, 384, or 512
bits in length, respectively. SHA-224, -256, -384, and -52 are also
described in RFC 4634 <http://www.rfc-editor.org/rfc/rfc4634.txt>.
-
*RIPEMD <http://www.esat.kuleuven.ac.be/%7Ebosselae/ripemd160.html>:* A
series of message digests that initially came from the RIPE (RACE Integrity
Primitives Evaluation) project.
RIPEMD-160<http://www.esat.kuleuven.ac.be/%7Ecosicart/pdf/AB-9601/AB-9601.pdf>was
designed by Hans Dobbertin, Antoon Bosselaers, and Bart Preneel, and
optimized for 32-bit processors to replace the then-current 128-bit hash
functions. Other versions include RIPEMD-256, RIPEMD-320, and RIPEMD-128.
-
*HAVAL (HAsh of VAriable
Length)<http://www.calyptix.com/technology/haval.php>
:* Designed by Y. Zheng, J. Pieprzyk and J. Seberry, a hash algorithm
with many levels of security. HAVAL can create hash values that are 128,
160, 192, 224, or 256 bits in length.
-
* Whirlpool<http://paginas.terra.com.br/informatica/paulobarreto/WhirlpoolPage.html>
:* A relatively new hash function, designed by V. Rijmen and P.S.L.M.
Barreto. Whirlpool operates on messages less than 2256 bits in length,
and produces a message digest of 512 bits. The design of this has function
is very different than that of MD5 and SHA-1, making it immune to the same
attacks as on those hashes (see below).
-
*Tiger <http://www.cs.technion.ac.il/%7Ebiham/Reports/Tiger/>:* Designed
by Ross Anderson and Eli Biham, Tiger is designed to be secure, run
efficiently on 64-bit processors, and easily replace MD4, MD5, SHA and SHA-1
in other applications. Tiger/192 produces a 192-bit output and is compatible
with 64-bit architectures; Tiger/128 and Tiger/160 produce the first 128 and
160 bits, respectively, to provide compatibility with the other hash
functions mentioned above.
Hash functions are sometimes misunderstood and some sources claim that no
two files can have the same hash value. This is, in fact, not correct.
Consider a hash function that provides a 128-bit hash value. There are,
obviously, 2128 possible hash values. But there are a lot more than 2128 *
possible* files. Therefore, there have to be multiple files in fact, there
have to be an infinite number of files! that can have the same 128-bit
hash value.
The difficulty is *finding* two files with the same hash! What is, indeed,
very hard to do is to try to create a file that has a given hash value so as
to force a hash value collision which is the reason that hash functions
are used extensively for information security and computer forensics
applications. Alas, researchers in 2004 found that *practical* collision
attacks could be launched on MD5, SHA-1, and other hash algorithms. Readers
interested in this problem should read the following:
- Burr, W. (2006, Match/April). Cryptographic hash standards: Where do we
go from here? *IEEE Security & Privacy*, *4*(2), 88-91.
- Gutman, P., Naccache, D., & Palmer, C.C. (2005, May/June). When hashes
collide. *IEEE Security & Privacy*, *3*(3), 68-71.
- Klima, V. (March 2005) "Finding MD5 Collisions - a Toy For a
Notebook."<http://cryptography.hyperlink.cz/md5/MD5_collisions.pdf>
- Thompson, E. (2005, February). MD5 collisions and the impact on
computer forensics. *Digital Investigation*, *2*(1), 36-40.
- Wang, X., Feng, D., Lai, X., & Yu, H. (August 2004). "Collisions for
Hash Functions MD4, MD5, HAVAL-128 and
RIPEMD."<http://eprint.iacr.org/2004/199.pdf>
- Wang, X., Yin, Y.L., & Yu, H. (February 2005). "Collision Search
Attacks on SHA1." <http://theory.csail.mit.edu/%7Eyiqun/shanote.pdf>
An excellent overview of the situation with hash collisions can be found in
RFC 4270 <http://www.rfc-editor.org/rfc/rfc4270.txt> (by P. Hoffman and B.
Schneier, November 2005). And for additional information on hash functions,
see David Hopwood's MessageDigest
Algorithms<http://www.users.zetnet.co.uk/hopwood/crypto/scan/md.html>page.
At this time, there is no obvious successor to MD5 and SHA-1 that could be
put into use quickly; there are so many products using these hash functions
that it could take many years to flush out all use of 128- and 160-bit
hashes. That said, NIST announced in 2007 their Cryptographic Hash Algorithm
Competition <http://csrc.nist.gov/groups/ST/hash/sha-3/index.html> to find
the next-generation secure hashing method. Dubbed SHA-3, this new scheme
will augment FIPS 180-2. A list of submissions can be foud at The
SHA-3 Zoo<http://ehash.iaik.tugraz.at/wiki/The_SHA-3_Zoo%22>.
The SHA-3 standard may not be available until 2011 or 2012.
Certain extensions of hash functions are used for a variety of information
security and digital forensics applications, such as:
- *Hash libraries* are sets of hash values corresponding to known files.
A hash library of known good files, for example, might be a set of files
known to be a part of an operating system, while a hash library of known bad
files might be of a set of known child pornographic images.
- *Rolling hashes* refer to a set of hash values that are computed based
upon a fixed-length "sliding window" through the input. As an example, a
hash value might be computed on bytes 1-10 of a file, then on bytes 2-11,
3-12, 4-13, etc.
- *Fuzzy hashes* are an area of intense research and represent hash
values that represent two inputs that are similar. Fuzzy hashes are used to
detect documents, images, or other files that are close to each other with
respect to content.
3.4. Why Three Encryption Techniques?
So, why are there so many different types of cryptographic schemes? Why
can't we do everything we need with just one?
The answer is that each scheme is optimized for some specific
application(s). Hash functions, for example, are well-suited for ensuring
data integrity because any change made to the contents of a message will
result in the receiver calculating a different hash value than the one
placed in the transmission by the sender. Since it is highly unlikely that
two different messages will yield the same hash value, data integrity is
ensured to a high degree of confidence.
Secret key cryptography, on the other hand, is ideally suited to encrypting
messages. The sender can generate a *session key* on a per-message basis to
encrypt the message; the receiver, of course, needs the same session key to
decrypt the message.
Key exchange, of course, is a key application of public-key cryptography (no
pun intended). Asymmetric schemes can also be used for non-repudiation; if
the receiver can obtain the session key encrypted with the sender's private
key, then only this sender could have sent the message. Public-key
cryptography could, theoretically, also be used to encrypt messages although
this is rarely done because secret-key cryptography operates about 1000
times faster than public-key cryptography.
FIGURE 2: Sample application of the three cryptographic techniques for
secure communication.
Figure 2 puts all of this together and shows how a *hybrid
cryptographic*scheme combines all of these functions to form a secure
transmission
comprising *digital signature* and *digital envelope*. In this example, the
sender of the message is Alice and the receiver is Bob.
A digital envelope comprises an encrypted message and an encrypted session
key. Alice uses secret key cryptography to encrypt her message using
the *session
key*, which she generates at random with each session. Alice then encrypts
the session key using Bob's public key. The encrypted message and encrypted
session key together form the digital envelope. Upon receipt, Bob recovers
the session secret key using his private key and then decrypts the encrypted
message.
The digital signature is formed in two steps. First, Alice computes the hash
value of her message; next, she encrypts the hash value with her private
key. Upon receipt of the digital signature, Bob recovers the hash value
calculated by Alice by decrypting the digital signature with Alice's public
key. Bob can then apply the hash function to Alice's original message, which
he has already decrypted (see previous paragraph). If the resultant hash
value is not the same as the value supplied by Alice, then Bob knows that
the message has been altered; if the hash values are the same, Bob should
believe that the message he received is identical to the one that Alice
sent.
This scheme also provides nonrepudiation since it proves that Alice sent the
message; if the hash value recovered by Bob using Alice's public key proves
that the message has not been altered, then only Alice could have created
the digital signature. Bob also has proof that he is the intended receiver;
if he can correctly decrypt the message, then he must have correctly
decrypted the session key meaning that his is the correct private key.
3.5. The Significance of Key Length
In a recent article in the industry literature (circa 9/98), a writer made
the claim that 56-bit keys do not provide as sufficient protection for DES
today as they did in 1975 because computers are 1000 times faster today than
in 1975. Therefore, the writer went on, we should be using 56,000-bit keys
today instead of 56-bit keys to provide adequate protection. The conclusion
was then drawn that because 56,000-bit keys are infeasible (*true*), we
should accept the fact that we have to live with weak cryptography (*false!*).
The major error here is that the writer did not take into account that the
number of possible key values double whenever a single bit is added to the
key length; thus, a 57-bit key has twice as many values as a 56-bit key
(because 257 is two times 256). In fact, a 66-bit key would have 1024 times
the possible values as a 56-bit key.
But this does bring up the issue, what is the precise significance of key
length as it affects the level of protection?
In cryptography, size does matter. The larger the key, the harder it is to
crack a block of encrypted data. The reason that large keys offer more
protection is almost obvious; computers have made it easier to attack
ciphertext by using brute force methods rather than by attacking the
mathematics (which are generally well-known anyway). With a brute force
attack, the attacker merely generates every possible key and applies it to
the ciphertext. Any resulting plaintext that makes sense offers a candidate
for a legitimate key. This was the basis, of course, of the EFF's attack on
DES.
Until the mid-1990s or so, brute force attacks were beyond the capabilities
of computers that were within the budget of the attacker community. Today,
however, significant compute power is commonly available and accessible.
General purpose computers such as PCs are already being used for brute force
attacks. For serious attackers with money to spend, such as some large
companies or governments, Field Programmable Gate Array (FPGA) or
Application-Specific Integrated Circuits (ASIC) technology offers the
ability to build specialized chips that can provide even faster and cheaper
solutions than a PC. Consider that an AT&T ORCA chip (FPGA) costs $200 and
can test 30 million DES keys per second, while a $10 ASIC chip can test 200
million DES keys per second (compared to a PC which might be able to test
40,000 keys per second).
The table below shows what DES key sizes are needed to protect data from
attackers with different time and financial resources. This information is
not merely academic; one of the basic tenets of any security system is to
have an idea of *what* you are protecting and *from who* are you protecting
it! The table clearly shows that a 40-bit key is essentially worthless today
against even the most unsophisticated attacker. On the other hand, 56-bit
keys are fairly strong unless you might be subject to some pretty serious
corporate or government espionage. But note that even 56-bit keys are
declining in their value and that the times in the table (1995 data) are
worst cases.
*TABLE 1. Minimum Key Lengths for Symmetric Ciphers.* Type of
Attacker Budget
Tool Time and Cost
Per Key Recovered Key Length Needed
For Protection
In Late-1995 40 bits 56 bits Pedestrian Hacker Tiny Scavanged
computer
time 1 week Infeasible 45 $400 FPGA 5 hours
($0.08) 38 years
($5,000) 50 Small Business $10,000 FPGA 12 minutes
($0.08) 18 months
($5,000) 55 Corporate Department $300K FPGA 24 seconds
($0.08) 19 days
($5,000) 60 ASIC 0.18 seconds
($0.001) 3 hours
($38) Big Company $10M FPGA 7 seconds
($0.08) 13 hours
($5,000) 70 ASIC 0.005 seconds
($0.001) 6 minutes
($38) Intelligence Agency $300M ASIC 0.0002 seconds
($0.001) 12 seconds
($38) 75
So, how big is big enough? DES, invented in 1975, is still in use today,
nearly 25 years later. If we take that to be a design criteria (i.e., a
20-plus year lifetime) and we believe Moore's Law ("computing power doubles
every 18 months"), then a key size extension of 14 bits (i.e., a factor of
more than 16,000) should be adequate. The 1975 DES proposal suggested 56-bit
keys; by 1995, a 70-bit key would have been required to offer equal
protection and an 85-bit key will be necessary by 2015.
The discussion above suggests that a 128- or 256-bit key for SKC will
suffice for some time because that key length keeps us ahead of the brute
force capabilities of the attackers. While a large key is good, a huge key
may not always be better. That is, many public-key cryptosystems use 1024-
or 2048-bit keys; expanding the key to 4096 bits probably doesn't add any
protection at this time but it does add significantly to processing time.
The most effective large-number factoring methods today use a mathematical
Number Field Sieve to find a certain number of relationships and then uses a
matrix operation to solve a linear equation to produce the two prime
factors. The sieve step actually involves a large number of operations of
operations that can be performed in parallel; solving the linear equation,
however, requires a supercomputer. Indeed, finding the solution to the
RSA-140 challenge in February 1999 factoring a 140-digit (465-bit) prime
number required 200 computers across the Internet about 4 weeks for the
first step and a Cray computer 100 hours and 810 MB of memory to do the
second step.
In early 1999, Shamir (of RSA fame) described a new machine that could
increase factorization speed by 2-3 orders of magnitude. Although no
detailed plans were provided nor is one known to have been built, the
concepts of TWINKLE (The Weizmann Institute Key Locating
Engine)<http://jya.com/twinkle.eps>could result in a specialized piece
of hardware that would cost about $5000
and have the processing power of 100-1000 PCs. There still appear to be many
engineering details that have to be worked out before such a machine could
be built. Furthermore, the hardware improves the sieve step only; the matrix
operation is not optimized at all by this design and the complexity of this
step grows rapidly with key length, both in terms of processing time and
memory requirements. Nevertheless, this plan conceptually puts 512-bit keys
within reach of being factored. Although most PKC schemes allow keys that
are 1024 bits and longer, Shamir claims that 512-bit RSA keys "protect 95%
of today's E-commerce on the Internet." (See Bruce Schneier's Crypto-Gram
(May 15, 1999) <http://www.counterpane.com/crypto-gram-9905.html> for more
information, as well as the comments from RSA
Labs<http://www.rsasecurity.com/rsalabs/html/twinkle.html>
.)
It is also interesting to note that while cryptography is good and strong
cryptography is better, long keys may disrupt the nature of the randomness
of data files. Shamir and van Someren ("Playing hide and seek with stored
keys" <http://www.ncipher.com/products/files/papers/anguilla/keyhide2.pdf>)
have noted that a new generation of viruses can be written that will find
files encrypted with long keys, making them easier to find by intruders and,
therefore, more prone to attack.
Finally, U.S. government policy has tightly controlled the export of crypto
products since World War II. Until recently, export outside of North America
of cryptographic products using keys greater than 40 bits in length was
prohibited, which made those products essentially worthless in the
marketplace, particularly for electronic commerce. More recently, the U.S.
Commerce Department relaxed the regulations, allowing the general export of
56-bit SKC and 1024-bit PKC products (certain sectors, such as health care
and financial, allow the export of products with even larger keys). The
Commerce Department's Bureau of Export Administration maintains a Commercial
Encryption Export Controls
<http://www.bxa.doc.gov/Encryption/Default.htm>web page with more
information. The potential impact of this policy on U.S.
businesses is well beyond the scope of this paper.
Much of the discussion above, including the table, are based on the
paper "Minimal
Key Lengths for Symmetric Ciphers to Provide Adequate Commercial
Security"<http://www.counterpane.com/keylength.html>by M. Blaze, W.
Diffie, R.L. Rivest, B. Schneier, T. Shimomura, E. Thompson,
and M. Wiener.
On a related topic, public key crypto schemes can be used for several
purposes, including key exchange, digital signatures, authentication, and
more. In those PKC systems used for SKC key exchange, the PKC key lengths
are chosen so to be resistant to some selected level of attack. The length
of the secret keys exchanged via that system have to have at least the same
level of attack resistance. Thus, the three parameters of such a system
system strength, secret key strength, and public key strength must be
matched. This topic is explored in more detail in *Determining Strengths For
Public Keys Used For Exchanging Symmetric Keys* (RFC
3766<ftp://ftp.rfc-editor.org/in-notes/rfc3766.txt>
).
4. TRUST MODELS
Secure use of cryptography requires trust. While secret key cryptography can
ensure message confidentiality and hash codes can ensure integrity, none of
this works without trust. In SKC, Alice and Bob had to share a secret key.
PKC solved the secret distribution problem, but how does Alice really know
that Bob is who he says he is? Just because Bob has a public and private
key, and purports to be "Bob," how does Alice know that a malicious person
(Mallory) is not pretending to be Bob?
There are a number of *trust models* employed by various cryptographic
schemes. This section will explore three of them:
- The web of trust employed by Pretty Good Privacy (PGP) users, who hold
their own set of trusted public keys.
- Kerberos, a secret key distribution scheme using a trusted third party.
- Certificates, which allow a set of trusted third parties to
authenticate each other and, by implication, each other's users.
Each of these trust models differs in complexity, general applicability,
scope, and scalability.
4.1. PGP Web of Trust
Pretty Good Privacy (described more below in Section
5.5<http://www.garykessler.net/library/crypto.html#pgp>)
is a widely used private e-mail scheme based on public key methods. A PGP
user maintains a local keyring of all their known and trusted public keys.
The user makes their own determination about the trustworthiness of a key
using what is called a "web of trust."
If Alice needs Bob's public key, Alice can ask Bob for it in another e-mail
or, in many cases, download the public key from an advertised server; this
server might a well-known PGP key repository or a site that Bob maintains
himself. In fact, Bob's public key might be stored or listed in many places.
(The author's public key, for example, can be found at *
http://www.garykessler.net/kumquat_pubkey.html*<http://www.garykessler.net/kumquat_pubkey.html>.)
Alice is prepared to believe that Bob's public key, as stored at these
locations, is valid.
Suppose Carol claims to hold Bob's public key and offers to give the key to
Alice. How does Alice know that Carol's version of Bob's key is valid or if
Carol is actually giving Alice a key that will allow Mallory access to
messages? The answer is, "It depends." If Alice trusts Carol and Carol says
that she thinks that her version of Bob's key is valid, then Alice *may*
at *her* option trust that key. And trust is not necessarily transitive;
if Dave has a copy of Bob's key and Carol trusts Dave, it does not
necessarily follow that Alice trusts Dave even if she does trust Carol.
The point here is that who Alice trusts and how she makes that determination
is strictly up to Alice. PGP makes no statement and has no protocol about
how one user determines whether they trust another user or not. In any case,
encryption and signatures based on public keys can only be used when the
appropriate public key is on the user's keyring.
4.2. Kerberos
Kerberos is a commonly used authentication scheme on the Internet. Developed
by MIT's Project Athena, Kerberos is named for the three-headed dog who,
according to Greek mythology, guards the entrance of Hades (rather than the
exit, for some reason!).
Kerberos employs a client/server architecture and provides user-to-server
authentication rather than host-to-host authentication. In this model,
security and authentication will be based on secret key technology where
every host on the network has its own secret key. It would clearly be
unmanageable if every host had to know the keys of all other hosts so a
secure, trusted host somewhere on the network, known as a Key Distribution
Center (KDC), knows the keys for all of the hosts (or at least some of the
hosts within a portion of the network, called a *realm*). In this way, when
a new node is brought online, only the KDC and the new node need to be
configured with the node's key; keys can be distributed physically or by
some other secure means.
FIGURE 3: Kerberos architecture.
The Kerberos Server/KDC has two main functions (Figure 3), known as the
Authentication Server (AS) and Ticket-Granting Server (TGS). The steps in
establishing an authenticated session between an application client and the
application server are:
1. The Kerberos client software establishes a connection with the
Kerberos server's AS function. The AS first authenticates that the client is
who it purports to be. The AS then provides the client with a secret key for
this login session (the *TGS session key*) and a ticket-granting ticket
(TGT), which gives the client permission to talk to the TGS. The ticket has
a finite lifetime so that the authentication process is repeated
periodically.
2. The client now communicates with the TGS to obtain the Application
Server's key so that it (the client) can establish a connection to the
service it wants. The client supplies the TGS with the TGS session key and
TGT; the TGS responds with an application session key (ASK) and an encrypted
form of the Application Server's secret key; this secret key is
*never*sent on the network in any other form.
3. The client has now authenticated itself *and* can prove its identity
to the Application Server by supplying the Kerberos ticket, application
session key, and encrypted Application Server secret key. The Application
Server responds with similarly encrypted information to authenticate itself
to the client. At this point, the client can initiate the intended service
requests (e.g., Telnet, FTP, HTTP, or e-commerce transaction session
establishment).
The current shipping version of this protocol is Kerberos V5 (described in RFC
1510 <http://www.rfc-editor.org/rfc/rfc1510.txt>), although Kerberos V4
still exists and is seeing some use. While the details of their operation,
functional capabilities, and message formats are different, the conceptual
overview above pretty much holds for both. One primary difference is that
Kerberos V4 uses only DES to generate keys and encrypt messages, while V5
allows other schemes to be employed (although DES is still the most widely
algorithm used).
4.3. Public Key Certificates and Certificate Authorities
*Certificates* and *Certificate Authorities (CA)* are necessary for
widespread use of cryptography for e-commerce applications. While a
combination of secret and public key cryptography can solve the business
issues discussed above, crypto cannot alone address the trust issues that
must exist between a customer and vendor in the very fluid, very dynamic
e-commerce relationship. How, for example, does one site obtain another
party's public key? How does a recipient determine if a public key really
belongs to the sender? How does the recipient know that the sender is using
their public key for a legitimate purpose for which they are authorized?
When does a public key expire? How can a key be revoked in case of
compromise or loss?
The basic concept of a certificate is one that is familiar to all of us. A
driver's license, credit card, or SCUBA certification, for example, identify
us to others, indicate something that we are authorized to do, have an
expiration date, and identify the authority that granted the certificate.
As complicated as this may sound, it really isn't! Consider driver's
licenses. I have one issued by the State of Vermont. The license establishes
my identity, indicates the type of vehicles that I can operate and the fact
that I must wear corrective lenses while doing so, identifies the issuing
authority, and notes that I am an organ donor. When I drive outside of
Vermont, the other jurisdictions throughout the U.S. recognize the authority
of Vermont to issue this "certificate" and they trust the information it
contains. Now, when I leave the U.S., everything changes. When I am in
Canada and many other countries, they will accept not the Vermont license,
per se, but *any* license issued in the U.S.; some other countries may not
recognize the Vermont driver's license as sufficient bona fides that I can
drive. This analogy represents the certificate chain, where even
certificates carry certificates.
For purposes of electronic transactions, certificates are digital documents.
The specific functions of the certificate include:
- *Establish identity:* Associate, or *bind*, a public key to an
individual, organization, corporate position, or other entity.
- *Assign authority:* Establish what actions the holder may or may not
take based upon this certificate.
- *Secure confidential information* (e.g., encrypting the session's
symmetric key for data confidentiality).
Typically, a certificate contains a public key, a name, an expiration date,
the name of the authority that issued the certificate (and, therefore, is
vouching for the identity of the user), a serial number, any pertinent
policies describing how the certificate was issued and/or how the
certificate may be used, the digital signature of the certificate issuer,
and perhaps other information.
FIGURE 4: GTE Cybertrust Global Root-issued certificate as viewed
by Netscape Navigator V4.
A sample abbreviated certificate is shown in Figure 4. This is a typical
certificate found in a browser; while this one is issued by GTE Cybertrust,
many so-called root-level certificates can be found shipped with browsers.
When the browser makes a connection to a secure Web site, the Web server
sends its public key certificate to the browser. The browser then checks the
certificate's signature against the public key that it has stored; if there
is a match, the certificate is taken as valid and the Web site verified by
this certificate is considered to be "trusted."
*TABLE 2. Contents of an X.509 V3 Certificate.*
version number
certificate serial number
signature algorithm identifier
issuer's name and unique identifier
validity (or operational) period
subject's name and unique identifier
subject public key information
standard extensions
certificate appropriate use definition
key usage limitation definition
certificate policy information
other extensions
Application-specific
CA-specific
The most widely accepted certificate format is the one defined in
International Telecommunication Union Telecommunication Standardization
Sector (ITU-T) Recommendation X.509. Rec. X.509 is a specification used
around the world and any applications complying with X.509 can share
certificates. Most certificates today comply with X.509 Version 3 and
contain the information listed in Table 2.
Certificate authorities are the repositories for public-keys and can be any
agency that issues certificates. A company, for example, may issue
certificates to its employees, a college/university to its students, a store
to its customers, an Internet service provider to its users, or a government
to its constituents.
When a sender needs an intended receiver's public key, the sender must get
that key from the receiver's CA. That scheme is straight-forward if the
sender and receiver have certificates issued by the same CA. If not, how
does the sender know to *trust* the foreign CA? One industry wag has noted,
about trust: "You are either born with it or have it granted upon you."
Thus, some CAs will be trusted because they are known to be reputable, such
as the CAs operated by AT&T, BBN, Canada Post Corp., CommerceNet, GTE
Cybertrust <http://www.bbn.com/products/security/cytrust/>, MCI, Nortel
EnTrust <http://www.entrust.com/>, Thawte <http://www.thawte.com/>, the U.S.
Postal Service, and VeriSign <http://www.verisign.com/>. CAs, in turn, form
trust relationships with other CAs. Thus, if a user queries a foreign CA for
information, the user may ask to see a list of CAs that establish a "chain
of trust" back to the user.
One major feature to look for in a CA is their identification policies and
procedures. When a user generates a key pair and forwards the public key to
a CA, the CA has to check the sender's identification and takes any steps
necessary to assure itself that the request is really coming from the
advertised sender. Different CAs have different identification policies and
will, therefore, be trusted differently by other CAs. Verification of
identity is just of many issues that are part of a CA's Certification
Practice Statement (CPS) and policies; other issues include how the CA
protects the public keys in its care, how lost or compromised keys are
revoked, and how the CA protects its own private keys.
4.4. Summary
The paragraphs above describe three very different trust models. It is hard
to say that any one is better than the others; it depend upon your
application. One of the biggest and fastest growing applications of
cryptography today, though, is electronic commerce (e-commerce), a term that
itself begs for a formal definition.
PGP's web of trust is easy to maintain and very much based on the reality of
users as people. The model, however, is limited; just how many public keys
can a single user reliably store and maintain? And what if you are using the
"wrong" computer when you want to send a message and can't access your
keyring? How easy it is to revoke a key if it is compromised? PGP may also
not scale well to an e-commerce scenario of secure communication between
total strangers on short-notice.
Kerberos overcomes many of the problems of PGP's web of trust, in that it is
scalable and its scope can be very large. However, it also requires that the
Kerberos server have *a priori* knowledge of all client systems prior to any
transactions, which makes it unfeasible for "hit-and-run" client/server
relationships as seen in e-commerce.
Certificates and the collection of CAs will form a Public Key Infrastructure
(PKI). In the early days of the Internet, every host had to maintain a list
of every other host; the Domain Name System (DNS) introduced the idea of a
distributed database for this purpose and the DNS is one of the key reasons
that the Internet has grown as it has. A PKI will fill a similar void in the
e-commerce and PKC realm.
While certificates and the benefits of a PKI are most often associated with
electronic commerce, the applications for PKI are much broader and include
secure electronic mail, payments and electronic checks, Electronic Data
Interchange (EDI), secure transfer of Domain Name System (DNS) and routing
information, electronic forms, and digitally signed documents. A single
"global PKI" is still many years away, that is the ultimate goal of today's
work as international electronic commerce changes the way in which we do
business in a similar way in which the Internet has changed the way in which
we communicate.
5. CRYPTOGRAPHIC ALGORITHMS IN ACTION
The paragraphs above have provided an overview of the different types of
cryptographic algorithms, as well as some examples of some available
protocols and schemes. Table 3 provides an even longer list of some of the
schemes employed today for a variety of functions, most notably electronic
commerce. The paragraphs below will show several real cryptographic
applications that many of us employ (knowingly or not) everyday; for
password protection and private communication.
*TABLE 3. Other Crypto Algorithms and Systems of Note.
* Capstone <http://csrc.nist.gov/keyrecovery/cap.txt> A now-defunct U.S.
National Institute of Standards and Technology (NIST) and National Security
Agency (NSA) project under the Bush Sr. and Clinton administrations for
publicly available strong cryptography with keys escrowed by the government
(NIST and the Treasury Dept.). Capstone included in one or more tamper-proof
computer chips for implementation (Clipper), a secret key encryption
algorithm (Skipjack), digital signature algorithm (DSA), key exchange
algorithm (KEA), and hash algorithm (SHA).
Clipper<http://csrc.nist.gov/keyrecovery/clip.txt> The
computer chip that would implement the Skipjack encryption scheme. See also
EPIC's The Clipper Chip <http://www.epic.org/crypto/clipper/> Web
page. Escrowed
Encryption Standard (EES) Largely unused, a controversial crypto scheme
employing the SKIPJACK secret key crypto algorithm and a Law Enforcement
Access Field (LEAF) creation method. LEAF was one part of the key escrow
system and allowed for decryption of ciphertext messages that had been
legally intercepted by law enforcement agencies. Described more in
FIPS 185<http://www.itl.nist.gov/fipspubs/fip185.htm>.
Federal Information Processing Standards
(FIPS)<http://csrc.nist.gov/publications/fips/index.html> These
computer security- and crypto-related FIPS are produced by the U.S. National
Institute of Standards and Technology (NIST) as standards for the U.S.
Government. Fortezza (formerly called
Tessera)<http://www.rsasecurity.com/rsalabs/faq/6-2-6.html> A
PCMCIA card developed by NSA that implements the Capstone algorithms,
intended for use with the Defense Messaging Service (DMS). GOST GOST is a
family of algorithms that is defined in the Russian cryptographic standards.
Although most of the specifications are written in Russian, RFC
4357<http://www.rfc-editor.org/rfc/rfc4357.txt>provides supplemental
information and specifications so that the algorithms
can be used effectively in Internet applications. Identity-Based Encryption
(IBE) <http://crypto.stanford.edu/ibe/> Identity-Based Encryption was first
proposed by Adi Shamir in 1984, and is a key authentication system where the
public key can be derived from some unique information based upon the user's
identity. In 2001, Dan Boneh (Stanford) and Matt Franklin (U.C., Davis)
developed a practical implementation of IBE based on elliptic curves and a
mathematical construct called the Weil Pairing. In that year, Clifford Cocks
(GCHQ) also described another IBE solution based on quadratic residues in
composite groups.
- RFC 5091: Identity-Based Cryptography Standard (IBCS)
#1<http://www.rfc-editor.org/rfc/rfc5091.txt>:
Describes an implementation of IBE using Boneh-Franklin (BF) and Boneh-Boyen
(BB1) Identity-based Encryption. This document is in part based on Voltage
Security's Identity-based Cryptography Standards (IBCS)
documents<http://www.voltage.com/ibe_dev/documentation/resources/ibcs1.pdf>
.
IP Security Protocol
(IPsec)<http://www.ietf.org/html.charters/ipsec-charter.html> The
IPsec protocol suite is used to provide privacy and authentication services
at the IP layer. An overview of the protocol suite and of the documents
comprising IPsec can be found in RFC
2411<http://www.rfc-editor.org/rfc/rfc2411.txt>.
Other documents include:
- RFC 4301 <http://www.rfc-editor.org/rfc/rfc4301.txt>: IP security
architecture.
- RFC 4302 <http://www.rfc-editor.org/rfc/rfc4302.txt>: IP Authentication
Header (AH), one of the two primary IPsec functions; AH provides
connectionless integrity and data origin authentication for IP datagrams and
protects against replay attacks.
- RFC 4303 <http://www.rfc-editor.org/rfc/rfc4303.txt>: IP Encapsulating
Security Payload (ESP), the other primary IPsec function; ESP provides a
variety of security services within IPsec.
- RFC 4304 <http://www.rfc-editor.org/rfc/rfc4304.txt>: Extended Sequence
Number (ESN) Addendum, allows for negotiation of a 32- or 64- bit sequence
number, used to detect replay attacks.
- RFC 4305 <http://www.rfc-editor.org/rfc/rfc4305.txt>: Cryptographic
algorithm implementation requirements for ESP and AH.
- RFC 4306 <http://www.rfc-editor.org/rfc/rfc4306.txt>: The Internet Key
Exchange (IKE) protocol, version 2, providing for mutual authentication and
establishing and maintaining security associations.
- IKE v1 was described in three separate documents, RFC
2407<http://www.rfc-editor.org/rfc/rfc2407.txt>(application of ISAKMP
to IPsec), RFC
2408 <http://www.rfc-editor.org/rfc/rfc2408.txt> (ISAKMP, a framework
for key management and security associations), and RFC
2409<http://www.rfc-editor.org/rfc/rfc2409.txt>(IKE, using part of
Oakley and part of SKEME in conjunction with ISAKMP to
obtain authenticated keying material for use with ISAKMP, and for other
security associations such as AH and ESP). IKE v1 is obsoleted with the
introdcution of IKEv2.
- RFC 4307 <http://www.rfc-editor.org/rfc/rfc4307.txt>: Cryptographic
algoritms used with IKEv2.
- RFC 4308 <http://www.rfc-editor.org/rfc/rfc4308.txt>: Crypto suites for
IPsec, IKE, and IKEv2.
- RFC 4309 <http://www.rfc-editor.org/rfc/rfc4309.txt>: The use of AES in
CBC-MAC mode with IPsec ESP.
- RFC 4312 <http://www.rfc-editor.org/rfc/rfc4312.txt>: The use of the
Camellia cipher algorithm in IPsec.
- RFC 4359 <http://www.rfc-editor.org/rfc/rfc4359.txt>: The Use of
RSA/SHA-1 Signatures within Encapsulating Security Payload (ESP) and
Authentication Header (AH).
- RFC 4434 <http://www.rfc-editor.org/rfc/rfc4434.txt>: Describes
AES-XCBC-PRF-128, a pseudo-random function derived from the AES for use with
IKE.
- RFC 2403 <http://www.rfc-editor.org/rfc/rfc2403.txt>: Describes use of
the HMAC with MD5 algorithm for data origin authentication and integrity
protection in both AH and ESP.
- RFC 2405 <http://www.rfc-editor.org/rfc/rfc2405.txt>: Describes use of
DES-CBC (DES in Cipher Block Chaining Mode) for confidentiality in ESP.
- RFC 2410 <http://www.rfc-editor.org/rfc/rfc2410.txt>: Defines use of
the NULL encryption algorithm (i.e., provides authentication and integrity
without confidentiality) in ESP.
- RFC 2412 <http://www.rfc-editor.org/rfc/rfc2412.txt>: Describes OAKLEY,
a key determination and distribution protocol.
- RFC 2451 <http://www.rfc-editor.org/rfc/rfc2451.txt>: Describes use of
Cipher Block Chaining (CBC) mode cipher algorithms with ESP.
- RFCs 2522 <http://www.rfc-editor.org/rfc/rfc2522.txt> and
2523<http://www.rfc-editor.org/rfc/rfc2523.txt>:
Description of Photuris, a session-key management protocol for IPsec.
IPsec was first proposed for use with IP version 6 (IPv6), but can also be
employed with the current IP version, IPv4.
(See more detail about IPsec below in Section
5.6<http://www.garykessler.net/library/crypto.html#ipsec>.)
Internet Security Association and Key Management Protocol
(ISAKMP/OAKLEY)<http://cio.cisco.com/public/library/isakmp>
ISAKMP/OAKLEY
provide an infrastructure for Internet secure communications. ISAKMP,
designed by the National Security Agency (NSA) <http://www.nsa.gov/> and
described in RFC 2408 <http://www.rfc-editor.org/rfc/rfc2408.txt>, is a
framework for key management and security associations, independent of the
key generation and cryptographic algorithms actually employed. The OAKLEY
Key Determination Protocol, described in RFC
2412<http://www.rfc-editor.org/rfc/rfc2412.txt>,
is a key determination and distribution protocol using a variation of
Diffie-Hellman. Kerberos <http://www.pdc.kth.se/kth-krb/> A secret-key
encryption and authentication system, designed to authenticate requests for
network resources within a user domain rather than to authenticate messages.
Kerberos also uses a trusted third-party approach; a client communications
with the Kerberos server to obtain "credentials" so that it may access
services at the application server. Kerberos V4 uses DES to generate keys
and encrypt messages; DES is also commonly used in Kerberos V5, although
other schemes could be employed.
Microsoft added support for Kerberos V5 with some proprietary extensions
in Windows 2000. There are many Kerberos articles posted at
Microsoft's Knowledge
Base <http://support.microsoft.com/default.aspx?scid=fh;EN-US;KBHOWTO>,
notably "Basic Overview of Kerberos User Authentication Protocol in Windows
2000 <http://support.microsoft.com/default.aspx?scid=kb;en-us;Q217098>,"
"Windows
2000 Kerberos 5 Ticket Flags and KDC Options for AS_REQ and TGS_REQ
Messages<http://support.microsoft.com/default.aspx?scid=kb;en-us;Q230669>,"
and "Kerberos Administration in Windows
2000<http://support.microsoft.com/default.aspx?scid=kb;en-us;Q232179>."
Keyed-Hash Message Authentication Code (HMAC) A message authentication
scheme based upon secret key cryptography and the secret key shared between
two parties rather than public key methods. Described in FIPS
198<http://csrc.nist.gov/publications/fips/fips198/fips-198a.pdf>and
RFC
2104 <http://www.rfc-editor.org/rfc/rfc2104.txt>. Message Digest Cipher
(MDC) <http://web.textfiles.com/software/sfs7.txt> Invented by Peter
Gutman<http://www.cs.auckland.ac.nz/%7Epgut001/>,
MDC turns a one-way hash function into a block cipher. MIME Object Security
Standard (MOSS) Designed as a successor to PEM to provide PEM-based security
services to MIME messages. Pretty Good Privacy (PGP) <http://www.pgp.com/> A
family of cryptographic routines for e-mail and file storage applications
developed by Philip Zimmermann. PGP 2.6.x uses RSA for key management and
digital signatures, IDEA for message encryption, and MD5 for computing the
message's hash value; more information can also be found in RFC
1991<http://www.rfc-editor.org/rfc/rfc1991.txt>.
PGP 5.x (formerly known as "PGP 3") uses Diffie-Hellman/DSS for key
management and digital signatures; IDEA, CAST, or 3DES for message
encryption; and MD5 or SHA for computing the message's hash value. OpenPGP,
described in RFC 2440 <http://www.rfc-editor.org/rfc/rfc2440.txt>, is an
open definition of security software based on PGP 5.x.
(See more detail about PGP below in Section
5.5<http://www.garykessler.net/library/crypto.html#pgp>.)
Privacy Enhanced Mail (PEM) Provides secure electronic mail over the
Internet and includes provisions for encryption (DES), authentication, and
key management (DES, RSA). May be superseded by S/MIME and PEM-MIME.
Developed by IETF PEM Working Group and defined in four RFCs:
RFC 1421 <http://www.rfc-editor.org/rfc/rfc1421.txt>: Part I, Message
Encryption and Authentication Procedures
RFC 1422 <http://www.rfc-editor.org/rfc/rfc1422.txt>: Part II,
Certificate-Based Key Management
RFC 1423 <http://www.rfc-editor.org/rfc/rfc1423.txt>: Part III, Algorithms,
Modes, and Identifiers
RFC 1424 <http://www.rfc-editor.org/rfc/rfc1424.txt>: Part IV, Key
Certification and Related Services Private Communication Technology
(PCT) Developed
by Microsoft and Visa for secure communication on the Internet. Similar to
SSL, PCT supports Diffie-Hellman, Fortezza, and RSA for key establishment;
DES, RC2, RC4, and triple-DES for encryption; and DSA and RSA message
signatures. A companion to SET. Secure Electronic Transactions
(SET)<http://www.mastercard.com/set/> A
merging of two other protocols: SEPP (Secure Electronic Payment Protocol),
an open specification for secure bank card transactions over the Internet,
developed by CyberCash, GTE, IBM, MasterCard, and Netscape; and STT (Secure
Transaction Technology), a secure payment protocol developed by Microsoft
and Visa International. Supports DES and RC4 for encryption, and RSA for
signatures, key exchange, and public-key encryption of bank card numbers.
SET is a companion to the PCT protocol. Secure Hypertext Transfer Protocol
(S-HTTP) <http://www.ietf.org/html.charters/wts-charter.html> An extension
to HTTP to provide secure exchange of documents over the World Wide Web.
Supported algorithms include RSA and Kerberos for key exchange, DES, IDEA,
RC2, and Triple-DES for encryption. Secure Multipurpose Internet Mail
Extensions (S/MIME) <http://www.ietf.org/html.charters/smime-charter.html> An
IETF secure e-mail scheme intended to supercede PEM. S/MIME, described in
RFCs 2311 <http://www.rfc-editor.org/rfc/rfc2311.txt> and
2312<http://www.rfc-editor.org/rfc/rfc2312.txt>,
adds digital signature and encryption capability to Internet MIME
messages. Secure
Sockets Layer (SSL) <http://home.netscape.com/eng/ssl3/ssl-toc.html> Developed
by Netscape Communications to provide application-independent security and
privacy over the Internet. SSL is designed so that protocols such as HTTP,
FTP (File Transfer Protocol), and Telnet can operate over it transparently.
SSL allows both server authentication (mandatory) and client authentication
(optional). RSA is used during negotiation to exchange keys and identify the
actual cryptographic algorithm (DES, IDEA, RC2, RC4, or 3DES) to use for the
session. SSL also uses MD5 for message digests and X.509 public-key
certificates. (Found to be breakable soon after the IETF announced formation
of group to work on TLS.)
(See more detail about SSL below in Section
5.7<http://www.garykessler.net/library/crypto.html#ssl>.)
Server Gated Cryptography
(SGC)<http://www.microsoft.com/security/tech/sgc/default.asp?ID=26&Parent=4>
Microsoft
extension to SSL that provides strong encryption for online banking and
other financial applications using RC2 (128-bit key), RC4 (128-bit key), DES
(56-bit key), or 3DES (equivalent of 168-bit key). Use of SGC requires a
Windows NT Server running Internet Information Server (IIS) 4.0 with a valid
SGC certificate. SGC is available in 32-bit Windows versions of Internet
Explorer (IE) 4.0, and support for Mac, Unix, and 16-bit Windows versions of
IE is expected in the future. Simple Authentication and Security Layer
(SASL) <http://www.rfc-editor.org/rfc/rfc4422.txt> (SASL) is a framework for
providing authentication and data security services in connection-oriented
protocols (a la TCP). It provides a structured interface and allows new
protocols to reuse existing authentication mechanisms and allows old
protocols to make use of new mechanisms.
It has been common practice on the Internet to permit anonymous access to
various services, employing a plain-text password using a user name of
"anonymous" and a password of an email address or some other identifying
information. New IETF protocols disallow plain-text logins. The Anonymous
SASL Mechanism (RFC 4505 <http://www.rfc-editor.org/rfc/rfc4505.txt>)
provides a method for anonymous logins within the SASL framework.
Simple Key-Management for Internet Protocol (SKIP) <http://skip.incog.com/> Key
management scheme for secure IP communication, specifically for IPsec, and
designed by Aziz and Diffie. SKIP essentially defines a public key
infrastructure for the Internet and even uses X.509 certificates. Most
public key cryptosystems assign keys on a per-session basis, which is
inconvenient for the Internet since IP is connectionless. Instead, SKIP
provides a basis for secure communication between any pair of Internet
hosts. SKIP can employ DES, 3DES, IDEA, RC2, RC5, MD5, and SHA-1. Transport
Layer Security (TLS) <http://www.ietf.org/html.charters/tls-charter.html> IETF
specification (RFC 2246 <http://www.rfc-editor.org/rfc/rfc2246.txt>)
intended to replace SSL. Employs Triple-DES (secret key cryptography), SHA
(hash), Diffie-Hellman (key exchange), and DSS (digital signatures).
(See more detail about TLS below in Section
5.7<http://www.garykessler.net/library/crypto.html#ssl>.)
X.509 <http://www.ietf.org/html.charters/pkix-charter.html> ITU-T
recommendation for the format of certificates for the public key
infrastructure. Certificates map (bind) a user identity to a public key. The
IETF application of X.509 certificates is documented in RFC
2459<http://www.rfc-editor.org/rfc/rfc2459.txt>.
An Internet X.509 Public Key Infrastructure is further defined in RFC
2510<http://www.rfc-editor.org/rfc/rfc2510.txt>(Certificate Management
Protocols) and RFC
2527 <http://www.rfc-editor.org/rfc/rfc2527.txt> (Certificate Policy and
Certification Practices Framework).
5.1. Password Protection
Nearly all modern multiuser computer and network operating systems employ
passwords at the very least to protect and authenticate users accessing
computer and/or network resources. But passwords are *not* typically kept on
a host or server in plaintext, but are generally encrypted using some sort
of hash scheme.
*A) /etc/passwd file*
root:Jbw6BwE4XoUHo:0:0:root:/root:/bin/bash
carol:FM5ikbQt1K052:502:100:Carol Monaghan:/home/carol:/bin/bash
alex:LqAi7Mdyg/HcQ:503:100:Alex Insley:/home/alex:/bin/bash
gary:FkJXupRyFqY4s:501:100:Gary Kessler:/home/gary:/bin/bash
todd:edGqQUAaGv7g6:506:101:Todd Pritsky:/home/todd:/bin/bash
josh:FiH0ONcjPut1g:505:101:Joshua Kessler:/home/webroot:/bin/bash
*B.1) /etc/passwd file (with shadow passwords)*
root:x:0:0:root:/root:/bin/bash
carol:x:502:100:Carol Monaghan:/home/carol:/bin/bash
alex:x:503:100:Alex Insley:/home/alex:/bin/bash
gary:x:501:100:Gary Kessler:/home/gary:/bin/bash
todd:x:506:101:Todd Pritsky:/home/todd:/bin/bash
josh:x:505:101:Joshua Kessler:/home/webroot:/bin/bash
*B.2) /etc/shadow file*
root:AGFw$1$P4u/uhLK$l2.HP35rlu65WlfCzq:11449:0:99999:7:::
carol:kjHaN%35a8xMM8a/0kMl1?fwtLAM.K&kw.:11449:0:99999:7:::
alex:1$1KKmfTy0a7#3.LL9a8H71lkwn/.hH22a:11449:0:99999:7:::
gary:9ajlknknKJHjhnu7298ypnAIJKL$Jh.hnk:11449:0:99999:7:::
todd:798POJ90uab6.k$klPqMt%alMlprWqu6$.:11492:0:99999:7:::
josh:Awmqpsui*787pjnsnJJK%aappaMpQo07.8:11492:0:99999:7:::
FIGURE 5: Sample entries in Unix/Linux password files.
Unix/Linux, for example, uses a well-known hash via its *crypt()* function.
Passwords are stored in the */etc/passwd* file (Figure 5A); each record in
the file contains the username, hashed password, user's individual and group
numbers, user's name, home directory, and shell program; these fields are
separated by colons (:). Note that each password is stored as a 13-byte
string. The first two characters are actually a *salt*, randomness added to
each password so that if two users have the same password, they will still
be encrypted differently; the salt, in fact, provides a means so that a
single password might have 4096 different encryptions. The remaining 11
bytes are the password hash, calculated using DES.
As it happens, the */etc/passwd* file is world-readable on Unix systems.
This fact, coupled with the weak encryption of the passwords, resulted in
the development of the *shadow password* system where passwords are kept in
a separate, non-world-readable file used in conjunction with the normal
password file. When shadow passwords are used, the password entry in *
/etc/passwd* is replaced with a "*" or "x" (Figure 5B.1) and the MD5 hash of
the passwords are stored in */etc/shadow* along with some other account
information (Figure 5B.2).
Windows NT uses a similar scheme to store passwords in the Security Access
Manager (SAM) file. In the NT case, all passwords are hashed using the MD4
algorithm, resulting in a 128-bit (16-byte) hash value (they are then *
obscured* using an undocumented mathematical transformation that was a
secret until distributed on the Internet). The password password, for
example, might be stored as the hash value (in hexadecimal)
60771b22d73c34bd4a290a79c8b09f18.
Passwords are not saved in plaintext on computer systems precisely so they
cannot be easily compromised. For similar reasons, we don't want passwords
sent in plaintext across a network. But for remote logon applications, how
does a client system identify itself or a user to the server? One mechanism,
of course, is to send the password as a hash value and that, indeed, may be
done. A weakness of that approach, however, is that an intruder can grab the
password off of the network and use an off-line attack (such as a *dictionary
attack* where an attacker takes every known word and encrypts it with the
network's encryption algorithm, hoping eventually to find a match with a
purloined password hash). In some situations, an attacker only has to copy
the hashed password value and use it later on to gain unauthorized entry
without ever learning the actual password.
An even stronger authentication method uses the password to modify a shared
secret between the client and server, but never allows the password in any
form to go across the network. This is the basis for the Challenge Handshake
Authentication Protocol (CHAP), the remote logon process used by Windows NT.
As suggested above, Windows NT passwords are stored in a security file on a
server as a 16-byte hash value. In truth, Windows NT stores *two* hashes; a
weak hash based upon the old LAN Manager (LanMan) scheme and the newer NT
hash. When a user logs on to a server from a remote workstation, the user is
identified by the username, sent across the network in plaintext (no worries
here; it's not a secret anyway!). The server then generates a 64-bit random
number and sends it to the client (also in plaintext). This number is the *
challenge*.
Using the LanMan scheme, the client system then encrypts the challenge using
DES. Recall that DES employs a 56-bit key, acts on a 64-bit block of data,
and produces a 64-bit output. In this case, the 64-bit data block is the
random number. The client actually uses three different DES keys to encrypt
the random number, producing three different 64-bit outputs. The first key
is the first seven bytes (56 bits) of the password's hash value, the second
key is the next seven bytes in the password's hash, and the third key is the
remaining two bytes of the password's hash concatenated with five
zero-filled bytes. (So, for the example above, the three DES keys would be
60771b22d73c34, bd4a290a79c8b0, and 9f180000000000.) Each key is applied to
the random number resulting in three 64-bit outputs, which comprise the *
response*. Thus, the server's 8-byte challenge yields a 24-byte response
from the client and this is all that would be seen on the network. The
server, for its part, does the same calculation to ensure that the values
match.
There is, however, a significant weakness to this system. Specifically, the
response is generated in such a way as to effectively reduce 16-byte hash to
three smaller hashes, of length seven, seven, and two. Thus, a password
cracker has to break at most a 7-byte hash. One Windows NT vulnerability
test program that I have used in the past will report passwords that are
"too short," defined as "less than 8 characters." When I asked how the
program knew that passwords were too short, the software's salespeople
suggested to me that the program broke the passwords to determine their
length. This is undoubtedly not true; all the software really has to do is
look at the second 7-byte block and some known value indicates that it is
empty, which would indicate a password of seven or less characters.
Consider the following example, showing the LanMan hash of two different
short passwords (take a close look at the last 8 bytes):
AA: 89D42A44E77140AAAAD3B435B51404EE AAA: 1C3A2B6D939A1021AAD3B435B51404EE
Note that the NT hash provides no such clue:
AA: C5663434F963BE79C8FD99F535E7AAD8 AAA: 6B6E0FB2ED246885B98586C73B5BFB77
It is worth noting that the discussion above describes the Microsoft version
of CHAP, or MS-CHAP (MS-CHAPv2 is described in RFC
2759<http://www.rfc-editor.org/rfc/rfc2759.txt>).
MS-CHAP assumes that it is working with hashed values of the password as the
key to encrypting the challenge. More traditional CHAP (RFC
1994<http://www.rfc-editor.org/rfc/rfc2510.txt>)
assumes that it is starting with passwords in plaintext. The relevance of
this observation is that a CHAP client, for example, cannot be authenticated
by an MS-CHAP server; both client and server must use the same CHAP version.
5.2. Some of the Finer Details of Diffie-Hellman
The first published public-key crypto algorithm was Diffie-Hellman. The
mathematical "trick" of this scheme is that it is relatively easy to compute
exponents compared to computing discrete logarithms. Diffie-Hellman allows
two parties the ubiquitous Alice and Bob to generate a secret key; they
need to exchange some information over an unsecure communications channel to
perform the calculation but an eavesdropper cannot determine the shared key
based upon this information.
Diffie-Hellman works like this. Alice and Bob start by agreeing on a large
prime number, *n*. They also have to choose some number *g* so that g<n.
There is actually another constraint on g, specifically that it must be
primitive with respect to n. *Primitive* is a definition that is a little
beyond the scope of our discussion but basically g is primitive to n if we
can find integers *i* so that gi = j mod n for all values of j from 1 to
n-1. As an example, 2 is not primitive to 7 because the set of powers of 2
from 1 to 6, mod 7 = {2,4,1,2,4,1}. On the other hand, 3 is primitive to 7
because the set of powers of 3 from 1 to 6, mod 7 = {3,2,6,4,5,1}.
(The definition of primitive introduced a new term to some readers, namely *
mod*. The phrase *x mod y* (and read as written!) means "take the remainder
after dividing x by y." Thus, 1 mod 7 = 1, 9 mod 6 = 3, and 8 mod 8 = 0.)
Anyway, either Alice or Bob selects n and g; they then tell the other party
what the values are. Alice and Bob then work independently:
*Alice...*
Choose a large random number, *x*
Send to Bob: X = g*x* mod n
Compute: KA = Y*x* mod n
*Bob...*
Choose a large random number, *y*
Send to Alice: Y = g*y* mod n
Compute: KB = X*y* mod n
Note that *x* and *y* are kept secret while X and Y are openly shared; these
are the private and public keys, respectively. Based on their own private
key and the public key learned from the other party, Alice and Bob have
computed their secret keys, KA and KB, respectively, which are equal to g*xy
* mod n.
Perhaps a small example will help here. Although Alice and Bob will really
choose large values for n and g, I will use small values for example only;
let's use n=7 and g=3.
*Alice...*
Choose *x*=2
Send to Bob: X = 32 mod 7 = 2
KA = 62 mod 7 = 1
*Bob...*
Choose *y*=3
Send to Alice: Y = 33 mod 7 = 6
KB = 23 mod 7 = 1
In this example, then, Alice and Bob will both find the secret key 1 which
is, indeed, 36 mod 7. If an eavesdropper (Mallory) was listening in on the
information exchange between Alice and Bob, he would learn g, n, X, and Y
which is a lot of information but insufficient to compromise the key; as
long as *x* and *y* remain unknown, K is safe. As said above, calculating X
as g*x* is a lot easier than finding *x* as logg X!
*A short digression on modulo arithmetic.* In the paragraph above, we noted
that 36 mod 7 = 1. This can be confirmed, of course, by noting that:
36 = 729 = 104*7 + 1
There is a nice property of modulo arithmetic, however, that makes this
determination a little easier, namely: (a mod x)(b mod x) = (ab mod x).
Therefore, one possible shortcut is to note that 36 = (33)(33). Therefore, 3
6 mod 7 = (33 mod 7)(33 mod 7) = (27 mod 7)(27 mod 7) = 6*6 mod 7 = 36 mod 7
= 1.
Diffie-Hellman can also be used to allow key sharing amongst multiple
users. Note again that the Diffie-Hellman algorithm is used to generate
secret keys, not to encrypt and decrypt messages.
5.3. Some of the Finer Details of RSA Public-Key Cryptography
Unlike Diffie-Hellman, RSA can be used for key exchange as well as digital
signatures and the encryption of small blocks of data. Today, RSA is primary
used to encrypt the session key used for secret key encryption (message
integrity) or the message's hash value (digital signature). RSA's
mathematical hardness comes from the ease in calculating large numbers and
the difficulty in finding the prime factors of those large numbers. Although
employed with numbers using hundreds of digits, the math behind RSA is
relatively straight-forward.
To create an RSA public/private key pair, here are the basic steps:
1. Choose two prime numbers, p and q. From these numbers you can
calculate the modulus, n = pq.
2. Select a third number, e, that is relatively prime to (i.e., it does
not divide evenly into) the product (p-1)(q-1). The number e is the public
exponent.
3. Calculate an integer d from the quotient (ed-1)/[(p-1)(q-1)]. The
number d is the private exponent.
The public key is the number pair (n,e). Although these values are publicly
known, it is computationally infeasible to determine d from n and e if p and
q are large enough.
To encrypt a message, M, with the public key, create the ciphertext, C,
using the equation:
C = Me mod n
The receiver then decrypts the ciphertext with the private key using the
equation:
M = Cd mod n
Now, this might look a bit complex and, indeed, the mathematics does take a
lot of computer power given the large size of the numbers; since p and q may
be 100 digits (decimal) or more, d and e will be about the same size and n
may be over 200 digits. Nevertheless, a simple example may help. In this
example, the values for p, q, e, and d are purposely chosen to be very small
and the reader will see exactly how badly these values perform, but
hopefully the algorithm will be adequately demonstrated:
1. Select p=3 and q=5.
2. The modulus n = pq = 15.
3. The value e must be relatively prime to (p-1)(q-1) = (2)(4) = 8.
Select e=11
4. The value d must be chosen so that (ed-1)/[(p-1)(q-1)] is an integer.
Thus, the value (11d-1)/[(2)(4)] = (11d-1)/8 must be an integer. Calculate
one possible value, d=3.
5. Let's say we wish to send the string *SECRET*. For this example, we
will convert the string to the decimal representation of the ASCII values of
the characters, which would be *83 69 67 82 69 84*.
6. The sender encrypts each digit one at a time (we have to because the
modulus is so small) using the public key value (e,n)=(11,15). Thus, each
ciphertext character Ci = Mi11 mod 15. The input digit string *
0x836967826984* will be transmitted as *0x2c696d286924*.
7. The receiver decrypts each digit using the private key value
(d,n)=(3,15). Thus, each plaintext character Mi = Ci3 mod 15. The input
digit string *0x2c696d286924* will be converted to *0x836967826984* and,
presumably, reassembled as the plaintext string *SECRET*.
Again, the example above uses small values for simplicity and, in fact,
shows the weakness of small values; note that 4, 6, and 9 do not change when
encrypted, and that the values 2 and 8 encrypt to 8 and 2, respectively.
Nevertheless, this simple example demonstrates how RSA can be used to
exchange information.
RSA keylengths of 512 and 768 bits are considered to be pretty weak. The
minimum suggested RSA key is 1024 bits; 2048 and 3072 bits are even better.
As an aside, Adam Back
(http://www.cypherspace.org/~adam/<http://www.cypherspace.org/%7Eadam/>)
wrote a two-line Perl script to implement RSA. It employs dc, an arbitrary
precision arithmetic package that ships with most UNIX systems:
print pack"C*",split/\D+/,`echo "16iII*o\U@{$/=$z;[(pop,pop,unpack"H*",<>
)]}\EsMsKsN0[lN*1lK[d2%Sa2/d0<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<J]dsJxp"|dc`
5.4. Some of the Finer Details of DES, Breaking DES, and DES Variants
The Data Encryption Standard (DES) has been in use since the mid-1970s,
adopted by the National Bureau of Standards (NBS) [now the National
Institute for Standards and Technology (NIST)] as Federal Information
Processing Standard 46 (FIPS
46-3<http://csrc.nist.gov/publications/fips/fips46-3/fips46-3.pdf>)
and by the American National Standards Institute (ANSI) as X3.92.
As mentioned earlier, DES uses the Data Encryption Algorithm (DEA), a secret
key block-cipher employing a 56-bit key operating on 64-bit blocks.
FIPS 81<http://www.itl.nist.gov/div897/pubs/fip81.htm>describes four
modes of DES operation: Electronic Codebook (ECB), Cipher
Block Chaining (CBC), Cipher Feedback (CFB), and Output Feedback (OFB).
Despite all of these options, ECB is the most commonly deployed mode of
operation.
NIST finally declared DES obsolete in 2004, and withdrew FIPS 46-3, 74, and
81 (*Federal Register*, July 26, 2004, *69*(142),
44509-44510<http://csrc.nist.gov/Federal-register/July26-2004-FR-DES-Notice.pdf>).
Although other block ciphers will replace DES, it is still interesting to
see how DES encryption is performed; not only is it sort of neat, but DES
was the first crypto scheme commonly seen in non-govermental applications
and was the catalyst for modern "public" cryptography. DES remains in many
products and cryptography students and cryptographers will continue to
study DES for years to come.
*DES Operational Overview*
DES uses a 56-bit key. In fact, the 56-bit key is divided into eight 7-bit
blocks and an 8th odd parity bit is added to each block (i.e., a "0" or "1"
is added to the block so that there are an odd number of 1 bits in each
8-bit block). By using the 8 parity bits for rudimentary error detection, a
DES key is actually 64 bits in length for computational purposes (although
it only has 56 bits worth of randomness, or *entropy*).
FIGURE 6: DES enciphering algorithm.
DES then acts on 64-bit blocks of the plaintext, invoking 16 rounds of
permutations, swaps, and substitutes, as shown in Figure 6. The standard
includes tables describing all of the selection, permutation, and expansion
operations mentioned below; these aspects of the algorithm are not secrets.
The basic DES steps are:
1. The 64-bit block to be encrypted undergoes an initial permutation
(IP), where each bit is moved to a new bit position; e.g., the 1st, 2nd, and
3rd bits are moved to the 58th, 50th, and 42nd position, respectively.
2. The 64-bit permuted input is divided into two 32-bit blocks, called *
left* and *right*, respectively. The initial values of the left and right
blocks are denoted L0 and R0.
3. There are then 16 rounds of operation on the L and R blocks. During
each iteration (where *n* ranges from 1 to 16), the following formulae
apply:
Ln = Rn-1
Rn = Ln-1 XOR f(Rn-1,Kn)
At any given step in the process, then, the new L block value is merely
taken from the prior R block value. The new R block is calculated by taking
the bit-by-bit exclusive-OR (XOR) of the prior L block with the results of
applying the DES cipher function, *f*, to the prior R block and Kn.
(Knis a 48-bit value derived from the 64-bit DES key. Each round uses
a
different 48 bits according to the standard's Key Schedule algorithm.)
The cipher function, f, combines the 32-bit R block value and the 48-bit
subkey in the following way. First, the 32 bits in the R block are expanded
to 48 bits by an expansion function (E); the extra 16 bits are found by
repeating the bits in 16 predefined positions. The 48-bit expanded R-block
is then ORed with the 48-bit subkey. The result is a 48-bit value that is
then divided into eight 6-bit blocks. These are fed as input into 8
selection (S) boxes, denoted S1,...,S8. Each 6-bit input yields a 4-bit
output using a table lookup based on the 64 possible inputs; this results in
a 32-bit output from the S-box. The 32 bits are then rearranged by a
permutation function (P), producing the results from the cipher function.
4. The results from the final DES round i.e., L16 and R16 are
recombined into a 64-bit value and fed into an inverse initial permutation
(IP-1). At this step, the bits are rearranged into their original
positions, so that the 58th, 50th, and 42nd bits, for example, are moved
back into the 1st, 2nd, and 3rd positions, respectively. The output from IP
-1 is the 64-bit ciphertext block.
Consider this example with the given 56-bit key and input:
Key: *1100101 0100100 1001001 0011101 0110101 0101011 1101100 0011010*
Input character string: * GoAggies*
Input bit string: * 11100010 11110110 10000010 11100110 11100110 10010110
10100110 11001110*
Output bit string: *10011111 11110010 10000000 10000001 01011011 00101001
00000011 00101111*
Output character string: *ùOÚ"Àô*
*Breaking DES*
The mainstream cryptographic community has long held that DES's 56-bit key
was too short to withstand a brute-force attack from modern computers.
Remember Moore's Law: computer power doubles every 18 months. Given that
increase in power, a key that could withstand a brute-force guessing attack
in 1975 could hardly be expected to withstand the same attack a quarter
century later.
DES is even more vulnerable to a brute-force attack because it is often used
to encrypt words, meaning that the entropy of the 64-bit block is,
effectively, greatly reduced. That is, if we are encrypting random bit
streams, then a given byte might contain any one of 28 (256) possible values
and the entire 64-bit block has 264, or about 18.5 quintillion, possible
values. If we are encrypting words, however, we are most likely to find a
limited set of bit patterns; perhaps 70 or so if we account for upper and
lower case letters, the numbers, space, and some punctuation. This means
that only about ¼ of the bit combinations of a given byte are likely to
occur.
Despite this criticism, the U.S. government insisted throughout the
mid-1990s that 56-bit DES was secure and virtually unbreakable if
appropriate precautions were taken. In response, RSA Laboratories sponsored
a series of cryptographic
challenges<http://www.rsasecurity.com/rsalabs/challenges/>to prove
that DES was no longer appropriate for use.
DES Challenge I was launched in March 1997. It was completed in 84 days by
R. Verser in a collaborative effort using thousands of computers on the
Internet.
The first DES II challenge lasted 40 days in early 1998. This problem was
solved by distributed.net <http://www.distributed.net/>, a worldwide
distributed computing network using the spare CPU cycles of computers around
the Internet (participants in distributed.net's activities load a client
program that runs in the background, conceptually similar to the SETI @Home
"Search for Extraterrestrial Intelligence" project). The
distributed.netsystems were checking 28
*billion* keys per second by the end of the project.
The second DES II challenge lasted less than 3 days. On July 17, 1998, the
Electronic Frontier Foundation (EFF) announced the construction of hardware
that could brute-force a DES key in an average of 4.5 days. Called Deep
Crack, the device could check 90 billion keys per second and cost only about
$220,000 including design (it was erroneously and widely reported that
subsequent devices could be built for as little as $50,000). Since the
design is scalable, this suggests that an organization could build a DES
cracker that could break 56-bit keys in an average of a day for as little as
$1,000,000. Information about the hardware design and all software can be
obtained from the EFF<http://www.eff.org/pub/Privacy/Crypto_misc/DES_Cracking>
.
The DES III challenge, launched in January 1999, was broken is less than a
day by the combined efforts of Deep Crack and distributed.net. This is
widely considered to have been the final nail in DES's coffin.
The Deep Crack algorithm is actually quite interesting. The general approach
that the DES Cracker Project took was not to break the algorithm
mathematically but instead to launch a brute-force attack by guessing every
possible key. A 56-bit key yields 256, or about 72 quadrillion, possible
values. So the DES cracker team looked for any shortcuts they could find!
First, they assumed that *some* recognizable plaintext would appear in the
decrypted string even though they didn't have a specific known plaintext
block. They then applied all 256 possible key values to the 64-bit block (I
don't mean to make this sound simple!). The system checked to see if the
decrypted value of the block was "interesting," which they defined as bytes
containing one of the alphanumeric characters, space, or some punctuation.
Since the likelihood of a single byte being "interesting" is about ¼, then
the likelihood of the entire 8-byte stream being "interesting" is about ¼8,
or 1/65536 (½16). This dropped the number of possible keys that might yield
positive results to about 240, or about a trillion.
They then made the assumption that an "interesting" 8-byte block would be
followed by another "interesting" block. So, if the first block of
ciphertext decrypted to something interesting, they decrypted the next
block; otherwise, they abandoned this key. Only if the second block was also
"interesting" did they examine the key closer. Looking for 16 consecutive
bytes that were "interesting" meant that only 224, or 16 million, keys
needed to be examined further. This further examination was primarily to see
if the text made any sense. Note that possible "interesting" blocks might be
1hJ5&aB7 or DEPOSITS; the latter is more likely to produce a better result.
And even a slow laptop today can search through lists of only a few million
items in a relatively short period of time. (Interested readers are urged to
read *Cracking DES* and EFF's Cracking DES
<http://www.eff.org/descracker/>page.)
It is well beyond the scope of this paper to discuss other forms of breaking
DES and other codes. Nevertheless, it is worth mentioning a couple of forms
of cryptanalysis that have been shown to be effective against DES.
*Differential
cryptanalysis*, invented in 1990 by E. Biham and A. Shamir (of RSA fame), is
a chosen-plaintext attack. By selecting pairs of plaintext with particular
differences, the cryptanalyst examines the differences in the resultant
ciphertext pairs. *Linear plaintext*, invented by M. Matsui, uses a linear
approximation to analyze the actions of a block cipher (including DES). Both
of these attacks can be more efficient than brute force.
*DES Variants*
Once DES was "officially" broken, several variants appeared. But none of
them came overnight; work at hardening DES had already been underway. In the
early 1990s, there was a proposal to increase the security of DES by
effectively increasing the key length by using multiple keys with multiple
passes. But for this scheme to work, it had to first be shown that the DES
function is *not* a *group*, as defined in mathematics. If DES was a group,
then we could show that for two DES keys, X1 and X2, applied to some
plaintext (P), we can find a single equivalent key, X3, that would provide
the same result; i.e.,:
EX2(EX1(P)) = EX3(P)
where EX(P) represents DES encryption of some plaintext *P* using DES key *X
*. If DES were a group, it wouldn't matter how many keys and passes we
applied to some plaintext; we could always find a single 56-bit key that
would provide the same result.
As it happens, DES was proven to not be a group so that as we apply
additional keys and passes, the effective key length increases. One obvious
choice, then, might be to use two keys and two passes, yielding an effective
key length of 112 bits. Let's call this Double-DES. The two keys, Y1 and Y2,
might be applied as follows:
C = EY2(EY1(P))
P = DY1(DY2(C))
where EY(P) and DY(C) represent DES encryption and decryption, respectively,
of some plaintext *P* and ciphertext *C*, respectively, using DES key *Y*.
So far, so good. But there's an interesting attack that can be launched
against this "Double-DES" scheme. First, notice that the applications of the
formula above can be thought of with the following individual steps (where
C' and P' are intermediate results):
C' = EY1(P) and C = EY2(C')
P' = DY2(C) and P = DY1(P')
Unfortunately, C'=P'. That leaves us vulnerable to a simple *known plaintext
* attack (sometimes called "Meet-in-the-middle") where the attacker knows
some plaintext (P) and its matching ciphertext (C). To obtain C', the
attacker needs to try all 256 possible values of Y1 applied to P; to obtain
P', the attacker needs to try all 256 possible values of Y2 applied to C.
Since C'=P', the attacker knows when a match has been achieved after only
256 + 256 = 257 key searches, only twice the work of brute-forcing DES. So
"Double-DES" won't work.
Triple-DES (3DES), based upon the Triple Data Encryption Algorithm (TDEA),
is described in FIPS
46-3<http://csrc.nist.gov/publications/fips/fips46-3/fips46-3.pdf>.
3DES, which is not susceptible to a meet-in-the-middle attack, employs three
DES passes and one, two, or three keys called K1, K2, and K3. Generation of
the ciphertext (C) from a block of plaintext (P) is accomplished by:
C = EK3(DK2(EK1(P)))
where EK(P) and DK(P) represent DES encryption and decryption, respectively,
of some plaintext *P* using DES key *K*. (For obvious reasons, this is
sometimes referred to as an *encrypt-decrypt-encrypt mode* operation.)
Decryption of the ciphertext into plaintext is accomplished by:
P = DK1(EK2(DK3(C)))
The use of three, independent 56-bit keys provides 3DES with an effective
key length of 168 bits. The specification also defines use of two keys
where, in the operations above, K3 = K1; this provides an effective key
length of 112 bits. Finally, a third keying option is to use a single key,
so that K3 = K2 = K1 (in this case, the effective key length is 56 bits and
3DES applied to some plaintext, P, will yield the same ciphertext, C, as
normal DES would with that same key). Given the relatively low cost of key
storage and the modest increase in processing due to the use of longer keys,
the best recommended practices are that 3DES be employed with three keys.
Another variant of DES, called DESX, is due to Ron Rivest. Developed in
1996, DESX is a very simple algorithm that greatly increases DES's
resistance to brute-force attacks without increasing its computational
complexity. In DESX, the plaintext input is XORed with 64 additional key
bits prior to encryption and the output is likewise XORed with the 64 key
bits. By adding just two XOR operations, DESX has an effective keylength of
120 bits against an exhaustive key-search attack. As it happens, DESX is no
more immune to other types of more sophisticated attacks, such as
differential or linear cryptanalysis, but brute-force is the primary attack
vector on DES.
*Closing Comments*
Although DES has been deprecated and replaced by the Advanced Encryption
Standard (AES) because of its vulnerability to a modestly-priced brute-force
attack, many applications continue to rely on DES for security, and many
software designers and implementers continue to include DES in new
applications. In some cases, use of DES is wholly appropriate but, in
general, DES should not continue to be promulgated in production software
and hardware. RFC 4772 <http://www.rfc-editor.org/rfc/rfc4772.txt> discusses
the security implications of employing DES.
5.5. Pretty Good Privacy (PGP)
Pretty Good Privacy (PGP) is one of today's most widely used public key
cryptography programs. Developed by Philip
Zimmermann<http://www.philzimmermann.com/>in the early 1990s and long
the subject of controversy, PGP is available as
a plug-in for many e-mail clients, such as Claris Emailer, Microsoft
Outlook/Outlook Express, and Qualcomm Eudora.
PGP can be used to sign or encrypt e-mail messages with the mere click of
the mouse. Depending upon the version of PGP, the software uses SHA or MD5
for calculating the message hash; CAST, Triple-DES, or IDEA for encryption;
and RSA or DSS/Diffie-Hellman for key exchange and digital signatures.
When PGP is first installed, the user has to create a key-pair. One key, the
public key, can be advertised and widely circulated. The private key is
protected by use of a *passphrase*. The passphrase has to be entered every
time the user accesses their private key.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi Carol.
What was that pithy Groucho Marx quote?
/kess
-----BEGIN PGP SIGNATURE-----
Version: PGP for Personal Privacy 5.0
Charset: noconv
iQA/AwUBNFUdO5WOcz5SFtuEEQJx/ACaAgR97+vvDU6XWELV/GANjAAgBtUAnjG3
Sdfw2JgmZIOLNjFe7jP0Y8/M
=jUAU
-----END PGP SIGNATURE-----
FIGURE 7: A PGP signed message. The sender uses their private key; at the
destination, the sender's e-mail address yields the public key from the
receiver's keyring.
Figure 7 shows a PGP signed message. This message will not be kept secret
from an eavesdropper, but a recipient can be assured that the message has
not been altered from what the sender transmitted. In this instance, the
sender signs the message using their own private key. The receiver uses the
sender's public key to verify the signature; the public key is taken from
the receiver's keyring based on the sender's e-mail address. Note that the
signature process does not work unless the sender's public key is on the
receiver's keyring.
-----BEGIN PGP MESSAGE-----
Version: PGP for Personal Privacy 5.0
MessageID: DAdVB3wzpBr3YRunZwYvhK5gBKBXOb/m
qANQR1DBwU4D/TlT68XXuiUQCADfj2o4b4aFYBcWumA7hR1Wvz9rbv2BR6WbEUsy
ZBIEFtjyqCd96qF38sp9IQiJIKlNaZfx2GLRWikPZwchUXxB+AA5+lqsG/ELBvRa
c9XefaYpbbAZ6z6LkOQ+eE0XASe7aEEPfdxvZZT37dVyiyxuBBRYNLN8Bphdr2zv
z/9Ak4/OLnLiJRk05/2UNE5Z0a+3lcvITMmfGajvRhkXqocavPOKiin3hv7+Vx88
uLLem2/fQHZhGcQvkqZVqXx8SmNw5gzuvwjV1WHj9muDGBY0MkjiZIRI7azWnoU9
3KCnmpR60VO4rDRAS5uGl9fioSvze+q8XqxubaNsgdKkoD+tB/4u4c4tznLfw1L2
YBS+dzFDw5desMFSo7JkecAS4NB9jAu9K+f7PTAsesCBNETDd49BTOFFTWWavAfE
gLYcPrcn4s3EriUgvL3OzPR4P1chNu6sa3ZJkTBbriDoA3VpnqG3hxqfNyOlqAka
mJJuQ53Ob9ThaFH8YcE/VqUFdw+bQtrAJ6NpjIxi/x0FfOInhC/bBw7pDLXBFNaX
HdlLQRPQdrmnWskKznOSarxq4GjpRTQo4hpCRJJ5aU7tZO9HPTZXFG6iRIT0wa47
AR5nvkEKoIAjW5HaDKiJriuWLdtN4OXecWvxFsjR32ebz76U8aLpAK87GZEyTzBx
dV+lH0hwyT/y1cZQ/E5USePP4oKWF4uqquPee1OPeFMBo4CvuGyhZXD/18Ft/53Y
WIebvdiCqsOoabK3jEfdGExce63zDI0=
=MpRf
-----END PGP MESSAGE-----
FIGURE 8: A PGP encrypted message. The receiver's e-mail address is the
pointer to the public key in the sender's keyring. At the destination side,
the receiver uses their own private key.
Figure 8 shows a PGP encrypted message (PGP compresses the file, where
practical, prior to encryption because encrypted files lose their randomness
and, therefore, cannot be compressed). In this case, public key methods are
used to exchange the session key for the actual message encryption using
secret-key cryptography. In this case, the receiver's e-mail address is the
pointer to the public key in the sender's keyring; in fact, the same message
can be sent to multiple recipients and the message will not be significantly
longer since all that needs to be added is the session key encrypted by each
receiver's private key. When the message is received, the recipient must use
their private key to extract the session secret key to successfully decrypt
the message (Figure 9).
Hi Gary,
"Outside of a dog, a book is man's best friend.
Inside of a dog, it's too dark to read."
Carol
FIGURE 9: The decrypted message.
It is worth noting that PGP was one of the first so-called "hybrid
cryptosystems" that combined aspects of SKC and PKC. When Zimmermann was
first designing PGP in the late-1980s, he wanted to use RSA to encrypt the
entire message. The PCs of the days, however, suffered significant
performance degradation when executing RSA so he hit upon the idea of using
SKC to encrypt the message and PKC to encrypt the SKC key.
PGP went into a state of flux in 2002. Zimmermann sold PGP to Network
Associates, Inc. (NAI) in 1997 and himself resigned from NAI in early 2001.
In March 2002, NAI announced that they were dropping support for the
commercial version of PGP having failed to find a buyer for the product
willing to pay what NAI wanted. In August 2002, PGP was purchased from NAI
by PGP Corp. (*http://www.pgp.com/*). Meanwhile, there are many freeware
versions of PGP available through the International PGP
Page<http://www.pgpi.org/>and the OpenPGP
Alliance <http://openpgp.org/>. Also check out the GNU Privacy Guard
(GnuPG)<http://www.gnupg.org/>,
a GNU project implementation of OpenPGP (defined in RFC
2440<http://www.rfc-editor.org/rfc/rfc2440.txt>
).
5.6. IP Security (IPsec) Protocol
*NOTE:* The information in this section assumes that the reader is familiar
with the Internet Protocol (IP), at least to the extent of the packet format
and header contents. More information about IP can be found in * An Overview
of TCP/IP Protocols and the
Internet*<http://www.garykessler.net/library/tcpip.html>.
More information about IPv6 can be found in IPv6: The Next Generation
Internet Protocol <http://www.garykessler.net/library/ipv6_exp.html>.
The Internet and the TCP/IP protocol suite were not built with security in
mind. This statement is not meant as a criticism; the baseline UDP, TCP, IP,
and ICMP protocols were written in 1980 and built for the relatively closed
ARPANET community. TCP/IP wasn't designed for the commercial-grade financial
transactions that they now see nor for virtual private networks (VPNs) on
the Internet. To bring TCP/IP up to today's security necessities, the
Internet Engineering Task Force (IETF) formed the IP Security Protocol
Working Group <http://www.ietf.org/html.charters/ipsec-charter.html> which,
in turn, developed the IP Security (IPsec) protocol. IPsec is not a single
protocol, in fact, but a suite of protocols providing a mechanism to provide
data integrity, authentication, privacy, and nonrepudiation for the classic
Internet Protocol (IP). Although intended primarily for IP version 6 (IPv6),
IPsec can also be employed by the current version of IP, namely IP version 4
(IPv4).
As shown in Table
3<http://www.garykessler.net/library/crypto.html#tab03-ipsec>,
IPsec is described in nearly a dozen RFCs. RFC
4301<http://www.rfc-editor.org/rfc/rfc4301.txt>,
in particular, describes the overall IP security architecture and RFC
2411<http://www.rfc-editor.org/rfc/rfc2411.txt>provides an overview of
the IPsec protocol suite and the documents
describing it.
IPsec can provide either message authentication and/or encryption. The
latter requires more processing than the former, but will probably end up
being the preferred usage for applications such as VPNs and secure
electronic commerce.
Central to IPsec is the concept of a *security association (SA)*.
Authentication and confidentiality using AH or ESP use SAs and a primary
role of IPsec key exchange it to establish and maintain SAs. An SA is a
simplex (one-way or unidirectional) logical connection between two
communicating IP endpoints that provides security services to the traffic
carried by it using either AH or ESP procedures. The endpoint of an SA can
be an IP host or IP security gateway (e.g., a proxy server, VPN server,
etc.). Providing security to the more typical scenario of two-way
(bi-directional) communication between two endpoints requires the
establishment of two SAs (one in each direction).
An SA is uniquely identified by a 3-tuple composed of:
- Security Parameter Index (SPI), a 32-bit identifier of the connection
- IP Destination Address
- security protocol (AH or ESP) identifier
The IP Authentication Header (AH), described in RFC
4302<http://www.rfc-editor.org/rfc/rfc4302.txt>,
provides a mechanism for data integrity and data origin authentication for
IP packets using HMAC with MD5 (RFC
2403<http://www.rfc-editor.org/rfc/rfc2403.txt>),
HMAC with SHA-1 (RFC 2404 <http://www.rfc-editor.org/rfc/rfc2404.txt>), or
HMAC with RIPEMD (RFC 2857 <http://www.rfc-editor.org/rfc/rfc2857.txt>). See
also RFC 4305 <http://www.rfc-editor.org/rfc/rfc4305.txt>.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Next Header | Payload Len | RESERVED |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Security Parameters Index (SPI) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number Field |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Integrity Check Value-ICV (variable) |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
FIGURE 10: IPsec Authentication Header format. *(From RFC 4302)*
Figure 10 shows the format of the IPsec AH. The AH is merely an additional
header in a packet, more or less representing another protocol layer above
IP (this is shown in Figure 12 below). Use of the IP AH is indicated by
placing the value 51 (0x33) in the IPv4 Protocol or IPv6 Next Header field
in the IP packet header. The AH follows mandatory IPv4/IPv6 header fields
and precedes higher layer protocol (e.g., TCP, UDP) information. The
contents of the AH are:
- *Next Header:* An 8-bit field that identifies the type of the next
payload after the Authentication Header.
- *Payload Length:* An 8-bit field that indicates the length of AH in
32-bit words (4-byte blocks), minus "2". [The rationale for this is somewhat
counter intuitive but technically important. All IPv6 extension headers
encode the header extension length (Hdr Ext Len) field by first subtracting
1 from the header length, which is measured in 64-bit words. Since AH was
originally developed for IPv6, it is an IPv6 extension header. Since its
length is measured in 32-bit words, however, the Payload Length is
calculated by subtracting 2 (32 bit words) to maintain consistency with IPv6
coding rules.] In the default case, the three 32-bit word fixed portion of
the AH is followed by a 96-bit authentication value, so the Payload Length
field value would be 4.
- *Reserved:* This 16-bit field is reserved for future use and always
filled with zeros.
- *Security Parameters Index (SPI):* An arbitrary 32-bit value that, in
combination with the destination IP address and security protocol, uniquely
identifies the Security Association for this datagram. The value 0 is
reserved for local, implementation-specific uses and values between 1-255
are reserved by the Internet Assigned Numbers Authority (IANA) for future
use.
- *Sequence Number:* A 32-bit field containing a sequence number for each
datagram; initially set to 0 at the establishment of an SA. AH uses sequence
numbers as an anti-replay mechanism, to prevent a "person-in-the-middle"
attack. If anti-replay is enabled (the default), the transmitted Sequence
Number is never allowed to cycle back to 0; therefore, the sequence number
must be reset to 0 by establishing a new SA prior to the transmission of the
232nd packet.
- *Authentication Data:* A variable-length, 32-bit aligned field
containing the Integrity Check Value (ICV) for this packet (default length =
96 bits). The ICV is computed using the authentication algorithm specified
by the SA, such as DES, MD5, or SHA-1. Other algorithms *may* also be
supported.
The IP Encapsulating Security Payload (ESP), described in RFC
4303<http://www.rfc-editor.org/rfc/rfc4303.