The Data Encryption Standard (DES)

The basic concept of encryption is shown in Figure 7.3. The data that is to be kept secret, X, is input to an encryption process which performs a mathematical transformation and creates an encrypted set of data, Y. The encrypted set of data will have the same number of bits as the original data. It appears to be a random collection of bits, but the process can be reversed, using a reverse process which regenerates the original data, X. These two processes, encryption and decryption, can be implemented as computer programs, in software or using special purpose hardware.

The commonly used methods of encryption are controlled by a pair of numbers, known as keys. One key is used for encryption, the other for decryption. The methods of encryption vary in the choice of processes and in the way the keys are selected. The mathematical form of the processes are not secret. The security lies in the keys. A key is a string of bits, typically from 40 to 120 bits or more. Long keys are intrinsically much more secure than short keys, since any attempt to violate security by guessing keys is twice as difficult for every bit added to the key length.

Historically, the use of encryption has been restricted by computer power. The methods all require considerable computation to scramble and unscramble data. Early implementations of DES, the method described in Panel 7.4, required special hardware to be added to every computer. With today's fast computers, this is much less of a problem, but the time to encrypt and decrypt large amounts of data is still noticeable. The methods are excellent for encrypting short message, such as passwords, or occasional highly confidential messages, but the methods are less suitable for large amounts of data where response times are important.

Private key encryption

Private key encryption is a family of methods in which the key used to encrypt the data and the key used to decrypt the data are the same, and must be kept secret.

Private key encryption is also known as single key or secret key encryption. Panel 7.4 describes DES, one of the most commonly used methods.

Panel 7.4

Private key encryption is only as secure as the procedures that are used to keep the key secret. If one computer wants to send encrypted data to a remote computer it must find a completely secure way to get the key to the remote computer. Thus private key encryption is most widely used in applications where trusted services are exchanging information.

Dual key encryption

When using private key encryption over a network, the sending computer and the destination must both know the key. This poses the problem of how to get started if one computer can not pass a key secretly to another. Dual key encryption permits all information to be transmitted over a network, including the public keys, which can be transmitted completely openly. For this reason, it has the alternate name of public key encryption. Even if every message is intercepted, the encrypted information is still kept secret.

The RSA method is the best known method of dual key encryption. It requires a pair of keys. The first key is made public; the second is kept secret. If an individual, A, wishes to send encrypted data to a second individual, B, then the data is encrypted using the public key of B. When B receives the data it can be decrypted, using the private key, which only B knows.

This dual key system of encryption has many advantages and one major problem. The problem is to make sure that a key is genuinely the public key of a specific individual.

The normal approach is to have all keys generated and authenticated by a trusted authority, called a certification authority. The certification authority generates certificates, which are signed messages specifying an individual and a public key.

This works well, so long as security at the certificate authority is never violated.

Digital signatures

Digital signatures are used to check that a computer file has not been altered. Digital signatures are based on the concept of a hash function. A hash is a mathematical function that can be applied to the bytes of a computer file to generate a fixed-length number. One commonly used hash function is called MD5. The MD5 function can be applied to any length computer file. It carries out a special transformation on the bits of the file and ends up with an apparently random 128 bits.

If two files differ by as little as one bit, their MD5 hashes will be completely different.

Conversely, if two files have the same hash, there is an infinitesimal probability that they are not identical. Thus a simple test for whether a file has been altered is to calculate the MD5 hash when the file is created; at a later time, to check that no changes have taken place, recalculate the hash and compare it with the original. If the two are the same then the files are almost certainly the same.

The MD5 function has many strengths, including being fast to compute on large files, but, as with any security device, there is always a possibility that some bright person may discover how to reverse engineer the hash function, and find a way to create a file that has a specific hash value. At the time that this book was being written there were hints that MD5 may be vulnerable in this way. If so, other hash functions are available.

A hash value gives no information about who calculated it. A digital signature goes one step further towards guaranteeing the authenticity of a library object. When the hash value is calculated it is encrypted using the private key of the owner of the

material. This together with the public key and the certificate authority creates a digital signature. Before checking the hash value the digital signature is decrypted using the public key. If the hash results match, then the material is unaltered and it is known that the digital signature was generated using the corresponding private key.

Digital signatures have a problem. While users of a digital library want to be confident that material is unaltered, they are not concerned with bits; their interest lies in the content. For example, the Copyright Office pays great attention to the intellectual content, such as the words in text, but does not care that a computer system may have attached some control information to a file, or that the font used for the text has been changed, yet the test of a digital signature fails completely when one bit is changed. As yet, nobody has suggested an effective way to ensure authenticity of content, rather than bits.

Deployment of public key encryption

Since the basic mathematics of public key encryption are now almost twenty years old, it might be expected that products based on the methods would have been widely deployed for many years. Sadly this is not the case.

One reason for delay is that there are significant technical issues. Many of them concern the management of the keys, how they are generated, how private keys are stored, and what precautions can be taken if the agency that is creating the keys has a security break-in. However, the main problems are policy problems.

Patents are part of the difficulty. Chapter 6 discussed the problems that surround software patents. Public key encryption is one of the few areas where most computer scientists would agree that there were real inventions. These method are not obvious and their inventors deserve the rewards that go with invention. Unfortunately, the patent holders and their agents have followed narrow licensing policies, which have restricted the creative research that typically builds on a break-through invention.

A more serious problem has been interference from U.S. government departments.

Agencies such as the CIA claim that encryption technology is a vital military secret and that exporting it would jeopardize the security of the United States. Police forces claim that public safety depends upon their ability to intercept and read any messages on the networks, when authorized by an appropriate warrant. The export argument is hard to defend when the methods are widely published overseas, and reputable companies in Europe and Japan are building products that incorporate them. The public safety augment is more complicated, but it is undercut by the simple fact that the American public does not trust the agencies, neither their technical competence nor their administrative procedures. People want the ability to transmit confidential information without being monitored by the police.

The result of these policy problems has been to delay the deployment of the tools that are needed to build secure applications over the Internet. Progress is being made and, in a few years time we may be able to report success. At present it is a sorry story.

It is appropriate that this chapter ends with a topic in which the technical solution is held up by policy difficulties. This echoes a theme that recurs throughout digital libraries and is especially important in access management. People, technology, and administrative procedures are intimately linked. Successful digital libraries combine aspects of all three and do not rely solely on technology to solve human problems.

Chapter 8

Dalam dokumen Digital Libraries (Halaman 111-114)