Public Key Distribution -- a White Paper for ESnet DRAFT 9/95 Michael Graff, Joe Metzger and Stephen Elbert ABSTRACT -------- Secure communication channels will be a critical component of any distributed collaborative environment such as the Distributed Informatics, Computing and Collaborative Environment (DICCE) project for ESnet. The successful management and operation of such a system requires reliable authentication (I am who I say I am) and authorization (I have permission to do this) tools. Although public key technology has the potential to solve these problems, a scalable method for the distribution of public keys has yet to be resolved. This document proposes a protocol and implementation to address the key distribution issue. BACKGROUND ---------- The use of public key technology to allow authenticated and secure communication is growing steadily. The arrival of the paperless office and its compelling economic advantages can only be fully realized with the use of digital signatures made possible by public key technology. There are many commercial and public domain systems in use, both within the United States and abroad, nearly all based upon the RSA algorithm. PEM (Privacy Enhanced Mail) and PGP (Pretty Good Privacy) are two examples, although PEM is actually a standard implemented by several programs and PGP is a specific program. Use of this technology would be even more widespread if its use were transparent. One of the obstacles to transparent use is how encryption keys are managed and distributed. When using any form of public key technology two difficulties must be overcome. The first is the obvious one: How does one distribute a public key so that others may easily find it? The second is much more important: Given a public key, how can a potential user be reasonably assured of the key's validity? Key Validity Models The validity issue is implementation dependent. PEM uses a model in which a Certifying Authority (CA) puts an RSA signature on a PEM key, which gives assurance to anyone who verifies the key that it is indeed an authentic key. PGP uses a much more general approach, where anyone can sign anyone's key. This is commonly referred to as a Web of Trust model. The PEM method requires that every potential user trust all CA's, not only to sign keys after proof of identity but also to trust the CA to maintain the privacy of their own RSA key. The PGP model is much more flexible, since in effect, anyone can put some level of assurance on a key, and it is more of a decision for the end user on how valid a key must appear before it is considered valid. PGP can use the CA model as well, simply by defining a given key (or set of keys) to be completely trusted. Both of these methods have one serious failing. Both require a user of the system to have a key stored locally for everyone the user wishes to communicate with. This is not scalable and not an acceptable method for key distribution. Current Key Servers In 1992, a basic key server was written by Michael Graff, and released to the world to allow PGP keys to be exchanged using electronic mail, and later various other methods. This version is currently in widespread use and contains 13,000 PGP keys from all over the world. Currently, about 15 of these servers are interconnected and available for general public use. This first generation server was adequate at one time, but it has some major shortcomings in terms of scalability, speed, and most importantly, the inherent dependency on PGP. Each server maintains a list of all keys and relies on PGP itself to manage a single key ring. A single key ring is, in PGP terms, one large file with no indexing for fast retrieval. Key retrieval speed ranges from slow to unbearable; interactive key retrieval is impossible. Although a significant percentage of the keys contain useful information, there is no simple way to remove those keys which are no longer (or never were) valid. The proposed design addresses these limitations. SCALABLE KEY DISTRIBUTION ------------------------- It is impossible for every public key server to maintain a list of all available keys. The amount of time required to synchronize the data and the cost of a machine powerful enough to serve all keys with reasonable speed is prohibitive. In the proposed design, each server maintains only a subset of all keys. The server is considered to be authoritative for the keys it stores and maintains. Each set of keys a given server stores is called a cell, which is an administrative domain, such as AMESLAB.GOV or ES.NET. Storage Method Because of a demand for speed, a full relational database engine should be used to store retrieval information for keys. The PGP key servers in operation today use PGP to manage one large public key ring. PGP uses a straight forward method for key retrieval -- reading a huge file from beginning to end. This causes the current servers to be slow to respond to key retrievals. The public key server architecture should also be flexible, allowing any type of public encryption key to be archived for public retrieval. One server should be able to maintain and distribute PEM, PGP, or any other key type with ease. Trust Each server maintains a list of keys for a cell for which it is the authoritative source. To become an authoritative source, each server must be sanctioned by a central authority, and added to a master list of public servers. No primary server should trust a server from another cell without server administrator intervention. This allows for a simple trust model for keys, much like in the Domain Name System (DNS) used to distribute Internet machine names today. NEW DESIGN ---------- The new design is a scalable, fast, and reliable method to distribute public keys. The servers themselves do not provide authentication for the keys stored within the database; this requires any public key system to employ its own method (CA or Web of Trust) to validate a key. Scalability Each cell can contain any number of servers, all of which contain the same information and are considered authoritative for the cell. If one of these servers fails for any reason, another can be contacted and used. This addresses both the reliability and scalability issues by providing redundant functionally equivalent services on multiple machines. Improved Speed By using a database centric storage method, a key can be quickly located within the key database. Because retrievals of keys are the most common operation performed by a user, adding new keys and updating old ones are allowed to be slower than retrievals. The initial version will use the Postgres95 Database Management System which is free from the University of California at Berkeley. Subsequent versions will use commercial grade SQL based systems from Oracle and Sybase. Adaptation to other, site preferred, databases should be straight-forward as long as they are SQL based. Retrieval Initially, a simple program could be written to automatically locate and retrieve a public key. Eventually this functionality could be integrated within the public key system itself, allowing transparent retrieval of any key. IMPLEMENTATION ISSUES --------------------- Each PEM signed message contains enough information to locate the correct key server and retrieve the correct key. Other methods are not so well designed. PGP does not provide enough information to locate a key, so a key server design must provide this function. Also, some methods allow for multiple identities to be associated with a given public key. This allows, for example, michael@es.net and graff@ameslab.gov to be associated with the same key. In a distributed system, this is not a simple issue, and needs to be researched. PRELIMINARY DESIGN ------------------ Using DNS as a model, a system was designed which would meet the requirements for a distributed key server. These are listed below, as well as a functional description of each component in the system. A cell is comprised of two types of servers, called primary and secondary. Also, in order to maintain a key server system, a root server is used. Root Servers Root servers are infrastructure only. They maintain lists such as which hosts are authoritative for a given cell and how to contact them. They may also perform other functions depending on a public key system's requirements, such as the design flaw in PGP which does not not provide enough information to locate a key easily. Primary Servers Primary servers are the machines which perform key retrieval, updates, and additions. They should be fast and reliable, and they may be replicated to distribute load and to maintain reliability. They perform other functions as well, such as providing a list of machines for a given cell to a client. This list helps a client find the correct set of machines for retrieving a key. Secondary Servers Secondary servers exist to help balance retrieval load only. They may not modify or add new keys for a given cell. This allows a secondary to run on smaller machines than would be required for a primary. A cell can contain any number of secondaries to help balance load. A client should have a list of local-cell servers and choose one at random when requesting a key. Secondary servers are not needed for a cell to operate. They exist only because they are less resource intensive and may be run on smaller machines. Client A client is the user (or program) requesting, adding, or updating a public key. The client's duties should be as simple as possible, ideally nothing more than contacting a server, asking for a key, and getting either the key or another server or cell to try. Relevance to X.509 We are currently investigating the feasibility and desirability of including X.509 Directory - Authentication Framework functionality in the server. IMPLEMENTATION TIMETABLE ------------------------ It is estimated that a basic server system can be implemented in about a FTE-month. This basic server will allow proof of concept testing and many technical issued to be discovered. In approximately two more FTE-months, a small-scale distributed server system can be tested, with replicated servers and root servers added to the system. This system can be tested and installed at multiple test sites and environments. Within eight FTE-months, the server system can be installed in a beta testing manner with a wider test base. The design should be set at this time, and while it should be flexible enough to allow future modifications, only problems should be addressed and no new features added to the server system unless absolutely necessary. Within one FTE-year, a full server system can be deployed within the ESnet community, allowing authentic and secure communication within and between sites.