Public Key Distribution -- a White Paper for ESnet

DRAFT  9/95

Michael Graff, Joe Metzger and Stephen Elbert


ABSTRACT
--------
Secure communication channels will be a critical component of any
distributed collaborative environment such as the Distributed Informatics,
Computing and Collaborative Environment (DICCE) project for ESnet. The
successful management and operation of such a system requires reliable
authentication (I am who I say I am) and authorization (I have permission
to do this) tools. Although public key technology has the potential to
solve these problems, a scalable method for the distribution of public keys
has yet to be resolved. This document proposes a protocol and
implementation to address the key distribution issue.

BACKGROUND
----------
The use of public key technology to allow authenticated and secure
communication is growing steadily. The arrival of the paperless office and
its compelling economic advantages can only be fully realized with the use
of digital signatures made possible by public key technology. There are
many commercial and public domain systems in use, both within the United
States and abroad, nearly all based upon the RSA algorithm. PEM (Privacy
Enhanced Mail) and PGP (Pretty Good Privacy) are two examples, although PEM
is actually a standard implemented by several programs and PGP is a
specific program. Use of this technology would be even more widespread if
its use were transparent. One of the obstacles to transparent use is how
encryption keys are managed and distributed.

When using any form of public key technology two difficulties must be
overcome. The first is the obvious one: How does one distribute a public
key so that others may easily find it? The second is much more important:
Given a public key, how can a potential user be reasonably assured of the
key's validity?

Key Validity Models

The validity issue is implementation dependent. PEM uses a model in which a
Certifying Authority (CA) puts an RSA signature on a PEM key, which gives
assurance to anyone who verifies the key that it is indeed an authentic
key. PGP uses a much more general approach, where anyone can sign anyone's
key. This is commonly referred to as a Web of Trust model.

The PEM method requires that every potential user trust all CA's, not only
to sign keys after proof of identity but also to trust the CA to maintain
the privacy of their own RSA key. The PGP model is much more flexible,
since in effect, anyone can put some level of assurance on a key, and it is
more of a decision for the end user on how valid a key must appear before
it is considered valid. PGP can use the CA model as well, simply by
defining a given key (or set of keys) to be completely trusted.

Both of these methods have one serious failing. Both require a user of the
system to have a key stored locally for everyone the user wishes to
communicate with. This is not scalable and not an acceptable method for key
distribution.

Current Key Servers

In 1992, a basic key server was written by Michael Graff, and released to
the world to allow PGP keys to be exchanged using electronic mail, and
later various other methods. This version is currently in widespread use
and contains 13,000 PGP keys from all over the world.

Currently, about 15 of these servers are interconnected and available for
general public use. This first generation server was adequate at one time,
but it has some major shortcomings in terms of scalability, speed, and
most importantly, the inherent dependency on PGP. Each server maintains a
list of all keys and relies on PGP itself to manage a single key ring. A
single key ring is, in PGP terms, one large file with no indexing for fast
retrieval. Key retrieval speed ranges from slow to unbearable; interactive
key retrieval is impossible. Although a significant percentage of the keys
contain useful information, there is no simple way to remove those keys
which are no longer (or never were) valid. The proposed design addresses
these limitations.

SCALABLE KEY DISTRIBUTION
-------------------------
It is impossible for every public key server to maintain a list of all
available keys. The amount of time required to synchronize the data and the
cost of a machine powerful enough to serve all keys with reasonable speed
is prohibitive. In the proposed design, each server maintains only a subset
of all keys. The server is considered to be authoritative for the keys it
stores and maintains. Each set of keys a given server stores is called a
cell, which is an administrative domain, such as AMESLAB.GOV or ES.NET.

Storage Method

Because of a demand for speed, a full relational database engine should be
used to store retrieval information for keys. The PGP key servers in
operation today use PGP to manage one large public key ring. PGP uses a
straight forward method for key retrieval -- reading a huge file from
beginning to end. This causes the current servers to be slow to respond to key
retrievals.

The public key server architecture should also be flexible, allowing any
type of public encryption key to be archived for public retrieval. One
server should be able to maintain and distribute PEM, PGP, or any other key
type with ease.

Trust

Each server maintains a list of keys for a cell for which it is the
authoritative source. To become an authoritative source, each server must
be sanctioned by a central authority, and added to a master list of public
servers.

No primary server should trust a server from another cell without server
administrator intervention. This allows for a simple trust model for keys,
much like in the Domain Name System (DNS) used to distribute Internet
machine names today.

NEW DESIGN
----------
The new design is a scalable, fast, and reliable method to distribute
public keys. The servers themselves do not provide authentication for the
keys stored within the database; this requires any public key system to
employ its own method (CA or Web of Trust) to validate a key.

Scalability

Each cell can contain any number of servers, all of which contain the same
information and are considered authoritative for the cell. If one of these
servers fails for any reason, another can be contacted and used. This
addresses both the reliability and scalability issues by providing
redundant functionally equivalent services on multiple machines.

Improved Speed

By using a database centric storage method, a key can be quickly located
within the key database. Because retrievals of keys are the most common
operation performed by a user, adding new keys and updating old ones are
allowed to be slower than retrievals. The initial version will use the
Postgres95 Database Management System which is free from the University of
California at Berkeley. Subsequent versions will use commercial grade SQL
based systems from Oracle and Sybase. Adaptation to other, site preferred,
databases should be straight-forward as long as they are SQL based.

Retrieval

Initially, a simple program could be written to automatically locate and
retrieve a public key. Eventually this functionality could be integrated
within the public key system itself, allowing transparent retrieval of any
key.

IMPLEMENTATION ISSUES
---------------------
Each PEM signed message contains enough information to locate the correct
key server and retrieve the correct key. Other methods are not so well
designed. PGP does not provide enough information to locate a key, so a key
server design must provide this function.

Also, some methods allow for multiple identities to be associated with a
given public key. This allows, for example, michael@es.net and
graff@ameslab.gov to be associated with the same key. In a distributed
system, this is not a simple issue, and needs to be researched.

PRELIMINARY DESIGN
------------------
Using DNS as a model, a system was designed which would meet the
requirements for a distributed key server. These are listed below, as well
as a functional description of each component in the system.

A cell is comprised of two types of servers, called primary and secondary.
Also, in order to maintain a key server system, a root server is used.

Root Servers

Root servers are infrastructure only. They maintain lists such as which
hosts are authoritative for a given cell and how to contact them. They may
also perform other functions depending on a public key system's
requirements, such as the design flaw in PGP which does not not provide
enough information to locate a key easily.

Primary Servers

Primary servers are the machines which perform key retrieval, updates, and
additions. They should be fast and reliable, and they may be replicated to
distribute load and to maintain reliability. They perform other functions
as well, such as providing a list of machines for a given cell to a client.
This list helps a client find the correct set of machines for retrieving a
key.

Secondary Servers

Secondary servers exist to help balance retrieval load only. They may not
modify or add new keys for a given cell. This allows a secondary to run on
smaller machines than would be required for a primary. A cell can contain
any number of secondaries to help balance load. A client should have a list
of local-cell servers and choose one at random when requesting a key.
Secondary servers are not needed for a cell to operate. They exist only
because they are less resource
intensive and may be run on smaller machines.

Client

A client is the user (or program) requesting, adding, or updating a public
key. The client's duties should be as simple as possible, ideally nothing
more than contacting a server, asking for a key, and getting either the key
or another server or cell to try.

Relevance to X.509

We are currently investigating the feasibility and desirability of
including X.509 Directory - Authentication Framework functionality in the
server.

IMPLEMENTATION TIMETABLE
------------------------
It is estimated that a basic server system can be implemented in about a
FTE-month. This basic server will allow proof of concept testing and many
technical issued to be discovered.

In approximately two more FTE-months, a small-scale distributed server
system can be tested, with replicated servers and root servers added to the
system. This system can be tested and installed at multiple test sites and
environments.

Within eight FTE-months, the server system can be installed in a beta
testing manner with a wider test base. The design should be set at this
time, and while it should be flexible enough to allow future modifications,
only problems should be addressed and no new features added to the server
system unless absolutely necessary.

Within one FTE-year, a full server system can be deployed within the ESnet
community, allowing authentic and secure communication within and between
sites.