Electronic mail (abbreviated "e-mail" or, more commonly, "email") is a store and forward method of composing, sending, storing, and receiving messages over electronic communication systems. The term "e-mail" (as a noun or verb) applies both to the Internet e-mail system based on the Simple Mail Transfer Protocol (SMTP) and to intranet systems allowing users within one organization to e-mail each other. Often these workgroup collaboration organizations may use the Internet protocols for internal e-mail service.
Origins of e-mail
E-mail predates the Internet; existing e-mail systems were a crucial tool in creating the Internet. The Compatible Time-Sharing System (CTSS) was begun at MIT in 1961. It allowed multiple users to log into the IBM 7094 from remote dial-up terminals, and to store files online on disk. This new ability encouraged users to share information in new ways. E-mail started in 1965 as a way for multiple users of a time-sharing mainframe computer to communicate. Although the exact history is murky, among the first systems to have such a facility were SDC's Q32 and MIT's CTSS.
E-mail was quickly extended to become network e-mail, allowing users to pass messages between different computers. The messages could be transferred between users on different computers by 1966, but it is possible the SAGE system had something similar some time before.
The ARPANET computer network made a large contribution to the evolution of e-mail. There is one report which indicates experimental inter-system e-mail transfers on it shortly after its creation, in 1969. Ray Tomlinson initiated the use of the @ sign to separate the names of the user and their machine in 1971. The ARPANET significantly increased the popularity of e-mail, and it became the killer app of the ARPANET.
As the utility and advantages of e-mail on the ARPANET became more widely known, the popularity of e-mail increased, leading to demand from people who were not allowed access to the ARPANET. A number of protocols were developed to deliver e-mail among groups of time-sharing computers over alternative transmission systems, such as UUCP and IBM's VNET e-mail system.
Since not all computers or networks were directly inter-networked, e-mail addresses had to include the "route" of the message, that is, a path between the computer of the sender and the computer of the receivers. E-mail could be passed this way between a number of networks, including the ARPANET, BITNET, and NSFNET, as well as to hosts connected directly to other sites via UUCP.
The route was specified using so-call "bang path" addresses, specifying hops to get from some assumed-reachable location to the addressee, so called because each hop is into a form understandable by another vendor.
The CCITT developed the X.400 standard in the 1980s to allow different e-mail systems to interoperate. Roughly at the same time, the IETF developed a much simpler protocol called the Simple Mail Transfer Protocol (SMTP) which has become the de facto standard for e-mail transfer on the Internet. With the advent of widespread use of home personal computers connected to the Internet, interoperability via SMTP-based Internet e-mail has become a critical feature for all e-mail systems.
In 1969 US Air Force users were sending text messages by keypunching cards with long text messages using one card for each 80 character line and transmitting them as card decks from one computer to another. In 1973, David Woolley developed the "notes" bulletin board for the PLATO educational software system, and in 1974 Kim Mast extended that to a "personal notes" facility.
In the mid 1970s it was becoming apparent that as computers decreased to the size that would fit on a desktop, they might become valuable tools for increasing organizational productivity. The problem was that no one had the vaguest idea about how to use a networked system of computers and workstations productively within the office environment.
In 1976, the United States Air Force took the initiative and implemented a project called “The Paperless Laboratory”, at the Rome Air Development Center (RADC) – a research and development facility in Northern New York State.
Three members of RADC were selected for the project: Captain Kenneth Nocito, John McNamara, and Fred Dyer. Over the next year they studied organizational processes and workflow and then wrote a specification for a system, which they named LONEX (Laboratory-Office Networking Experiment).
LONEX would implement a network system of central computers linked to remote workstations with integrated tools to create text, graphic, form, and data documents and then give the user the ability to transmit these documents to others - singularly or grouped as part of a “work package”. At the end of the implementation the LONEX Team would measure productivity increases of each of the electronic tools vs. their manual equivalent.
During LONEX’s design phase, it became obvious that the most critical factor whether computers in the office would spark a revolution rested with the ease and speed with which the creator of electronic documents could transmit these products to others outside as well as within the organization.
Although the ability to transmit electronically created files existed with the File Transfer Protocol (FTP) and with the Simple Mail Transfer Protocol (SMTP) the LONEX team quickly discovered that the "average" office worker had neither the temperament nor in many cases the ability to type with no errors the long string of command-line instructions that were required.
To resolve this dilemma, the team specified a simple "front-end" that became the application known today as Email - so that typing information into fields headed with the titles: "To”, “From”, “Send”, “Carbon Copy” (CC), “Blind Copy” (BC), “Reply", “Subject”, and “Attach” would automatically trigger background processes to accomplish the transmission.
In 1978, the LONEX specification was released for bid and Bunker Ramo and Wooldridge won the contract. In the ensuing three years they wrote the LONEX software, implemented the hardware and used existing telephone wires within and between the buildings of RADC to create the network for sending the electronic documents to other workstations within as well as (using a Modem) outside of RADC.
Over the course of the experiment the Project Team managed system implementation, resolved problems, compiled data, and monitored usage of the LONEX’s 44 integrated electronic tools that were being used daily by the 1300 engineers, scientists, secretaries and staff at RADC.
During this time word of the impending computer revolution was also spreading rapidly throughout the country. By 1980 corporate executives began tasking their office managers to implement systems to improve their organization’s productivity. With no knowledge to draw upon these managers quickly realized that they desperately needed help. As word spread about the far-reaching integrated system that was operational at RADC, the LONEX Program Office soon became besieged with requests for demonstrations.
In response to these requests, the LONEX team assembled a portable system consisting of: a workstation, a large-screen projector and a dial-up MODEM. From 1981 through 1985 hundreds of demonstrations were given to packed rooms at organizations as well as business conferences throughout the United States and Canada.
By linking to LONEX at RADC attendees were able to view, on a large screen, an operational system unlike anything they had ever seen – where in real time integrated text, graphics, forms, and data documents were being created. Then using LONEX’s Email application, they witnessed these documents being transmitted and then received by other users anywhere in the world, almost instantly.
At the culmination of the LONEX implementation the independent firm of Booz Allen Hamilton was contracted to measure and document the increases in productivity of applications created on LONEX versus their manual equivalent. Not surprisingly, of the 44 LONEX applications measured, Email showed the greatest increase in productivity – which rose significantly as the distance between the document’s creator and recipient increased.
By the end of 1983 US Air Force users were using user names like email@example.com to send e-mail between a nationwide linkup of VAX computers. By 1984 these same users were using personal computers for same.
In 1982 the White House adopted a prototype e-mail system from IBM called the Professional Office System, or PROFs for the National Security Council (NSC) staff. By April 1985, the system was fully operational within the NSC with home terminals for principals on the staff. By November of 1986 the rest of the White House came online, first with the PROFs system, and later (by the end of the 1980s) through a variety of systems including VAX A-1 ("All in One"), and ccmail.
In 1991, the first e-mail from space was sent from aboard the Space Shuttle Atlantis, mission STS-43, using AppleLink running on a Macintosh Portable.
Modern Internet e-mail
How Internet e-mail works
The diagram above shows a typical sequence of events that takes place when Alice composes a message using her mail user agent (MUA). She types in, or selects from an address book, the e-mail address of her correspondent. She hits the "send" button.
- Her MUA formats the message in Internet e-mail format and uses the Simple Mail Transfer Protocol (SMTP) to send the message to the local mail transfer agent (MTA), in this case smtp.a.org, run by Alice's Internet Service Provider (ISP).
- The MTA looks at the destination address provided in the SMTP protocol (not from the message header), in this case firstname.lastname@example.org. A modern Internet e-mail address is a string of the form email@example.com, creating a Fully Qualified Domain Address (FQDA). The part before the @ sign is the local part of the address, often the username of the recipient, and the part after the @ sign is a domain name. The MTA looks up this domain name in the Domain Name System to find the mail exchange servers accepting messages for that domain.
- The DNS server for the b.org domain, ns.b.org, responds with an MX record listing the mail exchange servers for that domain, in this case mx.b.org, a server run by Bob's ISP.
- smtp.a.org sends the message to mx.b.org using SMTP, which delivers it to the mailbox of the user bob.
- Bob presses the "get mail" button in his MUA, which picks up the message using the Post Office Protocol (POP3).
This sequence of events applies to the majority of e-mail users. However, there are many alternative possibilities and complications to the e-mail system:
- Alice or Bob may use a client connected to a corporate e-mail system, such as IBM's Lotus Notes or Microsoft's Exchange. These systems often have their own internal e-mail format and their clients typically communicate with the e-mail server using a vendor-specific, proprietary protocol. The server sends or receives e-mail via the Internet through the product's Internet mail gateway which also does any necessary reformatting. If Alice and Bob work for the same company, the entire transaction may happen completely within a single corporate e-mail system.
- Alice may not have a MUA on her computer but instead may connect to a webmail service.
- Alice's computer may run its own MTA, so avoiding the transfer at step 1.
- Bob may pick up his e-mail in many ways, for example using the Internet Message Access Protocol, by logging into mx.b.org and reading it directly, or by using a webmail service.
- Domains usually have several mail exchange servers so that they can continue to accept mail when the main mail exchange server is not available.
It used to be the case that many MTAs would accept messages for any recipient on the Internet and do their best to deliver them. Such MTAs are called open mail relays. This was important in the early days of the Internet when network connections were unreliable. If an MTA couldn't reach the destination, it could at least deliver it to a relay that was closer to the destination. The relay would have a better chance of delivering the message at a later time. However, this mechanism proved to be exploitable by people sending unsolicited bulk e-mail and as a consequence very few modern MTAs are open mail relays, and many MTAs will not accept messages from open mail relays because such messages are very likely to be spam.
Note that the people, e-mail addresses and domain names in this explanation are fictional: see Alice and Bob.
Internet e-mail format
The format of Internet e-mail messages is defined in RFC 2822 and a series of RFCs, RFC 2045 through RFC 2049, collectively called Multipurpose Internet Mail Extensions (MIME). Although as of July 13, 2005 RFC 2822 is technically a proposed IETF standard and the MIME RFCs are draft IETF standards, these documents are the de facto standards for the format of Internet e-mail. Prior to the introduction of RFC 2822 in 2001 the format described by RFC 822 was the de facto standard for Internet e-mail for nearly two decades; it is still the official IETF standard. The IETF reserved the numbers 2821 and 2822 for the updated versions of RFC 821 (SMTP) and RFC 822, honoring the extreme importance of these two RFCs. RFC 822 was published in 1982 and based on the earlier RFC 733.
Internet e-mail messages consist of two major sections:
- Header - Structured into fields such as summary, sender, receiver, and other information about the e-mail
- Body - The message itself as unstructured text; sometimes containing a signature block at the end
The header is separated from the body by a blank line.
Internet e-mail header
The message header consists of fields, usually including at least the following:
- From: The e-mail address, and optionally name, of the sender of the message
- To: The e-mail addresses, and optionally names, of the receiver of the message
- Subject: A brief summary of the contents of the message
- Date: The local time and date when the message was originally sent
Each header field has a name and a value. RFC 2822 specifies the precise syntax. Informally, the field name starts in the first character of a line, followed by a ":", followed by the value which is continued on non-null subsequent lines that have a space or tab as their first character. Field names and values are restricted to 7-bit ASCII characters. Non-ASCII values may be represented using MIME encoded words.
Note that the "To" field in the header is not necessarily related to the addresses to which the message is delivered. The actual delivery list is supplied in the SMTP protocol, not extracted from the header content. The "To" field is similar to the greeting at the top of a conventional letter which is delivered according to the address on the outer envelope. Also note that the "From" field does not have to be the real sender of the e-mail message. It is very easy to fake the "From" field and let a message seem to be from any mail address. It is possible to digitally sign e-mail, which is much harder to fake. Some Internet service providers do not relay e-mail claiming to come from a domain not hosted by them, but very few (if any) check to make sure that the person or even e-mail address named in the "From" field is the one associated with the connection. Some internet service providers digitally sign e-mail being sent through their MTA to allow other MTAs to detect forged spam that might apparently appear to be from them.
Other common header fields include:
- Cc: Courtesy copy
- Received: Tracking information generated by mail servers that have previously handled a message
- Content-Type: Information about how the message has to be displayed, usually a MIME type
Many e-mail clients present "Bcc" (Blind courtesy copy, recipients not visible in the "To" field) as a header field. Since the entire header is visible to all recipients, "Bcc" is not included in the message header. Addresses added as "Bcc" are only added to the SMTP delivery list, and do not get included in the message data.
E-mail content encoding
E-mail was originally designed for 7-bit ASCII. Much e-mail software is 8-bit clean but must assume it will be communicating with 7-bit servers and mail readers. The MIME standard introduced charset specifiers and two content transfer encodings to encode 8 bit data for transmission: quoted printable for mostly 7 bit content with a few characters outside that range and base64 for arbitrary binary data. The 8BITMIME extension was introduced to allow transmission of mail without the need for these encodings but many mail transport agents still don't support it fully. For international character sets, Unicode is growing in popularity.
Saved message file extension
Most, but not all, e-mail clients save individual messages as separate files, or allow users to do so. Different applications save e-mail files with different filename extensions.
- This is the default e-mail extension for Mozilla Thunderbird and is used by Microsoft Outlook Express.
- Used by Apple Mail.
- Used by Microsoft Office Outlook.
Messages and mailboxes
Messages are exchanged between hosts using the Simple Mail Transfer Protocol with software like Sendmail. Users can download their messages from servers with standard protocols such as the POP or IMAP protocols, or, as is more likely in a large corporate environment with a proprietary protocol specific to Lotus Notes or Microsoft Exchange Servers.
Mail can be stored either on the client, on the server side, or in both places. Standard formats for mailboxes include Maildir and mbox. Several prominent e-mail clients use their own proprietary format and require conversion software to transfer e-mail between them.
When a message cannot be delivered, the recipient MTA must send a bounce message back to the sender, indicating the problem.
Spamming and e-mail worms
The usefulness of e-mail is being threatened by three phenomena: spamming, phishing and e-mail worms.
Spamming is unsolicited commercial e-mail. Because of the very low cost of sending e-mail, spammers can send hundreds of millions of e-mail messages each day over an inexpensive Internet connection. Hundreds of active spammers sending this volume of mail results in information overload for many computer users who receive tens or even hundreds of junk messages each day.
E-mail worms use e-mail as a way of replicating themselves into vulnerable computers. Although the first e-mail worm affected UNIX computers, the problem is most common today on the more popular Microsoft Windows operating system.
The combination of spam and worm programs results in users receiving a constant drizzle of junk e-mail, which reduces the usefulness of e-mail as a practical tool.
A number of technology-based initiatives mitigate the impact of spam. In the United States, U.S. Congress has also passed a law, the Can Spam Act of 2003, attempting to regulate such e-mail.
Privacy problems regarding e-mail
E-mail privacy, without some security precautions, can be compromised because
- e-mail messages are generally not encrypted;
- e-mail messages have to go through intermediate computers before reaching their destination, meaning it is relatively easy for others to intercept and read messages;
- many Internet Service Providers (ISP) store copies of your e-mail messages on their mail servers before they are delivered. The backups of these can remain up to several months on their server, even if you delete them in your mailbox;
- the Received: headers and other information in the email can often identify the sender, preventing anonymous communication.
There are cryptography applications that can serve as a remedy to one or more of the above. For example, Virtual Private Networks or the Tor (anonymity network) can be used to encrypt traffic from the user machine to a safer network while GPG, PGP or S/MIME can be used for end-to-end message encryption, and SMTP STARTTLS or SMTP over Transport Layer Security/Secure Sockets Layer can be used to encrypt communications for a single mail hop between the SMTP client and the SMTP server.
Another risk is that e-mail passwords might be intercepted during sign-in. One may use encrypted authentication schemes such as SASL to help prevent this.