Decoding Internet Attachments

A Tutorial by Michael Santovec



Table of Contents

For additional help, information and resources, see my ?Technical Help page.

Why Are Attachments Encoded?

Internet e-mail and Usenet news posts were designed for plain text messages. As such, many systems expect the messages to only contain printable characters from the 7-bit (first bit of the 8-bit byte is always zero) ASCII character set. These programs can have problems if the message includes extended 8-bit (the first bit is a one) characters, such as the various accented letters. This also poses a problem for sending files, such as images, sound, video, spreadsheets and programs which can contain any combination of 8-bit binary data. This even poses a problem for formatted documents, since many word processors embed binary control fields in the files.

The way around this limitation is to encode the binary data (attachment) into ASCII characters before sending. To the mail and news systems that the messages travels through, the file is just so much text. At the receiving end, the message is decoded back into the original file, none-the-worse for the experience. Many mail and news programs automate the encoding and decoding. However, sometimes a separate program may be required.

The nice thing about Standards is there are so many to choose from. Encoding is no exception. Among the more popular are: Uuencode, MIME, Base64, Quoted-Printable and Binhex. There are other less common methods as well.

It should be noted that encoding is not the same as encryption. The purpose of encoding is to allow some information to be stored in, or pass through, a medium that can't handle the data directly. The purpose of encryption is prevent unauthorized persons from view or using some information. It's possible for a message to use both encoding and encryption.


Top of page Top of section


Uuencode

Uuencode (Unix-to-Unix Encode), as its name implies, comes from the Unix world. It was commonly used to encode files transmitted from one Unix computer to another. Since the early Internet consisted almost entirely of computers running the Unix operating system, it's not surprising that Uuencode is widely used. Today, almost all computer platforms have programs capable of encoding/decoding using Uuencode.

Most mail and news programs can decode Uuencode. However, not all of them can encode it. Most mail systems will pass Uuencode without problems. If you don't know your recipient's capabilities, Uuencode is a good first guess. Uuencode is more common in news than mail. MIME is making inroads in mail faster than news.

Uuencode results in a transmitted message about 42% larger than the original file. This is typical of the encoding penalty.

A Below shows how the image to the right would look if Uuencoded. The first line starts with the word "begin". The "644" represents the Unix file permissions (read/write/execute). This is largely ignored by other operating systems. In this example, "a.gif" is the file name.

The encoded file follows. Most lines begin with an "M" (representing the line length) and 60 characters of data. The last data line is usually shorter, and therefore starts with a different character. The end of the encoding has "`" on a line by itself and then the word "end" on a line by itself.

begin 644 a.gif
M1TE&.#EA)0`H`+,``.P`2QBQ`/__________________________________
M_____________________RP`````)0`H```$5S#(2:N]..O-N_]@*(YD:9YH
MB@(LH%ZM^U;Q3,6L+>6MW@>_5VXW%,J`Q500>5PUF:HE\4F20ITX'#:K-5FG
;WB3MZR?J^(QM9RVF'7PN'Q.K]OO^+Q^!``[
`
end

Problems can occur due to inconsistent encoding/decoding in different mail and news programs. For example, Microsoft Outlook Express will use a blank (x'20') as an encoding character. (Some other encoders will use the ` character (x'60') instead of a blank.) If the blank ends up as the last character in a line, Outlook Express will then drop the blank resulting in a short line. If Netscape decodes this attachment, it will assume that the short line is padded with nulls (x'00) rather than blanks. This can result with what was orginally a x'40', x'80' or x'C0' byte becoming a 'x00'. This problem only occurs when a x'40', x'80' or x'C0' byte was orinally at the 45th byte of the file, or a multiple there of (e.g. 90th, 135th, etc.).

The file corruption may or may not be apparent. For an image file, a chunk of the image may appear to be off color or otherwise distorted. For an executable file, it may seem to run OK, give some error when used, or give incorrect results. A ZIP file should indicate that it is corrupted when unzipped.

This problem can be avoided if the Outlook Express user uses MIME(Base64) encoding instead of Uuencode. Netscape users can successfully decode the attachment by using manual Decoding with a product such as Wincode or StuffIt Expander, both of which correctly assume that short lines are padded with blanks.


Top of page Top of section


MIME

MIME (Multipurpose Internet Mail Extensions) is not actually a method for encoding attachments. Rather, it deals with the overall structure of a message. A message using MIME doesn't necessarily include attachments. If it does include attachments, they most often use Base64 encoding, or sometimes Quoted-Printable encoding. Theoretically, MIME could even use Uuencode, Binhex, or other methods, but that is both rare and frowned upon in the MIME standards (RFC 2045).

The main advantage of MIME is that it provides a consistent way for the sending program to describe the message contents to the receiving program. The original Internet mail message specification (RFC 822) just describes simple text messages. The message might include an encoded attachment, but it's up to the receiving program to find it in the midst of the message text.

Most newer mail and news programs support MIME. However, older programs don't. And some older mail and news servers either remove or mutilate some of the MIME headers, rendering the message unintelligible to a receiving MIME capable program. Due to its flexibility and power, MIME is the best choice if all parties can handle it.

Some mail and news programs present a choice between Uuencode and MIME encoding. This is a bit misleading and confusing. The Uuencode choice usually means to use a simple mail message (none of the MIME message headers), and to Uuencode any attachments. The MIME choice means to use the MIME headers, and use Base64 or Quoted-Printable for attachments.

The distinguishing characteristic of a MIME message is the presence of the MIME headers. These are normally invisible in a MIME capable reader, but can be seen in the message source. Below are shown some typical MIME headers. The "MIME-Version:" header is present in all messages using MIME. The others are specific to the attachments or other contents. A MIME message may have multiple attachments of various types.

MIME-Version: 1.0
Content-Description: "Base64 encode of a.gif by Wincode 2.7.3"
Content-Type: image/gif; name="a.gif"
Content-Transfer-Encoding: Base64
Content-Disposition: attachment; filename="a.gif"


Although the MIME name specifies "Internet Mail", the same considerations also apply to news. And some parts of MIME are also used by Web Browsers. In particular, the web servers use the Content-Type ("image/gif" in the above example) to identify the type of file being sent to the browser so that the browser can determine how to handle it. However, since the browser protocol (http) supports binary transfers, the encoding issues don't apply there. (For more information on Content-Type, see: RFC 2046 and MIME Types)


Top of page Top of section


Base64

Base64 is the preferred encoding method for attachments in messages using MIME. However, in some cases Quoted-Printable is used instead. Although Base64 could be used without MIME, this is rare.

Base64 results in a transmitted message about 37% larger than the original file. This is typical of the encoding penalty, but slightly more efficient than Uuencode.

A Below shows how the image to the right would look if using Base64 encoding. The MIME headers provide all the descriptive information. This includes the file name, its type, and that Base64 encoding is used.

MIME-Version: 1.0
Content-Description: "Base64 encode of a.gif by Wincode 2.7.3"
Content-Type: image/gif; name="a.gif"
Content-Transfer-Encoding: Base64
Content-Disposition: attachment; filename="a.gif"

R0lGODlhJQAoALMAAOwASxixAP//////////////////////////////////////////////////
/////ywAAAAAJQAoAAAEVzDISau9OOvNu/9gKI5kaZ5oigIsoF6t+1bxTMWsLeWt3ge/V243FMqA
xVQQeVw1maol8UmSQp04HDarNVmn3iTt6yGDq+IxtZy2mHXwuHxOr9vv+Lx+BAA7


Top of page Top of section


Quoted-Printable

Quoted-Printable is used to encode some attachments in messages using MIME. Quoted-Printable leaves printable ASCII characters alone and only encodes those characters (bytes) that might get lost or converted in transit.

If the attachment consists mostly of printable ASCII characters, the MIME program may automatically select quoted-printable over Base64, since this would be much more efficient. In the best case, Quoted-Printable results in a transmitted message only about 3% larger than the original file. However, in the worst case, the transmitted message could be about 200% larger than the original file. So it's important to only use this encoding method on suitable files.

Although Quoted-Printable may be used for attachments, it is more often used for the main message text. The mail or news program may offer the option to encode the text using Quoted-Printable. There are two advantages to this. 1) Characters outside of the normal printable ASCII can be safely transmitted. This includes some special characters, and letters from some foreign alphabets. 2) The intended paragraph layout can be preserved. Simple text messages are arbitrarily chopped into suitable chunks (typically 70-80 characters per line) by the sending program. Quoted-Printable allows the logical lines to exceed the physical line limits of the mail or news transport. It places a hard carriage return (line break) only at the end of a paragraph. The receiving program can then reflow the paragraph to the viewing window. Not all receiving programs support wrapping. This may result in each paragraph being displayed as a single line making the message difficult to read. And some programs will wrap the text for display, but not for printing.

Some mail and news servers may automatically convert any messages that contain 8-bit characters into quoted-printable encoding as the message passes through them.

The following sample text

This is a example of a quoted-printable text file. This might contain some special characters such as:
equal sign =, dollar sign $, or even extended characters such as the cent sign ¢ or foreign characters ÀÆß


is shown below as it would look if using Quoted-Printable encoding. An equal sign "=" at the end of a line indicates a soft carriage-return. The receiving program should remove it and flow the following line into this one. an "=20" at the end of a line represents a Space. Normally, trailing spaces on a line are removed in transit. This causes them to be preserved. And finally, an equal sign followed by 2 hexadecimal characters (0-9, A-F) represent an extended character.

MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

This is a example of a quoted-printable text file.  This might contain =
some special characters such as:=20
equal sign =3D, dollar sign $, or even extended characters such as the =
cent sign =A2 or foreign characters =C0=C6=DF

If the recipient's mail or news program can't handle quoted-printable (many older ones can't), the message will look peculiar with all the equal signs and hexadecimal encoding, but it is still largely readable.


Top of page Top of section


Binhex

Binhex is most often used with the Macintosh. Although Binhex decoders are available for other platforms few people have them. Binhex is a reasonable choice for encoding if both the sender and recipient are using Macs. However, in any other case another encoding method should be used.

Unlike most other methods, Binhex encodes the file name and other information along with the data.

Also, unlike most other methods, Binhex has a built-in compression capability. It's possible that a highly compressible file could result in a smaller transmitted message than the original file. However, you will generally get better results by compressing the file first with a standard compression utility. For an already compressed file, Binhex results in a transmitted message about 40% larger than the original file. This is typical of the encoding penalty, but slightly less efficient than Base64.

A Below shows how the image to the right would look if using Binhex encoding. The first line, rather obviously, indicates the encoding method.

(This file must be converted with BinHex 4.0)
:"@%ZCfPQ!&4&@&4YC'pc!*!&SJ#3"*!!bNG*4MJjB58!+!#c!!$X!%XBX3$rN#S
X!*!%*3!S!!!%9c$)5DZp11[0ZrpJ+)jNDCjSLJ)XS&kYqeEa6-@X,H@YhJHr9fi
h&-U!a933H9`eQDSPm8Q53Tdi($DV09QRhL6Ykb'$Uq)aYCbfQ(A`Z(a1Vp[[q,a
q"!!lU1B!!!!!:

A file encoded using Binhex often has an HQX file extension. If Binhex is used in a MIME formatted message, if usually has a Content-type: application/mac-binhex40. This is a departure from the usual MIME format, in that the Content-type indicates the encoding method rather than the type of data in the file. For more information, see RFC 1741.


Top of page Top of section


Other Encoding Methods

In addition to the most common encoding methods discussed above, you might encounter several other methods of encoding attachments, or several things that look similar to encoded attachments. The following should help you identify what you've got.

Binary, 8-Bit, Raw

A Below shows how the image to the right might look if using Binary, 8-bit, or Raw encoding. This does not actually encode the file, but rather includes the data without any conversion. Some mail and news programs allow you to do this or you might be able to paste the file or concatenate it into the message text without conversion. However, Internet mail and news transports don't guarantee to transmit the file without alteration. It's quite possible that the message will get truncated because some combination of characters look like an end of message indication. It might even result in a corrupted mail folder on the mail server or the recipient's mail program. It's unlikely that the recipient will be able to use the file unless it was a text file to begin with. If you receive such a file, you might as well throw it away and ask the sender to try again.

A 8-Bit

BTOA

A Below shows how the image to the right might look if using BTOA encoding. This is rarely seen on the Internet today.

xbtoa5 78 a.gif Begin
7nH003FO36-igRR!:0\Y(pO)@s8W-!s8W-!s8W-!s8W-!s8W-!s8W-!s8W-!s8W-!s8W-!s8W-!s"<
"-M!!";F-ia5M="q]eX1^LYc+F!`.#qhPSnNu_/>-=Oqc<5\`N1ZQXkj;t=)Kr2b(.H1&:%M<RAqS,
'8WlE23#jiW2-Hj6,jjn@K<*ud[@F[mFmung:9WF@pq2%Y!':/\E
xbtoa End N 162 a2 E 26 S 3a3f R 7626cb65


BOO

A Below shows how the image to the right might look if using BOO encoding. This is rarely seen on the Internet today.

A.GIF
AdU6>3UQ9@0X0;<00>`0BaRa0?oooooooooooooooooooooooooooooooooooooooooooooooooo
ooooob`0~39@0X~215L`b4V[_CS[cK_oH2R>I6VNJ8X2;:1N[O]FlDc5[2gU[Mh7_eM^=aC:P<ED
47UL=IVZ9O59TT:M>1`fZcEIYmhTkN\QPj_R<KFL]YQel;QlCZoKkoRlOP@0>`00


ROT-13

ROT-13 is not a encoding format for attachments. It is a simple encryption for text. It Rotates each letter of the alphabet 13 positions. "A" and "N" are exchanged, "B" and "O" are exchanged, etc. Numbers, spaces and punctuation are not changed. Because it is so simple, its purpose is not security. Rather, it is used so that others don't accidentally read a message that they don't want to. It was most often used for messages of questionable taste. Some news readers have ROT-13 decoding built-in. It is rarely used on the Internet today. Below is a sample of a message using ROT-13 encoding.

Guvf vf n fnzcyr bs n zrffntr rapbqrq hfvat EBG-13 rapbqvat.
Orpnhfr bs gur fvzcyr angher bs gur rapelcgvba, vgf checbfr 
vf abg frphevgl ohg gb cerirag nppvqragny ernqvat.



Top of page Top of section


MS-TNEF WINMAIL.DAT Attachments

Mail programs in the Microsoft Exchange family, which includes Windows Messaging, Outlook97, Outlook98 and Outlook2000, will include a TNEF (Transport Neutral Encapsulation Format) attachment named WINMAIL.DAT when the sender selects, or defaults to, RTF (Rich Text Format). If the sender is using MIME formatting, this attachment will have Content-Type: application/ms-tnef.

The TNEF attachment includes a Rich Text Format (e.g. bold, underline, fonts) version of the plain text message. If the sender has included any attachments (e.g. pictures, spreadsheets, programs), they will be embedded within the TNEF attachment and not as separate attachments.

Most other mail programs do not know how to handle the TNEF attachments and so Exchange family users should avoid using RTF unless they know that the recipient has a compatible program. The sender can control the use of RTF on a recipient by recipient basis. However, if sending via a Microsoft Exchange server, the server can override the sender's settings. In this case the Exchange administrator will need to make changes on the server. For more information on this, see the following articles: How Message Formats Affect Internet Mail, XCLN: Sending Messages In Rich-Text Format, XFOR: Preventing WINMAIL.DAT Sent to Internet Users, Sending RTF with Attachment as MIME Loses Attachment, XADM: POP3 Users may not Receive an Attachment if Part of DL and XFOR: CR Receives Rich Text Format Information Unexpectedly.

Fentun

Fentun is a freeware TNEF Attachment Extractor. It is available for the Win95/98/NT4 and Linux operating systems. It does not show the RTF message embedded in the TNEF attachment, but does let you see if there are any other attachments within the TNEF attachment and let's you save them.

For Netscape Mail, Fentun can be installed as a Helper. Instructions are at the Fentun web site.

If the Fentun author's web site is unavailable, a copy the the Win95/98 version, along with notes, is available for download here.

If you download this MS-TNEF.REG file, and run it on a Win95/98 system, it will create a file association for *.TNF files to Fentun, in order to make Fentun easier to use with other mail programs. Depending on where you install the FENTUN.EXE program, you will need to either edit the path in the registry file before running, or else update the path in the file association after running. For details, see the comments in the MS-TNEF.REG file.

Users of Microsoft IE3 Internet Mail, and IE4/IE5 Outlook Express will not see any indication that they have recieved a TNEF attachment. Apparently, Microsoft has decided that since these programs can't handle a TNEF attachment, it will be hidden. In order get these programs to decode the TNEF attachment and make it available to Fentun via the above file assocation, perform the following steps:
  1. Save the whole e-mail message to an *.EML file via File, Save As, or Drag-and-drop
  2. Open the *.EML file in a text editor such as Wordpad or Notepad
  3. Locate the line Content-Type: application/ms-tnef;, and change the Content subtype to something else, such as Content-Type: application/ms-tnefx;
  4. Locate the line filename="winmail.dat" and change to filename="winmail.tnf". You may change the winmail part as well, if desired
  5. Save the *.EML file from the text editor
  6. Double click the *.EML file from the Windows File Explorer. This will open the message in a mail window and the TNEF attachment should now be available for saving or opening.

LS-TNEF

LS-TNEF is a Java based TNEF Decoder. The LS-TNEF API allows one to decode the TNEF file from the command line, via the API or by using Sun's Java Mail API.


Top of page Top of section


HTML - Web Pages

HTML (Hyper Text Markup Language) is used to describe web pages. A web page consits of one or more files. The main file is a text file that contains the HTML formatting codes, usually most or all of the page text, links to any other web pages and links to any images, animations, sound clips, etc. Each image, animiation, sound clip, etc. is a separate file.

Some, but not all, mail and news programs can display HTML messages. Most that can display HTML require that the message use MIME formatting and specify Content-Type: text/html for the HTML message portion. Many programs that create HTML messages specify Content-Type: multipart/alternative and include two copies (attachments) of the message text. The first is the plain text version of the message (Content-Type: text/plain), the second is the HTML version (Content-Type: text/html). If a receiving program understands the MIME multipart/alternative and HTML, it will display the HTML version in the message body and hide the plain text one. If it doesn't understand HTML, it will display the plain text version in the message body and hide the HTML one. If it doesn't understand multipart/alternative it may display either or both message copies as attachments. And if it doesn't understand MIME, it will display both copies in the message body, but the HTML version will be difficult to read because it will be the raw HTML with all the formatting codes displayed.

Below is "A simple HTML message.":
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_0059_01BEA6E2.1A467F40"

This is a multi-part message in MIME format.

------=_NextPart_000_0059_01BEA6E2.1A467F40
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

A simple HTML message.

------=_NextPart_000_0059_01BEA6E2.1A467F40
Content-Type: text/html;  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD W3 HTML//EN">
<HTML>
<HEAD>
<META content=3Dtext/html;charset=3Diso-8859-1 =
http-equiv=3DContent-Type>
<META content=3D'"MSHTML 4.72.3110.7"' name=3DGENERATOR>
</HEAD>
<BODY>
<DIV>A simple <STRONG>HTML</STRONG> message.</DIV>
<DIV> </DIV></BODY></HTML>

------=_NextPart_000_0059_01BEA6E2.1A467F40--


The HTML capabilities of mail and news programs come in 3 levels:.

  1. HTML Text Only Some programs can only display HTML text and links. They cannot display any referenced images, play sound clips, etc. Links to images and such might just appear as a link, or not at all.
  2. HTML Text and External Images Some programs can display images if the HTML contains a link to an external source, such as a web site or a corporate file server. If the image link is to a web site, then the recipient must have an open Internet connection while reading the message for the image to be displayed. If the image link is to a corporate web server, then the recipient must have access to the web server while reading the message, and use the same drive mappings as the sender. If the image link is to the sender's local disk drive, then the recipient would have to have a copy of the image file already on their own local disk drive in the corresponding directory. Support for links to sound clips may be limited or non-existent.
  3. HTML Text, External and Internal Images Some newer programs support Content-Type: multipart/related which also allows the linked images to be attachments in the same message as the HTML code. The advantage is that the recipient doesn't need a live Internet connection or access to a file server to see the images while reading the message. One disadvantage is that the mail or news message becomes much larger in size. Additionally, if such a message is sent to a recipient with only level 2 HTML support, they won't see the images within the message. They may see the images as separate attachments, or not at all.

If a message's HTML features exceed the capabiliities of the recipient's mail or news program, it is possible to save the HTML portion of the message to an HTML file (.HTM or .HTML file extension) and then open in the web browser. If the message used multipart/related, the images will need to also be saved, possibly requiring manual Decoding. The HTML file will require editing if multipart/related was used because the links are not normal file names. Even in other cases, some edting may be required to make the HTML file usable. Overall, this is a lot more bother than it is probably worth.


Top of page Top of section


Macintosh Notes - AppleSingle and AppleDouble

Some Macintosh files consist of two parts:

A Macintosh mail or news program could encode attachments one or more of the following ways. Some may offer a choice, some will support only one of the following.

AppleSingle and AppleDouble attachments normally use MIME formatted messages with Base64 encoding. The Data Fork only attachments could use Base64, Uuencode or Binhex. For example, if the Macintosh program gives you encoding options of AppleDouble, Base64 and Binhex, the Base64 and Binhex options likely send only the Data Fork.

When a non-Macintosh user receives an AppleDouble attachment, they will most likely see two attachments. Both attachments might have the same name (e.g. photo.jpg), or the first attachment (Resource Fork) might have a generated name (e.g. att0001.dat) while the second (Data Fork) has the real name (e.g. photo.jpg). The first attachment is usually small (less than 10 KB). They will likely get an error if they try to open the first attachment (unknown file type or invalid file format). They should ignore the first attachment and just open/save the second one.

You can verify the presence of AppleSingle or AppleDouble encoding by looking at the message source. AppleSingle will have a single attachment with a Content-type: application/applefile. AppleDouble will have a header with a Content-type: multipart/appledouble followed by the Resource Fork attachment with Content-type: application/applefile followed by the Data fork attachment with a Content-type that depends on the actual file type (e.g. Content-type: image/jpeg). For more information, see RFC 1740.


Top of page Top of section


Identifying Attachment File Types

Usually the file name of an attachment indicates the type of file it is. For example, a file named A.GIF has a file extension of GIF which indicates that it is probably a GIF image file. Knowing the type of file allows you to select an appropriate application program to open the file in.

The following sites have lists of file extensions and the type of file it could be: Joz's Extensions Base, CKNOW.COM File Extensions, Common Internet File Formats, WhatIs.com.

However, file extensions are not necessarily unique. The same file extension may be used by different types of files. For example, a DAT file could be just about anything. In some cases, the attachment arrives with an incorrect file extension. It may be necessary to take a look at the contents of a file to determine what it is. Often you can use a text or word processor, such as the Windows Notepad or Wordpad, to look at a file.

If the file is mostly a jumble of letters and numbers, it may need manual decoding. Comparing the file to the examples in Uuencode, MIME, Base64, Quoted-Printable, Binhex, and Other sections should allow you to identify the type of encoding used. See the Decoding Mechanics section for how to handle these files.

If the file appears to be mostly strange characters, there may be a few letters in the mix that let you identify the type of file it is. For example, if the file appears in your word processor as A 8-Bit
the GIF as the first 3 characters identifies this as a GIF image file. The identifying characters don't necessarily match the file extension and they aren't necessarily the first 3 characters. This varies by file type.

The following is a list of some common file types and the identifying information.

If the file is a Microsoft Office file (Word, Excel, Access, Power Point), the following free utilities may be useful: Microsoft Office Converters and Viewers


Top of page Top of section


Compression and Message Size Limits

Sometimes it is a good idea to compress the files before attaching. The advantages of this are:

However, since compression does require a little extra work on both the sender's and recipient's part, there are times when it isn't worth the bother. If the file size is small (e.g. less than 40 KB) or a file type that doesn't compress well (see above) and you are only sending one or two files, I wouldn't compress them. Also take into consideration whether the recipient will know how to uncompress the file.

There are several compression methods

ZIP

This is the most common compression method on DOS and Windows PCs. Programs are also readily available to handle Zip files on other platforms. This is the best choice of a compression method for DOS and Windows, as well as cross-platform file transfers. PKWARE PKZIP/PKUNZIP is the standard DOS program for ZIP files. They also have versions for Windows, OS/2 and Unix. There are also numerous DOS and Windows front-end programs available to make the use of the DOS PKZIP easier. Winzip is a popular Windows base Zip/Unzip program, that also has decoding abilities. Info-ZIP has freeware Zip and Unzip programs for over 30 operating systems.

StuffIt - SIT

Aladdin Systems StuffIt is commonly used on Macs. Unstuffing utilities (e.g. StuffIt Expander) are available for DOS and Windows PCs, but few people have them. Utilities for creating SIT files are not commonly available on these platforms. StuffIt Expander can also uncompress ZIP files as well as decode Uuencode and Binhex.

TAR - Compress

TAR (Tape ARchive) is the standard archive for Unix systems. This is often combined with the standard Unix utility "compress". Compressed Unix files typically have a file name ending in ".Z". TAR and Unix "compress" compatible utilities are available for other platforms, but few people have them.

For ZIP and other utilities, try these sites: ZDNET Hotfiles, SHAREWARE.COM, c|net Download.Com, Filez.


Top of page Top of section


Encoding Mechanics

Most newer mail and news programs provide automatic encoding of attachments. You merely select the menu item or button for Attachment and then select the file to attach. Or your operating system may provide a drag-and-drop or other method to send a file. If your mail or news program supports more than one encoding method, there will be an option to set the default encoding method. There may also be the option to override the default encoding method when composing a message. (Note: Some programs support more decoding methods than they do encoding methods. For example, a program might always use MIME encoding in sending, but be able to decode either MIME or Uuencoded attachments on receipt.)

There are two cases where you might need or want to manually encode an attachment: your program doesn't support attachments or the recipient's program can't decode any of the encoding methods that your program supports.

Because MIME headers are integrated with the message headers, it would be difficult, if not impossible, to manually insert a MIME encoded attachment in an existing message such that the recipient's program would automatically decode it. For this reason, Uuencode would be the best choice for manual encoding.

When you manually encode a file, the encoder program produces a plain text file. For example, the A.GIF file might get converted to A.UUE. ".UUE" is a common extension for Uuencoded files. However, the exact name is unimportant since it should appear nowhere in the sent message. Once you have the encoded file, you need to insert it as text into the body of the message. The encoded file should appear along with your message text, as in the following example.

Hi Bob,
Here's the image file that I promised you.

begin 644 a.gif
M1TE&.#EA)0`H`+,``.P`2QBQ`/__________________________________
M_____________________RP`````)0`H```$5S#(2:N]..O-N_]@*(YD:9YH
MB@(LH%ZM^U;Q3,6L+>6MW@>_5VXW%,J`Q500>5PUF:HE\4F20ITX'#:K-5FG
;WB3MZR?J^(QM9RVF'7PN'Q.K]OO^+Q^!``[
`
end

Some people include a line, such as "------ CUT HERE -----" before and after the encoded text. This is unnecessary. If the recipient's program can decode the attachment type, it won't use those lines. If the recipient has to manually cut the encoded text for decoding, after they've done it once, it will be obvious to them what needs to be cut.

Notes: If you are manually encoding a file, be sure to Insert as Text and not Attach the encoded file. If you attach the encoded file, it gets encoded a second time. Not only does it make the resulting message larger than necessary, but defeats the purpose of manual encoding. If the recipient could decode your attachments in the first place, there would be no need for the manual encoding. By attaching an already encoded file, you are forcing the recipient to double decode it.

If you manually Uuencode an attachment and insert that into a message that is using MIME, that may allow a recipient whose program only supports Uuencode to automatically decode it. However, if the recipient's program supports MIME, it may not automatically decode such a message, even if that program also supports Uuencode. That's because many programs don't expect MIME and Uuencode to be used in the same message. They only look for Uuencoded attachments in messages without any MIME headers.

If you are going to use compression, you compress the original file first. Then either attach the compressed file (if using automatic encoding) or encode the compressed file (if using manual encoding).


Top of page Top of section


Decoding Mechanics

Most newer mail and news programs provide automatic decoding of attachments. However, your program might not support the encoding method used by the sender. If your program can handle the decoding of more than one method, it will usually automatically detect the message's method.

If your program automatically decoded an attachment, it will do one of several things, depending on the program, your options, the type of file and/or how it was attached.

You might need to manually decode an attachment. The reasons for this include: The sender used an encoding method not supported by your mail or news program; your mail or news server doesn't support MIME and removed some critical MIME headers; the sender double encoded the attachment (see Encoding Mechanics); and the original message was split into multiple parts (see Compression and Message Size Limits)

If your mail or news program decodes an attachment, but it needs further decoding, use the program's save attachment feature, then use the manual decoder on that file (for an exception, see Multipart Messages below). Otherwise you need to save the raw message as a plain text file. Most programs have a File, Save As option to save a message to an external file. Although they may give the file a special extension, these are normally plain text files. Your manual decoder program might expect that the file has a special extension, such as ".UUE", but this is usually not necessary.

It's a good idea to look at the saved file in your word processor to see what you've got. Comparing the file to the examples in Uuencode, MIME, Base64, Quoted-Printable, Binhex, and Other sections should allow you to identify the type of encoding used. If the file does not look like any of the encoding formats, see the Identifying Attachment File Types section for further help.

Depending on the manual decoder program that you use, you may need to do some editing of the saved file before decoding. You may need to remove message headers (e.g. From:, Subject:, etc.) and normal message text. However, MIME decoders generally expect the message headers, and Binhex decoders expect the "(This file must be converted with BinHex 4.0)" line. Some better decoder programs (e.g. Wincode) do a good job of ignoring what they don't need, so that you rarely need to edit the file before decoding.

You might also need to tell the decoder program the type of encoding used, or select a different decoder program based on the type of encoding. Wincode version 2.7.3 or later does a good job of determining the encoding type if set to "AUTO-1" decoding. (However, some files use Base64 with incomplete or no MIME headers. In that case you will need to manually strip the file of any headers before decoding and set Wincode to decode Raw Base64.)

Multipart Messages

Because of message size limits imposed by some ISPs (see Compression and Message Size Limits), larger attachments may have been split into multiple messages. Since decoding is the reverse of encoding, you must perform the steps in the reverse order. The original file was first encoded, then split into multiple messages. So you must first combine the multiple messages, then decode it.

Some news programs (but fewer mail programs) may automatically identify the message parts, combine them and decode. Some others may allow you to identify the parts and order them, then the program will decode it. However, in many cases, you will need to manually save the parts and then decode it.

Your mail or news program may automatically decode the first message part. However, that doesn't do you any good. There is no practical way to combine that already decoded part with the rest of the parts. You will need to save each part (including the first) as a plain text file (as discussed above), then decode that. Your program may allow you to select all the parts and save as a single text file. If not, save each message part as a separate file. If you give the individual parts names such as FILE1.UUE, FILE2.UUE, FILE3.UUE, etc. then tell Wincode to decode FILE1.UUE, it will automatically find the other parts for decoding. For some decoders it may be necessary to use your word processor (or other program) to combine the individual parts into a single file before decoding.

For Wincode and other decoding utilities, try these sites: ZDNET Hotfiles, SHAREWARE.COM, c|net Download.Com, Filez. Winzip also includes decoding functions. Aladdin Systems StuffIt Expander can also decode Uuencode and Binhex.


Top of page Top of section


Problems and Complications

If there weren't problems and complications, you wouldn't be reading this. Most of these are caused by incompatibilities between the sending and receiving program. Another source of problems is the sender's lack of understanding of encoding. One or more of the following may apply to the problem message.


Top of page Top of section


Notes on Mail and News Programs

Disclaimer: Information in this section is based on rumor and innuendo. I don't use all of these programs. Program features are subject to change without notice. Different versions of a program on the same or different platforms may have different features. For example, the Macintosh version might support both encoding and decoding using Binhex, while the Windows version might support only decoding Binhex, or not support Binhex at all. If you have more accurate or up-to-date information on these or other major mail and news programs regarding their support for Attachments, drop me a note. mailto: I might even update this page with that information.


Top of page Top of section


Notes on Service Providers

Disclaimer: Information in this section is based on rumor and innuendo. I don't use all of these services. Service features are subject to change without notice. Users on the same service may have different software or versions. The service on different platforms may have different features. If you have more accurate or up-to-date information on these or other major services regarding their support for Attachments, drop me a note. mailto: I might even update this page with that information.

Some of the links below are restricted to members of the service.


Top of page Top of section



Visitor Count: counter

Last updated: 2000-08-01

Interested:

Comments:

Questions: