Advice for Electronic Documents

Introduction

This document is meant to contain helpful information for creating electronic documents that can be posted on a HTTP server. These documents might typically be PostScript or PDF in form. Their origin would typically be MS Word or LaTeX.


PostScript fonts

There is one example of difficulty with a document generated in MS Word, and printed to file to create PostScript output. The document made use of the Times New Roman fonts, which are not in the standard set of 35 that come with virtually every laser printer (see this URL for a discussion of how this came about). As a result, the document's appearance was fine on an X-Windows screen when viewed with Ghostview on FNALU, but it printed out badly. The WH12W_HP5M printer substituted some font which was too big and the text ran off the page. At ANL, Maury Goodman was unable to print the document. The solution to the problem was to change the Times New Roman font references in the PostScript file to Times Roman.

The books "The LaTeX Companion" & "The LaTeX Graphics Companion" list the fonts available in almost all PostScript printers. They say that these are four serif families (Times, Palatino, Bookman, and New Century Schoolbook), two sans-serif families (Helvetica and AvantGarde), one monospaced typewriter family (Courier), a symbol font, and the cursive Zapf Chancery; these were the fonts Apple provided with the first LaserWriter Plus in 1986. Together these families, in their various shapes, make up 35 fonts, and you often see references to "the 35 fonts". "The LaTeX Companion" shows the 35 "basic" PostScript fonts in Table 11.6; it says that most or all of these fonts are present in the ROM of the common laser printers. For those that do not have all of them (the older or cheaper models), the ones most likely to be present are Times-Roman, Helvetica, and Courier.

A list of these 35 fonts may be found in the file "Fontmap" from the Ghostscript distribution (ftp.cs.wisc.edu:/ghost/aladdin/). This list is shown below. The PostScript names are on the left; the fonts that Ghostscript substitutes for them are listed on the right.

% Aliases (PostScript fonts) 1. /Bookman-Demi /URWBookmanL-DemiBold ; 2. /Bookman-DemiItalic /URWBookmanL-DemiBoldItal ; 3. /Bookman-Light /URWBookmanL-Ligh ; 4. /Bookman-LightItalic /URWBookmanL-LighItal ; 5. /Courier /NimbusMonL-Regu ; 6. /Courier-Oblique /NimbusMonL-ReguObli ; 7. /Courier-Bold /NimbusMonL-Bold ; 8. /Courier-BoldOblique /NimbusMonL-BoldObli ; 9. /AvantGarde-Book /URWGothicL-Book ; 10. /AvantGarde-BookOblique /URWGothicL-BookObli ; 11. /AvantGarde-Demi /URWGothicL-Demi ; 12. /AvantGarde-DemiOblique /URWGothicL-DemiObli ; 13. /Helvetica /NimbusSanL-Regu ; 14. /Helvetica-Oblique /NimbusSanL-ReguItal ; 15. /Helvetica-Bold /NimbusSanL-Bold ; 16. /Helvetica-BoldOblique /NimbusSanL-BoldItal ; 17. /Helvetica-Narrow /NimbusSanL-ReguCond ; 18. /Helvetica-Narrow-Oblique /NimbusSanL-ReguCondItal; 19. /Helvetica-Narrow-Bold /NimbusSanL-BoldCond ; 20. /Helvetica-Narrow-BoldOblique /NimbusSanL-BoldCondItal; 21. /Palatino-Roman /URWPalladioL-Roma ; 22. /Palatino-Italic /URWPalladioL-Ital ; 23. /Palatino-Bold /URWPalladioL-Bold ; 24. /Palatino-BoldItalic /URWPalladioL-BoldItal ; 25. /NewCenturySchlbk-Roman /CenturySchL-Roma ; 26. /NewCenturySchlbk-Italic /CenturySchL-Ital ; 27. /NewCenturySchlbk-Bold /CenturySchL-Bold ; 28. /NewCenturySchlbk-BoldItalic /CenturySchL-BoldItal ; 29. /Times-Roman /NimbusRomNo9L-Regu ; 30. /Times-Italic /NimbusRomNo9L-ReguItal ; 31. /Times-Bold /NimbusRomNo9L-Medi ; 32. /Times-BoldItalic /NimbusRomNo9L-MediItal ; 33. /Symbol /StandardSymL ; 34. /ZapfChancery-MediumItalic /URWChanceryL-MediItal ; 35. /ZapfDingbats /Dingbats ;

This list, in another form, can be found in the file "psfonts.map" that dvips needs. This file can be found in $TEX_FILES_DIR/texmf/dvips/ directory on FNALU (if "setup tex_files" has been done).


PostScript files from a PC

The following is advice from Dave Anderson. He has tried it on his PC (which is running Windows NT, I believe). This is advice on how to produce a "clean" PostScript file (e.g. the minimum of embedded "Features").

--------------------

The trick is to add an Apple printer setup to save as a file. To do this

Open "Add Printer" 
Select "My Computer" 
Select "File" for the port
Choose "Apple" under Manufacturers and 
Choose an Apple LaserWriter II
If you are on Windows NT you may have to log in as a local administrator for your machine to do this.

After this printer is installed, use it to produce PostScript files. The PC will put the ".prn" ending on the file. Before you send it elsewhere you should change the file suffix to ".ps" [e.g. a WWW browser won't send such a file to Ghostview if it ends in ".prn", unless a special setup has been made].

There is a "PrintFile" application available that allows one to drag and drop ps files on its icon on a PC. It is very Mac-like. You may download it in zipped form from this URL.

--------------------

The choice of the Apple LaserWriter II is what minimizes the number of embedded "Features". An example of a "Feature" is choice of a paper tray.

The "My Computer" choice in the above is referring to a choice of "My Computer" or "Network". The choice of "Network" gets you a choice of printers that the network administrators have pre-setup.

When I tried these instructions on my Windows NT machine, the Add Printer Wizard asked me to insert my WindowsNT Workstation CD in my CD drive, so that it could find the driver for the Apple LaserWriter II.

You can avoid having the ".prn" suffix tacked onto your file name. To do this choose "All files(*.*)" in the dialogue box and then edit away the ".prn" that is shown in the file name line. Replace it with ".ps" at the end of your file name. You won't end up with ".ps.prn" like you might have otherwise.

--------------------

Here is similar advice from others on how to set up to do "print to file" by choosing a "printer" that actually creates a file (instead of using the "print to file" button provided when printing from most applications). These are equivalent to Dave Anderson's instructions.

--------------------

Can I suggest the way to create a "Print to file" printer? From Start > Settings > Printers > Add Printer. Follow the Add Printer Wizard instructions, selecting your printer and driver and give it a suitable name. But when you get to the Port selection, select File so that the printer will create a *.prn file. Now, when going to File > Print, use the drop-down printer selector to choose your new printer.

Hope this helps, please let us know how you get on :-)

Mike Glen
Microsoft MVP - Project

See http://www.gallicrow.co.uk/PrintingFAQ.html for step by step instructions on how to set up a dummy printer to print to the "File:" port or a named local port.

Jon

- --------------------------------------------------------------------
Jon Eva                                mailto:joneva@gallicrow.co.uk
Gallicrow Software Limited             http://www.gallicrow.co.uk
Home of Imprint - a utility for printing text and binary files
- --------------------------------------------------------------------

PCL or PostScript?

Many PC applications allow you to choose any printer (e.g. one that includes a built-in PostScript interpreter) via >File, >Print, and then a form on the screen that has a button that allows you to create a file instead of going to the printer. What gets put into the file depends on the driver for that printer. You could end up with a PCL file, instead of a PostScript file. Or, if it is a PostScript file, you could end up with all sorts of "Features" selected for that printer (which can interfere with the operation of some versions of Ghostview).


Advice on which print driver to use

From the Helpfile of the Windows Version of GSView:

Some PostScript printer drivers include code that is specific to a particular printer. The PostScript output from these drivers may be unportable and may not display in GSview. If you are having this problem, try using a reasonably generic PostScript driver such as Apple LaserWriter II NT for PostScript level 2 printers, or Apple LaserWriter Plus for PostScript level 1 printers.


Advice on encoding documents

Files that contain binary information are best encoded, if they are going to be shipped somewhere else. Mail readers (MUAs--mail user agents) and Browsers know what to do with such encoded files. For example, I believe that "dvips" includes fonts in binary form in the PostScript files that it generates (dvips is used when going from TeX or LaTeX to PostScript). It doesn't include the comment "%%DocumentData: Clean7Bit" in the PostScript file; such a comment indicates that no encoding is necessary.

Next listed are some of the DSC comments from a PostScript file generated on my Mac (DSC=Document Structuring Convention--start with %%).

%!PS-Adobe-3.0
%%Title: (Untitled1)
%%Creator: (Microsoft Word: PSPrinter 8.3.1)
%%CreationDate: (1:57 PM Monday, February 22, 1999)
%%For: (alan wehmann)
%%Pages: 1
%%DocumentFonts: Courier
%%DocumentNeededFonts: Courier
%%DocumentSuppliedFonts:
%%DocumentData: Clean7Bit
The last one indicates that the file is pure 7 bit ascii and therefore does not need to be encoded.

Here are some extracts from URL "http://www.ics.uci.edu/~mh/book/overall/ovofmime.htm" that explain some aspects of MIME (Multipurpose Internet Mail Extensions):

Of course, when MIME encodes a binary file (like a digitized picture) that people can't read in the first place, the encoded data won't be any easier for a person to read. MIME encoding is designed to get the data safely through almost every known mail transfer system and gateway. One of the major wins in MIME is that it was designed to work everywhere, including "broken" and "brain-damaged" systems. Instead of trying to impose a new standard on mail transfer systems, MIME works with existing systems -- and adapts to their eccentricities.

base64 is used for data and other text that was never meant to be read by humans -- or must be preserved verbatim. Every 3 octets (24 bits) are encoded into a 4-character sequence. The 64-character set was chosen carefully. It comes from ASCII characters that aren't munged by known gateways or transfer systems.


What does base64 encoding do?

Here is something nice found in "comp.mail.mime":

A real quick and simple explanation for Base64 encoding.

Basically you are converting 8 bit (binary image) to 7 bit ASCII.

The way it's done is very simple. String all the 8 bit bytes together as one long number. Then count from the left 7 bits over and make that an ASCII character. Then count another 7 bits over and make that an ASCII character. When you get to the end of this huge number you may not be able to count exactly 7 bits so those bits are padded and for that reason you'll sometimes notice "==" or "=" character at the end of most Base64 encodings. They are just padded bits.

Example:

original binary file in hex: CD 4C 6F

binary representation. 11001101 01001100 01101111

Now for the Encoding.
group in 7 bits.
1100110 1010011 ....

decimal
102 83 ....

now convert the decimal to the associated ASCII values and you are done. Remember you must group as exactly 7 bits and when you reach the end you'll need to pad bits on the end to make the grouping 7 bits.

Do the same process in reverse to convert back. Except you'll be grouping in 8 bits and when you have an odd number of 8 bits at the end you throw them away.


Problem Lines in PostScript

I keep a file with notes on lines found in PostScript files that have given me problems. This is a simple text file. If it is somewhat unintelligible, please complain and I'll improve it. That way I'll know people are referencing it.

Some of these lines gave problems with an older version of Ghostview on the FNALU cluster. A newer version of Ghostview has been installed since then that may be more forgiving (GSview on \\numiwinctr1 didn't have these problems). Some of the other problem lines are those that are illegal for inclusion in EPS files. I don't attempt to identify which is which in this file of informal notes.


Bulletins

3/12/99, Adobe URL of interest

This location gives advice on what to do if you are having problems with viewing PDF files inside your Web Browser Window.

3/22/99, PostScript Advice from Randy Herber

Jean Slisz, Publications Office, made me aware of document containing useful advice regarding PostScript (posted on a CDF web server).

5/7/99, PDF --> PS --> EPSF --> PPT

Stan got some PDF files with Super-K results, which were converted to PS format. He wanted to use them in a MS PowerPoint presentation (for the 5/18/99 DOE NuMI/MINOS Review).

The PS files were further converted to EPS, and a TIFF preview was inserted (by using GSview). In this form each file had the required binary header for an EPS file, and contained both a preview bitmap for screen display in MS Windows and the PostScript code to be interpreted when printed. When the EPS file was imported as a picture file into a MS PowerPoint slide, the preview was okay on the screen, but the PostScript code contained in the EPS file didn't print as desired on WH12W_HP5M (a queue using a PostScript driver).

After some investigation the problem was narrowed down to the presence of an illegal PostScript operator--initgraphics (illegal in an EPS file). This had the effect of re-initializing the PostScript interpreter's graphics state, and thus negated all of the resizing and positioning commands that MS PowerPoint had inserted ahead of the EPS file, in the larger PostScript file.

The lesson is to remove the illegal operators from the PS or EPS file, before adding the preview. After adding the preview, it gets much harder to remove illegal operators, since then one has to modify the binary header (to readjust the pointer to the start of the TIFF preview, and readjust the parameter giving the length of the PostScript code in the EPS file).

A further confusion during this episode was that the slides with the inserted EPS files printed okay on the WH12W_HP4P printer. This printer used the preview bitmap, not the PostScript instructions from the EPSF file. A test of "print to file" with this printer chosen as the printer for printing a slide from PPT indicated that the file generated was in the HP PCL language. There wasn't a hint of PostScript in the file. Since the printer for making color transparencies is a PostScript printer, it doesn't help that a slide prints okay on a printer that isn't using PostScript (by using the eps preview bitmap).


5/20/99, Inserting EPS from MS Project into PPT

Sam Childress wanted to put a Gantt chart from MS project into a MS PowerPoint Slide, showing installation of the absorber and near detector on the same schedule. We tried doing it with an eps file from MS Project. As I recall, we first tried using Landscape orientation for the print file. When we scaled the eps file beyond a certain shrinkage value, the PostScript interpreter reported an illegal command and we couldn't get the shrinkage factor we wanted. We managed to get around this by using Portrait orientation for the generation of the eps file, and by using clipping operators in PostScript to removed the parts of the image that we didn't want.

Further investigation showed that the problem with too big a shrinkage factor for the Landscape orientation was with the generation of the patterns that were in the bars on the Gantt chart. A transformation matrix and its inverse are used to get these patterns right. Beyond a certain shrinkage factor, integer division round-off was producing an illegal transformation matrix. With this knowledge one could finesse the original problem by fiddling with the division of the integers. The details are given below.

In the section that starts with the DSC comment
     
     %%BeginResource: file Adobe_WinNT_Pattern 2.0 0
     
the following code snippet was the problem:
     
     GDIBWPatternDict begin Width Height
     end dsnap scale
     
A change as follows would fix this problem
(but would probably not work at the original
scale)
     
     GDIBWPatternDict begin Width Height
     end
     %Extra Step here to avoid bad CTM for makepattern
     %otherwise get bad scale factors from dsnap
     2 add exch 2 add exch
     dsnap 
     scale

12/15/99, Invisible eps file in MS PowerPoint

I had the experience of being unable to extract a table from a LaTeX document as an Encapsulated PostScript file (eps), insert it into a MS PowerPoint 97 SR-1 slide, and have it appear on the printed output. Nothing from the eps file was visible when I printed--either to a printer or to a print file (when viewed with GSview). I made the Encapsulated PostScript file using dvips with option -E.

I also tried inserting the eps file into MS Word, and it didn't print there either. Having or not having a wmf preview inserted by GSview didn't matter--for either application.

Upon further investigation, in MS PowerPoint 97 SR-1 the PostScript that gets generated when printing has an insert of PostScript code--ahead of the eps file--that positions and scales (and, in principle, crops) the eps file. In this insert there is a line of PostScript code that sets the grey level to white. My guess is that this line has the purpose of creating a clean background for the inserted eps file. The PostScript code insert fails to save the graphics state ahead of this line (the one that sets the grey level to white). The consequence is that when the insert later restores the graphics state, the grey level stays set to white. The eps file generated by dvips -E does not set the grey level to black, so the image that the eps file paints on the page is white and therefore invisible against the white background.

The solution is to insert a line into the eps file (generated by dvips -E) that sets the grey level to black. After doing this, the eps insertion into MS PowerPoint is visible when printed. After insertion the local part of the eps file looks as follows:

%%EndProlog
%%BeginSetup
%Feature: *Resolution 600dpi
%%EndSetup
TeXDict begin
0 setgray

In contrast to what happens with the extraction from LaTeX, in the case of Canvas on my Mac (together with the Adobe PSprinter 8.3.1 printer driver) printing does set the grey level to black inside of the generated eps print file; that is why insertion of an eps file from a Canvas drawing works in MS PowerPoint.


3/1/00, Faint drawings from IHEP

IHEP sent us PostScript files which had some drawings that were rather faint. These seemed to be LaTeX produced files with embedded eps drawings from AutoCAD. A cure was to replace "0 setlinewidth" in the prolog part of these eps files with "0.5 setlinewidth".


4/11/00, another problem with MS PPT and gray level

Like in the case of the previous bulletin, I had another problem with gray level not being set ahead of an eps insert in MS PPT. This time it was a PS file generated by Softdesk's Quick CAD. A part of the drawing was missing. I ended up going into the PS file and adding a setting of the gray level at the beginning. An extract is shown below:

NTPSOct95 begin
%%Page: 1 1
NTPSOct95 /PageSV save put
11.32 776.219 translate 72 600 div dup neg scale
0 0 transform .25 add round .25 sub exch .25 add round .25 sub exch
itransform translate
1 sl
0 g
n

My insert was the line "0 g". Earlier "g" had been defined to be "setgray". This fixed the problem.


6/2/00, a problem with insertion of eps file into MS Word

This was a problem with a 3 page MS Word document, with an eps file inserted on page 2. GSview for Windows version 2.5 (or version 2.9) acted as if there weren't any page 3. The problem also occurred in a similar fashion with MS PowerPoint. The problem did not occur with Ghostview on Unix node Fsui03.

The problem was traced to the lack of DSC comments "%%BeginDocument:" and "%%EndDocument" surrounding the eps file (see URL PostScript Language Document Structuring Conventions Specification) and the comment line

%MSEPS Preamble [Softek v3.6]

that is inserted ahead of the eps file. Putting in the DSC comments and adding a space between "%" and "M" in the "MSEPS Preamble" comment line fixed the problem.

This fix did not strictly obey the following stricture from the DSC document, since the eps file wanted a font that the enclosing document hadn't asked for. That didn't seem to matter.

Note: All feature and resource requirements of an included (child) document should be inherited by the including (parent) document. For example, if a child document needs the StoneSerif font resource, this must be reflected in the %%DocumentNeededResources: comment of the parent. This is necessary so document managers can examine the top level header of any document and know all resources and features that are required.

The specification for eps can be found at URL spec for eps. This and the other were taken from the Adobe web site. Adobe keeps moving them, so I have put them here for easy reference.


12/6/00, Another insertion problem in MS PPT

This was a problem of inserting an eps file with preview into MS PPT 2000 SR-1. The insertion procedure got mixed up as to what it was doing. The problem is documented in a PDF file.
Comments to: Alan Wehmann (wehmann@fnal.gov)
Last modified: Wednesday, 12/6/00