Hi pdfTeX fans, first excuse the long E-Mail. Here is some news on JBIG2 file inclusion with pdfTeX. The newest experimental pdfTeX driver I have put on my homepage. And also a PDF file with the datastream example from the JBIG2 standard. Status of Experimental JBIG2 Driver ----------------------------------- Multiple JBIG2 images from a given JBIG2 file can be selected, one per call only, e. g.: \pdfximage page 1 {foo.jb2} \pdfximage page 2 {foo.jb2} \pdfximage page 3 {foo.jb2} In this case the page 0 object is stored only once. The case of optimum JBIG2 compression would be to include ALL images from a given JBIG2 file. Giving page/width together does not work, see my other E-Mail. The newest xpdf 2.0 can nicely display and print the PDF-file generated from the full datastream example file (all three images!) from Annex H of the JBIG2 draft. Wow, how did they do it? But my Acrobat Reader ((R) by Adobe) x86 linux 5.0.5 Apr 25 2002 11:55:36 crashes already at page/image 2, saying just `Abgebrochen' :-( So most likely I have done something wrong. But where? Some Remarks on JBIG2 Multiple Image Inclusion in pdfTeX -------------------------------------------------------- Including multiple images from the same JBIG2 file gives some conceptual complications (similar problems might be known from PDF inclusion). How the JBIG2 file is organized is shown by an example (hope my understanding is right), where the digits denote segments for numbered pages, EOP is end-of-page flag, EOF is end-of-file flag. My understanding is, that a certain page N segment X requires all info from any page 0 segment up to the segment number X. A JBIG file might look like: 00010111111EOP(1)22222220022EOP(2)0333303EOP(3)EOF So if one wants to include the image from page 1, one needs all the page 0 segments up to EOP(1). If one wants page 2 also, one needs additional page 0 segments up to EOP(2). But already when writing the first page, the required page 0 info is to be written out as PDF object. The PDF definition seems to require, that all page 0 info is collected in ONE PDF stream. --- Or can one define a continuation stream in PDF? --- This would mean, that for writing an additional, later page (e. g. page 2), this page has to be accompanied also by its page 0 information---which is a waste of space in the PDF file, as part of the page 0 PDF info has been already written before. What to do? (1) Accompany any page N with all its page 0 information upto the EOP(N) flag. This is straight-forward, but it increases the size of the PDF file, as the same page information is included several times. So it makes part of the JBIG2 compression advantages void. (2) Scan once over the whole file and make one big page 0 object. Then reference this for any included page. This gives relative small files if multiple images from the same file are included. But this might give increased PDF file size, if only one image from the JBIG2 file is used. (3) Planning ahead (= knowing in advance), which images (e. g. up to which page) will be included from a given JBIG2 file. Then the included page 0 segment could be kept at a minimum. Just take everything up to the maximum image number. Or utilize a real JBIG2 decoder (XPDF) to cleverly decide which segments are really needed for the image subset. But the decoder would have almost nothing to decode (the UNdecoded JBIG2 stream is included), it only would have to decide... :-) How to remember that a page 0 is already written out for a given image? It seems that once an image object is written out, its img_name(img) and other structure info are forgotten. As a simple cludge, I remember it in a static (fixed-size, booo!) string storage within the write_jbig2() function. And _all_ page 0 information from the JBIG2 file is written, see case (2) above. Inclusion of multiple images is slow on the JBIG2 reading side, as the JBIG2 file for any new image is always scanned fresh from the beginning. There is no data structure optimizing this. If one would know beforehand, which images in total to include from a JBIG2 file, things could be optimized. If there were a TeX data structure available to the C-side, telling about pages (e. g. page 2,3-5,12), one could include only the actually required page 0 info. If the pages were sorted in ascending order, one could loop through the pages within the write_jbig2() function with high JBIG2 reading speed. All this is far for production use, obviously with errors, only for experimentation. I didn't look yet into the xpdf sources (sorry it's still too complicated for me). Anybody out there who can give me some hint about above question marks, and what to do next? Have fun! Greetings Hartmut ------------------------------------------------------------------------ Dr.-Ing. Hartmut Henkel In den Auwiesen 6, D-68723 Oftersheim, Germany E-Mail: hartmut_henkel@gmx.de http://www.circuitwizard.de ------------------------------------------------------------------------