Version 0.50, Rich Franzen, August 1998
NOTE: this page and PNG16 Technical have cross-linked sections.
NOTE: | The PNG-16 image format is neither sponsored nor approved by the PNG Development Group. It has no standing with the W3C. |
Why 16 bits? Well, 8 bits per pixel is inadequate for full color imagery, and 24 bits (8 bits each for red, green, and blue) is more than is necessary. And if an image 200 kbytes in size is indistinguishable from one 300 kbytes in size (uncompressed comparison), why not use the smaller one? Also 16 bits is a natural size for computer hardware to work with. The full pixel integrity is within the one entity, rather than handled as 3 separate color channel bytes.
The primary proposal is for the pixels to be stored in saturation, intensity, and hue form. This format will be tentatively called fc16 (full color 16). Saturation is the degree of color intensity; grey has no saturation, and very garish colors have full saturation. Intensity is the degree of brightness involved; black has no intensity, and "as bright as your monitor gets" has full intensity. Finally, hue is the color component, red, yellow, burgundy, etc. The 16 bits of a pixel would be divided as 4 bits for saturation, 6 bits for intensity, and 6 bits for hue. As we will show, the bit fields are have s.i.h order for a very good reason.
For those completely uncomfortable with working outside of rgb color space, a secondary format, mc16 (many color 16), is proposed. It has 5 bits each for red, green, and blue, plus an intensity bit. If the image were being converted to mc16 from 24-bit rgb, the i-bit would be calculated democratically, giving close to the color and intensity resolution of an 18-bit rgb image.
For those who need some color, but with deep, deep, deep grey, there is a third format, gc16 (grey with color).  This provides a 15-bit greyscale (32768 shades) along with 15-bit RGB color.
And for those who want absolute control of which colors they get from either 24-bit or 48-bit RGB space, there is even a fourth format, ic16 (indexed color). This provides up to 65536 precise colors from a palette of, ummm, 16 million times 16 million colors.
You can stop imagining. The fc16 format delivers it all -- and with an appearance that is indistinguishable from a 24-bit image. By indistinguishable, I mean that if a true color image were displayed on three different monitors, two of them showing it in 24-bit form, and one in fc16 form, then the vast majority of people could not reliably identify the one that was "different". I.e. the inherent difference between monitors exceeds any loss as the image was mapped to 16 bits. (If you have any technical experience working with true color imagery, you probably will not believe this. Yet. Please read on...)
The fc16 format has lots of bells and whistles. But at its core, it provides great true color representation for imagery. Even with all its greys and special effects, there are still 60,480 permanent, defined colors. (Go see them now!)
If you do not think this is enough, think again. The Amiga computer was renowned for its graphical capabilities, and it only had a palette of 4096 colors to choose from. Similarly, in the early 1980's, Spatial Data Systems had the VDI (Video Display Interface) that offered high-end, professional, true color displays in 14-bit IHS space (a limited form of the 16-bit SIH being proposed herein). The VDI had 6 bits of intensity, 5 bits of hue, and 3 bits of saturation. (It used the two remaining bits for a cursor overlay.) So true color has already been offered commercially at something other than 24 bits per pixel. And only modern high end 32-bit (24-bit color with an 8-bit alpha channel) graphics systems offer more capability than fc16.
One of the cool features that PNG already has is multiple levels of transparency. I.e., if a PNG image is sitting on top of another image (web page backdrop, for example) it can show the underlying image through. And it can darken the underlying image as well, in an effect similar to looking at people through the glass in their car window. However, regular PNG transparency is achieved via a separate overlay channel.
Why not take 192 of the black magic colors and extend the concept. We could include high-lighting (making the underlying image look brighter) and intensity inversion (making it a photographic negative). The large quantity of levels are chosen so that the intensity of the underlying image can be modified between -200% to +280% in 2.5% increments. 100% transparency would be the unmodified backdrop pixel, and -100% would be the photographic negative of the backdrop at its original intensity. Since 0% would be completely opaque (black), it is not necessary to have a 181st value for this.
Note that the normal PNG overlay channel would still be available, but it would seldom be necessary in the PNG-16 paradigm. When an actual blend of the backdrop image and the overlaying png image were desired, then of course the normal png overlay feature would be used.
256 more black magic colors would be allocated for color cycling, in 8 cycling groups of 32 colors each. Additionally, each cycling group could contain up to 8 sets of 32 colors. This effect can be used to make water (or wine) ripple, flames waver, or Jack both in and out of the box. (Yes, with fc16, you can have your cake and eat it too!) It is a very limited form of animation, but one that is useful in many cases. Each cycling group would have two time values associated with it, from 0 to 25.5 seconds. One of the timers would be the cycle-timer -- how long to wait before the next set of colors load. The other time would be the offset timer -- how long to wait from the initial display of the image until color cycling begins. Note that the colors used for cycling come from the entire palette of 65536 colors, even allowing, crazy fool that I am, color cycling of color-cycled colors!
Some would argue that not defining these 256 values would lead to non-portable
images. They would be correct; images that used these values would
not look the same on some other vendor's display. However, images
which used only the 65,280 defined values would be portable.
So let us make a trade-off; we will give up some levels of saturation in the dark for more total intensity levels at the brighter end. Intensities 1 through 14 are compressed into half their normal space by giving up levels of saturation. Intensity 1 is only given 1 saturation level (full saturation), intensity 2 is given 2 saturations levels (full and half-saturated), intensity 3 is given 3 saturation levels (full, 2/3, and 1/3), etc. Finally, at intensity 14, there are 14 of the 15 available saturation levels, and all higher intensities have the full 15 levels.
With this pattern, exactly 7 more total intensities are gained, with each one of the extra intensities having their full complement of hues and saturations. Instead of the 63 intensity-levels mentioned above, now there are 70.
Since Shadow Soup reduced the number of saturation levels at the dark end, it would have been silly to have the Pale Parables add levels back. So something better was done -- one and a half new saturation levels were created!
What is "half of an intensity"? Good question. Intensity 1 was virtually black, and yet it contained 64 hues. So 12 hues were removed from it, and these formed intensity ½. If you think in terms of the 256 intensities of an 8-bit byte, then SIH intensity ½ corresponds to byte intensities 2 and 3. SIH intensity 1 corresponds to byte intensities 4, 5, and 6. From there on up there are either 3 or 4 byte intensities for each SIH intensity.
So the combination of condensed Shadow Soup and the Pale Parables increases the number of SIH intensities from 64 to 75.5. This is getting really close to the number of levels your eyes will allow you to discern. Check out the SIH Wheel and see for yourself.
In late September 1998, I wrote the SIH Wheel java applet to see what fc16 (also called "SIH colorspace") actually looks like. I found that the green and yellow domains were very smooth, but the other 4 domains had visible differences between most adjacent hues and saturations. So I revised fc16 to eliminate the indexed colors, which now permanently augment red and cyan. I also admitted to myself that most images do not need the capability of deep grey. Thus there is now an option to use the 12-bit greyscale slots for augmenting the blue and magenta domains.
Even when this option is chosen, there still remain 256 levels of grey, which is all most image sources will actually contain. This augmentation increases the number of static colors to 64260.
"Wait a minute--if I have a 24-bit graphics board, what good is fc16?" A fair question, but maybe the section on black magic put you to sleep. To begin with, disk storage is reduced (by 1/3 for uncompressed images, and somewhat less if comparing compressed images) and time to transfer the image over the net is also reduced. Additionally, the fc16 image is indistinguishable from its 24-bit equivalent. Except, of course for the features it offers that 24-bit does not.
There is a good analogy in musical CD players. Although most people do not really know what it means, vendors advertise an ability known as oversampling (e.g. "with 2x oversampling" or "3x..."). One way to look at this is that the C.D. player can produce more beats per second than the music file actually contains. It blends these extra beats with the actual beats to produce a more full sound. No one insists on music files that contain the extra beats; it is not necessary, and there would be less music per C.D. Think of your 24-bit display as providing color oversampling to the fc16 imagery. The "right" 64,576 colors have already been chosen.
As stated earlier, the pixel bits are defined as rrrrrgggggibbbbb. This is five bits each for red, green, and blue, plus an intensity bit. Thus each of the 32,768 unique shades can be represented at two intensity levels. The intensity bit would be common to all three color components, almost providing 6 bits per color component. In fact, the image is no more than one intensity level of one color component different than an 18-bit image. I.e. in an 18-bit RGB image, if 0 or 1 of the color components wanted the low order bit ("i") set, it would not be. But if 2 or all 3 of the color components wanted, then it would be set.
The i bit is placed between green and blue for a reason. Many hardware displays have a 16-bit pixel option, and this is the bit order used by them. Well, except that i is a sixth green bit. It would be stupid to be so close to an existing hardware format and not support it directly. Thus an option exists to treat i as green, and have a plain five bits for red and blue.
Ignoring the extra features, what makes fc16 superior? Basically, the precision of intensity. Studies have shown that people can discern more than 64 intensity levels on good monitors, but less than 128. The fc16 format provides 70 for any given color, and 4096 for grey. Since mc16 provides somewhere between 32 and 64 intensity levels for any given shade of color (for many colors even less!), it cannot make the claim of indistinguishability I make for fc16.
Still, 18-bit RGB does a pretty good job of showing full color imagery, and mc16 comes very close in effecting the 18-bit RGB quality. If your thought processes demand that you think of color in terms of what goes out the red, green, and blue wires of your monitor cable, then mc16 is a reasonable choice.
The bit ordering in this case is a color-selection bit as the low-order bit, preceded either by 15 bits of grey or 5.5.5 rgb color.
Png-16 offers a realistic and useful choice, what I call the No-Brainer Mosaic. The source images can be left alone, and stored within the png-16 envelope as a byte-by-byte replicas of themselves. Their indexed values would be mapped using the normal png-16 indexed colors, and the target image would end up being only slightly larger than the sum of the sizes of the source images (even taking compression into account). Since ic16 supports 65,536 indexed colors, it would theoretically be possible to build a target image with 256 source images.
For fc16, mc16, and gc16, some calculations would have to be done to convert the color values to rgb values. For these, maybe we should call this feature the Some-Brainer Mosiac.
An imaging program may use fc16 to simply jazz-up an existing 8-bit image with effects such as color cycling. No problem -- make Mona Lisa blink, or cry (or maybe even smile!). If 256 colors is all you need, why pay more?