Michael W Harris, 2020-09-12 (v1). (PDF version also available.)
This document provides a high level, non-collection specific assessment of the JPEG file format with regard to preservation risks and the practicalities of preserving data in this format. The structure of this assessment has been partly adopted from a British Library document assessing the TIFF format1. This assessment is a “quick” assessment, and may not consider all issues that you think are important. It is not an academic paper, and instead is intended to collect evidence for what might be considered obvious: that the JPEG file format is not an “at risk” format.
At some point, a broader framework for selecting file formats for digital preservation purposes will be published, and JPEG will be assessed against that framework as well. This document is being published as a “stop-gap”.
This document will look at the JPEG format as commonly used (JFIF and JPEG/Exif; see background). It will not consider JPEG-XL, other extensions, JPEG2000 or similar, or use of JPEG compression in other formats (e.g. in TIFF files).
Strictly speaking JPEG is a compression format. Most of the time when talking about JPEG, either “JPEG File Interchange Format (JFIF)”2 or “JPEG/Exif” is meant. However, while these two formats are incompatible, they are close enough for our purposes. JFIF “is a minimal file format that enables JPEG bitstreams to be exchanged between a wide variety of platforms and applications”3. JPEG/Exif is mostly just JPEG bitstreams with additional metadata.
For further background, and any number of links, see: the Wikipedia articles on JPEG, JFIF and Exif; the Just Solve the File Format Problem wiki articles on JPEG, JFIF and Exif; and the Library of Congress 'Sustainability of Digital Formats' pages on JPEG, JFIF and JPEG/Exif.
JPEG compression and related formats are open and have been standardised by such organisations as the International Telecommunication Union and International Standards Organisation. The relevant standards are freely available and are true open standards (see links in 'Background' above).
Apparently, development continues on both the standards and the original reference software (see, for example, the Independent JPEG Group reference site).
Because there maybe some concerns around whether particular software will or will not support the entirety of a standard, some care should be taken to preserve software (in source and/or binary form) that has been confirmed not to have issues with your particular files.
The format is widely used for various image related purposes. Many digital cameras will produce JPEGs as default. It’s possibly one of the most commonly used image formats of all time. The format can be opened and edited by software on various platforms from most of the last thirty-odd years, including all modern web browsers. There are a multitude of different, independent, free and open source software programs that can edit the format, including ImageMagic, GIMP, libJPEG (reference implementation) and many others.
One of the common refrains around JPEG is “it's lossy, the sky is falling!”. I humbly submit that if you are using JPEGs as a source file format and editing in that format, then you may have a point. However:
If you are not editing your preservation (“master”) copies, then it doesn't matter.
If you receive an image in JPEG format, and you will not change it further after it is entered into your repository, then it doesn’t matter.
If you are going to edit the image, and you save it in an intermediate/working format (a lossless format, perhaps PNG or XCF) while working on it, and then export that to a lossless format such as PNG, then it doesn't matter.
If you are going to edit the image, and you save it in an intermediate/working format (a lossless format, perhaps PNG or XCF) while working on it, and the result exported to JPEG is more than sufficient for your purposes, then it doesn't matter!
So, being lossy only matters if you are regularly editing and saving the JPEG file. This continued re-saving will result in loss of quality over time.
If you already have files in JPEG format that you are not going to be editing, “converting them to TIFF [or PNG] will not magically improve the quality of the pictures”4. Action is not required to move the files to another format.
Other organisations and people have assessed this file format. In all cases here, the organisations list JPEG as one of a number of other formats at the same level for the same type of file.
Duke University Libraries: Recommended/Acceptable format.
National Archives of Australia says of JFIF: “low risk of becoming obsolete”.
Public Records Office of Victoria: Sustainable Format – “It is unlikely that JPEG would be unreadable at any time in the foreseeable future.”
There are no current legal issues known. Historical legal issues are no longer relevant.
Three alternatives include: Better Portable Graphics (BPG); High Efficiency Image File Format (HEIF); and WebP. While these may have advantages over JPEG (e.g. smaller file size for the same image quality), all three have yet to demonstrate the staying power, adoption, or software support that JPEG has. There are also some concerns around patents for HEIF, at least in some jurisdictions.
PNG and TIFF (both of which have demonstrated staying power, and to a lesser extent for TIFF, adoption and software support) can also be considered alternatives for some purposes. Both will result in larger files, but allow lossless editing.
This format is in no way at risk, and shows no sign of becoming obsolete. You should continue to:
use this format for access copies, where applicable;
accept this format as an input format into your repositories;
create images in this format where the trade-offs between size and quality are acceptable; and
consider creating images in this format where a preferred lossless format (e.g. PNG) is not available (e.g. not supported by hardware) and your processing pipeline is not able to convert to a preferred lossless format.
‘JPEG Image Compression FAQ, Part 1/2’, 28 March 1999. http://www.faqs.org/faqs/jpeg-faq/part1/.
Library of Congress. ‘JFIF, JPEG File Interchange Format, Version 1.02’. Web page, 30 January 2012. https://www.loc.gov/preservation/digital/formats/fdd/fdd000018.shtml.
Stanescu, Andreas. ‘Assessing the Durability of Formats in a Digital Preservation Environment: The INFORM Methodology’. OCLC Systems & Services 21, no. 1 (2005): 61–81. http://dx.doi.org.dbgw.lis.curtin.edu.au/10.1108/10650750510578163.
Wheatley, Paul, Peter May, Maureen Pennock, and Akiko Kimura. ‘TIFF Format Preservation Assessment’. British Library Digital Preservation Team, 26 March 2015. https://wiki.dpconline.org/images/f/f3/TIFF_Assessment_v1.2_external.pdf.
1Wheatley et al., ‘TIFF Format Preservation Assessment’.
2‘JPEG Image Compression FAQ, Part 1/2’.
3Library of Congress, ‘JFIF, JPEG File Interchange Format, Version 1.02’.
4Stanescu, ‘Assessing the Durability of Formats in a Digital Preservation Environment’.
This page is located at http://next-nexus.info/writing/computer~ing/justifying-jpeg.php and was last modified on 2020-09-12.