Public offices use a wide variety of equipment and software in creating records of official business. Some of the digital formats are text, images, videos, CAD files, databases and websites. These records have to be managed and be made accessible for as long as they are required, regardless of the file format or the technology used when they were originally created.

Under s14 of the State Records Act 1998, it is the public office’s responsibility to ensure that records remains able to be produced or made available for the minimum authorised retention period irrespective of changing technology.

One of the strategies to ensure that records remain accessible or usable over a period of time is the use of sustainable file formats.

Back to top

Criteria for selecting file formats

The file formats best suited for long-term sustainability and accessibility:

  • are widely used and supported
  • are identifiable and well documented
  • are independent of specific software, developers or vendors
  • are unencrypted
  • have open specifications or embodies open-source principles
  • are stable (rare releases of newer versions) and backwards and forwards compatible with other versions
  • should be uncompressed and if compression is used, lossless compression is preferred
  • are metadata friendly or have the ability to embed metadata.

Please note that not all formats will meet the above criteria. Some criteria will also be more relevant for specific formats.

Back to top

Recommendations

We recommend public offices to:

  • use sustainable file formats when creating new records and information
  • use sustainable file formats when converting or migrating records
  • use the list below as a way of identifying records at risk of format obsolescence
  • consider relevant sustainable file formats as a criteria when acquiring new systems or software
  • consider relevant sustainable file formats as a design requirement when developing or implementing new systems
  • use the relevant sustainable file format in requesting records from a cloud service provider.
Back to top

The file formats covered in this guidance have been identified as sustainable. Please note that the list below is not exhaustive.

Media type

Sustainable file formats

Audio files

  • Waveform Audio (.wav)
  • MPEG audio layer 3 (.mp3)
  • MPEG4 (.mp4 or .m4A) 
  • Free Lossless Audio Codec (.flac)

CAD

  • AutoCAD Drawing Interchange Format (.dxf)
  • AutoCAD Drawing (.dwg)
  • STEP-file specification (.stp, .step, .p21)

Data files and databases

  • Comma Separated Values file (.csv)
  • eXtensible Markup Language (.xml) 
  • JavaScript Object Notation (.json or .jsn)
  • Software Independent Archiving of Relational Databases Version 1.0 (.siard) 

Email

Individual

  • Electronic Mail (.eml) for each message as a single file and attachments may either be included or written off as a separate file.
  • Outlook Item Message (.msg) for storing a single message object such as an email or an appointment, a contact, or a task in a file. May also include attachments.
  • iCalendar Electronic Calendar and Scheduling format (.ics, .ifb, .iCal)

Aggregate

  • MBOX Email (.mbx or .mbox) for storing collections of electronic mail messages
  • Personal Folders File (.ps) for storing local copies of messages, calendar events and other items within Microsoft software

Encapsulation

  • ZIP File format (.zip)

Geospatial

Vector GIS file formats

  • ESRI Shapefile with mandatory component files (.shp, .shx, .dbf)
  • GeoJSON Version 1.0 (.json)
  • Geography Markup Language (.gml)
  • Keyhole Markup Language (.kml)
  • OpenStreetMap XML (.osm)

Raster GIS file formats

  • ERDAS Imagine (.img)
  • Geo-Referenced TIFF  (.tif, .tiff, .gtiff)

Geographic database file formats

  • ESRI Deodatabase (.gdb)
  • GeoPackage Encoding Standard, version 1.0 (.gpkg)

Light Detection and Ranging (LiDAR)

  • LASer, version 1.4 (.las)

CAD file formats

  • AutoCAD Drawing (.dwg)
  • AutoCAD Drawing Interchange Format (.dxf)

Image Files

  • Tagged Image File Format, Revision 6.0  (.tiff, .tif)
  • JPEG Image Encoding family (.jpeg, .jpg)  
  • Portable Network Graphics (.png)  
  • Adobe Digital Negative (.dng, .tif) for storing and interchanging camera raw images
  • Scalable Vectors Graphics (.svg)
  • OpenDocument Drawing format (.odg) for editable graphics documents

Office documents 

Office Open XML:

  • Word Processing (.docx)
  • Spreadsheet files (.xlsx)
  • Presentation / Slideshow files (.pptx)

Page-layout format

  • Portable Document Format (.pdf) is most appropriate for final-state format or for delivery to end users and also for multipage scanned documents. For more information on multi-page scanned documents, please see our guidance on Digitisation: Technical Specifications.

Text-based

  • Plain text (.txt)
  • Tabular data exchange (.csv)
  • XML (.xml) for textual content, metadata records or as  a wrapper format for complex digital objects (or records) 
  • JavaScript Object Notation (.json or .jsn) used as a data interchange format

Video Files

  • Motion JPEG 2000 (.mj2, .mjp2)
  • Digital Moving-Picture Exchange (.dpx)
  • Digital Cinema Initiative Package (.dcp)
  • Ogg File Format (.ogg, .ogv)  
  • MPEG-2 (.mpeg, mpg) 
  • MPEG-4 (.mp4,  .m4a)

Websites

Web pages

  • Hyper Text Markup Language (.html, .htm)
  • eXtensible Markup Language (.xml)

Webpage components

  • Cascading Style Sheet (.css) for styling web pages
  • XML Schema Definition (.xsd) used to control the structure of XML documents

Archived websites

  • Web ARChive (.warc) is used for web-accessible content in archived state.

 

 

References:

  1. “Sustainability of Digital Formats: Planning for Library of Congress Collections,” The Library of Congress, accessed May 24, 2019, https://www.loc.gov/preservation/digital/formats/intro/intro.shtml
  2. “PRONOM: technical registry,” The National Archives, accessed May 24, 2019, http://www.nationalarchives.gov.uk/PRONOM/Default.aspx
  3. “Geopackage,” OpenGeospatial Consortium, accessed May 24, 2019, http://www.geopackage.org/.
  4. “The Ultimate List of GIS Formats and Geospatial File Extensions,” GISGeography, accessed May 24, 2019, https://gisgeography.com/gis-formats/.
  5. "The 'Bit List' of Digitally Endangered Species," Digital Preservation Coalition, accessed November 11, 2019, https://www.dpconline.org/our-work/bit-list

Published August 2019 / Updated November 2019

Back to top
Recordkeeping Advice
Recordkeeping A-Z
F S