Introduction
The Western Sydney University (WSU) has been using digitisation to enhance their business for many years. Various staff members are engaged in both business process digitisation programs and back capture digitisation projects.
The business process digitisation program, which has operated since around 2004-5, has involved digitising student records, incoming and outgoing correspondence and legal documents.
The back capture projects are for records of archival value and are undertaken by records staff in their spare time (at no additional cost to the University). These include the digitisation of photographs, a project which began in 2005-06, and the digitisation of student cards and student fiche which began in 2009 and 2011 respectively.
The variety of programs and projects in train demonstrate how there is not a ‘one-size-fits-all’ approach to digitisation. Each project or program may have different aims and business drivers and the records themselves may be retained for different periods of time and present different challenges. As a result, different approaches, technical standards and metadata may be required. An essential part of planning for digitisation involves determining these factors as little differences may have cost and management implications over time.
UWS have kindly agreed to share their TRIM Scanning standards.
Back to topBusiness process digitisation programs
Student files
Aims
The Western Sydney University has six campuses located on various sites throughout the Western Sydney region. Prior to the implementation of an EDRMS across the organisation which began in late 2004, student files had not been centralised. As a result, the management of student information was becoming increasingly complex. Discovery Orders or Freedom of Information (FOI) requests were labour intensive and involved staff on many campuses.
Digitisation advocates recognised that the technology could streamline the handling of student files. Location would no longer be an impediment to searching for and finding the information required. By having the records controlled by the University’s EDRMS, unauthorised access or modification could be prevented or detected and files could be easily monitored and audited. These critical records, essential to University business, could be managed in a way that enhanced their accuracy, accountability and credibility.
Approach
From 2004 a program was incrementally introduced to digitise all student files. Digitisation of hard copy student forms is mainly conducted by an Electronic Data Management (EDM) team operating from the University’s Academic Registrar’s Office. Some student centres also perform digitisation, but all student files undergo quality control by the EDM team.
Staff members performing the digitisation are thoroughly trained in the needs of these particular records and the standards of digitisation required. A procedure manual is provided to ensure consistency of digitisation and quality assurance practices.
The software used allows for handwriting recognition etc, though many of the old handwritten forms are gradually going online. As a result, the University will be discontinuing the use of this software in the near future.
Original records are held for a designated time for quality assurance purposes, then destroyed. The original period of retention was 12 months but over time, as staff became more experienced, and made fewer errors, it was reduced to 6 months then 3 months.
Technical standards
The current technical standards employed for these records are as follows:
Technical specifications for student files | |
---|---|
Scan resolution | 300-400ppi (400ppi for records that were difficult to read) |
File bit depth | 1 bit |
Format | |
Colour | B/W (some colour as required) |
Compression | Lossless |
The higher resolution reflects the importance of these records to the organisation and the need to ensure legibility and completeness of all detail present in the originals. Compression is lossless to ensure no loss of information.
The file format chosen was originally multi-page TIFF but it was discovered that, although the University was on a standard operating platform, many default viewers for TIFF within the University varied. For example, some could open a multi-page document but only the first page could be read. As a result, the University moved to PDF as its main format as all computers could view PDF. While this is not an archival format, student records have a relatively short term retention.
One of the issues with PDF as a format is that it may not automatically capture some important metadata for the management of records over time. For example, some technical metadata like image software may not be automatically captured and persistently linked to the records.
The main metadata elements required for retrieval are student ID, the form and the date which are captured or inherited from the file when the digital images are saved into the EDRMS.
The University is now digitising some student records in colour as some have annotations in red ink that are important to distinguish from other annotations of a different colour.
Legal documents
For the last 6 years, all legal documents have been digitised by the University. The aim of this digitisation program is to increase accessibility of the records for legal staff and to ensure the security and protection of originals. The original legal documents are retained and held in fire-resistant safes (one hour fire rated) while the digital duplicates are used for business.
Technical specifications for legal documents | |
---|---|
Scan resolution | 300ppi |
File bit depth | Varies |
Format | |
Colour | Grayscale and colour |
Greyscale and colour are used to ensure that all relevant annotations and information can be obtained from the digital images without the need to further reference the original paper records. Again, while metadata cannot be automatically captured in PDF, the important metadata for retrieval is added when the documents are registered into the EDRMS.
Other current business process digitisation programs
Since about 2004, multi-function devices (MFDs), such as photocopiers with scanners built in, are available in most business units which allow users to scan the paper correspondence and other business records they receive and send. In these cases, digitisation is performed by the staff member managing the information, according to documented procedures. The staff member is responsible for checking their own digitisation and saving the image into the EDRMS.
Technical specifications for UWS default | |
---|---|
Scan resolution | 200ppi (default and recommended minimum) |
File bit depth | 1 bit |
Format | |
Colour | B&W |
These technical standards reflect the less critical (and primarily short term) nature of the business being conducted.
Note: One of the issues with this devolved scanning is that staff members will scan very large files, such as a file with 150 pages in one batch. When a very large file, e.g. over 10MB, is sent to the computer from the MFD it will often result in a completely illegible file. If the user does not check this, it may be saved to the EDRMS in an unreadable format. It is recommended that original paper records are destroyed within 3 months so there is some risk that this will not be discovered until it is too late. Scanning guidelines at UWS strongly stress the need for quality control checks.
Current business papers with archival value, such as board papers, are converted to PDF/A.
Back to topBack-capture digitisation projects
Archival photographs
Records staff at the University are involved in a back-capture project to digitise the University’s significant and valuable archival photographic collection. The prime aim of this project is to make the images more accessible, including to the wider community. The original photographs are being retained.
The digital images (the master and derivatives) are saved into the EDRMS and are linked to the series registration of the original paper records. The file format and resolution form part of the title so that versions can be distinguished.
Technical specifications for archival photographs | |
---|---|
Scan resolution | 600ppi for TIFF master 75ppi for JPEG derivatives |
File bit depth | Varies |
Format | TIFF for master JPEG for derivatives |
Colour | Colour |
Compression | Lossless for master Lossy for derivatives |
The technical standards (for masters) are very high, reflecting the need to produce a high quality image for publication. In addition, the University was keen to digitise them at the best quality possible so they would not need to be re-digitised in the future. The technical standards for derivatives which are to be shared over the web etc are lower to reduce file size and ease transmission.
TIFF allows for the automatic capture of metadata, including technical metadata required to manage these images over time. Although the originals are being maintained it is important that the high quality images can also be maintained to maximise business benefits of the digitisation over time.
Note: Originally the archival photographs which were black and white were being digitised in greyscale. However, it was soon realised that greyscale did not capture the depth of the image so the specification was changed to colour. Even though it increased the file size, it was important to capture this depth.
Archival student cards
The University has student cards dating from 1891 to the 1970s which are the primary record of student enrolments and results for that period. The technical specifications for these cards are lower as digitisation is for access purposes only and the original cards are maintained as the official records.
The digital images are saved into the EDRMS and are linked to the series registration of the original paper records.
Technical specifications for archival student cards | |
---|---|
Scan resolution | 200ppi |
File bit depth | 8 bit |
Format | JPEG then PDF |
Colour | Colour |
Archival student fiche
The University also has some student fiche (the equivalent of student transcripts) from the 1960s-1970s. The original conversion and quality checking of these images was poor with fuzzy images, skewed pages and missing text common. However, the University decided to digitise them anyway, as they are the only student records in existence from this period for two of the University's predecessor institutions (the originals were destroyed on microfilming). They are still referenced and need to be more accessible and they have archival value. The microfiche will be retained after digitisation.
The digital images are saved into the EDRMS and are linked to the series registration of the original paper records.
Technical specifications for archival student fiche | |
---|---|
Scan resolution | 300ppi |
File bit depth | 1 bit |
Format | |
Colour | Black and white |
The original microfiche were created in black and white so there was no value in changing to colour or greyscale.
Back to topConclusions drawn and lessons learned
Keys to success
The keys to the success of these projects are that the project leaders:
- considered the aims/required outcomes of the project and the particular records in question when deciding on what to digitise and the technical standards and metadata required - this meant that standards chosen could be ‘fit for purpose’ and not unnecessarily burdensome for the University
- limited critical digitisation projects e.g. student records, legal documents, archival projects to key people who were highly trained to ensure quality control was very effective where it was needed
- planned their quality assurance processes from the start which enabled processes to be refined and accuracy to be improved immediately
- maintained clear documentation about each project or program including planning, processes, specifications, the quality assurance undertaken, outcomes etc. This enables the University to meet the requirements of the General retention and disposal authority: imaged records [since replaced by the general retention and disposal authority for original or source records that have been copied (GA45)] and can help to ensure the images are authentic and credible.
Lessons learned
Lessons learned included that:
- the standard operating environment for technology in an organisation can impede how the images can be viewed and should be considered in planning
- it is essential to determine upfront what the essential characteristics of the records are and whether they can be reproduced with the technical specifications chosen e.g. is it necessary to reproduce the colour in the documents? Is the added depth provided by colour important for digital images of publication quality?
- each file format has its advantages and drawbacks. PDF, for example, can be read by many computers but inhibits the automatic collection of metadata. If significant metadata needs to be added this will increase project costs. TIFF, however, requires that users have viewers that can read the files. The key is to understand these limitations and make informed decisions about what is the most suitable for the records in question.
Published June 2012 / Revised June 2016 (reference to GA45 only)
Back to top