Fluid : Designing Software that works - For everyone.

The Decapod Project

The Problem Space

Scholarly content needs to be online, and for much mass produced content, that migration has happened. Unfortunately, the online presence of scholarly content is much more sporadic for materials like small journals, original source materials in the humanities and social sciences, non-journal periodicals, and more.

A large barrier to this content being available is the cost and complexity of setting up a digitization project for small and scattered collections coupled with a lack of revenue opportunities to recoup those costs. Collections with limited audiences and hence limited revenue opportunities are nonetheless often of considerable scholarly importance within their domains. The expense and difficulty of digitization presents a significant obstacle to making paper archives available online.

Working Towards the Solution

To meet this need we are building Decapod. Decapod will be an inexpensive attaché case sized hardware/software solution that can be readily procured and assembled and taken into the stacks by local staff or volunteers to quickly and unobtrusively capture the material and deliver it in usable format. It will be open-source, easy to use, and will provide an out-of-the box method of digitizing small to medium archives of scholarly material.

Decapod will remove the barriers to digitization now encountered by archives of documentary material: cost of equipment, cost of labour, lack of digitization expertise, lack of suitable distribution formats, and lack of acceptable remediation workflows. Decapod will address them all to produce a paper-to-digital document solution that is highly effective, highly automated, and low operator interaction (apart from page turning).

The solution will address these problem areas.
  1. Allow the camera based capture of bound material by using computer vision techniques to produce flat, clean page images equivalent to those produced from a flat bed scanner.
  2. Remove the need for extensive operator intervention in the capture process by detecting scan problems and allowing the operator to rectify the scan immediately.
  3. Reduce user intervention in the conversion process by using advanced document understanding techniques to remove almost all intervention, and by reducing the remainder to very simple "1-click" operations.
  4. PDF/A outputs will be visually faithful to the original, searchable, and widely usable.
  5. Allow the output to be viewable on mobile devices that support PDF reflow.
  6. Remove the need for deep software, hardware or digitization skills by integrating all software components into a turnkey end-to-end solution.
  7. Remove capital cost barriers by using consumer grade cameras.
  8. Reduce operational cost barriers by allowing volunteers or local staff to operate the system with minimal training or commitment.

Fluid is a project of the Adaptive Technology Resource Centre, University of Toronto,
funded by a grant from The Andrew W. Mellon Foundation.
© the fluid project, 2009