What do archives and forensics techniques have in common? According to Simpson Garfinkel, a computer scientist and early proponent of the emerging field of digital forensics, computer crime investigations are far from the only instances in which forensics techniques are useful. I had the opportunity to learn about some of the intersections between archival practice and digital forensics at the BitCurator workshop, held at UNC Chapel Hill’s School of Information and Library Science on Friday, April 27th. The day-long workshop provided an opportunity to explore the BitCurator environment and gain hands-on practice with its suite of services.
Corporations, government agencies and law enforcement professionals have used traditional and computer forensics methods for a wide range of purposes. Forensics techniques for archives have emerged over the last decade from the field of computer forensics, with some important differences in application. Archivists working with digital materials frequently need to work with legacy files on a wide range of electronic devices. Digital forensics strategies help archivists identify, access, and preserve the files found on these removable devices.
BitCurator was developed to enable institutions to integrate appropriate digital forensics strategies into their archival workflows. The flexible software environment created by the BitCurator team allows archivists to perform automated acquisition and ingest functions on disk images, exact replicas of the information on a device or hard drive. Allowing digital files to remain on their original physical media contributes to long-term preservation risks; to preserve the underlying bitstreams on a device or drive, it’s important to copy and “lift” data off of the physical medium. After a drive has been copied using BitCurator imaging tools, archivists can manipulate and examine the disk image without damaging the original files. In addition, the workflows facilitated by BitCurator help archives establish a chain of custody and ensure that the data retain their integrity, usability and authenticity.
The BitCurator package is comprised of a free and open-source suite of tools that can be installed on a Linux machine or run via a virtual machine. The environment provides scripts and suggested workflows for performing initial data triage, disk imaging, file system analysis, PII identification and metadata export. Using BitCurator, digital archivists can create and virtually mount a disk image, analyze the complete file system of the device, identify any deleted files, and quickly detect the presence of personal data (such as social security numbers and credit cards).
The project is managed through the BitCurator Consortium, a group of institutions that support its development and participate in its governance. The Consortium, which is currently housed at the Educopia Institute, offers ongoing training, webinars and online materials to help institutions get started in digital forensics. BitCurator’s funding model and infrastructure allow the project to continue to evolve and to spin off new tools and programs, such as BitCurator Access—which provides tools to enable end users to work with disk images remotely—and BitCurator NLP, a project designed to enable natural language processing of digital collections.
If you’re curious about digital forensics and would like to learn more about BitCurator’s ongoing projects, visit bitcurator.net.