Retouching before Photoshop
If you’ve ever admired the flawless complexions in vintage portraits and assumed your great-grandparents had impossibly smooth skin – think again. Image manipulation has been around since the earliest days of photography. Angus Hamilton (Librarian, Digital Access) shared some insights, inspired by the Library’s Make Believe: Encounters with Misinformation exhibition.
Long before Photoshop, photo editing was a hands-on craft, carried out using an array of fine tools at specially designed retouching desks with slanted surfaces. It was meticulous work, often done by employees of photographic studios working on portraits in dark rooms. Retouching was most commonly done by women, who were paid significantly less than their male counterparts.
/filters:quality(75)/filters:no_upscale()/filters:strip_exif()/slv-lab/media/images/17597281776cb/a-woman-at-a-retouching-desk.png)
/filters:quality(75)/filters:no_upscale()/filters:strip_exif()/slv-lab/media/images/1759728383599/perry-winkle-writes-about-the-art-of-retouching.png)
While retouching was widespread, it’s rare to find surviving glass negatives that show markings left behind by the craft. However, these can be seen in two of State Library Victoria’s collections: the glass plate negatives of the Spencer Shier collection and the Rosenberg collection: studio of Vincent Kelly.
/filters:quality(75)/filters:no_upscale()/filters:strip_exif()/slv-lab/media/images/1759793275000/miss-j-barber-fl15655881.jpg)
/filters:quality(75)/filters:no_upscale()/filters:strip_exif()/slv-lab/media/images/175979368dcd3/mrs-f-mcdonald.png)
/filters:quality(75)/filters:no_upscale()/filters:strip_exif()/slv-lab/media/images/1759793836556/closeup-mrs-f-mcdonald.png)
Spotting fakes
When we're presented with an image, how can we tell what's real and what isn't?
The Library’s Lead Developer (and keen birder) Nick Paustian explored the growing challenge of identifying manipulated images in the age of AI.
Nick noted that we’ve reached a tipping point: distinguishing real from fake now requires more time and skill. While our eyes can still catch telltale signs – like distorted features or overly smooth textures – more sophisticated methods rely on metadata and online detection tools.
/filters:quality(75)/filters:no_upscale()/filters:strip_exif()/slv-lab/media/images/1759802732f0b/3-birds.png)
Platforms like Meta – the parent company of Facebook and Instagram – have begun adding ‘AI info’ labels to AI-generated images, video and audio; but their detectors often misfire, flagging authentic content edited with AI-powered tools like Photoshop. This is also the case for many online AI detector tools, like wasitai.com.
/filters:quality(75)/filters:no_upscale()/filters:strip_exif()/slv-lab/media/images/17598857646b9/detail-bird-being-checked-wasitai.png)
Instead of trying to spot fakes, what if we could authenticate what's real?
Nick introduced the concept of cryptographic fingerprinting; a technique borrowed from software development. By generating a unique hash for an image, any modification – even a single pixel – would result in a different hash, revealing tampering.
/filters:quality(75)/filters:no_upscale()/filters:strip_exif()/slv-lab/media/images/1759879721ab4/hatted-birbs.png)
Companies like Truepic are pioneering this space, working with Adobe and others through the Coalition for Content Provenance and Authenticity (C2PA). Their goal is to embed cryptographic signatures and metadata directly into images, allowing platforms and users to verify authenticity.
This technology could be especially valuable for journalism and archival institutions, where knowing when and where an image was captured is crucial. Nick suggested that organisations like State Library Victoria might consider adopting such systems to enhance the provenance of their collections.
Optical Character Recognition (OCR)
SLV LAB’s Innovation Lead Sotirios Alpanis spoke about Optical Character Recognition (OCR) – a technology that has quietly shaped how we access and interpret historical collections.
OCR is the process of extracting text from images. It sounds simple, but as Sotirios put it, programming a computer to do this is ‘like teaching a rock how to think’. We can glance at a newspaper and instantly understand its structure – columns, headlines, captions – but for a computer, it’s chaos. Different fonts, sizes and layouts make segmentation and recognition a complex task.
Despite these challenges, OCR has transformed research. It doesn’t just make scanned documents searchable; it allows us to slice and analyse data in new ways, opening possibilities for both scholarship and creative projects. Sotirios shared examples ranging from his own playful ‘Primary Source Ransom Note Generator’ to Olivia Vane’s Steptext, a tool for visualising text patterns in digitised archives.
/filters:quality(75)/filters:no_upscale()/filters:strip_exif()/slv-lab/media/images/175988559d3d7/rng.png)
/filters:quality(75)/filters:no_upscale()/filters:strip_exif()/slv-lab/media/images/1759885414f03/steptext-screenshot.png)
Sotirios reflected on his early career at the British Newspaper Archive, where OCR dramatically accelerated digitisation. What once took years could now be done in days. But this progress came with hidden costs: segmentation was still manual, and OCR training was often outsourced to low-paid workers in Southeast Asia – a practice that echoes the way modern AI models are trained today. As Sotirios noted, 'OCR has a slightly murky past.’
The story of OCR is also tied to the big tech race of the 2000s. Google and Microsoft competed to digitise vast libraries, including the British Library’s collections. Microsoft eventually abandoned its project, returning images with minimal metadata and documentation. Google persisted, digitising millions of pages – but kept its methods opaque and focused on English-language material, leaving gaps for non-Latin scripts.
One bright spot was Tesseract, an OCR engine open-sourced by HP in 2005 and later developed by Google. Today, it underpins many OCR tools, including those used by cultural institutions.
Printed text was only the beginning. Handwritten materials posed an even greater challenge, leading to the development of Transkribus, a platform born from EU-funded projects that allows users to upload documents, correct transcriptions and train custom AI models – a powerful tool for cultural heritage institutions. Sotirios contributed to its early stages by helping to prepare datasets of Arabic scientific manuscripts.
/filters:quality(75)/filters:no_upscale()/filters:strip_exif()/slv-lab/media/images/1759885876e97/ts-arabic.png)
Despite the hype around AI, OCR remains a reliable workhorse – one Sotirios has often returned to throughout his career. However, OCR’s evolution raises important concerns, such as the creation of bias due to uneven datasets where English material dominates, leaving other languages behind. Additionally, human labour is often invisible, and OCR data now feeds large language models (LLMs), adding another layer of bias to digital systems.
_________________
Code Club is a grassroots initiative within State Library Victoria that provides space for staff to learn and engage with technology. With an inclusive and accessible approach, the club aims to increase digital literacy, demystify technology and foster cross-departmental connections. Learn more