Linking Crowdsourced Transcription to Automated Handwriting Recognition: Lessons from Transcribe Bentham

old_uid14156
titleLinking Crowdsourced Transcription to Automated Handwriting Recognition: Lessons from Transcribe Bentham
start_date2017/06/23
schedule09h40-10h25
onlineno
location_infosalle Jean Jaurès
detailsCrossing borders: Three talks on Text Analysis and Digital Humanities. Conférence organisée par le laboratoire LATTICE
summaryFor nearly seven years, the Transcribe Bentham project has been generating high quality crowdsourced transcripts of the writings of the philosopher and jurist Jeremy Bentham (1748-1832), held at University College London, and latterly, the British Library. Now with nearly 6 million words transcribed by volunteers, little did we know at the outset that this project would provide an ideal, quality controlled dataset to provide "ground truth" for the development of Handwriting Technology Recognition. This paper will look at the past, present and future of automated handwriting analysis for documents, showing how our research on the EU framework 7 Transcriptorium, and now H2020 READ projects, is working towards a service to improve the searching and analysis of digitised manuscript collections across Europe, and reusing the data created by crowdsourced, volunteer labour, for machine learning purposes. Melissa Terras is Director of UCL Centre for Digital Humanities, Professor of Digital Humanities in UCL's Department of Information Studies, and Vice Dean of Research in UCL’s Faculty of Arts and Humanities. Publications include "Image to Interpretation: Intelligent Systems to Aid Historians in the Reading of the Vindolanda Texts" (2006, Oxford University Press) and "Digital Images for the Information Professional" (2008, Ashgate) and she has co-edited various volumes such as "Digital Humanities in Practice" (Facet 2012) and "Defining Digital Humanities: A Reader" (Ashgate 2013). She is currently serving on the Board of Curators of the University of Oxford Libraries, and the Board of the National Library of Scotland, and is a Fellow of the Chartered Institute of Library and Information Professionals and Fellow of the British Computer Society. Her research focuses on the use of computational techniques to enable research in the arts and humanities that would otherwise be im! possible.
responsibles<not specified>
speakers