toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Record Links
Author (up) G.Thorvaldsen; Joana Maria Pujadas-Mora; T.Andersen ; L.Eikvil; Josep Llados; Alicia Fornes; Anna Cabre edit  url
openurl 
  Title A Tale of two Transcriptions Type Journal
  Year 2015 Publication Historical Life Course Studies Abbreviated Journal  
  Volume 2 Issue Pages 1-19  
  Keywords Nominative Sources; Census; Vital Records; Computer Vision; Optical Character Recognition; Word Spotting  
  Abstract non-indexed
This article explains how two projects implement semi-automated transcription routines: for census sheets in Norway and marriage protocols from Barcelona. The Spanish system was created to transcribe the marriage license books from 1451 to 1905 for the Barcelona area; one of the world’s longest series of preserved vital records. Thus, in the Project “Five Centuries of Marriages” (5CofM) at the Autonomous University of Barcelona’s Center for Demographic Studies, the Barcelona Historical Marriage Database has been built. More than 600,000 records were transcribed by 150 transcribers working online. The Norwegian material is cross-sectional as it is the 1891 census, recorded on one sheet per person. This format and the underlining of keywords for several variables made it more feasible to semi-automate data entry than when many persons are listed on the same page. While Optical Character Recognition (OCR) for printed text is scientifically mature, computer vision research is now focused on more difficult problems such as handwriting recognition. In the marriage project, document analysis methods have been proposed to automatically recognize the marriage licenses. Fully automatic recognition is still a challenge, but some promising results have been obtained. In Spain, Norway and elsewhere the source material is available as scanned pictures on the Internet, opening up the possibility for further international cooperation concerning automating the transcription of historic source materials. Like what is being done in projects to digitize printed materials, the optimal solution is likely to be a combination of manual transcription and machine-assisted recognition also for hand-written sources.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 2352-6343 ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.077; 602.006 Approved no  
  Call Number Admin @ si @ TPA2015 Serial 2582  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: