We propose a model for estimating the time to transcribe a large collection of historical handwritten documents when the transcription is assisted by a keyword spotting system following the query-by-string approach. The model assumes that the system is segmentation-based and provides as output the transcription of each item (either right or wrong) or a reject. We also assume that any other information the system may need is obtained from the training set. The model has been validated by comparing its estimates with the actual time required for the manual transcription of pages from the Bentham dataset. Eventually, we discuss possible ways of extending the model to consider different kind of keyword spotting system, such as those providing the output in terms of a ranked list of alternatives and/or adopting the query-by-example approach.

Assisted transcription of historical documents by keyword spotting: a performance model

Adolfo Santoro
;
Angelo Marcelli
2017-01-01

Abstract

We propose a model for estimating the time to transcribe a large collection of historical handwritten documents when the transcription is assisted by a keyword spotting system following the query-by-string approach. The model assumes that the system is segmentation-based and provides as output the transcription of each item (either right or wrong) or a reject. We also assume that any other information the system may need is obtained from the training set. The model has been validated by comparing its estimates with the actual time required for the manual transcription of pages from the Bentham dataset. Eventually, we discuss possible ways of extending the model to consider different kind of keyword spotting system, such as those providing the output in terms of a ranked list of alternatives and/or adopting the query-by-example approach.
2017
978-1-5386-3586-5
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4703375
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 2
social impact