Module 1 - Layout Analysis


Outcome: text blocks (blue), text lines (light
blue), decoration (yellow), page (red)

The goal of the image analysis module was to automatically detect different layout elements on a scanned manuscript page such as ornaments, illustrations, and text elements. The text elements are then provided to the text recognition module.

First, a database of document images was compiled from e-codices, a virtual manuscript library from the Medieval Institute of the University of Fribourg, and a layout model was defined [C10.2]. Based on this layout model, a semi-automatic annotation tool was developed to label the different layout elements in the database [C10.1].

 

Pyramidal approach

Machine learning techniques have been developed to automatically extract the layout elements [C11.1]. A novel pyramidal approach was pursued that classifies the layout elements at different levels of resolution. First, a rough decision is taken based on low-resolution images: each pixel of a downscaled, small page image is classified into foreground, background, or out of page. Then, the layout elements are refined based on higher resolutions: foreground and background are refined to text blocks and out of text blocks based on a higher image resolution. Out of text blocks includes illustrations, ornaments, colored initials, marginalia, etc. Finally, the text blocks are refined to text lines using a detailed, large image.

The recognition system used is a special form of neural network. Pixels are classified based on several features that include the position, the color, the color of neighboring pixels, and the classification output from the preceding level of the pyramid. For more details on the recognition system and the features, we refer to [C13.3, C13.1].

The main aim of the module was to provide a generic layout analysis tool that can be easily adapted to different types of documents. With the help of machine learning methods this goal could be reached. As an exemplary result achieved on digital images of the Cod. Sang. 231, a Latin manuscript held by the Abbey Library of Saint Gall, text line images could be extracted with a pixel-level accuracy of about 92 % [Cn1]. Although 8 % of the pixels are misclassified, this is an encouraging result providing automatic access to a large part of the textual contents.

 

Publications

[C13.1] M. Baechler, M. Liwicki, and R. Ingold. Text Line Extraction using DMLP Classifiers for Historical Manuscripts. In Proc. 12th Int. Conf. on Document Analysis and Recognition, in print, 2013.

[C13.3] H. Wei, M. Baechler, F. Slimane, and R. Ingold. Evaluation of SVM, MLP and GMM Classifiers for Layout Analysis of Historical Document. In Proc. 12th Int. Conf. on Document Analysis and Recognition, in print, 2013.

[C12.1] A. Fischer, H. Bunke, N. Naji, J. Savoy, M. Baechler, and R. Ingold. HisDoc: Historical Document Analysis, Recognition, and Retrieval. In Digital Humanities, Book of Abstracts, pp. 94–97, 2012. Digital Humanities-Link

[C12.4] A. Garz, A. Fischer, R. Sablatnig, and H. Bunke. Binarization-free text line segmentation for historical documents based on interest point clustering. In Proc. 10th Int. Workshop on Document Analysis Systems, pp. 95–99, 2012. 10.1109/DAS.2012.23

[C11.1] M. Baechler and R. Ingold. Multi Resolution Layout Analysis of Medieval Manuscripts Using Dynamic MLP. In Proc. 11th Int. Conf. on Document Analysis and Recognition, pp. 1185–1189, 2011. 10.1109/ICDAR.2011.239

[C10.1] M. Baechler, J.-L. Blochle, and R. Ingold. Semi-automatic annotation tool for medieval manuscripts. In Proc. 12th Int. Conf. on Frontiers in Handwriting Recognition, pp. 182–187, 2010. 10.1109/ICFHR.2010.36

[C10.2] M. Baechler and R. Ingold. Medieval manuscript layout model. In Proc. 10th ACM Symposium on Document Engineering, pp. 275–278, 2010. 10.1145/1860559.1860622

[Cn1] A. Fischer, H. Bunke, N. Naji, J. Savoy, M. Baechler, and R. Ingold. The HisDoc Project. Automatic Analysis, Recognition, and Retrieval of Handwritten Historical Documents for Digital Libraries. In Proc. InterNational and InterDisciplinary Aspects of Scholarly Editing, in print, Bern, 2012. Link