Table Detection and Extraction from Image Document

  IJCOT-book-cover
 
International Journal of Computer & Organization Trends (IJCOT)          
 
© 2013 by IJCOT Journal
Volume-3 Issue-4                           
Year of Publication :  2013
Authors : Tanushree Dhiran, Rakesh Sharma

Citation

Tanushree Dhiran, Rakesh Sharma . "Table Detection and Extraction from Image Document" . International Journal of Computer & organization Trends  (IJCOT), V3(4):7-10 Jul - Aug 2013, ISSN 2249-2593, www.ijcotjournal.org. Published by Seventh Sense Research Group.

Abstract

Tables make information easier to understand and perceive than regular text block. Now days, it becomes popular structure for information representation. Format of tables differs and change according to need of representation of information. Various format of table makes it difficult for OCR system to recognize and just segment as an Image block. We proposed a novel approach which can detect all type of table format from single column image document. Tables are categorized in three type based of their rows and column separator.Type1 table have line as row and column separator. Type2 table have horizontal line for separating rows and space for separating column. In Type3 tables only space are used as both row and columns separator. Tables are detected from image documents based on simple projection profile and hough line detection method. We have tested this approach with 1200 image documents which contains all type of table format and get 89% accurate result.

References

[1] Zheng Y., Liu C.( 2001), Ding X., Pan S.,”Form Frame Line Detection with Directional Single-Connected Chain”, Proc. of the 6th Int. Conf. on Doc. Anal. & Recognition, 699-703
[2] Thomas G. Kieninger(1998), “Table Structure Recognition Based on Robust Block Segmentation”, German research center for artificial intelligence
[3] Shafait and Smith (2010), “Table Detection in Heterogeneous Documents”, DAS `10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems.
[4] B. Klein, S. Gokkus, T. Kieninger, A. Dengel(2001), “Three approaches to “industrial” table spotting”, Sixth International Conference on Document Analysis and Recognition (ICDAR01), Seattle, WA, September, pp.513–517.
[5] K. Zuyev(1997), “Table image segmentation”, Proceedings of the International conference on Document Analysis and Recognition (ICDAR) ’97, Ulm, Germany, August, pp.705–708.
[6] Gatos, B., Pratikakis, I., Perantonis, S.J(2004): “An adaptive binarisation technique for low quality historical documents”. IARP Workshop on Document Analysis Systems (DAS2004), Lecture Notes in Computer Science (3163), ,pp.102-113
[7] Perantonis, S.J., Gatos, B., Papamarkos, N(1999) “Block decomposition and segmentation for fast Hough transform evaluation”. Pattern Recognition, vol. 32(5), pp.811-824
[8] B. Gatos, D. Danatsas, I. Pratikakis and S. J. Perantonis. Automatic(2005) ,”Table Detection in document images”. National Center for Scientific Research “Demokritos”GR 15310 Athens, Gre

Keywords

Line Segmentation, Hough Line Detection, Word Level segmentation, Projection Profile