ABSTRACT
The automatic restoration technology of shredded paper is an important branch in computer science. It plays an important role in judicial evidence restoration, the restoration of secret documents, and many other areas. In this article, we establish a similarity measurement model by data mining. This article mainly focuses on Chinese text files with regular cutting. The mathematic model is established and used for restoration, we provide several measurements to achieve the restoration and reduce the workload of manual intervention. At the same time, this article provides a way to restore two-side printing shredded documents. This paper gives experimental results that prove the effectiveness of the proposed method.
- Zhang Cui. Digital watermarking and splicing of document images based on dotted lines[D]. Ocean University of China, 2011.Google Scholar
- YIN Yuping, LIU Wanjun, ZHANG Chong, LIU Yongchao, Automatic documents fragment re-assembly algorithm based on dynamic clustering, Computer Engineering and Applications, 2014,18:162-166,170Google Scholar
- LUO Zhizhong, Semi-auto stitching of scrapped paper based on character characteristic, Computer Engineering and Applications, 2012,5: 207-210Google Scholar
- HARALICK R M, SHANMUGAM K. Textural features for image classification[J]. IEEE transactions on systems ,man, and cybernetics, 1973 ,3(6):610-621.Google Scholar
- Gao Cheng-Cheng,Hui Xiao-Wei. GLCM-based Texture Feature Extraction[J]. Computer Systems & Applications, 2010,19(06):195-198Google Scholar
- Jiawei Han. Data Mining [M]. Machinery Industry Press. 2012.8Google Scholar
Recommendations
Sparse Document Image Coding for Restoration
ICDAR '13: Proceedings of the 2013 12th International Conference on Document Analysis and RecognitionSparse representation based image restoration techniques have shown to be successful in solving various inverse problems such as denoising, in painting, and super-resolution, etc. on natural images and videos. In this paper, we explore the use of sparse ...
Restoration of motion blurred document images
SAC '12: Proceedings of the 27th Annual ACM Symposium on Applied ComputingMotion blur often decreases the quality of document image and makes the text information within the document images unreachable by optical character recognition (OCR) or by a person. This paper presents a blur correction technique that aims to correct ...
A ground truth bleed-through document image database
TPDL'12: Proceedings of the Second international conference on Theory and Practice of Digital LibrariesThis paper introduces a new database of 25 recto/verso image pairs from documents suffering from bleed-through degradation, together with manually created foreground text masks. The structure and creation of the database is described, and three bleed-...
Comments