Genalog is an open-source, a cross-platform Python package that generates document images with synthetic noise that mimics scanned analog documents. Various text degradations can be added to these images to create a fast and efficient way of generating synthetic documents by leveraging layout from templates you can make using HTML format.
Genalog’s capabilities include flexible format image generation, custom image degradation, extracting text from images using a cognitive search pipeline, and getting OCR performance metrics.
This package provides a comprehensive solution for generating synthetic images from any text data rich in natural language and imitate most OCR noises that are found in scanned documents.
Genalog provides you with several document templates to use as a starting point. The document’s layout can be altered using standard CSS properties like font-family, font-size, text-align, etc.
Apart from document generation and degradation, Genalog also provides efficient implementation for text alignment between the source and noise text.
Related Paper: https://arxiv.org/pdf/2108.02899.pdf
Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.