Microsoft Unveils Genalog: An Open Source, AI Cross-Platform Python Package For Generating Document Images With Synthetic Noise

Source: https://microsoft.github.io/genalog/

Genalog is an open-source, a cross-platform Python package that generates document images with synthetic noise that mimics scanned analog documents. Various text degradations can be added to these images to create a fast and efficient way of generating synthetic documents by leveraging layout from templates you can make using HTML format.

Genalog’s capabilities include flexible format image generation, custom image degradation, extracting text from images using a cognitive search pipeline, and getting OCR performance metrics.

This package provides a comprehensive solution for generating synthetic images from any text data rich in natural language and imitate most OCR noises that are found in scanned documents.

Genalog provides you with several document templates to use as a starting point. The document’s layout can be altered using standard CSS properties like font-family, font-size, text-align, etc.

https://github.com/microsoft/genalog

Apart from document generation and degradation, Genalog also provides efficient implementation for text alignment between the source and noise text.

Github: https://github.com/microsoft/genalog

Source: https://microsoft.github.io/genalog/

Related Paper: https://arxiv.org/pdf/2108.02899.pdf