[Public] Unsatisfactory OCR result output from IronOCR

Enhancing OCR Accuracy with IronOcr.Extensions.AdvancedScan

Working with IronOCR can sometimes be challenging, as the effectiveness of a particular configuration, setting, or filter may vary depending on the input document. A solution that works well for one input might not yield the same results for another.

In many cases, integrating the IronOcr.Extensions.AdvancedScan package significantly improves OCR accuracy.

Key Differences: IronOcr vs. IronOcr.Extensions.AdvancedScan

The primary distinction between IronOcr.Extensions.AdvancedScan and the standard IronOCR package lies in the OCR engine they use:

IronOCR utilizes the Tesseract Engine.
IronOcr.Extensions.AdvancedScan is powered by PaddleOCR, which includes machine learning capabilities.

This advanced package is particularly effective for reading noisy images, such as screenshot images and structured documents or images such as passports and license plates. To accommodate these use cases, the package introduces specialized methods:

Considerations for .NET Framework Users

For applications built on .NET Framework, it's crucial to configure the project as x64. AdvancedScan requires a significant amount of memory, and running it in a different architecture may lead to runtime errors.

For more details, refer to the official troubleshooting guide:
🔗 Advanced Scan on .NET Framework

Limitations and Future Development

While the AdvancedScan package provides impressive OCR accuracy, it does have some limitations:

The result objects from AdvancedScan methods have fewer features compared to the standard OcrResult object.
Creating searchable PDFs from AdvancedScan outputs is not yet supported (this feature is currently under development).
Different read methods return different result object types, so refer to the API Reference for details on each.
Limited language support. Supported language currently includes:
- English
- Chinese
- Japanese
- Korean
- LatinAlphabets

Conclusion

For customers experiencing issues with OCR accuracy using IronOCR, consider adding IronOcr.Extensions.AdvancedScan to the solution. Its PaddleOCR engine and machine learning enhancements make it a powerful alternative for handling complex image processing tasks.