[Public] Unsatisfactory OCR result output from IronOCR
IronOCR did not give a satisfying result? What to do?
Enhancing OCR Accuracy with IronOcr.Extensions.AdvancedScan
Working with IronOCR can sometimes be challenging, as the effectiveness of a particular configuration, setting, or filter may vary depending on the input document. A solution that works well for one input might not yield the same results for another.
In many cases, integrating the IronOcr.Extensions.AdvancedScan package significantly improves OCR accuracy.
Key Differences: IronOcr vs. IronOcr.Extensions.AdvancedScan
The primary distinction between IronOcr.Extensions.AdvancedScan and the standard IronOCR package lies in the OCR engine they use:
-
IronOCR utilizes the Tesseract Engine.
-
IronOcr.Extensions.AdvancedScan is powered by PaddleOCR, which includes machine learning capabilities.
This advanced package is particularly effective for reading noisy images, such as screenshot images and structured documents or images such as passports and license plates. To accommodate these use cases, the package introduces specialized methods:
Considerations for .NET Framework Users
For applications built on .NET Framework, it's crucial to configure the project as x64. AdvancedScan requires a significant amount of memory, and running it in a different architecture may lead to runtime errors.
For more details, refer to the official troubleshooting guide:
🔗 Advanced Scan on .NET Framework
Limitations and Future Development
While the AdvancedScan package provides impressive OCR accuracy, it does have some limitations:
-
The result objects from AdvancedScan methods have fewer features compared to the standard OcrResult object.
-
Creating searchable PDFs from AdvancedScan outputs is not yet supported (this feature is currently under development).
-
Different read methods return different result object types, so refer to the API Reference for details on each.
- Limited language support. Supported language currently includes:
- English
- Chinese
- Japanese
- Korean
- LatinAlphabets
Conclusion
For customers experiencing issues with OCR accuracy using IronOCR, consider adding IronOcr.Extensions.AdvancedScan to the solution. Its PaddleOCR engine and machine learning enhancements make it a powerful alternative for handling complex image processing tasks.