Information extraction from invoices for expense management tool

THE PROBLEM

Client, a Fintech software provider was developing an AI-enabled Employee Expense Management Tool to automate expense processing for its customers. To achieve the above-mentioned objective, client required the creation of a generic AI module to extract a pre-defined set of relevant information from invoices, bills, and receipts raised by different vendors in image / pdf format.

INXITE OUT APPROACH

Data Annotation & Knowledge Repository Creation

Documents (invoices, bills, receipts) in different formats (pdf / image) across various organizations were analyzed in order to create a continually evolving Knowledge Repository. Additionally, annotation of training data for various computer vision models to be trained were undertaken.

Text Extraction from Images using Computer Vision

Custom object detection model was applied in conjunction with open-source image to text libraries to extract different clusters of textual information from image files. The resulting text extracted from different clusters were aggregated for information extraction using NLP models.

NLP Based Information Extraction from Text

Context-aware regex rules (developed based on the Knowledge Repository) are being applied to extract relevant information from text-based files / text clusters of image files.

RESULT

Solution was adopted by the client and integrated in their product.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Information extraction from invoices for expense management tool

THE PROBLEM

INXITE OUT APPROACH

Data Annotation & Knowledge Repository Creation

Text Extraction from Images using Computer Vision

NLP Based Information Extraction from Text

RESULT

Case Studies

Expertise

Solutions

About Us

Expertise

Solutions

About Us