Fashion might shine on the runway, but there’s an unseen force that keeps it moving: supply chain management. This web of transactions, certifications, and validations heavily relied, until recently, on human intervention. A partnership between Levi9’s data science and engineers team and clothing giant demonstrates a more efficient approach. Levi9’s AI-based significantly cut back the manual work on tens of thousands of documents.
The paper trail problem
Our customer must handle and validate tens of thousands of certificates that are mostly received as scanned documents. “These bureaucratic administrative procedures are time-consuming, create a heavy workload, and are also prone to human error”, explains Ana-Maria Cehan, Levi9’s Data Scientist.
The issue faced was clear: how could this customer maintain the accuracy and integrity of its supply chain documentation while reducing the burden on its workforce? Levi9 accepted the challenge and proposed an artificial intelligence-based solution.
Levi9's AI-powered solution
“We have developed an AI-based solution using AWS Textract that extracts relevant information from structured and semi-structured scanned documents.” With this solution, for example, purchase orders can be validated against invoices received from the seller.
This rather complex process involves several actions, such as checking if the certificates were renewed periodically, or checking that goods and services listed in the certificate match the purchase orders. Ana-Maria Cehan explains how the process works: “If the validation fails, an alert is triggered.” The system automatically flags and informs providers that they need to resend the information, as it’s either incomplete or it’s missing compared to the purchase order.
But how does one teach a machine to read and understand complex supply chain documentation? It’s not as simple as feeding it a dictionary and hoping for the best. The process, as Ana Maria puts it, is “an art.”
Teaching machines to read between the lines
“Working with information extracted from scanned documents implies understanding the needs of the client, development of those needs, Natural Language Processing techniques knowledge, and lots of creativity,” Ana Maria explains.
The solution developed by Levi9 is akin to teaching a computer to read between the lines – literally. It doesn’t just scan for keywords; it understands context, it recognizes patterns, and can even interpret the spatial relationships between different pieces of information on a page.
Why tables are tricky for a machine
One of the key challenges was dealing with tables in scanned documents. What might be easy for the average human eye, proves difficult for a machine: “We can recognize columns and rows and cells within a row, even when there are no table lines. But a machine struggles.”
To overcome this issue, Levi9’s team used AWS’s Textract. It parses information from well-structured documents, with tables for example, and also from what it looks like a structured document but it is rather semi-structured. Ana Maria explains the approach: “Textract is an amazing API from AWS, a Questioning and Answering Machine Learning Model that does not only consider what it was trained on to give answers but also can reply with new information it sees by understanding the question and also answering looking at the other details on the page and their coordinates.” Practically, with the AI results and post-processing techniques, even when the structure of the document is not explicitly defined, the tool managed to understand the structure of the document
Textract is also used to query for specific information from the scanned documents involved in this specific project, such as dates and company names, a task specific for Named Entity Recognition (NER) systems.
The power of Natural Language Processing
But the real magic happens in the matching process. The system doesn’t just extract information. It compares it to existing data, looking for discrepancies or inconsistencies, using the power of Natural Language Processing (NLP).
“We used various NLP techniques, such as clean-up, preprocessing, removing stopwords, and computing cosine distance,” Ana Maria explains. One of the key techniques employed in this process is Named Entity Recognition (NER), an important piece of Natural Language Processing. NER allows the system to automatically identify and classify specific entities within the text, such as dates, company names, and product details.
The solution developed with Levi9 reduced our customer’s human workload from manually checking tens of thousands of documents to only reviewing the handful that trigger automatic alerts.
A careful integration of technical solutions
The implementation of this system required a carefully orchestrated symphony of AWS services, including S3 for storage, CloudWatch for debugging, Athena for querying ground truth tables, and Glue Jobs for structuring the extracted data. As mentioned above, AWS Textract was essential in parsing information from structured and semi-structured documents. SNS messages were used to send notifications to a Teams channel, ensuring that our customer’s staff were always in the loop about any errors or failures in the process. Finally, all results were visually interpreted and designed in Tableau.
A three-step approach for your company
Can this approach be applied to other companies and industries? Ana Maria Cehan says yes, but keep in mind that AI, like any technology, is not a one-size-fits-all solution. Levi9’s data scientist recommends a three-step approach for companies looking to leverage AI in their supply chain management:
- Define the problem: Look at your supply chain and identify the specific challenges you’re facing.
- Assess your data: Consider what data you have, or what you can gather. The magic often happens when you look at the entire chain, not just individual components.
- Start small and scale: Begin with a focused project and gradually work towards a unified data model.
Levi9 experts can help you in all stages of this journey.
If you’re hesitant about adopting AI in your company, Ana Maria has some advice for you: “The future will be different. Period. Embrace AI within your organization early as an essential strategic move, rather than being under pressure to do so later.”