Making Document and Content Extraction Easy with K2, UiPath, and Azure AI
Last Updated Friday, August 16, 2019
Much of my career has been spent in the area of process automation technologies and before joining K2 my life was consumed by Document Automation, that ability to assemble documents, Word or PDF, using a data sets plus some rules that have been defined by someone in an organization who creates document templates. Part of my role was to be an evangelist for the technology, getting excited about how document automation can radically reduce risk, ensure compliance and increase efficiency in document production.
As an aside - the thing I love about the process automation marketing rule of three “reduce risk, ensure compliance and increase efficiency” is that it can be applied to any tool in the industry. Marketing a technology with this kind of tagline is really not a game changer, that imaginative or a differentiator. See what I did there. I say this because I’m sure most people take this as a given when buying technology for their organization. Why would you buy something that “increases risk, makes you non-compliant and decreases your efficiency?” But I digress.
Working in that space allowed me to see first-hand the problems that organizations have “going digital.” It’s not about digitizing what you have. It’s about changing the way we work using technology – where appropriate. There are tremendous amounts of great technology solutions out in the wild however sometimes the tooling promises a lot, but in practice gets in the way of solving the underlying business problem. I believe the reason for this lies at the feet of those people who design those technologies or solutions without looking at the human/user factors – we don’t fully investigate the problem we are trying to solve. Sometimes we are solving the wrong problems because we want to get the latest and greatest “cool” new tech rather than the boring ones that may provide a better solution to the organization’s problems. And this for me was a theme with working to get people excited about document automation. I would start many conference presentations with the line “What I am about to talk about and demonstrate is not bleeding edge technology, but I promise you it will make a tangible difference to your organization today.”
The reason I preface this blog with these musings is that I am fascinated by the amount of legacy data and key information that organizations have in storage, especially contained within Word or PDF documents. Many organizations cannot categorically say what obligations they have with many of their customers as they would have to go into every single document they have created to find out. In 2013, I know this is an old stat, Microsoft casually mentioned that there were over 500 billion Word documents created the previous year. That is a lot of unstructured data. How could you go about trying to extract that information and make it valuable? Make it usable? Make decisions on it? I have had many conversations over the years with one of my colleagues surrounding different methods that could be used to extract key information from this unstructured data. I even saw some early prototypes that worked extremely well. There are technologies dedicated to this end using Machine Learning to facilitate it. But that got me thinking, could you do this with K2?
If you’ve read this far, you’d be disappointed if I said no. In fact, this blog wouldn’t exist if the answer was no. So yes, yes you can. To do it we need to call upon a couple of other technologies – UiPath and Azure Cognitive Services. These two tools can easily be consumed within K2 using out SmartObject integration technologies which allow you to connect to compliant REST services.
Here’s a short explanation of how this works. We built a process app on K2 that allows a user to upload a document which triggers a workflow that coordinates the extraction of the document content using a UiPath robot which then allows K2 to get that content and pass it into Azure Text Analysis to extract the keywords from the document text.
Here’s a little more detail. We have a K2 SmartForm made up of a few views. One of the controls on the page is a file upload, which on receipt of a file will trigger my workflow and start the process of extracting the content for said document. The first step in the workflow sends the document to a location that can be read by the UiPath robot. In this instance, we’re using an Azure SQL database. We then trigger the UiPath Robot using a REST call that we created using the K2 SmartObject REST broker (further reading on the REST broker). This Robot then performs the following:
takes the file uploaded to K2 and opens it in Microsoft Word
extracts the text content
cleans the text structures and splits it into smaller text chunks – mainly by sentence
converts the Word document to PDF
saves the PDF, full document text and newly created sentence model back to the Azure SQL database, and;
calls K2 to say it is finished
Once K2 receives the call back from the Robot it renders the PDF in the SmartForm (using a small add-in PDF.JS from Mozilla) and a list of all the sentences and text chunks in the document. On the list control of the sentences, there is an event that fires on a double click which fires of another call to the Azure Text Analysis API which will take in the selected text and the provide the keywords it extracted – which we then put into the list control to display.
As with some of the other recent tech demos from me, there’s no concrete process involved here, or a real-world scenario. But that was not my intention. The purpose here was to make me think about the art of the possible with K2, to hopefully get people thinking about how we could extend this prototype to achieve something tangible. What if we could unlock the key information contained in our document repository’s and have tangible insights into them. It’s important to note that this was build using out of the box features from K2, UiPath and Azure Cognitive Services. (Actually, to be 100% transparent there was a little bit of wizardry for the document viewer in the K2 SmartForm, but I’ll address that in another blog).
There are some exciting new developments recently from the key players in the Artificial Intelligence and Machine learning <> field. Google has just released some tech around Document Understanding – when I get access we’ll have some more on that. We are obviously using Azure Cognitive Services here at K2 for our tech demos but there is a range of technologies that could also be used for this purpose with K2 being the backbone and orchestration layer to bring about end-to-end transformation and insight into your process and data.
K2 is a very powerful tool for low code app development, that allows organizations to automate and build end-to-end applications. These apps can integrate with multiple systems and data sources by including some clever technology that sits perfectly with digital process automation such as robotic process automation (RPA) and strands of artificial intelligence. With no code, using out-of-the-box functionality across the three technologies we have managed to extract document content, analyze it and provide key information from them using K2 Cloud, UiPath and Azure Cognitive Services.
Watch the Video
Check out how K2, UiPath, and Azure can be used to simplify document extraction and accelerate business decisions