Learn how to get started with Azure’s AI services with an AI-enabled summary from a PDF document with Telerik UI for ASP.NET Core and Document Processing Libraries.
While Progress Telerik is busy creating its own line of smart components (which you can try out on a bunch of platforms), there’s nothing stopping your from integrating AI into your applications right now. You can, for example, start using Azure’s Language Services in any application you’re currently building.
To make that point (and show you how to get started with Azure’s AI services), I’m going to show how to create an AI-enabled summary from a PDF document (I’ve got a kid going back to school and, I bet, he could find this useful). And, if you don’t care about how to make this work and just want to see how good a job Azure’s tools do in generating summaries from technical articles, scroll to the section at the end of this post called “So: How Good are the Summaries?” You’ll find some summaries of a variety of business and geeky articles, along with links to the original articles. You can make up your own mind about how good the tools are.
For this case study, I’m going to implement a solution in ASP.NET Core that displays the summary on the page above the PDF document. But, quite frankly, except for the markup making up the UI, the code you see would be same on any platform (and, if you use one of PDF Viewers from Progress Telerik, even the UI code will be similar, no matter what platform you try this on).
If you want to implement this yourself, you’ll need start by creating an Azure Language Service. If you’ve already signed up with Azure, you can use your existing login credentials to get to the Language service overview page.
Once on the Azure AI Services | Language Service page, click on the +Create link in the horizontal menu on the right to start the Wizard that will create your service.
The first page in the wizard lists the included features of a language service and some additional customization features that you can add. The base features include default Text Summarization processing, which is all I need for this case study, so I just clicked on the “Continue to create your resource” button at the bottom of the page.
The next page includes the typical information required to create any Azure resource: Subscription, resource group, etc. You will need, when picking a region, to pick one that supports the AI features you want to use—in this case, text analysis summarization. East U.S. is both close to me and, as near as I can tell, supports all the AI services, so I selected that region. You might want to select a closer region.
You’ll also find two items that aren’t part of the regular resource-creation process:
After filling out this page, I skipped the rest of the wizard by clicking the Review + Create button. If you want to control network access, set up a managed identity to assign permissions or assign tags, you may want to visit the other pages. Once on the final page of the wizard, I clicked the Create button and waited patiently for my resource to be created.
Once your service is created, on the Azure site, enter Language in the search box at the top of the Azure page and select Language from the resulting dropdown list to be taken back to Azure AI Services | Language service page. You should find your new service listed there—click on the service to open its Overview page.
Your code is, eventually, going to require two pieces of information about the service:
Both of those are available from the Overview page’s Manage keys link—clicking on the link will takes you to the service’s “Keys and endpoint” page. On that page, copy the values for Key 1
(the secret key) and EndPoint
(the service’s URL) and paste them somewhere safe.
Once you’ve got those two values, you can close down your Azure page.
Rather than wait for my resource to be created, I started creating the ASP.NET Core application that will let me load a PDF file and call my new service to summarize its content. I set up the project, as usual for a Telerik-enabled project.
Not surprisingly, I decided to use the ASP.NET Core PDF Viewer to display the PDF document I was summarizing. The language service only accepts text files, so I also decided to use the Telerik Document Processing Library (DPL) objects to convert my PDF file to text. To support all of that, my project requires just three NuGet packages:
To load and display a PDF in the kendo-pdfviewer, I added this markup to a Razor Page (I could use the same code in any View for MVC-style processing):
@page
@model IndexModel
@addTagHelper *, Kendo.Mvc
<kendo-pdfviewer height="1200"
name="pdfviewer">
<dpl-processing load-on-demand="true">
<read url="/Home/GetInitialPDF" />
</dpl-processing>
<toolbar enabled="false"/>
</kendo-pdfviewer>
<script src="~/kendo-ui-license.js"></script>
The kendo-pdfviewer’s read element’s url
attribute has to point to a method that will retrieve the initial document to display. As the URL in my sample code implies, I implemented that by creating method called GetInitialPDF
in my project’s Home controller.
In this code from my GetInitialPDF
method, I use .NET’s File object to load a PDF document (named sample.pdf) from a subfolder called Documents in my project’s wwwroot folder. I use the Telerik FixedDocument
object to create an object that the kendo-pdfviewer can display.
That code looks like this:
public class HomeController
{
public IActionResult GetInitialPdf(int? pageNumber)
{
JsonResult jrt;
byte[] pdfBytes = File.ReadAllBytes(
@"wwwroot\Documents\sample.pdf");
var pdfDoc = FixedDocument.Load(pdfBytes, true);
if (pageNumber == null)
{
jrt = new JsonResult(pdfDoc.ToJson());
}
else
{
jrt = new JsonResult(pdfDoc.GetPage((int)pageNumber));
}
return jrt;
}
}
To generate the summary that I’ll display above the PDF file in my webpage, I created a separate class called AIProcessing
with a method called GetSummary
. I set up the GetSummary
method to return my summary (a string) and to accept the relative file path to the document in my website (also a string). I’m eventually going to call an async method inside this method, so I set up my GetSummary
method with async
modifier and wrapped my return value in a Task object:
public class AIProcessing
{
Internal async Task<string> GetSummary(string docPath)
{
Within my GetSummary
method, I create:
With those two objects in place, I use a .NET File object to read my document into a Stream that I import into the PdfFormatProvider to create a format-neutral RadFixedDocument. I then feed the RadFixedDocument to the TextFormatProvider’s Export
method to create my text document:
PdfFormatProvider pdfProv = new PdfFormatProvider();
TextFormatProvider txtProvider = new TextFormatProvider();
using (Stream str = File.OpenRead(docPath))
{
RadFixedDocument radDoc = pdfProv.Import(str);
string txtDoc = txtProvider.Export(radDoc);
Now I have a text document that I can feed to my Language Service. The next step is to connect to my Language Service. I do that by creating a TextAnalysisClient
object, passing the URL and secret key I copied from my service’s “Keys and endpoint” page (I’ve omitted those values from this sample code):
TextAnalyticsClient tac = new TextAnalyticsClient(
new Uri("<Endpoint>"),
new AzureKeyCredential("<Key 1>"));
To actually pass my text document to the service, I need to add the document to a List of TextDocumentInput objects … which means I first have to create a TextDocumentInput that holds my text document.
A TextDocumentInput consists of a unique identifier for the document (unique within the collection, at any rate) and the text document itself. For the key, I decided to use the name of my PDF file (I used the .NET Path class to pull the file name out of the filepath passed to my method). Once I’ve created that TextDocumentInput, I add it to my list.
Here’s that code:
string key = Path.GetFileName(docPath);
TextDocumentInput doc = new(key, txtDoc);
IList<TextDocumentInput> docs = new List<TextDocumentInput>();
docs.Add(doc);
The next step is to create my summary by calling the Language Service’s AbstractiveSummarize
method and then catch the resulting AbstractiveSummarizeOperation
object.
I won’t lie: The summary is buried pretty deeply in the AbstractiveSummarizeOperation
object. Once you have the object, you first call its GetValues
method which hands back the collection of results from the analysis. The summaries that I’m interested in for this case study are in the first results collection returned by the GetValues
method, so I use this code to retrieve that collection of results:
AbstractiveSummarizeOperation aso = await tac.AbstractiveSummarizeAsync(
WaitUntil.Completed, docs);
AbstractiveSummarizeResultCollection results = aso.GetValues().First();
Now that I’ve got the result collection I’m interested in, I pull out the result for the document I’m interested in, using the key I created when adding the document to the list of documents I processed (in my case, that key was the file name). I worry about things so, in the following code, I check to see if I found a result and, if I did, pull out the text of the first summary in the result. That summary is what I return from this method (and if I don’t find anything, I return an empty string):
AbstractiveSummarizeResult docResult = results.FirstOrDefault(r => r.Id == key);
if (docResult != null)
{
return docResult.Summaries.First().Text;
}
return string.Empty;
The last step is to display that summary on my page just above my PDF document. I just need update my Razor Page’s GetAsync
method with the code to call my GetSummary
method and then update a property in my Razor page with the text summary the method returns. If I were using MVC-style processing, I’d put this code in a method in my Controller that a user could surf to and update a property on the Model I would pass to the View holding my PDF Viewer.
This code catches the string returned by my GetSummary
method and stuffs it into a property called summary:
public string summary { get; set; }
public async Task OnGetAsync()
{
AIProcessing aiproc = new AIProcessing();
summary = await aiproc.GetSummary(@"wwwroot\Documents\526M_Labs.pdf");
}
I feel that I should make at least some attempt to provide a readable display of the summary and the document, so I used this markup in my View to display the summary property (in Model.summary), just above my PDFViewer, inside a nicely formatted box:
@page
@model IndexModel
@addTagHelper *, Kendo.Mvc
<div style="width: 800px">
<label>Summary:</label>
<div style="margin: 20px; padding: 5px; border:1px solid black;">
@Model.summary
</div>
</br>
<label>Full Document:</label>
<div style="margin: 20px">
<kendo-pdfviewer name="pdfviewer">
…..
</kendo-pdfviewer>
</div>
</div>
Of course, you won’t want an application that displays exactly one document. To add the ability to dynamically switch between documents and generate summaries for each document, you should look at the PDF Viewer’s documentation.
But that’s your problem. What I wanted to know was: Are the summaries any good?
To test the quality of the summaries generated, I decided to use some of my own articles, as published on the Telerik.com site. Among other issues, this plan would avoid any copyright infringement issues. In addition, you can review those documents and make your own decision on how good the summaries are (that you’ll also be driving up my readership statistic when you do is just a happy accident).
For my first test, I took a subject near and dear to my heart that has general appeal but also has some technical content: The second of a series of posts I did on the fundamental principles of UX design (there’s nothing wrong with the first post, but I wanted something a little geekier than that introductory post). Here’s the AI generated summary:
The article provides a comprehensive guide on implementing effective user interfaces (UIs) by adhering to five key principles. It emphasizes the importance of being consistent with design, leveraging design patterns, supporting user scenarios, organizing UIs logically, and conducting thorough testing. The author introduces the concept of supporting a user’s mental model, whether it’s guiding them through a process or replacing it with a new one that offers greater value. The article further delves into the use of progress bars, menu structures, and navigation tools to enhance the user experience. It also discusses the significance of understanding user expectations and mental models in creating intuitive and efficient UIs.
And, I have to admit: That’s a pretty good summary of what I covered in the article. To put it another way—if I was forced to write a 100-word summary of that post, I don’t know that I could do better.
For my second test, I went with a post with very little technical content and more of a business orientation: A post on enterprise reporting standards and best practices. Here’s the summary for that post, which is also pretty good:
Enterprise reporting has evolved with self-service reporting tools, enabling users to transform data into valuable information independently. However, this progression has introduced two levels of reports: those created for individual or team use, and enterprise reports intended for a broader audience, often across multiple departments. To manage these reports effectively, it’s crucial to organize them in a way that’s intuitive for users, much like curating art in a museum. Each report should have a clear description, outlining its target audience, the questions it answers, and the data aggregation level, which aids in assessing report quality and relevance. Additionally, establishing consistent data sources, a standardized report format, and a company-wide style sheet are vital steps in ensuring the accuracy, accessibility, and coherence of enterprise reports.
I would have made a couple of changes in this summary:
And for my final test … how about this document you just read (minus this paragraph and the next)? It’s nerdier than the previous two documents and self-referential so it doesn’t get it quite right, does it:
The source document provides a comprehensive guide on integrating AI analysis with Telerik’s PDF viewer and document processing tools, as well as implementing AI-enabled summaries using Azure’s Language Services. It details the process of creating an ASP.NET Core application that can display a PDF document and call upon an AI service to generate a summary, which is then displayed above the document. The document emphasizes the importance of understanding user expectations and mental models in designing intuitive and efficient user interfaces. Additionally, the document includes a case study where the author tests the quality of AI-generated summaries using their own articles and explores the principles of effective user experience design.
Plainly, the processor got a little confused between the content of the article and the examples that I used to demonstrate what you should expect from the tool (it was all too “meta,” I guess). But fixing that would just mean deleting two things: the second to last sentence on UI design and the clause at the end of the last sentence. I’d also like to include a sentence about how this article contains all the code you need to implement language processing in your own application. Altogether, maybe 30 seconds’ work. I could live with that.
But that’s only my opinion and I’m not here to tell you what to think: You can skim the articles and make up your own mind. And, if you decide that the summaries have value and, with whatever issues you have, the AI generated summaries are better than no summary at all … well, now you know to add them to your own apps.
But here’s the important part: I used the S, paid tier to build this case study. Development, testing and running all of these samples cost me a total of six cents. I can live with that. I might even do some more.
He did, indeed, do more. Read the next post: Applying AI Document Analysis with Blazor.
Try out Telerik UI for ASP.NET Core yourself, free for 30 days.
Peter Vogel is a system architect and principal in PH&V Information Services. PH&V provides full-stack consulting from UX design through object modeling to database design. Peter also writes courses and teaches for Learning Tree International.