Summarize with AI:
In my previous post, I covered how to set up a Large Language Model (LLM) either in the cloud using Azure AI or on your desktop/server using Ollama. This post walks through the code you need to load your own content into a custom AI agent based on that LLM.
That will let you create a Retrieval-Augmented Generation–enhanced application that provides grounded answers on the content that matters to your users. You could use this code to, for example, create a custom AI assistant for any part of your organization.
There are four steps to creating that application:
I’m going to demonstrate the code for loading your content using a Blazor application (Blazor simplifies creating an interactive application that integrates server and client-side processing). However, the code in this post will be very similar in any C# application and, in a later post, I’ll wrap my agent in a web service and access it from client-side JavaScript code.
Once your project is created, your next step is to add the necessary NuGet packages. The best advice I can give you around picking the right NuGet package is to a) include prerelease versions and b) always take the most recent package available.
The packages you’ll need are:
Connecting to your LLM depends on whether you’re using Azure or Ollama to host your LLM. (And, if you just created your LLM deployment, it can take up to 15 minutes before your LLM is ready to be used.)
If you’re using an Azure LLM then you need to create an AzureOpenAIClient, passing the URL and key from your deployment’s information page in the ai.azure.com portal (see my previous post).
Once you’ve created your AzureOpenAIClient, you can use its GetChatClient method to retrieve a ChatClient. But, rather than access that ChatClient directly, you should use the AsIChatClient method to, effectively, cast the ChatClient to the more general purpose IChatClient interface. All that just requires just these two lines of code:
AzureOpenAIClient aiclt = new(
new Uri("<deployment URL>"),
new AzureKeyCredential("<access key>"));
IChatClient chatClt = aiclt.GetChatClient("<Deployment Name>").AsIChatClient();
If you’re working with Ollama, you create a chat client with the IChatClient interface by creating an OllamaApiClient object, passing the URL for your local Ollama server and the LLM that you want to use. That looks like this:
IChatClient chatClt =
new OllamaApiClient(new Uri("<address for Ollama>"),
"<LLM name>");
Do be aware: For a typical development machine, processing documents using Ollama is not going to be as responsive as using an LLM on Azure. For example, the document I used in this case study contains about 1,500 words and took a few seconds to summarize using one of the Azure LLMs. Using Ollama, that process sometimes took over a minute. In some cases, your application may time out waiting for Ollama to respond.
You can deal with that issue in Ollama by creating a customHttpClient object and passing it to your OllamaApiClient when you create it.
Here’s some sample code that creates an HttpClient that is a) tied to the Ollama client’s URL (probably http://localhost:11434) and b) sets a five-minute timeout. The code then uses that custom HttpClient to create an Ollama client:
HttpClient httpClt = new()
{
BaseAddress = new Uri("<address for Ollama>"),
Timeout = new TimeSpan(0, 5, 0)
};
IChatClient chatClt =
new OllamaApiClient(httpClt, "<model name>");
The Telerik Document Processing Libraries (DPL) provide multiple AI processors for analyzing documents.
In this post, I’m going to focus on the summarization processor (I’ll look at other processors in my next post). Since all the processors expect to be passed a DPL SimpleTextDocument, switching between processors is simple.
As an example, here’s the code to convert a DOCX file into a SimpleTextDocument using Telerik WordsProcessing library (the Flow library also handles RTF and HTML files; for PDF files, you would use Telerik PdfProcessing library):
RadFlowDocument dDoc;
DocxFormatProvider dProv = new();
using (Stream str = System.IO.File.OpenRead(@"wwwroot/documents/scrolltoitem.docx"))
{
dDoc = dProv.Import(str, TimeSpan.FromSeconds(10));
}
SimpleTextDocument std = dDoc.ToSimpleTextDocument(TimeSpan.FromSeconds(10));
Now that you have a SimpleTextDocument, your next step is to configure one of the Telerik AI processors to work with it.
In terms of AI processing, this document represents the content (or corpus) to be used by your LLM … but it’s a very small corpus that consists of only a single document. While I’m going to stick with a single document for this case study, you can use the Merge method on both Flow and Fixed documents to load multiple documents into a single document object before creating your SimpleTextDocument from that single document.
This code, for example, loads one DOCX file and then merges a second one into it, before creating a SimpleTextDocument from the result:
RadFlowDocument ddocMaster;
RadFlowDocument ddocTemp;
DocxFormatProvider dprov = new();
using (Stream str = System.IO.File.OpenRead(@"wwwroot/documents/InitialDoc.docx"))
{
ddocMaster = dprov.Import(str, TimeSpan.FromSeconds(10));
}
using (Stream str = System.IO.File.OpenRead(@"wwwroot/documents/SecondDoc.docx"))
{
ddocTemp = dprov.Import(str, TimeSpan.FromSeconds(10));
}
ddocMaster.Merge(ddocTemp);
The next step is to use one of the Telerik AI Connectors to enable your content for use by the LLM. To support this processing, you’ll need to add the Telerik.Documents.AIConnector NuGet package to your project.
In this example, I’m using the Telerik summarization processor to generate a summary of my custom content (more on the summarization processor and the other two Telerik AI processors in my next post):
SummarizationProcessorSettings spOpts = new (3500, "Summarize in 100 words");
using (SummarizationProcessor sp = new (chatClt, spOpts))
{
string summary = await sp.Summarize(std);
}
Don’t expect a fast response when testing—it takes some time to absorb a complete document. The Azure-based LLMs I used would pause for three or four seconds to absorb each document, while the Ollama LLM took about 90 seconds on my laptop.
Which raises an important point: The summarization processor passes the whole document to your LLM—that can be both time consuming and expensive. You might want to catch the SummaryResourceCalculated event that the processor raises. The EventArgs parameter passed to that event includes two properties (EstimatedCallsRequired and EstimatedTokensRequired) that you can check to see if the request is larger than you want to handle. If the request is “too big,” you can set the EventArgs parameter’s ShouldContinueExecutionproperty to false to stop processing.
And there you have your own, custom AI agent which you can load with whatever content you want to create. You can do more than just summarize document content, though, as I’ll cover in my next post.
But, looking ahead to providing a frontend for users to access my custom AI agent, I have two UI issues that I should be thinking about to create a genuinely useful agent:
And that’s ignoring the reality that, thanks to the existing AI-enabled UIs my users are already using (looking at you, ChatGPT), my users have expectations about what an AI-enabled UI should look like. All that suggests that my application will need an interactive UI. So, after my next post, I’m going to use the Telerik AI Prompt component to create a UI that provides my user with that interactivity.
Explore Telerik Document Processing Libraries, plus component libraries, reporting and more with a free trial of the Telerik DevCraft bundle:
Peter Vogel is both the author of the Coding Azure series and the instructor for Coding Azure in the Classroom. Peter’s company provides full-stack development from UX design through object modeling to database design. Peter holds multiple certifications in Azure administration, architecture, development and security and is a Microsoft Certified Trainer.