Summarize with AI:
Now we’ll add content to the LLM using AI processors from the Progress Telerik Document Processor Libraries to summarize and query our agent’s content.
In this series of posts, I’ve been creating a custom AI Agent using Telerik tools. That’s included accessing an Azure Large Language Model (LLM) and loading my own content and creating a client to access the LLM.
The last step is to pass my content to my LLM using one of the Progress Telerik AI processors (part of the Document Processor Libraries), which is what this post is about.
The Telerik AI processors support two scenarios for your users: summarizing your agent’s content and querying your agent’s content.
To summarize your content, you’ll use the Telerik SummarizationProcess processor. For querying content, on the other hand, you have a choice between two processors:
CompleteContextQuestionProcessor which loads all of your content before querying itPartialContentQuestionProcessor which loads just part of your content as a series of fragments—a good choice when your content is large and you don’t want to allocate (i.e., “pay for”) enough tokens to load all of it at onceA token, by the way, represents a word, part of a word or a punctuation mark. The sample document I’ll be using to demonstrate the summarization process contains roughly 1,500 words (a relatively small document). So, when summarizing that document, I’ll set my token count to 3,500 (I figured doubling the word count to include punctuation marks and adding a 33% buffer would work).
On the other hand, to demonstrate querying, I’ll be loading three documents as my agent’s content, totaling about 8,000 words. That’s going to require either a larger token count or fragmenting my document.
Before you can do any summarizing or querying, you’ll need to add the Telerik.Documents.AIConnector NuGet package to your project. With that package added, to summarize your agent’s content you’ll use the Telerik SummarizationProcessor processor. Your first step is to create a settings object the model that defines the context window for the model by specifying:
Once you’ve created the settings object, you create a SummarizationProcessor object, passing a chat client (discussed in my last post) and your settings object. Once you’ve created the processor, you pass its Summarize method your content and catch the result as a string.
In the following code, I’ve assumed that you’ve used Telerik Document Processing Libraries to load a SimpleTextDocument into a variable called std (again, see my previous posts). The code then:
SummarizationProcessor from the chat client and the settings objectSummarizationProcessor’s Summarize methodAnd it’s only four lines of code:
AzureOpenAIClient aiclt = new(
new Uri("<deployment URL>"),
new AzureKeyCredential("<access key>"));
IChatClient chatClt = aiclt.GetChatClient("<Deployment Name>").AsIChatClient();
SummarizationProcessorSettings spOpts = new (3500, "Summarize in 100 words");
using (SummarizationProcessor sp = new (chatClt, spOpts))
{
string summary = await sp.Summarize(std);
}
As an example of what you can expect, I fed my custom agent [a post I wrote on how to use the scrollToItemmethod]() in the Kendo Grid. You can compare the output to the post to decide how good a job the LLMs did.
For my first try, I used Azure’s gpt-5.2-chat LLM and asked for a summary in 100 words or less. That gave me these 94 words (I’ve stripped out the HTML formatting):
This article explains how to ensure the right row is visible in a Kendo Grid without forcing users to scroll. It focuses on the `scrollToItem` method, which scrolls directly to a row based on a data item’s unique ID. The method works in Angular, ASP.NET MVC, and jQuery grids, with alternatives for React and Blazor. The article walks through configuring an ASP.NET Core project, setting up the grid and data source, enabling scrolling, and invoking `scrollToItem` via JavaScript. It also covers handling invalid selections and loading rows not on the current page using callbacks.
I tried the same content using the gpt-oss LLM running in Ollama and got these 114 words:
This article explains Kendo Grid’s `scrollToItem` method, which brings a specified row into view without manual scrolling. It works across Angular, ASP .NET MVC, and jQuery grids (React and Blazor can use DOM’s `scrollIntoView`). The method requires the grid’s data model to have a unique `id` field (e.g., `ProductID`) and virtual or endless scrolling enabled. The author demonstrates setting up an ASP .NET Core Razor Page, configuring the grid with a datasource, model, and columns, and creating a textbox/dropdown that triggers `MoveToItem`. The method accepts a callback for invalid IDs or to fetch non‑paged items by calling the callback’s `success` with the row index. The result is a row positioned at the top of the grid.
I also tried passing just “summarize” to both LLMs without specifying a word count. The gpt-5.2-chat LLM on Azure gave me back just under 300 words with a pretty good breakdown of the main points in the article. The same prompt with Ollama and gpt-oss gave me just over 400 words but also threw in some of the sample code.
Finally, I passed “provide a title” as my requested processing to both LLMs. From gpt-5.2-chat, I got back “Using Kendo Grid’s scrollToItem to Instantly Bring Data Into View” and from gpt-oss, I got “Kendo Grid’s scrollToItem: A Practical Guide to Quick Row Navigation.” I have to admit, I think both of those proposed titles are better than the one I used (feel free to disagree). Not surprisingly, given that I was using the Telerik summarization processor, both engines threw in a summary of the post along with their new title.
If you want to let your user ask queries that require looking at your complete document, you can use the Telerik CompleteContextQuestionProcessor processor(which is also the easiest of the two query processors to use).
To query your agent’s content, you just need to:
CompleteContextProcessorSettings object, passing a maximum token countCompleteContextQuestionProcessor object, passing a chat client and the settings objectCompleteContextQuestionProcessor object’s AnswerQuestion methodTypical code will look like this (I’ve omitted the code to create the chat client because it’s identical to my previous example). Note the larger token count:
CompleteContextProcessorSettings caOpts = new(9000);
using (CompleteContextQuestionProcessor ca = new(chatClient, caOpts))
{
string answer = await ca.AnswerQuestion(sampleDocument, prompt);
}
For this example, I loaded three of my posts on how to create asynchronous applications using Azure storage queues (starting with this Coding Azure 18 post. With that content loaded, I tried asking “How do I create a storage queue?” using the gpt-5.2-chat LLM and got back this result:
You create a storage queue in two main steps:
1. Create a storage account
In the Azure Portal, go to Storage Accounts and select Create.
Give the storage account a URL friendly name (lowercase, no special characters).
Choose settings appropriate for queues (Primary service set to Other).
Complete the wizard and create the storage account.
2. Create the queue inside the storage account
Open the storage account you just created.
From the left menu, select Data Storage, then Queues.
Click + Queue.
Enter a lowercase, UL friendly queue name.
Click OK to create the queue.
At that point, the storage queue exists and is ready to be secured and used by applications.
Which isn’t, I thought, a bad response (though I question the need for the last sentence). I then tried asking “What’s the minimum code required” and the answer I got back began with this text:
The minimum code required depends on what you are trying to do. The simplest meaningful scenario covered is adding a message to an Azure Storage Queue from a server-side app using a Managed Identity (the recommended production approach).
Assumptions (already configured in Azure):
- The queue already exists
- The App Service has a User Assigned Managed Identity
- That identity has the Storage Queue Data Message Sender role on the queue
That was followed by the code to write out a message to a storage queue (about another 200 words). What I especially liked about response is that the opening cued me (pun intended) to ask follow-up questions. For example, my next prompts would probably be “How do I create a User-Assigned Managed Identity?” and/or “How do I read a message from the queue?”
However, if you have a very large document (and your LLM is charging you by your token count), you might want to use the Telerik PartialContextQuestionProcessor processor. Using this processor does make an assumption: that you don’t need the whole document to answer any one question. And, for a large document that, presumably covers multiple topics, that may be a reasonable assumption.
Using the PartialContextQuestionProcessor is more complicated than using a CompleteContextQuestionProcessor. First, you’ll need to pick an LLM that supports embeddings (I used Azure’s text-embedding-3-large LLM). Second, you’ll need to provide a custom “embedding” class that will segment your content into fragments to be processed individually.
I’ve provided a version of the default Telerik “embedding” class at the end of this post. When you instantiate this class, you’ll need to pass:
With your “embedding” class available, to use the PartialContextQuestionProcessor you need to:
EmbeddingSettingsFactoryobject’s static CreateSettingsForTextDocuments method to create an IEmbeddingSettings object, passing a token count and the name of your deploymentPartialContextQuestionProcessor, passing a chat client, your embedding object, your settings object and your contentPartialContextQuestionProcessor object’s Answer methodTypical code looks like this (note the lower token count) and is still only four lines of code:
IEmbeddingSettings paOpts =
EmbeddingSettingsFactory.CreateSettingsForTextDocuments(3500,
“<deployment name>”);
DefaultAIEmbedder embedder =
new DefaultAIEmbedder(
"<key>",
"<deployment URL>",
"<deployment name>",
"<deployment version>"
);
using (PartialContextQuestionProcessor pa =
new PartialContextQuestionProcessor(chatClt, embedder, paOpts, std))
{
string answer = await pa.AnswerQuestion(prompt);
}
I then retried the “How do I create a storage queue” question with the same content I used for the CompleteContextQuestionProcessor. Here’s the result I got, which, you’ll notice, omits the material on creating the Storage Account that the CompleteContextQuestionProcessor answer included:
Here is how you create a storage queue in the Azure Portal. First, make sure you already have a Storage Account created that is dedicated to queues.
Then do the following:
1. Go to the Azure Portal.
2. Open your Storage Account.
3. From the menu on the left, select Data Storage, then Queues.
4. On the Queues page, click the + Queue button at the top.
5. Enter a queue name. The name must be lowercase, contain no spaces, and have no special characters. Example: updateproductinventory
6. Click OK.
That’s it. The queue is created immediately.
In answer to my “What is the minimal code” question, I got identical code, but the introduction wasn’t as comprehensive as with the CompleteContextQuestionProcessor (the prerequisites/assumptions section is terser, for example):
From the context, the smallest useful interpretation of “minimal code” is: The minimal code required to add a message to an Azure Storage Queue, assuming the queue already exists and security is already configured.
Below are the minimal working examples for both scenarios described in the context.
Still, I thought, a pretty good answer.
With those three processors (and an LLM and your content) you have all you need to create an AI-enabled backend. The next step is to create a frontend that supports your users interacting with your backend and meets their expectations for an AI-enabled application. That’s my next post.
And here’s that default embedder I promised:
internal class DefaultAIEmbedder : IEmbedder
{
internal readonly HttpClient httpClient;
internal readonly string deploymentName;
internal readonly string apiVersion;
internal DefaultAIEmbedder(string apiKey, string url, string deploymentName, string apiVersion,)
{
this.deploymentName = deploymentName;
this.apiVersion = apiVersion;
this.httpClient = new HttpClient();
this.httpClient.Timeout = TimeSpan.FromMinutes(10);
this.httpClient.DefaultRequestHeaders.Add("api-key", apiKey);
this.httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
this.httpClient.BaseAddress = new Uri(Path.TrimEndingDirectorySeparator(url));
}
public async Task<IList<Telerik.Documents.AI.Core.Embedding>> EmbedAsync(IList<IFragment> fragments)
{
AzureEmbeddingsRequest requestBody = new AzureEmbeddingsRequest
{
Input = fragments.Select(p => p.ToEmbeddingText()).ToArray(),
Dimensions = 3072
};
string json = JsonSerializer.Serialize(requestBody);
StringContent content = new StringContent(json, Encoding.UTF8, "application/json");
using HttpResponseMessage response = await this.httpClient.PostAsync(
"openai/deployments/" + this.deploymentName + "/embeddings?api-version=" + this.apiVersion,
content,
CancellationToken.None);
Telerik.Documents.AI.Core.Embedding[] embeddings = new Telerik.Documents.AI.Core.Embedding[fragments.Count];
string responseJson = await response.Content.ReadAsStringAsync(CancellationToken.None);
AzureEmbeddingsResponse? responseObj = JsonSerializer.Deserialize<AzureEmbeddingsResponse>(responseJson);
List<EmbeddingData> sorted = responseObj!.Data.OrderBy(d => d.Index).ToList();
List<float[]> result = new List<float[]>(sorted.Count);
for (int i = 0; i < sorted.Count; i++)
{
EmbeddingData item = sorted[i];
embeddings[i] = new Telerik.Documents.AI.Core.Embedding(fragments[i], item.Embedding);
}
return embeddings;
}
private sealed class AzureEmbeddingsRequest
{
[System.Text.Json.Serialization.JsonPropertyName("input")]
public string[] Input { get; set; } = Array.Empty<string>();
[System.Text.Json.Serialization.JsonPropertyName("dimensions")]
public int? Dimensions { get; set; }
}
private sealed class AzureEmbeddingsResponse
{
[System.Text.Json.Serialization.JsonPropertyName("data")]
public EmbeddingData[] Data { get; set; } = Array.Empty<EmbeddingData>();
[System.Text.Json.Serialization.JsonPropertyName("model")]
public string? Model { get; set; }
[System.Text.Json.Serialization.JsonPropertyName("usage")]
public UsageInfo? Usage { get; set; }
}
private sealed class UsageInfo
{
[System.Text.Json.Serialization.JsonPropertyName("prompt_tokens")]
public int PromptTokens { get; set; }
[System.Text.Json.Serialization.JsonPropertyName("total_tokens")]
public int TotalTokens { get; set; }
}
private sealed class EmbeddingData
{
[System.Text.Json.Serialization.JsonPropertyName("embedding")]
public float[] Embedding { get; set; } = Array.Empty<float>();
[System.Text.Json.Serialization.JsonPropertyName("index")]
public int Index { get; set; }
}
}
}
Peter Vogel is both the author of the Coding Azure series and the instructor for Coding Azure in the Classroom. Peter’s company provides full-stack development from UX design through object modeling to database design. Peter holds multiple certifications in Azure administration, architecture, development and security and is a Microsoft Certified Trainer.