Telerik blogs

Now we’ll add content to the LLM using AI processors from the Progress Telerik Document Processor Libraries to summarize and query our agent’s content.

In this series of posts, I’ve been creating a custom AI Agent using Telerik tools. That’s included accessing an Azure Large Language Model (LLM) and loading my own content and creating a client to access the LLM.

The last step is to pass my content to my LLM using one of the Progress Telerik AI processors (part of the Document Processor Libraries), which is what this post is about.

The Telerik AI processors support two scenarios for your users: summarizing your agent’s content and querying your agent’s content.

To summarize your content, you’ll use the Telerik SummarizationProcess processor. For querying content, on the other hand, you have a choice between two processors:

  • CompleteContextQuestionProcessor which loads all of your content before querying it
  • PartialContentQuestionProcessor which loads just part of your content as a series of fragments—a good choice when your content is large and you don’t want to allocate (i.e., “pay for”) enough tokens to load all of it at once

A token, by the way, represents a word, part of a word or a punctuation mark. The sample document I’ll be using to demonstrate the summarization process contains roughly 1,500 words (a relatively small document). So, when summarizing that document, I’ll set my token count to 3,500 (I figured doubling the word count to include punctuation marks and adding a 33% buffer would work).

On the other hand, to demonstrate querying, I’ll be loading three documents as my agent’s content, totaling about 8,000 words. That’s going to require either a larger token count or fragmenting my document.

Summarizing Documents

Before you can do any summarizing or querying, you’ll need to add the Telerik.Documents.AIConnector NuGet package to your project. With that package added, to summarize your agent’s content you’ll use the Telerik SummarizationProcessor processor. Your first step is to create a settings object the model that defines the context window for the model by specifying:

  • The maximum of number of tokens you’re willing to have used in processing the document
  • Any text describing how you want to customize the summarization process (e.g., “under 100 words,” “target project managers”)

Once you’ve created the settings object, you create a SummarizationProcessor object, passing a chat client (discussed in my last post) and your settings object. Once you’ve created the processor, you pass its Summarize method your content and catch the result as a string.

In the following code, I’ve assumed that you’ve used Telerik Document Processing Libraries to load a SimpleTextDocument into a variable called std (again, see my previous posts). The code then:

  1. Creates a chat client tied to an LLM deployed in Azure
  2. Creates a settings object that tells the summarization processor to
    • Accept up to 3,500 tokens
    • Asks the processor to summarize the document in less than 100 words
  3. Creates the SummarizationProcessor from the chat client and the settings object
  4. Passes my content document to the SummarizationProcessor’s Summarize method
  5. Catches the result as a string

And it’s only four lines of code:

AzureOpenAIClient aiclt = new(
                new Uri("<deployment URL>"),
                new AzureKeyCredential("<access key>"));
IChatClient chatClt = aiclt.GetChatClient("<Deployment Name>").AsIChatClient();

SummarizationProcessorSettings spOpts = new (3500, "Summarize in 100 words");
using (SummarizationProcessor sp = new (chatClt, spOpts))
{
   string summary = await sp.Summarize(std);
}

Summarization Results

As an example of what you can expect, I fed my custom agent [a post I wrote on how to use the scrollToItemmethod]() in the Kendo Grid. You can compare the output to the post to decide how good a job the LLMs did.

For my first try, I used Azure’s gpt-5.2-chat LLM and asked for a summary in 100 words or less. That gave me these 94 words (I’ve stripped out the HTML formatting):

This article explains how to ensure the right row is visible in a Kendo Grid without forcing users to scroll. It focuses on the `scrollToItem` method, which scrolls directly to a row based on a data item’s unique ID. The method works in Angular, ASP.NET MVC, and jQuery grids, with alternatives for React and Blazor. The article walks through configuring an ASP.NET Core project, setting up the grid and data source, enabling scrolling, and invoking `scrollToItem` via JavaScript. It also covers handling invalid selections and loading rows not on the current page using callbacks.

I tried the same content using the gpt-oss LLM running in Ollama and got these 114 words:

This article explains Kendo Grid’s `scrollToItem` method, which brings a specified row into view without manual scrolling. It works across Angular, ASP .NET MVC, and jQuery grids (React and Blazor can use DOM’s `scrollIntoView`). The method requires the grid’s data model to have a unique `id` field (e.g., `ProductID`) and virtual or endless scrolling enabled. The author demonstrates setting up an ASP .NET Core Razor Page, configuring the grid with a datasource, model, and columns, and creating a textbox/dropdown that triggers `MoveToItem`. The method accepts a callback for invalid IDs or to fetch non‑paged items by calling the callback’s `success` with the row index. The result is a row positioned at the top of the grid.

I also tried passing just “summarize” to both LLMs without specifying a word count. The gpt-5.2-chat LLM on Azure gave me back just under 300 words with a pretty good breakdown of the main points in the article. The same prompt with Ollama and gpt-oss gave me just over 400 words but also threw in some of the sample code.

Finally, I passed “provide a title” as my requested processing to both LLMs. From gpt-5.2-chat, I got back “Using Kendo Grid’s scrollToItem to Instantly Bring Data Into View” and from gpt-oss, I got “Kendo Grid’s scrollToItem: A Practical Guide to Quick Row Navigation.” I have to admit, I think both of those proposed titles are better than the one I used (feel free to disagree). Not surprisingly, given that I was using the Telerik summarization processor, both engines threw in a summary of the post along with their new title.

Complete Document Queries

If you want to let your user ask queries that require looking at your complete document, you can use the Telerik CompleteContextQuestionProcessor processor(which is also the easiest of the two query processors to use).

To query your agent’s content, you just need to:

  1. Create a CompleteContextProcessorSettings object, passing a maximum token count
  2. Create a new CompleteContextQuestionProcessor object, passing a chat client and the settings object
  3. Pass your content and the user’s question to the CompleteContextQuestionProcessor object’s AnswerQuestion method
  4. Catch the result as a string

Typical code will look like this (I’ve omitted the code to create the chat client because it’s identical to my previous example). Note the larger token count:

CompleteContextProcessorSettings caOpts = new(9000);
using (CompleteContextQuestionProcessor ca = new(chatClient, caOpts))
{
       string answer = await ca.AnswerQuestion(sampleDocument, prompt);
}

Complete Query Results

For this example, I loaded three of my posts on how to create asynchronous applications using Azure storage queues (starting with this Coding Azure 18 post. With that content loaded, I tried asking “How do I create a storage queue?” using the gpt-5.2-chat LLM and got back this result:

You create a storage queue in two main steps:
1. Create a storage account
     In the Azure Portal, go to Storage Accounts and select Create.
     Give the storage account a URL friendly name (lowercase, no special characters).
     Choose settings appropriate for queues (Primary service set to Other).
     Complete the wizard and create the storage account.

2. Create the queue inside the storage account
         Open the storage account you just created.
          From the left menu, select Data Storage, then Queues.
          Click + Queue.
           Enter a lowercase, UL friendly queue name.
           Click OK to create the queue.

At that point, the storage queue exists and is ready to be secured and used by applications.

Which isn’t, I thought, a bad response (though I question the need for the last sentence). I then tried asking “What’s the minimum code required” and the answer I got back began with this text:

The minimum code required depends on what you are trying to do. The simplest meaningful scenario covered is adding a message to an Azure Storage Queue from a server-side app using a Managed Identity (the recommended production approach).

Assumptions (already configured in Azure):
- The queue already exists
- The App Service has a User Assigned Managed Identity
- That identity has the Storage Queue Data Message Sender role on the queue 

That was followed by the code to write out a message to a storage queue (about another 200 words). What I especially liked about response is that the opening cued me (pun intended) to ask follow-up questions. For example, my next prompts would probably be “How do I create a User-Assigned Managed Identity?” and/or “How do I read a message from the queue?”

Partial Document Queries

However, if you have a very large document (and your LLM is charging you by your token count), you might want to use the Telerik PartialContextQuestionProcessor processor. Using this processor does make an assumption: that you don’t need the whole document to answer any one question. And, for a large document that, presumably covers multiple topics, that may be a reasonable assumption.

Using the PartialContextQuestionProcessor is more complicated than using a CompleteContextQuestionProcessor. First, you’ll need to pick an LLM that supports embeddings (I used Azure’s text-embedding-3-large LLM). Second, you’ll need to provide a custom “embedding” class that will segment your content into fragments to be processed individually.

I’ve provided a version of the default Telerik “embedding” class at the end of this post. When you instantiate this class, you’ll need to pass:

  • The security key for your deployment
  • The first segment of the URL for your LLM’s deployment (i.e., everything up to the first single forward slash)
  • Your deployment’s name
  • The version of your LLM

With your “embedding” class available, to use the PartialContextQuestionProcessor you need to:

  1. Use the EmbeddingSettingsFactoryobject’s static CreateSettingsForTextDocuments method to create an IEmbeddingSettings object, passing a token count and the name of your deployment
  2. Create your “embedding” object
  3. Create a PartialContextQuestionProcessor, passing a chat client, your embedding object, your settings object and your content
  4. Pass your query to the PartialContextQuestionProcessor object’s Answer method
  5. Catch the result in a string

Typical code looks like this (note the lower token count) and is still only four lines of code:

IEmbeddingSettings paOpts =
       EmbeddingSettingsFactory.CreateSettingsForTextDocuments(3500,<deployment name>);
DefaultAIEmbedder embedder =
    new DefaultAIEmbedder(
        "<key>",
        "<deployment URL>",
        "<deployment name>",
        "<deployment version>"
        );
using (PartialContextQuestionProcessor pa =
           new PartialContextQuestionProcessor(chatClt, embedder, paOpts, std))
{
    string answer = await pa.AnswerQuestion(prompt);
}

Partial Query Results

I then retried the “How do I create a storage queue” question with the same content I used for the CompleteContextQuestionProcessor. Here’s the result I got, which, you’ll notice, omits the material on creating the Storage Account that the CompleteContextQuestionProcessor answer included:

Here is how you create a storage queue in the Azure Portal. First, make sure you already have a Storage Account created that is dedicated to queues.
Then do the following:
1.	Go to the Azure Portal.
2. Open your Storage Account.
3. From the menu on the left, select Data Storage, then Queues.
4. On the Queues page, click the + Queue button at the top.
5. Enter a queue name. The name must be lowercase, contain no spaces, and have no special characters. Example: updateproductinventory
6. Click OK.
That’s it. The queue is created immediately.

In answer to my “What is the minimal code” question, I got identical code, but the introduction wasn’t as comprehensive as with the CompleteContextQuestionProcessor (the prerequisites/assumptions section is terser, for example):

From the context, the smallest useful interpretation of “minimal code” is: The minimal code required to add a message to an Azure Storage Queue, assuming the queue already exists and security is already configured.
Below are the minimal working examples for both scenarios described in the context.

Still, I thought, a pretty good answer.

With those three processors (and an LLM and your content) you have all you need to create an AI-enabled backend. The next step is to create a frontend that supports your users interacting with your backend and meets their expectations for an AI-enabled application. That’s my next post.

And here’s that default embedder I promised:

    internal class DefaultAIEmbedder : IEmbedder
    {
        internal readonly HttpClient httpClient;
        internal readonly string deploymentName;
        internal readonly string apiVersion;
        internal DefaultAIEmbedder(string apiKey, string url, string deploymentName, string apiVersion,)
        {
            this.deploymentName = deploymentName;
            this.apiVersion = apiVersion;

            this.httpClient = new HttpClient();
            this.httpClient.Timeout = TimeSpan.FromMinutes(10);
            this.httpClient.DefaultRequestHeaders.Add("api-key", apiKey);
            this.httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));            
            this.httpClient.BaseAddress = new Uri(Path.TrimEndingDirectorySeparator(url));
        }

        public async Task<IList<Telerik.Documents.AI.Core.Embedding>> EmbedAsync(IList<IFragment> fragments)
        {
            AzureEmbeddingsRequest requestBody = new AzureEmbeddingsRequest
            {
                Input = fragments.Select(p => p.ToEmbeddingText()).ToArray(),
                Dimensions = 3072
            };

            string json = JsonSerializer.Serialize(requestBody);
            StringContent content = new StringContent(json, Encoding.UTF8, "application/json");

            using HttpResponseMessage response = await this.httpClient.PostAsync(
                "openai/deployments/" + this.deploymentName + "/embeddings?api-version=" + this.apiVersion,
                content,
                CancellationToken.None);

            Telerik.Documents.AI.Core.Embedding[] embeddings = new Telerik.Documents.AI.Core.Embedding[fragments.Count];

            string responseJson = await response.Content.ReadAsStringAsync(CancellationToken.None);
            AzureEmbeddingsResponse? responseObj = JsonSerializer.Deserialize<AzureEmbeddingsResponse>(responseJson);

            List<EmbeddingData> sorted = responseObj!.Data.OrderBy(d => d.Index).ToList();
            List<float[]> result = new List<float[]>(sorted.Count);

            for (int i = 0; i < sorted.Count; i++)
            {
                EmbeddingData item = sorted[i];
                embeddings[i] = new Telerik.Documents.AI.Core.Embedding(fragments[i], item.Embedding);
            }

            return embeddings;
        }

        private sealed class AzureEmbeddingsRequest
        {
            [System.Text.Json.Serialization.JsonPropertyName("input")]
            public string[] Input { get; set; } = Array.Empty<string>();

            [System.Text.Json.Serialization.JsonPropertyName("dimensions")]
            public int? Dimensions { get; set; }
        }

        private sealed class AzureEmbeddingsResponse
        {
            [System.Text.Json.Serialization.JsonPropertyName("data")]
            public EmbeddingData[] Data { get; set; } = Array.Empty<EmbeddingData>();

            [System.Text.Json.Serialization.JsonPropertyName("model")]
            public string? Model { get; set; }

            [System.Text.Json.Serialization.JsonPropertyName("usage")]
            public UsageInfo? Usage { get; set; }
        }

        private sealed class UsageInfo
        {
            [System.Text.Json.Serialization.JsonPropertyName("prompt_tokens")]
            public int PromptTokens { get; set; }

            [System.Text.Json.Serialization.JsonPropertyName("total_tokens")]
            public int TotalTokens { get; set; }
        }

        private sealed class EmbeddingData
        {
            [System.Text.Json.Serialization.JsonPropertyName("embedding")]
            public float[] Embedding { get; set; } = Array.Empty<float>();

            [System.Text.Json.Serialization.JsonPropertyName("index")]
            public int Index { get; set; }
        }
    }
}   

Peter Vogel
About the Author

Peter Vogel

Peter Vogel is both the author of the Coding Azure series and the instructor for Coding Azure in the Classroom. Peter’s company provides full-stack development from UX design through object modeling to database design. Peter holds multiple certifications in Azure administration, architecture, development and security and is a Microsoft Certified Trainer.

Related Posts

Comments

Comments are disabled in preview mode.