Summarize with AI:
OpenSearch is an open-source search library that simplifies analysis of large volumes of data and result tuning. See how to create a NestJS API with this full-text search capability.
In this post, we’ll build a full-text search API using NestJS, OpenSearch and Docker. You’ll learn how to set up OpenSearch locally with Docker and customize search behavior for more effective queries.
Our API will manage article documents and support searches using keywords or exact phrases, along with filters, sorting, and paginating results. It will also provide highlighted matching snippets for a better user interface.
To follow along and get the most from this post, you’ll need basic knowledge of HTTP, RESTful APIs, and cURL, basic familiarity with Docker, and a basic understanding of NestJS and TypeScript.
This is a search of all of the documents’ contents based on the full scope of relevance and intent of the query. Unlike traditional searches, it doesn’t try to match an exact word or phrase; instead, it ranks results by relevance using typo tolerance, proximity and term matching rules.
Full-text search is mostly used in systems like ecommerce and content management, where large amounts of unstructured data need to be efficiently searched.
Imagine you’re searching through transaction documents. Instead of using the ID or an exact keyword like the date or sender, as you would in traditional databases, you use a part of a phrase you think is in the transaction description. Full-text search is much more intuitive and user-friendly.
This simply means an open-source search and analytics engine. OpenSearch facilitates the storage, search and analysis of large amounts of data at scale. The project is completely open source, so it can be used, modified or distributed freely.
OpenSearch is built on the Lucene library, using its data structures and algorithms. It works by first indexing your data and storing it efficiently. Then, when you send a search query, it goes through its indices looking for the relevant results and returns them immediately.
Run the following command in your terminal to create a NestJS project:
nest new open-search-demo
cd open-search-demo
Next, run the command below to install the dependencies we will need for this project:
npm i @opensearch-project/opensearch @nestjs/config class-validator class-transformer
The @opensearch-project/opensearch package gives the Client class used to create an OpenSearch client instance. The @nestjs/config package is used to import environment variables into our code. We also need the class-validator and class-transformer to define and validate our DTOs.
Now, update your main.ts file with the following to enable DTO validation globally:
import { ValidationPipe } from '@nestjs/common';
import { NestFactory } from '@nestjs/core';
import { AppModule } from './app.module';
async function bootstrap() {
const app = await NestFactory.create(AppModule);
app.useGlobalPipes(new ValidationPipe({ whitelist: true, transform: true }));
await app.listen(process.env.PORT ?? 3000);
}
bootstrap();
Next, create a synonyms.txt file in the project directory. This will allow us to be able to search terms even when their close synonyms are used (e.g., “ai” could be used in a query in place of “artificial intelligence”):
ai, artificial intelligence
ux, user experience
ml, machine learning
react, javascript
database, db
frontend, front-end
backend, back-end
api, application programming interface
search, query
performance, optimization
Next, create a docker.compose.yaml file and add the following to it:
services:
opensearch:
image: opensearchproject/opensearch:2.14.0
environment:
discovery.type: single-node
OPENSEARCH_JAVA_OPTS: "-Xms512m -Xmx512m"
OPENSEARCH_INITIAL_ADMIN_PASSWORD: "ChangeMe_StrongPassword_123!"
ports:
- "9200:9200"
volumes:
- opensearch-data:/usr/share/opensearch/data
- ./synonyms.txt:/usr/share/opensearch/config/synonyms.txt:ro
In our OpenSearch service, we configured the following environment variables:
discovery.type: single-node – This means we are running OpenSearch as a single node for development rather than a whole cluster, as would be the case for production.OPENSEARCH_JAVA_OPTS: "-Xms512m -Xmx512m" – This sets the heap size that the Java virtual machine uses to store data while our app runs. For production, it’s recommended to use about half of your system’s RAM.OPENSEARCH_INITIAL_ADMIN_PASSWORD: "ChangeMe_StrongPassword_123!" – This is simply the admin password used for authorization.ports: "9200:9200" – Exposes OpenSearch on port 9200.volumes: opensearch-data:/usr/share/opensearch/data – This mounts a named volume at that path and allows our index data to persist regardless of container restarts.volumes: ./synonyms.txt:/usr/share/opensearch/config/synonyms.txt:ro – This mounts our synonyms.txt file into the OpenSearch config directory. It’s used as a reference for possible synonyms when handling queries.With the Docker Compose file set up, we can now start our container by running the command below:
docker compose up --build -d
In the project directory, create a file named content_index.json and add the following to it:
{
"settings": {
"analysis": {
"filter": {
"my_synonyms": {
"type": "synonym_graph",
"lenient": true,
"synonyms_path": "synonyms.txt"
}
},
"analyzer": {
"my_text_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase", "stop", "my_synonyms"]
}
}
}
},
"mappings": {
"properties": {
"id": { "type": "keyword" },
"title": { "type": "text", "analyzer": "my_text_analyzer", "fields": { "raw": { "type": "keyword" } }, "index_options":"positions", "term_vector":"with_positions_offsets" },
"tags": { "type": "keyword" },
"category": { "type": "keyword" },
"body": { "type": "text", "analyzer": "my_text_analyzer", "index_options":"positions", "term_vector":"with_positions_offsets" },
"publishedAt":{ "type": "date" },
"views": { "type": "integer" },
"isFeatured": { "type": "boolean" }
}
}
}
An index is like the OpenSearch equivalent of a database in traditional databases. Here, we’re instructing OpenSearch on how to structure and store our content. We created a custom analyzer called “my_text_analyzer” that applies lowercase conversion, removes common stop words like “the” and “and,” and also references synonyms using our synonyms.txt file.
The mapping section defines the data types for our article documents and the search behavior of each field:
"id", "category" and "tags" use exact keyword matching"title" and "body" are full-text searchable and have position tracking for phrase search and highlighting"publishedAt" and "views" allow for sorting by time and popularity"isFeatured" is a boolean and can be filtered as suchWith our content_index.json file set up, run the command below to create our index, which will be called "content_v1”:
curl -k -u admin:ChangeMe_StrongPassword_123! -X PUT "https://localhost:9200/content_v1" \
-H "Content-Type: application/json" \
--data-binary @content_index.json
Next, let’s add an alias content_current. An alias is a nickname that points to one or more OpenSearch indices. It allows us to decouple our application (i.e., our NestJS app only references “content_current” while we can switch the actual index from “content_v1” to “content_v2”).
Run the command below:
curl -k -u admin:ChangeMe_StrongPassword_123! -X POST
"https://localhost:9200/_aliases" \
-H "Content-Type: application/json" \
-d '{"actions":[{"add":{"index":"content_v1","alias":"content_current"}}]}'
Create a file called sample_data.json and add the following content to it. We’ll use it to populate our index with article documents:
{"index":{"_index":"content_v1","_id":"1"}}
{"id":"1","title":"Getting started with NestJS","tags":["nestjs","api"],"category":"dev","body":"NestJS makes building scalable server-side apps easy.","publishedAt":"2025-01-01","views":1200,"isFeatured":true}
{"index":{"_index":"content_v1","_id":"2"}}
{"id":"2","title":"OpenSearch fuzzy matching tips","tags":["opensearch","search"],"category":"search","body":"Configure analyzers, synonyms, and boosts for better relevance.","publishedAt":"2025-02-10","views":800,"isFeatured":false}
{"index":{"_index":"content_v1","_id":"3"}}
{"id":"3","title":"AI and Machine Learning Basics","tags":["ai","ml"],"category":"tech","body":"Artificial intelligence is transforming how we build applications.","publishedAt":"2025-03-15","views":1500,"isFeatured":true}
{"index":{"_index":"content_v1","_id":"4"}}
{"id":"4","title":"UX Design Principles","tags":["ux","design"],"category":"design","body":"User experience design focuses on creating intuitive interfaces.","publishedAt":"2025-03-20","views":900,"isFeatured":false}
{"index":{"_index":"content_v1","_id":"5"}}
{"id":"5","title":"Frontend Development with React","tags":["react","frontend","javascript"],"category":"dev","body":"React provides powerful tools for building modern user interfaces.","publishedAt":"2025-04-01","views":2000,"isFeatured":true}
{"index":{"_index":"content_v1","_id":"6"}}
{"id":"6","title":"Database Optimization Techniques","tags":["database","performance","sql"],"category":"backend","body":"Learn how to optimize your database queries for better performance.","publishedAt":"2025-04-15","views":750,"isFeatured":false}
Run the command below to add the sample data:
curl -k -u admin:ChangeMe_StrongPassword_123! -X POST
"https://localhost:9200/_bulk" \
-H "Content-Type: application/x-ndjson" \
--data-binary @sample_data.json
Run the command below to test and make sure everything works as expected:
curl -k -u admin:ChangeMe_StrongPassword_123! "https://localhost:9200/content_current/_search?q=*:*&pretty"
If everything has been set up properly, you’ll get a response with all the article documents we indexed.
Create a .env file and add these environment variables to it:
OPENSEARCH_NODE=https://localhost:9200
OPENSEARCH_USERNAME=admin
OPENSEARCH_PASSWORD=ChangeMe_StrongPassword_123!
OPENSEARCH_INDEX_ALIAS=content_current
Create a `Search Module with the following file structure:
src/
├── search/
│ ├── search.controller.ts
│ ├── search.dto.ts
│ ├── search.interfaces.ts
│ ├── search.module.ts
│ └── search.service.ts
Next, update the search.module.ts file with the following:
import { Module } from '@nestjs/common';
import { ConfigModule, ConfigService } from '@nestjs/config';
import { Client } from '@opensearch-project/opensearch';
import { SearchService } from './search.service';
import { SearchController } from './search.controller';
@Module({
imports: [ConfigModule.forRoot({ isGlobal: true })],
providers: [
{
provide: Client,
useFactory: (cfg: ConfigService) => {
const node = cfg.getOrThrow<string>('OPENSEARCH_NODE');
const username = cfg.getOrThrow<string>('OPENSEARCH_USERNAME');
const password = cfg.getOrThrow<string>('OPENSEARCH_PASSWORD');
return new Client({
node,
auth: {
username,
password,
},
ssl: {
rejectUnauthorized: false,
},
});
},
inject: [ConfigService],
},
SearchService,
],
controllers: [SearchController],
exports: [SearchService],
})
export class SearchModule {}
Now, take a look at the @Module decorator. We first load environment variables in the imports array using ConfigModule.forRoot(). Next, in the providers array, we register the OpenSearch client using a factory and the SearchService for dependency injection.
Add the following code to your search.dto.ts file. These are the DTOs we’ll use in our controller:
import { IsString, IsArray, IsNumber, IsBoolean, IsDateString, IsNotEmpty, IsOptional, IsIn, Min, Max } from 'class-validator';
export class SearchParamsDTO {
@IsOptional()
@IsString()
q?: string;
@IsOptional()
@IsString()
phrase?: string;
@IsOptional()
@IsArray()
@IsString({ each: true })
tags?: string[];
@IsOptional()
@IsString()
category?: string;
@IsOptional()
@IsBoolean()
featured?: boolean;
@IsOptional()
@IsIn(['_score', 'recent', 'views'])
sort?: '_score' | 'recent' | 'views';
@IsOptional()
@IsNumber()
@Min(1)
page?: number;
@IsOptional()
@IsNumber()
@Min(1)
@Max(50)
pageSize?: number;
}
export class DocumentDTO {
@IsString()
@IsNotEmpty()
id: string;
@IsString()
@IsNotEmpty()
title: string;
@IsArray()
@IsString({ each: true })
tags: string[];
@IsString()
@IsNotEmpty()
category: string;
@IsString()
@IsNotEmpty()
body: string;
@IsDateString()
publishedAt: string;
@IsNumber()
views: number;
@IsBoolean()
isFeatured: boolean;
}
export interface SearchHit {
id: string;
score: number | null;
title: string;
tags: string[];
category: string;
publishedAt: string;
views: number;
highlight?: {
title?: string[];
body?: string[];
};
}
export interface SearchResponse {
total: number;
hits: SearchHit[];
}
Next, add the following to your search.interface.ts file. These are the types we’ll use in our search service:
export type SearchParams = {
q?: string;
phrase?: string;
tags?: string[];
category?: string;
featured?: boolean;
sort?: '_score' | 'recent' | 'views';
page?: number;
pageSize?: number;
};
export interface QueryClause {
multi_match?: {
query: string;
fields: string[];
fuzziness?: string;
operator?: string;
type?: string;
slop?: number;
};
bool?: {
should: QueryClause[];
};
terms?: {
[key: string]: string[];
};
term?: {
[key: string]: string | boolean;
};
}
Add the following to your search.service.ts file:
import { Injectable, InternalServerErrorException } from '@nestjs/common';
import { Client } from '@opensearch-project/opensearch';
import { Search_Response } from '@opensearch-project/opensearch/api/_core/search';
import { DocumentDTO, SearchResponse, SearchHit } from './search.dto';
import { SearchParams, QueryClause } from './search.interface';
import { ConfigService } from '@nestjs/config';
import { TotalHits } from '@opensearch-project/opensearch/api/_types/_core.search';
@Injectable()
export class SearchService {
private readonly indexAlias: string;
constructor(private readonly os: Client, private readonly cfg: ConfigService) {
this.indexAlias = this.cfg.getOrThrow<string>('OPENSEARCH_INDEX_ALIAS');
}
// Search
async search(params: SearchParams): Promise<SearchResponse> {}
// Upsert document
async indexDocument(doc: DocumentDTO): Promise<void> {}
// Delete document
async removeDocument(id: string): Promise<void> {}
// Delete all documents
async deleteAllDocuments(): Promise<void> {}
}
Finally, we can update the search method with the following:
async search(params: SearchParams): Promise<SearchResponse> {
try {
const {
q,
phrase,
tags,
category,
featured,
sort = '_score',
page = 1,
pageSize = 10,
} = params;
const should: QueryClause[] = [];
const filter: QueryClause[] = [];
if (q) {
should.push({
multi_match: {
query: q,
fields: ['title^2', 'body'],
fuzziness: 'AUTO',
operator: 'and',
},
});
}
if (phrase) {
should.push({
multi_match: {
query: phrase,
type: "phrase",
fields: ['title^2', 'body'],
slop: 2,
},
});
}
if (tags?.length) filter.push({ terms: { tags } });
if (category) filter.push({ term: { category } });
if (featured !== undefined) filter.push({ term: { isFeatured: featured } });
const query= { bool: { filter, should } };
const sortClause =
sort === 'recent' ? [{ publishedAt: 'desc' }] :
sort === 'views' ? [{ views: 'desc' }] :
[sort];
const size = pageSize ?? 10;
// Convert page number to OpenSearch offset (from)
const from = ((page ?? 1) - 1) * size;
const body: Record<string, unknown> = {
query,
sort: sortClause,
from,
size,
track_total_hits: true,
_source: ['id', 'title', 'tags', 'category', 'publishedAt', 'views'],
highlight: {
pre_tags: ['<mark>'],
post_tags: ['</mark>'],
fields: { title: {}, body: {} },
},
};
const res: Search_Response = await this.os.search({ index: this.indexAlias, body });
const raw = res?.body ?? {};
const totalRaw = raw?.hits?.total as TotalHits;
const total = totalRaw.value;
const hits = raw?.hits?.hits ?? [];
const items: SearchHit[] = hits
.map((h) => {
const src = h?._source;
if (!src) throw new Error('Hit missing _source');
return {
id: src.id,
score: h?._score as number,
title: src.title,
tags: src.tags,
category: src.category,
publishedAt: src.publishedAt,
views: src.views,
...(h?.highlight && { highlight: h.highlight }),
};
})
return { total, hits: items };
} catch (error) {
throw new InternalServerErrorException('Search operation failed. Please try again later.');
}
}
Our search method maps the incoming parameters into a valid OpenSearch query, and sets safe defaults. It then transforms the raw OpenSearch response into a clean SearchResponse type.
After extracting the parameters and setting defaults, we create two arrays: should, which are optional signals that score documents based on relevance, and filter, which are hard constraints that narrow the scope but don’t contribute to the score.
If q is present, we push a multi_match query into our should array, which searches across multiple fields while giving certain fields more importance with boosts (e.g., title^2). This query can also be typo-tolerant with fuzziness: 'AUTO'.
With the default multi_match and operator set to and, matches are term-based within a single field (order isn’t enforced). This means if q was “nestjs tutorials”, a document must contain “nestjs” and “tutorial” in either its title or body to match.
If phrase is present, we push a multi_match of type phrase into our should array. This is a stricter version of multi_match for groups of words in an exact sequence. Slop controls how loosely the phrase can be matched, with words being reordered or having gaps. If tags, category or featured have values, we push them into the filter array.
Next, we pass the should and filter arrays into our bool query. This is a compound query that lets us mix different conditions. Its options are must, should, must_not and filter.
We set the sorting behavior using the publishedAt field in our documents for sorting by recent and the views field for sorting by views, or we can default to sorting by the document’s score.
We also set the page size and then convert the user-provided page number to an offset for OpenSearch’s from parameter. For example, page 2 with a size of 10 would have a from value of 10 (meaning the documents on this page would start from 10 and end at 19).
Lastly, we put all the parts together and assemble the request body, adding:
track_total_hits – This tells OpenSearch to calculate the exact number of documents that matched our query and return it for us._source – Specifies what fields we want returned.highlight – Tells OpenSearch to wrap matching snippets in the matching documents with <mark>...</mark>. This helps with UI.For our response, we first retrieve the total from raw?.hits?.total and cast it as TotalHits. We do this because, in older versions, its value is a simple number, while in newer versions, it’s an object of type TotalHits. The OpenSearch client type definitions support both versions, but our code clearly handles the newer one.
Next, we extract the hits from raw?.hits?.hits and map them neatly into an items array. If the source is missing for a hit, we throw an error; however, this is primarily done to satisfy the client type definitions, as our query requests the source, so we can expect it to be returned.
Now, let’s update the other methods:
// Upsert document
async indexDocument(doc: DocumentDTO): Promise<void> {
try {
await this.os.index({
index: this.indexAlias,
id: doc.id,
body: doc,
refresh: 'wait_for'
});
} catch (error) {
throw new InternalServerErrorException('Failed to index document. Please try again later.');
}
}
// Delete document
async removeDocument(id: string): Promise<void> {
try {
await this.os.delete({
index: this.indexAlias,
id,
refresh: 'wait_for'
});
} catch (error) {
throw new InternalServerErrorException('Failed to remove document. Please try again later.');
}
}
// Delete all documents
async deleteAllDocuments(): Promise<void> {
try {
await this.os.deleteByQuery({
index: this.indexAlias,
body: {
query: {
match_all: {}
}
},
refresh: true,
});
} catch (error) {
throw new InternalServerErrorException('Failed to delete all documents. Please try again later.');
}
}
In the code above, the refresh parameter allows us to configure how OpenSearch handles our writes.
By default, when we index, update or delete a document, OpenSearch marks the document for that action, but it doesn’t execute the action until the next scheduled refresh (about 1-second interval). This means that if you’re trying to index a document, it may not be searchable immediately. Or if you’re trying to delete a document, it will still be searchable until the next refresh.
The refresh parameter has three options:
false – This marks the document and returns a response without waiting for the refresh to take place. Writes are fast, but the document doesn’t become searchable immediately.true – It marks the document, then forces an immediate refresh to occur. This makes the document searchable immediately, but it is more expensive.wait_for – It marks the document, but doesn’t return a response until the next scheduled refresh occurs and the document becomes searchable. Latency is higher, but it avoids the costs of forcing a refresh. Note that this option is not supported with the deleteByQuery() method.Next, update your search.controller.ts file with the following:
import { Controller, Post, Body, Delete, Param, Put } from '@nestjs/common';
import { SearchService } from './search.service';
import { DocumentDTO, SearchParamsDTO, SearchResponse } from './search.dto';
@Controller('search')
export class SearchController {
constructor(private readonly search: SearchService) {}
@Post('/query')
find(@Body() query: SearchParamsDTO): Promise<SearchResponse> {
return this.search.search(query);
}
@Put('/upsert')
upsertDocument(@Body() document: DocumentDTO) {
return this.search.indexDocument(document);
}
@Delete('/delete/:id')
removeDocument(@Param('id') id: string) {
return this.search.removeDocument(id);
}
@Delete('/delete-all')
deleteAllDocuments() {
return this.search.deleteAllDocuments();
}
}
With everything set up, run the following command in your terminal to start the NestJS server:
npm run start:dev
Now, we can test the endpoints using the following cURL requests:
curl -s -X POST http://localhost:3000/search/query -H "Content-Type: application/json" --data-raw '{"q":"nestjs","phrase":"getting started","tags":["api","dev"],"sort":"_score","page":1,"pageSize":10}'
This is for the search document.
curl -s -X PUT http://localhost:3000/search/upsert -H "Content-Type: application/json" --data-raw '{"id":"4","title":"UX Design Principles (Updated)","tags":["ux","design"],"category":"design","body":"User experience design focuses on creating intuitive interfaces.","publishedAt":"2025-03-20","views":950,"isFeatured":true}'
This is for the upsert document.
curl -s -X DELETE http://localhost:3000/search/delete/5
This is for the delete document.
curl -s -X DELETE http://localhost:3000/search/delete-all
This tests deleting all documents.
In this post, we’ve built a NestJS server that enables full-text search using OpenSearch.
Our setup implements an OpenSearch bool query with should clauses (multi_match) and filters. It smartly ranks documents, is typo-tolerant, and includes sorting, pagination and highlights for better UI display.
With all we’ve covered, you now have a solid foundation for building a full-text search API with NestJS and OpenSearch. Possible next steps include implementing search templates and exploring other OpenSearch query types.
Chris Nwamba is a Senior Developer Advocate at AWS focusing on AWS Amplify. He is also a teacher with years of experience building products and communities.