Telerik blogs

Learn to handle large files efficiently and reliably on a NestJS server with Node.js streams, S3 buckets and CSV-to-JSON.

In this post, we’ll learn how to handle large files on a NestJS server efficiently. Using Node.js streams, we’ll handle downloads, uploads to disk and S3 buckets and even file processing with a CSV-to-JSON example. After reading this post, you’ll no longer have to worry about your server crashing due to large files.

Prerequisites

To follow along and get the most from this post, you’ll need basic knowledge of how HTTP downloads and uploads generally work, some familiarity with Multer for handling uploads, basic knowledge of AWS S3 SDK and a basic understanding of NestJS architecture.

Project Setup

Let’s start by creating a NestJS project:

nest new stream-app
cd stream-app

Next, run the command below to create the FilesModule, FilesController, FilesService and CSVController files:

nest g module files \
&& nest g controller files \
&& nest g service files \
&& nest g controller files/csv \
&& nest g service files/csv \
&& nest g controller files/s3 \
&& nest g service files/s3

Install the dependencies we’ll be needing for this project:

npm install multer csv-parser mime-types @aws-sdk/client-s3 @nestjs/config
npm install -D @types/multer @types/mime-types

Here, we’ll be using multer for handling uploads, csv-parser for transforming CSV to JSON, mime-types for setting the correct Content-Type for files, @aws-sdk/client-s3 for uploading files to an S3-compatible storage service (DigitalOcean Spaces) and @nestjs/config for retrieving environment variables.

Next, to use the NestJS Config service in our app, we need to import the ConfigModule. Update your app.module.ts file with the following:

import { Module } from "@nestjs/common";
import { AppController } from "./app.controller";
import { AppService } from "./app.service";
import { FilesModule } from "./files/files.module";
import { ConfigModule } from "@nestjs/config";

@Module({
  imports: [ConfigModule.forRoot({ isGlobal: true }), FilesModule],
  controllers: [AppController],
  providers: [AppService],
})
export class AppModule {}

Lastly, in the root directory, create a folder called storage and add a large file to it (at least 100MB to show the memory benefits of streaming).

For example:

stream - app / storage / large - report.pdf;

Basic Streaming in NestJS

Now, the wrong way to send a large file to a user is to use readFileSync(). This loads the entire file into memory and sends it all at once. This approach isn’t practical for large files or large-scale apps.

// BAD EXAMPLE -- DO NOT USE
@Get('download-bad')
getFileBad(@Res() res: Response) {
  const filePath = join(process.cwd(), 'storage', 'large-report.pdf');
  const fileBuffer = readFileSync(filePath); // Loads ENTIRE file into memory

  res.setHeader('Content-Type', 'application/pdf');
  res.setHeader('Content-Disposition', 'attachment; filename="report.pdf"');

  return res.send(fileBuffer); // Sends the entire buffer at once
}

Fortunately, Node.js lets us work with streams. Streams are a way to handle data efficiently, progressively and in a non-blocking way. This way, data is processed in chunks instead of as a whole. Using createReadStream(), we read a file in chunks of 64KB (default).

Update your files.controller.ts file with the following:

import {
  Controller,
  Get,
  Query,
  Res,
  HttpException,
  HttpStatus,
  Post,
  UploadedFile,
  UseInterceptors,
  ConsoleLogger,
} from "@nestjs/common";
import { Response } from "express";
import { extname, join } from "path";
import { createReadStream, statSync } from "fs";
import { StreamableFile } from "@nestjs/common";
import * as mime from "mime-types";
import { FilesService } from "./files.service";
import { FileInterceptor } from "@nestjs/platform-express";
import { diskStorage } from "multer";

@Controller("files")
export class FilesController {
  constructor(private readonly filesService: FilesService) {}

  @Get("download")
  getFile(@Res({ passthrough: true }) res: Response) {
    const filePath = join(process.cwd(), "storage", "large-report.pdf");
    const fileStream = createReadStream(filePath);

    res.set({
      "Content-Type": "application/pdf",
      "Content-Disposition": 'attachment; filename="report.pdf"',
    });

    return new StreamableFile(fileStream);
  }
}

In the code above, the @Res({ passthrough: true }) decorator tells NestJS to allow us to set the headers (modify the response) while still handling sending back the response data (meaning we don’t have to call res.send()).

The headers we are setting are:

  • Content-Type: Which tells the browser the file type we are sending
  • Content-Disposition: Which tells the browser what to name the file and to download the file

StreamableFile(fileStream) wraps the raw stream, allowing NestJS to understand how to return it as a response. This approach works for both Express and Fastify. The StreamableFile class handles the low-level differences, so if you want to switch to Fastify, you only need to modify your main.ts file and install the adapter.

Improved File Download

The earlier example works, but in production, we’ll want more things like better error handling, input validation, properly set headers and reusable logic.

Update your files.service.ts file with the following:

import {
  Injectable,
  StreamableFile,
  NotFoundException,
  BadRequestException,
} from "@nestjs/common";
import { join } from "path";
import { createReadStream, existsSync } from "fs";
import { ReadStream } from "fs";

@Injectable()
export class FilesService {
  getFileStream(fileName: string): { stream: ReadStream; path: string } {
    try {
      // Basic filename validation
      if (!fileName || typeof fileName !== "string") {
        throw new BadRequestException("Invalid filename provided");
      }

      // Prevent directory traversal
      if (
        fileName.includes("..") ||
        fileName.includes("/") ||
        fileName.includes("\\")
      ) {
        throw new BadRequestException(
          "Invalid filename: contains path traversal characters"
        );
      }

      const filePath = join(process.cwd(), "storage", fileName);

      if (!existsSync(filePath)) {
        throw new NotFoundException(`File '${fileName}' not found`);
      }

      const stream = createReadStream(filePath);
      return { stream, path: filePath };
    } catch (error) {
      if (
        error instanceof NotFoundException ||
        error instanceof BadRequestException
      ) {
        throw error;
      }
      throw new BadRequestException(
        `Failed to get file stream for ${fileName}: ${error.message}`
      );
    }
  }
}

In the code above, first, we perform basic filename validation to prevent null or undefined values from causing crashes. Next, we improve security by preventing directory traversal attacks (we block attempts to access files outside the storage directory). Lastly, we implement proper error handling using NestJS’s exceptions.

Please note that existsSync() checks if a specified path or directory exists; it returns true if it does and false if it doesn’t.

Now, update your *files.controller.ts* to include the following endpoint:

@Get('improved-download')
downloadFile(@Query('name') name: string, @Res({ passthrough: true }) res: Response) {
  if (!name) {
    throw new HttpException('Filename is required', HttpStatus.BAD_REQUEST);
  }

  const { stream, path } = this.filesService.getFileStream(name);
  const fileSize = statSync(path).size;
  const fileExtension = extname(path);
  const contentType = mime.lookup(fileExtension) || 'application/octet-stream';

  res.set({
    'Content-Type': contentType,
    'Content-Disposition': `attachment; filename="${name}"`,
    'Content-Length': fileSize.toString(),
    'Cache-Control': 'no-cache, no-store, must-revalidate',
  });

  return new StreamableFile(stream);
}

In the code above, first we implement dynamic file selection using a query parameter name. We then pass this name to our getFileStream(name) method to get the stream and path. Using statSync(), we get the file size, which we’ll set in the headers to help browsers show progress bars during download. We get the file extension and use the mime-types library to map it to the accurate MIME type, e.g., application/pdf or image/jpeg. Finally, we set the headers before allowing NestJS to handle the response.

When we download files, browsers sometimes cache the response, which can lead to issues like users getting outdated versions of files. By setting Cache-Control, we prevent such issues.

Uploading Large Files

Next, let’s cover how to handle uploads with streams. We’ll go over streaming uploads to disk and streaming uploads to S3 buckets.

Uploading to Disk

Add the Post/upload route below to the FilesController:

@Post('upload')
  @UseInterceptors(
    FileInterceptor('file', {
      storage: diskStorage({
        destination: './uploads',
        filename: (req, file, callback) => {
          const uniqueName = Date.now() + extname(file.originalname);
          callback(null, uniqueName);
        },
      }),
      limits: {
        fileSize: 500 * 1024 * 1024, // 500MB
      },
    }),
  )
handleUpload(@UploadedFile() file: Express.Multer.File) {
  return {
    message: 'File uploaded successfully',
    filename: file.filename,
    size: file.size,
  };
}

In the code above, @UseInterceptors is a NestJS decorator that allows us to attach interceptors to a route handler. Here we are attaching FileInterceptor, which is a NestJS helper that wraps Multer, helps us get the file from the request, parse it with Multer, and make it available in our controller with the @UploadedFile() decorator.

FileInterceptor takes the name of the field in the form data that has our file (file) and the Multer configuration object. We set storage to diskStorage instead of buffering the file into memory. This way, we write the file chunk by chunk as it’s being received.

The diskStorage() method takes the destination (the directory we want to save the file in) and filename, which is a function that is used to determine the name of the file.

Finally, with the @UploadedFile() decorator, we get access to the file object, which has information like the filename, originalname, mimetype, size, path and buffer. But because we set storage to diskStorage, file.buffer would be undefined. Using the file object, we send a few details back as the response to show that the upload was successful.

Uploading to S3

Here we’ll first upload the file using diskStorage(), then stream it directly to our S3 bucket.

In this example, we’ll be using DigitalOcean Spaces, which is fully S3-compatible. It uses the AWS SDK the same way, but with a custom endpoint and CDN URL from DigitalOcean.

Update your s3.service.ts file with the following:

import { Injectable } from "@nestjs/common";
import { ConfigService } from "@nestjs/config";
import { S3Client, PutObjectCommand } from "@aws-sdk/client-s3";
import { Readable } from "stream";
import * as path from "path";

@Injectable()
export class S3Service {
  private s3: S3Client;
  private readonly bucketName: string;
  private readonly endpoint: string;
  private readonly region: string;
  private readonly cdnUrl: string;

  constructor(private readonly configService: ConfigService) {
    this.bucketName = this.configService.getOrThrow<string>(
      "DIGITAL_OCEAN_SPACE_BUCKET_NAME"
    );
    this.endpoint = this.configService.getOrThrow<string>(
      "DIGITAL_OCEAN_SPACE_ENDPOINT"
    );
    this.region = this.configService.getOrThrow<string>(
      "DIGITAL_OCEAN_SPACE_REGION"
    );
    this.cdnUrl = this.configService.getOrThrow<string>(
      "DIGITAL_OCEAN_SPACE_CDN_URL"
    );
    const accessKeyId = this.configService.getOrThrow<string>(
      "DIGITAL_OCEAN_SPACE_ACCESS_KEY_ID"
    );
    const secretAccessKey = this.configService.getOrThrow<string>(
      "DIGITAL_OCEAN_SPACE_SECRET_KEY"
    );
    this.s3 = new S3Client({
      endpoint: this.endpoint,
      forcePathStyle: false,
      region: this.region,
      credentials: {
        accessKeyId,
        secretAccessKey,
      },
    });
  }

  async uploadImageStream(payload: {
    location: string;
    file: {
      stream: Readable;
      filename: string;
      mimetype: string;
      size: number;
    };
  }): Promise<{ path: string; key: string }> {
    const { location, file } = payload;
    const uid = Date.now().toString(); // Replace with ULID if needed
    const extension = path.extname(file.filename);
    const key = `${location}/${uid}${extension}`;

    const command = new PutObjectCommand({
      Bucket: this.bucketName,
      Key: key,
      Body: file.stream,
      ContentLength: file.size,
    });

    try {
      await this.s3.send(command);
      return {
        path: `${this.cdnUrl}/${key}`,
        key,
      };
    } catch (error) {
      console.error("Error uploading file stream:", error);
      throw new Error("File upload failed");
    }
  }
}

In our uploadImageStream() method, we first define a unique key for the file or object, then we set up the AWS SDK v3 upload command, passing the readable stream as the body and setting the ContentLength.

Finally, we perform the upload in our try-catch block and return the path and key.

Next, update your s3.controller.ts file with the following;

import {
  Controller,
  Post,
  UploadedFile,
  UseInterceptors,
  BadRequestException,
} from "@nestjs/common";
import { FileInterceptor } from "@nestjs/platform-express";
import { diskStorage } from "multer";
import * as fs from "fs";
import * as path from "path";
import { S3Service } from "./s3.service";

@Controller("s3")
export class S3Controller {
  constructor(private readonly s3Service: S3Service) {}

  @Post("upload")
  @UseInterceptors(
    FileInterceptor("file", {
      storage: diskStorage({
        destination: "./uploads",
        filename: (req, file, cb) => {
          cb(null, `${Date.now()}-${file.originalname}`);
        },
      }),
      limits: { fileSize: 200 * 1024 * 1024 },
    })
  )
  async uploadToS3(@UploadedFile() file: Express.Multer.File) {
    if (!file) {
      throw new BadRequestException("No file uploaded");
    }

    const location = "uploads";
    const filePath = file.path; // From disk storage
    const readStream = fs.createReadStream(filePath);
    const { size } = fs.statSync(filePath);

    try {
      const uploadResult = await this.s3Service.uploadImageStream({
        location,
        file: {
          stream: readStream,
          filename: file.originalname,
          mimetype: file.mimetype,
          size,
        },
      });

      return {
        message: "File uploaded to S3",
        ...uploadResult,
      };
    } catch (error) {
      throw new Error(`File upload failed: ${error.message}`);
    } finally {
      // Clean up temp file
      if (file.path && fs.existsSync(file.path)) {
        fs.unlinkSync(file.path);
      }
    }
  }
}

In our uploadToS3 route handler, we pass the location and file to the uploadImageStream() method and return a success response with the key and path. If an error occurs, we throw the error. Finally, we clear the temporarily stored file from our disk using fs.unlinkSync(file.path).

Processing Large Files: CSV-JSON Example

Update your csv.service.ts file with the following:

import { Injectable, BadRequestException } from "@nestjs/common";
import * as csv from "csv-parser";
import { Readable } from "stream";

export interface CsvRow {
  [key: string]: string;
}

export interface CsvProcessingResult {
  totalRows: number;
  data: CsvRow[];
}

@Injectable()
export class CsvService {
  async processCsvStream(fileStream: Readable): Promise<CsvProcessingResult> {
    return new Promise((resolve, reject) => {
      const results: CsvRow[] = [];
      // For very large files, consider replacing this with database logic

      // Create CSV parser stream
      const csvStream = csv();

      // Set up error handling
      csvStream.on("error", (error) => {
        reject(new BadRequestException(`CSV parsing failed: ${error.message}`));
      });

      // Process completion
      csvStream.on("end", () => {
        resolve({
          totalRows: results.length,
          data: results,
        });
      });

      // Pipe the streams together and collect data
      fileStream.pipe(csvStream).on("data", (data: CsvRow) => {
        results.push(data);
        // For very large files, replace the above line with database logic:
        // this.databaseService.insertRow(data);
        // Or accumulate a batch and insert in bulk for better performance.
      });
    });
  }
}

In the processCsvStream() method, we first create a new promise to handle the asynchronous nature of streaming. The resulting array is where each parsed row of the CSV will be stored as it comes in (for larger files, replace this with database logic). Next, we create a CSV stream using csv() from csv-parser, which works as a transform stream. A transform stream is a stream that’s able to read and write data (we’re reading raw CSV data and writing it into JSON, one row at a time).

fileStream.pipe(csvStream) sends chunks of raw CSV data into the csv-parser. When the data is in the csv-parser, it converts it into JSON row by row. Once the csv-parser has converted a row into a JSON object, it emits that JSON object as a data event. This data event is then handled by our event handler, which takes each resulting JSON object and pushes it into our results array.

We have two other event handlers for error and end. When an error event is received, we reject the promise with a bad request exception. When the end event is received, it means the entire CSV file has been processed, and we then resolve the promise with the collected results.

Now, update your csv.controller.ts file with the following:

import {
  Controller,
  Post,
  UploadedFile,
  UseInterceptors,
  BadRequestException,
} from "@nestjs/common";
import { FileInterceptor } from "@nestjs/platform-express";
import { diskStorage } from "multer";
import * as fs from "fs";
import { CsvService } from "./csv.service";

@Controller("csv")
export class CsvController {
  constructor(private readonly csvService: CsvService) {}

  @Post("upload")
  @UseInterceptors(
    FileInterceptor("file", {
      storage: diskStorage({
        destination: "./uploads",
        filename: (req, file, cb) => {
          cb(null, `${Date.now()}-${file.originalname}`);
        },
      }),
      limits: { fileSize: 50 * 1024 * 1024 }, // 50MB limit
    })
  )
  async handleCsvUpload(@UploadedFile() file: Express.Multer.File) {
    if (!file) {
      throw new BadRequestException("No file uploaded");
    }

    // Create read stream from file (true streaming)
    const fileStream = fs.createReadStream(file.path);

    try {
      // Process CSV as a stream using the service
      const result = await this.csvService.processCsvStream(fileStream);

      return {
        message: "CSV processed successfully",
        filename: file.originalname,
        ...result,
      };
    } catch (error) {
      throw new BadRequestException(`CSV processing failed: ${error.message}`);
    } finally {
      // Clean up temp file
      if (file.path && fs.existsSync(file.path)) {
        fs.unlinkSync(file.path);
      }
    }
  }
}

Finally, check your files.module.ts file and verify all the providers and controllers are configured correctly as shown below:

import { Module } from "@nestjs/common";
import { FilesController } from "./files.controller";
import { FilesService } from "./files.service";
import { CsvController } from "./csv/csv.controller";
import { S3Controller } from "./s3/s3.controller";
import { S3Service } from "./s3/s3.service";
import { CsvService } from "./csv/csv.service";

@Module({
  controllers: [FilesController, CsvController, S3Controller],
  providers: [FilesService, S3Service, CsvService],
})
export class FilesModule {}

Conclusion

In this post, we’ve covered how to handle file downloads, uploads to disk and S3 and file processing. You now know what to do, what not to do and the reasons why. Possible next steps are adding database logic to the CSV to JSON example or adding retry logic for S3 uploads.


About the Author

Christian Nwamba

Chris Nwamba is a Senior Developer Advocate at AWS focusing on AWS Amplify. He is also a teacher with years of experience building products and communities.

Related Posts

Comments

Comments are disabled in preview mode.