Node.js Stream

I often forget Node.js stream API and usage.

What For?

As the post says, it is for:

  1. Memory efficiency
  2. Time efficiency

This API helps when you do batch operation like processing 100,000 records of a table of PostgreSQL, for example.

interface User {
  id: string;
  name: string;
}

// fetchUsers sends a request to DB and gets all records from `users` table.
function fetchUsers(): Promise<User[]> {
  // omit impl
}

(async () => {
  const users = await fetchUsers();

  for (const user of users) {
    // process each user
  }
})();

The code above is NOT memory efficient. It loads all user records into memory and the process might be killed by OOM killer when the table grows.

The stream API works very well in this case. It also does flow control (how many items flows, pause/resume flow, etc.)

Stream Types

  • stream.Readable: can be a src
  • stream.Writable: can be a dest
  • stream.Duplex
  • stream.Transform

See the docs if you’re interested in the details.

Usage

In most web apps, you want to handle Readable stream.

There’re several ways to do so and I summarized them here.

I guess the following ways are modern ones:

  • Using Readable.from (1-b in the gist) to create Readable stream
  • Using for await (2-c in the gist) to consume Readable stream

stream/promises module

This was introduced from Node.js v15.0.

Main objects like Readable or Writable are still exported from stream module.

Contents