Electrons Are Fast, So Can Be Electron – How to Optimize Electron App Performance

Daniel Kawka — Mon, 24 Jul 2023 09:13:00 +0000

Electron is a popular framework for building desktop applications for different systems using the same codebase.

However, we often hear it is slow, consumes a lot of memory, and spawns multiple processes slowing down the whole system. Some very popular applications are built using Electron, including:

Microsoft Teams (but they are migrating to Edge Webview2),
Signal,
WhatsApp.

Not all of them are perfect, but there are some very good examples, like Visual Studio Code. Can we say it’s slow? In our experience, it’s the opposite – it’s quite performant and responsive.

In this article, we’ll show you how we reduced bottlenecks in our Electron application and made it fast! The presented method can be applied to Node.js-based applications like API servers or other tools requiring high performance.

Electron-based game launcher is our test subject

Our project is an Electron-based game launcher. If you play games, you probably have a few of them installed on your computer. Most launchers download game files, install updates, and verify files so games can launch without any problems.

There are parts we can’t speed up that are dependent on, e.g., connection speed, but when it comes to verifying downloaded or patched files, it’s a different story, and if the game is big, it can take an impressive amount of time for the whole process. This is our case.

Our app is responsible for downloading files and, if eligible, applying binary patches. When this is done, we must ensure that nothing gets corrupted. It does not matter what causes the corruption, our users want to play the game, and we have to make it possible.

Now, let me give you some numbers. Our games consist of 44 files of a total size of around ~4.7GB.

We must verify them all after downloading the game or an update. We used https://www.npmjs.com/package/crc to calculate the CRC of each file and verify it against the manifest file, let’s see how performant this approach is, time for some benchmarks.

Running the Electron app pre-performance-optimization benchmark test

All benchmarks are run on a 2021 MacBook Pro 14’ M1 Pro.

First, we need some files to verify. We can create a few using the command

mkfile -n 200m test_200m_1

But if we look at the content, we will see it’s all zeros!

➜  /tmp cat test_200m_1 | xxd | tail -n 10
0c7fff60: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0c7fff70: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0c7fff80: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0c7fff90: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0c7fffa0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0c7fffb0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0c7fffc0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0c7fffd0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0c7fffe0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0c7ffff0: 0000 0000 0000 0000 0000 0000 0000 0000  ................

This might give us skewed results. Instead, we will use this command

dd if=/dev/urandom of=test_200m_1 bs=1M count=200

I will create 10 files, 200MB each, and because the data in them is random, they should have different checksums.

The benchmark code:

import crc32 from "crc/crc32";
import { createReadStream } from "fs";

const calculate = async (path) => {
 return new Promise((resolve, reject) => {
   let checksum = null;
   const readStream = createReadStream(path);

   readStream.on("data", (chunk) => {
     checksum = crc32(chunk, checksum);
   });

   readStream.on("end", () => {
     resolve(checksum);
   });

   readStream.on("error", () => {
     resolve(false);
   });
 });
};

const then = new Date().getTime();
await calculate("test_200m_1");
const now = new Date().getTime();
const elapsedTimeInMs = now - then;

console.log(elapsedTimeInMs);

It takes around 800ms to create the read stream and calculate the checksum incrementally. We prefer streams because we can’t afford to load big files into system memory. If we calculate CRC32 for all files one by one, the result is ~16700ms. It slows down after the 3rd file.

Is it any better if we use Promise.all to run them concurrently? Well… this is at the limit of measurement error. It varies at around ~16100ms.

So, here are our results so far:

Single file	10 files one by one	10 files in Promise.all
~800ms	~16700ms	~16100ms

Possible ways to optimize an Electron app performance

There are many paths you can take when optimizing an Electron app, but we are primarly interested in:

NodeJS Worker Threads
Node-API
Neon
Napi-rs
Other JS library that works natively

NodeJS Worker Threads

Worker Thread requires some boilerplate code around it. Also, it might be problematic if your code base is in TypeScript, it’s doable but requires additional tools like ts-node or configuration. We don’t want to spawn who knows how many worker threads – this would be inefficient too. The performance problem is somewhere else. It will be slow wherever we put this calculation.

Conclusion: spawning worker threads will slow down our app even more, so NodeJS Worker Threads is not for us.

Node-API

If we want it fast, Node-API looks like a perfect solution. A library written in C/C++ must be fast. If you prefer to use C++ over C, the node-addon-api can help. This is probably one of the best solutions available, especially since it is officially supported by the Node.js team. It’s super stable once it is built, but it can be painful during development. Errors are often far from easy to understand, so if you are no expert in C, it might kick your ass very easily.

Conclusion: we don’t have C skills to fix the errors, so Node-API is not for us.

Neon Bindings

Now it is getting interesting, Neon Bindings. Rust in Node.js sounds amazing, another buzzword, but is it only a buzzword? Neon says it is being used by popular apps like 1Password and Signal https://neon-bindings.com/docs/example-projects, but let’s take a look at the other Rust-based option, which is NAPI-RS.

Conclusion: Neon Bindings looks promising, but let’s see how it compares to our last option.

NAPI-RS

If we look at the documentation, NAPI-RS’s docs look much better than Neon’s. The framework is sponsored by some big names in the industry. The extensive documentation and support of big brands are sufficient reasons for us to go with NAPI-RS rather than Neon Bindings.

Conclusion: NAPI-RS provides better documentation than comparable Neon Bindings and therefore makes a safer choice.

Using NAPI-RS to optimize the Electron app performance

To optimize our Electron app, we’ll use NAPI-RS, which mixes Rust with Node.js. Rust is an attractive addition to Node.js because of its performance, memory safety, community, and tools (cargo, rust-analyzer). No wonder it’s one of the most liked languages and why more and more companies are rewriting their modules to Rust.

With NAPI-RS, we need to build a library that includes https://crates.io/crates/crc32fast to calculate CRC32 extremely fast. NAPI-RS gives us great ready-to-go workflows to build NPM packages, so building it and integrating it with the project is a breeze. Prebuilts are supported, too, so you don’t need the Rust environment at all to use it, the correct build will be downloaded and used. No matter if you use Windows, Linux, or MacOS (Apple M1 machines are on the list too.)

With the crc32fast library, we will use the Hasher instance to update the checksum from the read stream, as in JS implementation:

// Spawn and run the thread, it starts immediately
   let handle = thread::spawn(move || {
     // Has to be equal to JS implementation, it changes the checksum if different
     const BUFFER_LEN: usize = 64 * 1024;
     let mut buffer = [0u8; BUFFER_LEN];

     // Open the file, if it fails it will return -1 checksum.
     let mut f = match File::open(path) {
       Ok(f) => f,
       Err(_) => {
         return -1;
       }
     };

     // Hasher instance, allows us to calculate checksum for chunks
     let mut hasher = Hasher::new();

     loop {
       // Read bytes and put them in the buffer, again, return -1 if fails
       let read_count = match f.read(&mut buffer[..]) {
         Ok(count) => count,
         Err(_) => {
           return -1;
         }
       };

       // If this is the last chunk, read_count will be smaller than BUFFER_LEN.
       // In this case we need to shrink the buffer, we don't want to calculate the checksum for a half-filled buffer.
       if read_count != BUFFER_LEN {
         let last_buffer = &buffer[0..read_count];
         hasher.update(&last_buffer);
       } else {
         hasher.update(&buffer);
       }

       // Stop processing if this is the last chunk
       if read_count != BUFFER_LEN {
         break;
       }
     }

     // Calculate the "final" checksum and return it from thread
     let checksum = i64::from(hasher.finalize());
     checksum

Running the Electron app post-performance-optimization benchmark test

It might sound like a fake or invalid result but it’s just 75ms for a single file! It’s ten times faster than the JS implementation. When we process all files one by one, it’s around 730ms, so it also scales much better.

But that’s not all. There is one more quite simple optimization we can make. Instead of calling the native library N times (where N is the number of files), we can make it accept an array of paths and spawn a thread for each file.

Remember: Rust does not have a limit on the number of threads, as these are OS threads managed by the system. It depends on the system, so if you know how many threads will be spawned and it’s not very high, you should be safe. Otherwise, we would recommend putting a limit and processing files or doing the computation in chunks.

Let’s put our calculation in a thread per single file and return all checksum at once

// Vector of threads, to be "awaited" later
let mut threads = Vec::<std::thread::JoinHandle<i64>>::new();

for path in paths.into_iter() {
 // Spawn and run the thread, it starts immediately
 let handle = thread::spawn(move || {
   // ... code removed for brevity
 });

 // Push handle to the vector
 threads.push(handle);
}

// Prepare an empty vector for checksums
let mut results = Vec::<i64>::new();

// Go through every thread and wait for it to finish
for task in threads.into_iter() {
 // Get the checksum and push it to the vector
 let result = task.join().unwrap();
 results.push(result);
}

// Return vector(array) of checksums to JS
Ok(results)

How long does it take to call the native function with an array of paths and do all the calculations?

Only 150ms, yes, it is THAT quick. To be 100% sure, we restarted our MacBook and did two additional tests.

First run:

Rust took 463ms Checksums [
  2918571326,  644605025,
   887396193, 1902706446,
  2840008691, 3721571342,
  2187137076, 2024701528,
  3895033490, 2349731754
]
JS promise.all took 16190ms Checksum [
  2918571326,  644605025,
   887396193, 1902706446,
  2840008691, 3721571342,
  2187137076, 2024701528,
  3895033490, 2349731754
]

Second run:

Rust took 197ms Checksums [
  2918571326,  644605025,
   887396193, 1902706446,
  2840008691, 3721571342,
  2187137076, 2024701528,
  3895033490, 2349731754
]
JS promise.all took 16189ms Checksum [
  2918571326,  644605025,
   887396193, 1902706446,
  2840008691, 3721571342,
  2187137076, 2024701528,
  3895033490, 2349731754
]

Let’s bring all the results together and see how they compare.

	JS	Rust
Single file	~800ms	~75ms
10 files one by one	~16700ms	~730ms
10 files Promise.all	~16100ms	-
10 files in threads	-	~200ms

It’s worth noting that calling the native function with an empty array takes 124584 nanoseconds which is 0.12ms so the overhead is very small.

Remember to keep your Electron app unpacked

As mentioned in the beginning, all of this applies to Web APIs, CLI tools, and Electron. Basically, to everything where Node.js is used. But with Electron, there is one more thing to remember. Electron bundles the app into an archive called app.asar. Some Node modules must be unpacked in order to be loaded by the runtime. Most bundlers like Electron Builder or Forge automatically keep those modules outside the archive file, but it might happen that our library will stay in the Asar file. If so, you should specify what libraries should remain unpacked. It’s not mandatory but will reduce the overhead of unpacking and loading these .node files.

Our advice: Try experimenting with Rust and C to improve your Electron app performance

As you can see, there are multiple ways of speeding up parts of your Electron application, especially when it comes to doing heavy computations. Luckily, developers can choose from different languages and strategies to cover a wide spectrum of use cases.

In our app, verifying files is only part of the whole launcher process. The slowest part for most players is downloading the files, but this cannot be optimized beyond what your internet service provider offers. Also, some players have older machines with HDD disks where IO might be the bottleneck and not the CPU.

But if there is something we can improve and make more performant at reasonable costs, we should strive for it. If there are any functions or modules in your application that can be rewritten in either Rust or C, why not try experimenting? Such optimizations could significantly improve your app’s overall performance.

Forem: Daniel Kawka