The process.env frontend time bomb (plus: a sustainable definition of “fixed”)

Massimiliano Mirra

Notes

Date: Apr 2024

Status: finished

The process.env frontend time bomb (plus: a sustainable definition of “fixed”)

When I open a file and see process.env used like this, I know I’m in trouble:

function Cart() {
  useEffect(() => {
    fetch(`${process.env.REACT_APP_API_BASE_URL}/cart`).then(
      (res) => {
        // ...

Why? Well, suppose I offer you a strategy to make your app work locally, yet break in production, in a way that your users will notice but you won’t, is hard to debug, and is guaranteed to happen at some point because the root cause is unsolvable.

Assuming you’re not the villain of this story… do you take the offer?

If the answer is “no”, then I hope you’re not using process.env as in the example above, because it’s got all the ingredients of trouble:

if you forget to set REACT_APP_API_BASE_URL in CI, the app breaks only in production
unless you manually test after each deployment, you won’t find out that the cart is broken until users report it
users will say “the cart doesn’t load” and the console message they’ll (hopefully) report is Uncaught SyntaxError: Unexpected token '<', "<!doctype "... is not valid JSON, both of which are a long shot away from “I wonder if somebody forgot to set an environment variable”
you don’t introduce variables often enough for the task of setting them in CI to become a habit, so it will always depend on conscious attention, which you’ll inevitably lack at some point.

The issues with that code don’t stop there (hardcoded strings are a fragile way of representing routes, there are much better alternatives to the combination of useEffect and fetch, …) but, in my experience, process.env is the one that most consistently flies under the radar.

Below, I’ll show a way of removing this class of bug from your process. But first, let’s dig a little deeper, because the “fix” that most of us rush toward in such a situation is a worse cure than the disease.

A sustainable definition of “fixed”

It’s late afternoon, the cart has been broken for most of the day, but you just found out. Business lost revenue, long-time users are frustrated, first-time users are gone, and you’ve canceled your evening plans. After much digging, you finally realize what’s going on. You set that damn variable and redeploy…

Well done! By all means go ahead with those evening plans. But don’t declare anything “fixed” just yet.

Teams like to debate the “definition of done”, but I’ve seldom been in a conversation about the “definition of fixed”. Yet software keeps breaking in ways it already broke, and nobody connects the dots.

“Fixed” isn’t putting back up the sign that was knocked down by a gust of wind, only to find it down again two weeks later. “Fixed” is anchoring the sign so that it stays up. “Fixed” is making the change that lets you say not only “it works now” but also “it will work tomorrow”. Other definitions are unsustainable: code bases grow, time does not. If we let weak spots grow along with code, soon we spend all our time on dealing with breakage rather than adding value.

If setting that environment variable didn’t really fix the problem, then, what does? And what problem are we really talking about anyway?

Forgetting to set an environment variable isn’t a problem — it’s certainty. And so is the failure of any process that relies on something inherently fallible like human attention. Not recognizing certain failure and not planning for it, that is the true problem.

It’s ironic: we plan for the failure machines and networks, which work fine 99.99% of the time, yet when it comes to memory, which for most people fails many times a day, the response is “let’s remember next time”.

The fix isn’t to try harder, it is to reframe the problem so it be can attacked with what machines are good at rather than what people are bad at.

Centralize, parse, anticipate

We got into trouble because:

we were reading environment variables all over the place
didn’t see something break until users told us
received obscure error messages

So let’s define the goal as:

read variables all in one place
if any is invalid, fail the build so that the bug never gets deployed
produce errors that point at the exact problem

Start by creating a config.ts to centralize all configuration code so that the rest of the app doesn’t need to worry about it. A React component has no business dealing with the environment anyway.

Treat thes variables as external data (because they are) and have them go through security before letting them on board:

// config.ts
import { createContext } from "react";
import { z } from "zod";

export interface Config {
  api: {
    baseUrl: string;
  };
}

export const parseConfig = (
  envVars: Record<string, string | undefined>,
): Config => {
  const envSchema = z.object({
    REACT_APP_API_BASE_URL: z.string().url(),
  });

  const env = envSchema.parse(envVars);

  return {
    api: { baseUrl: env.REACT_APP_API_BASE_URL },
  };
};

export const ConfigContext = createContext<Config | null>(
  null,
);

In the index.tsx entry point, read the configuration and make it available to the rest of the app via React Context. (Bonus: this will also save you from mocking process.env and other acrobatics in tests and Storybook.)

// index.tsx
  import reportWebVitals from "./reportWebVitals";
+ import { parseConfig, ConfigContext } from "./config";
+
+ const config = parseConfig(process.env);

  const root = ReactDOM.createRoot(
    document.getElementById("root") as HTMLElement,
  );
  root.render(
    <React.StrictMode>
-     <App />
+     <ConfigContext.Provider value={config}>
+       <App />
+     </ConfigContext.Provider>
    </React.StrictMode>,
 );

Adapt Cart.tsx:

// Cart.tsx
- import { useState, useEffect } from "react";
+ import { useState, useEffect, useContext } from "react";
+ import { ConfigContext } from "./config";

  export const Cart: React.FC = () => {
+   const config = useContext(ConfigContext);
+   if (config === null) throw new Error("Config not set");
+
    const [cart, setCart] = useState(null);

    useEffect(() => {
-     fetch(`${process.env.REACT_APP_API_BASE_URL}/carts/1`)
+     fetch(`${config.api.baseUrl}/carts/1`)
        .then((res) => res.json())
        .then(setCart)
        .catch(console.error);
-   }, []);
+   }, [config]);

Finally, add a package.json script that will call parseConfig in CI. For example using tsx:

    "scripts": {
+     "validate-env": "tsx -e 'import { parseConfig } from \"./src/config\"; parseConfig(process.env)'",
      "start": "react-scripts start",

Now, when a build variable is unset or invalid, you get an error in CI, not in production, and it tells you exactly what’s wrong:

$ REACT_APP_API_BASE_URL=foo npm run validate-env
...
ZodError: [
  {
    "validation": "url",
    "code": "invalid_string",
    "message": "Invalid url",
    "path": [
      "REACT_APP_API_BASE_URL"
    ]
  }
]

See the full code example on Github (also covers the advanced bits below).

Advanced

Really disconnecting from `process.env`

A config.REACT_APP_API_BASE_URL that’s guaranteed to be valid is better than a fickle process.env, but it might not map to what’s optimal for calling code. For example, we might want feature flags as an array of literals rather than a raw string that must be split every time:

  const SearchResults: React.FC = () => {
    const config = useContext(ConfigContext);
-   if (config.REACT_APP_ENABLED_FEATURES.split(",").includes("infinite-scroll")) {
+   if (config.enabledFeatures.includes("infinite-scroll")) {
      // ...

To achieve that, validate environment variables like before, but instead of returning them verbatim, use them to fill a domain-specific Config object:

// config.ts

interface Config {
  enabledFeatures: Array<"infinite-scroll" | "dark-mode" | "share-button">
}

export const parseConfig = (
  envVars: Record<string, string | undefined>,
): Config => {
  const envSchema = z.object({
    REACT_APP_ENABLED_FEATURES: z.string().optional()
  });

  const env = envSchema.parse(envVars);

  const enabledFeatures = env.REACT_APP_ENABLED_FEATURES
    ? z
        .array(z.enum(["infinite-scroll", "dark-mode", "share-button"]))
        .parse(env.REACT_APP_ENABLED_FEATURES)
    : [];

  return {
    enabledFeatures
  }
})

Default configuration values

To prevent .env fatigue, you can set default values:

export const parseConfig = (
  envVars: Record<string, string | undefined>,
): Config => {
  const envSchema = z.object({
    REACT_APP_API_BASE_URL: z
      .string()
      .url()
      .default("https://example.com/api"),
  });

Dependent configuration values

Let’s say that some environment variables only make sense together, and you want them all set or none at all. Here’s how to do it:

interface Config {
  api: {
    baseUrl: string;
  };
  datadog?: {
    applicationId: string;
    site: string;
  };
}

export const parseConfig = (
  envVars: Record<string, string | undefined>,
): Config => {
  const apiEnvSchema = z.object({
    REACT_APP_API_BASE_URL: z.string().url(),
  });

  const dataDogEnvSchema = z
    .object({
      REACT_APP_DATADOG_APPLICATION_ID: z.string(),
      REACT_APP_DATADOG_SITE: z.string(),
    })
    .or(
      z.object({
        REACT_APP_DATADOG_APPLICATION_ID: z.undefined(),
        REACT_APP_DATADOG_SITE: z.undefined(),
      }),
    );

  const fullEnvSchema = apiEnvSchema
    .and(dataDogEnvSchema);

  const env = fullEnvSchema.parse(envVars);

  const datadog = env.REACT_APP_DATADOG_APPLICATION_ID
    ? {
        site: env.REACT_APP_DATADOG_SITE,
        applicationId: env.REACT_APP_DATADOG_APPLICATION_ID,
      }
    : undefined;

// ...

“My teammates keep forgetting about this and end up using `process.env` in component code again”

Add this to .eslintrc.json:

{
  "rules": {
    "no-process-env": "error"
  }
}

Summary

Code bugs can be just symptoms of process bugs.
Breakage due to a missing env variable reveals that we’re giving people tasks that are suited to machines (a process bug if there ever was one).
Reframe such tasks and do give them to machines.

Credits

Thanks to Atris for many productive conversations on the topic, Sebastién for his review and additional perspectives, and Kyle (as well as his excellent product Crone) for the precious feedback on writing.

Notes

The process.env frontend time bomb (plus: a sustainable definition of “fixed”)

A sustainable definition of “fixed”

Centralize, parse, anticipate

Advanced

Really disconnecting from process.env

Default configuration values

Dependent configuration values

“My teammates keep forgetting about this and end up using process.env in component code again”

Summary

Credits

Really disconnecting from `process.env`

“My teammates keep forgetting about this and end up using `process.env` in component code again”