dev

I was going through our deployment process this week and hit a snag: I was using the dev host in our production environment. And then another one, and another, and another… It turned out that we’d been breaking our own rules, not all at once, but slowly and surely. Sometimes it was a matter of convenience. Other times a matter of bad assumptions. Each time, it was innocent enough, but compiled it created a nightmare when it came time to deploy. We’d been hardcoding dev values.

dev crossed-out and replaced with the function get_env().

While sometimes it can feel innocent enough to hardcode “dev” here and there to speed things up, I’d argue that it’s almost always a bad idea if you have any hopes of using any of the code in prod someday. So while it may take a little more upfront thought to get started, it will save you a lot of tricky debugging in the long run if you have a plan for switching environments early on.

How to Handle Environments

There are three keystone decisions you need to make in order to properly handle different environments. They are:

Separate environments completely
Decide the source of truth for the environment
Use switching logic and a secret store to manage values

Separate Environments Completely

dev, qa, and prod should not just be folders. They are entirely separate environments. In GitHub, only your main branch should deploy anything to prod, and it should be done via CI/CD (i.e. GitHub Actions).

If you are using Databricks, this would mean having a dev workspace, a qa workspace, and a prod workspace. In GCP, this would be different projects.

In addition, each environment should have its own service principal that is allowed to run your code, and whose permissions are used when running said code.

Decide the source of truth for the environment

You need a way for your code to programmatically know which environment it is in. You can then use this info to handle your switching logic and grab the right information that you need, such as service principals, passwords, hosts, etc.

How do you know you are in dev? And similarly how do you know you are in qa or prod? It may be a host url, or some environment variables that are only set in one or the other, or have one value in dev and another value in prod.

We do this with a combination of compute policies and environment variables. The policy is set at the workspace (environment) level and automatically added to any compute in that environment. The policy sets environment variables, namely one we call ENV_CODE which takes a value of dev, qa, or prod based on the policy and environment.

Use switching logic and a secret store to manage values

Now that you know definitively where you are, you can very simply determine where to deploy code to, who should deploy it, and who should run it. The easiest way to manage this is the have simple switches based on the environment, and then corresponding folders in a secret store (e.g. Azure KeyVault, Vault, etc.) that contain all of the relevant info for each environment (host, SP name and password, etc.).

If you are deploying from GitHub, you can use your branch names as inputs to your GitHub workflows if using GitHub Actions (e.g. if the workflow is kicked off from main, deploy to prod). If you use trunk-based development, you can use event triggers to determine where to deploy to; a pull request merge can deploy to dev, a workflow dispatch can deploy to qa, and a release being published can deploy to prod.

Why Environments are Important

The main purpose of environments in my opinion is to prevent humans from touching and accidentally breaking important things.

dev is a sandbox. A bit of the wild west. It’s okay to break things here, have duplicates, be a bit messy. And importantly, it’s okay for humans to be in here running things manually.

qa is the tidied up version of dev, and is a barebones replica of prod in some ways. Maybe not everything is running all the time, but most things that you expect to find in the prod environment should also be present in the qa environment. This is a clean environment where your deployment and run processes are as close to prod as possible so you can get a true sense of if your end-to-end application will work (or still works). Humans should not be deploying to or running the code in qa. Deployments should be happening automatically via CI/CD (triggered by a release PR or a workflow dispatch or some other method), and service principals should be running the code.

prod is the most important: the big leagues. Everything is live and real. Everything in qa applies to prod as well. The only difference is that what happens in prod matters and is permanent in some way most of the time. You have upstream dependencies and someday will have downstream dependencies.

If you follow these steps and build them into your process early on, you’ll have far fewer headaches come deployment time, and less risk of something breaking once it makes it to production.