A Modern DevOps Solution for the Age-Old Embedded Build, Test, and Release Process
If you have been a developer or leader in the embedded space for any longer than 5 minutes, here are some stories you know very well:
The “It Won’t Build for Me” Scenario
Jack and Jill are feverishly working on developing modules with each’s respective Windows PC and local development systems. Each has been using their locally installed cross-compile toolchains all day without issue. Jill commits her code to the branch when they are ready to test together, and Jack pulls to integrate.
Ugh! Jill’s code won’t build on Jack’s machine.
The “It Doesn’t Build for Production” Scenario
Jack is working on a delivery for today with his Linux machine and development system on his desk. Finally, at 7 pm, he’s got it working and commits his code to the repository. The build system picks up the changes and starts the build.
Ugh! The production build system, a Windows machine, won’t compile the code.
The “Disgruntled Windows User” Scenario
New hire Jack requests a machine with Ubuntu for his development environment. Sorry, our IT department only supports Windows. Here’s your Dell, and here’s Visual Studio.
Ugh! But wait, isn’t our target environment Linux?
For a software developer, their PC is like a chef’s set of knives — it’s a very personal tool. Most developers are intimately familiar with their machines. Partially because we have to be, but more so because we want to be. We configure them just the way we like it. We are comfortable and fast when using our favorite editor and development environment.
We want the IT team to leave us alone, thank you very much.
The Best of the Old Way
Good organizations do their best to minimize these issues by using a combination of organizational and personal discipline along with good configuration management. The purpose is always to keep the developers’ local build environment close to the production and each other’s build environment.
Toolchain and Build System/Environment
Embedded developers always need a cross-compile toolchain. Sometimes free (ie, ARM GCC), sometimes licensed (ie, ARM Keil). Managing this toolchain, its version, configuration, and environment is one of the most critical symphonies across the development team because it is a source of many “it doesn’t build for me issues.”
There is also the matter of the build system or build scripts. Whether it’s some flavor of make, autotools, yocto, an IDE project, or custom scripts in you-name-the-shell, this set of black magic can singlehandedly derail any release on any day.
The best organizations figure out how to get both the toolchain and build system under configuration management. One such method is putting the entire toolchain and build utilities into a repository.
Even though the golden master build system exists, most developers’ local build environment deviates from it. There are many reasons in the day-to-day operations where developers modify or update these elements.
We try stuff. It’s what we do.
Installation Script
Good organizations create a utility that installs and configures the golden-master build environment. This utility provides a common process and known state for getting someone up and running.
But the usefulness is limited over time since installation. Developers modify their environment. The build scripts can be easily updated locally via pulls from the repository, but the toolchain may not update in as streamlined a fashion.
And again, we try stuff.
Virtual Machines
There is also the not-so-little matter of the user’s host environment. In some cases, we’re all are using different OS’s. We’re all using the same OS in others, but the inevitable update and configuration nuances create subtly different build environments between each development and the production build machine.
Another reasonable practice is to use an officially supported Virtual Machine that includes the supported toolchain and build environment. That helps solve the host environment differences and even liberates the user from a particular host OS. But it opens up a new set of challenges because the VM is only a copy. As soon as the user copies it to their local system, now they’ve forked it, and it can start to deviate. The VM goes on its own path the first time the developer starts whacking at the build environment to debug issues. It’s essentially just like giving a developer yet another machine that they have to manage.
The DevOps Solution — Common Build Environment
Regardless of how good the organization’s configuration management is and how disciplined a developer is, a developer’s local environment can only ever get similar to the production environment with the practices described above.
However, we’ve solved the problem in our organization for both build configuration management and local build versus production build. We did so by applying the modern DevOps concept of containers.
We create production build containers with the golden master toolchain and access to the necessary repositories and build scripts. All developers, testers, and managers have access to these containers and use them locally to build. We generically name these containers as a Common Build Environment (CBE).
A CBE is not similar to the production environment, it is the production environment. Therein lies the magic. Since we first deployed the CBE to our firmware team about several years ago, we’ve enjoyed 100% success between local development and the production build. We’ve had not one single build failure attributed to the build environment.
Here’s how a CBE is deployed both locally and to the production build environment.
Another benefit to CBE is that it can liberate the developers from a particular platform, at least for the build. Docker containers are supported on Windows, Linux, and macOS.
If your build is for an embedded target processor, there is a good chance that you can create some flavor of Linux container for that target. However, even if you require Windows (*cough*…ARM Keil…*cough*), you can still build a Windows CBE. The downside is that a Windows host system is required to execute Windows containers. However, you may work around this by using a Windows VM on your Host OS and then executing the CBE build from within that VM.
How to Create Your CBE and its Workflow
You’ll need some development and configuration management to make the CBE useful for the development team. A benefit is that everything required for a CBE can be put into a source repository and a container repository. Therefore, it is always traceable and reproducible.
What You Need:
- Container in a container repository with the toolchain and 3rd party utilities (Dockerfile and/or container image)
- User build scripts/project and utilities (i.e., make, autotools, custom, etc.)
- Docker installation on the host machine
The CBE Container
Create your container, and install the toolchain and any 3rd party, non-custom, build-related utilities (note that these are NOT the “build” scripts/utilities themselves). The build-related utilities needed in a CBE are applications such as lint, statistics gatherers, linker and image tools, 3rd party static analysis, etc.
We found that in some cases, it was easier to start from a prototype container image (rather than work from the Dockerfile) and use it interactively to install the toolchain and utilities. That provided an easy and quick method for testing the install. We created the Dockerfile after we were satisfied with the container we built interactively.
You can use a single CBE that contains all the toolchains for the various targets you support or separate CBE’s with single toolchains. The choice is yours, with benefits and drawbacks to each. We have chosen to use many CBE’s, each with a particular target toolchain. This allows us to easily keep all the CBE’s very stable and, once created, rarely require updates. That helps with traceability and recreation of previous releases.
This diagram shows you the basics of how we configuration manage the CBE’s themselves.
We create a CBE per target processor toolchain and then version that CBE with the tag.
You must be aware about placing IP or internal repository access credentials into your containers. If you do, you should use an internal container repository. If you don’t have any IP or credentials in your CBE, then you can use the public hub.docker.io. We don’t have any IP in our containers, but we do have SSH keys for accessing our internal repositories. Therefore, we host our containers inside our network.
Build Project and Scripts
These are the Makefiles, autotools, Bazel, yocto, etc, or custom build scripts that are required to compile, link, and package your embedded image files.
Use good software design principles that include encapsulation and interface consideration. Encapsulation is essential since you will export the source workspace to the CBE to execute the build.
We have a relatively easy build and use a combination of Makefiles with some bash scripts on top.
User Level Scripts and Utilities: Working with the Container
Here is where you’ll find most of the new work required to effectively use a CBE in your build workflow.
You will need to create a setup of utilities/scripts that allow your developers to use the CBE easily. Ideally, a developer doesn’t even know that the build occurs in a CBE container. It appears, whether by CLI or IDE that the build is native to the host system.
Your requirements are as follows:
- The speed of the build must be on par with a local host OS build.
- All build options (i.e., CLI options, targets, flags, etc.) must be accessible as if the build was local.
- All build stdout, stderr, and log files must be presented just as if the build was local.
- All build artifacts, including image files, debug symbols, linker maps, etc, must be deposited into the same location as if the build was local.
Here is the basic user level script workflow for using a CBE:
This workflow assumes the following:
- The user either has internet connectivity or already has the CBE locally. However, one of the benefits is that the internet is not required once the CBE is available locally.
- Docker is present and executing on the developer’s system.
When we initially rolled out the CBE, we did it for a single particular target — an ARM using the GCC. The build was make with some bash on top for usability (referred to below as build.sh).
To support the developers’ use of CBE, we created another bash wrapper that implemented the workflow above. We’ll refer to it as cbe_build.sh from hereon.
Here are the main guts of that initial wrapper. The CBE is an Ubuntu 14 LTS image stripped down, and it contains the ARM GCC 4.9 and 6.3 series toolchains in it. The developer specifies which toolchain version as a command-line argument to this script.
Overall Structure of cbe_build.sh
######################################################################
# Main script
main()
{
cmdline $ARGS
get_container
# Grab the PID of the most-recently-launched container
DOCKER_PID=`docker ps -q -n 1`
clean
copy_source_code
build
get_artifacts
log “=== Destroying container…”
docker rm -f $DOCKER_PID
}
Get the Container
Since we host our containers internally, we use a service account to access and are not very strict with the credentials.
#########################################################################
# get_container()
#
# Pulls latest version of CBE container
get_container()
{
log “=== Pulling latest version of container…”
# No IP, can be loose with credentials
docker login -u $UNAME -p $PWORD
docker pull $CBE
# Launch the container, give it a no-op command to run so it will stop
# quickly and wait for us.
echo “echo ”” | docker run -i $CBE
}
Copy the Source Code
Note: We copy the source into the container rather than share a mounted volume due to the unreliability of shared mount in Windows Host OS. If your Host OS is all Linux or macOS, then skip this step and share the volume.
When copying the source code, also copy in your build tools/scripts. In our case, our build scripts are part of the source tree because we use Makefiles with some bash on top.
Note that our source code base for this target is very small (< 100K lines of code), therefore, we copy the entire source code tree into CBE.
We do have to hack the line endings if the developer is on Windows. I’m sure there is a more elegant solution to that part.
#########################################################################
# copy_source_code()
#
# Copies the source code from Host OS into the container
# ensures proper line endings
#
copy_source_code()
{
# Copy user’s sandbox into container FS
log “=== Copying source files…”
docker cp ./ $DOCKER_PID:/build
# if windows, change the EOL’s in the container
log “=== OS = ${OS}”
if [[ ${OS} =~ .*MINGW.* ]] || [[ ${OS} =~ .*CYGWIN.* ]]
then
log “=== Changing EOL’s IX style”
container_run $DOCKER_PID “find . -type f -exec dos2unix -q {} {} ‘;'”
else
log “=== EOL change NOT required”
fi
}
Build
I’ve left out some $ARGS preprocessing for brevity. That preprocessing sets the toolchain path as well as massages the $ARGS variable to ensure the build options (ie, target, etc) are correct. As mentioned previously, a bash script build.sh sits on top of make to perform the build inside CBE.
#########################################################################
# container_run (DOCKER_PID) (command)
#
# Runs (command) in the stopped foreground container with pid (DOCKER_PID)
# by piping it into stdin of “docker start -i (DOCKER_PID)”
container_run()
{
if [ -z “$2” ]
then
log_error “container_run(): missing parameter”
log_error “Usage: container_run (DOCKER_PID) (command string)”
return 1
fi
echo $2 | docker start -i $1
}
##########################################################################
# build()
#
# Runs the build command and captures time information
#
build()
{
# Run script in container
log “=== Building with CBE…”
bdstart=$(date +%s)
log ” – build ARGS: $ARGS”
# Note that toolchain path is set by preprocessing
container_run $DOCKER_PID “export PATH=$toolchain:\”$PATH\” && cd /build/tools && ./build.sh $ARGS”
bdstop=$(date +%s)
BDCOUNT=$((bdstop-bdstart))
}
#
Get the Artifacts
This step requires precision to keep the build time with CBE on par with a local build. Copy ONLY what is absolutely necessary from the CBE back to the host OS. We use some logic to determine if the build was a success or failure and modify the behavior accordingly.
#########################################################################
# container_run (DOCKER_PID) (command)
#
# Runs (command) in the stopped foreground container with pid (DOCKER_PID)
# by piping it into stdin of “docker start -i (DOCKER_PID)”
container_run()
{
if [ -z “$2” ]
then
log_error “container_run(): missing parameter”
log_error “Usage: container_run (DOCKER_PID) (command string)”
return 1
fi
echo $2 | docker start -i $1
}
##########################################################################
# build()
#
# Runs the build command and captures time information
#
build()
{
# Run script in container
log “=== Building with CBE…”
bdstart=$(date +%s)
log ” – build ARGS: $ARGS”
# Note that toolchain path is set by preprocessing
container_run $DOCKER_PID “export PATH=$toolchain:\”$PATH\” && cd /build/tools && ./build.sh $ARGS”
bdstop=$(date +%s)
BDCOUNT=$((bdstop-bdstart))
}
#
Integration with an IDE
We integrated the CBE build with a few IDE’s, but it requires another custom-developed utility. Unfortunately, each IDE is a different animal, and therefore, we haven’t found a common method for all. But here are some concepts to understand:
- Calling the build and how the targets and options are specified
- Debug symbols and files/info required for the debugger
Benefits and Challenges
Here are what we’ve experienced as both benefits and challenges to the CBE approach.
Benefits
- All users are building in the production environment.
- Configuration management is on a single item, the CBE, rather than spread across the user base.
- All users get any updates to the build environment automatically
- Internet on or off (assuming prereq’s are met)
- Host OS independent
Challenges
- Docker on all the developer machines. Not much of an issue once it’s installed.
- Enabling developers to monkey around with the tools and scripts. This is a legitimate challenge because as embedded developers, we all need to (or like to) monkey with the tools and options sometimes. We solve this by showing a developer how to work in a CBE interactively, or the developer installs the build environment locally and works locally until happy.
- Windows containers have portability limitations. Windows containers must execute on Windows Host OS. Therefore, the developer must use Windows natively or a Windows VM to execute the build.
- Licensed toolchains require floating licenses. You are roped into a floating license scenario with a licensed toolchain, which is typically more expensive. This is how many organizations work, so it may not be a problem.
Summary
The Common Build Environment has eliminated the “doesn’t build for me” and “doesn’t build in production” problems for embedded target builds in our organization. The CBE uses a Docker container with the golden-standard embedded toolchain and build environment. Since all developers and the production build server use the same docker container, the developers are always building in the production environment.