Upgrading to Node 6 on Elastic Beanstalk
And speeding up npm install by 95%!
Friday, Dec 2nd, 2016
Mixmax is a communications platform that brings professional communication & email into the 21st century.
In case you haven’t heard, Node 6 went LTS mid-October, with AWS Elastic Beanstalk adding support at the end of the month. Since Node 6 promised support for 99% of ES6 features as well as a host of performance and security improvements, we moved quickly to adopt it. We found it to be very easy to upgrade locally—we only had to upgrade a few native dependencies to their latest version to pick up new bindings, and did not have to change any code. Kudos to the Node Foundation for a stable release and the community for embracing Node 6 well in advance of LTS.
It was not as easy to upgrade Elastic Beanstalk, however—upgrading the platform version persistently resulted in stuck deploys and rollbacks. Debugging this required exploring Elastic Beanstalk’s inner workings, but we ultimately made fixing it as simple as installing a Node package. And not only did that package enable us to upgrade Elastic Beanstalk to Node 6, but it also sped up npm install by 95%. Here’s how we did it.
What went wrong
We initially tried to upgrade the platform version in place using the upgrade button on our application’s dashboard. But the configuration deploy never finished—boxes would just time out. We then cloned the environment with the latest platform. This initially succeeded, only for further deploys to fail.
Watching these deploys fail was agonizing. Elastic Beanstalk’s dashboard gives very little insight into what’s going on during a deploy. But you can easily SSH into the EC2 instances and tail the deployment logs. Using the EB CLI tool:
eb ssh -i <instance id> tail -f /var/log/eb-activity.log
This revealed that the boxes were getting stuck running
[2016-12-02T01:17:44.287Z] INFO  - [Application update app-91d6-161202_011650-stage-161202_011650@28/AppDeployStage0/AppDeployPreHook/50npm.sh] : Starting activity...
We were perplexed. EB’s docs said that platform 3.1.0 was using npm 2.15.5, same as the previous platform. What was the difference?
We quickly suspected that EB’s docs were wrong, since Node 6.9.1 usually ships with npm 3.10.8. We confirmed this on an EC2 instance:
[ec2-user@ip-10-20-4-104 ~]$ export PATH=/opt/elasticbeanstalk/node-install/node-v6.9.1-linux-x64/bin:$PATH [ec2-user@ip-10-20-4-104 ~]$ /opt/elasticbeanstalk/node-install/node-v6.9.1-linux-x64/bin/npm -v 3.10.8
We upgraded to npm 3 locally and timed
npm install in several of our projects. We found that
npm 3.10.8 consistently takes about 2x longer to run
npm install than npm 2.15.5. And on
the resource-constrained EC2 instances, it was taking even longer—much longer than the
Cloning the environment appeared to fix the problem only because EB uses a longer timeout when
creating an environment than when deploying configuration changes to an existing environment.
So the fix was going to involve downgrading npm 3 to npm 2… on every EC2 instance, across all of our services, whenever Elastic Beanstalk deployed to a new instance. How could we automate this?
Luckily, Elastic Beanstalk offers a way to hook into its deploy process: by adding
configuration files to a folder named
you can add scripts for EB to run during deploy and even overwrite its default scripts.
This let us make an ebextension file that would install an npm-downgrading script. (Note: these intermediate scripts are for illustration, not use, since the ultimate set of scripts is way better.)
# EB runs deploy scripts in alphabetical order http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/ebextensions.html, # Node is installed using a script called "40install_node.sh", and `npm install` is # run using a script called "50npm.sh", so we downgrade Node in a script called # "45npm_downgrade.sh". files: "/opt/elasticbeanstalk/hooks/appdeploy/pre/45npm_downgrade.sh": mode: "000755" owner: root group: users content: | #!/usr/bin/env bash EB_NODE_VERSION=$(/opt/elasticbeanstalk/bin/get-config optionsettings -n aws:elasticbeanstalk:container:nodejs -o NodeVersion) # Make sure Node binaries can be found (required to run npm). # And this lets us invoke npm more simply too. export PATH=/opt/elasticbeanstalk/node-install/node-v$EB_NODE_VERSION-linux-x64/bin:$PATH if [ $(npm -v) != "2.15.9" ]; then echo "Downgrading npm to 2.15.9..." npm install firstname.lastname@example.org -g else echo "npm already at 2.15.9" fi
But now we had the challenge of distributing this file across our 14 microservices. Were we going to copy-and-paste it? No way!
Awhile back, we made an npm package precisely to solve the problem of distributing files like this ebextension. The package is called install-files, and what it does is allow another package to install files into its host package’s directory.
Let’s say that
my-microservice installs the
eb-fix-npm package. The
can then call
install-files source from an install script
to copy the contents of
This tool lets you share files between Node projects the same way you would share code, using npm and declarative package names/versions. And with the ability to quickly distribute changes to the script, we got ambitious.
Simply by downgrading to npm 2, we were able to upgrade our Elastic Beanstalk environments to Node 6. But, in the process of investigating Elastic Beanstalk’s npm script, we noticed several inefficiencies.
First, it installed Node modules afresh on every deploy. So we introduced a cache:
files: "/opt/elasticbeanstalk/hooks/appdeploy/pre/46cache_node_modules.sh": mode: "000755" owner: root group: users content: | #!/usr/bin/env bash # Cache Node modules in /var. if [ ! -d "/var/node_modules" ]; then mkdir /var/node_modules ; fi ln -s /var/node_modules /tmp/deployment/application/
By comparing timestamps when tailing EB’s activity log, we could see that EB’s npm script went from taking ~4m to ~1m: a 75% speedup.
Then we noticed that EB was calling
npm rebuild after installing. But modules are
automatically built for the appropriate architecture when installing! The only time you need
to rebuild is when the architecture changes—on configuration deploy. And on
configuration deploy, EB was trying to install new modules—even though
doesn’t change on configuration deploy, only on application deploy.
npm rebuild on application deploy, and no
npm install on configuration deploy:
files: "/opt/elasticbeanstalk/env.vars": mode: "000775" owner: root group: users content: | # Exports variables for use by the other scripts below. EB_NODE_VERSION=$(/opt/elasticbeanstalk/bin/get-config optionsettings -n aws:elasticbeanstalk:container:nodejs -o NodeVersion) export PATH=/opt/elasticbeanstalk/node-install/node-v$EB_NODE_VERSION-linux-x64/bin:$PATH "/opt/elasticbeanstalk/hooks/appdeploy/pre/50npm.sh": mode: "000755" owner: root group: users content: | #!/usr/bin/env bash # # Note that this *overwrites* Elastic Beanstalk's default 50npm.sh script. . /opt/elasticbeanstalk/env.vars cd /tmp/deployment/application && npm install --production "/opt/elasticbeanstalk/hooks/configdeploy/pre/50npm.sh": mode: "000755" owner: root group: users content: | #!/usr/bin/env bash # # Note that this *overwrites* Elastic Beanstalk's default 50npm.sh script. . /opt/elasticbeanstalk/env.vars cd /tmp/deployment/application && npm rebuild --production
During application deploy, our replacement npm script now took ~10 seconds: a 95% speedup compared to the initial duration.
Bonus: EB’s configuration deploy npm script doesn't actually do anything—it uses the wrong working directory. Our script actually rebuilds your modules if, for instance, you change your Node version.
One package to fix everything
So there you have it: you can unblock upgrading your Elastic Beanstalk environments to Node 6 and virtually eliminate
npm install time by installing a single package,
eb-fix-npm. After installation, you’ll
effectively only have to
npm install when Elastic Beanstalk spins up new EC2 instances,
without the cache. But we think we have a way to get rid of this hiccup too. Stay tuned…