Node.js 101 : Part #4 – Basic Deployment and Hosting with Azure, Heroku, and AppHarbor

Following on from my recent post about doing something this year, I’m committing to doing 12 months of “101”s; posts and projects themed at beginning something new (or reasonably new) to me.

January is all about node, and I started with a basic intro, then cracked open a basic web server with content-type manipulation and basic routing, and the last one was a basic API implementation

Appharbor, Azure, and Heroku

Being a bit of a cocky git I said on twitter at the weekend:

It’s not quite that easy, but it’s actually not far off!

Deployment & Hosting Options

These are not the only options, but just three that I’m aware of and have previously had a play with. A prerequisite for each of these – for the purposes of this post – is using git for version control since AppHarbor, Azure, and Heroku support git hooks and remotes; this means essentially you can submit your changes directly to your host, which will automatically deploy them (if pre-checks pass).

I’ll be using the set of files from my previous API post for this one, except I need to change the facility to pass in command line args for the api key to instead take it from a querystring parameter.

The initial files are the same as the last post and can be grabbed from github

Those changes are:

app.js (removed lines about getting value from command line):

[js]var server = require("./server"),
router = require("./router"),
requestHandlers = require("./requestHandlers");

// only handling GETs at the moment
var handle = {}
handle["favicon.ico"] = requestHandlers.favicon;
handle["product"] = requestHandlers.product;
handle["products"] = requestHandlers.products;

var port = process.env.PORT || 3000;
server.start(router.route, handle, port);[/js]

server.js (added in querystring param usage):

[js highlight=”7″]var http = require("http"),
url = require("url");

function start(route, handle, port) {
function onRequest(request, response) {
var pathname = url.parse(request.url).pathname;
var apiKey = url.parse(request.url, true).query.key;
route(handle, pathname, response, apiKey);
}

http.createServer(onRequest).listen(port);
console.log("Server has started listening on port " + port);
}

exports.start = start;[/js]

The “.query” returns a querystring object, which means I can get the parameter “key” by using “.key” instead of something like [“key”].

Ideal scenario

In the perfect world all I’d need to do is something like:
[code]git add .
git commit -m "initial node stuff"
git push {azure/appharbor/heroku/whatever} master
…..
done
…..
new site deployed to blahblah.websitey.net
…..
have a lovely day
[/code]
and I could pop off for a cup of earl grey.

In order to get to that point there were a few steps I needed to take for each of the three hosts.

Appharbor

appharbor-home-1

Getting started

First things first; go and sign up for a free account with AppHarbor.

Then set up a new application in order to be given your git remote endpoint to push to.

I’ve previously had a play with Appharbor, but this is the first time I’m using it for more than just a freebie host.

Configuring

It’s not quite as simple as I would have liked; there are a couple of things that you need to bear in mind. Although Appharbor supports node deployments they are primarily a .Net hosting service and use Windows hosting environments (even though they’re on EC2 as opposed to Azure). Running node within iis means that you need to supply a web.config file and give it some IIS-specific info.

The config file I had to use is:
[xml highlight=”3,9″]<configuration>
<system.web>
<compilation batch="false" />
</system.web>
<system.webServer>
<handlers>
<add name="iisnode" path="app.js" verb="*" modules="iisnode" />
</handlers>
<iisnode loggingEnabled="false" />

<rewrite>
<rules>
<rule name="myapp">
<match url="/*" />
<action type="Rewrite" url="app.js" />
</rule>
</rules>
</rewrite>
</system.webServer>
</configuration>[/xml]

Most of that should be pretty straightforward (redirect all calls to app.js), but notice the lines about compilation and logging; the permissions under which the appharbor deployment process runs for node projects doesn’t have access to the filesystem so can’t create anything in a “temp” dir (precompilation) nor write any log files upon errors. As such, you need to disable these.

You could also enable file system access and disable precompilation within your application’s settings – as far as I can tell, it does the same thing.

appharbor-settings-1

Deploying

Commit that web.config to your repo, add a remote for appharbor, then push to it – any branch other than master, default, or trunk needs a manual deploy instead of it happening automatically, but you can specify the branch name to track within your appharbor application settings; I put in the branch name “appharbor” that I’ve been developing against and it automatically deploys when I push that branch or master, but not any others.

You’ll see your dashboard updates and deploys (automatic deployment if it’s a tracked branch):

appharbor-deploy-dashboard-1

And then you can browse to your app:

appharbor-deploy-result-1

Azure

azure-home-1

Getting started

Again, first step is to go and sign up for Azure – you can get a free trial, and if you only want to host up to 10 small websites then it’s completely free.

You’ll need to set up a new Azure website in order to be given your git remote endpoint to push to.

Configuring

This is pretty similar to the AppHarbor process in that Azure Websites sit on Windows and IIS, so you need to define a web.config to set up IIS for node. The same web.config works as for AppHarbor.

Deploying

Although you can push to Appharbor from any branch and it will only deploy automatically from the specific tracked branch, you can’t choose to manually deploy from within azure, so you either need to use [code]git push azure {branch}:master[/code] (assuming your remote is called “azure”) or you can define your tracked branch in the configuration section:

azure-settings-1

Following a successful push your dashboard updates and deploys:

azure-deploy-dashboard-1

And then your app is browsable:

azure-deploy-result-1

Heroku

heroku-home-1

Getting started

Sign up for a free account.

Configuring

Heroku isn’t Windows based as it’s aimed at hosting Ruby, Node.js, Clojure, Java, Python, and Scala. What this means for our node deployment is that we don’t need a web.config to get the application running on Heroku. It’s still running on Amazon’s EC2 as far as I can tell though.

However, we do need to jump through several other strange hoops:

Procfile

The procfile is a list of the “process types in an application. Each process type is a declaration of a command that is executed when a process of that process type is executed.” These can be arbitrarily named except for the “web” one which handles HTTP traffic.

For node, this Procfile needs to be:

Procfile:
[code]web: node app.js[/code]

Should I want to pass in command line arguments, as in the previous version of my basic node API code, I could do it in this file i.e. [code]web: node app.js mYAp1K3Y[/code]

Deploying

Heroku Toolbelt

There’s a command line tool which you need to install in order to use Heroku, called the Toolbelt; this is the Heroku client which allows you to do a lot of powerful things from the command line including scaling up and down, and start and stopping your application.

Instead of adding heroku as a git remote yourself you need to open a command line in your project’s directory and run [code]heroku login[/code]and then[code]heroku create[/code]
Your application space will now have been created within Heroku automatically (no need to log in and create one first) as well as your git remote; this will have the default name of “heroku”

Deploying code is still the same as before [code]git push heroku master[/code]

In Heroku you do need to commit to master to have your code built and deployed, and I couldn’t find anywhere to specify a different tracking branch.

Before that we need to create the last required file:
package.json:
[js]{
"name": "rposbo-basic-node-hosting-options",
"author": "Robin Osborne",
"description": "the node.js files used in my blog post about a basic node api being hosted in various places (github, azure, heroku)",
"version": "0.0.1",
"engines": {
"node": "0.8.x",
"npm": "1.1.x"
}
}[/js]

This file is used by npm (node package manager) to install the module dependencies for your application; e.g. express, jade, stylus. Even though our basic API project has no specifc dependencies, the file is still required by Heroku in order to define the version of node and npm to use (otherwise your application simply isn’t recognised as a node.js app).

Something to consider is that Heroku doesn’t necessarily have the same version of node installed as you might; I defined 0.8.16 and received an error upon deployment which listed the available versions (the highest at time of writing is 0.8.14). I decided to define my required version as “0.8.x” (any version that is major 0 minor 8).

However, if you define a version of node in the 0.8.x series you must also define the version of npm. A known issue, apparently. Not only that, it needs to be specifically “1.1.x”.

Add these settings into the “engines” section of the package.json file, git add, git commit, and git push to see your dashboard updated:

heroku-deploy-dashboard-1

And then your app – with a quite random URL! – is available:

heroku-deploy-result-1

If you have problems pushing due to your existing public keys not existing within heroku, run the following to import them [code]heroku keys:add[/code]

You can also scale up and down your number of instances using the Heroku client: [code]heroku ps:scale web=1[/code]

Debugging

The Heroku Toolbelt is a really useful client to have; you can check your logs with [code]heroku logs[/code] and you can even leave a trace session open using [code]heroku logs –tail[/code], which is amazing for debugging problems.

The error codes you encounter are all listed on the heroku site as is all of the information on using the Heroku Toolbelt logging facility.

A quick one: if you see the error “H14”, then although your deployment may have worked it hasn’t automatically kicked off a web role – you can see this where it says “dyno=” instead of “dyno=web.1”; you just need to run the following command to start one up: [code]heroku ps:scale web=1[/code]

Also – make sure you’ve created a Procfile (with capitalised “P”) and that it contains [code]web: node app.js[/code]

Summary

Ok, so we can now easily deploy and host our API. The files that I’ve been working with throughout this post are on github; everything has been merged into master (both heroku files and web.config) so it can be deployed to any of these hosts.

There are also separate branches for Azure/Appharbor and Heroku should you want to check the different files in isolation.

Next Up

Node packages!

Node.js @ UKWAUG: MS Cloud Day – Windows Azure Spring Release

The fourth session I attended was the highly energetic and speedy introduction to writing node.js and running it on Azure, presented by the author of Simple.Data and Simple.Web, and one of those voices of the developer community with a great JFDI attitude, Mark Rendle (@markrendle).

I’ve just recently got into node.js development myself and have been very much enjoying node, npm, express, stylus, and nib; there is a fantastic community and expanse of modules already and that can be a bit daunting.

During the session Mark’s short code example shows just how simple it can be to get up and running with node, and also how easy it is to deploy to Azure.

A nice comment was that we are on the road to “ecmascript harmony”! And that “Javascript is a great language so long as you ignore the 90% of it which coffeescript doens’t compile to.”

It was a very fast-paced session; hopefully my notes still make sense though..

What the various aspects of Azure do

  • compute – web, worker, vm
  • websites – .net, node, php
  • storage – blob, tables (distributed nosql, like cassandra), queues
  • sql – sql azure, reporting
  • services – servicebus, caching, acs

What are the Cloud Service types used for

  • web roles – iis, for apps
  • worker – no iis, for running anything

How to peruse the contents of blob or table

General tips for developing sites for use in Azure

  • keep static content in blob storage
  • websites commit and deploy much faster than cloud serviecs commit and deploy process
  • azure/iis needs server.js, not app.js

How to run RavenDB in Azure

  • Spin up a vm and install it!! (this used to be a much trickier process, but the recent Azure update meant that the VM support is mature enough to allow the simpler solution)

Developing node.js

Use jetbrains webstorm for debugging/ or the wonderful online editor, Cloud9IDE. Sublime Text 2 is a great editor for simple code requirements, and has great plugins for Javascript support. I also used this for taking all of the seminar notes as it has a simple “zen” zero-distractions interface

Next up – Hadoop and High Performance Computing

MongoDB @ UKWAUG: MS Cloud Day – Windows Azure Spring Release

My third session was about MongoDB and how you might implement it in Azure, presented by MongoDB’s own Gregor Macadam (@gregormacadam).

I only had limited knowledge of what MongoDB was before this session (a document based data store, much like CouchDB and other NoSQL variants), so given that this session appeared to be an intro to MongoDB as opposed to MongoDB on Azure then that suited me just fine!

Here are the basic notes I made during Gregor’s talk (although you may as well just go to MongoDB.org and read the  intro..):

MongoDB uses sharding for write throughput.
The REST interface uses JSON as the data transport format
Data is saved in BSON structure

The db structure (usually?) consists of three nodes; a single primary and two replicated secondary – these are referred to as a Replica Set.
A Replica Set has a single write node with async replicate to other set members, read from all

The write history (known as UpLog) is in the format "move from state A, to state B" so as to avoid overwriting changed data.

If write (to primary) fails, an automatic election determines which remainder is new primary; usually primary is the node with latest data.

It can be configured to write to multiple hosts, but the write won’t return until all writes are completed

An "arbiter" can be the tie breaker in determining the new primary node during election, and we can specify weighting for that process.

"Read" scales with more read nodes, "Write" scales with multiple read/write groups (replica sets) or sharding.

Need config database to define key ranges for sharding etc

MongoS process runs on another node and knows which shard to write your data to.

The updates are released on windows and Linux at same time

Within Azure

Data is persisted in blob storage
MongoDB runs in worker role
page blob is NTFS cloud drive (data drive?)

MongoS router process is required to load balance access to correct node, not the Azure load balancer; the Azure load balancer can end up sending the write request to a non-primary node.

OSdisk has caching enabled by default, data disk doesn’t

Code is Open Source and can be found on github and issues can be raised on the Mongo Jira site

You can sign up for a free Mongo Monitoring Service on 10gen

Main points that I took away from this is that it sounds like you need a large number of Azure VMs to get Mongo running; one for each node, one for each MongoS service, one for an arbiter (maybe more – I didn’t catch all of these details that were raised by a couple of good questions from the audience).

Although I have a big plan to use NoSQL for the front end of an ecommerce website, I don’t think that MongoDB’s Azure offering is mature enough yet. I’ll be looking into CouchDB and Raven initially and keeping an eye on MongoDB. (Interested in how to get Raven running on Azure? Wait for the next post!)

The slide deck from this session is here

Next up – node.js

IaaS @ UKWAUG: MS Cloud Day – Windows Azure Spring Release

Infrastructure as a Service in Azure

Unfortunately, the earlier network disaster at the conference meant that this session seemed to have been cut short. This is a shame as the Azure IaaS offering has really matured and I was looking forward to how I can utilise the improved system.

Since it was a short one, the notes taken on what Microsoft’s own Michael Washam (@MWashamMS) talked about are limited. Here goes:

You can can upload your own VHDs which must be fixed disks, not dynamic

Data disks are a virtual HD which can be attached to your VM; each data disk is up to 1TB!

Instance size/# of data disks

  • xs 1
  • s 2
  • m 4
  • l 8
  • xl 16

So a single XL instance VM can have 16TB of HA storage attached!

Data disks lives in blob storage

Using port forwarding and configuring a Load Balanced Set allows you to set up a cluster/farm.

The load balancer functionality has custom probes; these look for HTTP200 on a health check page. The health check page can be custom and do custom logic (e.g., auth/db conn) determining whether to return a 200 status or not.

Availability Sets ensure not all the VMs in a set would go down for usual updates and patches at the same time; i.e., your load balanced endpoint would always have an active VM on the other end.

The Windows Azure Virtual Network allows, as mentioned in the Keynote, a VPN to be set up which can be patched into your on-premises router to act as if it’s on-prem itself.

The VPN can be configured/defined via an xml file. The creation of the VMs and their attached data disks can be scripted from the mature Azure Powershell cmdlet library. Using these together Michael was able to show how he can run a powershell script to spin up a farm of ten servers from a pre-existing VHD, attach a 1TB data disk to each, and assign them IP addresses within the configured VPN range.

He then downloaded the VPN settings from Azure in a format specific to a possible corporate router to effectively bring that new server farm into the on-premises network.

This automatic setup and configuration was really rushed through and quite tough to follow at times, probably due to the lack of time. There were some tantalising snippets of script on screen momentarily, and then it was all over.

My big take away from this session was the ability to automatically spin up a preconfigured set of servers to create a QA/dev/load test environment, do horrible things to it, get some reports generated, then turn it back off. Wonderful.

:: Michael with the money shot

ukwaug ms cloud day MWasham azure spring release 2012 IaaS

Next up >> Mongo DB

UKWAUG: MS Cloud Day – Windows Azure Spring Release

Some notes on the recent UK Windows Azure User Group (@ukwaug) Microsoft Cloud Day held in the Vue cinema in Fulham Broadway; a place close to my heart, as Fulham was the first place I rented in London well over a decade ago. The cinema wasn’t there then. Nor was the mall which houses the cinema.

Nor, more significantly, was Azure (pretty sure Microsoft was though..)

To be fair, even though Azure *was* around a couple of years ago, it certainly was not a first class citizen in my tekky box of tricks; more of a "this is pretty cool – think I’ll wait until it’s finished first."

Now it’s pretty much ready; not quite "stick a knife in it, comes out clean" ready, but more "better give it five more minutes" ready.*

The first Azure attempt had a reasonably complex and confusing Silverlight interface that attempted to convey the full capabilities of Azure. If you really gave it your attention then you could develop some really complex and clever solutions via cloud-based computing.
:: Silverlight portal

If you already had a chance to catch the recent Azure keynote then you’ll probably already know that Azure has a wonderful new HTML5 portal (tata Silverlight – though you can still get that version if you want), WebSites (sort of an extension/simplification of the Web Role cloud service), git support, tfs support, and loads more; I’m not going to list and explain everything new here as I just wanted to cover some of the notes I made during the UKWAUG MS Cloud Day.
:: HTML5 portal

Given that the Spring release has been out a couple of weeks now I’ve already had a chance to play with what I found to be the more interesting changes namely: git support, tfs support, websites (sort of lightweight web roles), and support for non-.net languages, such as Node.js.

Keynote

After a short intro from Andy Cross of Elastacloud (and co-founder of UKWAUG) we are presented with the trademark red polo shirt of Scott Guthrie, ready to show off the great new developments in the Spring Azure release.

Interesting points:

  • The Azure SDKs are published on github, they take pull requests, which actually makes Azure SDKs actual OSS! Amazing!
  • The azure servers themselves are, located close to hydroelectric generators or wind powered generators. They are site managed by 1 person per 15k servers.
  • There is a 99.95% SLA, with a money back guarantee
  • The new release supports node.js, java, php, and python
  • The Azure SDKs run on linux, mac, and windows
  • There is a much simplified pricing calculator
    :: Old
  • :: new

VMs

With regards to licensing for software installed on VMs in Azure (such as SQL): you can use the standard licensing from vendor, volume licensing/SA is also supported.

You can attach a "data drive" to vm, which is a virtual fixed disk of 1TB. You can attach multiple data drives and these can be RAID-ed should you want. The data stored on these is persistent and will survive a restart.

The VMs all have a D drive which is just temporary swap file drive; the data on these is temporary and will not survive a restart.

One interesting point was raised by an audience member: why does the Windows Server 2012 VM in Azure have a floppy drive listed in "My Computer"?.. Scott couldn’t answer that one!

All endpoints on a VM are locked down initially and have to be explicitly opened up.

You can create an Availability Set; these are multiple VMs which will be guaranteed to be deployed on separate servers, separate racks, etc. This gives a sort of automatic "High" Availability.

There’s a great blog post by Maarten Balliauw about using Azure VMs to create a cloud-hosted web farm!

Since the VMs support other OSs which will also support these data drives etc, they can be formatted to whatever file format is required (such as EXT3 for example).

Scott then proceeds to use git bash to ssh into his unbuntu server VM.

Within the new portal there is a "My Images" section under the VM area; you can upload your own VM image or use a capture/snapshot created from within Azure itself.

One thing that was wonderful for a real nerd like me (whose degree was based on Unix) is the Azure cmd line support!

From the command line you can run "azure vm list" which executes a REST query against your Azure management portal. Adding a "–json" flag returns json formatted data.

VPN support

From within the portal a VPN can be created by simply selecting New -> Network -> Address Range. This is virtualised so no conflicts exist between customers using the same address range.

It is then possible to download these VPN settings to make the Azure VPN link with your local network, making it work as if it’s just in the next room.

Supports UDP and TCP but not multicast.

There is hypervisor support for porting VMs to and from Azure.

VMs are setup with a persistent drive; these are HA drives accessing data from Azure storage. As such, they are always-on.

The High Availability is implemented with a layer of SSD & caching above the HA triplicate drives with continuous storage georeplication; this covers for natural disaster (however these are always within same geopolitical boundary). This is automatically on but can turn it off to save money.

Azure now supports Chef, the open-source systems integration framework!
http://www.opscode.com/chef/

No AV is provided on the VM image by default, but you can install it yourself on the VM.

My big take-away from this session is the plan to utilise VMs to spin up dev/test environments within the VPN and turn them off outside of office hours to save money. This could potentially be covered within a standard MSDN subscription; something to look into.

Websites

Each Azure account has 10 websites for free. These start in a shared location, but you can pay for private.

All Azure accounts include 10 websites and the MSDN subscription includes bandwidth, storage, vms, etc which can be used to enhance the website (i.e., install a SQL backend or something)

Supports git, tfs, ftp, and web deploy

supports incremental deployment; first one is big, subsequent ones are incremental/smaller

The free Azure account already includes 10 websites, 1gb bandwidth, 20gb db, and 1gb storage.

In terms of DNS, you can map cnames to websites and but would need to use a VM to map anames.

At this point there was a major network outage eventually resulting in a dodgy, insanely long RJ45 cable to be laid from the projection room at the back all the way to the front of the auditorium! Apparently this cable was actually longer than the max allowed length for a wired ethernet connection, so only worked on a couple of laptops; neither of which were Scott’s.. Note to self: always have a backup *everything*, even a backup network!

Cloud Services

When creating a website in Visual Studio you can just deploy to azure as a website; the only additions to your file system might be a web.config and a csdef file.

You can convert a website project in VS to cloud service which includes cloud services within your solution and then deploys as an Azure Cloud Service

It is possible to programmatically scale; currently this is by using the Wasabi library, but does need to be scripted. This functionality will be added as a feature in future releases.

The Azure Application building blocks for Cloud Services:

  • big data,
  • db,
  • storage,
  • traffic caching,
  • messaging,
  • id (authentication),
  • media,
  • cdn,
  • networking

SQL

  • Reporting services has been added recently for BI
  • SQL Server 2012 denali (?) is a subset of full SQL Server 2012
  • VS2012 can support data projects to target the correct SQL DB type for Azure and can identify incompatible features (remember ye olde SQL Azure’s incompatibilities? Yuck..)
  • SQL is PAYG.
  • It is possible to restore a db to a specific point via the constantly backed up transaction logs
** Unfortunately the earlier network outage during this session meant we had to cut it short here as we were already overrunning quite significantly **

:: Scott Guthrie delivering the Spring Release Azure Keynote

ukwaug ms cloud day scottgu azure spring release 2012 keynote

Next up – IaaS, Mongo DB, Node.js & git & Cloud9IDE >>

* yes, I like cooking. So?!