Data URI scheme

The Data URI Scheme is a method of including (potentially external) data in-line in a web page or resource.

For example, the usual method of referencing an image (which is almost always separate to the page you’ve loaded) would the one schemes of either html:

[html]<img src="/assets/images/core/flagsprite.png" alt="flags" />[/html]

or css:

[css]background:url(/assets/images/core/flagsprite.png)[/css]

However, this remote image (or other resource) can be base64 encoded and included directly into the html or css using the data uri schema:

[html]<img src="data:image/png;base64,
iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAABGdBTUEAALGP
C/xhBQAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9YGARc5KB0XV+IA
AAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAF1J REFUGNO9zL0NglAAxPEfdLTs4BZM4DIO4C7OwQg2JoQ9LE1exdlYvBBeZ7jq
ch9//q1uH4TLzw4d6+ErXMMcXuHWxId3KOETnnXXV6MJpcq2MLaI97CER3N0
vr4MkhoXe0rZigAAAABJRU5ErkJggg==">[/html]

or

[css]background:url(data:image/png;base64,
iVBORw0KGgoAAAANSUhEUgAAABAAAAAQAQMAAAAlPW0iAAAABlBMVEUAAAD/
//+l2Z/dAAAAM0lEQVR4nGP4/5/h/1+G/58ZDrAz3D/McH8yw83NDDeNGe4U
g9C9zwz3gVLMDA/A6P9/AFGGFyjOXZtQAAAAAElFTkSuQmCC[/css]

So, if you fancy cutting down on the number of HTTP requests required to load a page whilst massively increasing the size of your css and html downloads, then why not look into the data uri scheme to actually include images in your css/htm files instead of referencing them?!

Sounds crazy, but it just might work.

Using the code below you can recursively traverse a directory for css files with “url(“ image references in them, download the images, encode them, and inject the encoded image back into the css file. The idea is that this little proof of concept will allow you to see the difference in http requests versus full page download size between referencing multiple external resources (normal) and referencing fewer, bigger resources (data uri).

Have a play, why don’t you:

[csharp highlight=”72,73,75″]using System;
using System.IO;
using System.Text.RegularExpressions;
using System.Net;

namespace Data_URI
{
class Data_URI
{
static void Main(string[] args)
{
try
{
var rootPath = @"D:\WebSite\";

// css file specific stuff
var cssExt = "*.css";
// RegEx "url(….)"
var cssPattern = @"url\(([a-zA-Z0-9_.\:/]*)\)";
// new structure to replace "url(…)" with
var cssReplacement = "url(data:{0};base64,{1})";

// recursively get all files matching the extension specified
foreach (var file in Directory.GetFiles(rootPath, cssExt, SearchOption.AllDirectories))
{
Console.WriteLine(file + " injecting");

// read the file
var contents = File.ReadAllText(file);

// get the new content (with injected images)
// match css referenced images: "url(/blah/blah.jpg);"
var newContents = GetAssetDataURI(contents, cssPattern, cssReplacement);

// overwrite file if it’s changed
if (newContents != contents)
{
File.WriteAllText(file, newContents);
Console.WriteLine(file + " injected");
}
else
{
Console.WriteLine(file + " no injecting required");
}
}

Console.WriteLine("** DONE **");
Console.ReadKey();
}
catch (Exception e)
{
Console.WriteLine(e.Message);
Console.ReadKey();
}
}

static string GetAssetDataURI(string fileContents, string pattern, string replacement)
{
try
{
// pattern matching fun
return Regex.Replace(fileContents, pattern, new MatchEvaluator(delegate(Match match)
{
string assetUrl = match.Groups[1].ToString();

// check for relative paths
if (assetUrl.IndexOf("http://") < 0)
assetUrl = "http://mywebroot.example.com" + assetUrl;

// get the image, encode, build the new css content
var client = new WebClient();
var base64Asset = Convert.ToBase64String(client.DownloadData(assetUrl));
var contentType = client.ResponseHeaders["content-type"];

return String.Format(replacement, contentType, base64Asset);
}));
}
catch (Exception)
{
Console.WriteLine("Error"); //usually a 404 for a badly referenced image
return fileContents;
}
}
}
}[/csharp]

The key lines are highlighted: they download the referenced resource, convert it to a byte array, encode that as base64, and generate the new css.

This practise probably isn’t very useful for swapping out img refs  in HTML since you lose out on browser caching and static assets cached in CDNs. It may be more useful for images referenced in CSS files, since they’re static files themselves which can be minified, pushed to CDNs, and take advantage of browser caching.

Comments welcomed.

Quick and Dirty C# Recursive Find and Replace

Say you had a vast Visual Studio solution of something ridunculous like 120+ projects and wanted to test out a few proofs of concept on improving build times.

Now say that one of the proofs of concept was to use a shared bin folder for all projects in a single solution. Editing 120+ proj files is going to make you a little crazy.

How about a little recursive find-and-replace app using regular expressions (my saviour in many menial text manipulation tasks) to do it all for you? That’d be nice, wouldn’t it? That’s what I thought too. So I just did a quick and dirty console app to do just that.

[csharp]using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.IO;
using System.Collections.ObjectModel;

namespace RecursiveFindAndReplace
{
class Program
{
static void Main(string[] args)
{
// where to start your directory walk
var directoryToTraverse = @"C:\VisualStudio2010\Projects\TestSolutionWithLoadsOfProjectsInIt\";

// what files to open
var fileTypeToOpen = "*.csproj";

// what to look for
var patternToMatch = @"<OutputPath>bin\\[a-zA-Z]*\\</OutputPath>;";
var regExp = new Regex(patternToMatch);
// the new content
var patternToReplace = @"<OutputPath>;C:\bin\$(Configuration)\</OutputPath>";

// get all the files we want and loop through them
foreach (var file in GetFiles(directoryToTraverse, fileTypeToOpen))
{
// open, replace, overwrite
var contents = File.ReadAllText(file);
var newContent = regExp.Replace(contents, patternToReplace);
File.WriteAllText(file, newContent);
}
}

// recursive method to return the files we want in all sub dirs of the initial root
static List<string> GetFiles(string directoryPath, string extension)
{
var fileList = new List<string>();
foreach (var subDir in Directory.GetDirectories(directoryPath))
{
fileList.AddRange(GetFiles(subDir, extension));
}

fileList.AddRange(Directory.GetFiles(directoryPath, extension));

return fileList;
}

}
}[/csharp]

No doubt this could be made prettier with a little lambda, but like I said – quick and dirty.

—————–

Edit: I’ve just realised that Directory.GetFiles is inherently recursive. Duh. So the foreach instead becomes:

[csharp highlight=”4″]// get all the files we want and loop through them
foreach (var file in Directory.GetFiles(directoryToTraverse
,fileTypeToOpen
,SearchOption.AllDirectories))
{
// open, replace, overwrite
var contents = File.ReadAllText(file);
var newContent = regExp.Replace(contents, patternToReplace);
File.WriteAllText(file, newContent);
}[/csharp]

So that’s even quicker and slightly less dirty. Ah well.