This is part 3 of a series on getting up and running with Azure WebJobs in .NET Core. If you are just joining us, it’s highly recommended you start back on Part 1 as there’s probably some pretty important stuff you’ve missed on the way.


Azure WebJobs In .NET Core

Part 1 – Initial Setup/Zip Deploy
Part 2 – App Configuration and Dependency Injection
Part 3 – Deploying Within A Web Project and Publish Profiles


Deploying Within A Web Project

So far in this series we’ve been packaging up our WebJob as a stand alone service and deploying as a zip file. There’s two problems with this approach.

  • We are manually having to upload a zip to the Azure portal
  • 9 times out of 10, we just want to package the WebJob with the actual Website, and deploy them together to Azure

Solving the second issue actually solves the first too. Typically we will have our website deployment process all set up, whether that’s manual or via a build pipeline. If we can just package up our WebJob so it’s actually “part” of the website, then we don’t have to do anything special to deploy our WebJob to Azure.

If you’ve ever created an Azure WebJob in .NET Framework, you know in that eco system, you can just right click your web project and select “Add Existing Project As WebJob” and be done with it. Something like this :

Well, things aren’t quite that easy in the .NET Core world. Although I wouldn’t rule out this sort of integration into Visual Studio in the future, right now things are a little more difficult.

What we actually want to do is create a publish step that goes and builds our WebJob and places it into a particular folder in our Web project. Something we haven’t jumped into yet is that WebJobs are simply console apps that live inside the “App_Data” folder of our web project. Basically it’s a convention that we can make use of to “include” our WebJobs in the website deployment process.

Let’s say we have a solution that has a website, and a web job within it.

What we need to do is edit the csproj of our website project. We need to add something a little like the following anywhere within the <project> node :

What we are doing is saying when the web project is published, *after* publishing, also run the following command.

Our command is a dotnet publish command that calls publish on our webjob, and says to output it (signified by the -o flag) to a folder in the web projects output directory called “App_Data\Jobs\continuous\WebJobExample”.

Now a quick note on that output path. As mentioned earlier, WebJobs basically just live within the App_Data folder within a website. When we publish a website up to the cloud, Azure basically goes hunting inside these folders looking for webjobs to run. We don’t have to manually specify them in the portal.

A second thing to note is that while we are putting it in the “continuous” folder, you can also put jobs inside the “triggered” folder which are more for scheduled jobs. Don’t worry too much about this for now as we will be covering it in a later post, but it’s something to keep in mind.

Now on our *Website* project, we run a publish command :  dotnet publish -c Release . We can head over to our website output directory and check that our WebJob has been published to our web project into the App_Data folder.

At this point, you can deploy your website publish package to Azure however you like. I don’t want to get too in depth on how to deploy the website specifically because it’s less about the web job, and more about how you want your deploy pipeline to work. However below I’ll talk about a quick and easy way to get up and running if you need something to just play around with.

Deploying Your Website And WebJob With Publish Profiles

I have to say that this is by no means some enterprise level deployment pipeline. It’s just a quick and easy way to validate your WebJobs on Azure. If you are a one man band deploying a hobby project, this could well suit your needs if you aren’t deploying all that often. Let’s get going!

For reasons that I haven’t been able to work out yet, the csproj variables are totally different when publishing from Visual Studio rather than the command line. So we actually need to edit the .csproj of our web project a little before we start. Instead of :

We want :

So we remove the $(ProjectDir)  variable. The reason is that when we publish from the command line, the $(PublishDir)  variable is relative, whereas when we publish from Visual Studio it’s an absolute path. I tried working out how to do it within MSBuild and have conditional builds etc. But frankly, you are typically only ever going to build one way or the other, so pick whichever one works for you.

If you head to your Azure Web App, on the overview screen, you should have a bar running along the top. You want to select “Get Publish Profile” :

This will download a .publishsettings file to your local machine. We are going to use this to deploy our site shortly.

Inside Visual Studio. Right click your website project, and select the option to Publish. This should pop up a box where you can select how you want to publish your website. We will be clicking the button right down the bottom left hand corner to “Import Profile”. Go ahead and click it, and select the .publishsettings file you just downloaded.

Immediately Visual Studio will kick into gear and push your website (Along with your WebJob) into Azure.

Once completed, we can check that our website has been updated (Visual Studio should immediately open a browser window with your new website), but on top of that we can validate our WebJob has been updated too. If we open up the Azure Portal for our Web App, and scroll down to the WebJob section, we should see the following :

Great! We managed to publish our WebJob up to Azure, but do it in a way that it just goes seamlessly along with our website too. As I mentioned earlier, this isn’t some high level stuff that you want to be doing on a daily basis for large projects, but it works for a solo developer or someone just trying to get something into Azure as painlessly as possible.

Verifying WebJob Files With Kudu

As a tiny little side note, I wanted to point out something if you ever needed to hastily change a setting on a WebJob on the fly, or you needed to validate that the WebJob files were actually deployed properly. The way to do this is using “Kudu”. The name of this has sort of changed but it’s all the same thing.

Inside your Azure Web App, select “Advanced Tools” from the side menu :

Notice how the icon is sort of a “K”… Like K for Kudu… Sorta.

Anyway, once inside you want to navigate to Debug Tools -> CMD. From here you can navigate through the files that actually make up your website. Most notably you want to head along to /site/wwwroot/App_Data/ where you will find your WebJob files. You can add/remove files on the fly, or even edit your appsettings.json files for a quick and dirty hack to fix bad configuration.

What’s Next?

So far all of our WebJobs have printed out “Hello World!” on repeat. But we can actually “Schedule” these jobs to run every minute, hour, day, or some combination of the lot. Best of all, we can do all of this with a single configuration file, without the need to write more C# code!

This is part 2 of a series on getting up and running with Azure WebJobs in .NET Core. If you are just joining us, it’s highly recommended you start back on Part 1 as there’s probably some pretty important stuff you’ve missed on the way.


Azure WebJobs In .NET Core

Part 1 – Initial Setup/Zip Deploy
Part 2 – App Configuration and Dependency Injection
Part 3 – Deploying Within A Web Project and Publish Profiles


App Configuration Of Our .NET Core Web Job

Previously we managed to get our WebJob all up and running as a simple console application. But the reality is that in almost every application, we are going to want some sort of configuration swap happening when we deploy up to Azure. In full framework we might have done something as simple as a web.config transform, but here we are using appsettings.json, so things are a little different. Let’s get cracking!

I’m not going to explain what each one of these does, but I would just recommend installing all of the following nuget packages into your WebJob/Console Application. Adding these will give you an experience that you are probably used to from building .NET Core web projects.

Now let’s go ahead and create two configuration files in our project. One we will call appsettings.json, and the other appsettings.production.json

Both of these files should be set to “Copy If Newer” so that they are output to the bin directory both when running locally, and when publishing.

The contents of the appsettings.json file will be :

And for our production file :

You can probably see where this is going!

And now we will change the console app code to look a bit like so :

So let’s break this down a little.

First we go ahead and grab the environment available of “ASPNETCORE_ENVIRONMENT”, we’ll look at how we actually set this later. We also write this out to the console just for a bit of debugging.

Next we load configuration files. We load the default appsettings.json, and we load one that will be appsettings.environment.json, where environment is whatever was in the environment variable from earlier. Obviously if we set this to “production”, then we will pull in our production app settings which will override the default file. Perfect!

Finally, we print out the “message” app setting. Locally we see the following output :

Let’s publish our Web Job up to Azure (Using the zip method from Part 1), and see what we get. After uploading, we check the WebJob logging out :

Bleh! That didn’t work, it still thinks we are running in Development?! It’s because we didn’t set the environment variable!

Head to the “Application Settings” configuration panel of your Azure Web App. You need to add the variable “ASPNETCORE_ENVIRONMENT” with the setting of “production”.

If we head back to our WebJob logging output :

Awesome! So now our WebJob has the concept of both configuration, and which “environment” it’s running in so that it can swap configuration out at runtime.

Using .NET Core Dependency Injection

A pretty big part of .NET Core is it’s out-of-the-box dependency injection support. While usage of DI in a quick and dirty console application is probably pretty rare, if you are using shared libraries between the web project and your web job, you are probably going to need it to wire up a couple of services here and there. If you’ve already added DI into a .NET Core console application before, this may not be totally new for you, but it’s still worth stepping through it anyway.

First we need the nuget package that actually has the .NET Core dependency injection classes in it :

Our first point of order is actually going to wire up our configuration properly. Previously when we accessed our configuration, we were doing it on the raw ConfigurationRoot object, which isn’t too great. I personally prefer using POCO’s to hold configuration. So let’s create something to hold our “Message” :

We need to modify our appsettings.json files (Both the default and production) to better match our structure :

Now we need to fire up our service collection, and bind up our configuration. To do this, in our main method of our program.cs. Underneath the configuration setup, add the following :

What this does is it creates a new service collection, and binds our configuration for the IWebJobConfiguration interface. Not that helpful (yet!), but follow on.

Next we are going to do something a little funky. We are going to create a new “Entry Point” for our application. So far all of our setup *and* our actual “work” has been done within the main method of our program.cs. This is fine for now, but what we actually need is an application root that can be the kick off point for our WebJob to actually start. Now I want to point out, that later in this series we are going to move this “Entry Point” and instead use a library that essentially does it for us, but it’s still good to know.

Go ahead and create a class that looks like the following :

Pretty simple, we just have a single method that prints out our configuration message (Remember, this changes depending if we are local or in production).

Now we need to wire this up, we add the following to our main method of our program.cs at the very bottom.

So essentially we are adding WebJobEntryPoint as a service we can build. We are then requesting that service back (Which in the process injects in our configuration), and then we call Run on it. The reason we do this is it provides a way that the more we add into that entry point in terms of dependencies or services, the actual call to “new” everything up doesn’t change. All in all, our main method should end up looking like :

Publish this up to Azure and we get our usual output! Now of course, we haven’t really achieved much in total by adding in dependency injection to our console app, but that’s because this this a simple application without too many dependencies. If we had a much more complicated service layer and a tonne of business logic, especially if it’s buried away in shared DLLs, we wouldn’t have such a headache in getting things up and running.

What’s Next?

So far we’ve been fiddling around with individual console applications that we’ve deployed as zips. Next we are going to take a look at how we might be able to deploy a web job as part of a web application. By that I mean when we deploy out web app to Azure, we automatically push the web job with it. Again this is something that is pretty standard when building Web Jobs in .NET full framework, but does not work straight out of the box when working with .NET Core. You can check our Part 3 right here!

While working on a web project recently that was written in .NET Core, there was a need for a small background task to run either on a timer, or on an Azure queue message. No worries, we can just use an Azure WebJob for that, no extra infrastructure or Azure setup required right? Well, it then occurred to me that I had never actually written a WebJob in .NET Core. What started as a “Oh you just add a WebJob in Visual Studio and you are done” quickly turned into “OK… So I need this specific beta version of this nuget package that a random Microsoft developer mentioned over here”. So here it is, here is how you create an Azure WebJob in .NET Core.


Azure WebJobs In .NET Core

Part 1 – Initial Setup/Zip Deploy
Part 2 – App Configuration and Dependency Injection
Part 3 – Deploying Within A Web Project and Publish Profiles


Forewarning

It’s tempting that if you already have a WebJob written in .NET Framework, to skip through this tutorial right to the part where it mentions something specifically related to your existing project. If there is one thing I learned about WebJobs in .NET Core, is that it pays to start with the real basics. I would recommend even if you already know about WebJobs and how they work, to still start on Part 1 and work your way through. Everything from configuration to publishing is totally different than what you might have expected, so even if it feels like we are going right back to Hello World level, it’s worth it to get a total understanding of how WebJobs currently work under .NET Core.

What You Should Already Know

While we may be creating a “Hello World” type WebJob along the way, this is certainly not a beginners series on “what” WebJobs are. Infact, we probably brush over some really important basics that if you’ve never used WebJobs (Or even Azure Web Apps) before, you are going to have a really rough time. This series is more on the .NET Core part, and less on the “How do I use Azure” part. So if you’ve never created an Azure WebJob before, go ahead and create one in the full .NET Framework, have a play, then come back when you are ready to get cracking with .NET Core.

A Console App And Nothing More

Did you know that a WebJob is actually just a console application? Infact I’ve seen people upload diagnostic tools as plain .exe’s as “WebJobs” to an Azure Web App just so they can run a specific tool on the same machine as their web app. For example a tool to spit out what SSL certificates were available on the machine, or what it thought certain environment variables were set to.

So let’s start there. Let’s forget all about the concept of a “WebJob”, and let’s just create a simple console application that runs in an Azure Web App.

First go ahead and create a new .NET Core Console Application. At the time of writing, there are no Visual Studio templates for creating web jobs for .NET Core. So for example you may see below inside VS:

But this is for full framework only. In the grand scheme of things, it’s not a big deal. The VS Template just installs a few nuget packages and gives us some boiler plate code, but realistically it’s just a console application anyway.

So what we want is :

After creating, we are going to have code that looks like the following :

Congrats! You just created your first Web Job. How easy was that! But now how to push things up to Azure. Hmmm.

Kicking Our Job Into Gear

Now let’s say we deploy this as is. Given that .NET Core applications compile down to .dll files, in our publish file we could have multiple .dll files, how will Azure know which one to run? Well, we actually need to give it a nudge in the right direction. The easiest way to do this is to create a “run” file.

The easiest way to explain a run file, is that we take an extension from the following list :  .fsx, .cmd, .bat, .exe, .ps1, .sh, .php. .py, .js  and we create a file called “run” with that extension. So for example, “run.cmd”. Azure checks our WebJob folder for any run file, and kicks it into gear. Also note that .cmd file extension is obviously used for Windows App Services, and .sh for Linux etc.

In the root of your project, go ahead and create a file called “run.cmd” with the contents :

Pretty simple. When we upload our WebJob, it’s going to find our run file, and then use the dotnet SDK to kick off our console app. Obviously replace the .dll filename with your projects dll name, and you are good to go.

Very Important!

Do not create this file in Visual Studio. It creates the file as UTF-8 BOM, and will break the WebJob. Even if you don’t create this file in Visual Studio, you should still check that you are not using BOM. The easiest way is to use Notepad++ and check the encoding on this file. Ensure that it’s set to UTF-8 (Without BOM).

And the second thing is to ensure that the file itself is set to copy to your output directory.

If either of these steps aren’t adhered to, your WebJob can either just fail to start, or Azure will fail the deployment (And silently, I might add) which can be a bit frustrating.

Zip Publishing

Later on in this series we are going to cover a much better way of getting Web Jobs into Azure, but for now let’s just just get things up and running. For that we are going to publish a very simple Zip with our webjob inside it, then just upload via the Azure Portal.

To publish our app, we just run the following in our project directory :

You then need to zip the following folder :  bin\Release\netcoreapp2.1\publish . Note that’s the publish folder not the parent. Once we have all of that zipped up, head over to Azure. If you haven’t already, now would be a great time to spin up a test Azure Web App (Windows) before progressing. I’ll wait…

Now with your Azure Web App in the Azure Portal, select Web Jobs from the side menu, then Add.

Don’t worry too much about the various settings, we can dive into these later. But it should look a bit like the following :

Once uploaded, we now need to check the logs of our WebJob and make sure it’s all working as intended. Select our WebJob from the list (You may need to refresh the list a couple of times after the upload), then select Logs. You should be taken to a screen where you can see the output of your WebJob :

Woop! So it ran and printed out Hello World! Our first WebJob is all go!

What’s Next?

I know the above might not seem like you’ve achieved a lot. I mean it took us probably a good 30 mins just to print Hello World on a log window in Azure, not that fancy right! But everything we’ve done so far will be built upon and give us better understanding of exactly how WebJobs work under the hood.

In future parts in this series, we will cover using .NET Core’s configuration, dependency injection, scheduled jobs, triggered jobs, uploading as part of a web app, and so much more! Check our Part 2 here!

ELO is a player rating system that was first used to rank chess players, but later found a lot of usage in video games, basketball, baseball and many other sports. It’s designed in a way that when underdogs win in a game, they are given more “credit” for their win than if a favourite was to win. For example if chess grandmaster Magnus Carlsen was to beat me (a non ranked player) in chess, it really shouldn’t affect his ranking. And vice versa should someone with a low rating beat the number 1 player in the world, their ranking should shoot up.

ELO is also designed in a way that means a top player can’t just continually play much lower ranked players and keep moving up the rankings at the same rate. It doesn’t mean that matchups don’t exist where a certain “style” of play (depending on the sport/game) means it’s easier or harder than their ELO lets on, but it’s meant to reward underdogs winning a whole lot more than someone picking favourable matchups.

ELO seems like it should be some massively complicated system, but it can really be boiled down to two “simple” equations.

Expectation To Win Equation

The “Expectation To Win” equation is really part of the ELO equation, but it’s worth splitting out because you often may want to see it as a stand alone number anyway.

After passing in two player ratings, we are returned the likelihood of Player 1 winning the match up. This is represented as a decimal (So a result of 0.5 would mean that Player 1 has a 50% chance to win the match up).

Some example input and output would look like :

Player 1 Rating : 1500
Player 2 Rating : 1500
Expectation For Player 1 Win : 0.5 (50% chance to win)

Player 1 Rating : 1700
Player 2 Rating : 1300
Expectation For Player 1 Win : 0.90 (90% chance to win)

ELO Rating Equation

The next step is, given two players compete against each other with a clear winner and loser, what would their new ELO ranking be? The equation for that looks a little bit like this :

Let’s break this down a little.

We pass in our two players ratings, and the outcome as seen by player 1. Once inside, we use a special number called “K”, we’ll talk a little more about this number later, but for now just think of it as a constant. We then take the outcome (Either 1 or 0), and minus out the actual expected outcome of the game. We then add this delta to player one’s rating, and subtract it away from player two. Because we are using the expected outcome as part of the equation, we can reward the underdog for winning more than if the expected winner actually wins. Let’s look at some actual examples :

Player 1 ELO : 1700
Player 2 ELO : 1300
Outcome : Player 1 Wins
Player 1 New ELO : 1702
Player 2 New ELO : 1298
ELO Shift  : 2

Player 1 ELO : 1700
Player 2 ELO : 1300
Outcome : Player 1 Loses
Player 1 New ELO : 1671
Player 2 New ELO : 1329
ELO Shift  : 29

So as we can see, when player 1 wins, they gain only 2 ELO points because they were expected to come out on top (Infact they were expected to win 90% of the time). However when they lose, they pass along 29 points to the loser which is a huge shift. This represents the underdog winning a game they were not expected to win at all.

Finding The Perfect “K”

So in our calculate method, we use a constant called “K”. In simple terms, we can think of this number of how “fast” ELO will change hands. A low number (Such as 10) will mean that ELO will fluctuate rapidly to results. A high ELO (Such as 32) will mean that ELO rankings will be much slower moving. Typically this means that for sports/games that have a low amount of games played per season (Or per career), you may want a rather low number so that results change pretty rapidly. In sports where there are possibly even hundreds of games a year, you would want a lower ELO to reflect that you are expected to lose a few games here or there, but overall you would have to go on a large losing run to really start slipping.

Other times K can change based on either the ELO rating of the players, or the amount of games already played. For example in a new “season”, with few games played, a high K rating would mean that ELO rankings would fluctuate wildly, and then as the season goes on you could lower K to stabilize the rankings. I’m not a huge fan of this as it begins to put more importance on winning games early in the season rather than the end, but it does make sense for new players in a game to be given a lower K so they can find their true ELO faster.

If you take the varying of K based on the ELO rating of the players, you can give a high ELO to lower/mid range ranked players so that they can dig themselves out of the weeds rather fast. Then lower K as you reach higher ELO’s to reflect that at the top of the rankings, things should be a bit more stabilized.

Ultimately, K should be between 10 and 32. And it will totally depend on what you are rating players in to what it should be.

I recently came across the need to host a .NET Core web app as a Windows Service. In this case, it was because each machine needed to locally be running an API. But it’s actually pretty common to have a web interface to manage an application on a PC without needing to set up IIS. For example if you install a build/release management tool such as Jenkins or TeamCity, it has a web interface to manage the builds and this is able to be done without the need for installing and configuring an additional web server on the machine.

Luckily .NET Core actually has some really good tools for accomplishing all of this (And even some really awesome stuff for being able to run a .NET Core web server by double clicking an EXE if that’s your thing).

A Standalone .NET Core Website/Web Server

The first step actually has nothing to do with Windows Services. If you think about it, all a Windows Service is, is a managed application that’s hidden in the background, will restart on a machine reboot, and if required, will also restart on erroring. That’s it! So realistically what we first want to do is build a .NET Core webserver that can be run like an application, and then later on we can work out the services part.

For the purpose of this tutorial, I’m just going to be using the default template for an ASP.net Core website. The one that looks like this :

We first need to head to the csproj file of our project and add in a specific runtime (Or multiple), and an output type. So overall my csproj file ends up looking like :

Our RuntimeIdentifiers (And importantly notice the “s” on the end there) specifies the runtimes our application can be built for. In my case I’m building only for Windows 10, but you could specify other runtime monkiers if required.

Ontop of this, we specify that we want an outputtype of exe, this is so we can have a nice complete exe to run rather than using the “dotnet run” command to start our application. I’m not 100% sure, but the exe output that comes out of this I think is simply a wrapper to boot up the actual application dll. I noticed this because when you change code and recompile, the exe doesn’t change at all, but the dll does.

Now we need to be able to publish the app as a standalone application. Why standalone? Because then it means any target machine doesn’t have to have the .NET Core runtime installed to get everything running. Ontop of that, there is no “what version do you have installed?” type talk. It’s just double click and run.

To publish a .NET Core app as standalone, you need to run the following command from the project directory in a command prompt/powershell :

It should be rather self explanatory. We are doing a publish, using the release configuration, we pass through the self contained flag, and we pass through that the runtime we are building for is Windows 10 – 64 Bit.

From your project directory, you can head to :  \bin\Release\netcoreapp2.1\win10-x64\publish

This contains your application exe as well as all framework DLL’s to run without the need for a runtime to be installed on the machine. It’s important to note that you should be inside the Publish folder. One level up is also an exe but this is not standalone and relies on the runtime being installed.

From your publish folder, try double clicking yourapplication.exe.

In your browser head to http://localhost:5000 and wallah, you now have your website running from an executable. You can copy and paste this publish folder onto any Windows 10 machine, even a fresh install, and have it spin up a webserver hosting your website. Pretty impressive!

Installing As A Window Service

So the next part of this tutorial is actually kinda straight forward. Now that you have an executable that hosts your website, installing it as a service is exactly the same as setting up any regular application as a service. But we will try and have some niceties to go along with it.

First we need to do a couple of code changes for our app to run both as a service, and still be OK running as an executable (Both for debugging purposes, and in case we want to run in a console window and not as a service).

We need to install the following from your package manager console :

Next we need to go into our program.exe and make your main method look like the following :

This does a couple of things :

  • It checks whether we are using the debugger, or if we have a console argument of “–console” passed in.
  • If neither of the above are true, it sets the content root manually back to where the exe is running. This is specifically for the service runtime.
  • Next if we are a service, we use a special “RunAsService()” method that .NET Core gives us
  • Otherwise we just do a “Run()” as normal.

Obviously the main point of this is that if the debugger is attached (e.g. we are running from visual studio), or we run from a command prompt with the flag “–console”, it’s going to run exactly the same as before. Back in the day we used to have to run the service with a 10 second sleep at the start of the app, and quickly try and attach the debugger to the process before it kicked off to be able to set breakpoints etc. Now it’s just so much easier.

Now let’s actually get this thing installed!

In your project in Visual Studio (Or your favourite editor) add a file called install.bat to your project. The contents of this file should be :

Obviously replace MyService with the name of your service, and be sure to rename the exe to the actual name of your applications exe. Leave the %~dp0 part as this refers to the current batch path (Allowing you to just double click the batch file when you want to install).

The install file creates the service, sets up failure restarts (Although these won’t really be needed), starts the service, and sets the service to auto start in the future if the machine reboots for any reason.

Go ahead and create an uninstall.bat file in your project. This should look like :

Why the timeout? I sometimes found that it took a while to stop the service, and so giving it a little bit of a break inbetween stopping and deleting helped it along it’s way.

Important! For both of these files, be sure to set them up so they copy to the output directory in Visual Studio. Without this, your bat files won’t output to your publish directory.

Go ahead and publish your application again using our command from earlier :

Now in your publish directory, you will find your install and uninstall bat files. You will need to run both of these as Administrator for them to work as installing Windows Services requires elevated access. A good idea is that the first time you run these, you run them from a command prompt so you can catch any errors that happen.

Once installed, you should be able to browse to http://localhost:5000 and see your website running silently in the background. And again, the best part is when you restart your machine, it starts automatically. Perfect!

This week there was a great blog post about Bing.com running on .NET Core 2.1, and the performance gains that brought along with it. Most curious to me was that they singled out the performance gains of string.Equals and string.IndexOf in .NET Core 2.1 as having the largest amount of impact to performance.

Whichever way you slice it, HTML rendering and manipulation are string-heavy workloads. String comparisons and indexing operations are major components of that. Vectorization of these operations is the single biggest contributor to the performance improvement we’ve measured.

At first I thought they must have some very special use case that runs millions of string comparisons under the hood, so it’s not going to be much use to me. But then I kind of thought how many comparison of strings must happen under the hood when building a web application. There is probably a whole lot more happening than we realize, and the singling out of string manipulation performance improvements may not be as off as I first thought.

So let’s just take their word for it and say that doing stuff on the web is a string-heavy workload in .NET. How much of a performance gain can we actually expect to see in .NET Core 2.1 for these methods? We aren’t necessarily looking at the time it takes for this functions to complete, but rather the factor of improvement that .NET Core 2.1 has over versions of .NET Full Framework.

String.Equals Performance Benchmarks

(Before reading too much into these results, see the next section entitled “String.Equals Performance Benchmarks Updated”. Some interesting stuff!)

Now we could write some huge loop and run it on each runtime one by one, or we could write a nice benchmark using BenchmarkDotNet (Guide Here), and get it all in one go.

Our benchmark looks like :

So a couple of things to point out. First that we are using 2 different tool chains. .NET Core 2.1 and .NET Full Framework 4.7.2. Both of which are the latest version of runtimes.

The benchmark itself is simple, we compare the string “Hello World!” to another string that says “Hello World!”. That’s it! Nothing too fancy.

Now typically with benchmarks on large pieces of code, I feel OK to run it on my own machine. While this can give you skewed results, especially if you are trying to use your computer at the same time, for big chunks of code usually I’m just looking to see if there is actually any difference what so ever, not the actual level of difference. Here, it’s a little different. We are going to be talking about differences down to the nano seconds, so we need to be far more careful.

So instead, I spun up a VM in Azure to run the benchmarks on. It’s a D2s_V3 machine, so 2 CPU cores and 8GB of ram. It’s probably pretty typical of your standard web box that you might scale up to, before starting to scale out horizontally in a web farm.

Enough waffle, what do the results look like?

Method Toolchain Mean Error Scaled
IsEqual .NET Core 2.1 0.9438 ns 0.0686 ns 1.00
IsEqual CsProjnet472 1.9381 ns 0.0844 ns 2.06

I ran this a couple of times to make sure… And yes, to do a string compare in full framework took twice as long to complete. And trust me, I ran this multiple times just to make sure I wasn’t doing something stupid, the results were that astounding.

Incase someone doesn’t believe me, the exact tooling as per BenchmarkDotNet that was used was :

.NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT
.NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3062.0

Again, prove me wrong because I couldn’t believe the results myself. Pretty unbelievable.

String.Equals Performance Benchmarks Updated (2018-08-23)

I’m always nervous when I post Benchmarks. There is so much that can go wrong, get optimized out, or have something minor completely skew the results. This time was no different. There was a couple of observations with the benchmark.

Compile time strings are interned so I think the string equal test is testing equality on the same string instance

and

You should test with longer string, that’s where the optimizations will kick in

Both good points (hat tip to Jeff Cyr). First I wanted to test the point that if I am using the same string instance, that I shouldn’t see any performance difference (Or not much anyway), because the objects will actually be the same memory space under the hood. So let’s modify our benchmark a little to :

So now it’s definitely a different instance. Running the benchmarks and what do you know :

Method Toolchain Mean Error Scaled
IsEqual .NET Core 2.1 7.370 ns 0.1855 ns 1.00
IsEqual CsProjnet472 7.152 ns 0.1928 ns 0.97

So point proven, when the instance is different and small, there is very little performance difference. Infact .NET Core is actually slower in my benchmark, but within the range of error.

So let’s scale up the test to prove the second point. That for cases where the strings are much longer, we should see the performance benefits kick in. Our benchmark this time will look like :

So the strings we will compare are 12000 long. And they are different instances. Running our benchmark we get :

Method Toolchain Mean Error StdDev Scaled ScaledSD
IsEqual .NET Core 2.1 128.7 ns 4.367 ns 12.88 ns 1.00 0.00
IsEqual CsProjnet472 211.7 ns 6.989 ns 20.28 ns 1.66 0.24

This is what we expected to see, so on larger strings, there is a definite performance improvement in .NET Core.

So what are the takeaways here?

  1. .NET Core has done some work it seems that optimizes equality tests of strings when they are of the same instance
  2. For short strings, there isn’t any great performance benefit.
  3. For long strings, .NET Core has a substantial performance boost.
  4. I’m still nervous about posting benchmarks!

String.IndexOf Performance Benchmarks

Next up let’s take a look at IndexOf performance. This one was interesting because using IndexOf on a string, you can either do IndexOf(string) or IndexOf(char). And from the looks of the change (you can view the original PR into the Core Github repo here), the performance impact should only affect IndexOf(char). But this actually gives us a good opportunity to make sure that we are benchmarking correctly. Let’s include a benchmark that does an IndexOf(string) too! We should expect to see very minimal difference between .NET Core and Full Framework on this, but it would be good to see it in the numbers.

The benchmarking code is :

You’ll notice that in this test case, we are passing in two different arguments for each benchmark. The first is with a string that is 12 characters long, and the second is with a string that is 12,000 characters long. This was mostly because of the comment on the original PR that stated :

for longer strings, where the match is towards the end or doesn’t match at all, the gains are substantial.

Because of this I also made sure that the indexOf didn’t actually find a match at all. So we could see the maximum performance gain that this new code has in .NET Core 2.1.

And the results?

Method Toolchain haystack Mean Error Scaled
IndexOfString .NET Core 2.1 Hello World! 171.212 ns 3.3849 ns 1.00
IndexOfString CsProjnet472 Hello World! 184.194 ns 3.6937 ns 1.08
IndexOfChar .NET Core 2.1 Hello World! 7.962 ns 0.4588 ns 1.00
IndexOfChar CsProjnet472 Hello World! 12.305 ns 0.2841 ns 1.59
IndexOfString .NET Core 2.1 Hello(…)orld! [12000] 39,964.455 ns 781.2495 ns 1.00
IndexOfString CsProjnet472 Hello(…)orld! [12000] 40,476.489 ns 805.1209 ns 1.01
IndexOfChar .NET Core 2.1 Hello(…)orld! [12000] 765.894 ns 15.2256 ns 1.00
IndexOfChar CsProjnet472 Hello(…)orld! [12000] 7,522.823 ns 147.9425 ns 9.83

There is a bit to take in here but here goes.

First, when the method is “IndexOfString”, we see minimal to no difference between the two runtimes. .NET Core is slightly faster, but this could be down to a whole host of factors not related to this specific method.

When we move to the IndexOfChar method, we see that when the string is small, we lop quite a bit of the average time. But if we move down to working on larger strings… wow… We are almost 10x faster in .NET Core than Full Framework. Pretty. Damn. Incredible.

Won’t This Stuff Make It Into .NET Framework?

Because much of this work actually relies on the use of C# 7.2’s new feature of Span, it’s likely it will make it’s way through eventually. But what I typically see now is that the release cycle is that much faster with .NET Core over Framework, that we see these sorts of improvements at a much more rapid pace make their way into the Core runtime, and sort of backfill their way into the framework. I’m sure at some point a reader will come across this post, and in .NET Framework version 4.8.X there is no performance difference, but by that point, there will be some other everyday method that is blazingly fast in Core, but not Framework.

Getting a mime type based on a file name (Or file extension), is one of those weird things you never really think about until you really, really need it. I recently ran into the issue when trying to return a file from an API (Probably not the best practice, but I had to make it happen), and I wanted to specify the mime type to the caller. I was amazed with how things “used” to be done both in the .NET Framework, and people’s answers on Stack Overflow.

How We Used To Work Out The Mime Type Based On a File Name (aka The Old Way)

If you were using the .NET Framework, you had two ways to get going. Now I know this is a .NET Core blog, but I still found it interesting to see how we got to where we are now.

The first way is that you build up a huge dictionary yourself of mappings between file extensions and mime types. This actually isn’t a bad way of doing things if you only expect a few different types of files need to be mapped.

The second was that in the System.Web namespace of the .NET Framework there is a static class for mapping classes. We can actually see the source code for this mapping here : https://referencesource.microsoft.com/#system.web/MimeMapping.cs. If you were expecting some sort of mime mapping magic to be happening well, just check out this code snippet.

400+ lines of manual mappings that were copied and pasted from the default IIS7 list. So, not that great.

But the main issue with all of this is that it’s too hard (close to impossible) to add and remove custom mappings. So if your file extension isn’t in the list, you are out of luck.

The .NET Core Way

.NET Core obviously has it’s own way of doing things that may seem a bit more complicated but does work well.

First, we need to install the following nuget package :

Annoyingly the class we want to use lives inside this static files nuget package. I would say if that becomes an issue for you, to look at the source code and make it work for you in whatever way you need. But for now, let’s use the package.

Now we have access to a nifty little class called FileExtensionContentTypeProvider . Here’s some example code using it. I’ve created a simple API action that takes a filename, and returns the mime type :

Nothing too crazy and it works! We also catch if it doesn’t manage to map it, and just map it ourselves to a default content type. This is one thing that the .NET Framework MimeMapping class did have, was that if it couldn’t find the correct mapping, it returned application/octet-stream. But I can see how this is far more definitive as to what’s going on.

But here’s the thing, if we look at the source code of this here, we can see we are no better off in terms of doing things by “magic”, it’s still one big dictionary under the hood. And the really interesting part? We can actually add our own mappings! Let’s modify our code a bit :

I’ve gone mad with power and created a new file extension called .dnct and mapped it to it’s own mimetype. Everything is a cinch!

But our last problem. What if we want to use this in multiple places? What if we need better control for unit testing that “instantiating” everytime won’t really give us? Let’s create a nice mime type mapping service!

We could create this static, but then we lose a little flexibility around unit testing. So I’m going to create an interface too. Our service looks like so :

So we provide a single method called “Map”. And when creating our MimeMappingService, we take in a content service provider.

Now we need to head to our startup.cs and in our ConfigureServices method we need to wire up the service. That looks a bit like this :

So we instantiate our FileExtensionContentTypeProvider, give it our extra mappings, then bind our MimeMappingService all up so it can be injected.

In our controller we change out code to look a bit like this :

Nice and clean. And it means that any time we inject our MimeMappingService around, it has all our customer mappings contained within it!

.NET Core Static Files

There is one extra little piece of info I should really give out too. And that is if you are using the .NET Core static files middleware to serve raw files, you can also use this provider to return the correct mime type. So for example you can do things like this :

So now when outside of C# code, and we are just serving the raw file of type .dnct, we will still return the correct MimeType.

It’s almost a right of passage for a junior developer to cludge together their own CSV parser using a simple string.Split(‘,’), and then subsequently work out that there is a bit more to this whole CSV thing than just separating out values by a comma. I recently took a look at my options for parsing CSV’s in .NET Core, here’s what I found!

CSV Gotchas

There are a couple of CSV gotchas that have to be brought up before we dive deeper. Hopefully they should go ahead and explain why rolling your own is sometimes more pain than it’s worth.

  • A CSV may or may not have a header row. If there is a header row, then the order of the columns is not important since you can detect what is actually in each column. If there is no header row, then you rely on the order of the columns being the same. Any CSV parser should be able to both read columns based on a “header” value, and by index.
  • Any field may be contained in quotes. However fields that contain a line-break, comma, or quotation marks must be contained in quotes.
  • To re-emphasize the above, line breaks within a field are allowed within a CSV as long as they are wrapped in quotation marks, this is what trips most people up who are simply reading line by line like it’s a regular text file.
  • Quote marks within a field are notated by doing double quote marks (As opposed to say an escape character like a back slash).
  • Each row should have the same amount of columns, however in the RFC this is labelled as a “should” and not a “must”.
  • While yes, the C in CSV stands for comma, ideally a CSV parser can also handle things like TSV (That is using tabs instead of commas).

And obviously this is just for parsing CSV files into primitives, but in something like .NET we will also be needing :

  • Deserializing into a list of objects
  • Handling of Enum values
  • Custom mappings (So the header value may or may not match the name of the class property in C#)
  • Mapping of nested objects

Setup For Testing

For testing out these libraries, I wanted a typical scenario for importing. So this included the use of different primitive types (strings, decimals etc), the usage of a line break within a string (Valid as long as it’s contained within quotes), the use of quotes within a field, and the use of a “nested” object in C#.

So my C# model ended up looking like :

And our CSV file ended up looking like :

So a couple of things to point out :

  • Our “Make” is enclosed in quotes but our model is not. Both are valid.
  • The “Type” is actually an enum and should be deserialized as such
  • The “Comment” field is a bit of a pain. It contains quotes, it has a line break, and in our C# code it’s actually a nested object.

All of this in my opinion is a pretty common setup for a CSV file import, so let’s see how we go.

CSV Libraries

So we’ve read all of the above and we’ve decided that rolling our own library is just too damn hard. Someone’s probably already solved these issues for us right? As it happens, yes they have. Let’s take a look at what’s out there in terms of libraries. Realistically, I think there is only two that really matter, CSVHelper and TinyCSVParser, so let’s narrow our focus down to them.

CSVHelper

Website : https://joshclose.github.io/CsvHelper/
CSVHelper is the granddaddy of working with CSV in C#. I figured I would have to do a little bit of manual mapping for the comment field, but hopefully the rest would all work out of the box. Well… I didn’t even have to do any mapping at all. It managed to work everything out itself with nothing but the following code :

Really really impressive. And it handled double quotes, new lines, and enum parsing all on it’s own.

The thing that wow’d me the most about this library is how extensible it is. It can handle completely custom mappings, custom type conversion (So you could split a field into stay a dictionary or a child list), and even had really impressive error handling options.

The fact that in 3 lines of code, I’m basically done with the CSV parsing really points out how good this thing is. I could go on and on about it’s features, but we do have one more parser to test!

Tiny CSV Parser

Website : http://bytefish.github.io/TinyCsvParser/index.html
Next up was Tiny CSV Parser. Someone had pointed me to this one supposedly because it’s speed was supposed to be pretty impressive. We’ll take a look at that in a bit, but for now we just wanted to see how it handled our file.

That’s where things started to fall down. It seems that Tiny CSV doesn’t have “auto” mappings, and instead you have to create a class to map them yourself. Mine ended up looking like this :

So, right from the outset there is a bit of an overhead here. I also don’t like how I have to map a specific row index instead of a row heading. You will also notice that I had to create my own type converter for the nested comment object. I feel like this was expected, but the documentation doesn’t specify how to actually create this and I had to delve into the source code to work out what I needed to do.

And once we run it, oof! While it does handle almost all scenario’s, it doesn’t handle line breaks in a field. Removing the line break did mean that we could parse everything else, including the enums, double quotes, and (eventually) the nested object.

The code to actually parse the file (Together with the mappings above) was :

So not a hell of a lot to actually parse the CSV, but it does require a bit of setup. Overall, I didn’t think it was even in the same league as CSVHelper.

Benchmarking

The thing is, while I felt CSVHelper was miles more user friendly than Tiny CSV, the entire reason the latter was recommended to me was because it’s supposedly faster. So let’s put that to the test.

I’m using BenchmarkDotNet (Guide here) to do my benchmarking. And my code looks like the following :

A quick note on this, is that I tried to keep it fairly simple, but also I had to ensure that I used “ToList()” to make sure that I was actually getting the complete list back, even though this adds quite a bit of overhead. Without it, I get returned an IEnumerable that might not actually be enumerated at all.

First I tried a 100,000 line CSV file that used our “complicated” format above (Without line breaks in fields however as TinyCSVParser does not handle these).

TinyCSVParser was quite a bit faster. (4x infact). And when it came to memory usage, it was way down on CSVHelper.

Let’s see what happens when we up the size to a million rows :

It seems pretty linear here in terms of both memory allocated, and the time it takes to complete. What’s mostly interesting here is that the file itself is only 7.5MB big at a million rows, and yet we are allocating close to 2.5GB to handle it all, pretty crazy!

So, the performant status of TinyCSV definitely hold up, it’s much faster and uses far less memory, but for my mind CSVHelper is still the much more user friendly library if you are processing smaller files.

I was recently looking into the new Channel<T>  API in .NET Core (For an upcoming post), but while writing it up, I wanted to do a quick refresher of all the existing “queues” in .NET Core. These queues are also available in full framework (And possibly other platforms), but all examples are written in .NET Core so your mileage may vary if you are trying to run them on a different platform.

FIFO vs LIFO

Before we jump into the .NET specifics, we should talk about the concept of FIFO or LIFO, or “First In, First Out” and “Last In, Last Out”. For the concept of queues, we typically think of FIFO. So the first message put into the queue, is the first one that comes out. Essentially processing messages as they go into a queue. The concept of LIFO, is typically rare when it comes to queues, but in .NET there is a type called Stack<T>  that works with LIFO. That is, after filling the stack with messages/objects, the last one put in would then be the first one out. Essentially the order would be reversed.

Queue<T>

Queue<T>  is going to be our barebones simple queue in .NET Core. It takes messages, and then pops them out in order. Here’s a quick code example :

Pretty stock standard and not a lot of hidden meaning here. The Enqueue  method puts a message on our queue, and the Dequeue  method takes one off (In a FIFO manner). Our console app obviously prints out two lines, “Hello” then “World!”.

Barring multi threaded scenarios (Which we will talk about shortly), you’re not going to find too many reasons to use this barebones queue. In a single threaded app, you might pass around a queue to process a “list” of messages, but you may find that using a List<T>  within a loop is a simpler way of achieving the same result. Infact if you look at the source code of Queue, you will see it’s actually just an implementation of IEnumerable anyway!

So how about multi threaded scenarios? It kind of makes sense that you may want to load up a queue with items, and then have multiple threads all trying to process the messages. Well using a queue in this manner is actually not threadsafe, but .NET has a different type to handle multi threading…

ConcurrentQueue<T>

ConcurrentQueue<T>  is pretty similar to Queue<T> , but is made threadsafe by a copious amount of spinlocks. A common misconception is that ConcurrentQueues are just a wrapper around a queue with the use of the lock  keyword. A quick look at the source code here shows that’s definitely not the case. Why do I feel the need to point this out? Because I often see people try and make their use of Queue<T>  threadsafe by using locks, thinking that they are doing what Microsoft does when using ConcurrentQueue, but that’s pretty far from the truth and actually takes a pretty big performance hit when doing so.

Here’s a code sample of a ConcurrentQueue :

So you’ll notice we can no longer just dequeue a message, we need to TryDequeue. It will return true if we managed to pop a message, and false if there is no message to pop.

Again, the main point of using a ConcurrentQueue over a regular Queue is that it’s threadsafe to have multiple consumers (Or producers/enqueuers) all using it at the same time.

BlockingCollection<T>

A blocking collection is an interesting “wrapper” type that can go over the top of any IProducerConsumerCollection<T>  type (Of which Queue<T>  and ConcurrentQueue<T>  are both). This can be handy if you have your own implementation of a queue, but for most cases you can roll with the default constructor of BlockingCollection. When doing this, it uses a ConcurrentQueue<T> under the hood making everything threadsafe (See source code here). The main reason to use a BlockingCollection is that it has a limit to how many items can sit in the queue/collection. Obviously this is beneficial if your producer is much faster than your consumers.

Let’s take a quick look :

What will happen with this code? You will see “Adding Hello”, “Adding World!”, and then nothing… Your application will just hang. The reason is this line :

We’ve initialized the collection to be a max size of 2. If we try and add an item where the collection is already at this size, we will just wait until a message is dequeued. How long will we wait? Well by default, forever. However we can change our add line to be :

So we’ve changed our Add call to TryAdd, and we’ve specified a timespan to wait. If this timespan is hit, then the TryAdd method will return false to let us know we weren’t able to add the item to the collection. This is handy if you need to alert someone that your queue is overloaded (e.g. the consumers are stalled for whatever reason).

Stack<T>

As we talked about earlier, a Stack<T> type allows for a Last In, First Out (LIFO) queuing style. Consider the following code :

The output would be “World!” then “Hello”. It’s rare that you would need this reversal of messages, but it does happen. Stack<T>  also has it’s companion in ConcurrentStack<T> , and you can initialize BlockingCollection with a ConcurrentStack within it.

Channel<T>

There is a brand new Channel<T> type released with .NET Core. Because it’s just so different from the existing queue types in .NET, I’ll have an upcoming post discussing a tonne more about how they work, and why you might use them. In the meantime the only documentation I can find is on Github from Stephen Toub here. Have a look, and see if it works for you until the next post!

While working on an API that was built specifically for mobile clients, I ran into an interesting problem that I couldn’t believe I hadn’t found before. When working on a REST API that deals exclusively in JSON payloads, how do you upload images? Which naturally leads onto the next question, should a JSON API then begin accepting multipart form data? Is that not going to look weird that for every endpoint, we accept JSON payloads, but then for this one we accept a multipart form? Because we will want to be uploading metadata with the image, we are going to have to read out formdata values too. That seems so 2000’s! So let’s see what we can do.

While some of the examples within this post are going to be using .NET Core, I think this really applies to any language that is being used to build a RESTful API. So even if you aren’t a C# guru, read on!

Initial Thoughts

My initial thoughts sort of boiled down to a couple of points.

  • The API I was working on had been hardcoded in a couple of areas to really be forcing the whole JSON payload thing. Adding in the ability to accept formdata/multipart forms would be a little bit of work (and regression testing).
  • We had custom JSON serializers for things like decimal rounding that would somehow manually need to be done for form data endpoints if required. We are even using snake_case as property names in the API (dang iOS developers!), which would have to be done differently in the form data post.
  • And finally, is there any way to just serialize what would have been sent under a multi-part form post, and include it in a JSON payload?

What Else Is Out There?

It really became clear that I didn’t know what I was doing. So like any good developer, I looked to copy. So I took a look at the public API’s of the three social media giants.

Twitter

API Doc : https://developer.twitter.com/en/docs/media/upload-media/api-reference/post-media-upload

Twitter has two different ways to upload files. The first is sort of a “chunked” way, which I assume is because you can upload some pretty large videos these days. And a more simple way for just uploading general images, let’s focus on the latter.

It’s a multi-part form, but returns JSON. Boo.

The very very interesting part about the API however, is that it allows uploading the actual data in two ways. Either you can upload the raw binary data as you typically would in a multipart form post, or you could actually serialise the file as a Base64 encoded string, and send that as a parameter.

Base64 encoding a file was interesting to me because theoretically (And we we will see later, definitely), we can send this string data any way we like. I would say that of all the C# SDKs I looked at, I couldn’t find any actually using this Base64 method, so there weren’t any great examples to go off.

Another interesting point about this API is that you are uploading “media”, and then at a later date attaching that to an actual object (For example a tweet). So if you wanted to tweet out an image, it seems like you would (correct me if I’m wrong) upload an image, get the ID returned, and then create a tweet object that references that media ID. For my use case, I certainly didn’t want to do a two step process like this.

LinkedIn

API Doc : https://developer.linkedin.com/docs/guide/v2/shares/rich-media-shares#upload

LinkedIn was interesting because it’s a pure JSON API. All data POSTs contain JSON payloads, similar to the API I was creating. Wouldn’t you guess it, they use a multipart form data too!

Similar to Twitter, they also have this concept of uploading the file first, and attaching it to where you actually want it to end up second. And I totally get that, it’s just not what I want to do.

Facebook

API Doc : https://developers.facebook.com/docs/graph-api/photo-uploads

Facebook uses a Graph API. So while I wanted to take a look at how they did things, so much of their API is not really relevant in a RESTful world. They do use multi-part forms to upload data, but it’s kinda hard to say how or why that is the case,. Also at this point, I couldn’t get my mind off how Twitter did things!

So Where Does That Leave Us?

Well, in a weird way I think I got what I expected, That multipart forms were well and truly alive. It didn’t seem like there was any great innovation in this area. In some cases, the use of multipart forms didn’t look so brutal because they didn’t need to upload metadata at the same time. Therefore simply sending a file with no attached data didn’t look so out of place in a JSON API. However, I did want to send metadata in the same payload as the image, not have it as a two step process.

Twitter’s use of Base64 encoding intrigued me. It seemed like a pretty good option for sending data across the wire irrespective of how you were formatting the payload. You could send a Base64 string as JSON, XML or Form Data and it would all be handled the same. It’s definitely proof of concept time!

Base64 JSON API POC

What we want to do is just test that we can upload images as a Base64 string, and we don’t have any major issues within a super simple scenario. Note that these examples are in C# .NET Core, but again, if you are using any other language it should be fairly simple to translate these.

First, we need our upload JSON Model. In C# it would be :

Not a whole lot to it. Just a description field that can be freetext for a user to describe the image they are upload, and an imagedata field that will hold our Base64 string.

For our controller :

Again, fairly damn simple. We take in the model, then C# has a great way to convert that string into a byte array, or to read it into a memory stream. Also note that as we are just building a proof of concept, I echo out the image data to make sure that it’s been received, read, and output like I expect it would, but not a whole lot else.

Now let’s open up postman, our JSON payload is going to look a bit like :

I’ve obviously truncated imagedata down here, but a super simple tool to turn an image into a Base64 is something like this website here. I would also note that when you send your payload, it should be without the data:image/jpeg;base64, prefix that you sometimes see with online tools that convert images to strings.

Hit send in Postman and :

Great! So my image upload worked and the picture of my cat was echo’d back to me! At this point I was actually kinda surprised that it could be that easy.

Something that became very evident while doing this though, was that the payload size was much larger than the original image. In my case, the image itself is 109KB, but the Base64 version was 149KB. So about 136% of the original image. In having a quick search around, it seems expected that a Base64 version of a file would be about 33% bigger than the original. When it comes to larger files, I think less about sending 33% more across the wire, but more the fact of reading the file into memory, then converting it into a huge string, and then writing that out… It could cause a few issues. But for a few basic images, I’m comfortable with a 33% increase.

I will also note that there was a few code snippets around for using BSON or Protobuf to do the same thing, and may actually cut down the payload size substantially. The mentality would be the same, a JSON payload with a “stringify’d” file.

Cleaning Up Our Code Using JSON Converters

One thing that I didn’t like in our POC was that we are using a string that almost certainly will be converted to a byte array every single time. The great thing about using a JSON library such as JSON.net in C#, is that how the client sees the model and how our backend code sees the model doesn’t necessarily have to be the exact same. So let’s see if we can turn that string into a byte array on an API POST automagically.

First we need to create a “Custom JSON Converter” class. That code looks like :

Fairly simple, all we are doing is taking a value and converting it from a string into a byte array. Also note that we are only worried about reading JSON payloads here, we don’t care about writing as we never write out our image as Base64 (yet).

Next, we had back to our model and we apply the custom JSON Converter.

Note we also change the “type” of our ImageData field to a byte array rather than a string. So even though our postman test will still send a string, by the time it actually gets to us, it will be a byte array.

We will also need to modify our Controller code too :

So it becomes even simpler. We no longer need to bother handling the Base64 encoded string anymore, the JSON converter will handle it for us.

And that’s it! Sending the exact same payload will still work and we have one less piece of plumbing to do if we decide to add more endpoints to accept file uploads. Now you are probably thinking “Yeah but if I add in a new endpoint with a model, I still need to remember to add the JsonConverter attribute”, which is true. But at the same time, it means if in the future you decide to swap to BSON instead of Base64, you aren’t going to have to go to a tonne of places and work out how you are handling the incoming strings, it’s all in one handy place.