With the announcement of .NET 5 last year, and subsequent announcements leading up to MSBuild 2020, a big question has been what’s going to happen to “.NET Standard”. That sort of framework that’s not an actual framework just an interface that various platforms/frameworks are required to implement, but not really, and then you have to get David Fowler to do a Github Gist that gets shared a million times to actually explain to people what the hell this thing is.

Anyway. .NET Standard is no more (Or will be eventually). As confusing as it may be at first to get rid of something that was only created 3 odd years ago… It does kinda make sense to get rid of it at this juncture.

Rewinding The “Why” We Even Had .NET Standard

Let’s take a step back and look at how and why .NET Standard came to be.

When .NET Core was first released, there was a conundrum. We have all these libraries that are already written for .NET Framework, do we really want to re-write them all for .NET Core? Given that the majority of early .NET Core was actually a port of .NET Framework to work cross platform, many of the classes and method signatures were identical (Infact I would go as far as to say most of them were).

Let’s use an example. Let’s say that I want to open a File inside my library using the standard File.ReadAllLines(string path) call. Now it just so happens if you write this code in .NET Framework, .NET Core or even Mono, it takes the same parameters (a string path variable), and returns the same thing, (a string array). Now *how* these calls read a file is up to the individual platform (For example .NET Core and Mono may have some special code to handle Mac path files), but the result should always be the same, a string array of lines from the file.

So if I had a library that does nothing but open a file to read lines and return it. Should I really need to release that library multiple times for different frameworks? Well, that’s where .NET Standard comes in. The simplest way to think about it is it defines a list of classes and methods that every platform agrees to implement. So if File.ReadAllLines() is part of the standard, then I can be assured that my library can be released once as a .NET Standard library, and it will work on multiple platforms.

If you’re looking for a longer explanation about .NET Standard, then there’s an article I wrote over 3 years ago that is still relevant today : https://dotnetcoretutorials.com/2017/01/13/net-standard-vs-net-core-whats-difference/

TL;DR; .NET Standard provided a way for different .NET Platforms to share a set of common method signatures that afforded library creators to write code once and be able to run on multiple platforms. 

.NET Standard Is No Longer Needed

So we come to the present day where announcements are coming out that .NET Standard is no longer relevant (sort of). And there’s two main reasons for that….

.NET Core Functionality Surpassed .NET Framework – Meaning New .NET Standard Versions Were Hard To Come By

Initially, .NET Core was a subset of .NET Framework functionality. So the .NET Standard was a way almost of saying, if you wrote a library for .NET Framework, here’s how you know it will work out of the box for .NET Core. Yes, .NET Standard was also used as a way to see functionality across other platforms like Mono, Xamarin, Silverlight, and even Windows Phone. But I feel like the majority of use cases were for .NET Framework => .NET Core comparisons.

As .NET Core built up it’s functionality, it was still essentially trying to reach feature parity with .NET Framework. So as a new version of .NET Core got released each year, a new version of .NET Standard also got released with it that was, again, almost exclusively to look at the common method signatures across .NET Framework <=> .NET Core. So eventually .NET Core surpasses .NET Framework, or at the very least says “We aren’t porting anything extra over”. This point is essentially .NET Standard 2.0.

But obviously work on .NET Core doesn’t stop, and new features are added to .NET Core that don’t exist in .NET Framework. But .NET Framework updates at first are few and far between,  until it’s announced that essentially it’s maintenance mode only (Or some variation there-of). So with the new features being added to .NET Core, do they make sense to be added to a new version of standard given that .NET Framework will never actually implement that standard? Kind of.. .Or atleast they tried. .NET Standard 2.1 was the latest release of the standard and (supposedly, although some would disagree), is implemented by both Mono and Xamarin, but not .NET Framework.

So now we have a standard that was devised to describe the parity between two big platforms, that one platform is no longer going to be participating in. I mean I guess we can keep implementing new standards but if there is only one big player actually adhering to that standard (And infact, probably defining it), then it’s kinda moot.

The Merger Of .NET Platforms Makes A Standard Double Moot

But then of course we rolled around 6 months after the release of .NET Standard 2.1,  and find the news that .NET Framework and .NET Core are being rolled into this single .NET platform called .NET 5. Now we are doubly not needing a standard because the two platforms we were trying to define the parity are actually just going to become one and the same.

Now take that, and add in the fact that .NET 6 is going to include the rolling in of the Xamarin platform. Now all those charts you saw of .NET Standard where you tried to trace your finger along the columns to check which version you should support are moot because there’s only one row now, that of .NET 6.

In the future there is only one .NET platform. There is no Xamarin, no .NET Core, no Mono, no .NET Framework. Just .NET.

So I Should Stop Using .NET Standard?

This was something that got asked of me recently. If it’s all becoming one platform, do we just start writing libraries for .NET 5 going forward then? The answer is no. .NET Standard will still exist as a way to write libraries that run in .NET Framework or older versions of .NET Core. Even today, when picking a .NET Standard version for a library, you try and pick the lowest number you can feasibly go to ensure you support as many platforms as you can. That won’t change going forward – .NET 5 still implements .NET Standard 1.0 for example, so any library that is targeting an older standard still runs on the latest version of the .NET platform.

What will change for the better are those hideously complex charts and nuget dependency descriptions on what platforms can run a particular library/package. In a few years from now it won’t be “Oh this library is for .NET Standard 2.1, Is that for .NET Core 2.1? No, it’s for .NET Core 3+… Who could have known”. Instead it will be, oh this library is for .NET 5, then it will work in .NET 7 no problems.

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

So by now you’ve probably heard a little about various .NET 5 announcements and you’re ready to give it a try! I thought first I would give some of the cliffnotes from the .NET 5 announcement, and then jump into how to actually have a play with the preview version of .NET 5 (Correct as of writing this post on 2020-05-22).

Cliff Notes

  • .NET Standard is no more! Because .NET Framework and .NET Core are being merged, there is less of a need for .NET Standard. .NET Standard also covers things like Xamarin but that’s being rolled into .NET 6 (More on that a little later), so again, no need for it.
  • .NET Core coincides with C#9 and F#5 releases (As it typically does), but Powershell will now also be released on the same cadence.
  • They have added a visual designer for building WinForm applications since you could theoretically build WinForm applications in .NET Core 3.X but not design them with quite as much functionality as you typically would.
  • .NET 5 now runs on Windows ARM64.
  • While the concept of Single File Publish already exists in .NET Core 3.X, it looks like there has been improvements where it’s actually a true exe instead of a self extracting ZIP. Mostly for reasons around being on read-only media (e.g. A locked down user may not be able to extract that single exe to their temp folder etc).
  • More features have been added to System.Text.Json for feature parity with Newtonsoft.Json.
  • As mentioned earlier, Xamarin will be integrated with .NET 6 so that there would be a single unifying framework. Also looks like Microsoft will be doing a big push around the .NET ecosystem as a way to build apps once (in Xamarin), and deploy to Windows, Mac, IOS, Android etc. Not sure how likely this actually is but it looks like it’s the end goal.

Setting Up The .NET 5 SDK

So as always, the first thing you need to do is head to the .NET SDK Download page here : https://dotnet.microsoft.com/download/dotnet/5.0. Note that if you go to the actual regular download page of https://dotnet.microsoft.com/download you are only given the option to download .NET Core 3.1 or .NET Framework 4.8 (But there is a tiny little banner above saying where you can download the preview).

Anyway, download the .NET 5 SDK installer for your OS.

After installing, you can run the dotnet info command from a command prompt :

dotnet --info

Make sure that you do have the SDK installed correctly. If you don’t see .NET 5 in the list, the most common reason I’ve found is people installing the X86 version on their X64 PC. So make sure you get the correct installer!

Now if you use VS Code, you are all set. For any existing project you have that you want to test out running in .NET 5 (For example a small console app), then all you need to do is open the .csproj file and change :

<PropertyGroup>
  <OutputType>Exe</OutputType>
  <TargetFramework>netcoreapp3.1</TargetFramework>
</PropertyGroup>

To :

<PropertyGroup>
  <OutputType>Exe</OutputType>
  <TargetFramework>net5.0</TargetFramework>
</PropertyGroup>

As noted in the cliffnotes above, because there is really only one Framework with no standard going forward, they ditched the whole “netcoreapp” thing and just went with “net”. That means if you want to update any of your .NET Standard libraries, you actually need them to target “net5.0” as well. But hold fire because there is actually no reason to bump the version of a library unless you really need something in .NET 5 (Pretty unlikely!).

.NET 5 In Visual Studio

Now if you’ve updated your .NET Core 3.1 app to .NET 5 and try and build in Visual Studio, you may just get :

The reference assemblies for .NETFramework,Version=v5.0 were not found. 

Not great! But all we need to do is update to the latest version and away we go. It’s somewhat lucky that there isn’t a Visual Studio release this year (e.g. There is no Visual Studio 2020), otherwise we would have to download yet another version of VS. So to update, inside Visual Studio, simply go Help -> Check For Updates. The version you want to atleast be on is 16.6 which as of right now, is the latest non-preview version.

Now after installing this update for the first time, for the life of me I couldn’t work out why I could build an existing .NET 5 project, but when I went to create a new project, I didn’t have the option of creating it as .NET 5.

As it turns out, by default the non-preview version of Visual Studio can only see non-preview versions of the SDK. I guess so that you can keep the preview stuff all together. If you are like me and just want to start playing without having to install the Preview version of VS, then you need to go Tools -> Options inside Visual Studio. Then inside the options window under Environment there is an option for “Preview Features”.

Tick this. Restart Visual Studio. And you are away laughing!

Do note that some templates such as Console Applications don’t actually prompt you for the SDK version when creating a new project, they just use the latest SDK available. In this case, your “default” for Visual Studio suddenly becomes a preview .NET Core SDK. Perfectly fine if you’re ready to sit on the edge, but just something to note in case this is a work machine or similar.

 

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

Learning basic sorting algorithms is a bit of a Computer Science 101 class. But many examples out there are either in pesudocode, or languages more suited to large computation (e.x. Python). So I thought I would quickly go over the three basic sorting algorithms, and demonstrate them in C#.

Default Sorting In C#/.NET

So going over these, you’re probably going to be thinking “So which one does C# use?”. By that I mean, if you call Array.Sort(), does it use any of these examples? Well.. the answer is “it depends”. In general when using “Sort()” on a List, Array or Collection it will use :

  • If the collection has less than 16 elements, the algorithm “Insertion Sort” will be used (We will talk about this below).
  • If the number of partitions exceeds 2 log *array size*, then Heapsort is used.
  • Otherwise Quicksort is used.

However this is not always the case, For example when using Linq on a list and calling OrderBy, Quicksort is always used as the underlying implementation.

However, what I’m trying to point out here is that the sorting algorithms outlined below are rarely, if ever used in the real world and are more likely to be used as an interview question. But they are important to understand because often other sorting algorithms are built on top of these more “archaic” algorithms (It also helps to understand jokes in Silicon Valley too!).

Array vs List

I just want to point out something very important in talking about sorting algorithms when it comes to C#. When I first started programming, I couldn’t understand why examples always used arrays in their sample code. Surely since we are in C#, Lists are way cooler! And even though a fixed array is obviously faster than a linked list, do we really need to use arrays even in examples?

Well the thing is, these are all “in place” sorts. That is, we do not create a second object to return the result, store state, or hold “partial” results. When we do things like Insertion Sort below, I’ll give an example of how this might easier be done with a List, but it requires an additional array to be created in memory. In almost all sorting algorithms you’ll find that they work within the data structure given and don’t “clone” or select out items into a new List to return an entirely different object. Once I realized that “sorting” was not simply about “give me the lowest item and I’ll just put it in this new list over here and keep going until I select out all items”, but instead almost about the most efficient way to “juggle” items inside an array, those pseudocode sort algorithms suddenly made sense.

Bubble Sort

So first up we are going to look at Bubblesort. This is essentially worse case scenario for sorting data as it takes many “passes” of single swaps for things to actually sort.

Let’s look at the code :

public static void BubbleSort(int[] input)
{
    var itemMoved = false;
    do
    {
        itemMoved = false;
        for (int i = 0; i < input.Count() - 1; i++)
        {
            if (input[i] > input[i + 1])
            {
                var lowerValue = input[i + 1];
                input[i + 1] = input[i];
                input[i] = lowerValue;
                itemMoved = true;
            }
        }
    } while (itemMoved);
}

Now how does BubbleSort work. Starting at index zero, we take an item and the item next in the array and compare them. If they are in the right order, then we do nothing, if they are in the wrong order (e.g. The item lower in the array is actually a higher value than the next element), then we swap these items. Then we continue through each item in the array doing the same thing (Swapping with the next element if it’s higher).

Now since we are only comparing each item with it’s neighbour, each item may only move a single place when it actually needs to move several places. So how does Bubblesort solve this? Well it just runs the entire process all over again. Notice how we have the variable called “itemMoved”. We simply set this to true if we did swap an item and start the scan all over again.

Because we are moving things one at a time, not directly to the right position, and having to multiple passes to get things right, BubbleSort is seen as extremely inefficient.

Insertion Sort

Next up is Insertion Sort. Now while we still check items one by one what we instead do is “insert” the item at the correct index right from the get go. Unlike BubbleSort where we are swapping the item with it’s neighbour, we are instead inserting the item into the correct position given what we have already checked.

I’m actually going to show the code twice. First is what I think is your typical insertion sort :

public static void InsertionSort(int[] input)
{

    for (int i = 0; i < input.Count(); i++)
    {
        var item = input[i];
        var currentIndex = i;

        while (currentIndex > 0 && input[currentIndex - 1] > item)
        {
            input[currentIndex] = input[currentIndex - 1];
            currentIndex--;
        }

        input[currentIndex] = item;
    }
}

So a quick explanation of this code.

We loop through each item in the index and get the value. Then we loop through each item in the indexes *below* the index we started at. If the item has a lower value than the index below them, then we “shift” that item below them up by 1. And check the next item. In a way it’s like a bubble sort because we are comparing the neighbour below them, but if we do swap, then we continue swapping until we get to the end.

If we get to the last index (0), or we hit a new item that has a lower value than our current item, then we “break” and simple insert our current item at that index.

But a simpler way to view the “Insertion” sort algorithm is actually by building a new list to return. For example :

public static List<int> InsertionSortNew(this List<int> input)
{
    var clonedList = new List<int>(input.Count);

    for (int i = 0; i < input.Count; i++)
    {
        var item = input[i];
        var currentIndex = i;

        while (currentIndex > 0 && clonedList[currentIndex - 1] > item)
        {
            currentIndex--;
        }

        clonedList.Insert(currentIndex, item);
    }

    return clonedList;
}

So in this example, instead we create a brand new list where we slowly add items to it by inserting them at the correct location. Again, not quite correct as we are doing things like being able to insert items at certain indexes without having to shift the items above it up an index. But really being able to insert the item at a certain index is just sugar that C# takes care of for us.

Again however, generally when we talk about list sorting, we are talking about “inplace” sorting and not trying to cherry pick items out into a new object.

Selection Sort

Selection Sort is actually very very similar to Insertion Sort. The code looks like so :

public static void SelectionSort(int[] input)
{
    for (var i = 0; i < input.Length; i++)
    {
        var min = i;
        for(var j = i + 1; j < input.Length; j++) { 
            if(input[min] > input[j])
            {
                min = j;
            }
        }

        if(min != i)
        {
            var lowerValue = input[min];
            input[min] = input[i];
            input[i] = lowerValue;
        }
    }
}

What we are essentially doing is scanning the index from start to finish. For each index, we scan the rest of the array for an item that is lower (Infact, the lowest) compared to the current item. If we find one, we swap it with the current item. The fact that the current item goes into a position later in the array isn’t too important as eventually, all elements will be checked.

Now, again, this looks more complicated than it should be because of our in-place array criteria. But really what we are doing is scanning by one one, finding the lowest item in the list, putting it into an array, and then continuing with the next highest etc.

Divide and Conquer Sorting

Not featured here are “Divide and Conquer” sorting algorithms, these are things like MergeSort and QuickSort that divide up the work to many smaller sorting operations, and then combine the results at the end. These are generally the sorting algorithms you will find out in the wild, but it’s maybe a little bit past the “basics” of sorting.

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

This post is part of a series on using Azure CosmosDB with .NET Core

Part 1 – Introduction to CosmosDB with .NET Core
Part 2 – Azure CosmosDB with .NET Core EF Core


When I first found out EntityFramework supported Azure CosmosDB, I was honestly pretty excited. Not because I thought it would be revolutionary, but because if there was a way to get new developers using Cosmos by leveraging what they already know (Entity Framework), then that would actually be a pretty cool pathway.

But honestly, after hitting many many bumps along the road, I don’t think it’s quite there yet. I’ll first talk about setting up your own small test, and then at the end of this post I’ll riff a little on some challenges I ran into.

Setting Up EFCore For Cosmos

I’m going to focus on Cosmos only information here, and not get too bogged down in details around EF Core. If you already know EF Core, this should be pretty easy to follow!

The first thing you need to do is install the nuget package for EF Core with Cosmos. So from your Package Manager Console :

Install-Package Microsoft.EntityFrameworkCore.Cosmos

In your startup.cs, you will need a line such as this :

services.AddDbContext(options =>
    options.UseCosmos("CosmosEndPoint",
    "CosmosKey",
    "CosmosDatabase")
);

Now.. This is the first frustration of many. There is no overload to pass in a connection string here (Yah know, the thing that literally every other database context allows). So when you put this into config, you have to have them separated out instead of just being part of your usual “ConnectionStrings” configuration.

Let’s say I am trying to store the following model :

public class People
{
    public Guid Id { get; set; }
    public string Name { get; set; }
    public Address Address { get; set; }
}

public class Address
{
    public string City { get; set; }
    public string ZipCode { get; set; }
}

Then I would make my context resemble something pretty close to :

public class CosmosDbContext : DbContext
{
    public DbSet People { get; set; }

    public CosmosDbContext(DbContextOptions options)
        : base(options)
    {
    }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity()
            .ToContainer("People")
            .OwnsOne(x => x.Address);
    }
}

Now a couple of notes here.

For reasons known only to Microsoft, the default name of the collection it tries to pull is the name of the context. e.g. When it goes to Cosmos, it looks for a collection called “CosmosDbContext” even if my DbSet itself is called People. I have no idea why it’s built like this because, again, in every other use for EntityFramework, the table/container name takes after the DbSet not the entire Context. So we have to add in an explicit call to map the container.

Secondly, Cosmos in EFCore seems unable to work out sub documents. I kind of understand this one because in my model, Is Address it’s own collection, or is it a subdocument of People? But the default should be subdocument as it’s unlikely people are doing “joins” across CosmosDB collections, and if they are, they aren’t expecting EF Core to handle that for them through navigation properties. So if you don’t have that “OwnsOne” config, it thinks that Address is it’s own collection and throws a wobbly with :

'The entity type 'Address' requires a primary key to be defined. If you intended to use a keyless entity type call 'HasNoKey()'.'

And honestly, that’s all you need to get set up with EF Core and Cosmos. That’s your basic configuration!

Now Here’s The Bad

Here’s what I found while trying to set up even a simple configuration in Cosmos with EF Core.

  • As mentioned, the default naming of the Collections in Cosmos using the Context name is illogical. Given that in most cases you will have only a single DbContext within your application, but you may have multiple collections you need to access, 9 times out of 10 you are going to need to re-define the container names for each DBSet.
  • The default mappings aren’t what you would expect from Cosmos. As pointed out, the fact that it can’t handle subdocuments out of the box seems strange to me given that if I used the raw .NET Core library it works straight away.
  • You have no control (Or less control anyway) over naming conventions. I couldn’t find a way at all to use camelCase naming conventions at all and it had to use Pascal. I personally prefer NOSQL stores to always be in camelcase, but you don’t get the option here.
  • Before I knew about it trying to connect to a collection with the same name as the context, I wasn’t getting any results back from my queries (Since I was requesting data from a non-existent collection), but my code wasn’t throwing any exceptions, it just returned nothing. Maybe this is by design but it’s incredibly frustrating that I can call a non existent resource and not have any error messages show.
  • Because you might already have a DBContext for SQL Server in your project, things can become hectic when you introduce a second one for Cosmos (Since you can’t use the same Context). Things like migration CLI commands now need an additional flag to say which context it should run on (Even though Cosmos doesn’t use Migrations).

Should You Use It?

Honestly your mileage may vary. I have a feeling that the abstraction of using EF Core may be a little too much for some (e.g. The naming conventions) and that many would prefer to have a bit more control over what’s going on behind the scenes. I feel like EntityFramework really shines when working with a large amount of tables with foreign keys between them using Navigation Properties, something that CosmosDB won’t really have. And so I don’t see a great value prop in wrangling EF Core for a single Cosmos table. But have a try and let me know what you think!

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

This post is part of a series on using Azure CosmosDB with .NET Core

Part 1 – Introduction to CosmosDB with .NET Core
Part 2 – Azure CosmosDB with .NET Core EF Core


I haven’t used CosmosDB an awful lot over the years, but when I have, It’s been a breeze to use. It’s not a one size fits all option so forget about being a one for one replacement for something like SQL Server, but I’ve used it many a time to store large amounts of data that we “rarely” need access to. A good example is that for a chatbot project I worked on, we needed to store all the conversations incase there was ever a dispute over what was said within the chatbot. Those disputes are few and far between and it’s a manual lookup anyway, so we don’t need crazy read rates or wild search requirements. But storing every conversation adds up over time when it comes to storage costs. Those sorts of loads are perfect for CosmosDB (Which was formerly DocumentDB by the way just incase you wondered where that went!).

I was having a look today at all the ways you can actually talk to CosmosDB from code, and it’s actually pretty astounding how many options there are. Microsoft have done a really good job porting existing “API’s” from storage solutions like MongoDB, Cassandara and even their own Table Storage, that means you can basically swap one out for the other (Although, there’s obviously big caveats thrown in there). So I thought I would do a quick series rattling off a few different wants of talking to CosmosDB with .NET Core.

Setting Up CosmosDB

Setting up CosmosDB via the Azure portal is pretty straight forward but I do want to point out one thing. That is when creating the resource, you need to select the “API” that you want to use.

This *cannot* be changed at a later date. If you select the wrong API (For example you select MongoDB cause that sounds interesting, and then you want to connect via SQL), then you need to actually create a new resource with the correct API and migrate all the data (An absolute pain). So be careful! For this example, we are going to use the Core API (SQL).

Once created, we then need to create our “container”.

What’s A Container?

CosmosDB has the concept of a “container” which you can kind of think of as a a table. A container belongs to a database, and a database can have multiple containers. So why not just call it a table? Well because the container may be a table, or it may be a “collection” as thought of like in MongoDB, or it could be a graph etc. So we call it a container just as an overarching term for a collection of rows/items/documents etc, because CosmosDB can do them all.

Partition Keys

When creating your container, you will be asked for a Partition Key. If you’ve never used CosmosDB, or really any large data store, this may be new to you. So what makes a good Partition Key? You essentially want to pick a top level property of your item that has a distinct set of values, that can be “bucketed”. CosmosDB uses these to essentially distribute your data across multiple servers for scalability.

So two bad examples for you :

  • A GUID ID. This is bad because it can never be “bucketed”. It’s essentially always going to be unique.
  • A User “Role” where the only options are “Member” and “Administrator”. Now we have gone the opposite way where we only have 2 distinct values that we are partitioning on, but it’s going to be very lopsided with only a handful of users fitting into the Administrator bucket, and the rest going into the Member bucket.

I just want to add that for the above two, I *have* used them as Partition Keys before. They do work and IMO even though they run against the recommendations from Microsoft, they are pretty hard to shoot yourself in the foot when it comes to querying.

And a couple of good examples :

  • ZipCode (This is actually used as an example from Microsoft). There is a finite amount of zipcodes and people are spread out across them. There will be a decent amount of users in each zipcode assuming your application is widely used across the country.
  • You could use something like DepartmentId if you were creating a database of employees as another example.

Honestly, there is a lot more that goes into deciding on Partition keys. The types of filters you will be running and even consistency models go into creating partition keys. While you are learning, you should stick to the basics above, but there are entire video series dedicated to the subject, so if your datastore is going to be hitting 10+GB in size any time soon, it would be best to do further reading.

Our Example Container

For the purposes of this guide, I’m going to be using the following example. My data in JSON format I want to look like :

{
	"id" : "{guid}", 
	"name" : "Joe Blogs", 
	"address" : 
	{
		"city" : "New York", 
		"zipcode" : "90210"
	}

}

Pretty simple (and keeping with the easy Zipcode example). That means that my setup for my CosmosDB will look like so :

Nothing too crazy here!

Creating Items in C#

For the purpose of this demo, we are going to use the basic C# API to create/read items. The first thing we need to do is install the CosmosDB nuget package. So run the following from your Package Manager console :

Install-Package Microsoft.Azure.Cosmos

Next we need to model our data as a C# class. In the past we had to decorate the models with all sorts of attributes, but now they can be just plain POCO.

class People
{
    public Guid Id { get; set; }
    public string Name { get; set; }
    public Address Address { get; set; }

}
    
class Address
{
    public string City { get; set; }
    public string ZipCode { get; set; }
}

Nothing too spectacular, now onto the code to create items. Just winging it inside a console application, it looks like so :

var connectionString = "";
var client = new CosmosClientBuilder(connectionString)
                    .WithSerializerOptions(new CosmosSerializationOptions
                    {
                        PropertyNamingPolicy = CosmosPropertyNamingPolicy.CamelCase
                    })
                    .Build();

var peopleContainer = client.GetContainer("TestDatabase", "People");

var person = new People
{
    Id = Guid.NewGuid(),
    Name = "Joe Blogs",
    Address = new Address
    {
        City = "New York",
        ZipCode = "90210"
    }
};

await peopleContainer.CreateItemAsync(person);

Now I just want to point out a couple of things. Firstly that I use the CosmosClientBuilder. I found that when I create the client and tried to change settings (Like serializer options in this case), they didn’t work, but when I used the builder, magically everything started working.

Secondly I want to point out that I’m using a specific naming policy of CamelCase. If you’ve used CosmosDB before you’ve probably seen things like :

[JsonProperty("id")]
public Guid Id { get; set; }

Littered everywhere because in C#, we use Pascalcase, but in CosmosDB the pre-defined columns are all camelCase and there was no way to override everything at once. Personally, I prefer that JSON always be camelCase, and the above serialization settings does just that.

The rest of the code should be straight forward. We get our “Container” or table, and we call CreateItem with it. And wallah :

But We Didn’t Define Schema?!

So the first thing people notice when jumping into CosmosDB (Or probably most NoSQL data stores), is that we didn’t pre-define the schema we wanted to store. No where in this process did we go and create the “table” that we could insert data to. Instead I just told CosmosDB to store what I send it, and it does. Other than the ID, everything else is optional and it really doesn’t care what it’s storing.

This is obviously great when you are working on a brand new greenfields project because you can basically riff and change things on the fly. As projects get bigger though, it can become frustrating when a developer might “try” something out and add a new column, but now half your data doesn’t have that column! You’ll find that as time goes on, your models become a hodge podge of nullable data types to handle migration scenarios or columns being added/removed.

Reading Data

There are two main ways to read data from Cosmos.

People person = null;

// Can write raw SQL, but the iteration is a little annoying. 
var iterator = peopleContainer.GetItemQueryIterator("SELECT * FROM c WHERE c.id = '852ad197-a5f1-4709-b16d-5e9019d290af' " +
                                                                "AND c.address.zipCode = '90210'");
while (iterator.HasMoreResults)
{
    foreach (var item in (await iterator.ReadNextAsync()).Resource)
    {
        person = item;
    }
}

// If you prefer Linq
person = peopleContainer.GetItemLinqQueryable(allowSynchronousQueryExecution: true)
                            .Where(p => p.Id == Guid.Parse("852ad197-a5f1-4709-b16d-5e9019d290af"))
                            .ToList().First();

So the first is for fans of Dapper and the like. Personally, I find it kinda unweildy at times to get the results I want, but it does allow for more complete control. The second is obviously using Linq.

Now I want to point something out in the Linq example. Notice that I’m calling ToList()? That’s because the Cosmos Linq provider does not support First/FirstOrDefault. In our case it’s a easy fix because we can just instead execute the query and get our list back, and then get the first item anyway. But it’s a reminder that just because something supports Linq, doesn’t mean that it supports *all* of LINQ.

Finally, I also want to say that generally speaking, every query you write against a CosmosDB should try and include the PartitionKey. Because we’ve used the ZipCode, is that really feasible in our example? Probably not. It would mean that we would have to have the ZipCode already before querying the user, rather unlikely. This is one of the tradeoffs you have to think about when picking a PartitionKey, and really even when thinking about using CosmosDB or another large datastore in general.

Up Next

In the next part of this series, I want to talk about something really cool with CosmosDB. Using it with EntityFramework!

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

One of the most popular posts on this blog is a very simple write-up on how to parse JSON in .NET Core. I mostly wrote it because I thought that there was definitely a “proper” way of doing things, and people were almost going out of their way to make life difficult for themselves when working with JSON.

I think working with XML is slightly different because (just IMO), there still isn’t a “gold standard” library for XML.

Unlike JSON which has the incredible JSON.NET library to handle everything and anything, the majority of cases when you work with XML you’ll use one of the inbuilt XML Parsers inside the .NET Core framework. These can be frustrating at times and incredibly brittle. Part of it is that they were created very early on in the creation of .NET, and because of that, always need to be backwards compatible so you lose out on things like Generics. The other part is that the actual XML spec that involves things like namespaces and DTDs, while at first look simple, can be incredibly harsh. By harsh I mean that things will just plain not work if you are missing just one piece of the puzzle, and it can take hours to work out what’s wrong.

Anyway, let’s jump right in and check out our options for working with XML in .NET.

Our Example XML File

I’m going to be using a very simple XML file that has an element, an attribute property and a list. I’ll use these as we check out the options so we are always comparing trying to read the same file.

<?xml version="1.0" encoding="utf-8" ?>
<MyDocument xmlns="http://www.dotnetcoretutorials.com/namespace">
  <MyProperty>Abc</MyProperty>
  <MyAttributeProperty value="123" />
  <MyList>
    <MyListItem>1</MyListItem>
    <MyListItem>2</MyListItem>
    <MyListItem>3</MyListItem>
  </MyList>
</MyDocument>

Using XMLReader

So the first option we have is using the class “XMLReader”. It’s a forward only XML Parser (By that I mean that you read the file line by line almost). I’ll warn you now, it’s very very primitive. For example our code might look a bit like so :

XmlReaderSettings settings = new XmlReaderSettings();
settings.IgnoreWhitespace = true;

using (var fileStream = File.OpenText("test.xml"))
using(XmlReader reader = XmlReader.Create(fileStream, settings))
{
    while(reader.Read())
    {
        switch(reader.NodeType)
        {
            case XmlNodeType.Element:
                Console.WriteLine($"Start Element: {reader.Name}. Has Attributes? : {reader.HasAttributes}");
                break;
            case XmlNodeType.Text:
                Console.WriteLine($"Inner Text: {reader.Value}");
                break;
            case XmlNodeType.EndElement:
                Console.WriteLine($"End Element: {reader.Name}");
                break;
            default:
                Console.WriteLine($"Unknown: {reader.NodeType}");
                break;
        }
    }
}

With the output looking like :

Unknown: XmlDeclaration
Start Element: MyDocument. Has Attributes? : True
Start Element: MyProperty. Has Attributes? : False
Inner Text: Abc
End Element: MyProperty
Start Element: MyAttributePropety. Has Attributes? : True
Start Element: MyList. Has Attributes? : False
Start Element: MyListItem. Has Attributes? : False
Inner Text: 1
End Element: MyListItem
Start Element: MyListItem. Has Attributes? : False
Inner Text: 2
End Element: MyListItem
Start Element: MyListItem. Has Attributes? : False
Inner Text: 3
End Element: MyListItem
End Element: MyList
End Element: MyDocument

It sort of reminds me of using ADO.NET and reading data row by row and trying to store it in an object. The general idea is because you are only parsing line by line, it’s less memory intensive. But you’re also having to handle each line individually with any number of permutations of elements/attributes/lists etc. I think the only reason to use this method would be if you have extremely large XML files (100+MB), or you are looking for something very very specific. e.g. you only want to read a single element from the file, and you don’t want to load the entire thing while looking for that one element.

Another thing I will point out is that XML Namespaces and the difficulty around those wasn’t there with XMLReader. It just sort of powered through and there wasn’t any issue around prefixes, namespaces, DTDs etc.

But again in general, I wouldn’t use XMLReader in the majority of cases.

Using XPathDocument/XPathNavigator

So another way of getting individual XML Nodes, but being able to “search” a document is using the XPathNavigator object.

First, the code :

using (var fileStream = File.Open("test.xml", FileMode.Open))
{
    //Load the file and create a navigator object. 
    XPathDocument xPath = new XPathDocument(fileStream);
    var navigator = xPath.CreateNavigator();

    //Compile the query with a namespace prefix. 
    XPathExpression query = navigator.Compile("ns:MyDocument/ns:MyProperty");

    //Do some BS to get the default namespace to actually be called ns. 
    var nameSpace = new XmlNamespaceManager(navigator.NameTable);
    nameSpace.AddNamespace("ns", "http://www.dotnetcoretutorials.com/namespace");
    query.SetContext(nameSpace);

    Console.WriteLine("My Property Value : " + navigator.SelectSingleNode(query).Value);
}

Now honestly… This is bad and I made it bad for a reason. Namespaces here are really painful. In my particular case because I have a default namespace, this was the only way I could find out there that would get the XPath working. Without the namespace, things would actually be a cinch. So with that said I’m going to admit something here… I have totally used string replace functions to remove namespaces before… Now I know someone will jump in the comments and say “but the XML spec says blah blah blah”. I honestly think every headache I’ve ever had with working with XML has been because of namespaces.

So let me put a caveat on my recommendation here. If the document you are working with does not make use of namespaces (Or you are willing to remove them), and you need use an XPath expression to get a single node, then using the XMLNavigator actually isn’t a bad option. But that’s a big if.

Using XMLDocument

XMLDocument can be thought of like an upgraded version of the XPathNavigator. It has a few easier methods to load documents, and allows you to modify XMLDocuments in memory too!

XmlDocument document = new XmlDocument();
document.Load("test.xml");

XmlNamespaceManager m = new XmlNamespaceManager(document.NameTable);
m.AddNamespace("ns", "http://www.dotnetcoretutorials.com/namespace");
Console.WriteLine(document.SelectSingleNode("ns:MyDocument/ns:MyProperty", m).InnerText);

Overall you still have to deal with some namespace funny business (e.g. Default Namespaces are not handled great), and you still have to get each element one by one as you need it, but I do think this is the best option if you are looking to load out only a small subset of the XML doc. The fact you can modify the XML and save it back to file is also a pretty good one.

Using XMLSerializer

Now we are cooking with gas, XMLSerializer in my opinion is the very best way to parse XML in .NET Core. If you’ve used JSONDocument from JSON.NET before, then this is very close to being the same sort of setup.

First we simply create a class that models our actual XML file. We use a bunch of attribute to specify how to read the doc, which namespace we are using, even what type of element we are trying to deserialize (e.g. An attribute, element or array).

[XmlRoot("MyDocument", Namespace = "http://www.dotnetcoretutorials.com/namespace")]
public class MyDocument
{
    public string MyProperty { get; set; }

    public MyAttributeProperty MyAttributeProperty { get; set; }

    [XmlArray]
    [XmlArrayItem(ElementName = "MyListItem")]
    public List MyList { get; set; }
}

public class MyAttributeProperty
{
    [XmlAttribute("value")]
    public int Value { get; set; }
}

Really really simple. And then the code to actually read our XML and turn it into this class :

using (var fileStream = File.Open("test.xml", FileMode.Open))
{
    XmlSerializer serializer = new XmlSerializer(typeof(MyDocument));
    var myDocument = (MyDocument)serializer.Deserialize(fileStream);

    Console.WriteLine($"My Property : {myDocument.MyProperty}");
    Console.WriteLine($"My Attribute : {myDocument.MyAttributeProperty.Value}");

    foreach(var item in myDocument.MyList)
    {
        Console.WriteLine(item);
    }
}

No messing about trying to get namespaces right, no trying to work out the correct XPath, it just works. I think once you start using XMLSerializer, you will wonder why you ever bothered trying to manually read out XML documents again.

Now there is a big caveat. If you don’t really care about the bulk of the document and you are just trying to get a really deep element, it can be painful creating these huge models and classes just go get a single element.

Overall, in 99.9% of cases, try and use XMLSerializer to parse XML. It’s less brittle than other options and follows a very similar “pattern” to that of JSON serialization meaning anyone who has worked with one, can work with the other.

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

I skipped out on University/College in my earlier years and instead opted to fully self teach myself programming. While that means I can chew someone’s ear off about the latest feature in C# 8, it also means that I missed out on plenty of “Data Structure/Algorithm” style programming problems. Let’s be honest, unless you are going for a job interview it’s rare you use these anyway – especially in programming business systems. But every now and again I come across an algorithm that I think “How have I not heard of this before?”. One such algo is the “Knapsack Problem”, also sometimes known as the “Rucksack Problem”.

One of the biggest issues I found with looking for an algorithm in C# to do this, is very rarely did they ever explain how the code worked, or even a thorough explanation on how to use it. So this is that explanation.

What Is The Knapsack Problem?

The Knapsack Problem is where you have a “bag” that can hold a limited number of items, given that you have a set of items to choose from each with individual “values”, how can you maximize filling your bag with only the most valuable items.

Let’s take a real world example. A robber has broken into a jewellery store and wants to steal precious jewellery. His backpack can only hold 50KG of weight (He’s superman). When he walks around the store thinking about what to steal, he has to do a sort of “cost/benefit” sum in his head to maximize his total take. As an example, with a 50KG bag, is it better to steal a 50KG item worth $100, or steal five 10kg items worth $50 each? Obviously the latter because even though that one item is more valuable, it takes up all the space in the bag, whereas stealing multiple smaller items actually maximizes the value the bag can hold. This is the knapsack/rucksack problem.

Code Explanation

Before I just give you the code, I want to explain a little bit first on how this actually works. Even though the code below has a tonne of comments, I still want to go that extra mile because it can be hard to wrap your head around at first.

The code works a bit like this :

  • Create a loop to go through each jewel one by one.
  • Given a smaller bag, and increasing the bag size each loop, check if the jewel can fit inside the bag (Even if we had to empty out something else)
  • If the Jewel can fit, then work out what’s already in the “bag” by checking what the last value was when we looped at this bag size. Or another way of putting it, on the last jewel we looped on, when we were at this weight, what was the value? And with the current jewel we have, it we combine it with other jewels to make up the other space in the bag, who has the higher value? If it’s the current Jewel + extra then set a matrix value to that, otherwise set it to the last jewel we fit at this weight.
  • If the jewel can’t fit into our smaller bag (e.g.  The jewel weighs 30KG but our bag can only carry 20KG, then take whatever the last jewel round was that fit into this 20KG bag).
  • As we carry values forward, then at the very end, the last indexes in our matrix will be the maximum value.

… I’m going to stop here and say that this is probably a little confusing. So let’s take an example of where we might be in the current loop cycle and how this might look by building a quick flow diagram. (Click to open to a larger image).

Knapsack Code

Now here’s the code. Note that this is not crazy optimized with one letter variables names for a leetcode competition, but it’s written in a way that hopefully is easier to understand.

public class Jewel
{
	public int Weight { get; set; }
	public int Value { get; set; }
}

public static int KnapSack(int bagCapacity, List jewels)
{
	var itemCount = jewels.Count;

	int[,] matrix = new int[itemCount + 1, bagCapacity + 1];

	//Go through each item. 
	for (int i = 0; i <= itemCount; i++)
	{
		//This loop basically starts at 0, and slowly gets bigger. 
		//Think of it like working out the best way to fit into smaller bags and then keep building on that. 
		for (int w = 0; w <= bagCapacity; w++)
		{
			//If we are on the first loop, then set our starting matrix value to 0. 
			if (i == 0 || w == 0)
			{
				matrix[i, w] = 0;
				continue;
			}

			//Because indexes start at 0, 
			//it's easier to read if we do this here so we don't think that we are reading the "previous" element etc. 
			var currentJewelIndex = i - 1;
			var currentJewel = jewels[currentJewelIndex];

			if (i == 0 || w == 0)
				matrix[i, w] = 0;
			//Is the weight of the current jewel less than W 
			//(e.g. We could find a place to put it in the bag if we had to, even if we emptied something else?)
			if (currentJewel.Weight <= w)
			{
				//If I took this jewel right now, and combined it with other gems
				//Would that be bigger than what you currently think is the best effort now? 
				//In other words, if W is 50, and I weigh 30. If I joined up with another jewel that was 20 (Or multiple that weigh 20, or none)
				//Would I be better off with that combination than what you have right now?
				//If not, then just set the value to be whatever happened with the last item 
				//(may have fit, may have done the same thing and not fit and got the previous etc). 
				matrix[i, w] = Math.Max(currentJewel.Value + matrix[i - 1, w - currentJewel.Weight]
										, matrix[i - 1, w]);
			}
			//This jewel can't fit, so bring forward what the last value was because that's still the "best" fit we have. 
			else
				matrix[i, w] = matrix[i - 1, w];
		}
	}

	//Because we carry everything forward, the very last item on both indexes is our max val
	return matrix[itemCount, bagCapacity];
}

static void Main(string[] args)
{
	var items = new List
		{
			new Jewel {Value = 120, Weight = 10},
			new Jewel {Value = 100, Weight = 20},
			new Jewel {Value = 500, Weight = 30},

		};

	Console.WriteLine(KnapSack(50, items));
}

I’ve included a sample little console debug (So you can paste this into a new console app), to illustrate how it works. Again, I’ve commented and used a “Jewel” class to illustrate the example a little more so that hopefully it’s easier to understand than some code golf example. Even if you don’t use C# as your main language, hopefully it’s easy to follow!

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

I had a friend who was taking a look through the classic “Gang Of Four” Design Patterns book for the first time. He reached out to ask me which of the design patterns I’ve actually used in business applications, and actually thought “I’m using this pattern right now”. Singleton, Factory Pattern, Mediator – I’ve used all of these and I’ve even written about them before. But one that I haven’t talked about before is the Chain Of Responsibility Pattern.

What Is “Chain Of Responsibility”

Chain Of Responsibility (Or sometimes I’ve called it Chain Of Command) pattern is a design pattern that allows “processing” of an object in hierarchical fashion. The classic Wikipedia definition is

In object-oriented design, the chain-of-responsibility pattern is a design pattern consisting of a source of command objects and a series of processing objects. Each processing object contains logic that defines the types of command objects that it can handle; the rest are passed to the next processing object in the chain. A mechanism also exists for adding new processing objects to the end of this chain. Thus, the chain of responsibility is an object oriented version of the if … else if … else if ……. else … endif idiom, with the benefit that the condition–action blocks can be dynamically rearranged and reconfigured at runtime.

That probably doesn’t make much sense but let’s look at a real world example that we can then turn into code.

Let’s say I own a bank. Inside this bank I have 3 levels of employees. A Bank Teller, Supervisor, and a Bank Manager. If someone comes in to withdraw money, the Teller can allow any withdrawal of less than $10,000, no questions asked. If the amount is for more than $10,000, then it passes the request onto the supervisor. The supervisor can handle requests up to $100,000, but only if the account has ID on record. If the ID is not on record, then the request must be rejected no matter what. If the requested amount is for more than $100,000 it goes to the bank manager. The bank manager can approve any amount for withdrawal even if the ID is not on record because if they are withdrawing that amount, they are a VIP and we don’t care about ID and money laundering regulations.

This is the hierarchical “Chain” that we talked about earlier where each person tries to process the request, and can then pass it onto the next. If we take this approach and map it to code (In an elegant way), this is what we call the Chain Of Responsibility pattern. But before we go any further, let’s look at a bad way to solve this problem.

A Bad Approach

Let’s just solve this entire problem using If/Else statements.

class BankAccount
{
    bool idOnRecord { get; set; }

    void WithdrawMoney(decimal amount)
    {
        // Handled by the teller. 
        if(amount < 10000)
        {
            Console.WriteLine("Amount withdrawn by teller");
        } 
        // Handled by supervisor
        else if (amount < 100000)
        {
            if(!idOnRecord)
            {
                throw new Exception("Account holder does not have ID on record.");
            }

            Console.WriteLine("Amount withdrawn by Supervisor");
        }
        else
        {
            Console.WriteLine("Amount withdrawn by Bank Manager");
        }
    }
}

So there is a few issues with our code.

  • Adding additional levels of employees in here is really hard to manage with the mess of If/Else statements.
  • The special logic of checking ID at the supervisor level is somewhat hard to unit test because it has to pass a few other checks first.
  • While the only defining logic is for the amount withdrawn at the moment, we could add additional checks in the future (e.g. VIP customers are marked as such and are always handled by the supervisor). This logic is going to be hard to manage and could easily get out of control.

Coding Chain Of Responsibility

Let’s rewrite the code a little. Instead let’s create “employee” objects that can handle the logic of whether they can process the request themselves or not. Ontop of that, let’s give them a line manager so that they know they can pass the request up if needed.

interface IBankEmployee
{
    IBankEmployee LineManager { get; }
    void HandleWithdrawRequest(BankAccount account, decimal amount);
}

class Teller : IBankEmployee
{
    public IBankEmployee LineManager { get; set; }

    public void HandleWithdrawRequest(BankAccount account, decimal amount)
    {
        if(amount > 10000)
        {
            LineManager.HandleWithdrawRequest(account, amount);
            return;
        }

        Console.WriteLine("Amount withdrawn by Teller");
    }
}

class Supervisor : IBankEmployee
{
    public IBankEmployee LineManager { get; set; }

    public void HandleWithdrawRequest(BankAccount account, decimal amount)
    {
        if (amount > 100000)
        {
            LineManager.HandleWithdrawRequest(account, amount);
            return;
        }

        if(!account.idOnRecord)
        {
            throw new Exception("Account holder does not have ID on record.");
        }

        Console.WriteLine("Amount withdrawn by Supervisor");
    }
}

class BankManager : IBankEmployee
{
    public IBankEmployee LineManager { get; set; }

    public void HandleWithdrawRequest(BankAccount account, decimal amount)
    {
        Console.WriteLine("Amount withdrawn by Bank Manager");
    }
}

We can then create the “chain” by creating the employees required along with their managers. Almost like creating an Org Chart.

var bankManager = new BankManager();
var bankSupervisor = new Supervisor { LineManager = bankManager };
var frontLineStaff = new Teller { LineManager = bankSupervisor };

We can then completely transform the BankAccount class Withdraw method to instead be handled by our front line staff member (The Teller).

class BankAccount
{
    public bool idOnRecord { get; set; }

    public void WithdrawMoney(IBankEmployee frontLineStaff, decimal amount)
    {
            frontLineStaff.HandleWithdrawRequest(this, amount);
    }
}

Now, when we make a withdrawl request, the Teller always handles it first, if it can’t, it then passes it to it’s line manager *whoever* that may be. So the beauty of this pattern is

  • Subsequent items in the “chain” don’t need to know why things got passed to it. A supervisor doesn’t need to know what the requirements for on why a Teller passed it up the chain.
  • A Teller doesn’t need to know the entire chain after it. Just that it passed the request to the supervisor and it will be handled there (Or further if need be).
  • The entire org chart can be changed by introducing new employee types. For example if I created a “Teller Manager” that could handle requests between 10k -> 50k, and then pass it to the Supervisor. The Teller object would stay the same, The Supervisor object would stay the same, and I would just change the LineManager of the Teller to be the “Teller Manager” instead.
  • Any Unit Tests we write can focus on a single employee at once. For example when testing a Supervisor, we don’t also need to test the Teller’s logic on when it gets passed to it.

Extending Our Example

While I think the above example is a great way to illustrate the pattern, often you’ll find people using a method called “SetNext”. In general I think this is pretty uncommon in C# because we have property getters and setters. Using a “SetVariableName” method is typically from C++ (And for me – Pascal) days where that was the preferred way of encapsulating variables.

But ontop of that, other examples also typically use an Abstract Class to try and tighten how requests are passed along. The problem with our example above is that there is a lot of duplicate code of passing the request onto the next handler. Let’s tidy that up a little bit.

There is a lot of code so bare with me. The first thing we want to do is create an AbstractClass that allows us to handle the withdrawal request in a standardized way. It should check the condition, if it passes, do the withdraw, if not, it needs to pass it onto it’s line manager. That looks like so :

interface IBankEmployee
{
    IBankEmployee LineManager { get; }
    void HandleWithdrawRequest(BankAccount account, decimal amount);
}

abstract class BankEmployee : IBankEmployee
{
    public IBankEmployee LineManager { get; private set; }

    public void SetLineManager(IBankEmployee lineManager)
    {
        this.LineManager = lineManager;
    }

    public void HandleWithdrawRequest(BankAccount account, decimal amount)
    {
        if (CanHandleRequest(account, amount))
        {
            Withdraw(account, amount);
        } else
        {
            LineManager.HandleWithdrawRequest(account, amount);
        }
    }

    abstract protected bool CanHandleRequest(BankAccount account, decimal amount);

    abstract protected void Withdraw(BankAccount account, decimal amount);
}

Next we need to modify our employee classes to inherit from this BankEmployee class.

class Teller : BankEmployee, IBankEmployee
{
    protected override bool CanHandleRequest(BankAccount account, decimal amount)
    {
        if (amount > 10000)
        {
            return false;
        }
        return true;
    }

    protected override void Withdraw(BankAccount account, decimal amount)
    {
        Console.WriteLine("Amount withdrawn by Teller");
    }
}

class Supervisor : BankEmployee, IBankEmployee
{
    protected override bool CanHandleRequest(BankAccount account, decimal amount)
    {
        if (amount > 100000)
        {
            return false;
        }
        return true;
    }

    protected override void Withdraw(BankAccount account, decimal amount)
    {
        if (!account.idOnRecord)
        {
            throw new Exception("Account holder does not have ID on record.");
        }

        Console.WriteLine("Amount withdrawn by Supervisor");
    }
}

class BankManager : BankEmployee, IBankEmployee
{
    protected override bool CanHandleRequest(BankAccount account, decimal amount)
    {
        return true;
    }

    protected override void Withdraw(BankAccount account, decimal amount)
    {
        Console.WriteLine("Amount withdrawn by Bank Manager");
    }
}

So notice that in all cases, the public method of “HandleWithdrawRequest” from the abstract class is called, it then calls the subclass “CanHandleRequest” which can contain our logic on if this employee is good to go or not. If it is, then call it’s local “Withdraw” request, otherwise try the next employee.

We just need to change how we create the chain of employees like so :

var bankManager = new BankManager();

var bankSupervisor = new Supervisor();
bankSupervisor.SetLineManager(bankManager);

var frontLineStaff = new Teller();
frontLineStaff.SetLineManager(bankSupervisor);

Again, I prefer not to use the “SetX” methods, but it’s what a lot of examples use so I thought I would include it.

Other examples also put the logic of whether an employee can handle the request or not inside the actual abstract class. I personally prefer not to do this as it means all our handlers have to have very similar logic. So for example at the moment all are checking the amount to be withdrawn, but what if we had a particular handler that was looking for something in particular (Like a VIP flag?), adding that logic into the abstract class for some handlers but not others would just take us back to If/Else hell.

When To Use The “Chain Of Responsibility” Design Pattern?

The best use cases of this pattern are where you have a very logical “chain” of handlers that should be run in order every time. I would note that forking of the chain is a variation on this pattern, but quickly becomes extremely complex to handle. For that reason, I typically end up using this pattern when I am modelling real world “chain of command” scenarios. It’s the entire reason I use a bank as an example, because it’s a real world “Chain Of Responsibility” that can be modelled in code.

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

As countries head into lockdown/quarantine/#StayHomeSaveLives, we find ourselves with a fair bit of free time on our hands. While I’m sure Netflix, Disney Plus,  and Amazon Prime will be getting a thorough workout during this time, it’s also a great opportunity to upskill and maybe gain a couple of those certifications you’ve been meaning to get for the past few years now. Even if your aim isn’t to actually sit an exam, there’s no harm in using all this time to actually upskill and hit the ground running when this is all over.

So here’s a select few courses that I’ve personally done and can recommend. Some are programming, some are devops, some are Scrum (ugh, I know but.. hear me out). A few have some free study options but even if there is a cost, typically it’s only going to be ~$10 or so. And even if you don’t use my recommended course, often there are free alternatives out there. So let’s jump right in!

Azure Exam AZ-204 – Developing Solutions For Microsoft Azure

So let’s start off with one that I’m sure many readers will think about doing in their career. The Microsoft exam AZ-204 is made for developers to expand their knowledge on everything Azure. AZ-204 used to be called AZ-203, and before that was 70-532, but they are all roughly the same thing. The only difference is that as Azure keep adding new services and features, the exam has to keep getting updated. Because the exam outline is released once (When the exam is released), they can’t just keep adding into the same exam and have to update the number. Regardless, AZ-204, AZ-203 and 70-532 are all on developers getting to grips with Azure services.

Who Is This For?

If you are a developer and use Azure for work, sit this exam. It’s that simple. I would even go as far to say that even if you are a tester/QA, or someone that is an “almost” developer (Like DevOps, report writer, SQL developer etc), you should sit this exam. If you aren’t someone who likes exams, following the study materials is still an amazing way to learn the ins and outs of a tonne of Azure services. I can’t even begin to describe the amount of times I’ve been asked a question at work and thought “huh… I have the exact answer just from flicking through that Azure book for the past few weeks”.

Exam Cost

The exam itself is around $165USD and as of right now, you can sit the exam from your home. You do not need to go into a testing center to complete. So you can actually complete all studying and sit the exam during the lockdown!

Check out the exam here : https://docs.microsoft.com/en-gb/learn/certifications/exams/az-204

Study Materials

There are a multitude of ways to study for this exam, so I’ll give a quick rundown of them all.

If you like studying from books, there are Microsoft Exam Ref books available on Amazon. Unfortunately a book for AZ-204 hasn’t been released so you would have to use the AZ-203 reference book (Kindle/Paperback on Amazon here). I personally bought the Exam Ref 70-532 book (Kindle/Paperback on Amazon here), and thought it was a bit of a mixed bag. There was typos and references to non existent diagrams all over the place, but it was still a pretty solid book. The bonus of the Exam Ref book is that it gives you a bit of an intro to a topic in a structured way, and then I always just went and read more on the official Azure documentation.

If you prefer online video courses, then for pretty much all Azure exams, people use the series from Scott Duffy which are available on Udemy here. I’ve personally used Scott Duffy’s stuff for both the developer and architecture exams and found them pretty much spot on. For the cost ($~14), it’s kinda a no brainer. The other thing I’ll mention is that I bought the course when the exam was still 70-532 and I still have access to the AZ-203. Scott gives all existing students access to the updated exams. So even if you buy the AZ-203 course, when that gets rolled into AZ-204, you will get access to that as well. I mean that’s not a guarantee but so far, I purchased the course in February of 2017 and I still have access right now. It’s crazy good value.

Finally, Microsoft have free learning resources available that are typically a mix of articles/videos available at https://docs.microsoft.com/en-gb/learn/browse/ and https://channel9.msdn.com/Series/Microsoft-Azure-Tutorials. These are great free resources but I’ve never really found them to be that aligned to the exam. In some ways that’s a good thing since you aren’t learning specifically what’s in the exam and instead just working your way through Azure resources. But if you intend on sitting the exam, I recommend trying to stick with an Exam Ref course/book so that you know exactly what it is you should be learning.

Professional Scrum Master (PSM) – Scrum.org

Professional Scrum Master is the certification available from scrum.org. There is another certification in the scrum world called “Certified Scrum Master” that you will probably see scrum masters tack onto the end of their email signatures like it’s some sort of PHD, but they are testing the exam same thing. PSM is just an online exam for when you think you know the material. CSM is a course you have to do in person and pay a few grand for the badge of honor. In the grand scheme of things, knowing scrum is highly advantageous to the programming world right now, irrespective of the letters you want to put in your LinkedIn name.

Who Is This For?

I actually recommend sitting the PSM exam for anyone who works in a team (or even near a team) that works in any agile methodology. You don’t have to be angling for a scrum masters job to sit this exam, it really is for anyone looking to gain broader knowledge on exactly how scrum works. I found it extremely helpful in understanding the “why” of Scrum. “Why” do we even bother with a sprint retrospective? If someone is away, why should we even bother doing a standup that day?

Another reason I found this exam helpful was that you only learn exactly what the scrum guide actually says. The scrum guide itself is 15 odd pages of info and that’s it. It seems like every workplace has their own “additions” to Scrum or their own ScrumButs, so it’s good to separate out workplace frameworks versus what the scrum guide actually says.

Exam Cost

The exam is $150 and is purely an online exam. You can purchase it and sit it whenever you want from the comfort of your own home.

Study Materials

The incredible thing about learning scrum is you can download the scrum guide here, read it 3 times over, and sit the exam and pass. I personally think that if you have *never* used Scrum before it’s actually easier to sit the exam and pass because you only know what the guide told you. I’ve seen colleagues actually fail the exam because they think “here’s how we solve this problem at my work” rather than “the scrum guide says”.

I also used an online video course on Udemy called “Scrum Certification Prep“. While there is hours of lectures, I kinda felt like they were all over the place. When I did it, there were two different instructors with one being really hard to hear. It also seemed like a lot of the content was repetitive which… Is kinda unique to scrum in a way. Again, the scrum guide is so short and concise but people have made careers on running workshops for Scrum, and the only way to do that is to repeat the same things over and over.

AWS Cerified Solutions Architect Associate

While I recommended the developer exam for Azure, I actually recommend the architecture exam when it comes to AWS. I found the AWS Architecture Associate exam to be more of an AWS 101 rather than getting down to the nitty gritty. Personally, most of my work ends up on Azure just because of the synergy between the .NET ecosystem and things like Azure Web Apps. But I always want to keep up to date with what’s happening on AWS so that if we run into a road block in Azure, there’s always the “let’s check out what AWS offers”. Even if you never work with AWS, I think it’s helpful to know what else is out there at a high level – and the AWS Architecture Associate exam gives that.

Who Is This For?

If you’re working on AWS day to day I think that this exam is a no brainer from developers, to testers, to architects. If you are working mostly on Azure, then I think this exam/study is still beneficial but mostly if you are in a tech lead/architect sort of role where you always want to have a comparison of “this is what the other cloud offers”.

Exam Cost

All associate exams at AWS are $150USD each. Again AWS offers online exams so you can sit the exam from the comfort of your own home even during lockdown.

Study Materials

I’ve spoken to numerous people who have sat AWS exams and they all say one thing. “We used A Cloud Guru”. To give you an idea of how popular A Cloud Guru’s study material is, there are currently 520k students enrolled on the Udemy course here. That’s insane. Again, I purchased this course in November of 2016, and I still have access to the updates materials which is just crazy value for money.

A Cloud Guru also offer practice tests with over 200 questions for around $20 as well which is crazy value for money. You can grab the practice exams here. I highly recommend if you are looking to sit the exam, to do the practice questions as they may hit on some topics that you don’t actually know that well, giving you the opportunity to go and study that area a bit more.

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

I’ve debated about posting an article on this for a long long time. Mostly because I think if you code any C# MVC/API project correctly, you should almost never run into this issue, but I’ve seen people run into the problem over and over again. What issue? Well it’s when you end up with two very interesting exceptions. I say two because depending on whether you are using System.Text.Json or Newtonsoft.JSON in your API, you will get two different error messages.

For System.Text.Json (Default for .NET Core 3+), you would see something like :

JsonException: A possible object cycle was detected which is not supported. This can either be due to a cycle or if the object depth is larger than the maximum allowed depth of 32.

And for Newtonsoft.Json (Or JSON.NET as it’s sometimes called, default for .NET Core 2.2 and lower) :

JsonSerializationException: Self referencing loop detected with type

They mean essentially the same thing, that you have two models that reference each other and will cause an infinite loop of serializing doom.

Why Does This Happen?

Before we get into how to resolve the problem, let’s have a good dig into why this happens.

Let’s assume I have an API that contains the following models :

public class StaffMember
{
    public string FirstName { get; set; }
    public Department Department { get; set; }
}

public class Department
{
    public List StaffMembers { get; set; }
}

Already we can see that there is a small problem. The class StaffMember references Department, and Department references StaffMember. In normal C# code this isn’t such an issue because it’s simply pointers. But when we are working inside an API, when we output this model it has to traverse the full model to output our JSON model.

So if for example we had an API endpoint that looked like this :

[HttpGet]
public ActionResult Get()
{
    var staff = new StaffMember { FirstName = "John Smith" };
    var department = new Department();
    staff.Department = department;
    department.StaffMembers = new List { staff };

    return Ok(staff);
}

We are gonna blow up.

It takes our StaffMember to try and serialize, which points to the Department. It goes to the Department and tries to serialize it, and it finds a StaffMember. It follows that StaffMember and… goto:Step 1.

But you’re probably sitting there thinking “When have I ever created models that reference each other in such a way?!”. And you’d probably be pretty right. It’s exceedingly rare that you create these sorts of two way relationships… Except of course… when using Entity Framework relationships.

If these two classes were part of an EntityFramework model, it’s highly likely that it would look like so :

public class StaffMember
{
    public string FirstName { get; set; }
    public virtual Department Department { get; set; }
}

public class Department
{
    public virtual ICollection StaffMembers { get; set; }
}

In EntityFramework (Or many other ORMs), we create two way relationships because we want to be able to traverse models both ways, and often this can create reference loops.

The Real Solution

So putting “real” in the subtitle may trigger some because this isn’t so much a “here’s the line of code to fix this problem” solution in so much as it’s a “don’t do it in the first place”.

The actual problem with the above example is that we are returning a model (our datamodel) that isn’t fit to be serialized in the first place. In general terms, you *should not* be returning your data model direct from an API. In almost all cases, you should be returning a ViewModel from an API. And then when returning a view model, you wouldn’t make the same self referencing issue.

For example (Sorry for the long code)

[HttpGet]
public ActionResult Get()
{
    var staffMember = new StaffMember { Department = new Department() }; //(Really this should actually be calling a repository etc). 

    var viewModel = new StaffMemberViewModel
    {
        FirstName = staffMember.FirstName,
        Department = new StaffMemberViewModelDepartment
        {
            DepartmentName = staffMember.Department.DepartmentName
        }
    };

    return Ok(viewModel);
}

public class StaffMemberViewModel
{
    public string FirstName { get; set; }
    public StaffMemberViewModelDepartment Department { get; set; }
}

public class StaffMemberViewModelDepartment
{
    public string DepartmentName { get; set; }
}

public class StaffMember
{
    public string FirstName { get; set; }
    public virtual Department Department { get; set; }
}

public class Department
{
    public string DepartmentName { get; set; }
    public virtual ICollection StaffMembers { get; set; }
}

Here we can see we map the StaffMember data model into our fit for purpose ViewModel. It may seem overkill to create new models to get around this issue, but the reality is that this is best practice in any case. Not using ViewModels and instead returning your exact DataModel is actually going to cause a whole heap of other issues, so even if you solve the reference loop another way, you are still gonna have issues.

Global API Configuration Settings

So I’ve just ranted about how you really shouldn’t run into this issue if you use proper view models, but let’s say that you have to use a model that contains a reference loop. You have no other option, what can you do?

NewtonSoft.Json (JSON.NET)

Let’s start with if you are using Newtonsoft.Json first (If you are using .NET Core 3+, there is a guide on adding Newtonsoft back as the default JSON serializer here : https://dotnetcoretutorials.com/2019/12/19/using-newtonsoft-json-in-net-core-3-projects/).

You can then edit your startup.cs where you add in Newtonsoft to configure the ReferenceLoopHandling :

public void ConfigureServices(IServiceCollection services)
{
    services.AddControllers().AddNewtonsoftJson(x => x.SerializerSettings.ReferenceLoopHandling = Newtonsoft.Json.ReferenceLoopHandling.Ignore);
}

Now when Newtonsoft.Json runs into a loop, it simply stops serializing that tree. To me it’s still not pretty as your output essentially ends up like :

{"firstName":"John Smith","department":{"departmentName":null,"staffMembers":[]}}

Anyone reading this would at first think that the Department has no Staff Members, but in reality it just stopped serializing at that point because it detected it was about to loop.

Another thing to note is if you set the serializer to instead Serialize like so :

public void ConfigureServices(IServiceCollection services)
{
    services.AddControllers().AddNewtonsoftJson(x => x.SerializerSettings.ReferenceLoopHandling = Newtonsoft.Json.ReferenceLoopHandling.Serialize);
}

Your website will just straight up crash. Locally, you will get an error like

The program '[23496] iisexpress.exe' has exited with code -1073741819 (0xc0000005) 'Access violation'.

But it’s essentially crashing because it’s looping forever (Like you’ve told it to!)

System.Text.Json

If you are using System.Text.Json, you are basically out of luck when it comes to reference loop support. There are numerous tickets created around this issue, but the actual feature is being tracked here : https://github.com/dotnet/runtime/issues/30820

Now the gist of it is, that the feature has been added to the serializer, but it’s not in the general release yet (Only preview release), and on top of that, even if you add the preview release to your project it actually doesn’t work as a global setting (See last comments on the issue). It only works if you manually create a serializer (For example serializing a model for an HttpClient call). So in general, if you are running into these issues and you don’t want to edit your ViewModel, for now it looks like you will have to use Newtonsoft.Json. I will update this post when System.Text.Json gets it’s act together!

Ignoring Properties

This one is weird because if you can access the model to make these changes, then just create a damn viewmodel! But in anycase, there is another way to avoid reference loops and that is to tell the serializer to not serialize a property at all. In *both* Newtonsoft.Json and System.Text.Json there is an attribute called JsonIgnore :

public class StaffMember
{
    public string FirstName { get; set; }
    public virtual Department Department { get; set; }
}

public class Department
{
    public string DepartmentName { get; set; }
    [JsonIgnore]
    public virtual ICollection StaffMembers { get; set; }
}

This means that the StaffMembers property on the Department object will simply not be output. It’s in my mind a slightly better option than above because you don’t see the property of StaffMembers with an empty array, instead you don’t see it at all!

 

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.