This is a 4 part series on working with Protobuf in C# .NET. While you can start anywhere in the series, it’s always best to start at the beginning!

Part 1 – Getting Started
Part 2 – Serializing/Deserializing
Part 3 – Using Length Prefixes
Part 4 – Performance Comparisons


We’ve made mention in previous posts to the fact that Protobuf (supposedly) will out perform many other data formats, namely JSON. And while we’ve kind of alluded to the fact it’s “fast” and it’s “small”, we haven’t really jumped into the actual numbers.

This post will take a look across three different metrics :

  • File Size – So just how lightweight is Protobuf?
  • Serialization – How fast can we take a C# object and serialize it into Protobuf or JSON?
  • Deserialization – Given a Protobuf/JSON data format, how fast can we turn it into a C# object?

Let’s jump right in!

File Size Comparisons

Before looking at read/write performance, I actually wanted to compare how large the actual output is between Protobuf and JSON. I set up a really simple test that used the following model :

[ProtoContract]
class Person
{
    [ProtoMember(1)]
    public string FirstName { get; set; }

    [ProtoMember(2)]
    public string LastName { get; set; }

    [ProtoMember(3)]
    public List Emails { get; set; }
}

And I used the following code to create an object, and write it twice. Once with Protobuf and once with JSON :

var person = new Person
{
    FirstName = "Wade",
    LastName = "G", 
    Emails = new List<string>
    {
        "[email protected]", 
        "[email protected]"
    }
};

using (var fileStream = File.Create("person.buf"))
{
    Serializer.Serialize(fileStream, person, PrefixStyle.Fixed32);
}

var personString = JsonConvert.SerializeObject(person);
File.WriteAllText("person.json", personString);

The results were :

FormatFileSize
Protobuf46 bytes
JSON85 bytes

So just by default, Protobuf is almost half the size. Obviously your mileage may vary depending on your data types and even your property names.

That last point is important because while Protobuf has other mechanisms keeping the size down, a big part of it is that all property names are serialized as integers rather than their string form. To illustrate this, I modified the model to look like so :

[ProtoContract]
class Person
{
    [ProtoMember(1)]
    [JsonProperty("1")]
    public string FirstName { get; set; }

    [ProtoMember(2)]
    [JsonProperty("2")]
    public string LastName { get; set; }

    [ProtoMember(3)]
    [JsonProperty("3")]
    public List Emails { get; set; }
}

So now our JSON will be serialized with single digit names as well. When running this, our actual comparison table looks like so :

FormatFileSize
Protobuf46 bytes
JSON85 bytes
JSON With Digit Properties65 bytes

So half of the benefits of using Protobuf when it comes to size instantly disappears! For now, I’m not going to use the single digit properties going forward because it’s not illustrative of what happens in the real world with JSON, but it’s an interesting little footnote that you can shrink your disk footprint with just this one simple hack that storage providers hate.

So overall, Protobuf has JSON beat when it comes to file size. That’s no surprise, but what about actual performance when working with objects?

Serialization Performance

Next, let’s take a look at serializing performance. There are a couple of notes on the methodology behind this

  • Because Protobuf serializes to bytes and JSON to strings, I wanted to leave them like that. e.g. I did not take the JSON string, and convert it into bytes as this would artificially create an overhead when there is no need.
  • I kept everything in memory (I did not write to a file etc)
  • I wanted to try and use *both* JSON.NET and Microsoft’s JSON Serializer. The latter is almost certainly going to be faster, but the former probably has more use cases out there in the wild.
  • For now, I’m just using the Protobuf.NET library for everything related to Protobuf
  • Use Protobuf as the “baseline” so everything will compared to how much slower (Or faster, you never know!) it is compared to Protobuf

With that in mind, here’s the benchmark using BenchmarkDotNet (Quick guide if you haven’t seen it before here : https://dotnetcoretutorials.com/2017/12/04/benchmarking-net-core-code-benchmarkdotnet/)

public class ProtobufVsJSONSerializeBenchmark
{
    static Person person = new Person
    {
        FirstName = "Wade",
        LastName = "G",
        Emails = new List<string>
        {
            "[email protected]",
            "[email protected]"
        }
    };

    [Benchmark(Baseline = true)]
    public byte[] SerializeProtobuf()
    {
        using(var memoryStream = new MemoryStream())
        {
            ProtoBuf.Serializer.Serialize(memoryStream, person);
            return memoryStream.ToArray();
        }
    }

    [Benchmark]
    public string SerializeJsonMicrosoft()
    {
        return System.Text.Json.JsonSerializer.Serialize(person);
    }

    [Benchmark]
    public string SerializeJsonDotNet()
    {
        return Newtonsoft.Json.JsonConvert.SerializeObject(person);
    }
}

And the results?

FormatAverage TimeBaseline Comparison
Protobuf680ns
Microsoft JSON743ns9% Slower
JSON.NET1599ns135% Slower

So we can see that Protobuf is indeed faster, but not by a heck of a lot. And of course, I’m willing to bet a keen eyed reader will drop a comment below and tell me how the benchmark could be improved to make Microsoft’s JSON serializer even faster.

Of course JSON.NET is slower, and that is to be expected, but again I’m surprised that Protobuf, while fast, isn’t *that* much faster. How about deserialization?

Deserialization Performance

We’ve done serialization, so let’s take a look at the reverse – deserialization.

I do want to point out one thing before we even start, and that is that JSON.NET and Microsoft’s JSON library handle case sensitivity with JSON *very* differently. Infact, JSON.NET is case insensitive by default and is the *only* way it can run. Microsoft’s JSON library is case sensitive by default and must be switched to handle case insensitivity at a huge cost. I have an entire article dedicated to the subject here : https://dotnetcoretutorials.com/2020/01/25/what-those-benchmarks-of-system-text-json-dont-mention/

In some ways, that somewhat invalidates our entire test (Atleast when comparing JSON.NET to Microsoft’s JSON), because it actually entirely depends on whether your JSON is in the exact casing you require (In most cases that’s going to be PascalCase), or if it’s in CamelCase (In which case you take a performance hit). But for now, let’s push that aside and try our best to create a simple benchmark.

Other things to note :

  • Again, I want to work with the formats that work with each data format. So Protobuf will be deserializing from a byte array, and JSON will be deserializing from a string
  • I *had* to create a memory stream for Protobuf. Atleast without making the test more complicated than it needed to be.
public class ProtobufVsJSONDeserializeBenchmark
{
    public static Person person = new Person
    {
        FirstName = "Wade",
        LastName = "G",
        Emails = new List<string>
        {
            "[email protected]",
            "[email protected]"
        }
    };

    static byte[] PersonBytes;
    static string PersonString;

    [GlobalSetup]
    public void Setup()
    {
        using (var memoryStream = new MemoryStream())
        {
            ProtoBuf.Serializer.Serialize(memoryStream, person);
            PersonBytes = memoryStream.ToArray();
        }

        PersonString = JsonConvert.SerializeObject(person);
    }

    [Benchmark(Baseline = true)]
    public Person DeserializeProtobuf()
    {
        using (var memoryStream = new MemoryStream(PersonBytes))
        {
            return ProtoBuf.Serializer.Deserialize<Person>(memoryStream);
        }
    }

    [Benchmark]
    public Person DeserializeJsonMicrosoft()
    {
        return System.Text.Json.JsonSerializer.Deserialize<Person>(PersonString);
    }

    [Benchmark]
    public Person DeserializeJsonDotNet()
    {
        return Newtonsoft.Json.JsonConvert.DeserializeObject<Person>(PersonString);
    }
}

I know it’s a big bit of code to sift through but it’s all relatively simple. We are just deserializing back into a Person object. And the results?

FormatAverage TimeBaseline Comparison
Protobuf1.019us
Microsoft JSON1.238us21% Slower
JSON.NET2.598us155% Slower

So overall, Protobuf wins again and by a bigger margin this time than our Serialization effort (When it comes to percentage). But again, your mileage will vary heavily depending on what format your JSON is in.

Conclusion

The overall conclusion is that indeed, Protobuf is faster than JSON by a reasonable margin, or a huge margin if comparing it to JSON.NET. However, in some respects a big part of the difference is likely to lie in how JSON is always serialized as strings versus the direct byte serialization of Protobuf. But that’s just a hunch of mine.

When it comes to file size, Protobuf wins out again, *especially* when serializing full JSON property names. Obviously here we are talking about the difference between a few bytes, but when you are storing say 500GB of data in Protobuf, that same data would be 1000GB in JSON, so it definitely adds up.

That’s all I’m doing on Protobuf for a bit and I hope you’ve learnt something a bit new. Overall, just in my personal view, don’t get too dragged into the hype. Protobuf is great and it does what it says on the tin. But it’s just another data format, nothing to be afraid of!

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

This is a 4 part series on working with Protobuf in C# .NET. While you can start anywhere in the series, it’s always best to start at the beginning!

Part 1 – Getting Started
Part 2 – Serializing/Deserializing
Part 3 – Using Length Prefixes
Part 4 – Performance Comparisons


In the last post in this series, we looked at how we can serialize and deserialize a single piece of data to and from Protobuf. For the most part, this is going to be your bread and butter way of working with Protocol Buffers. But there’s actually a slightly “improved” way of serializing data that might come in handy in certain situations, and that’s using “Length Prefixes”.

What Are Length Prefixes In Protobuf?

Length Prefixes sound a bit scary but really it’s super simple. Let’s first start with a scenario of “why” we would want to use length prefixes in the first place.

Imagine that I have multiple objects that I want to push into a single Protobuf stream. Let’s say using our example from previous posts, I have multiple “Person” objects that I want to push across the wire to another application.

Because we are sending multiple objects at once, and they are all encoded as bytes, we need to know when one person ends, and another begins. There are really two ways to solve this :

  • Have a unique byte code that won’t appear in your data, but can be used as a “delimiter” between items
  • Use a “Length Prefix” whereby the first byte (Or bytes) in a stream say how long the first object is, and you know after that many bytes, you can then read the next byte to figure out how long the next item is.

I’ve actually seen *both* options used with Protobuf, but the more common one these days is to use the latter. Mostly because it’s pretty fail safe (You don’t have to pick some special delimited character), but also because you can know ahead of time how large the upcoming object is (You don’t have to just keep reading blindly until you reach a special byte character).

I’m not much of a photoshop guy, so here’s how the stream of data might look in MS Paint :

When reading this data, it might work like so :

  • Read the first 4 bytes to understand how long Message 1 will be
  • Read exactly that many bytes and store as Message 1
  • We can now read the next 4 bytes to understand exactly how long Message 2 will be
  • Read exactly that many bytes and store as Message 2

And so on, and we could actually do this forever if the stream was a constant pump of data. As long as we read the first set of bytes to know how long the next message is, we don’t need any other breaking up of the messages. And again, it’s a boon to us to use this method as we never have to pre-fetch data to know what we are getting.

In all honesty, Length Prefixing is not Protobuf specific. After all the data following could be in *any* format, but it’s probably one of the few data formats that seem to have it really baked in. So much so that of course our Protobuf.NET library from earlier posts has out of the box functionality to handle it! So let’s jump into that now.

Using Protobuf Length Prefixes In C# .NET

As always, if you’re only just jumping into this post without reading the previous ones in the series, you’ll need to install the Protobuf.NET library by using the following command on your package manager console.

Install-Package protobuf-net

Then the code to serialize multiple items to the same data stream might look like so :

var person1 = new Person
{
    FirstName = "Wade",
    LastName = "G"
};

var person2 = new Person
{
    FirstName = "John",
    LastName = "Smith"
};

using (var fileStream = File.Create("persons.buf"))
{
    Serializer.SerializeWithLengthPrefix(fileStream, person1, PrefixStyle.Fixed32);
    Serializer.SerializeWithLengthPrefix(fileStream, person2, PrefixStyle.Fixed32);
}

This is a fairly verbose example to write to a file, but obviously you could be writing to any data stream, looping through a list of people etc. The important thing is that our Serialize call changes to “SerializeWithLengthPrefix”.

Nice and easy!

And then to deserialize, there are some tricky things to look out for. But our basic code might look like so :

using (var fileStream = File.OpenRead("persons.buf"))
{
    Person person = null;
    do
    {
        person = Serializer.DeserializeWithLengthPrefix<Person>(fileStream, PrefixStyle.Fixed32);
    } while (person != null);
}

Notice how we actually *loop* the DeserializeWithLengthPrefix. This is because if there are multiple items within the stream, calling this method will return *one* item each time it’s called (And also move the stream to the start of the next item). If we reach the end of the stream and call this again, we will instead return a null object.

Alternatively, you can call DeserializeItems to instead return an IEnumerable of items. This is actually very similar to serializing one at a time because the IEnumerable is lazy loaded.

using (var fileStream = File.OpenRead("persons.buf"))
{
    var persons = Serializer.DeserializeItems<Person>(fileStream, PrefixStyle.Fixed32, -1);
}

Because the Protobuf.NET library is so easy to use, I don’t want to really dive into every little overloaded method. But the important thing to understand is that when using Length Prefixes, we can push multiple pieces of data to the same stream without any additional legwork required. It’s really great!

Of course, all of this isn’t really worth it unless there is some sort of performance gains right? And that’s what we’ll be looking at in the next part of this series. Just how does ProtoBuf compare to something like JSON? Take a look at all of that and more here : https://dotnetcoretutorials.com/2022/01/18/protobuf-in-c-net-part-4-performance-comparisons/

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

This is a 4 part series on working with Protobuf in C# .NET. While you can start anywhere in the series, it’s always best to start at the beginning!

Part 1 – Getting Started
Part 2 – Serializing/Deserializing
Part 3 – Using Length Prefixes
Part 4 – Performance Comparisons


In our last post, we spent much of the time talking about how proto contracts work. But obviously that’s all for nothing if we don’t start serializing some data. Thankfully for us, the Protobuf.NET library takes almost all of the leg work out of it, and we more or less follow the same paradigms that we did when working with XML or JSON in C#.

Of course, if you haven’t already, install Protobuf.NET into your application using the following package manager console command :

Install-Package protobuf-net

I’m going to be using the same C# contract we used in the last post. But for reference, here it is again.

[ProtoContract]
class Person
{
    [ProtoMember(1)]
    public string FirstName { get; set; }

    [ProtoMember(2)]
    public string LastName { get; set; }

    [ProtoMember(3)]
    public List Emails { get; set; }
}

And away we go!

Serializing Data

To serialize or write our data in protobuf format, we simply need to take our object and push it into a stream. An in memory example (For example if you needed a byte array to send somewhere else), would look like this :

var person = new Person
{
    FirstName = "Wade",
    LastName = "Smith",
    Emails = new List
    {
        "[email protected]", 
        "[email protected]"
    }
};

using(var memoryStream = new MemoryStream())
{
    Serializer.Serialize(memoryStream, person);
    var byteArray = memoryStream.ToArray();
}

So ignoring our set up code there for the Person object, we’ve basically serialized in 1 or 5 lines of code depending on if you want to count the setup of the memory stream. Pretty trivial and it makes all that talk about Protobuf being some sort of voodoo really just melt away.

If we wanted to, we could instead serialize directly to a file like so :

using (var fileStream = File.Create("person.buf"))
{
    Serializer.Serialize(fileStream, person);
}

This leaves us with a person.buf file locally. Of course, if we open this file in a text editor it’s unreadable (Protobuf is not human readable when serialized), but we can use a tool such as https://protogen.marcgravell.com/decode to open the file and tell us what’s inside of it.

Doing that, we get :

Field #1: 0A String Length = 4, Hex = 04, UTF8 = “Wade”
Field #2: 12 String Length = 5, Hex = 05, UTF8 = “Smith”
Field #3: 1A String Length = 20, Hex = 14, UTF8 = “[email protected] …” (total 20 chars)
Field #3: 1A String Length = 18, Hex = 12, UTF8 = “[email protected] …” (total 18 chars)

Notice that the fields within our protobuf file are identified by their integer identifer, *not* by their string property name. Again, this is important to understand because we need the same proto contract identifiers on both ends to know that Field 1 is actually a persons first name.

Well that’s serialization done, how about deserializing?

Deserializing Data

Of course if serializing data can be done in 1 line of code, deserializing or reading back the data is going to be just as easy.

using (var fileStream = File.OpenRead("person.buf"))
{
    var myPerson = Serializer.Deserialize<Person>(fileStream);
    Console.WriteLine(myPerson.FirstName);
}

This is us simply reading a file and deserializing it into our myPerson object. It’s somewhat trivial and really straight forward if I’m being honest and there actually isn’t too much to deep dive into.

That is.. Until we start talking about length prefixes. Length prefixes are protobufs way of serializing several piece of data into the same data stream. So imagine that if we have 5 people, how can we store 5 people in the same file or data stream and know when one persons data ends, and another begins. In the next part of this series we’ll be taking a look at just how that works with Protobuf.NET! Check it out : https://dotnetcoretutorials.com/2022/01/14/protobuf-in-c-net-part-3-using-length-prefixes/

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

This is a 4 part series on working with Protobuf in C# .NET. While you can start anywhere in the series, it’s always best to start at the beginning!

Part 1 – Getting Started
Part 2 – Serializing/Deserializing
Part 3 – Using Length Prefixes
Part 4 – Performance Comparisons


I had just started programming in C# when JSON started gaining steam as the “XML Killer”. Being new to software development, I didn’t really have a horse in the race, but I was fascinated by the almost tribal level of care people put into such a simple thing as competing data formats.

Surprisingly, Google actually released Protobuf (Or Protocol Buffers) in 2008, but I think it’s only just started to pick up steam (Or maybe that’s just in the .NET world). I recently worked on a project that used it, and while not to the level of JSON vs XML, I still saw some similarities in how Protobuf was talked about. Mainly that it was almost made out to be some voodoo world changing piece of technology. All I could think was “But.. It’s just a data serialization format right?”.

The Protobuf docs (just in my view) are not exactly clear in spelling out just what Protobuf is and how it works. Mainly I think that’s because much of the documentation out there takes a language neutral approach to describing how it works. But imagine if you were just learning XML, and you learnt all of the intricacies of XML namespaces, declarations, or entities before actually doing the obvious and serializing a simple piece of data down, looking at it, then deserializing it back up.

That’s what I want to do with this article series. Take Protobuf and give you a dead simple overview with C# in mind and show you just how obvious and easy it really is.

Defining Proto Message Contracts

The first thing we need to understand is the Proto Message Contract. These look scary and maybe even a bit confusing as to what they are used for, but they are actually dead simple. A proto message definition would look like this (In proto format) :

syntax="proto3";

message Person {
  string firstName = 1;
  string lastName = 2;
  repeated string emails = 3;
}

Just look at this like any other class definition in any language :

  • We have our message name (Person)
  • We have our fields and their types (For example firstName is a string)
  • We can have “repeated” fields (Basically arrays/lists in C#)
  • We have an integer identifier for each field. This integer is used *instead* of the field name when we serialize. For example, if we serialized someone with the first name Bob, the serialized content would not have “firstName=’bob'”, it would have “1=’bob'”.

The last point there may be tricky at first but just think about it like this. Using numerical identifiers for each field means you can save a lot of space when dealing with big data because you aren’t subject to storing the entire field name when you serialize.

These contracts are nothing more than a universal way to describe what a message looks like when we serialize it. In my view, it’s no different than an XML or JSON schema. Put simply, we can take this contract and give it to anyone and they will know what the data will look like when we send it to them.

If we take this proto message, and paste it into a converter like the one by Marc Gravell here : https://protogen.marcgravell.com/ We can get what a generated C# representation of this data model will look like (And a bit more on this later!).

The fact is, if you are talking between two systems with Protobuf, you may not even need to worry about every writing or seeing contracts in this format. It’s really no different than someone flicking you an email with something like :

Hey about that protobuf message, it’s going to be in this format :

Firstname will be 1. It’s a string.
LastName will be 2. It’s also a string.
Emails will be 3, and it’s going to be an array of strings

It’s that simple.

Proto Message Contracts In C#

When it comes to working with JSON in C# .NET, you have JSON.NET, so it only makes sense when you are working with Protobuf in C# .NET you have… Protobuf.NET (Again by the amazing Marc Gravell)! Let’s spin up a dead simple console application and add the following package via the package manager console :

Install-Package protobuf-net

Now I will say there are actually a few Protobuf C# libraries floating around, including one by Google. But what I typically find is that these are converted Java libraries, and as such they don’t really conform to how C# is typically written. Protobuf.NET on the other hand is very much a C# library from the bottom up, which makes it super easy and intuitive to work with.

Let’s then take our person class, and use a couple of special attributes given to us by the Protobuf.NET library :

[ProtoContract]
class Person
{
    [ProtoMember(1)]
    public string FirstName { get; set; }

    [ProtoMember(2)]
    public string LastName { get; set; }

    [ProtoMember(3)]
    public List Emails { get; set; }
}

If we compare this to our Proto contact from earlier, it’s a little less scary right? It’s just a plain old C# class, but with a couple of attributes to ensure that we are serializing to the correct identifiers.

I’ll also point something else out here, because we are using integer identifiers, the casing of our properties no longer matters at all. Coming from the C# world where we love PascalCase, this is enormously easy on the eyes. But even more so, when we take a look at performance a bit later on in this series, it will become even clearer what a good decision this is because we no longer have to fiddle around parsing strings, including whether the casing is right or not.

I’ll say it  again that if you have an existing Proto message contract given to you (For example, someone else is building an application in Java and they have given you the contract only), you can simply run it through Marc Gravell’s Protogen tool here : https://protogen.marcgravell.com/

It does generate a bit of a verbose output :

[global::ProtoBuf.ProtoContract()]
public partial class Person : global::ProtoBuf.IExtensible
{
    private global::ProtoBuf.IExtension __pbn__extensionData;
    global::ProtoBuf.IExtension global::ProtoBuf.IExtensible.GetExtensionObject(bool createIfMissing)
        => global::ProtoBuf.Extensible.GetExtensionObject(ref __pbn__extensionData, createIfMissing);

    [global::ProtoBuf.ProtoMember(1)]
    [global::System.ComponentModel.DefaultValue("")]
    public string firstName { get; set; } = "";

    [global::ProtoBuf.ProtoMember(2)]
    [global::System.ComponentModel.DefaultValue("")]
    public string lastName { get; set; } = "";

    [global::ProtoBuf.ProtoMember(3, Name = @"emails")]
    public global::System.Collections.Generic.List Emails { get; } = new global::System.Collections.Generic.List();

}

But for larger contracts it may just work well as a scaffolding tool for you!

So defining contracts is all well and good, how do we go about Serializing the data? Let’s check that out in Part 2! https://dotnetcoretutorials.com/2022/01/13/protobuf-in-c-net-part-2-serializing-deserializing/

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

Now that the flames have simmered down on the Hot Reload Debacle, maybe it’s time again to revisit this feature!

I legitimately feel this is actually one of the best things to be released with .NET in a while. The amount of frustrating times I’ve had to restart my entire application because of one small typo… whereas now it’s Hot Reload to the rescue!

It’s actually a really simple feature so this isn’t going to be too long. You’ll just have to give it a crack and try it out yourself. In short, it looks like this when used :

In case it’s too small, you can click to make it bigger. But in short, I have a console application that is inside a never ending loop. I can change the Console.WriteLine text, and immediately see the results of my change *without* restarting my application. That’s the power of Hot Reload!

And it isn’t just limited to Console Applications. It (should) work with Web Apps, Blazor, WPF applications, really anything you can think of. Obviously there are some limitations. Notably that if you edit your application startup (Or other run-once type code), your application will hot reload, it doesn’t re-run any code blocks, meaning you’ll need to restart your application to get that startup ran again. I’ve also at times had the Hot Reload fail with various errors, usually meaning I just restart and we are away again.

Honestly, one of the biggest things to get used to is the mentality of Hot Reload actually doing something. It’s very hard to “feel” like your changes have been applied. If I’m fixing a bug, and I do a reload and the bug still exists…. It’s hard for me to not stop the application completely and restart just to be sure!

Hot Reload In Visual Studio 2022

Visual Studio 2019 *does* have a hot reload functionality, but it’s less featured (Atleast for me). Thus I’m going to show off Visual Studio 2022 instead!

All we need to do is edit our application while it’s running, then look to our nice little task bar in Visual Studio for the following icon :

That little icon with two fishes swimming after each other (Or.. atleast that’s what it looks like to me) is Hot Reload. Press it, and you are done!

If that’s a little too labour intensive for you, there is even an option to Hot Reload on file save.

If you’re coming from a front end development background you’ll be used to file watchers recompiling your applications based on a save only. On larger projects I’ve found this to maybe be a little bit more pesky (If Hot Reload is having issues, having popups firing off every save is a bit annoying), but on smaller projects I’ve basically run this without a hitch everytime.

Hot Reload From Terminal

Hot Reload from a terminal or command line is just as easy. Simple run the following from your project directory :

dotnet watch

Note *without* typing run after (Just incase you used to use “dotnet watch run”). And that’s it!

Your application will now run with Hot Reload on file save switched on! Doing this you’ll see output looking something like

watch : Files changed: F:\Projects\Core Examples\HotReload\Program.cs~RF1f7ccc54.TMP, F:\Projects\Core Examples\HotReload\Program.cs, F:\Projects\Core Examples\HotReload\qiprgg31.zfd~
watch : Hot reload of changes succeeded.

And then you’re away laughing again!

 

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

TL;DR; Check out the new Q&A Section here : https://qna.dotnetcoretutorials.com/

It’s almost 5 years to the day that I started .NET Core Tutorials. I actually went back and checked and the first ever post was on the 26th of December. Maybe that gives away just how much I do xmas!

One of the first things I did all those years ago was set up an email ([email protected]), and start fielding questions from people. After all, my writing was literally just me figuring things out as I tried to get to grips on the differences between .NET Core and .NET Framework. Over the years, I’ve probably had an ungodly amount of emails from students to 30 year veterans, just asking questions about .NET and C# in general. Some were clearly homework questions, and others were about bugs in .NET Core that I had a hell of a time debugging, but I treated them all the same and gave replies as best I could.

Some of those questions got turned into blog posts of their own where I more or less shared my reply. But other times, the answer was simple enough that dedicating an entire post to what was almost a one word answer or a 5 line code snippet seemed somewhat dumb. That being said, it always annoyed me that while I was willing to help anyone and everyone who emailed me, me replying to that person one on one and not sharing it wasn’t helping anyone else at the same time.

So, with a bit of time on my hands I’ve spun up a Q&A section right here : https://qna.dotnetcoretutorials.com/

What are the rules? Well.. I’m not really sure. For now, I’m just slowing going back and posting questions that I have been emailed over the years, and pasting in my answer to the question. But *anyone* is available to post a question (You don’t even have to register/login), and *anyone* can post an answer, even on questions that already have answers. I know it’s a bit redundant when things like stackoverflow exist, but again, I’m just trying to share what can’t be turned into it’s own fully fledged post.

Will it be overrun with spam in 3 months time? Who knows. But for now, feel free to jump in, post a question, lend a helping hand with an answer, and let’s see how we go.

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

An under the radar feature introduced in SQL Server 2016 (And available in Azure SQL also) is Temporal Tables. Temporal tables allow you to keep a “history” of all data within a SQL table, in a separate (but connected) history table. In very basic terms, every time data in a SQL table is updated, a copy of the original state of the data is cloned into the history table.

The use cases of this are pretty obvious, but include :

  • Ability to query what the state of data was at a specific time
  • Ability to select multiple sets of data between two time periods
  • Viewing how data changes over time (For example, feeding into an analytics or machine learning model)
  • An off the shelf, easy to use, auditing solution for tracking what changed when
  • And finally, a somewhat basic, but still practical disaster recovery scenario for applications going haywire

A big reason for me doing this post is that EF Core 6 has just been released, and includes built in support for temporal tables. While this post will just be a quick intro in how temporal tables work, in the future I’ll be giving a brief intro on getting set up with Entity Framework too!

Getting Started

When creating a new table, it’s almost trivial to add in temporal tables. If I was to create a Person table with two columns, a first and last name, it would look something like so :

CREATE TABLE Person
(  
    [Id] int NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED, 
    FirstName NVARCHAR(250) NOT NULL,
    LastName NVARCHAR(250) NOT NULL , 
    -- The below is how we turn on Temporal. 
    [ValidFrom] datetime2 (0) GENERATED ALWAYS AS ROW START,
    [ValidTo] datetime2 (0) GENERATED ALWAYS AS ROW END,
    PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo)
 )  
 WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.PersonHistory));

There is a couple of things to note here, the first is the last three lines of the CreateTable expression. We need to add the ValidFrom and ValidTo columns and the PERIOD line for everything to work nicely.

Second, it’s very important to note the HISTORY_TABLE statement. When I first started with temporal tables I assumed that there would be a naming convention along the lines of {{TableName}}History. But infact, if you don’t specify what the history table should be called, instead you just end up with a semi random generated name that doesn’t look great.

With this statement run, we end up with a table within a table when looking via SQL Management Studio. Something like so :

I will note that you can turn on temporal tables on an existing table too with an ALTER TABLE statement which is great for projects already on the go.

But here’s the most amazing part about all of this. Nothing about how you use a SQL table changes. For example, inserting a Person record is the same old insert statement as always :

INSERT INTO Person (FirstName, LastName)
VALUES ('Wade', 'Smith')

Our SQL statements for the most part don’t even have to know this is a temporal table at all. And that’s important because if we have an existing project, we aren’t going to run into consistency issues when trying to turn temporal tables on.

With the above insert statement, we end up with a record that looks like so :

The ValidFrom is the datetime we inserted, and obviously the ValidTo is set to maximum because for this particular record, it is valid for all of time (That will become important shortly).

Our PersonHistory table at this point is still empty. But let’s change that! Let’s do an Update statement like so :

UPDATE Person
SET LastName = 'G'
WHERE FirstName = 'Wade'

If we check our Person table, it looks largely the same as before, our ValidFrom date has shifted forward and Wade’s last name is G. But if we check our PersonHistory table :

We now have a record in here that tells us that between our two datetimes, the record with ID 1 had a last name of Smith.

Again, our calling code that updates our Person record doesn’t even have to know that temporal tables is turned on, and everything just works like clockwork encapsulated with SQL Server itself.

Modifying Table Structure

I wanted to point out another real convenience with temporal tables that you might not get if you decided to roll your own history table. After a table creation, what happens if you wanted to add a column to the table?

For example, let’s take our Person table and add a DateOfBirth column.

ALTER TABLE Person
ADD DateOfBirth DATE NULL

You’ll notice that I am only altering the Person table, and not touching the PersonHistory table. That’s because temporal tables automatically handle the alter table statements to also modify the underlying history table. So if I run the above, my history table also receives the update :

This is a huge feature because it means your two tables never get out of sync, and yet, it’s all abstracted away for you and you’ll never have to think about it!

Querying Temporal Tables

Of course, what happens if we actually want to query the history of our Person record? If we were rolling our own, we might have to do a union of our current Person table, and our PersonHistory table. But with temporal tables, it’s a single select statement and SQL Server will work out under the hood which table the data should come from.

Confused? Maybe these examples will help :

SELECT *
FROM Person 
FOR SYSTEM_TIME AS OF '2021-12-10 23:19:25'
WHERE Id = 1

I run the above statement to ask for the state of the Person record, with Id 1, at exactly a particular time. The code executes, and in my case, it pulls the record from the History table.

But let’s say I run it again with a different time :

SELECT *
FROM Person 
FOR SYSTEM_TIME AS OF '2022-01-01'
WHERE Id = 1

Here I’ve made it in the future, just to illustrate a point, but in this case I know it will pull the record from the Person table because it will be the very latest.

What I’m trying to demonstrate is that there is no switching between tables to try and work out which version was correct at the right time. SQL Server does it all for you!

Better yet, you’ll probably end up showing an audit history page somewhere on your web app if using temporal tables. For that we can use the BETWEEN statement like so :

SELECT *
FROM Person 
FOR SYSTEM_TIME BETWEEN '2021-01-01' AND '2022-01-01'
WHERE Id = 1

This then fetches all audit history *and* the current record if applicable between those time periods. Again, all hidden away under the hood for you and exposed as a very simple SYSTEM_TIME query statement.

Size Considerations

While all of this sounds amazing, there is one little caveat to a lot of this. And that’s data size footprint.

In general, you’ll have to think about how much data you are storing if your system generates many updates across a table. Due to the nature of temporal tables storing a copy of the data, many small updates could explode the size of your database. However, in a somewhat ironic twist, tables that receive many updates may be a good candidate for temporal table anyway for auditing history.

Another thing to think about is use of blob data types (text, nvarchar(max)), and even things such as nvarchar vs varchar. Considerations around these data types upfront could save a lot of data space in the long run when it’s duplicated across many historic rows.

There is no one size all approach that fits perfectly, but it is something to keep in mind!

Temporal Tables vs Event Sourcing

Let’s just get this out of the way, temporal tables and event sourcing are not drop in replacements for each other, nor are they really competing technologies.

A temporal table is somewhat rudimentary. It takes a copy of your data and stores it elsewhere on every update/delete operations. If we ask for a row at a specific point in time, we will receive what the data looked like at that point. And if we give a timeframe, we will be returned several copies of that data.

Event sourcing is more of a series of deltas that describe how the data was changed. The hint is the name (event), and it functions much the same as receiving events on a queue. Given a point in time, event sourcing can recreate the data by applying deltas up to that point, and given a timeframe, instead of receiving several copies of the data, we instead receive the deltas that were applied.

I think temporal tables work best when a simple copy of the data will do. For pure viewing purposes, maybe as a data administrator looking at how data looked at a certain point of time for application debugging and the like. Whereas event sourcing really is about structuring your application in an event driven way. It’s not a simple “switch” that you flick on to suddenly make your application work via event sourcing.

Temporal Tables vs Roll Your Own

Of course, history tables are a pretty common practice already. So why use Temporal Tables if you’ve already got your own framework set up?

I think it really comes down to ease of use and a real “switch and forget” mentality with temporal tables. Your application logic does not have to change at all, nor do you have to deal with messy triggers. It almost is an audit in a box type solution with very little overhead to set up and maintain. If you are thinking of adding an audit trail/historic log to an application, temporal tables will likely be the solution 99% of the time.

Entity Framework Support

As mentioned earlier, EF Core 6.0 shipped with temporal table support. That includes code first migrations for turning on temporal tables, and LINQ query extensions to make querying temporal tables a breeze. In the next post, we’ll give head first into how that works!

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

I’ve run into this issue not only when migrating legacy projects to use async/await in C# .NET, but even just day to day on greenfields projects. The issue I’m talking about involves code that looks like so :

static async Task Main(string[] args)
{
    MyAsyncMethod(); // Oops I forgot to await this!
}

static async Task MyAsyncMethod()
{
    await Task.Yield();
}

It can actually be much harder to diagnose than you may think. Due to the way async/await works in C#, your async method may not *always* be awaited. If the async method completes before it has a chance to wait, then your code will actually work much the same as you expect. I have had this happen often in development scenarios, only for things to break only in test. And the excuse of “but it worked on my machine” just doesn’t cut it anymore!

In recent versions of .NET and Visual Studio, there is now a warning that will show to tell you your async method is not awaited. It gives off the trademark green squiggle :

And you’ll receive a build warning with the text :

CS4014 Because this call is not awaited, execution of the current method continues before the call is completed. Consider applying the ‘await’ operator to the result of the call.

The problem with this is that the warning isn’t always immediately noticeable. On top of this, a junior developer may not take heed of the warning anyway.

What I prefer to do is add a line to my csproj that looks like so :

<PropertyGroup>
    <WarningsAsErrors>CS4014;</WarningsAsErrors>
</PropertyGroup>

This means that every async method that is not awaited will actually stop the build entirely.

Disabling Errors By Line

But what if it’s one of those rare times you actually do want to fire and forget (Typically for desktop or console applications), but now you’ve just set up everything to blow up? Worse still the error will show if you are inside an async method calling a method that returns a Task, even if the called method is not itself async.

But we can disable this on a line by line basis like so :

static async Task Main(string[] args)
{
    #pragma warning disable CS4014 
    MyAsyncMethod(); // I don't want to wait this for whatever reason, it's not even async!
    #pragma warning restore CS4014
}

static Task MyAsyncMethod()
{
    return Task.CompletedTask;
}

Non-Awaited Tasks With Results

Finally, the one thing I have not found a way around is like so :

static async Task Main(string[] args)
{
    var result = MyAsyncMethodWithResult();
    var newResult = result + 10;//Error because result is actually an integer. 
}

static async Task<int> MyAsyncMethodWithResult()
{
    await Task.Yield();
    return 0;
}

This code will actually blow up. The reason being that we expect the value of result to be an integer, but in this case because we did not await the method, it’s a task. But what if we pass the result to a method that doesn’t care about the type like so :

static async Task Main(string[] args)
{
    var result = MyAsyncMethodWithResult();
    DoSomethingWithAnObject(result);
}

static async Task MyAsyncMethodWithResult()
{
    await Task.Yield();
    return 0;
}

static void DoSomethingWithAnObject(object myObj)
{
}

This will not cause any compiler warnings or errors (But it will cause runtime errors depending on what DoSomethingWithAnObject does with the value).

Essentially, I found that the warning/error for non awaited tasks is not shown if you assign the value to a variable. This is even the case with Tasks that don’t return a result like so :

static async Task Main(string[] args)
{
    var result = MyAsyncMethod(); // No error
}

static async Task MyAsyncMethod()
{
    await Task.Yield();
}

I have searched high and low for a solution for this but most of the time it leads me to stack overflow answers that go along the lines of “Well, if you assigned the value you MIGHT actually want the Task as a fire and forget”. Which I agree with, but 9 times out of 10, is not going to be the case.

That being said, turning the compiler warnings to errors will catch most of the errors in your code, and the type check system should catch 99% of the rest. For everything else… “Well it worked on my machine”.

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

I was recently helping another developer understand the various “OnDelete” behaviors of Entity Framework Core. That is, when a parent entity in a parent/child relationship is deleted, what should happen to the child?

I thought this was actually all fairly straight forward. The way I understood things was :

DeleteBehavior.Cascade – Delete the child when the parent is deleted (e.g. Cascading deletes)
DeleteBehavior.SetNull – Set the FK on the child to just be null (So allow orphans)
DeleteBehavior.Restrict – Don’t allow the parent to be deleted at all

I’m pretty sure if I asked 100 .NET developers what these meant, there is a fairly high chance that all of them would answer the same way. But in reality, DeleteBehavior.Restrict is actually dependant on what you’ve done in that DBContext up until the delete… Let me explain.

Setting Up

Let’s imagine that I have two models in my database, they look like so :

class BlogPost
{
	public int Id { get; set; }
	public string PostName { get; set; }
	public ICollection<BlogImage> BlogImages { get; set; }
}

class BlogImage
{
	public int Id { get; set; }
	public int? BlogPostId { get; set; }
	public BlogPost? BlogPost { get; set; }
	public string ImageUrl { get; set; }
}

Then imagine the relationship in EF Core is set up like so :

modelBuilder.Entity<BlogImage>()
    .HasOne(x => x.BlogPost)
    .WithMany(x => x.BlogImages)
    .OnDelete(DeleteBehavior.Restrict);

Any developer looking at this at first glance would say, if I delete a blog post that has images pointing to it, it should stop me from deleting the blog post itself. But is that true?

Testing It Out

Let’s imagine I have a simple set of code that looks like do :

var context = new MyContext();
context.Database.Migrate();

var blogPost = new BlogPost
{
	PostName = "Post 1", 
	BlogImages = new List<BlogImage>
	{
		new BlogImage
		{
			ImageUrl = "/foo.png"
		}
	}
};

context.Add(blogPost);
context.SaveChanges();

Console.WriteLine("Blog Post Added");

var getBlogPost = context.Find<BlogPost>(blogPost.Id);
context.Remove(getBlogPost);
context.SaveChanges(); //Does this error here? We are deleting the blog post that has images

Console.WriteLine("Blog Post Removed");

Do I receive an exception? The answer is.. No. When this code is run, and I check the database I end up with a BlogImage that looks like so :

So instead of restricting the delete, EF Core has gone ahead and set the BlogPostId to be null, and essentially given me an orphaned record. But why?!

Diving headfirst into the documentation we can see that DeleteBehavior.Restrict has the following description :

For entities being tracked by the DbContext, the values of foreign key properties in dependent entities are set to null when the related principal is deleted. This helps keep the graph of entities in a consistent state while they are being tracked, such that a fully consistent graph can then be written to the database. If a property cannot be set to null because it is not a nullable type, then an exception will be thrown when SaveChanges() is called.

Emphasis mine.

This doesn’t really make that much sense IMO. But I wanted to test it out further. So I used the following test script, which is exactly the same as before, except half way through I recreate the DB Context. Given the documentation, the entity I pull back for deletion will not have the blog images themselves being tracked.

And sure enough given this code :

var context = new MyContext();
context.Database.Migrate();

var blogPost = new BlogPost
{
	PostName = "Post 1", 
	BlogImages = new List<BlogImage>
	{
		new BlogImage
		{
			ImageUrl = "/foo.png"
		}
	}
};

context.Add(blogPost);
context.SaveChanges();

Console.WriteLine("Blog Post Added");

context = new MyContext(); // <-- Create a NEW DB context

var getBlogPost = context.Find<BlogPost>(blogPost.Id);
context.Remove(getBlogPost);
context.SaveChanges();

Console.WriteLine("Blog Post Removed");

I *do* get the exception I was expecting all along :

SqlException: The DELETE statement conflicted with the REFERENCE constraint “FK_BlogImages_BlogPosts_BlogPostId”.

Still writing this, I’m struggling to understand the logic here. If by some chance you’ve already loaded the child entity (By accident or not), your delete restriction suddenly behaves completely differently. That doesn’t make sense to me.

I’m sure some of you are ready to jump through your screens and tell me that this sort of ambiguity is because I am using a nullable FK on my BlogImage type. Which is true, and does mean that I expect that a BlogImage entity *can* be an orphan. If I set this to be a non-nullable key, then I will always get an exception because it cannot set the FK to null. However, the point I’m trying to make is that if I have a nullable key, but I set the delete behavior to restrict, I should still see some sort of consistent behavior.

What About DeleteBehavior.SetNull?

Another interesting thing to note is that the documentation for DeleteBehavior.SetNull is actually identical to that of Restrict :

For entities being tracked by the DbContext, the values of foreign key properties in dependent entities are set to null when the related principal is deleted. This helps keep the graph of entities in a consistent state while they are being tracked, such that a fully consistent graph can then be written to the database. If a property cannot be set to null because it is not a nullable type, then an exception will be thrown when SaveChanges() is called.

And yet, in my testing, using SetNull does not depend on which entities are being tracked by the DbContext, and works the same every time (Although, I did consider that possibly this is a SQL Server function using the default value rather than EF Core doing the leg work).

I actually spent a long time using Google-Fu to try and find anyone talking about the differences between SetNull and Restrict but, many just go along with what I described in the intro. SetNull sets null when it came, and restrict always stops you from deleting.

Conclusion

Maybe I’m in the minority here, or maybe there is a really good reason for the restrict behavior acting as it does, but I really do think that for the majority of developers, when they use DeleteBehavior.Restrict, they are expecting the parent to be blocked from being deleted in any and all circumstances. I don’t think anyone expects an accidental load of an entity into the DbContext to suddenly change the behavior. Am I alone in that?

Update

I opened an issue on Github asking if all of the above is intended behavior : https://github.com/dotnet/efcore/issues/26857

It’s early days yet but the response is :

EF performs “fixup” to keep the graph of tracked entities consistent when operations are performed on those entities. This includes nulling nullable foreign key properties when the principal that they reference is marked as Deleted. [..]

It is uncommon, but if you don’t want EF to do this fixup to dependent entities when a principal is deleted, then you can set DeleteBehavior.ClientNoAction. Making this change in the code you posted above will result in the database throwing with the message above in both cases, since an attempt is made to delete a principal while a foreign key constraint is still referencing it.

Further on, this is explained more :

Setting Restrict or NoAction in EF Core tells EF Core that the database foreign key constraint is configured this way, and, when using migrations, causes the database foreign key constraint to be created in this way. What it doesn’t do is change the fixup behavior of EF Core; that is, what EF does to keep entities in sync when the graph of tracked entities is changed. This fixup behavior has been the same since legacy EF was released in 2008. For most, it is a major advantage of using an OR/M.

Starting with EF Core, we do allow you to disable this fixup when deleting principal entities by specifying ClientNoAction. The “client” here refers to what EF is doing to tracked entities on the client, as opposed to the behavior of the foreign key constraint in the database. But is is uncommon to do this; most of the time the fixup behavior helps keep changes to entities in sync.

This actually does make a little bit of sense. The “fixup” being disconnected from what is happening in the database. Do I think it’s “intuitive”? Absolutely not. But atleast we have some reasoning for the way it is.

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

A big reason people who develop in .NET languages rave about Visual Studio being the number one IDE, is auto complete and intellisense features. Being able to see what methods/properties are available on a class or scrolling through overloads of a particular method are invaluable. While more lightweight IDE’s like VS Code are blazingly fast.. I usually end up spending a couple of hours setting up extensions to have it function more like Visual Studio anyway!

That being said. When I made the switch to Visual Studio 2022, there was something off but I couldn’t quite put my finger on it. I actually switched a couple of times back to Visual Studio 2019, because I felt more “productive”. I couldn’t quite place it until today.

What I saw was this :

Notice that intellisense has been extended to also predict entire lines, not just the complete of the method/type/class I am currently on. At first this felt amazing but then I started realizing why this was frustrating to use.

  1. The constant flashing of the entire line subconsciously makes me stop and read what it’s suggesting to see if I’ll use it. Maybe this is just something I would get used to but I noticed myself repeatedly losing my flow or train of thought to read the suggestions. Now that may not be that bad until you realize…
  2. The suggestions are often completely non-sensical when working on business software. Take the above suggestion, there is no type called “Category”. So it’s actually suggesting something that should I accept, will actually break anyway.
  3. Even if you don’t accept the suggestions, my brain subconsciously starts typing what they suggest, and therefore end up with broken code regardless.
  4. And all of the above is made even worse because the suggestions completely flip non-stop. In a single line, and even at times following it’s lead, I get suggested no less than 4 different types.

Here’s a gif of what I’m talking about with all 4 of the issues present.

 

Now maybe I’ll get used to the feature but until then, I’m going to turn it all off. So if you are like me and want the same level of intellisense that Visual Studio 2019 had, you need to go :

Tools -> Options -> Intellicode (Not intellisense!)

Then disable the following :

  • Show completions for whole lines of code
  • Show completions on new lines

After disabling these, restart Visual Studio and you should be good to go!

Again, this only affects the auto-complete suggestions for complete lines. It doesn’t affect completing the type/method, or showing you a method summary etc.

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.