In any database schema, it’s extremely common to have the fields “DateCreated, DateUpdated and DateDeleted” on almost every entity. At the very least, they provide helpful debugging information, but further, the DateDeleted affords a way to “soft delete” entities without actually deleting them.

That being said, over the years I’ve seen some pretty interesting ways in which these have been implemented. The worst, in my view, is writing C# code that specifically updates the timestamp when created or updated. While simple, one clumsy developer later and you aren’t recording any timestamps at all. It’s very prone to “remembering” that you have to update the timestamp. Other times, I’ve seen database triggers used which.. works.. But then you have another problem in that you’re using database triggers!

There’s a fairly simple method I’ve been using for years and it involves utilizing the ability to override the save behaviour of Entity Framework.

Auditable Base Model

The first thing we want to do is actually define a “base model” that all entities can inherit from. In my case, I use a base class called “Auditable” that looks like so :

public abstract class Auditable
{
	public DateTimeOffset DateCreated { get; set; }
	public DateTimeOffset? DateUpdated { get; set; }
	public DateTimeOffset? DateDeleted { get; set; }
}

And a couple of notes here :

  • It’s an abstract class because it should only ever be inherited from
  • We use DateTimeOffset because we will then store the timezone along with the timestamp. This is a personal preference but it just removes all ambiguity around “Is this UTC?”
  • DateCreated is not null (Since anything created will have a timestamp), but the other two dates are! Note that if this is an existing database, you will need to allow nullables (And work out a migration strategy) as your existing records will not have a DateCreated.

To use the class, we just need to inherit from it with any Entity Framework model. For example, let’s say we have a Customer object :

public class Customer : Auditable
{
	public int Id { get; set; }
	public string Name { get; set; }
}

So all the class has done is mean we don’t have to copy and paste the same 3 date fields everywhere, and that it’s enforced. Nice and simple!

Overriding Context SaveChanges

The next thing is maybe controversial, and I know there’s a few different ways to do this. Essentially we are looking for a way to say to Entity Framework “Hey, if you insert a new record, can you set the DateCreated please?”. There’s things like Entity Framework hooks and a few nuget packages that do similar things, but I’ve found the absolute easiest way is to simply override the save method of your database context.

The full code looks something like :

public class MyContext: DbContext
{
	public override Task<int> SaveChangesAsync(CancellationToken cancellationToken = default)
	{
		var insertedEntries = this.ChangeTracker.Entries()
							   .Where(x => x.State == EntityState.Added)
							   .Select(x => x.Entity);

		foreach(var insertedEntry in insertedEntries)
		{
			var auditableEntity = insertedEntry as Auditable;
			//If the inserted object is an Auditable. 
			if(auditableEntity != null)
			{
				auditableEntity.DateCreated = DateTimeOffset.UtcNow;
			}
		}

		var modifiedEntries = this.ChangeTracker.Entries()
				   .Where(x => x.State == EntityState.Modified)
				   .Select(x => x.Entity);

		foreach (var modifiedEntry in modifiedEntries)
		{
			//If the inserted object is an Auditable. 
			var auditableEntity = modifiedEntry as Auditable;
			if (auditableEntity != null)
			{
				auditableEntity.DateUpdated = DateTimeOffset.UtcNow;
			}
		}

		return base.SaveChangesAsync(cancellationToken);
	}
}

Now you’re context may have additional code, but this is the bare minimum to get things working. What this does is :

  • Gets all entities that are being inserted, checks if they inherit from auditable, and if so set the Date Created.
  • Gets all entities that are being updated, checks if they inherit from auditable, and is so set the Date Updated.
  • Finally, call the base SaveChanges method that actually does the saving.

Using this, we are essentially intercepting when Entity Framework would normally save all changes, and updating all timestamps at once with whatever is in the batch.

Handling Soft Deletes

Deletes are a special case for one big reason. If we actually try and call delete on an entity in Entity Framework, it gets added to the ChangeTracker as… well… a delete. And to unwind this at the point of saving and change it to an update would be complex.

What I tend to do instead is on my BaseRepository (Because.. You’re using one of those right?), I check if an entity is Auditable and if so, do an update instead. The copy and paste from my BaseRepository looks like so :

public async Task<T> Delete(T entity)
{
	//If the type we are trying to delete is auditable, then we don't actually delete it but instead set it to be updated with a delete date. 
	if (typeof(Auditable).IsAssignableFrom(typeof(T)))
	{
		(entity as Auditable).DateDeleted = DateTimeOffset.UtcNow;
		_dbSet.Attach(entity);
		_context.Entry(entity).State = EntityState.Modified;
	}
	else
	{
		_dbSet.Remove(entity);
	}

	return entity;
}

Now your mileage may vary, especially if you are not using the Repository Pattern (Which you should be!). But in short, you must handle soft deletes as updates *instead* of simply calling Remove on the DbSet.

Taking This Further

What’s not shown here is that we can use this same methodology to update many other “automated” fields. We use this same system to track the last user to Create, Update and Delete entities. Once this is up and running, it’s often just a couple more lines to instantly gain traceability across every entity in your database!

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

This past week, NET 7 Preview 1 was released! By extension, this also means that Entity Framework 7 and ASP.NET Core 7 preview versions shipped at the same time.

So what’s new? In all honesty not a heck of a lot that will blow your mind! As with most Preview 1 releases, it’s mostly about getting that first version bump out of the way and any major blockers from the previous release sorted. So with that in mind, skimming the release notes I can see :

  • Progress continues on MAUI (The multi platform UI components for .NET), but we are still not at an RC (Although RC should be shipping with .NET 7)
  • Entity Framework changes are almost entirely bugs from the previous release
  • There is a slight push (And I’ve also seen this on Twitter), to merge in concepts from Orleans, or more broadly, having .NET 7 focus on quality of life improvements that lend itself to microservices or independent distributed applications (Expect to hear more about this as we get closer to .NET 7 release)
  • Further support for nullable reference types in various .NET libraries
  • Further support for file uploads and streams when building API’s using the Minimal API framework
  • Support for nullable reference types in MVC Views/Razor Pages
  • Performance improvements for header parsing in web applications

So nothing too ground breaking here. Importantly .NET 7 is labelled as a “Current” release which means it only receives 18 months of support. This is normal as Microsoft tend to alternate releases between Life Time Support and Current.

You can download .NET 7 Preview 1 here : https://dotnet.microsoft.com/en-us/download/dotnet/7.0

And you will require Visual Studio 2022 *Preview*!

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

Such is life on Twitter, I’ve been watching from afar .NET developers argue about a particular upcoming C# 11 feature, Parameter Null Checks. It’s actually just a bit of syntactic sugar to make it easier to throw argument null exceptions, but it’s caused a bit of a stir for two main reasons.

  1. People don’t like the syntax full stop. Which I understand, but other features such as some of the switch statement pattern matching and tuples look far worse! So in for a penny in for a pound!
  2. It somewhat clashes with another recent C# feature of Nullable Reference Types (We’ll talk more about this later).

The Problem

First let’s look at the problem this is trying to solve.

I may have a very simple method that takes a list of strings (As an example, but it could be any nullable type). I may want to ensure that whatever the method is given is not null. So we would typically write something like :

void MyMethod(List<string> input)
{
    if(input == null)
    {
        throw new ArgumentNullException(nameof(input));
    }
}

Nothing too amazing here. If the list is null, throw an ArgumentNullException!

In .NET 6 (Specifically .NET 6, not a specific version of C#), a short hand was added to save a few lines. So we could now do :

void MyMethod(List input)
{
    ArgumentNullException.ThrowIfNull(input);
}

There is no magic here. It’s just doing the same thing we did before with the null check, but wrapping it all up into a nice helper.

So what’s the problem? Well.. There isn’t one really. The only real issue is that should you have a method with many parameters, and all of them nullable, and yet you want to throw a ArgumentNullException, you might have an additional few lines at the start of your method. I guess that’s a problem to be solved, but it isn’t too much of a biggie.

Parameter Null Checking In C# 11

I put C# 11 here, but actually you can turn on this feature in C# 10 by adding the following to your csproj file :

<EnablePreviewFeatures>True</EnablePreviewFeatures>

Now we have a bit of sugar around null check by doing the following :

void MyMethod(List<string> input!!)
{
}

Adding the “!!” operator to a parameter name immediately adds an argument null check to it, skipping the need for the first few lines to be boilerplate null checks.

Just my personal opinion, it’s not… that bad. I think people see the use of symbols, such as ? or ! and they immediately get turned off. When using a symbol like this, especially one that isn’t universal across different languages (such as a ternary ?), it’s not immediately clear what it does. I’ve even seen some suggest just adding another keyword such as :

void MyMethod(notnull List<string> input)
{
}

I don’t think this is really any better to be honest.

Overall, it’s likely to see a little bit of use. But the interesting context of some of the arguments against this is….

Nullable Reference Types

For why I am totally wrong in all of the below, check this great comment from David. It explains why, while the below is true, it’s also not the full story and I am wrong in suggesting that you only need one or the other!

C#8 introduced the concept of Nullable Reference Types. Before this, all reference types were nullable by default, and so the above checks were essentially required. C#8 came along and gave a flag to say, if I want something to be nullable, I’ll let you know, otherwise treat everything as non nullable. You can read more about the feature here : https://dotnetcoretutorials.com/2018/12/19/nullable-reference-types-in-c-8/

The interesting point here is that if I switch this flag on (And from .NET 6, it’s switched on by default in new projects), then there is no need for ArgumentNullExceptions because either the parameter is not null by default, or I specify that it can be null (And therefore won’t need the check).

Just as an example, with Nullables switched on using code :

#nullable enable
void MyMethod(List<string> input)
{
    //Input cannot be null anyway. So no need for the check. 
}


void MyMethod2(List<string>? input)
{
    //Using ? I've specified it can be null, and if I'm saying it can be null...
    //I won't be throwing exceptions when it is null right? 
}

There’s arguments that nullable reference types are a compile time check whereas throwing an exception is a runtime check. But the reality is they actually solve the same problem just in different ways, and if there is a push to do things one way (nullable reference types), then there’s no need for the other.

With all of that being said. Honestly, it’s a nice feature and I’m really not that fussed over it. The extent of my thinking is that it’s a handy little helper. That’s all.

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

A number of times in recent years, I’ve had the chance to work in companies that completely design out entire API’s using OpenAPI, before writing a single line of code. Essentially writing YAML to say which endpoints will be available, and what each API should accept and return.

There’s pros and cons to doing this of course. A big pro is that by putting in upfront time to really thinking about API structure, we can often uncover issues well before we get half way through a build. But a con is that after spending a bunch of time defining things like models and endpoints in YAML, we then need to spend days doing nothing but creating C# classes as clones of their YAML counterparts which can be tiresome and frankly, demoralizing at times.

That’s when I came across Open API Generator : https://openapi-generator.tech/

It’s a tool to take your API definitions, and scaffold out APIs and Clients without you having to lift a finger. It’s surprisingly configurable, but at the same time it isn’t too opinionated and allows you to do just the basics of turning your definition into controllers and models, and nothing more.

Let’s take a look at a few examples!

Installing Open API Generator

If you read the documentation here https://github.com/OpenAPITools/openapi-generator, it would look like installing is a huge ordeal of XML files, Maven and JAR files. But for me, using NPM seemed to be simple enough. Assuming you have NPM installed already (Which you should!), then you can simply run :

npm install @openapitools/openapi-generator-cli -g

And that’s it! Now from a command line you can run things like :

openapi-generator-cli version

Scaffolding An API

For this example, I actually took the PetStore API available here : https://editor.swagger.io/

It’s just a simple YAML definition that has CRUD operations on an example API for a pet store. I took this YAML and stored it as “petstore.yaml” locally. Then I ran the following command in the same folder  :

openapi-generator-cli generate -i petstore.yaml -g aspnetcore -o PetStore.Web --package-name PetStore.Web

Pretty self explanatory but one thing I do want to point out is the -g flag. I’m passing in aspnetcore here but in reality, Open API Generator has support to generate API’s for things like PHP, Ruby, Python etc. It’s not C# specific at all!

Our project is generated and overall, it looks just like any other API you would build in .NET

Notice that for each group of API’s in our definition, it’s generated a controller each and models as well.

The controllers themselves are well decorated, but are otherwise empty. For example here is the AddPet method :

/// <summary>
/// Add a new pet to the store
/// </summary>
/// <param name="body">Pet object that needs to be added to the store</param>
/// <response code="405">Invalid input</response>
[HttpPost]
[Route("/v2/pet")]
[Consumes("application/json", "application/xml")]
[ValidateModelState]
[SwaggerOperation("AddPet")]
public virtual IActionResult AddPet([FromBody]Pet body)
{
    //TODO: Uncomment the next line to return response 405 or use other options such as return this.NotFound(), return this.BadRequest(..), ...
    // return StatusCode(405);

    throw new NotImplementedException();
}

I would note that this is obviously rather verbose (With the comments, Consumes attribute etc), but a lot of that is because that’s what we decorated our OpenAPI definition with, therefore it tries to generate a controller that should act and function identically.

But also notice that it hasn’t generated a service or data layer. It’s just the controller and the very basics of how data gets in and out of the API. It means you can basically scaffold things and away you go.

The models themselves also get generated, but they can be rather verbose. For example, each model gets an override of the ToString method that looks a bit like so :

/// <summary>
/// Returns the string presentation of the object
/// </summary>
/// <returns>String presentation of the object</returns>
public override string ToString()
{
    var sb = new StringBuilder();
    sb.Append("class Pet {\n");
    sb.Append("  Id: ").Append(Id).Append("\n");
    sb.Append("  Category: ").Append(Category).Append("\n");
    sb.Append("  Name: ").Append(Name).Append("\n");
    sb.Append("  PhotoUrls: ").Append(PhotoUrls).Append("\n");
    sb.Append("  Tags: ").Append(Tags).Append("\n");
    sb.Append("  Status: ").Append(Status).Append("\n");
    sb.Append("}\n");
    return sb.ToString();
}

It’s probably overkill, but you can always delete it if you don’t like it.

Obviously there isn’t much more to say about the process. One command and you’ve got yourself a great starting point for an API. I would like to say that you should definitely dig into the docs for the generator as there is actually a tonne of flags to use that likely solve a lot of hangups you might have about what it generates for you. For example there is a flag to use NewtonSoft.JSON instead of System.Text.Json if that is your preference!

I do want to touch on a few pros and cons on using a generator like this though…

The first con is that updates to the original Open API definition really can’t be “re-generated” into the API. There are ways to do it using the tool but in reality, I find it unlikely that you would do it like this. So for the most part, the generation of the API is going to be a one time thing.

Another con is as I’ve already pointed out, the generator has it’s own style which may or may not suit the way you like to develop software. On larger API’s fixing some of these quirks of the generator can be annoying. But I would say that for the most part, fixing any small style issues is still likely to take less time than writing the entire API from scratch by hand.

Overall however, the pro of this is that you have a very consistent style. For example, I was helping out a professional services company with some of their code practices recently. What I noticed is that they spun up new API’s every month for different customers. Each API was somewhat beholden to the tech leads style and preferences. By using an API generator as a starting point, it meant that everyone had a somewhat similar starting point for the style we wanted to go for, and the style that we should use going forward.

Generating API Clients

I want to quickly touch on another functionality of the Open API Generator, and that is generating clients for an API. For example, if you have a C# service that needs to call out to a web service, how can you quickly whip up a client to interact with that API?

We can use the following command to generate a Client library project :

openapi-generator-cli generate -i petstore.yaml -g csharp -o PetStore.Client --package-name PetStore.Client

This generates a very simple PetApi interface/class that has all of our methods to call the API.

For example, take a look at this simple code :

var petApi  = new PetApi("https://myapi.com");
var myPet = petApi.GetPetById(123);
myPet.Name = "John Smith";
petApi.UpdatePet(myPet);

Unlike the server code we generated, I find that the client itself is often able to be regenerated as many times as needed, and over long periods of time too.

As I mentioned, the client code is very handy when two services need to talk to each other, but I’ve also found it useful for writing large scale integration tests without having to copy and paste large models between projects or be mindful about what has changed in an API, and copy those changes over to my test project.

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

I cannot tell you how many times I’ve had the following conversation

“Hey I’m getting an error”

“What’s the error?”

“DBUpdateException”

“OK, what’s the message though, that could be anything”

“ahhh.. I didn’t see…..”

Frustratingly, When doing almost anything with Entity Framework including updates, deletes and inserts, if something goes wrong you’ll be left with the generic exception of :

Microsoft.EntityFrameworkCore.DbUpdateException: ‘An error occurred while saving the entity changes. See the inner exception for details.’

It can be extremely annoying if you’re wanting to catch a particular database exception (e.g. It’s to be expected that duplicates might be inserted), and handle them in a different way than something like being unable to connect to the database full stop. Let’s work up a quick example to illustrate what I mean.

Let’s assume I have a simple database model like so :

class BlogPost
{
    public int Id { get; set; }
    public string PostName { get; set; }
}

And I have configured my entity to have a unique constaint meaning that every BlogPost must have a unique name :

modelBuilder.Entity<BlogPost>()
    .HasIndex(x => x.PostName)
    .IsUnique();

If I do something as simple as :

context.Add(new BlogPost
{
    PostName = "Post 1"
});

context.Add(new BlogPost
{
    PostName = "Post 1"
});

context.SaveChanges();

The *full* exception would be along the lines of :

Microsoft.EntityFrameworkCore.DbUpdateException: ‘An error occurred while saving the entity changes. See the inner exception for details.’
Inner Exception
SqlException: Cannot insert duplicate key row in object ‘dbo.BlogPosts’ with unique index ‘IX_BlogPosts_PostName’. The duplicate key value is (Post 1).

Let’s say that we want to handle this exception in a very specific way, for us to do this we would have to have a bit of a messy try/catch statement :

try
{
    context.SaveChanges();
}catch(DbUpdateException exception) when (exception?.InnerException?.Message.Contains("Cannot insert duplicate key row in object") ?? false)
{
    //We know that the actual exception was a duplicate key row
}

Very ugly and there isn’t much reusability here. If we want to catch a similar exception elsewhere in our code, we’re going to be copy and pasting this long catch statement everywhere.

And that’s where I came across the EntityFrameworkCore.Exceptions library!

Using EntityFrameworkCore.Exceptions

The EntityFrameworkCore.Exceptions library is extremely easy to use and I’m actually somewhat surprised that it hasn’t made it’s way into the core EntityFramework libraries already.

To use it, all we have to do is run the following on our Package Manager Console :

Install-Package EntityFrameworkCore.Exceptions.SqlServer

And note that there are packages for things like Postgres and MySQL if that’s your thing!

Then with a single line for our DB Context we can set up better error handling :

protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
    optionsBuilder.UseExceptionProcessor();
}

If we run our example code from above, instead of our generic DbUpdateException we get :

EntityFramework.Exceptions.Common.UniqueConstraintException: ‘Unique constraint violation’

Meaning we can change our Try/Catch to be :

try
{
    context.SaveChanges();
}catch(UniqueConstraintException ex)
{
    //We know that the actual exception was a duplicate key row
}

Much cleaner, much tidier, and far more reusable!

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

This is a 4 part series on working with Protobuf in C# .NET. While you can start anywhere in the series, it’s always best to start at the beginning!

Part 1 – Getting Started
Part 2 – Serializing/Deserializing
Part 3 – Using Length Prefixes
Part 4 – Performance Comparisons


We’ve made mention in previous posts to the fact that Protobuf (supposedly) will out perform many other data formats, namely JSON. And while we’ve kind of alluded to the fact it’s “fast” and it’s “small”, we haven’t really jumped into the actual numbers.

This post will take a look across three different metrics :

  • File Size – So just how lightweight is Protobuf?
  • Serialization – How fast can we take a C# object and serialize it into Protobuf or JSON?
  • Deserialization – Given a Protobuf/JSON data format, how fast can we turn it into a C# object?

Let’s jump right in!

File Size Comparisons

Before looking at read/write performance, I actually wanted to compare how large the actual output is between Protobuf and JSON. I set up a really simple test that used the following model :

[ProtoContract]
class Person
{
    [ProtoMember(1)]
    public string FirstName { get; set; }

    [ProtoMember(2)]
    public string LastName { get; set; }

    [ProtoMember(3)]
    public List Emails { get; set; }
}

And I used the following code to create an object, and write it twice. Once with Protobuf and once with JSON :

var person = new Person
{
    FirstName = "Wade",
    LastName = "G", 
    Emails = new List<string>
    {
        "[email protected]", 
        "[email protected]"
    }
};

using (var fileStream = File.Create("person.buf"))
{
    Serializer.Serialize(fileStream, person, PrefixStyle.Fixed32);
}

var personString = JsonConvert.SerializeObject(person);
File.WriteAllText("person.json", personString);

The results were :

Format FileSize
Protobuf 46 bytes
JSON 85 bytes

So just by default, Protobuf is almost half the size. Obviously your mileage may vary depending on your data types and even your property names.

That last point is important because while Protobuf has other mechanisms keeping the size down, a big part of it is that all property names are serialized as integers rather than their string form. To illustrate this, I modified the model to look like so :

[ProtoContract]
class Person
{
    [ProtoMember(1)]
    [JsonProperty("1")]
    public string FirstName { get; set; }

    [ProtoMember(2)]
    [JsonProperty("2")]
    public string LastName { get; set; }

    [ProtoMember(3)]
    [JsonProperty("3")]
    public List Emails { get; set; }
}

So now our JSON will be serialized with single digit names as well. When running this, our actual comparison table looks like so :

Format FileSize
Protobuf 46 bytes
JSON 85 bytes
JSON With Digit Properties 65 bytes

So half of the benefits of using Protobuf when it comes to size instantly disappears! For now, I’m not going to use the single digit properties going forward because it’s not illustrative of what happens in the real world with JSON, but it’s an interesting little footnote that you can shrink your disk footprint with just this one simple hack that storage providers hate.

So overall, Protobuf has JSON beat when it comes to file size. That’s no surprise, but what about actual performance when working with objects?

Serialization Performance

Next, let’s take a look at serializing performance. There are a couple of notes on the methodology behind this

  • Because Protobuf serializes to bytes and JSON to strings, I wanted to leave them like that. e.g. I did not take the JSON string, and convert it into bytes as this would artificially create an overhead when there is no need.
  • I kept everything in memory (I did not write to a file etc)
  • I wanted to try and use *both* JSON.NET and Microsoft’s JSON Serializer. The latter is almost certainly going to be faster, but the former probably has more use cases out there in the wild.
  • For now, I’m just using the Protobuf.NET library for everything related to Protobuf
  • Use Protobuf as the “baseline” so everything will compared to how much slower (Or faster, you never know!) it is compared to Protobuf

With that in mind, here’s the benchmark using BenchmarkDotNet (Quick guide if you haven’t seen it before here : https://dotnetcoretutorials.com/2017/12/04/benchmarking-net-core-code-benchmarkdotnet/)

public class ProtobufVsJSONSerializeBenchmark
{
    static Person person = new Person
    {
        FirstName = "Wade",
        LastName = "G",
        Emails = new List<string>
        {
            "[email protected]",
            "[email protected]"
        }
    };

    [Benchmark(Baseline = true)]
    public byte[] SerializeProtobuf()
    {
        using(var memoryStream = new MemoryStream())
        {
            ProtoBuf.Serializer.Serialize(memoryStream, person);
            return memoryStream.ToArray();
        }
    }

    [Benchmark]
    public string SerializeJsonMicrosoft()
    {
        return System.Text.Json.JsonSerializer.Serialize(person);
    }

    [Benchmark]
    public string SerializeJsonDotNet()
    {
        return Newtonsoft.Json.JsonConvert.SerializeObject(person);
    }
}

And the results?

Format Average Time Baseline Comparison
Protobuf 680ns
Microsoft JSON 743ns 9% Slower
JSON.NET 1599ns 135% Slower

So we can see that Protobuf is indeed faster, but not by a heck of a lot. And of course, I’m willing to bet a keen eyed reader will drop a comment below and tell me how the benchmark could be improved to make Microsoft’s JSON serializer even faster.

Of course JSON.NET is slower, and that is to be expected, but again I’m surprised that Protobuf, while fast, isn’t *that* much faster. How about deserialization?

Deserialization Performance

We’ve done serialization, so let’s take a look at the reverse – deserialization.

I do want to point out one thing before we even start, and that is that JSON.NET and Microsoft’s JSON library handle case sensitivity with JSON *very* differently. Infact, JSON.NET is case insensitive by default and is the *only* way it can run. Microsoft’s JSON library is case sensitive by default and must be switched to handle case insensitivity at a huge cost. I have an entire article dedicated to the subject here : https://dotnetcoretutorials.com/2020/01/25/what-those-benchmarks-of-system-text-json-dont-mention/

In some ways, that somewhat invalidates our entire test (Atleast when comparing JSON.NET to Microsoft’s JSON), because it actually entirely depends on whether your JSON is in the exact casing you require (In most cases that’s going to be PascalCase), or if it’s in CamelCase (In which case you take a performance hit). But for now, let’s push that aside and try our best to create a simple benchmark.

Other things to note :

  • Again, I want to work with the formats that work with each data format. So Protobuf will be deserializing from a byte array, and JSON will be deserializing from a string
  • I *had* to create a memory stream for Protobuf. Atleast without making the test more complicated than it needed to be.
public class ProtobufVsJSONDeserializeBenchmark
{
    public static Person person = new Person
    {
        FirstName = "Wade",
        LastName = "G",
        Emails = new List<string>
        {
            "[email protected]",
            "[email protected]"
        }
    };

    static byte[] PersonBytes;
    static string PersonString;

    [GlobalSetup]
    public void Setup()
    {
        using (var memoryStream = new MemoryStream())
        {
            ProtoBuf.Serializer.Serialize(memoryStream, person);
            PersonBytes = memoryStream.ToArray();
        }

        PersonString = JsonConvert.SerializeObject(person);
    }

    [Benchmark(Baseline = true)]
    public Person DeserializeProtobuf()
    {
        using (var memoryStream = new MemoryStream(PersonBytes))
        {
            return ProtoBuf.Serializer.Deserialize<Person>(memoryStream);
        }
    }

    [Benchmark]
    public Person DeserializeJsonMicrosoft()
    {
        return System.Text.Json.JsonSerializer.Deserialize<Person>(PersonString);
    }

    [Benchmark]
    public Person DeserializeJsonDotNet()
    {
        return Newtonsoft.Json.JsonConvert.DeserializeObject<Person>(PersonString);
    }
}

I know it’s a big bit of code to sift through but it’s all relatively simple. We are just deserializing back into a Person object. And the results?

Format Average Time Baseline Comparison
Protobuf 1.019us
Microsoft JSON 1.238us 21% Slower
JSON.NET 2.598us 155% Slower

So overall, Protobuf wins again and by a bigger margin this time than our Serialization effort (When it comes to percentage). But again, your mileage will vary heavily depending on what format your JSON is in.

Conclusion

The overall conclusion is that indeed, Protobuf is faster than JSON by a reasonable margin, or a huge margin if comparing it to JSON.NET. However, in some respects a big part of the difference is likely to lie in how JSON is always serialized as strings versus the direct byte serialization of Protobuf. But that’s just a hunch of mine.

When it comes to file size, Protobuf wins out again, *especially* when serializing full JSON property names. Obviously here we are talking about the difference between a few bytes, but when you are storing say 500GB of data in Protobuf, that same data would be 1000GB in JSON, so it definitely adds up.

That’s all I’m doing on Protobuf for a bit and I hope you’ve learnt something a bit new. Overall, just in my personal view, don’t get too dragged into the hype. Protobuf is great and it does what it says on the tin. But it’s just another data format, nothing to be afraid of!

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

This is a 4 part series on working with Protobuf in C# .NET. While you can start anywhere in the series, it’s always best to start at the beginning!

Part 1 – Getting Started
Part 2 – Serializing/Deserializing
Part 3 – Using Length Prefixes
Part 4 – Performance Comparisons


In the last post in this series, we looked at how we can serialize and deserialize a single piece of data to and from Protobuf. For the most part, this is going to be your bread and butter way of working with Protocol Buffers. But there’s actually a slightly “improved” way of serializing data that might come in handy in certain situations, and that’s using “Length Prefixes”.

What Are Length Prefixes In Protobuf?

Length Prefixes sound a bit scary but really it’s super simple. Let’s first start with a scenario of “why” we would want to use length prefixes in the first place.

Imagine that I have multiple objects that I want to push into a single Protobuf stream. Let’s say using our example from previous posts, I have multiple “Person” objects that I want to push across the wire to another application.

Because we are sending multiple objects at once, and they are all encoded as bytes, we need to know when one person ends, and another begins. There are really two ways to solve this :

  • Have a unique byte code that won’t appear in your data, but can be used as a “delimiter” between items
  • Use a “Length Prefix” whereby the first byte (Or bytes) in a stream say how long the first object is, and you know after that many bytes, you can then read the next byte to figure out how long the next item is.

I’ve actually seen *both* options used with Protobuf, but the more common one these days is to use the latter. Mostly because it’s pretty fail safe (You don’t have to pick some special delimited character), but also because you can know ahead of time how large the upcoming object is (You don’t have to just keep reading blindly until you reach a special byte character).

I’m not much of a photoshop guy, so here’s how the stream of data might look in MS Paint :

When reading this data, it might work like so :

  • Read the first 4 bytes to understand how long Message 1 will be
  • Read exactly that many bytes and store as Message 1
  • We can now read the next 4 bytes to understand exactly how long Message 2 will be
  • Read exactly that many bytes and store as Message 2

And so on, and we could actually do this forever if the stream was a constant pump of data. As long as we read the first set of bytes to know how long the next message is, we don’t need any other breaking up of the messages. And again, it’s a boon to us to use this method as we never have to pre-fetch data to know what we are getting.

In all honesty, Length Prefixing is not Protobuf specific. After all the data following could be in *any* format, but it’s probably one of the few data formats that seem to have it really baked in. So much so that of course our Protobuf.NET library from earlier posts has out of the box functionality to handle it! So let’s jump into that now.

Using Protobuf Length Prefixes In C# .NET

As always, if you’re only just jumping into this post without reading the previous ones in the series, you’ll need to install the Protobuf.NET library by using the following command on your package manager console.

Install-Package protobuf-net

Then the code to serialize multiple items to the same data stream might look like so :

var person1 = new Person
{
    FirstName = "Wade",
    LastName = "G"
};

var person2 = new Person
{
    FirstName = "John",
    LastName = "Smith"
};

using (var fileStream = File.Create("persons.buf"))
{
    Serializer.SerializeWithLengthPrefix(fileStream, person1, PrefixStyle.Fixed32);
    Serializer.SerializeWithLengthPrefix(fileStream, person2, PrefixStyle.Fixed32);
}

This is a fairly verbose example to write to a file, but obviously you could be writing to any data stream, looping through a list of people etc. The important thing is that our Serialize call changes to “SerializeWithLengthPrefix”.

Nice and easy!

And then to deserialize, there are some tricky things to look out for. But our basic code might look like so :

using (var fileStream = File.OpenRead("persons.buf"))
{
    Person person = null;
    do
    {
        person = Serializer.DeserializeWithLengthPrefix<Person>(fileStream, PrefixStyle.Fixed32);
    } while (person != null);
}

Notice how we actually *loop* the DeserializeWithLengthPrefix. This is because if there are multiple items within the stream, calling this method will return *one* item each time it’s called (And also move the stream to the start of the next item). If we reach the end of the stream and call this again, we will instead return a null object.

Alternatively, you can call DeserializeItems to instead return an IEnumerable of items. This is actually very similar to serializing one at a time because the IEnumerable is lazy loaded.

using (var fileStream = File.OpenRead("persons.buf"))
{
    var persons = Serializer.DeserializeItems<Person>(fileStream, PrefixStyle.Fixed32, -1);
}

Because the Protobuf.NET library is so easy to use, I don’t want to really dive into every little overloaded method. But the important thing to understand is that when using Length Prefixes, we can push multiple pieces of data to the same stream without any additional legwork required. It’s really great!

Of course, all of this isn’t really worth it unless there is some sort of performance gains right? And that’s what we’ll be looking at in the next part of this series. Just how does ProtoBuf compare to something like JSON? Take a look at all of that and more here : https://dotnetcoretutorials.com/2022/01/18/protobuf-in-c-net-part-4-performance-comparisons/

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

This is a 4 part series on working with Protobuf in C# .NET. While you can start anywhere in the series, it’s always best to start at the beginning!

Part 1 – Getting Started
Part 2 – Serializing/Deserializing
Part 3 – Using Length Prefixes
Part 4 – Performance Comparisons


In our last post, we spent much of the time talking about how proto contracts work. But obviously that’s all for nothing if we don’t start serializing some data. Thankfully for us, the Protobuf.NET library takes almost all of the leg work out of it, and we more or less follow the same paradigms that we did when working with XML or JSON in C#.

Of course, if you haven’t already, install Protobuf.NET into your application using the following package manager console command :

Install-Package protobuf-net

I’m going to be using the same C# contract we used in the last post. But for reference, here it is again.

[ProtoContract]
class Person
{
    [ProtoMember(1)]
    public string FirstName { get; set; }

    [ProtoMember(2)]
    public string LastName { get; set; }

    [ProtoMember(3)]
    public List Emails { get; set; }
}

And away we go!

Serializing Data

To serialize or write our data in protobuf format, we simply need to take our object and push it into a stream. An in memory example (For example if you needed a byte array to send somewhere else), would look like this :

var person = new Person
{
    FirstName = "Wade",
    LastName = "Smith",
    Emails = new List
    {
        "[email protected]", 
        "[email protected]"
    }
};

using(var memoryStream = new MemoryStream())
{
    Serializer.Serialize(memoryStream, person);
    var byteArray = memoryStream.ToArray();
}

So ignoring our set up code there for the Person object, we’ve basically serialized in 1 or 5 lines of code depending on if you want to count the setup of the memory stream. Pretty trivial and it makes all that talk about Protobuf being some sort of voodoo really just melt away.

If we wanted to, we could instead serialize directly to a file like so :

using (var fileStream = File.Create("person.buf"))
{
    Serializer.Serialize(fileStream, person);
}

This leaves us with a person.buf file locally. Of course, if we open this file in a text editor it’s unreadable (Protobuf is not human readable when serialized), but we can use a tool such as https://protogen.marcgravell.com/decode to open the file and tell us what’s inside of it.

Doing that, we get :

Field #1: 0A String Length = 4, Hex = 04, UTF8 = “Wade”
Field #2: 12 String Length = 5, Hex = 05, UTF8 = “Smith”
Field #3: 1A String Length = 20, Hex = 14, UTF8 = “[email protected] …” (total 20 chars)
Field #3: 1A String Length = 18, Hex = 12, UTF8 = “[email protected] …” (total 18 chars)

Notice that the fields within our protobuf file are identified by their integer identifer, *not* by their string property name. Again, this is important to understand because we need the same proto contract identifiers on both ends to know that Field 1 is actually a persons first name.

Well that’s serialization done, how about deserializing?

Deserializing Data

Of course if serializing data can be done in 1 line of code, deserializing or reading back the data is going to be just as easy.

using (var fileStream = File.OpenRead("person.buf"))
{
    var myPerson = Serializer.Deserialize<Person>(fileStream);
    Console.WriteLine(myPerson.FirstName);
}

This is us simply reading a file and deserializing it into our myPerson object. It’s somewhat trivial and really straight forward if I’m being honest and there actually isn’t too much to deep dive into.

That is.. Until we start talking about length prefixes. Length prefixes are protobufs way of serializing several piece of data into the same data stream. So imagine that if we have 5 people, how can we store 5 people in the same file or data stream and know when one persons data ends, and another begins. In the next part of this series we’ll be taking a look at just how that works with Protobuf.NET! Check it out : https://dotnetcoretutorials.com/2022/01/14/protobuf-in-c-net-part-3-using-length-prefixes/

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

This is a 4 part series on working with Protobuf in C# .NET. While you can start anywhere in the series, it’s always best to start at the beginning!

Part 1 – Getting Started
Part 2 – Serializing/Deserializing
Part 3 – Using Length Prefixes
Part 4 – Performance Comparisons


I had just started programming in C# when JSON started gaining steam as the “XML Killer”. Being new to software development, I didn’t really have a horse in the race, but I was fascinated by the almost tribal level of care people put into such a simple thing as competing data formats.

Surprisingly, Google actually released Protobuf (Or Protocol Buffers) in 2008, but I think it’s only just started to pick up steam (Or maybe that’s just in the .NET world). I recently worked on a project that used it, and while not to the level of JSON vs XML, I still saw some similarities in how Protobuf was talked about. Mainly that it was almost made out to be some voodoo world changing piece of technology. All I could think was “But.. It’s just a data serialization format right?”.

The Protobuf docs (just in my view) are not exactly clear in spelling out just what Protobuf is and how it works. Mainly I think that’s because much of the documentation out there takes a language neutral approach to describing how it works. But imagine if you were just learning XML, and you learnt all of the intricacies of XML namespaces, declarations, or entities before actually doing the obvious and serializing a simple piece of data down, looking at it, then deserializing it back up.

That’s what I want to do with this article series. Take Protobuf and give you a dead simple overview with C# in mind and show you just how obvious and easy it really is.

Defining Proto Message Contracts

The first thing we need to understand is the Proto Message Contract. These look scary and maybe even a bit confusing as to what they are used for, but they are actually dead simple. A proto message definition would look like this (In proto format) :

syntax="proto3";

message Person {
  string firstName = 1;
  string lastName = 2;
  repeated string emails = 3;
}

Just look at this like any other class definition in any language :

  • We have our message name (Person)
  • We have our fields and their types (For example firstName is a string)
  • We can have “repeated” fields (Basically arrays/lists in C#)
  • We have an integer identifier for each field. This integer is used *instead* of the field name when we serialize. For example, if we serialized someone with the first name Bob, the serialized content would not have “firstName=’bob'”, it would have “1=’bob'”.

The last point there may be tricky at first but just think about it like this. Using numerical identifiers for each field means you can save a lot of space when dealing with big data because you aren’t subject to storing the entire field name when you serialize.

These contracts are nothing more than a universal way to describe what a message looks like when we serialize it. In my view, it’s no different than an XML or JSON schema. Put simply, we can take this contract and give it to anyone and they will know what the data will look like when we send it to them.

If we take this proto message, and paste it into a converter like the one by Marc Gravell here : https://protogen.marcgravell.com/ We can get what a generated C# representation of this data model will look like (And a bit more on this later!).

The fact is, if you are talking between two systems with Protobuf, you may not even need to worry about every writing or seeing contracts in this format. It’s really no different than someone flicking you an email with something like :

Hey about that protobuf message, it’s going to be in this format :

Firstname will be 1. It’s a string.
LastName will be 2. It’s also a string.
Emails will be 3, and it’s going to be an array of strings

It’s that simple.

Proto Message Contracts In C#

When it comes to working with JSON in C# .NET, you have JSON.NET, so it only makes sense when you are working with Protobuf in C# .NET you have… Protobuf.NET (Again by the amazing Marc Gravell)! Let’s spin up a dead simple console application and add the following package via the package manager console :

Install-Package protobuf-net

Now I will say there are actually a few Protobuf C# libraries floating around, including one by Google. But what I typically find is that these are converted Java libraries, and as such they don’t really conform to how C# is typically written. Protobuf.NET on the other hand is very much a C# library from the bottom up, which makes it super easy and intuitive to work with.

Let’s then take our person class, and use a couple of special attributes given to us by the Protobuf.NET library :

[ProtoContract]
class Person
{
    [ProtoMember(1)]
    public string FirstName { get; set; }

    [ProtoMember(2)]
    public string LastName { get; set; }

    [ProtoMember(3)]
    public List Emails { get; set; }
}

If we compare this to our Proto contact from earlier, it’s a little less scary right? It’s just a plain old C# class, but with a couple of attributes to ensure that we are serializing to the correct identifiers.

I’ll also point something else out here, because we are using integer identifiers, the casing of our properties no longer matters at all. Coming from the C# world where we love PascalCase, this is enormously easy on the eyes. But even more so, when we take a look at performance a bit later on in this series, it will become even clearer what a good decision this is because we no longer have to fiddle around parsing strings, including whether the casing is right or not.

I’ll say it  again that if you have an existing Proto message contract given to you (For example, someone else is building an application in Java and they have given you the contract only), you can simply run it through Marc Gravell’s Protogen tool here : https://protogen.marcgravell.com/

It does generate a bit of a verbose output :

[global::ProtoBuf.ProtoContract()]
public partial class Person : global::ProtoBuf.IExtensible
{
    private global::ProtoBuf.IExtension __pbn__extensionData;
    global::ProtoBuf.IExtension global::ProtoBuf.IExtensible.GetExtensionObject(bool createIfMissing)
        => global::ProtoBuf.Extensible.GetExtensionObject(ref __pbn__extensionData, createIfMissing);

    [global::ProtoBuf.ProtoMember(1)]
    [global::System.ComponentModel.DefaultValue("")]
    public string firstName { get; set; } = "";

    [global::ProtoBuf.ProtoMember(2)]
    [global::System.ComponentModel.DefaultValue("")]
    public string lastName { get; set; } = "";

    [global::ProtoBuf.ProtoMember(3, Name = @"emails")]
    public global::System.Collections.Generic.List Emails { get; } = new global::System.Collections.Generic.List();

}

But for larger contracts it may just work well as a scaffolding tool for you!

So defining contracts is all well and good, how do we go about Serializing the data? Let’s check that out in Part 2! https://dotnetcoretutorials.com/2022/01/13/protobuf-in-c-net-part-2-serializing-deserializing/

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

Now that the flames have simmered down on the Hot Reload Debacle, maybe it’s time again to revisit this feature!

I legitimately feel this is actually one of the best things to be released with .NET in a while. The amount of frustrating times I’ve had to restart my entire application because of one small typo… whereas now it’s Hot Reload to the rescue!

It’s actually a really simple feature so this isn’t going to be too long. You’ll just have to give it a crack and try it out yourself. In short, it looks like this when used :

In case it’s too small, you can click to make it bigger. But in short, I have a console application that is inside a never ending loop. I can change the Console.WriteLine text, and immediately see the results of my change *without* restarting my application. That’s the power of Hot Reload!

And it isn’t just limited to Console Applications. It (should) work with Web Apps, Blazor, WPF applications, really anything you can think of. Obviously there are some limitations. Notably that if you edit your application startup (Or other run-once type code), your application will hot reload, it doesn’t re-run any code blocks, meaning you’ll need to restart your application to get that startup ran again. I’ve also at times had the Hot Reload fail with various errors, usually meaning I just restart and we are away again.

Honestly, one of the biggest things to get used to is the mentality of Hot Reload actually doing something. It’s very hard to “feel” like your changes have been applied. If I’m fixing a bug, and I do a reload and the bug still exists…. It’s hard for me to not stop the application completely and restart just to be sure!

Hot Reload In Visual Studio 2022

Visual Studio 2019 *does* have a hot reload functionality, but it’s less featured (Atleast for me). Thus I’m going to show off Visual Studio 2022 instead!

All we need to do is edit our application while it’s running, then look to our nice little task bar in Visual Studio for the following icon :

That little icon with two fishes swimming after each other (Or.. atleast that’s what it looks like to me) is Hot Reload. Press it, and you are done!

If that’s a little too labour intensive for you, there is even an option to Hot Reload on file save.

If you’re coming from a front end development background you’ll be used to file watchers recompiling your applications based on a save only. On larger projects I’ve found this to maybe be a little bit more pesky (If Hot Reload is having issues, having popups firing off every save is a bit annoying), but on smaller projects I’ve basically run this without a hitch everytime.

Hot Reload From Terminal

Hot Reload from a terminal or command line is just as easy. Simple run the following from your project directory :

dotnet watch

Note *without* typing run after (Just incase you used to use “dotnet watch run”). And that’s it!

Your application will now run with Hot Reload on file save switched on! Doing this you’ll see output looking something like

watch : Files changed: F:\Projects\Core Examples\HotReload\Program.cs~RF1f7ccc54.TMP, F:\Projects\Core Examples\HotReload\Program.cs, F:\Projects\Core Examples\HotReload\qiprgg31.zfd~
watch : Hot reload of changes succeeded.

And then you’re away laughing again!

 

ENJOY THIS POST?
Join over 3,000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.