This post is part of a series on Channel in .NET. Of course, it’s always better to start at Part 1, but you can skip anywhere you’d like using the links below.

Part 1 – Getting Started
Part 2 – Advanced Channels
Part 3 – Understanding Back Pressure


Up until this point, we have been using what’s called an “Unbounded” Channel. You’ll notice it when we create the channel, we do something like so :

var myChannel = Channel.CreateUnbounded<int>();

But actually, we can do something like :

var myChannel = Channel.CreateBounded<int>(1000);

This isn’t too dissimilar from creating another collection type such as a List or an Array that has a limited capacity. In our example, we’ve created a channel that will hold at most 1000 items. But why limit ourselves? Well.. That’s where Back Pressure comes in.

What Is Back Pressure?

Back Pressure in computing terms (Especially when it comes to messaging/queuing) is the idea that resources (Whether it be things like memory, ram, network capacity or for example an API rate limit on a required external API) are limited. And we should be able to apply “pressure” back up the chain to try and relieve some of that load. At the very least, let others know in the ecosystem that we are under load and we may take some time to process their requests.

Generally speaking, when we talk about back pressure with queues. Almost universally we are talking about a way to tell anyone trying to add more items in the queue that either they simply cannot enqueue any more items, or that they need to back off for a period of time. More rarely, we are talking about queues purely dropping messages once we reach a certain capacity. These cases are rare (Since generally you don’t want messages to simply die), but we do have the option.

So how does that work with .NET channels?

Back Pressure Options For Channels

We actually have a very simple way of adding back pressure when using Channels. The code looks like so :

var channelOptions = new BoundedChannelOptions(5)
{
    FullMode = BoundedChannelFullMode.Wait
};

var myChannel = Channel.CreateBounded<int>(channelOptions);

We can specify the following Full Modes :

Wait
Simply make the caller wait before turning on a WriteAsync() call.

DropNewest/DropOldest
Either drop the oldest or the newest items in the channel to make room for the item we want to add.

DropWrite
Simply dump the message that we were supposed to write.

There are also two extra pieces of code you should be aware of.

You can call WaitToWriteAsync() :

await myChannel.Writer.WaitToWriteAsync();

This let’s us “wait out” the bounded limits of the channel. e.g. While the channel is full, we can call this to simply wait until there is space. This means that even if there is a DropWrite FullMode turned on, we can limit the amount of messages we are dropping on the ground by simply waiting until there is capacity.

The other piece of code we should be aware of is :

var success = myChannel.Writer.TryWrite(i);

This allows us to try and write to the queue, and return whether we were successful or not. It’s important to note that this method is not async. Either we can write to the channel or not, there is no “Well.. You maybe could if you waited a bit longer”.

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

This post is part of a series on Channel in .NET. Of course, it’s always better to start at Part 1, but you can skip anywhere you’d like using the links below.

Part 1 – Getting Started
Part 2 – Advanced Channels
Part 3 – Understanding Back Pressure


In our previous post we looked at some dead simple examples of how Channels worked, and we saw some pretty nifty features, but for the most part it was pretty similar to any other XQueue implementation. So let’s dive into some more advanced topics. Well.. I say advanced but so much of this is dead simple. This might read like a bit of a feature run through but there is a lot to love!

Separation Of Read/Write Concerns

If you’ve ever shared a Queue between two classes, you’ll know that either class can read/write, even if they aren’t supposed to. For example :

class MyProducer
{
    private readonly Queue<int> _queue;

    public MyProducer(Queue<int> queue)
    {
        _queue = queue;
    }
}

class MyConsumer
{
    private readonly Queue<int> _queue;

    public MyConsumer(Queue<int> queue)
    {
        _queue = queue;
    }
}

So while a Producer is supposed to only write to the queue, and a Consumer is supposed to only read, in both cases they can do all operations on the queue. While you might in your own head want the Consumer to only read, another developer might come along and quite happily start calling Enqueue and there’s nothing but a code review to stop them making that mistake.

But with Channels, we can do things differently.

class Program
{
    static async Task Main(string[] args)
    {
        var myChannel = Channel.CreateUnbounded<int>();
        var producer = new MyProducer(myChannel.Writer);
        var consumer = new MyConsumer(myChannel.Reader);
    }
}

class MyProducer
{
    private readonly ChannelWriter<int> _channelWriter;

    public MyProducer(ChannelWriter<int> channelWriter)
    {
        _channelWriter = channelWriter;
    }
}

class MyConsumer
{
    private readonly ChannelReader<int> _channelReader;

    public MyConsumer(ChannelReader<int> channelReader)
    {
        _channelReader = channelReader;
    }
}

In this example I’ve added a main method to show you how the creation of the writer/reader happen, but it’s dead simple. So here we can see that for our Producer, I’ve passed it only a ChannelWriter, so it can only do write operations. And for our Consumer, we’ve passed it a ChannelReader so it can only read.

Of course it doesn’t mean that another developer can’t just modify the code and start injecting the root Channel object, or passing in both the ChannelWriter/ChannelReader, but it atleast outlays much better what the intention of the code is.

Completing A Channel

We saw earlier that when we call ReadAsync() on a channel, it will actually sit there waiting for messages, but what if there isn’t any more messages coming? Maybe this is a one time batch job and the batch is completed. Normally with other Queues in .NET, we would have to have some sort of shared boolean and/or a CancellationToken be passed around. But with Channels, it’s even easier.

Consider the following :

static async Task Main(string[] args)
{
    var myChannel = Channel.CreateUnbounded<int>();

    _ = Task.Factory.StartNew(async () =>
    {
        for (int i = 0; i < 10; i++)
        {
            await myChannel.Writer.WriteAsync(i);
        }

        myChannel.Writer.Complete();
    });

    try
    {
        while (true)
        {
            var item = await myChannel.Reader.ReadAsync();
            Console.WriteLine(item);
            await Task.Delay(1000);
        }
    }catch(ChannelClosedException e)
    {
        Console.WriteLine("Channel was closed!");
    }
}

I’ve made it so that our second thread writes to our channel as fast as possible, then completes it. Then our reader slowly reads with a delay of 1 second between reads. Notice that we catch the ChannelClosedExecption, this is called when you try and read from the closed channel *after* the final message.

I just want to make that clear. Calling Complete() on a channel does not immediately close the channel and kill everyone reading from it. It’s instead a way to say to notify any readers that once the last message is read, we’re done. That’s important because it means it doesn’t matter if the Complete() is called while we are waiting for new items, while the queue is empty, while it’s full etc. We can be sure that we will complete all available work then finish up.

Using IAsyncEnumerable With Channels

If we take our example when we try and close a channel, there are two things that stick out to me.

  1. We have a while(true) loop. And this isn’t really that bad, but it’s a bit of an eyesore.
  2. To break out of this loop, and to know that the channel is completed, we have to catch an exception and essentially swallow it.

These problems are solved using the command “ReadAllAsync()” that returns an IAsyncEnumerable (A bit more on how IAsyncEnumerable works right here). The code looks a bit like so :

static async Task Main(string[] args)
{
    var myChannel = Channel.CreateUnbounded<int>();

    _ = Task.Factory.StartNew(async () =>
    {
        for (int i = 0; i < 10; i++)
        {
            await myChannel.Writer.WriteAsync(i);
        }

        myChannel.Writer.Complete();
    });

    await foreach(var item in myChannel.Reader.ReadAllAsync())
    {
        Console.WriteLine(item);
        await Task.Delay(1000);
    }
}

Now the code reads a lot better and removes some of the extra gunk around catching the exception. Because we are using an IAsyncEnumerable, we can still wait on each item like we previously did, but we no longer have to catch an exception because when the channel completes, it simply says it has nothing more and the loop exits.

Again, this gets rid of some of the messy code you used to have to write when dealing with queues. Where previously you had to write some sort of infinite loop with a breakout clause, now it’s just a real tidy loop that handles everything under the hood.

What’s Next

So far, we’ve been using “Unbounded” channels. And as you’ve probably guessed, of course there is an option to use a BoundedChannel instead. But what is this? And how does the term “back pressure” relate to it? Check out the next part of this series on better understanding of back pressure.

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

This post is part of a series on Channel in .NET. Of course, it’s always better to start at Part 1, but you can skip anywhere you’d like using the links below.

Part 1 – Getting Started
Part 2 – Advanced Channels
Part 3 – Understanding Back Pressure


I’ve recently been playing around with the new Channel<T> type that was introduced in .NET Core 3.X. I think I played around with it when it was first released (along with pipelines), but the documentation was very very sparse and I couldn’t understand how they were different from any other queue.

After playing around with them, I can finally see the appeal and the real power they posses. Most notable with large asynchronous background operations that need almost two way communication to synchronize what they are doing. That sentence is a bit of a mouthful, but hopefully by the end of this series it will be clear when you should use Channel<T>, and when you should use something more basic like Queue<T>.

What Are Channels?

At it’s heart, a Channel is essentially a new collection type in .NET that acts very much like the existing Queue<T> type (And it’s siblings like ConcurrentQueue), but with additional benefits. The problem I found when really trying to research the subject is that many existing external queuing technologies (IBM MQ, Rabbit MQ etc) have a concept of a “channel” and they range from describing it as a completely abstract thought process vs being an actual physical type in their system.

Now maybe I’m completely off base here, but if you think about a Channel in .NET as simply being a Queue with additional logic around it to allow it to wait on new messages, tell the producer to hold up because the queue is getting large and the consumer can’t keep up, and great threadsafe support, I think it’s hard to go wrong.

Now I mentioned a bit of a keyword there, Producer/Consumer. You might have heard of this before and it’s sibling Pub/Sub. They are not interchangeable.

Pub/Sub describes that act of someone publishing a message, and one or many “subscribers” listening into that message and acting on it. There is no distributing of load because as you add subscribers, they essentially get a copy of the same messages as everyone else.

In diagram form, Pub/Sub looks a bit like this :

Producer/Consumer describes the act of a producer publishing a message, and there being one or more consumers who can act on that message, but each message is only read once. It is not duplicated out to each subscriber.

And of course in diagram form :

Another way to think about Producer/Consumer is to think about you going to a supermarket checkout. As customers try to checkout and the queue gets longer, you can simply open more checkouts to process those customers. This little thought process is actually important because what happens if you can’t open any more checkouts? Should the queue just keep getting longer and longer? What about if a checkout operator is sitting there but there are no customers? Should they just pack it in for the day and go home or should they be told to just sit and wait until there is customers.

This is often called the Producer-Consumer problem and one that Channels aims to fix.

Basic Channel Example

Everything to do with Channels lives inside the System.Threading.Channels. In later versions this seems to be bundled with your standard .NET Core project, but if not, a nuget package lives here : https://www.nuget.org/packages/System.Threading.Channels.

A extremely simple example for channels would look like so :

static async Task Main(string[] args)
{
    var myChannel = Channel.CreateUnbounded();

    for(int i=0; i < 10; i++)
    {
        await myChannel.Writer.WriteAsync(i);
    }

    while(true)
    {
        var item = await myChannel.Reader.ReadAsync();
        Console.WriteLine(item);
    }
}

There’s not a whole heap to talk about here. We create an “Unbounded” channel (Which means it can hold infinite items, but more on that further in the series). And we write 10 items and read 10 items, at this point it’s not a lot different from any other queue we’ve seen in .NET.

Channels Are Threadsafe

That’s right, Channels are threadsafe. Meaning that multiple threads can be reading/writing to the same channel without issue. If we take a peek at the Channels source code here, we can see that it’s threadsafe because it uses a combination of locks and an internal “queue” to synchronise readers/writers to read/write one after the other.

In fact, the intended use case of Channels is multi threaded scenarios. For example, if we take our basic code from above, there is actually a bit of overhead in maintaining our threadsafe-ness when we actually don’t need it. So we are probably better off just using a Queue<T> in that instance. But what about this code?

static async Task Main(string[] args)
{
    var myChannel = Channel.CreateUnbounded();

    _ = Task.Factory.StartNew(async () =>
    {
        for (int i = 0; i < 10; i++)
        {
            await myChannel.Writer.WriteAsync(i);
            await Task.Delay(1000);
        }
    });

    while(true)
    {
        var item = await myChannel.Reader.ReadAsync();
        Console.WriteLine(item);
    }
}

Here we have a separate thread pumping messages in, while our main thread reads the messages out. The interesting thing you’ll notice is that we’ve added a delay between messages. So how come we can call ReadAsync() and things just…. work? There is no TryDequeue or Dequeue and it runs null if there are no messages in the queue right?

Well the answer is that a Channel Reader’s “ReadAsync()” method will actually *wait* for a message (but not *block*). So you don’t need to do some ridiculously tight loop while you wait for messages, and you don’t need to block a thread entirely while waiting. We’ll talk about this more in upcoming posts, but just know you can use ReadAsync to basically await a new message coming through instead of writing some custom tightly wound code to do the same.

What’s Next?

Now that you’ve got the basics down, let’s look at some more advanced scenarios using Channels.

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

Azure’s Key vault is a great secret store with excellent support in .NET (obviously!). But I recently ran into an issue that sent me in circles trying to work out how to load certificates that have been loaded into Key Vault, from .NET. Or more specifically, there is a bunch of gotchas when loading certificates into a .NET/.NET Core application running in Azure App Service. Which given both are Azure services, you’ll probably run into one time or another.

But first, let’s just talk about the code to load a certificate from Key vault in general.

C# Code To Load Certificates From Keyvault

If you’ve already got a Key vault instance (Or have newly created one), you’ll need to ensure that you, as in your login to Azure, has been added to the access policy for the Key vault.

A quick note on Access Policies in general. They can become pretty complex and Key vault recently added the ability to use role based authentication and a few other nifty features. You can even authenticate against KV using a local certificate on the machine. I’m going to describe how I generally use it for my own projects and other small teams, which involves Managed Identity, but if this doesn’t work for you, you’ll need to investigate the best way of authenticating individuals against Key vault.

Back to the guide. If you created the Key vault yourself, then generally speaking you are automatically added to the access policy. But you can check by looking at the Key vault instance, and checking Access Policies under Settings and ensuring that your user has access.

Next, on your local machine, you need to login to Azure because the .NET Code actually uses the managed identity to gain access to Keyvault. To do that we need to run a couple of Powershell commands.

First, run the command

az login

This will pop up a browser window asking you to login to Azure. Complete this, and your powershell window should update with the following :

You have logged in. Now let us find all the subscriptions to which you have access...
[.. All your subscriptions listed here..]

Now if you only have one subscription, you’re good to go. If you have multiple then you need to do something else :

az account set --subscription "YOUR SUBSCRIPTION NAME THAT HAS KEYVAULT"

The reason you need to do this is once logged into Azure, you only have access to one subscription at a time. If you have multiple subscriptions you need to set the subscription that contains your keyvault instance as your “current” one.

Finally onto the C# code.

Now obviously you’ll want to turn this into a helpful service with a re-useable method, but the actual C# code is simple. Here it is in one block :

var _keyVaultName = $"https://YOURKEYVAULTNAME.vault.azure.net/";
var secretName = "YOURCERTIFICATENAME";
var azureServiceTokenProvider = new AzureServiceTokenProvider();
var _client = new KeyVaultClient(new KeyVaultClient.AuthenticationCallback(azureServiceTokenProvider.KeyVaultTokenCallback));
var secret = await client.GetSecretAsync(_keyVaultName, secretName);
var privateKeyBytes = Convert.FromBase64String(secret);
var certificate = new X509Certificate2(privateKeyBytes, string.Empty);

Again, people often leave comments like “Why don’t you load your keyvault name from App Settings?”. Well I do! But when I’m giving example code I want to break it down to the simplest possible example so that you don’t have to deconstruct it, and rebuild it to suit your own application.

With that out of the way, notice that when we call Key Vault, we don’t actually call “GetCertificate”. We just ask to get a secret. If that secret is a text secret, then it will come through as plain text. If it’s a certificate, then actually it will be a Base64 string, which we can then turn into a certificate.

Also note that we aren’t providing any sort of “authentication” to this code, that’s because it uses our managed identity to talk to Key vault.

And we are done! This is all the C# code you need. Now if you’re hosting on Azure App Service.. then that’s a different story.

Getting It Working On Azure App Service

Now I thought that deploying everything to an Azure App Service would be the easy part. But as it turns out, it’s a minefield of gotchas.

The first thing is that you need to turn on Managed Identity for the App Service. You can do this by going to your App Service, then Settings => Identity. Turn on System Assigned identity and save.

Now when you go back to your Key vault, go to Access Policies and search for the name of your App Service. Then you can add permissions for your App Service as if it was an actual user getting permissions.

So if you’re loading certificates, there is 3 main gotchas, and all 3 will generate this error :

Internal.Cryptography.CryptoThrowHelper+WindowsCryptographicException: The system cannot find the file specified.

App Service Plan Level

You must be on a Basic plan or above for the App Service. It cannot be a shared instance (Either F1 or D1). The reason behind this is that behind the scenes, on windows, there is some “User Profile” gunk that needs to be loaded for certificates of any type to be loaded. This apparently does not work on shared plans.

Extra Application Setting

You must add an Application setting on the App Service called “WEBSITE_LOAD_USER_PROFILE” and set this to 1. This is similar to the above and is about the gunk that windows needs, but is apparently not loaded by default in Azure.

.

Extra Certificate Flags In C#

In your C# code, the only thing that seemed to work for me was adding a couple of extra flags when loading your certificate from the byte array. So we change this :

var certificate = new X509Certificate2(privateKeyBytes, string.Empty);

To this :

var certificate = new X509Certificate2(privateKeyBytes, string.Empty, X509KeyStorageFlags.MachineKeySet | X509KeyStorageFlags.PersistKeySet | X509KeyStorageFlags.Exportable);

Finally, with all of these extras, you should be able to load a certificate from Key vault into your Azure App Service!

Further Troubleshooting

If you get the following error :

Microsoft.Azure.KeyVault.Models.KeyVaultErrorException: Access denied

In almost all cases, the managed identity you are running under (either locally or in Azure App Service) does not have access to the Key vault instance. If you’re getting this when trying to develop locally, generally I find it’s because you’ve selected the wrong subscription after using az login. If you’re running this in an App Service, I find it’s typically because you haven’t set up the managed identity between the App Service and Key vault.

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

I recently wrote a piece of code that used a self signed certificate to then sign a web request. All went well until it came to unit testing this particular piece of code. I (incorrectly) assumed that I could just write something like so :

var certificate = new X509Certificate2();

And on the face of it, it works. But as soon as the certificate actually starts getting used, there is a lot of internals that suddenly break at runtime.

Luckily there’s actually a simple solution. We can generate our own self signed certificate, and simply “hard code” it in code to be used anywhere we need a certificate.

Generating The Self Signed Certificate Using Powershell

To get this working, we need to use Powershell. If you are on a non-windows machine, then you’ll need to work out how to generate a self signed cert (And get the Base64 encoded string) yourself, and then skip to step 2.

Back to powershell. In an *Administrator* powershell prompt, run the following :

New-SelfSignedCertificate -certstorelocation cert:\localmachine\my -dnsname MySelfSignedCertificate -NotAfter (Get-Date).AddYears(10)

Note that this generates a certificate and installs it into your local machine (Which you can later remove). I couldn’t find a way in powershell to generate the certificate and *immediately* export it without having to first install it into a store location.

The result of this action should print a thumbprint. Save the thumbprint into a notepad because we’ll need it shortly.

The next piece of powershell we need to run is the following :

$password = ConvertTo-SecureString -String "Password" -Force -AsPlainText
Export-PfxCertificate -cert cert:\localMachine\my\{YOURTHUMBPRINTHERE} -FilePath $env:USERPROFILE\Desktop\MySelfSignedCertificate.pfx -Password $password

It creates a secure password (Change if you like but remember it!), and exports the certificate to your desktop as a pfx file. Almost there!

Finally, CD to your desktop and run :

$pfx_cert = get-content 'MySelfSignedCertificate.pfx' -Encoding Byte
$base64 = [System.Convert]::ToBase64String($pfx_cert)
Write-Host $base64

This will read the certificate from your local machine, and print out a long string as Base64. Save this into notepad and move onto the next step.

Loading Your Base64 Encoded Certificate

Once you have the certificate as a Base64 string, then actually, the rest is easy. Here’s the C# code you need.

public static X509Certificate2 GetClientTestingCertificate()
{
    string certificateString = "PUTYOURCERTIFICATESTRINGINHERE";
    var privateKeyBytes = Convert.FromBase64String(certificateString);
    var pfxPassword = "Password";
    return new X509Certificate2(privateKeyBytes, pfxPassword);
}

And that’s literally it. You will now be using a real certificate that doesn’t need to be installed on every single machine for unit tests to run.

I do want to stress that in most cases, this is purely for unit tests. You should not do this if you actually intend to be signing requests or using a certificate in any other “real” way. Unit tests only!

Cleaning Up (Windows)

If you followed the above steps to generate a certificate, then you’ll also want to go and delete your testing certificate from your local store. To do so, simply open MMC, view your certificate stores for your local computer, and delete the certificate named “MySelfSignedCertificate”.

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

I was recently investigating a piece of non-performant code that essentially boiled down to a loop that pulled items off a “working list”, and when the work was done, removed it from the list. Now even me describing this code now you’re probably thinking.. Well that sounds like a “Queue” type. And you would be right, it basically is a queue but written using a List object.

Maybe this sounds like a real rookie mistake, but there is definitely a common theme among C# developers (myself included), that use List as basically the catch all for any sort of collection. Arrays, queues, stacks and pipelines are all pretty much non-existent in any sort of business logic code I write, and as we’re about to find out, maybe that’s to my own detriment.

Back to the code I was talking about, it looked something like this :

//Somewhere in the code we add a bunch of items to the working list. 
var workingList = new List<object>();
workingList.Add(new object());

//And then later on we have a loop that looks like so : 
while(workingList.Any())
{
    var item = workingList.First();
    workingList.Remove(item);
    //do some work
}

Immediately looking at this code I realized it’s basically a makeshift queue built ontop of a list type, but with some clear performance issues in mind. Just how much performance can be squeezed out of this we will certainly delve into, but let’s look at the problems of this code in a nutshell.

  1. We have a call to “Any()” to check if we have items. This is probably pretty fast, but we do this as an additional check rather than just trying to “pop” an item immediately.
  2. We remove the item from the front of the list. This actually means that the list itself all shifts up 1 place, a much larger operation than I first thought.
  3. Our removing code requires us to compare object references within the list rather than remove it by index (Although this is the least of our problems).

Just touching on number 3, we could instead change the remove line to be :

workingList.RemoveAt(0);

But that’s not our actual issue.

The problem is we are trying to use a List type for a job that is clearly suited to Queue. But how much faster would Queue actually be? What if we only have a hundred items? Is there actually any noticeable difference?

Well, there’s an easy way to solve this using BenchmarkDotNet.

Here’s my benchmark that I set up :

[SimpleJob(RuntimeMoniker.NetCoreApp31)]
public class QueueVsListBenchmark
{
    [Params(100, 1000, 10000)]
    public int size;

    private Queue<object> queue = new Queue<object>();
    private List<object> list = new List<object>();

    [IterationSetup]
    public void Setup()
    {
        queue = new Queue<object>();
        list = new List<object>();
    }

    [Benchmark]
    public void ComputeQueue()
    { 
        for(int i=0; i < size; i++)
        {
            queue.Enqueue(new object());
        }

        while(queue.TryDequeue(out var item))
        {
        }
    }

    [Benchmark(Baseline = true)]
    public void ComputeList()
    {
        for (int i = 0; i < size; i++)
        {
            list.Add(new object());
        }

        while(list.Any())
        {
            var item = list.First();
            list.Remove(item);
        }
    }
}

Not too complicated. We simply insert X amount of items, and then dequeue until we are done. We also have different sizes of queues to just compare how the size of the queue may affect the performance.

And the results?

|       Method |  size |          Mean | Ratio |
|------------- |------ |--------------:|------:|
| ComputeQueue |   100 |      4.080 us |  0.19 |
|  ComputeList |   100 |     21.917 us |  1.00 |
|              |       |               |       |
| ComputeQueue |  1000 |     28.066 us |  0.12 |
|  ComputeList |  1000 |    242.496 us |  1.00 |
|              |       |               |       | 
| ComputeQueue | 10000 |    205.809 us |  0.02 |
|  ComputeList | 10000 | 10,033.700 us |  1.00 |

So swapping a queue, even at a relatively small size of 100 items, is 5x faster. And the larger the queue, the more performance we can squeeze out of not using a list.

Now is this over thinking everything? Maybe. But it’s one I’m very comfortable with. Not only is performance that much better using a queue, but it’s actually the right data type anyway. We aren’t doing some “hack” to squeeze juice out of it. We are actually using the correct data type for the correct use. Anyone looking at our code and seeing a “Queue” type is going to know exactly what it’s doing.

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

Some time back I wrote a post specifically on writing PDFs in C#/.NET Core. At the time, I had very specific requirements for turning an HTML file into a PDF and was really looking for the cheapest library that fit the bill, regardless of how much effort I had to do to get it working. I think I probably underestimated how bad support for generating PDFs actually was in C#. It turned into a maze of open source libraries that “kinda” worked complete with Github issue trackers that hadn’t been answered in years, all the way to paid libraries with seemingly black hole support.

Well a continual thing that came out of that post was people asking me “But.. doesn’t Adobe do this for you?”. At the time as I understood it, Adobe did not have an programming interface at all for PDF’s, they were just dealing with desktop software. But as it turns out, that’s because they partnered with a company called Datalogics who are their official vendor for a C# PDF SDK.

Better yet, Datalogics just announced their support for .NET Core and actual support for non-Windows environments. e.g. Linux. I put that in bold because people left comments on my previous post saying that even though a company said they supported Linux, they were actually just guessing and didn’t actually provide any support for it. So naturally, with my past post giving a few open source options and third party companies selling their own SDK, it makes sense to try the real deal.

Getting Started

You can head over to Datalogics and grab a free trial here : https://www.datalogics.com/products/pdf-sdks/adobe-pdf-library/. It looks scary filling in the contact form but don’t worry, you get an automated email back with all the details on how to grab the library immediately without having someone calling you 15 seconds after you hit submit.

As part of the download, you are given a package with over 70 sample projects that demonstrate how the library works! It’s actually somewhat refreshing to be given that many sample apps that work right out of the box without having to muck about working out how all the pieces fit together. Not only are all the sample applications there, but all the sample PDF’s with their own features (For example, a PDF with images to show image extraction) are all their ready for you to just run and step through the code.

Better yet, chances are one of these sample apps are exactly what you are looking for a PDF library for. Just an example of the sort of samples you get :

  • A WinForms application showing opening PDF’s, viewing and being able to edit them inside a Winforms control.
  • A simple console application that showcases true text redaction (Not just dark rectangles drawn over the top of a PDF)
  • An application to extract text from PDF’s that is among one of the best I’ve seen
  • An OCR application that reads text from images and adds them to a PDF file (To be honest, forget the PDF writing, this was easily the most impressive sample here!)

If you are interested, you can even view the sample code/applications here : https://dev.datalogics.com/adobe-pdf-library/sample-program-descriptions/net-sample-programs/working-with-actions-in-pdf-files/ to see if it does what you need before even jumping in.

While Datalogics do have a library for turning an HTML file into a PDF, I want to look at maybe some of the more complex sample applications and see just how good they are.

PDF Visual Editor

Yes OK, this is .NET Framework (for now) as most WinForm apps are, but I do want to touch on it because I found it pretty impressive to use and was probably the feature that I hadn’t seen before in a PDF library for C#. I think most SDK’s focus on manipulating PDF’s from code and any samples given are all in the form of console applications. So it was a bit of a change of pace to fire up an actual GUI application that showcases PDF editing functionality.

(Ignore the actual PDF in the screenshot, my partner has been getting a bit too much into cross stitch lately!)

This is the sample application that comes with the library, built in .NET, and has the ability to both read PDF’s and even edit them on the fly. See what I mean when I say that the samples are pretty good? This is a fully fledged application that even if you needed just one of these features, be it just be viewing a PDF in a WinForms application, editing a PDF, printing, whatever it may be! This application has the code right there in front of you for you to debug and actually use.

Redacting Text Inside A PDF

A really impressive example app in the sample suite is a small console application that redacts text. I actually went back and checked other PDF libraries for C#, both paid and free, and I couldn’t find any tools for redacting text inside a PDF. More importantly, redacting text is *not* simply drawing a rectangle over a word and calling it a day.

Quite famously, in the Mueller investigation vs Paul Manafort, a “redacted” legal response was released to the public that you could literally just copy and paste the “black box” into a notepad doc and read all of the “redacted” text. (Further reading here on that fail : https://www.vice.com/en_us/article/8xpye3/paul-manafort-russia-case-redaction-fail).

The sample application uses a simple weather PDF that will redacts all instances of the word “cloudy”. Impressively it actually outputs two sample documents, the first is it finding all instances of the word and drawing a box where it’s found it, and then another completely redacted version.

Before

After Identifying

After Redaction

And again, this is not simply drawing a a rectangle over it MSPaint style, this is truly identifying the text and redacting it completely.

Extracting Text From A PDF

I guess this is basically a staple for all PDF SDK’s, but I was really impressed with the quality from the Datalogics SDK. Again, they have a great sample application to show you what it can do.

Take this PDF section for example, it’s on Page 5 of a sample PDF.

And the text output :

Now I kinda get that all this is doing is reading text from a PDF and that doesn’t sound that impressive. But I found other libraries (Especially free ones), basically read the text as one big long string and gave you that. The entire structure of the page was lost, especially the line breaks.

On a previous project of mine I was tasked to build an application that given the page and line number of a PDF, I had to extract out the text. Can you imagine the headaches when the entire structure of the PDF is lost and I have to come up with all sorts of crazy ways of reverse engineering the line number? I love a library that does what it says on the tin.

Image PDF OCR

Is it weird that a PDF library impresses me the most by showcasing it’s image optical character recognition powers (OCR)? This thing actually blew my mind.

In the sample, they give you a PDF that is one big image, including tables :

Again, this is an image inside a PDF. You cannot highlight this text or copy it out. Nor can you use any old PDF reader to actually “read” the text because the text isn’t actually there. It’s an image. Then the sample application reads all text inside the PDF, re-does the entire PDF, and resaves it including the table, but with the text now OCR’d and input as actual text and not an image.

So you can now copy out the text just like you could if it was text inside a PDF in the first place. There just something crazy about the fact that it can not only OCR the text, but then also recognize the fact that text is inside a table etc and basically redraw everything pixel perfect, but now with the text perfectly selectable.

In the screenshot above, it does look like the text is sort of disjointed, but I can assure you when you copy that text out it’s a complete sentence in order :

Required)The type of annotation that this dictionary describes;
must be Redact for a redaction ann

Part of the reason this is so impressive to me is because I’ve actually been part of teams that have attempted to OCR billing invoices. Where there are tables of charges as an image, inside a PDF, and we are trying to read the line items from invoices to input them into a database. It took months of work and in some cases, we just said “can’t be done”. But I would love to give it another try with this library.

Who Is This Library For?

So here’s the thing. It’s not free. Let’s just get that out of the way. But this is easily the most comprehensive, if not *the* most comprehensive PDF library I’ve used. For extremely simple applications that turn a pretty plain HTML file into a PDF, yeah maybe it’s not for you. But there were features in this library that I hadn’t seen anywhere else, and had actual working examples for you to just copy and paste into your own application.

The thing that most surprised me about my previous post on PDF support in C#/.NET Core is just the share amount of libraries that didn’t even do what they say on the tin. It was this sort of mish-mash of partly working features and zero support. If you’re working on a business critical piece of functionality that involves PDF’s (e.g. Generating thousands of invoices that cannot be delayed due to you waiting for a response on your Github issue), then the Datalogics C# SDK might just be right for you.


This is a sponsored post however all opinions are mine and mine alone. 

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

This little “warning” has been the bain of my life recently…

warning CS1998: This async method lacks ‘await’ operators and will run synchronously. Consider using the ‘await’ operator to await non-blocking API calls, or ‘await Task.Run(…)’ to do CPU-bound work on a background thread.

And after a bit of searching around, I just wanted to touch on it because I think it’s a fairly common warning in C# code, and one that can either point to imminent danger, or be absolutely meaningless depending on your code. And it seems like there are all sorts of hokey “fixes” that don’t address why this happens in the first place.

Why/Where This Happens

The most common place I’ve seen code like this is where I’ve had to implement an interface that expects that your method likely needs to be async, but you don’t actually need anything async.

As an example, I had an interface for “validating” accounts. Something like so :

public interface IAccountValidator
{
    Task Validate(Account account);
}

But one of my implementations for this code was simply checking if the username was “admin” or not.

public class AccountValidatorRestrictedAdmin : IAccountValidator
{
    public async Task Validate(Account account)
    {
        if (account.Username == "admin")
            throw new Exception("Unable to use username admin");
    }
}

Now this particular piece of code is not async and does not contain async code. But other validators *are* async. So I run into a warning for this method. Oof. There isn’t really a great way to avoid this sort of implementation because

  1. Creating non-async code upfront is likely to come back and bite me later if I actually do need to be async. Since we try and preach “async all the way down”, to then break this will be a pain.
  2. I don’t want to have two different methods/interfaces, one async and the other sync to do the same thing.

Now actually, this warning can be “ignored” (Well.. Kinda) in most cases…..

The Imminent Danger Code

Before we get into the talk of where this warning isn’t a problem, let’s talk about the warning where your code is going to blow up.

Suppose I have the following method :

public async Task Validate(Account account)
{
    CheckSomethingAsync();
}

Where CheckSomethingAync() truly is async. Then I’m going to get the same warning again, but I’m also going to get another.

Warning CS4014 Because this call is not awaited, execution of the current method continues before the call is completed. Consider applying the ‘await’ operator to the result of the call.

These often come in two’s because one is saying that your method is async, but doesn’t await anything. But the second one tells you exactly why that is the case, because you’re actually *calling* an async method, but not awaiting it. This is generally speaking, a warning that *cannot* be ignored.

I’ll repeat, if you get warning CS4014 saying that an actual call is not awaited, in almost all cases your code requires an await somewhere and you should fix it!

The Overheads of Async/Await

Now if you’ve made it this far, and your code isn’t in the danger zone, we can talk about why exactly you get the warning on an async method running synchronously. After all, if a method is marked as async, and it runs sync, is there really a problem?

I really don’t profess to be an expert in all things async. In fact, all I really know on the subject is from a pluralsight course from Jon Skeet dating back to ~2016 (Which by the way, has since been taken down which is a great shame because it was incredible!). But from what I can tell, there is no actual issue with a method marked as async, that does not call async methods.

The only thing that I could find is that when you have a method like so :

public async Task Validate(Account account)
{
    if (account.Username == "admin")
        throw new Exception("Unable to use username admin");
}

There is an overhead in creating the state machine for an asynchronous method that won’t ever be used. I really couldn’t find much by way of measuring this overhead, but I think it’s relatively safe to say that it’s minimal as it would be the same overhead if your method was actually async.

So I feel nervous coming to this conclusion but generally speaking, other than the annoying warning, I can’t find anything dangerous about this code and it seems to me to be relatively safe to ignore (As long as you don’t have the second error labelled above!)

Using Task.FromResult/Task.CompletedTask

Many guides on this warning point to the usage of Task.FromResult to hide it. For example :

public Task Validate(Account account)
{
    if (account.Username == "admin")
        throw new Exception("Unable to use username admin");

    return Task.CompletedTask;
}

Instead of the method being async, it still returns a Task (To conform with an interface as an example), but instead returns Task.CompletedTask. If you have to return a result, then returning Task.FromResult() is also a valid way to do this.

This works but, to me it makes the code look bizarre. Returning empty tasks or values wrapped in tasks implies that my code is a lot more complicated than it actually is. Just in my opinion of course, but it is a suitable way of getting rid of the warning.

Suppressing Warnings

This is going to see my be eaten alive on Twitter I’m sure, but you can also supress the warnings in your project if you are sure that they are of no help to you.

To do so, edit your csproj file and add the following :

<Project Sdk="Microsoft.NET.Sdk.Web">
  <PropertyGroup>
    <NoWarn>1998</NoWarn>
  </PropertyGroup>
...
</Project>

Where 1998 is the original warning code. Again, *do not add 4014* as this, in almost all cases, is an example of code that’s about to blow up in your face.

Do Not Task.Yield()

Finally, the absolute worst thing to do is to add a “do nothing” Task.Yield().

public async Task Validate(Account account)
{
    if (account.Username == "admin")
        throw new Exception("Unable to use username admin");

    await Task.Yield();
}

This also gets rid of the warning, but is incredibly bad for performance. You see, when you call await, in some cases it may not need to “juggle” the process at all. It may just be able to continue along the execution path. However when you call Task.Yield(), you actually force the code to await. It’s no longer an “option”.

If that doesn’t make too much sense, that’s OK, it doesn’t really make much sense when I say it out loud either. But just know that adding a Task.Yield() for the sole purpose of bypassing a warning in your code is up there in terms of the worst things you can do.

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

Since the early days of C# 9, I’ve been doing writeups on all of the new features including init-only properties, record types, top level programs and more. And ever since those writeups, I’ve had people say “Yeah, I get I can read that, but can you just give me the gist of how to use the feature?”. I thought that a single page writeup would be short enough but I do get that it’s sometimes easier to explain in a 1 minute video. And so that’s what I’ve done!

I’ve created a completely free C# 9 video course that walks you through all of the features at a high level, to get you up and running as soon as possible. There’s no waffle. There’s no me writing screeds of code on video while you wait for me to get to the point. They are all one take wonders of me doing my best explanation of every C# 9 feature I can think of. And best of all, you can watch the entire series on your lunch break and come back in the afternoon ready to put C# 9 into action.

Interested? Grab the videos here for free!




Watch Now

If you are more of a text person, again, all of the information is available in posts below  :

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.

So anyone who uses Entity Framework/EF Core knows that you can run migrations from powershell/package manager console like so :

Update-Database

And you probably also know that with EF Core, you can also use the dotnet ef command like so :

dotnet ef database update

But in rare cases, you may have a need to run migrations on demand from your C#/.NET Core code, possibly by calling an API endpoint or from an admin screen. This was generally more of an issue in older versions of Entity Framework that had real issues with the “version” of the database versus the “version” that the EF Code thought it should be. Infact, put simply, it would bomb out.

In EF Core, this is less of an issue as your code won’t die until it actually tries to do something that it can’t complete (e.g. Select a column that doesn’t exist yet). But there are still some cases where you want to deploy code, test it works in a staging environment against the live database, *then* run database migrations on demand.

Migrating EF Core Database From C#

It’s actually very simple.

var migrator = _context.Database.GetService<IMigrator>();
await migrator.MigrateAsync();

Where _context is simply your database context. That’s it! Crazy crazy simple!

Checking Pending Migrations

It can also be extremely handy checking which migrations need to be run before attempting to run them. Even then, it can be useful to know which state the database is in from an admin panel or similar just to diagnose production issues. For example, if you roll a manual process of updating the production database, it can be useful to see if it’s actually up to date.

await _context.Database.GetPendingMigrationsAsync()

Really simple stuff.

Migrating EF Core On App Startup

In some cases, you really don’t care when migrations are run, you just want them to migrate the database when the app starts. This is good for projects that the timing of the database migration really doesn’t matter or is an incredibly small rollout window. For example, a single machine of a low use web app probably doesn’t need all the bells and whistles for a separate database rollout, it just needs to be on the latest version at any given time.

For that, .NET Core has this new paradigm of a “StartupFilter”. The code looks like so :

public class MigrationStartupFilter<TContext> : IStartupFilter where TContext : DbContext
{
    public Action<IApplicationBuilder> Configure(Action<IApplicationBuilder> next)
    {
        return app =>
        {
            using (var scope = app.ApplicationServices.CreateScope())
            {
                foreach (var context in scope.ServiceProvider.GetServices<TContext>())
                {
                    context.Database.SetCommandTimeout(160);
                    context.Database.Migrate();
                }
            }
            next(app);
        };
    }
}

Startup Filters in .NET Core are basically like Filters in MVC. They intercept the startup process and do “something” before the application starts, and only on startup. I actually haven’t made much use of them in the past but recently I’ve found them to be incredibly handy. If you ever made use of the global.asax startup methods in Full Framework .NET, then this is pretty similar.

We can then add our this filter to our startup pipeline by editing our startup.cs file like so :

services.AddTransient<IStartupFilter, MigrationStartupFilter<Context>>();

Where Context is our database context. Again, super simple stuff but something that you’ll probably end up using in every new project from now on!

ENJOY THIS POST?
Join over 3.000 subscribers who are receiving our weekly post digest, a roundup of this weeks blog posts.
We hate spam. Your email address will not be sold or shared with anyone else.