Getting a mime type based on a file name (Or file extension), is one of those weird things you never really think about until you really, really need it. I recently ran into the issue when trying to return a file from an API (Probably not the best practice, but I had to make it happen), and I wanted to specify the mime type to the caller. I was amazed with how things “used” to be done both in the .NET Framework, and people’s answers on Stack Overflow.

How We Used To Work Out The Mime Type Based On a File Name (aka The Old Way)

If you were using the .NET Framework, you had two ways to get going. Now I know this is a .NET Core blog, but I still found it interesting to see how we got to where we are now.

The first way is that you build up a huge dictionary yourself of mappings between file extensions and mime types. This actually isn’t a bad way of doing things if you only expect a few different types of files need to be mapped.

The second was that in the System.Web namespace of the .NET Framework there is a static class for mapping classes. We can actually see the source code for this mapping here : https://referencesource.microsoft.com/#system.web/MimeMapping.cs. If you were expecting some sort of mime mapping magic to be happening well, just check out this code snippet.

400+ lines of manual mappings that were copied and pasted from the default IIS7 list. So, not that great.

But the main issue with all of this is that it’s too hard (close to impossible) to add and remove custom mappings. So if your file extension isn’t in the list, you are out of luck.

The .NET Core Way

.NET Core obviously has it’s own way of doing things that may seem a bit more complicated but does work well.

First, we need to install the following nuget package :

Annoyingly the class we want to use lives inside this static files nuget package. I would say if that becomes an issue for you, to look at the source code and make it work for you in whatever way you need. But for now, let’s use the package.

Now we have access to a nifty little class called FileExtensionContentTypeProvider . Here’s some example code using it. I’ve created a simple API action that takes a filename, and returns the mime type :

Nothing too crazy and it works! We also catch if it doesn’t manage to map it, and just map it ourselves to a default content type. This is one thing that the .NET Framework MimeMapping class did have, was that if it couldn’t find the correct mapping, it returned application/octet-stream. But I can see how this is far more definitive as to what’s going on.

But here’s the thing, if we look at the source code of this here, we can see we are no better off in terms of doing things by “magic”, it’s still one big dictionary under the hood. And the really interesting part? We can actually add our own mappings! Let’s modify our code a bit :

I’ve gone mad with power and created a new file extension called .dnct and mapped it to it’s own mimetype. Everything is a cinch!

But our last problem. What if we want to use this in multiple places? What if we need better control for unit testing that “instantiating” everytime won’t really give us? Let’s create a nice mime type mapping service!

We could create this static, but then we lose a little flexibility around unit testing. So I’m going to create an interface too. Our service looks like so :

So we provide a single method called “Map”. And when creating our MimeMappingService, we take in a content service provider.

Now we need to head to our startup.cs and in our ConfigureServices method we need to wire up the service. That looks a bit like this :

So we instantiate our FileExtensionContentTypeProvider, give it our extra mappings, then bind our MimeMappingService all up so it can be injected.

In our controller we change out code to look a bit like this :

Nice and clean. And it means that any time we inject our MimeMappingService around, it has all our customer mappings contained within it!

.NET Core Static Files

There is one extra little piece of info I should really give out too. And that is if you are using the .NET Core static files middleware to serve raw files, you can also use this provider to return the correct mime type. So for example you can do things like this :

So now when outside of C# code, and we are just serving the raw file of type .dnct, we will still return the correct MimeType.

It’s almost a right of passage for a junior developer to cludge together their own CSV parser using a simple string.Split(‘,’), and then subsequently work out that there is a bit more to this whole CSV thing than just separating out values by a comma. I recently took a look at my options for parsing CSV’s in .NET Core, here’s what I found!

CSV Gotchas

There are a couple of CSV gotchas that have to be brought up before we dive deeper. Hopefully they should go ahead and explain why rolling your own is sometimes more pain than it’s worth.

  • A CSV may or may not have a header row. If there is a header row, then the order of the columns is not important since you can detect what is actually in each column. If there is no header row, then you rely on the order of the columns being the same. Any CSV parser should be able to both read columns based on a “header” value, and by index.
  • Any field may be contained in quotes. However fields that contain a line-break, comma, or quotation marks must be contained in quotes.
  • To re-emphasize the above, line breaks within a field are allowed within a CSV as long as they are wrapped in quotation marks, this is what trips most people up who are simply reading line by line like it’s a regular text file.
  • Quote marks within a field are notated by doing double quote marks (As opposed to say an escape character like a back slash).
  • Each row should have the same amount of columns, however in the RFC this is labelled as a “should” and not a “must”.
  • While yes, the C in CSV stands for comma, ideally a CSV parser can also handle things like TSV (That is using tabs instead of commas).

And obviously this is just for parsing CSV files into primitives, but in something like .NET we will also be needing :

  • Deserializing into a list of objects
  • Handling of Enum values
  • Custom mappings (So the header value may or may not match the name of the class property in C#)
  • Mapping of nested objects

Setup For Testing

For testing out these libraries, I wanted a typical scenario for importing. So this included the use of different primitive types (strings, decimals etc), the usage of a line break within a string (Valid as long as it’s contained within quotes), the use of quotes within a field, and the use of a “nested” object in C#.

So my C# model ended up looking like :

And our CSV file ended up looking like :

So a couple of things to point out :

  • Our “Make” is enclosed in quotes but our model is not. Both are valid.
  • The “Type” is actually an enum and should be deserialized as such
  • The “Comment” field is a bit of a pain. It contains quotes, it has a line break, and in our C# code it’s actually a nested object.

All of this in my opinion is a pretty common setup for a CSV file import, so let’s see how we go.

CSV Libraries

So we’ve read all of the above and we’ve decided that rolling our own library is just too damn hard. Someone’s probably already solved these issues for us right? As it happens, yes they have. Let’s take a look at what’s out there in terms of libraries. Realistically, I think there is only two that really matter, CSVHelper and TinyCSVParser, so let’s narrow our focus down to them.

CSVHelper

Website : https://joshclose.github.io/CsvHelper/
CSVHelper is the granddaddy of working with CSV in C#. I figured I would have to do a little bit of manual mapping for the comment field, but hopefully the rest would all work out of the box. Well… I didn’t even have to do any mapping at all. It managed to work everything out itself with nothing but the following code :

Really really impressive. And it handled double quotes, new lines, and enum parsing all on it’s own.

The thing that wow’d me the most about this library is how extensible it is. It can handle completely custom mappings, custom type conversion (So you could split a field into stay a dictionary or a child list), and even had really impressive error handling options.

The fact that in 3 lines of code, I’m basically done with the CSV parsing really points out how good this thing is. I could go on and on about it’s features, but we do have one more parser to test!

Tiny CSV Parser

Website : http://bytefish.github.io/TinyCsvParser/index.html
Next up was Tiny CSV Parser. Someone had pointed me to this one supposedly because it’s speed was supposed to be pretty impressive. We’ll take a look at that in a bit, but for now we just wanted to see how it handled our file.

That’s where things started to fall down. It seems that Tiny CSV doesn’t have “auto” mappings, and instead you have to create a class to map them yourself. Mine ended up looking like this :

So, right from the outset there is a bit of an overhead here. I also don’t like how I have to map a specific row index instead of a row heading. You will also notice that I had to create my own type converter for the nested comment object. I feel like this was expected, but the documentation doesn’t specify how to actually create this and I had to delve into the source code to work out what I needed to do.

And once we run it, oof! While it does handle almost all scenario’s, it doesn’t handle line breaks in a field. Removing the line break did mean that we could parse everything else, including the enums, double quotes, and (eventually) the nested object.

The code to actually parse the file (Together with the mappings above) was :

So not a hell of a lot to actually parse the CSV, but it does require a bit of setup. Overall, I didn’t think it was even in the same league as CSVHelper.

Benchmarking

The thing is, while I felt CSVHelper was miles more user friendly than Tiny CSV, the entire reason the latter was recommended to me was because it’s supposedly faster. So let’s put that to the test.

I’m using BenchmarkDotNet (Guide here) to do my benchmarking. And my code looks like the following :

A quick note on this, is that I tried to keep it fairly simple, but also I had to ensure that I used “ToList()” to make sure that I was actually getting the complete list back, even though this adds quite a bit of overhead. Without it, I get returned an IEnumerable that might not actually be enumerated at all.

First I tried a 100,000 line CSV file that used our “complicated” format above (Without line breaks in fields however as TinyCSVParser does not handle these).

TinyCSVParser was quite a bit faster. (4x infact). And when it came to memory usage, it was way down on CSVHelper.

Let’s see what happens when we up the size to a million rows :

It seems pretty linear here in terms of both memory allocated, and the time it takes to complete. What’s mostly interesting here is that the file itself is only 7.5MB big at a million rows, and yet we are allocating close to 2.5GB to handle it all, pretty crazy!

So, the performant status of TinyCSV definitely hold up, it’s much faster and uses far less memory, but for my mind CSVHelper is still the much more user friendly library if you are processing smaller files.

I was recently looking into the new Channel<T>  API in .NET Core (For an upcoming post), but while writing it up, I wanted to do a quick refresher of all the existing “queues” in .NET Core. These queues are also available in full framework (And possibly other platforms), but all examples are written in .NET Core so your mileage may vary if you are trying to run them on a different platform.

FIFO vs LIFO

Before we jump into the .NET specifics, we should talk about the concept of FIFO or LIFO, or “First In, First Out” and “Last In, Last Out”. For the concept of queues, we typically think of FIFO. So the first message put into the queue, is the first one that comes out. Essentially processing messages as they go into a queue. The concept of LIFO, is typically rare when it comes to queues, but in .NET there is a type called Stack<T>  that works with LIFO. That is, after filling the stack with messages/objects, the last one put in would then be the first one out. Essentially the order would be reversed.

Queue<T>

Queue<T>  is going to be our barebones simple queue in .NET Core. It takes messages, and then pops them out in order. Here’s a quick code example :

Pretty stock standard and not a lot of hidden meaning here. The Enqueue  method puts a message on our queue, and the Dequeue  method takes one off (In a FIFO manner). Our console app obviously prints out two lines, “Hello” then “World!”.

Barring multi threaded scenarios (Which we will talk about shortly), you’re not going to find too many reasons to use this barebones queue. In a single threaded app, you might pass around a queue to process a “list” of messages, but you may find that using a List<T>  within a loop is a simpler way of achieving the same result. Infact if you look at the source code of Queue, you will see it’s actually just an implementation of IEnumerable anyway!

So how about multi threaded scenarios? It kind of makes sense that you may want to load up a queue with items, and then have multiple threads all trying to process the messages. Well using a queue in this manner is actually not threadsafe, but .NET has a different type to handle multi threading…

ConcurrentQueue<T>

ConcurrentQueue<T>  is pretty similar to Queue<T> , but is made threadsafe by a copious amount of spinlocks. A common misconception is that ConcurrentQueues are just a wrapper around a queue with the use of the lock  keyword. A quick look at the source code here shows that’s definitely not the case. Why do I feel the need to point this out? Because I often see people try and make their use of Queue<T>  threadsafe by using locks, thinking that they are doing what Microsoft does when using ConcurrentQueue, but that’s pretty far from the truth and actually takes a pretty big performance hit when doing so.

Here’s a code sample of a ConcurrentQueue :

So you’ll notice we can no longer just dequeue a message, we need to TryDequeue. It will return true if we managed to pop a message, and false if there is no message to pop.

Again, the main point of using a ConcurrentQueue over a regular Queue is that it’s threadsafe to have multiple consumers (Or producers/enqueuers) all using it at the same time.

BlockingCollection<T>

A blocking collection is an interesting “wrapper” type that can go over the top of any IProducerConsumerCollection<T>  type (Of which Queue<T>  and ConcurrentQueue<T>  are both). This can be handy if you have your own implementation of a queue, but for most cases you can roll with the default constructor of BlockingCollection. When doing this, it uses a ConcurrentQueue<T> under the hood making everything threadsafe (See source code here). The main reason to use a BlockingCollection is that it has a limit to how many items can sit in the queue/collection. Obviously this is beneficial if your producer is much faster than your consumers.

Let’s take a quick look :

What will happen with this code? You will see “Adding Hello”, “Adding World!”, and then nothing… Your application will just hang. The reason is this line :

We’ve initialized the collection to be a max size of 2. If we try and add an item where the collection is already at this size, we will just wait until a message is dequeued. How long will we wait? Well by default, forever. However we can change our add line to be :

So we’ve changed our Add call to TryAdd, and we’ve specified a timespan to wait. If this timespan is hit, then the TryAdd method will return false to let us know we weren’t able to add the item to the collection. This is handy if you need to alert someone that your queue is overloaded (e.g. the consumers are stalled for whatever reason).

Stack<T>

As we talked about earlier, a Stack<T> type allows for a Last In, First Out (LIFO) queuing style. Consider the following code :

The output would be “World!” then “Hello”. It’s rare that you would need this reversal of messages, but it does happen. Stack<T>  also has it’s companion in ConcurrentStack<T> , and you can initialize BlockingCollection with a ConcurrentStack within it.

Channel<T>

There is a brand new Channel<T> type released with .NET Core. Because it’s just so different from the existing queue types in .NET, I’ll have an upcoming post discussing a tonne more about how they work, and why you might use them. In the meantime the only documentation I can find is on Github from Stephen Toub here. Have a look, and see if it works for you until the next post!

While working on an API that was built specifically for mobile clients, I ran into an interesting problem that I couldn’t believe I hadn’t found before. When working on a REST API that deals exclusively in JSON payloads, how do you upload images? Which naturally leads onto the next question, should a JSON API then begin accepting multipart form data? Is that not going to look weird that for every endpoint, we accept JSON payloads, but then for this one we accept a multipart form? Because we will want to be uploading metadata with the image, we are going to have to read out formdata values too. That seems so 2000’s! So let’s see what we can do.

While some of the examples within this post are going to be using .NET Core, I think this really applies to any language that is being used to build a RESTful API. So even if you aren’t a C# guru, read on!

Initial Thoughts

My initial thoughts sort of boiled down to a couple of points.

  • The API I was working on had been hardcoded in a couple of areas to really be forcing the whole JSON payload thing. Adding in the ability to accept formdata/multipart forms would be a little bit of work (and regression testing).
  • We had custom JSON serializers for things like decimal rounding that would somehow manually need to be done for form data endpoints if required. We are even using snake_case as property names in the API (dang iOS developers!), which would have to be done differently in the form data post.
  • And finally, is there any way to just serialize what would have been sent under a multi-part form post, and include it in a JSON payload?

What Else Is Out There?

It really became clear that I didn’t know what I was doing. So like any good developer, I looked to copy. So I took a look at the public API’s of the three social media giants.

Twitter

API Doc : https://developer.twitter.com/en/docs/media/upload-media/api-reference/post-media-upload

Twitter has two different ways to upload files. The first is sort of a “chunked” way, which I assume is because you can upload some pretty large videos these days. And a more simple way for just uploading general images, let’s focus on the latter.

It’s a multi-part form, but returns JSON. Boo.

The very very interesting part about the API however, is that it allows uploading the actual data in two ways. Either you can upload the raw binary data as you typically would in a multipart form post, or you could actually serialise the file as a Base64 encoded string, and send that as a parameter.

Base64 encoding a file was interesting to me because theoretically (And we we will see later, definitely), we can send this string data any way we like. I would say that of all the C# SDKs I looked at, I couldn’t find any actually using this Base64 method, so there weren’t any great examples to go off.

Another interesting point about this API is that you are uploading “media”, and then at a later date attaching that to an actual object (For example a tweet). So if you wanted to tweet out an image, it seems like you would (correct me if I’m wrong) upload an image, get the ID returned, and then create a tweet object that references that media ID. For my use case, I certainly didn’t want to do a two step process like this.

LinkedIn

API Doc : https://developer.linkedin.com/docs/guide/v2/shares/rich-media-shares#upload

LinkedIn was interesting because it’s a pure JSON API. All data POSTs contain JSON payloads, similar to the API I was creating. Wouldn’t you guess it, they use a multipart form data too!

Similar to Twitter, they also have this concept of uploading the file first, and attaching it to where you actually want it to end up second. And I totally get that, it’s just not what I want to do.

Facebook

API Doc : https://developers.facebook.com/docs/graph-api/photo-uploads

Facebook uses a Graph API. So while I wanted to take a look at how they did things, so much of their API is not really relevant in a RESTful world. They do use multi-part forms to upload data, but it’s kinda hard to say how or why that is the case,. Also at this point, I couldn’t get my mind off how Twitter did things!

So Where Does That Leave Us?

Well, in a weird way I think I got what I expected, That multipart forms were well and truly alive. It didn’t seem like there was any great innovation in this area. In some cases, the use of multipart forms didn’t look so brutal because they didn’t need to upload metadata at the same time. Therefore simply sending a file with no attached data didn’t look so out of place in a JSON API. However, I did want to send metadata in the same payload as the image, not have it as a two step process.

Twitter’s use of Base64 encoding intrigued me. It seemed like a pretty good option for sending data across the wire irrespective of how you were formatting the payload. You could send a Base64 string as JSON, XML or Form Data and it would all be handled the same. It’s definitely proof of concept time!

Base64 JSON API POC

What we want to do is just test that we can upload images as a Base64 string, and we don’t have any major issues within a super simple scenario. Note that these examples are in C# .NET Core, but again, if you are using any other language it should be fairly simple to translate these.

First, we need our upload JSON Model. In C# it would be :

Not a whole lot to it. Just a description field that can be freetext for a user to describe the image they are upload, and an imagedata field that will hold our Base64 string.

For our controller :

Again, fairly damn simple. We take in the model, then C# has a great way to convert that string into a byte array, or to read it into a memory stream. Also note that as we are just building a proof of concept, I echo out the image data to make sure that it’s been received, read, and output like I expect it would, but not a whole lot else.

Now let’s open up postman, our JSON payload is going to look a bit like :

I’ve obviously truncated imagedata down here, but a super simple tool to turn an image into a Base64 is something like this website here. I would also note that when you send your payload, it should be without the data:image/jpeg;base64, prefix that you sometimes see with online tools that convert images to strings.

Hit send in Postman and :

Great! So my image upload worked and the picture of my cat was echo’d back to me! At this point I was actually kinda surprised that it could be that easy.

Something that became very evident while doing this though, was that the payload size was much larger than the original image. In my case, the image itself is 109KB, but the Base64 version was 149KB. So about 136% of the original image. In having a quick search around, it seems expected that a Base64 version of a file would be about 33% bigger than the original. When it comes to larger files, I think less about sending 33% more across the wire, but more the fact of reading the file into memory, then converting it into a huge string, and then writing that out… It could cause a few issues. But for a few basic images, I’m comfortable with a 33% increase.

I will also note that there was a few code snippets around for using BSON or Protobuf to do the same thing, and may actually cut down the payload size substantially. The mentality would be the same, a JSON payload with a “stringify’d” file.

Cleaning Up Our Code Using JSON Converters

One thing that I didn’t like in our POC was that we are using a string that almost certainly will be converted to a byte array every single time. The great thing about using a JSON library such as JSON.net in C#, is that how the client sees the model and how our backend code sees the model doesn’t necessarily have to be the exact same. So let’s see if we can turn that string into a byte array on an API POST automagically.

First we need to create a “Custom JSON Converter” class. That code looks like :

Fairly simple, all we are doing is taking a value and converting it from a string into a byte array. Also note that we are only worried about reading JSON payloads here, we don’t care about writing as we never write out our image as Base64 (yet).

Next, we had back to our model and we apply the custom JSON Converter.

Note we also change the “type” of our ImageData field to a byte array rather than a string. So even though our postman test will still send a string, by the time it actually gets to us, it will be a byte array.

We will also need to modify our Controller code too :

So it becomes even simpler. We no longer need to bother handling the Base64 encoded string anymore, the JSON converter will handle it for us.

And that’s it! Sending the exact same payload will still work and we have one less piece of plumbing to do if we decide to add more endpoints to accept file uploads. Now you are probably thinking “Yeah but if I add in a new endpoint with a model, I still need to remember to add the JsonConverter attribute”, which is true. But at the same time, it means if in the future you decide to swap to BSON instead of Base64, you aren’t going to have to go to a tonne of places and work out how you are handling the incoming strings, it’s all in one handy place.

For the past few years, everytime I’ve started a new project there has been one sure fire class that I will copy and paste in on the first day. That has been my “TestingContext”. It’s sort of this one class unit testing helper that I can’t do without. Today, I’m going to go into a bit of detail about what it is, and why I think it’s so damn awesome.

First off, let’s start about what I think beginners get wrong about unit testing, and what veterans begin to hate about unit testing.

The Problem

The number one mistake I see junior developers making when writing unit tests is that they go “outside the scope” of the class they are testing. That is, if they are testing one particular class, let’s say ServiceA, and that has a method that calls ServiceB. Your test actually should never ever enter ServiceB (There is always exceptions, but very very rare). You are testing the logic for ServiceA, so why should it go and actually run code and logic for ServiceB. Furthermore, if your tests are written for ServiceA, and ServiceB’s logic is changed, will that affect any of your tests? The answer should be no, but it often isn’t the case. So my first goal was :

Any testing helper should limit the testing scope to a single class only.

A common annoyance with unit tests is that when a constructor changes, all unit tests are probably going to go bust even if the new parameter has nothing to do with that particular unit test. I’ve seen people argue that if a constructor argument changes, that all unit tests should have to change, otherwise the service itself is obviously doing too many things. I really disagree with this. Unless you are writing pretty close to one method per class, there is always going to be times where a new service or repository is injected into a constructor that doesn’t really change anything about the existing code. If anything, sticking to SOLID principles, the class is open for extension but not modification. So the next goal was :

Changes to a constructor of a class should not require editing all tests for that class.

Next, when writing tests, you should try and limit the amount of boilerplate code in the test class itself. It’s pretty common to see a whole heap of Mock instantiations clogging up a test setup class. So much so it becomes hard to see exactly what is boilerplate and going through the motions, and what is important setup code that needs to be there. On top of that, as the class changes, I’ll often find old test classes where half of the private class level variables are assigned, but are actually never used in tests as the class was modified. So finally :

Boilerplate code within the test class should be kept to a minimum.

Building The Solution

I could explain the hell out of why I did what I did and the iterations I went through to get here, but let’s just see the code.

First, you’re gonna need to install the following Nuget package:

This actually does most of the work for us. It’s an auto mocking package that means you don’t have to create mocks for every single constructor variable regardless of whether it’s used or not. You can read more about the specific package here : https://github.com/AutoFixture/AutoFixture. On it’s own it gets us pretty close to solving our problem set, but it doesn’t get us all the way there. For that we need just a tiny bit of a wrapper.

And that wrapper code :

Essentially it’s an abstract class that your test class can inherit from, that provides a simple way to abstract away everything about mocking. It means our tests require minimal boilerplate code, and rarely has to change based on class extension. But let’s take an actual look how this thing goes.

Testing Context In Action

To show you how the testing context works, we’ll create a quick couple of test classes.

First we just have a repository that returns names, then we have a service that has a couple of methods on it that interact with the repository, or in some cases a utility class.

Now two things here, the first being that the TestService takes in an ITestRepository (Interface) and UtilityService (class), so this could get a bit gnarly under normal circumstances because you can’t mock a concrete class. And second, the first method in the service, “GetNamesExceptJohn” doesn’t actually use this UtilityService at all. So I don’t want to have to mess about injecting in the class when it’s not going to be used at all. I would normally say you should always try and inject an interface, but in some cases if you are using a third party library that isn’t possible. So it’s more here as an example of how to get around that problem.

Now onto the tests. Our first test looks like so :

The first thing you’ll notice that we inherit from our TestingContext, and pass in exactly what class we are going to be testing. This means that it feels intuitive that the only thing we are going to be writing tests for is this single class. While it’s not impossible to run methods from other classes in here, it sort of acts as blinders to keep you focused on one single thing.

Our test setup calls the base.Setup() method which just preps up our testing context. More so, it clears out all the data from previous tests so everything is stand alone.

And finally, our actual test. We simply ask the context to get a mock for a particular interface. In the background it’s either going to return one that we created earlier (More on that shortly), or it will return a brand new one for us to setup. Then we run “ClassUnderTest” with the method we are looking to test, and that’s it! In the background it takes any mocks we have requested, and creates an instance of our class for us. We don’t have to run any constructors at all! How easy is that.

Let’s take a look at another test :

In this test, we are doing everything pretty similar, but instead are injecting in an actual class. Again, this is not a good way to do things, you should try and be mocking as much as possible, but there are two reasons that you may have to inject a concrete class.

  1. You want to write a fake for this class instead of mocking it.
  2. You are using a library that doesn’t have an interface and so you are forced to inject the concrete class.

I think for that second one, this is more a stop gap measure. I’ve dabbled taking out the ability to inject in classes, but there is always been atleast one test per project that just needed that little bit of extensibility so I left it in.

The Niceties

There is a couple of really nice things about the context that I haven’t really pointed out too much yet, so I’ll just highlight them here.

Re-using Mocks

Under the hood, the context keeps a list of mocks you’ve already generated. This means you can reuse them without having to have private variables fly all over the place. You might have had code that looked a bit like this in the past :

You can now rewrite like :

This really comes into it’s own when your setup method typically contained a huge list of mocks that you setup at the start, then you would set a class level variable to be re-used in a method. Now you don’t have to do that at all. If you get a mock in a setup method, you can request that mock again in the actual test method.

Constructor Changes Don’t Affect Tests

You might have seen on the first test we wrote above, even though the constructor required 2 arguments, we only bothered mocking the one thing we cared about for the method under test. Everything else we can let the fixture handle.

How often do you see things like this in your tests?

And then someone comes along and adds a new constructor argument, and you just throw a null onto the end again? It’s a huge pain point of mine and makes tests almost unreadable at times.

Test Framework Agnostic

While in the above tests I used NUnit to write my tests, the context itself doesn’t require any particular testing framework. It can work with NUnit, MSTest and XUnit.

Revisiting Our Problems

Let’s go full circle, and revisit the problems that I found with Unit Testing.

Any testing helper should limit the testing scope to a single class only.

I think we covered this pretty well! Because we pass the class we are looking to test into our testing context, it basically abstracts away being able to call other classes.

Changes to a constructor of a class should not require editing all tests for that class.

We definitely have this. Changes to the constructor don’t affect our test, and we don’t have to setup mocks for things we are uninterested in.

Boilerplate code within the test class should be kept to a minimum.

This is definitely here. No more 50 line setup methods just setting up mocks incase we need them later. We only setup what we need, and the re-usability of mocks means that we don’t even need a bag full of private variables to hold our mock instances.

What’s Your Thoughts?

Drop a comment below with your thoughts on the testing context!

I want to start off this post by saying if you are starting a new .NET Core project and you are looking to use a ServiceLocator. Don’t. There are numerous posts out there discussing how using a ServiceLocator is an “anti-pattern” and what not, and frankly I find anything that uses a ServiceLocator a right pain in the ass to test. Realistically in a brand new .NET Core project, you have dependency injection out of the box without the need to use any sort of static ServiceLocator. It’s simply not needed.

But, if you are trying to port across some existing code that already uses a ServiceLocator, it may not be as easy to wave a magic wand across it all and make everything work within the confines of .NET Core’s dependency injection model. And for that, we will have to work out a new way to “shim” the ServiceLocator in.

An important thing to note is that the “existing” code I refer to in this post is the “ServiceLocator” class inside the “Microsoft.Practices” library. Which itself is also part of the “Enterprise Library”. It’s a little confusing because this library is then dragged along with DI frameworks like Unity back in the day, so it’s hard to pinpoint exactly what ServiceLocator you are using. The easiest way is, are you calling something that looks like this :

If the answer is yes, then you are 99% likely using the Microsoft.Practices ServiceLocator. If you are using a different service locator but it’s still a static class, you can probably still follow along but change the method signature to your needs.

Creating Our Service Locator Shim

The first thing we are going to do is create a class that simply matches our existing ServiceLocator structure and method signatures. We want to create it so it’s essentially a drop in for our existing ServiceLocator so all the method names and properties should match up perfectly. The class looks like :

It can be a bit confusing because we are mixing in static methods with instance ones. But let’s walk through it.

On the static end, we only have one method that is SetLocatorProvider , this allows us to pass in a ServiceProvider  instance that will be used for all service location requests. ServiceProvider is the built in DI that comes with .NET Core (We’ll take a look at how we hook it up in a second). We also have a static property called Current  that simply creates an actual instance of ServiceLocator, providing us with access to the “instance” methods.

Once we have an instance of the ServiceLocator class, we then gain access to the GetInstance  methods, which perfectly match the existing ones of the old ServiceLocator class. Awesome!

Wiring It Up To .NET Core Service Provider

The next part is easy! In our ConfigureServices method of our startup.cs. We need to set the LocatorProvider. It looks like so :

So all we are doing is passing in an instance of our ServiceProvider and this will be used to fetch any instances that are required going forward.

This is actually all we need to do. If you have existing code that utilizes the ServiceLocator, barring a change to any “Using” statements to be swapped over, you should actually be all ready to go!

Testing It Out

Let’s give things a quick test to make sure it’s all working as intended.

I’m going to create a simple test class with a matching interface.

I need to wire this up in my ConfigureServices method of startup.cs

Then I’m just going to test everything on a simple API endpoint in my .NET Core web app.

Give it a run and…

All up and running!

I’m currently living the whole snake case vs camel case argument all over again. Me being a web developer, I prefer camel case (myVariable) as it fits nicely with any javascript code. Others, in what seems predominately iOS developers, prefer to use snake case (my_variable). Typically it’s going to be personal preference and on any internal API, you can do whatever floats your boat. But what about for public API’s? Can we find a way in which we can let the consumer of the API decide how they want the API to return payloads? Why yes, yes we can.

Serialization Settings In .NET Core

But first, let’s look at how we can control JSON serialization in .NET Core if we wanted to go one way over another.

We can override any particular property on a model to say always be serialized with a particular name. This is really heavy handed and I think is probably the worse case scenario. When you do this, it’s basically telling the global settings to get stuffed and you know better.

So for example if I have a model like so :

It will be serialized like :

So you’ll notice that the first property has the first letter capitalized (essentially PascalCase), but the second property without the JsonProperty attribute is using the default .NET Core serialization which is camelCase.

This may seem OK, but what if you then change the default serialization for .NET Core? What’s it going to look like then?

Let’s head over to our ConfigureServices method inside our Startup.cs. There we can change a couple of settings on our “AddMvc” call. Let’s say we now want to go with Snakecase for everything, so we change our JsonOptions to the following :

Running this, we now get :

Well… it did what we told it to do, but it’s definitely done us over a little bit. The property that we have overridden the naming of has won out against our “default” naming strategy. It’s what we expect to see, but often before people realize this it’s too late to go all the way back through the code and change it.

This will often become a problem when you’ve renamed a property not because of any naming strategy dogma, but because you actually want it to be called something else. So imagine the following :

You’ve renamed it not because you wanted to override the naming strategy, but because you literally wanted it to be serialized under a different name. Usually because a name that makes sense internally may not make total sense to the outside world.

At this point there is no getting around this, you can make a completely custom attribute and have that handled in a custom contract resolver (Which we will investigate shortly!), but there is nothing inside the framework to help you be able to rename a property that will still respect the overall naming strategy.

Dynamic Serialization At Runtime

So far we have seen how to hard code a particular naming strategy at compile time, but nothing that can be dynamically changed. So let’s work that out!

Within JSON.net, there isn’t any ability for this to do it out of the box. In my case I’m looking to use a naming strategy based on a particular header being passed through an API. Without there being one, I had to roll my own. Now the next part of this will be sort of like a brain dump of how I got up and running. And to be fair, I worked this as a proof of concept on a small API I was working on, so it may not be great at scale, but it’s a great starting point.

So to begin with, we have to create a custom “naming” strategy that inherits from JSON.net’s abstract class “NamingStrategy”. Inside this we need to override the method called “ResolvePropertyName” to instead return the name we want to use based on an API header. Phew. OK, here’s how I went about that :

OK so it’s kinda big, so let’s break it down into chunks.

First we have our settings class :

This contains a HeaderName for what the header is going to be called, a dictionary called NamingStrategies where the keys are the header values we might expect, and their corresponding naming strategies (More on this later). We also have our DefaultStrategy incase someone doesn’t pass a header at all. Next we have a func that will return an HttpContextAccessor. We need this because in .NET Core, HttpContext is no longer an application wide static property available everywhere, it actually needs to be injected. Because we aren’t actually using DI here, we instead need to pass in a “function” that will return an HttpContextAccessor. We’ll delve more into this when we get to our configuration.

The rest of the code should be pretty straight forward. We get the header, we check if it’s valid (Matches anything in our dictionary), and if it does, we use that strategy to get the property name. If it doesn’t we then just use the default naming strategy.

Now, at this point I actually thought I was done. But as it turns out, JSON.net has aggressive caching so it doesn’t have to work out how to serialize that particular type every single request. From what I’ve seen so far, this is more about the actual custom serialization of the values, not the names, but the naming sort of gets caught up in it all anyway. The caching itself is done in what’s called a “ContractResolver”. Usually you end up using the “DefaultContractResolver” 99% of the time, but you can actually create your own and within that, setup your own caching.

Here’s mine that I created to try and overcome this caching issue:

So what does this actually do? Well because we inherited from the DefaultContractResolver, for the most part it actually does everything the same. With some key differences, let’s start at the top.

When we construct the resolver, we pass in our naming strategy options (Truth be told, I’m not sure I like it this way, but I wasn’t originally intending to have to do a resolver so the options are for the “namingstrategy” not the resolver. Bleh). And we also pass in a “function” that can return a memory cache, again in .NET Core, memory caches are not side wide. JSON.NET actually just uses a static dictionary which also seemed OK, but I like MemoryCache’s wrappers a bit more.

The only thing we override is the ResolveContract method which is where it’s doing the aggressive caching. We actually want to cache things too! But instead of caching based purely on the type (Which is what the default does), we want to also find what the header was that was passed in. This way we cache for the combination of both the header value and the type. To get the header name, I actually reach out to the naming strategy which I probably shouldn’t be doing, but it was an easy way to “share” the logic of getting a valid header

Now it’s time to set everything up. Here’s how it looks inside our ConfigureServices method of our startup.cs :

A bit of a dogs breakfast but it works. We set up the contract resolver to use our new ApiHeaderJsonContractResolver.  We pass in our naming strategy options,  saying that the default should be CamelCase, the header should be “json-naming-strategy”, for our HttpContextAccessorProvider we tell it to use our ServiceCollection to get the service, we pass in a list of valid naming strategies. Finally we also pass in our function that should be used to pull the memory cache.

Let’s test it out!

First let’s try calling it with no header at all :

Cool, it used our default camelcase naming strategy.

Let’s tell it to use snake case now!

Perfect!

And forcing things back to camel case again!

Awesome!

So here we have managed to create our own resolver and naming strategy to allow clients to specify which naming convention they want. To be honest, it’s still a work of progress and this thing is definitely at the proof of concept stage, I’m still getting to grips on the internal workings of JSON.net, but it’s definitely a good start and OK to use on any small project you have going!

I’ve seen some fierce office arguments about how to use HttpClient  in .NET since I’ve been programming. And it’s always about one thing. When exactly do you dispose of an HttpClient instance?

You see there is one train of thought that looks like this :

So you are creating a new instance every time you make an outbound call. Certainly when I first started using the HttpClient class, this seemed logical. But within the past couple of years, this particular article has become pretty infamous : https://aspnetmonsters.com/2016/08/2016-08-27-httpclientwrong/. With the key quotes being :

If we share a single instance of HttpClient then we can reduce the waste of sockets by reusing them

and

In the production scenario I had the number of sockets was averaging around 4000, and at peak would exceed 5000, effectively crushing the available resources on the server, which then caused services to fall over. After implementing the change, the sockets in use dropped from an average of more than 4000 to being consistently less than 400, and usually around 100.

Pretty damning stuff. So in this example, it’s instead favoured to re-use HttpClient instances across the application. And I have to admit, before this article was sent my way, I was definitely in the “always wrap it in a using statement” camp, and that’s generally all I saw out in the wild. These days it’s gone completely the other way, and you would now expect a “static” instance of HttpClient to be created and reused for the lifetime of the application. (There is actually now articles telling you to *not* use a single instance!)

But of course, in comes .NET Core with a new way to manage HttpClient lifetimes, and it’s an interesting one! This guide revolves around using .NET Core 2.1. If you aren’t using version 2.1 yet, there is a handy guide here to get up and running.

HttpClient Factories

Because we are working with .NET Core, and Core has fallen in love with “Dependency Inject all the things”! Then of course Microsoft’s solution for the HttpClient messiness is a DI solution. Let’s imagine that I’m creating an API wrapper for Twitter. So I’m going to create a “TwitterApiClient” class to encapsulate all of this work.

For the sake of brevity, I’m not actually calling out to the Twitter API. But you get the idea that I’ve created a nice wrapper for the API, that has a method called “GetTweets” that would if I wanted to, reach out and get some tweets and return them as a list. Let’s just use our imagination here! You’ll also notice I did this the crap way where we wrap everything in a using statement. This is intentional for now, just to show how things “might have been” before we knew better!

In my ConfigureServices method I’m going to register my TwitterApiClient like so :

Now I’m going to go ahead and create a controller that just gets these tweets and writes them on the screen :

Run this bad boy and what do we see?

OK so in theory we have everything working, but it’s disposing of our HttpClient each time which as we know from the above article, is a bad idea. So we could create a static instance of HttpClient, but to be perfectly honest, I hate static instances. But we are cutting edge so we are using .NET Core 2.1 (Again if you aren’t yet, you can read our guide to getting up and running with 2.1 here), and now we can use an injected instance of HttpClient. So let’s do that!

First we change our class around. We instead inject in an instance of HttpClient and use this instead.

Now if you run this at this point, you are gonna see an error close to :

InvalidOperationException: Unable to resolve service for type ‘System.Net.Http.HttpClient’ while attempting to activate […]

This is because Core doesn’t just inject in HttpClient’s by default, there is a tiny bit of configuration needed.

First, we need to install the Microsoft.Extensions.Http nuget package. At the time of writing this is in preview so you will need the full version install command. So from your package manager console it will be something like:

Now back in our ConfigureServices method in our startup.cs. We are going to add a call to AddHttpClient like so but most importantly, we remove our original call to add a transient instance of our original client. This is super important. I banged my head against a wall for a long time trying to work out what was going wrong with my code. And it turns out when you call AddHttpClient, it actually does a bunch of wiring up for you. If you then call AddTransient yourself, you just overwrite the lot!

Give it a run and we should now be all up and running! Now what this code actually does is tell .NET Core to inject in an HttpClient instance into your nice little API wrapper, and it will handle the lifetimes for it. That last part is important. It’s not going to be a singleton, but it’s not going to be a per request type thing either. .NET Core has magic sauce under the hood that means it will at times recycle the underlying connections when it thinks it should.

I’ll admit it’s sort of hazy in a way that Microsoft says “trust us. We’ll sort this for you”. But it’s probably a whole lot better than what you were doing on your own.

Setting Defaults

Taking things a step further, we can actually set up some defaults in our configure method that mean we are configuring our application all in one place (And it’s not hardcoded in our services). As an example, I can do this :

If we decide to read these settings from a config store like appSettings.json, we don’t have to pollute any sort of IOptions throughout our actual Twitter client. Nice!

Named HttpClient Factories

Something else you can do is create named instances of HttpClient that can be created from an HttpClientFactory. This is handy if the class you want to inject HttpClient instances into needs more than one default. So for example a “SocialMediaApiClient” that talks to both Twitter and Facebook.

The setup is slightly different. Instead of saying which class we want to inject our HttpClient into, we just add an instance of HttpClient to the factory with a particular name, and the defaults we want.

Then when it comes to our actual service we first inject an instance of IHttpClientFactory, and then we can get that specific instance of HttpClient by calling CreateClient with the client name as a parameter. Once again, the lifecycle is managed for us as to when it’s disposed of or reused.

Generic HttpClient

Finally, the HttpClient factory comes with the ability to generate a new HttpClient on demand which will be managed for you. With this, there shouldn’t ever be a reason to “new up” an instance of HttpClient ever again.

First we just call “AddHttpClient” in our ConfigureServices method, passing in absolutely nothing.

And whenever we want to actually get a new instance of an HttpClient. We inject in an instance of IHttpClientFactory and call CreateClient passing in nothing extra.

 

While working on another post about implementing your own custom ILogger implementation, I came across a method on the ILogger interface that I had never actually used before. It looks something like  _logger.BeginScope("Message Here");  , Or correction, it didn’t just take a string, it could take any type and use this for a “scope”. It intrigued me, so I got playing.

The Basics Of Scopes In ILogger

I’m not really sure on the use of “scope” as Microsoft has used it here. Scope when talking about logging seems to imply either the logging level, or more likely which classes and use a particular logger. But scope as Microsoft has defined it in ILogger is actually to do with adding extra messaging onto a log entry. It’s somewhat metadata-ish, but it tends to lend itself more like a stacktrace from within a single method.

To turn on scopes, first you need to actually be using a logger that implements them – not all do. And secondly you need to know if that particular logger has a setting to then turn them on if they are not already on by default. I know popular loggers like NLog and Serilog do use scopes, but for now we are just going to use the Console logger provided by Microsoft. To turn it on, when configuring our logger we just use the “IncludeScopes” flag.

First I’ll give a bit of code to look at as it’s probably easier than me rambling on trying to explain.

Now when we run this, we end up with a log like so :

So for this particular logger, what it does is it appends it before the logs. But it’s important to note that the scope messages are stored separately, it just so happens that this loggers only way of reporting is to push it all into text. You can see the actual code for the Console logger by Microsoft here : https://github.com/aspnet/Logging/blob/dev/src/Microsoft.Extensions.Logging.Console/ConsoleLogScope.cs. So we can see it basically appends it in a hierarchical fashion and doesn’t actually turn it into a string message until we actually log a message.

If for example you were using a Logger that wrote to a database, you could have a separate column for scope data rather than just appending it to the log message. Same thing if you were using a logging provider that allowed some sort of metadata field etc.

Note :  I know that the scope message is doubled up. This is because of the way the ConsoleLogger holds these messages. It actually holds them as a key value pair of string and object. The way it gets the key is by calling “ToString()” on the message… which basically calls ToString() on a string. It then writes out the Key followed by the Value, hence the doubling up. 

So It’s Basically Additional Metadata?

Yes and no. The most important thing about scopes is it’s hierarchy. Every library I’ve looked at that implements scopes implements them in a way that there is a very clear hierarchy to the messages. Plenty of libraries allow you to add additional information with an exception/error message, but through scopes we can determine where the general area of our code got up to and the data around it without having to define private variables to hold this information.

Using Non String Values

The interesting thing is that BeginScope actually accepts any object as a scope, but it’s up to the actual Logger to decide how these logs are written. So for example, if I pass a Dictionary of values to a BeginScope statement, some libraries may just ToString() anything that’s given to them, others may check the type and decide that it’s better represented in a different format.

If we take a look at something like NLog. We can see how they handle scopes here : https://github.com/NLog/NLog.Extensions.Logging/blob/master/src/NLog.Extensions.Logging/Logging/NLogLogger.cs#L269. Or if I take the code and paste it up :

We can see it checks if the state message is a type of IEnumerable<KeyValuePair<string, object>> and then throws it off to another method (Which basically splits open the list and creates a nicer looking message). If it’s anything else (Likely a string), we just push the message as is. There isn’t really any defined way provided by Microsoft on how different types should be handled, it’s purely up to the library author.

How About XYZ Logging Library

Because logging scopes are pretty loosely defined within the ASP.net Core framework, it pays to always test how scopes are handled within the particular logging library you are using. There is no way for certain to say that scopes are even supported, or that they will be output in a way that makes any sense. It’s also important to look at the library code or documentation, and see if there is any special handling for scope types that you can take advantage of.

There is a current proposal that’s getting traction (By some) to make it into C# 8. That’s “default interface methods”. Because at the time of writing, C# 8 is not out, nor has this new language feature even “definitely” made it into that release, this will be more about the proposal and my two cents on it rather than any specific tutorial on using them.

What Are Default Interface Methods?

Before I start, it’s probably worth reading the summary of the proposal on Github here : https://github.com/dotnet/csharplang/issues/288. It’s important to remember is that this is just a proposal so not everything described will actually be implemented (Or implemented in the way it’s initially being described). Take everything with a grain of salt.

The general idea is that an interface can provide a method body. So :

A class that implements this interface does not have to implement methods where a body has been provided. So for example :

The interesting thing is that the class itself does not have the ability to run methods that have been defined and implemented on an interface. So :

The general idea is that this will now allow you to do some sort of multiple inheritance of behaviour/methods (Which was previously unavailable in C#).

There are a few other things that then are required (Or need to become viable) when opening this up. For example allowing private level methods with a body inside an interface (To share code between default implementations).

Abstract Classes vs Default Interface Methods

While it does start to blur the lines a bit, there are still some pretty solid differences. The best quote I heard about it was :

Interfaces define behaviour, classes define state.

And that does make some sense. Interfaces still can’t define a constructor, so if you want to share constructor logic, you will need to use an abstract/base class. An interface also cannot define class level variables/fields.

Classes also have the ability to define accessibility of it’s members (For example making a method protected), whereas with an interface everything is public. Although part of the proposal is extending interfaces with things like the static keyword and protected, internal etc (I really don’t agree with this).

Because the methods themselves are only available when you cast back to the interface, I can’t really see it being a drop in replacement for abstract classes (yet), but it does blur the lines just enough to ask the question.

My Two Cents

This feels like one of those things that just “feels” wrong. And that’s always a hard place to start because it’s never going to be black and white. I feel like interfaces are a very “simple” concept in C# and this complicates things in ways which I don’t see a huge benefit. It reminds me a bit of the proposal of “Primary Constructors” that was going to make it into C# 6 (See more here : http://www.alteridem.net/2014/09/08/c-6-0-primary-constructors/). Thankfully that got dumped but it was bolting on a feature that I’m not sure anyone was really clamoring for.

But then again, there are some merits to the conversation. One “problem” with interfaces is that you have to implement every single member. This can at times lock down any ability to extend the interface because you will immediately blow up any class that has already inherited that interface (Or it means your class becomes littered with throw new NotImplementedException() ).

There’s even times when you implement an interface for the first time, and you have to pump out 20 or so method implementations that are almost boilerplate in content. A good example given in the proposal is that of IEnumerable. Each time you implement this you are required to get RSI on your fingers by implementing every detail. Where if there was default implementations, there might be only a couple of methods that you truly want to make your own, and the default implementations do the job just fine for everything else.

All in all, I feel like the proposal should be broken down and simplified. It’s almost like an American political bill in that there seems to be a lot in there that’s causing a bit of noise (allowing static, protected members on interfaces etc). It needs to be simplified down and just keep the conversation on method bodies in interfaces and see how it works out because it’s probably a conversation worth having.

What do you think? Drop a comment below.