Why C++ is My Language of Choice

2011-02-26

I had a discussion with a friend of mine that came from a, what I would call, Microsoft indoctrination camp. He tried to persuade me of the benefits of .Net. As a matter of fact, I am programming C# for a living since a good while, but I just can't see the advantage. The conversation was unfortunately cut short and I was unable to really bring my points across. So I decided to take the time and write them down in as much detail as possible.

First off, I want to say I am not anti-Microsoft per se. I try addressing technical problems with a pure technical analysis. Just to be clear, stating that the decision for any technology is a business decision, is pure and utter nonsense. True if you happen to have an existing body of work, you don't want to change the technology just because a more adequate one came along. That is just wasting money. But when you start with a clean slate, it is purely a technical decision.

As a side note, I will focus on C#. This is because .Net is basically used only with C#. There are a few Visual Basic programmers that never learned a better language and a few C++/CLI people who desperately try to make their old program a better thing by using .Net. But anybody who starts out will use C#.

To understand the discussion further, I first have to clarify what C++ and .Net is.

What is C++

C++ is a language standardized by the ISO. You can pick up many compilers from different vendors and compile an executable program. C++ also specifies a small standard library for common tasks, such as strings, reading files and some math support. But it is so small that other common tasks such as threading or networking are not implemented. C++ by itself is not very usable, but can be extended through a myriad of third party libraries.

What is .Net

.Net is a framework that provides a number of languages, such as C#, VisualBasic.Net or C++/CLI. .Net is specified by the ECMA, but the standard is lead by Microsoft and almost only Microsoft. This is natural since they developed the technology. The framework provided by .Net is huge. You have basically everything from XML, over web services and COM, passing by database access and finally ending at fancy user interface components. The .Net runtime also provides a number of heigh level services for the languages, notably reflection, garbage collection and access validation.

Lots of Features

When people try to sell .Net they often will point to the many features that the framework and runtime provide. Almost every time they would say something like "and with C++ you would have to develop it all yourself" or "the runtime protects you from invalidating your memory". There is a certain little truth in these statements but even significantly more ignorance. This comes from two misconceptions. The first is their utter lack of knowledge about other technologies. The second is the fallacy that technology can accommodate for bad programming.

One Framework

Three Classes for the Software Engineers under the sky,
Seven for the Programmers in their halls of stone,
Nine for Code Monkeys doomed to die,
One for the Dark Lord on his dark throne
In the Land of Microsoft where the Shadows lie.
One Framework to rule them all, One Framework to find them,
One Framework to bring them all and in the darkness bind them
In the Land of Microsoft where the Shadows lie.

Ignoring the implications of vendor lock in, having one huge framework supplied with your language is not the blessing you might have thought. At first glance it seems like this is a huge boost in productivity, since you do not need to implement that much. To a certain degree this is the case. The framework is built to accommodate all common use cases and some more special ones. The really big problem arises when your problem domain just will not map into the solution domain of the framework. Now you will notice that you are programming around idiosyncrasies of the framework, since it will just not bend that way. Throwing out the framework and looking for a different one or implementing the functionality from scratch is hard to justify, since your managers bought into the cost saving idea of having an extensive framework.

With C++ that just will not happen, since there is no framework to start with. You can go out there and look for a framework or library that helps you in your task. There are literally like sand on the beach. Try them out, see which matches your use case best. You even have the option to implement it yourself. This is a really valuable options, since there are quite a few cases where implementing just what you need is less effort and code than translating your problem to a framework. Even the standard library is optional. As it happens, it is just a library like any other and you can decide not to use it.

The only drawback that C++ has is that there is no central repository of libraries. Technologies such as Python or Pearl certainly nailed this aspect. On the other hand there is Google...

Do it with Style

Many languages and technologies bind you to a specific programming paradigm. The only exception I know is Pearl and C++. A critique of C++ is often that it is not a "real" or "pure" object oriented language and I think this is great. With C++ you can program procedural, functional, object oriented, aspect oriented or service oriented. The language just does not bog you down by applying stupid restrictions. True some more esoteric features are provided by third party frameworks such AspectC++ or sigc++'s lambda expressions. (Yes, there is also' boost's lambda expressions, I know.)

The important aspect to remember is that certain problems just lend themselves to certain programming paradigms. In the end of the day you need to solve problems, not implement pure or fancy solutions. Limiting your solution space will hinder you, not empower you.

Alzheimer

I think one of the features touted by many "modern" languages is garbage collection. But I think that in many cases this is more of an anti-feature.

First off, I want to point out that if you really want you can also have garbage collection for C++. There are multiple implementations, but the most famous is Boehm-Demers-Weiser conservative garbage collector. It actually is collector for pure C. But because of C++ is compatible to C the collector is trivially adapted to C++ and in the case of that collector it is done for you.

As a result you have many options when programming C++. You can put the memory on the stack, explicitly allocate it on the heap, wrap it into a reference counting pointer scheme, extend that scheme to add copy on write semantic, full blown garbage collection and you can even write your own memory management. With a language like C# you just have garbage collection and that is it.

If you listen to .Net marketing you will hear that object construction and destruction is so much faster than C++. Here is the scoop: that is utter nonsense. This is only true if you artificially narrow your sight. Create a bunch of objects of one type and destroy them. Here C# will outperform C++ by an order of a magnitude. But that is comparing a kitchen knife with a katana when chopping vegetables. But you still want the katana when you are standing on the battlefield.

To understand why this is the way it is and why it fails in real applications you need to understand a bit how garbage collection works. The basics of garbage collection are that you allocate an object and just let it leak when you are done using it. It is then the task of the garbage collector to figure out which objects are used and which not. There are different heuristics to do this, but that is not so important. The important part is that memory that is unused is not immediately (in some implementations never) released to the operating system. Now when a new object is created the garbage collector looks in it's unused memory if there happens to be a chunk of unused memory that fits the size. In this case no new memory is allocated. This is done as a performance measure, since allocation and release of memory is quite expensive, since the operating system needs to track it all.

When looking at our example you can clearly see why the implementation using the garbage collector is faster, only a few objects are allocated and then reused. But this example is an unfair competition, since C++ has other ways to deal with creating and destroying many objects of the same type. First creating the object on the stack is simple, easy and here not one byte of memory is allocated or freed. Even if the problem is more difficult and you need to implement allocation, you can simply implement the flyweight pattern. Here the memory management is explicitly handled and you pool all small objects into one large memory pool that is allocated once.

If you make the example even more complex with objects of different sizes, you will start to see the garbage collector stating to struggle along. There are two problems. First the collector can not reuse as many memory chunks since the objects are different sizes. Second you are still paying for the luxury of garbage collection, the collector must figure out which object are not used anymore and manage it's internal data structure to keep track of everything. When you have high memory fluctuation, you will end up also paying for the memory release penalty, since the collector will need to free some memory. The only difference is that this ocures at an unspecified time later in the program.

But the biggest problem about managed memory is not the technical aspects, but poor programmers. The advantage about garbage collection is that you don't need to worry about memory; the disadvantage about garbage collection is that you don't need to worry about memory. With unmanaged programs finding memory leaks is relatively simple, run the program through a leak analyser and it will point out where you lost your memory. Manages programs are impossible to analyse with such simple means. Leave dangling references and your memory will balloon out of control. I have seen really weird ways in which the garbage collector was broken, that started as simple programs. The irony is that many applications are required to use a form of explicit memory management to work around these problems, such as the disposable "pattern" used in C#.

Got Properties?

The syntactic sugar that C# added in the form of properties is quite evil. Although used in moderation properties have the potential to lighten code a bit. The problem is that more often than you would wish, they are more trouble that good. When using a method to get or set a value you know you are using a method. Because it is a method you can call is anyway you like. It is clear that "obj.get_value()", "obj.retrive_value()" or "obj.compute_value()" have different expected runtime and "obj.value" implies near to no runtime. But in practice you just don't know. Although there is a certain consensus only to use properties for trivial operations, you can never rally be sure about it.

Mirror Mirror on the Wall

In my opinion the biggest anti-feature is reflection in a strongly typed language. One of the strengths of strongly typed languages is that the interface and layout of objects is well known at compile-time. This has the main benefit that code has as little ambiguity as possible. But this is completely circumvented when you just pass in an "object" and use reflection to interact with it.

The main goal of reflection is to decouple the concrete type and algorithms that work on them. But more often you need ways to flag certain properties, methods or classes to have different behaviour than others. In C# this is solved with attributes. But now you actually added coupling to achieve this. The class now needs to know in what context it will be used and add the appropriate attributes.

It gets even more difficult to trace when reflection is not enough and you need to influence the reflection directly. In the case of C# this would be the use of ICustomTypeDescriptor. I mean honestly, "I put generic implementation into your generic implementation so you can reflect while you reflect"? At this point you should just throw reflection out of the window and properly think about what you want to accomplish and properly design you code.

Not so fast buster!

But at least C# is as fast as C++, right? Wrong! True, you got to hand to the people of Microsoft, they did allot of work trying to make the .Net runtime as fast as humanly possible. (At least we are lead to believe that.) But here is the scoop, no matter what they do, they will never achieve the same speed. Why? Because of all the extra safety built in. Each time you access a reference it is checked for validity. Each time you call along assembly boarders your call is marshalled. Each time you allocate something the garbage collector needs to update its state. Each time you do something with the operating system it is checked against security settings. Each time you add a variable it is initialized to something neutral. All in all it is this comfort you are paying for.

So it is easy to use?

This is something that Microsoft is trying to sell you, C# and .Net is really easy to use and learn. This is basically pitched with the arguments that all the safety means are in place. This is like giving novice drivers a tank, sure they don't get hurt but the damage is till there. It basically makes no real difference if you get an access violation or a NullReference exception. In the end of the day they need to learn to program properly, if they want to build any application that is more complex than "calc.exe". In my experience novice programmers will probably dig themselves a hole they can't get out of.

Dependencies Dependencies Dependencies

When you bundle a C++ application you need all dependencies. In my experience this is around a dozen DLLs accounting to something like a few MB of disk space. They are all copied to the folder with your program, thus avoiding DLL-hell.

With .Net applications this is more complicated. First you need to ensure that the .Net framework is installed. Assuming that the framework is already installed is just rude and there are enough times where this really this is not the case. Let's face it, the .Net framework is huge and you need to bundle the .Net framework in your installer. The tragic part is that you are probably only are using 10% of the actual framework, but you simply do not have a choice. It is almost technically impossible to install only parts of the framework.

Additionally it is even more complicated to install .Net assemblies. Simple DLLs are just copied next to your program or put somewhere in the PATH. But .Net assemblies need to be correctly registered. Although not such a big deal, it is just one more unnecessary hurdle.

The Right Tool

I like C++ because it is such versatile tool, but it definitely is not the right tool for every task. I have written quite some web applications in my time and used technologies such as Pearl, PHP and Python. They are highly light weight languages that mend themselves to the needs of quickly putting things together. Sure they are slow like hell, but that is not an issue since the program is waiting on the database most of the time.

For simple height level scripting task I will always use a lightweight scripting language and for serious heavy duty tasks I will almost always use C++. But the important part is to look at the available technologies and ask yourself which will get the task done in an efficient and proper way.

Technologies that don't really know what they want, such as .Net or EJB just do not have a place in my tool belt. (And don't get me started about Java...)

Do You Even Know What You Are Talking About?!

To clarify my position let me explain a bit of background on myself. I have developed on both sides of the spectrum and in many places in between. I started programming web applications, first with Pearl, then PHP. I then moved over to program tools for CFD applications in C++, including a fully fledged graphical editor using OpenGL. I then moved on to program and maintain a compiler and interpreter for an interpreted language for industry automation using C++. Along all this delved into technology for games as a hobby which where mostly C++ and kept developing websites for personal use, including Python with Django and the Google Apps Engine. And Finally I am working in a team developing an editor for panels and control room displays for industry and plant automation which is written in C#. As it happens, this editor is also the largest C# application ever developed (at least so I am told). It takes a certain size of application to see where C# is failing.