Object-Oriented C Programming - Part I

To say that the object-oriented programming paradigm is the most popular and widely supported method of creating software today would be an extreme under-statement. Not only entire sets of tools but entire organizational development processes are built around the object-oriented model. The popularity of languages like C++, Java, C#, and other object-oriented languages has skyrocketed over the last two decades, while the popularity of the venerable C programming language has slowly but surely dwindled.

Introduction

As object-oriented programming has become more popular in the game industry, C++ has overtaken C by a wide margin. This has happened despite a number of deficiencies in the C++ programming language, many of which are actually quite harmful and problematic for game developers. It's not uncommon for game developers to disable a swath of C++ features such as exceptions and even RTTI, while simply avoiding other features such as the STL in favor of alternative libraries with better allocator designs or other performance improvements. Developers in other parts of the CS landscape often disable or simply avoid the same or even more parts of C++, such as operator overloading and virtual methods, due to a need to retain tighter control over the generated code as well as desiring more readability and safety in source code.

C is avoided because it is not designed as an object-oriented language. It lacks concepts like classes with inheritance, messages (virtual function calls), polymorphism, encapsulation, and generic programming. However, the lack of language support for these features should not lead developers into thinking that C is not capable of being used in an object-oriented manner.

The first thing to remember is that the early C++ compilers -- and in fact, even a number of modern compilers for alternative object-oriented languages -- acted merely as veneers or macro languages on top of a regular old-fashioned C compiler. Any feature implemented in C++ (aside from exceptions) can be implemented in C. Languages like Python or Ruby are themselves implemented in C. The difference between a language like C++ and a language like C is merely that the former offers syntactical "sugar" to ease the development of object-oriented code; this is not the same thing as saying that C++ made object-oriented programming possible while C did not.

I'm going to clear something up now: nothing's wrong with using C++. I myself vastly prefer using C++ when writing any kind of large application, such as a game engine. However, there are a variety of cases when pure C is a preferable language. For those cases, it's best if you know how to make the most out of the language. Plus, knowing how to write good object-oriented C code can actually help you write better C++ object-oriented code.

In this article series, I'll lay out some ground rules and give examples of how to write solid, maintainable, and easy to use object-oriented code in pure C.

Of Structs and Classes

By far the most obvious and striking feature of C++ not found in C is the concept of a class, or object. At its most basic level, a C++ class is nothing more than a C struct. Consider the two declarations below:

// C++
class MyClass {
public:
int foo; double bar; MyType baz; };
/* C */
typedef struct MyStruct {
int foo; double bar; MyType baz; } MyStruct;

Both declarations are almost entirely equivalent in terms of functionality and purpose. For the vast majority of compilers, both of those types will have identical size, alignment, and layout. The only striking difference at all in C++ is that C++ adds member protection, and defaults to non-public access of members in a class. That's it. The implication of this fact is that a simple C struct can easily replace a simple C++ class in any API. They are in these basic cases conceptually interchangeable.

Note that I am not implying that a single API can interchangeably use "identical looking" classes and structs. This is not true and is most definitely against the language semantics. If an API is written with classes, use those classes; if it's written using structs, use those structs.

It's worth taking a look at how I declared the MyStruct type in the example above. In C, struct names live in a different "namespace" than variables, functions, or typedefs. Check the following example:

struct Foo {
int bar; };

Foo foo; /* error: Foo is not a known type /
struct Foo foo; /
compiles successfully */

It's possible to use a typedef to inject a struct into the standard tag namespace in C. This can be done as a separate declaration if desired.

struct Foo {
int bar; }; typedef struct Foo Foo;

This is perfectly acceptable, and whether you use that format or the one I use in the rest of the article is purely a stylistic choice.

There is also a third option, which is to embed the struct definition into the typedef as I did above, but to make the struct anonymous, like so:

typedef struct {
int bar; } Foo;

I recommend avoiding doing that for two reasons. The first reason is that sometimes you need a type to reference itself, such as when creating node types for a linked list. This requires that the type name be known to the compiler before it reads the members of the struct. The following code for example will not compile:

/* forward declarations are necessary to avoid this error /
typedef struct {
int bar; Foo
next; /* error: Foo is unknown / Foo prev; /* error: Foo is unknown */ } Foo;

The second reason to avoid that format is that it makes forward declaration of the type in headers impossible. You will often want to have some functions in a header that take pointers to your types, but those headers have no reason to need the layout of the type or any other declarations from the type's header. In both C++ and C, you can use a forward declaration to achieve this. If the base struct is anonymous and is only named in a typedef, however, this won't work.

Here's the example using forward declarations. Note how Functions.h does not need to include Foo.h.

/* Foo.h */
typedef struct Foo {
int bar; } Foo;

/* Functions.h / extern do_stuff(struct Foo foo);

Now here's the example using the anonymous struct. Note that there's no way to use forward declarations in Functions1.h, and Functions2.h must include Foo.h to work.

/* Foo.h /
typedef struct {
int bar; } Foo;
/ Functions1.h /
extern do_stuff(struct Foo
foo); /* error: 'struct Foo' is a different type than 'typedef struct {} Foo' /

/
Functions2.h */

include "Foo.h"

extern do_stuff(Foo* foo);

Inheritance

The first real difference between C++ and C is that C++ offers the notion of inheritance. This allows the members and methods of a base class to be inserted into a derived class, and also allows implicit casting from pointers to the derived type into pointers of the base type.

From a technical standpoint, inheritance is easily implemented in C.

// C++
class MyClass : public BaseClass {
public:
int foo; };
/* C */
typedef struct MyStruct {
BaseStruct base; int foo; } MyStruct;

Again, in most C++ compilers, both of those declarations will have identical size, alignment, and layout. Functionally, these two definitions are equivalent.

The differences between the syntax between C and C++ get much more obvious in this case, however. Accessing the base class's members in C requires a bit more typing than C++. Additionally, there is no automatic casting from derived pointer types to base pointer types, so again some additional typing is necessary.

// C++
BaseClass* bc;
MyClass* mc;

mc->foo = 123; // access base class member in C++
mb = bc; // implicit cast to base type pointer in C++

/* C /
BaseStruct
bs;
MyStruct* ms;

ms->base.foo = 123; // access base class member in C
bs = &ms->base; // explicit case to base type pointer in C

The C version is certainly a bit uglier, but this turns out to be less of an issue than you might think. The C syntax is mainly only a huge problem when using deep type hierarchies, which can easily result in code like the following:

/* the deep hierarchy of the API is the problem, not the C syntax */
mystruct->base.base.base.base.base.member = value;

That's definitely not good. The major part of the solution to this problem is to simply avoid deep hierarchies, which is very good advice whether you're working in C, C++, Java, or any other language. Where possible, use has-a relationships rather than is-a relationships. In most large projects that are well architected, there should rarely be any class hierarchy deeper than three levels.

The second part of the solution is to use macros where necessary to make casting easier to type and easier to manipulate in the event that the hierarchy changes later on in the project.

/* example C struct hierarchy */
typedef struct BaseType {
int foo; } BaseType;

typedef struct Middle {
BaseType base; int bar; } Middle;

typedef struct Derived {
Middle base; int baz; } Derived;

/* macros for casting to parent types */

define MIDDLETOBASETYPE(ptr) (&(ptr)->base)

define DERIVEDTOMIDDLE(ptr) (&(ptr)->base)

define DERIVEDTOBASETYPE(ptr) (DERIVEDTOMIDDLE(MIDDLETOBASETYPE(ptr))

The resulting code will still be more verbose and explicit than C++, but can actually be a little more readable due to that explicitness. Purely in terms of C, the macros are superior to simply dereferencing the base member of the derived type because it allows you to merge the Middle type into either its parent or child type and then simply update the DERIVEDTOBASETYPE macro without needing to update any other code that actually converts between Derived and BaseType.

Next Time

In the next part of this series, I'll go over how to implement C++'s method support in C, including both virtual and non-virtual methods.