The Warts of C

I am surprised how much I like the C language. I learned it because I felt I had to in order do make some fast code for OpenGL programming, and in the beginning I struggled a lot with feeling comfortable with what was going on. I believe you have to have a feel for what is going on "under the hood" to feel really comfortable with any language, and with C that was learning about machine code. Once I had an imaginary physical CPU, stack and heap in my mind I found understanding C a lot easier.

And then I read an article which explained why C is not a low level language, and my computer is not a fast PDP-11, which kinda shattered my belief...

Part of understanding C for me was identifying it's warts. In the beginning I didn't know what was a wart and what wasn't, so I wasn't able to tell if something didn't feel right because I hadn't grasped the concept properly or because it was a clunky thing about the language. So here is my current understanding about what is clunky and what isn't.

Numeric types are messed up

How many bits in an int? How many in an unsigned long long int? The C language comes from a time when it was not a given that computers "thought" in multiples of eight bits (word size). This seems to be why the wikipedia page on C types uses words like "usually" and "at least" when referring to sizes. It's why I like to use stdint.h because it gives me types like uint32_t. Vim syntax highlights these types which I interpret as a signal that they are a standard.

To typedef or not to typedef

Many libraries do not typedef their structs, meaning something like this:

struct my_awesome_data {
    uint8_t field1;
    uint16_t field2;
};

Gets used like so:

struct my_awesome_data* data = awesome_func();

Whereas some libraries typedef their structs like this:

typedef struct my_awesome_data {
    uint8_t field1;
    uint16_t field2;
} my_awesome_data_t;

Or anonymously like this:

typedef struct {
    uint8_t field1;
    uint16_t field2;
} my_awesome_data_t;

Which gets used like this:

my_awesome_data_t* data = awesome_func();

To me this is a wart because how do you know which way is better? Some casual googling seems to show some strong opinions either way. As a beginner C programmer how do you know what to do? Better to pick one way and stick to it.

Include files

I end up wrapping all my includes with ifndef <MODULE>_H so I don't end up wasting a lot of time with sorting out the dependency graph of my includes. Something there is a wart.

I find include files in general to be awkward for a bunch or reasons. One of the more obscure reasons is if I want to use libffi to dynamically call some function I have to "understand" the parameters to the function, and the only real way to do that is to parse C code header files which is a completely non-trivial task.

Interpreted language runtimes that interact with C libraries usually have to have some third thing where the calling conventions are mapped out for that module. That means you can't just drop in a C library and have to (probably) manually inspect the C header to determine how to assemble arguments for a library function call.

Modules

You want your code to be modular. Firstly because it helps you organise your thoughts, and second because some of those modules might even come in useful in other projects. For the first use-case I usually have a bunch of *.h and *.c files in my project which I compile to *.o. My main program build then looks like gcc -o prog obja.o objb.o objc.o prog.c. But what if objb.o turns out to be super useful for other projects? How do I organise my code so that objb.o is "just there"?

The Debian guidelines on packaging shared libraries is daunting to say the least, and I get the feeling that packaging libraries is only for "real programmers" and not for mere mortals like myself. I notice that "single header libraries" are a bit of a thing, and I don't think I am alone to think this is because distrbuting modules in C is a pain in the butt.

Furthermore if I intend to include objb.o into a shared library I need to compile it with -fPIC so it's relocatable. Does that mean I need to produce an objb.o AND an objb_pic.o just in case?

Any "system" I can come up with to manage this for myself feels like I am working against the tide.

Desires

I would love to see a modern C that doesn't try to do anything smarter than just make the language simpler by removing some of the cruft. I am pretty convinced that CMake and automake do not (and can not) tackle the underlying problem, and instead just add extra complexity on top to try to hide the warts.

I would also like to see self-describing binary object formats, so that code can read the contents of an object file and determine function parameters dynamically. Like an ELF symbol table on steroids. Maybe then we could move closer to doing away with includes. Something like DWARF I guess.

Awesomeness

Some things I really like about C:

No exceptions

A segfault is always a bug that should be fixed. Not having permissions to write to a file is not an exception.