2022¶

November 20, 2022
10 min read

The downsides of compile time evaluation

Originally from: https://c3.handmade.network/blog/p/8590-the_downsides_of_compile_time_evaluation

Macros and compile time evaluation are popular ways to extend a language. While macros fell out of favour by the time Java was created, they've returned to the mainstream in Nim and Rust. Zig has compile time and JAI has both compile time execution and macros.

At one point in time I was assuming that the more power macros and compile time execution provided the better. I'll try to break down why I don't think so anymore.

Code with meta programming are hard to read

Macros and compile time form a set of meta programming tools, and in general meta programming has very strong downsides in terms of maintaining and refactoring code. To understand code with meta programming you have to first resolve the meta program in your head, and not until you do so you can think about the runtime code. This is exponentially harder than reading normal code.

Bye bye, refactoring tools

It's not just you as a programmer that need to resolve the meta programming – any refactoring tool would need to do the same in order to safely do refactorings – even simple ones as variable name changes.

And if the name is created through some meta code, the refactoring tool would basically need to reprogram your meta program to be correct, which is unreasonably complex. This is why everything from preprocessing macros to reflection code simply won't refactor correctly with tools.

Making it worse: arbitrary type creation

Some languages allow that arbitrary types are created at compile time. Now the IDE can't even know how types look unless it runs the meta code. If the meta code is arbitrarily complex, so will the IDE need to be in order to "understand" the code. While the meta programming evalution might be nicely ordered when running the compiler, a responsive IDE will try to iteratively compile source files. This means the IDE will need to compile more code to get the correct ordering.

Code and meta code living together.

Many languages try to make the code and meta code look very similar. This leads to lots of potential confusion. Is a a compile time variable (and thus may change during compilation, and any expression containing it might be compile time resolved) or is it a real variable?

Here's some code, how easy is it to identify the meta code?

fn performFn(comptime prefix_char: u8, start_value: i32) i32  {
    var result: i32 = start_value;
    comptime var i = 0;
    inline while (i < cmd_fns.len) : (i += 1) {
        if (cmd_fns[i].name[0] == prefix_char) {
            result = cmd_fns[i].func(result);
        }
    } 
    return result;
}

I've tried to make it easier in C3 by not mixing meta and runtime code syntax. This is similar how macros in C are encouraged to be all upper case to avoid confusion:

macro int performFn(char $prefix_char, int start_value)
{
    int result = start_value;
    // Prefix $ all compile time vars and statements
    $for (var $i = 0; $i < CMD_FNS.len, $i++):
        $if (CMD_FNS[$i].name[0] == $prefix_char):
            result = CMD_FNS[$i].func(result);
        $endif;   
    $endfor;   
    return result;
}

The intention with the C3 separate syntax is that the approximate runtime code can be found by removing all rows starting with $:

macro int performFn(char $prefix_char, int start_value)
{
    int result = start_value;


            result = CMD_FNS[$i].func(result);


    return result;
}

Not elegant, but the intention is to maximize readability. In particular, look at the "if/$if" statement. In the top example you can only infer that it is compile time evaluated and folded by looking at i and prefix_char definitions. In the C3 example, the $if itself guarantees the contant folding and will return an error if the boolean expression inside of () isn't compile time folded.

Extending syntax for the win?

A popular use for macros is for extending syntax, but this often goes wrong. Even if you have a language with a macro system that is doing this well, what does it mean? It means that suddenly you can't look at something like foo(x) and be able to make assumptions about it. In C without macros we can make the assumption that neither x nor other local variables will not changed (unless they have been passed by reference to some function prior to this), and the code will resume running after the foo call (except if setjmp/longjmp is used). With C++ we can asume less, since foo may throw an exception, and x might implicitly be passed by reference.

The more powerful the macro system the less we can assume. Maybe it's pulling variables from the calling scope and changing them? Maybe it's returning from the current context? Maybe it's formatting the drive? Who knows. You need to know the exact definition or you can't read the local code and this undermines the idea of most languages.

Because in a typical language you will what "breaks the rules": all the built in statements like if, for and return. Then there is a way to extend the language that follows certain rules: functions and types. This forms the common language understood by a developer to be what "knowing a language is about": you know the syntax and semantics of the built-in statements.

If the language extends its syntax, then every code base becomes a DSL which you have to learn from scratch. This is similar to having to buy into some huge framework in the JS/Java-space, just worse.

The point is that while we're always extending the syntax of the language, doing this through certain limited mechanisms like functions works well, but the more unbounded the extension mechanisms the harder the code will be to read and understand.

When meta programming is needed

In some cases meta programming can make code more readable. If the problem is something like having a pre-calculated list for fast calculations or types defined from a protocol, then code generation can often solve the problem. Languages can improve this by better compiler support for triggering codegen.

In other cases the meta programming can be replaced by running code at startup. Having "static init" like Java static blocks can help for cases when libraries need to do initialization.

If none of those options work, there is always copy-paste.

Summary

So to summarize:

Code with meta programming is hard to read (so minimize and support readability).
Meta programming is hard to refactor (so adopt a subset that can work with IDEs).
Arbitrary type creation is hard for tools (so restrict it to generics).
Same syntax is bad (so make meta code distinct).
Extending syntax with macros is bad (so don't do it).
Codegen and init at runtime can replace some use of compile time.

Macros and compile time can be made extremely powerful, but this power is tempered by the huge drawbacks, good macros are not what you can do with them, but if it manages to balance readability with necessary features.

Comments

Comment by Christoffer Lernö

Macros and compile time evaluation are popular ways to extend a language. While macros fell out of favour by the time Java was created, they've returned to the mainstream in Nim and Rust. Zig has compile time and JAI has both compile time execution and macros.

At one point in time I was assuming that the more power macros and compile time execution provided the better. I'll try to break down why I don't think so anymore.

Code with meta programming are hard to read

Macros and compile time form a set of meta programming tools, and in general meta programming has very strong downsides in terms of maintaining and refactoring code. To understand code with meta programming you have to first resolve the meta program in your head, and not until you do so you can think about the runtime code. This is exponentially harder than reading normal code.

Bye bye, refactoring tools

It's not just you as a programmer that need to resolve the meta programming – any refactoring tool would need to do the same in order to safely do refactorings – even simple ones as variable name changes.

And if the name is created through some meta code, the refactoring tool would basically need to reprogram your meta program to be correct, which is unreasonably complex. This is why everything from preprocessing macros to reflection code simply won't refactor correctly with tools.

Making it worse: arbitrary type creation

Some languages allow that arbitrary types are created at compile time. Now the IDE can't even know how types look unless it runs the meta code. If the meta code is arbitrarily complex, so will the IDE need to be in order to "understand" the code. While the meta programming evalution might be nicely ordered when running the compiler, a responsive IDE will try to iteratively compile source files. This means the IDE will need to compile more code to get the correct ordering.

Code and meta code living together.

Many languages try to make the code and meta code look very similar. This leads to lots of potential confusion. Is a a compile time variable (and thus may change during compilation, and any expression containing it might be compile time resolved) or is it a real variable?

Here's some code, how easy is it to identify the meta code?

fn performFn(comptime prefix_char: u8, start_value: i32) i32  {
    var result: i32 = start_value;
    comptime var i = 0;
    inline while (i < cmd_fns.len) : (i += 1) {
        if (cmd_fns[i].name[0] == prefix_char) {
            result = cmd_fns[i].func(result);
        }
    } 
    return result;
}

I've tried to make it easier in C3 by not mixing meta and runtime code syntax. This is similar how macros in C are encouraged to be all upper case to avoid confusion:

macro int performFn(char $prefix_char, int start_value)
{
    int result = start_value;
    // Prefix $ all compile time vars and statements
    $for (var $i = 0; $i < CMD_FNS.len, $i++):
        $if (CMD_FNS[$i].name[0] == $prefix_char):
            result = CMD_FNS[$i].func(result);
        $endif;   
    $endfor;   
    return result;
}

The intention with the C3 separate syntax is that the approximate runtime code can be found by removing all rows starting with $:

macro int performFn(char $prefix_char, int start_value)
{
    int result = start_value;


            result = CMD_FNS[$i].func(result);


    return result;
}

Not elegant, but the intention is to maximize readability. In particular, look at the "if/$if" statement. In the top example you can only infer that it is compile time evaluated and folded by looking at i and prefix_char definitions. In the C3 example, the $if itself guarantees the contant folding and will return an error if the boolean expression inside of () isn't compile time folded.

Extending syntax for the win?

A popular use for macros is for extending syntax, but this often goes wrong. Even if you have a language with a macro system that is doing this well, what does it mean? It means that suddenly you can't look at something like foo(x) and be able to make assumptions about it. In C without macros we can make the assumption that neither x nor other local variables will not changed (unless they have been passed by reference to some function prior to this), and the code will resume running after the foo call (except if setjmp/longjmp is used). With C++ we can asume less, since foo may throw an exception, and x might implicitly be passed by reference.

The more powerful the macro system the less we can assume. Maybe it's pulling variables from the calling scope and changing them? Maybe it's returning from the current context? Maybe it's formatting the drive? Who knows. You need to know the exact definition or you can't read the local code and this undermines the idea of most languages.

Because in a typical language you will what "breaks the rules": all the built in statements like if, for and return. Then there is a way to extend the language that follows certain rules: functions and types. This forms the common language understood by a developer to be what "knowing a language is about": you know the syntax and semantics of the built-in statements.

If the language extends its syntax, then every code base becomes a DSL which you have to learn from scratch. This is similar to having to buy into some huge framework in the JS/Java-space, just worse.

The point is that while we're always extending the syntax of the language, doing this through certain limited mechanisms like functions works well, but the more unbounded the extension mechanisms the harder the code will be to read and understand.

When meta programming is needed

In some cases meta programming can make code more readable. If the problem is something like having a pre-calculated list for fast calculations or types defined from a protocol, then code generation can often solve the problem. Languages can improve this by better compiler support for triggering codegen.

In other cases the meta programming can be replaced by running code at startup. Having "static init" like Java static blocks can help for cases when libraries need to do initialization.

If none of those options work, there is always copy-paste.

Summary

So to summarize:

Code with meta programming is hard to read (so minimize and support readability).
Meta programming is hard to refactor (so adopt a subset that can work with IDEs).
Arbitrary type creation is hard for tools (so restrict it to generics).
Same syntax is bad (so make meta code distinct).
Extending syntax with macros is bad (so don't do it).
Codegen and init at runtime can replace some use of compile time.

Macros and compile time can be made extremely powerful, but this power is tempered by the huge drawbacks, good macros are not what you can do with them, but if it manages to balance readability with necessary features.

Comment by Christoffer Lernö

Having macro meta syntax that is different from the regular syntax helps, but whenever compile time and runtime code mix, the readability goes down. Trying to keep two sets of states in your head at the same time is not trivial and affects code reading.

If you do plain code generation with a code generator (and actually produce a final source file), you have less restrictions on the how you express this code generation as opposed to having code generation in the same code you're running the code in.

If you run code at startup, rather than run it during compile time you will have an easier time understanding it and inspecting what it produces.

And so on.

Using compile time evaluation for these things is creating a very generic in-language tool, and such tools will by necessity be less easy to work with than a specialized tool (such as a custom code generator). These drawbacks need to be taken into account and be balanced against advantages.

November 16, 2022
6 min read

"auto" is a language design smell

Originally from: https://c3.handmade.network/blog/p/8587-auto_is_a_language_design_smell

It's increasingly popular to use type inference for variable declarations.

– and it's understandable, after all who wants to write something like Foobar<Baz<Double, String>, Bar> more than once?

I would argue that "auto" (or your particular language's equivalent) is an anti-pattern when the type is fully known.

When is type inference used?

Few are arguing for replacing:

int i = get_val();

by

auto i = get_val();

The latter is longer and gives less information. Still, some "auto all the things!" fanatics argue that this is right. Because maybe at some time you change what get_val() returns and then you need to change one less place, so now rather than having a syntax error where the function is invoked you get it later at some other place to make it extra hard to debug...

But most people will argue it's mainly for when the type gets complex. For example:

std::map<std::string,std::vector<int> >::iterator it = myMap.begin();
// vs
auto it = myMap.begin();

Another important use is when you write macros or templates and the type has to be inferred. Here's a C3 example:

// No type inference
macro @swap1(&a, &b)
{
  $typeof(a) temp = a;
  a = b;
  b = temp; 
}
// vs
macro @swap2(&a, &b)
{
  var temp = a;
  a = b;
  b = temp; 
}

So we have two common cases:

When type is unknown
When the type name grows long and complex.

Where do long type names come from?

No one is arguing against the use of type inference when the type isn't known or generic – this use makes perfect sense.

– But there is a problem with the auto it = myMap.begin() use, where type inference is a desired shorthand to only because the type names are too long.

Type names only become long because parameterized types usually carry their parameterization in their type (well, some Java "enterprise" code manages long type names anyway, but that's beside the point).

This inevitably causes type signatures to blow up. It's usually possible to write typedefs to make the types shorter, but few are doing that because it's convenient to just define the type directly with parameters as opposed to doing type defines, plus sometimes the parameterization is actually helpful to determine if it matches a particular generic function.

So basically the way we parameterize types in most languages cause the type name blowup that is then mitigated with type inference.

Again, the problem with type inference

I'm not going to rehash the arguments made here: https://austinhenley.com/blog/typeinference.html. I am mostly in agreement with them.

I think the most important thing is that the type declarations locally documents the assumptions in the code. If I ever need to "hover over a variable in the IDE to find the type" (as some suggest as a solution), it means that it is unclear from the local code context what the type is. Since the type of a variable is fundamental to how the code works, this should never be unclear – which is why the type declaration serves as strong support for code reading. (Explicit variable types also makes it easy to text search for type usage and for the IDE to track types).

While this is bad, the problem with long type signatures often makes up for it. Type inference becomes a necessary because of how parameterized types work.

I would strongly object the idea of introducing type inference it to languages that don't have issues with long type names, such as C (or C3), because fundamentally it is something that will make to code less clear to read and consequently: bugs harder to catch.

The design smell

"auto" is a language design smell because it is typically a sign of the language having types parameterized in a way that makes them inconveniently long.

The type inference thus becomes a language design band-aid which lets people ignore tackling the very real issue of long type names.

If long type names are bad, why is everyone doing it?

Unfortunately there is an added complication: there aren't many good alternatives. Enforcing something like typedefs to use parameterized types works but is not particularly elegant.

There are other possibilities that could be explored, such as eliding the parameterization completely, but retaining the rest of the type (e.g. iterator it = myMap.begin) and similar ideas that straddle both inference and types trying to get the best of both worlds.

Such explorations are uncommon though, which the "auto" style type inference is probably to blame for. A popular band-aid is easier to apply than to find a more innovative solution.

Comments

Comment by Christoffer Lernö

It's increasingly popular to use type inference for variable declarations.

– and it's understandable, after all who wants to write something like Foobar<Baz<Double, String>, Bar> more than once?

I would argue that "auto" (or your particular language's equivalent) is an anti-pattern when the type is fully known.

When is type inference used?

Few are arguing for replacing:

int i = get_val();

by

auto i = get_val();

The latter is longer and gives less information. Still, some "auto all the things!" fanatics argue that this is right. Because maybe at some time you change what get_val() returns and then you need to change one less place, so now rather than having a syntax error where the function is invoked you get it later at some other place to make it extra hard to debug...

But most people will argue it's mainly for when the type gets complex. For example:

std::map<std::string,std::vector<int> >::iterator it = myMap.begin();
// vs
auto it = myMap.begin();

Another important use is when you write macros or templates and the type has to be inferred. Here's a C3 example:

// No type inference
macro @swap1(&a, &b)
{
  $typeof(a) temp = a;
  a = b;
  b = temp; 
}
// vs
macro @swap2(&a, &b)
{
  var temp = a;
  a = b;
  b = temp; 
}

So we have two common cases:

When type is unknown
When the type name grows long and complex.

Where do long type names come from?

No one is arguing against the use of type inference when the type isn't known or generic – this use makes perfect sense.

– But there is a problem with the auto it = myMap.begin() use, where type inference is a desired shorthand to only because the type names are too long.

Type names only become long because parameterized types usually carry their parameterization in their type (well, some Java "enterprise" code manages long type names anyway, but that's beside the point).

This inevitably causes type signatures to blow up. It's usually possible to write typedefs to make the types shorter, but few are doing that because it's convenient to just define the type directly with parameters as opposed to doing type defines, plus sometimes the parameterization is actually helpful to determine if it matches a particular generic function.

So basically the way we parameterize types in most languages cause the type name blowup that is then mitigated with type inference.

Again, the problem with type inference

I'm not going to rehash the arguments made here: https://austinhenley.com/blog/typeinference.html. I am mostly in agreement with them.

I think the most important thing is that the type declarations locally documents the assumptions in the code. If I ever need to "hover over a variable in the IDE to find the type" (as some suggest as a solution), it means that it is unclear from the local code context what the type is. Since the type of a variable is fundamental to how the code works, this should never be unclear – which is why the type declaration serves as strong support for code reading. (Explicit variable types also makes it easy to text search for type usage and for the IDE to track types).

While this is bad, the problem with long type signatures often makes up for it. Type inference becomes a necessary because of how parameterized types work.

I would strongly object the idea of introducing type inference it to languages that don't have issues with long type names, such as C (or C3), because fundamentally it is something that will make to code less clear to read and consequently: bugs harder to catch.

The design smell

"auto" is a language design smell because it is typically a sign of the language having types parameterized in a way that makes them inconveniently long.

The type inference thus becomes a language design band-aid which lets people ignore tackling the very real issue of long type names.

If long type names are bad, why is everyone doing it?

Unfortunately there is an added complication: there aren't many good alternatives. Enforcing something like typedefs to use parameterized types works but is not particularly elegant.

There are other possibilities that could be explored, such as eliding the parameterization completely, but retaining the rest of the type (e.g. iterator it = myMap.begin) and similar ideas that straddle both inference and types trying to get the best of both worlds.

Such explorations are uncommon though, which the "auto" style type inference is probably to blame for. A popular band-aid is easier to apply than to find a more innovative solution.

August 7, 2022
10 min read

The case against a C alternative

Originally from: https://c3.handmade.network/blog/p/8486-the_case_against_a_c_alternative

Like several others I am writing an alternative to the C language (if you read this blog before then this shouldn't be news!). My language (C3) is fairly recent, there are others: Zig, Odin, Jai and older languages like eC. Looking at C++ alternatives there are languages like D, Rust, Nim, Crystal, Beef, Carbon and others.

But is it possible to replace C? Let's consider some arguments against.

1. C language toolchain

The C language is not just the language itself but all the developer tools developed for the language. Do you want to do static analysis on your source code? - There are a lot of people working on that for C. Tools for detecting memory leaks, data races and other bugs? There's a lot of those, even if your language has better tooling out of the box.

If you want to target some obscure platform, then likely it's assuming you're using C.

The status of C as the lingua franca of today's computing makes it worthwhile to write tools for it, so there are many tools being written.

If someone has a toolchain set up working, why risk it switching language? A "better C" must bring a lot of added productivity to motivate the spending time setting up a new toolchain. If it's even possible.

2. The uncertainties of a new language

Before a language has matured, it's likely to have bugs and might change significantly to address problems with language semantic. And is the language even as advertised? Maybe it offers something like "great compile times" or "faster than C" – only these goals turn out to be hard to reach a the language adds the full set of features.

And what about maintainers? Sure, an open source language can be forked, but I doubt many companies are interested in using a language that they further down the line might be forced to maintain.

Betting on a new language is a big risk.

3. The language might just not be good enough

Is the language even addressing the real pain points of C? It turns out that people don't always agree on what the pain points with C is. Memory allocation, array and string handling are often tricky, but with the right libraries and a sound memory strategy, it can be minimized. Is the language possibly addressing problems that advanced users don't really worry about – if so then its actual value might be much lower than expected.

And worse, what if the language omits crucial features that are present in C? Features that C advanced programmers rely on? This risk is increased if the language designer hasn't used C a great deal but comes from C++, Java etc.

4. No experienced developers

A new language will naturally have a much smaller pool of experienced developers. For any middle to large company that's a huge problem. The more developers there are available for a company, the better they like it.

Also, while the company has experience recruiting for C developers, it doesn't know how to recruit for this new language.

5. The C ABI is the standard for interoperability

If the language can't easily call – or be called - by C code, then anyone using the language will have to have to do extra work to do pretty much anything that is interfacing with outside code. This is potentially a huge disadvantage.

"Better X" doesn't matter

So those are some of the downsides to not picking C, to be offset by the advantages of picking the alternative. However, often language designers over-estimate what how big advantages their added "features" bring. Here are some common "false advantages"

1. Better syntax

Having a "better syntax" than C is mostly subjective. Different syntax is also a huge disadvantage: now you can't copy code from C, you might have to rewrite every single line even. No company will adopt a language because it has slightly better syntax than C.

2. Safer than C

Any C alternative will be expected to be on par with C in performance. The problem is that C have practically no checks, so any safety checks put into the competing language will have a runtime cost, which often is unacceptable. This leads to a strategy of only having checks in "safe" mode. Where the "fast" mode is just as "unsafe" as C.

There are some exceptions: "foreach" avoids manually adding boundary checks and so will automatically be safer. Similarly slices helps in writing checks compared to "pointer + len" (or worse: null terminated arrays).

3. Programmer productivity

First of all, pretty much all languages ever will make vacuous claims of "higher programmer productivity". The problem is that for a business this usually doesn't matter. Why? Because the actual programming is not the main time sink. In a business, what takes time is to actually figure out what the task really is. So something like a 10% or 20% "productivity boost" won't even register. A 100% increase in productivity might show, but even that isn't guaranteed.

What matters?

So if these don't matter, what does? For a business it's whether despite the downsides the language can help the bottom line: "Is this valuable enough to outweigh the downsides?"

But if "better x" doesn't help - what does? Well... "killer features": having unique selling points that C can't match.

Look at Java, when it was released it offered the following features that most of the competing languages couldn't give you:

OO done cleanly (OO was hot at the time)
Threading out of the box (uncommon at the time)
"Write once run anywhere"
"Run your code in the browser"
Garbage collection built in
Network programming
Good standard library
Free to use

That's not just one but eight(!) killer features. How many of those unique selling points do the C alternatives have? Less than Java did at least!

The next killing feature

So my argument is that a common way languages gets adoption by being the only language in order to use something: Dart for using Flutter, JS for scripting the browser, Java for applets, ObjC for Mac and iOS apps.

Even if those monopolies disappear over time, it gives the language become known and used.

Similarly there are examples where frameworks have been popular enough to boost languages, Ruby and Python are good examples.

So looking at our example languages, Jai's strategy of bundling a game engine seems good: anyone using it will have to learn Jai, so if the engine is good enough people will learn the language too.

But aside from Jai, is anyone C alternative really looking to pursue having killer features? And if it doesn't have one, how does it prove the switch from C is worth it? It can't.

Conclusion

The "build it and they will come" idea is tempting to believe in, but there is frankly little reason to drop C unless the alternative has important unique features an/or products that C can't hope to match.

While popularity and enthusiasm is helpful, it cannot replace proven value. In the end, all that matters is whether using a language can produce more tangible value to developers than C for at least a large subset of what C is used for. While developers may be excited by new languages, that enthusiasm doesn't translate to business value.

So no matter how exciting that C alternative may look, it probably will fail.

Comments

Comment by Christoffer Lernö

Like several others I am writing an alternative to the C language (if you read this blog before then this shouldn't be news!). My language (C3) is fairly recent, there are others: Zig, Odin, Jai and older languages like eC. Looking at C++ alternatives there are languages like D, Rust, Nim, Crystal, Beef, Carbon and others.

But is it possible to replace C? Let's consider some arguments against.

1. C language toolchain

The C language is not just the language itself but all the developer tools developed for the language. Do you want to do static analysis on your source code? - There are a lot of people working on that for C. Tools for detecting memory leaks, data races and other bugs? There's a lot of those, even if your language has better tooling out of the box.

If you want to target some obscure platform, then likely it's assuming you're using C.

The status of C as the lingua franca of today's computing makes it worthwhile to write tools for it, so there are many tools being written.

If someone has a toolchain set up working, why risk it switching language? A "better C" must bring a lot of added productivity to motivate the spending time setting up a new toolchain. If it's even possible.

2. The uncertainties of a new language

Before a language has matured, it's likely to have bugs and might change significantly to address problems with language semantic. And is the language even as advertised? Maybe it offers something like "great compile times" or "faster than C" – only these goals turn out to be hard to reach a the language adds the full set of features.

And what about maintainers? Sure, an open source language can be forked, but I doubt many companies are interested in using a language that they further down the line might be forced to maintain.

Betting on a new language is a big risk.

3. The language might just not be good enough

Is the language even addressing the real pain points of C? It turns out that people don't always agree on what the pain points with C is. Memory allocation, array and string handling are often tricky, but with the right libraries and a sound memory strategy, it can be minimized. Is the language possibly addressing problems that advanced users don't really worry about – if so then its actual value might be much lower than expected.

And worse, what if the language omits crucial features that are present in C? Features that C advanced programmers rely on? This risk is increased if the language designer hasn't used C a great deal but comes from C++, Java etc.

4. No experienced developers

A new language will naturally have a much smaller pool of experienced developers. For any middle to large company that's a huge problem. The more developers there are available for a company, the better they like it.

Also, while the company has experience recruiting for C developers, it doesn't know how to recruit for this new language.

5. The C ABI is the standard for interoperability

If the language can't easily call – or be called - by C code, then anyone using the language will have to have to do extra work to do pretty much anything that is interfacing with outside code. This is potentially a huge disadvantage.

"Better X" doesn't matter

So those are some of the downsides to not picking C, to be offset by the advantages of picking the alternative. However, often language designers over-estimate what how big advantages their added "features" bring. Here are some common "false advantages"

1. Better syntax

Having a "better syntax" than C is mostly subjective. Different syntax is also a huge disadvantage: now you can't copy code from C, you might have to rewrite every single line even. No company will adopt a language because it has slightly better syntax than C.

2. Safer than C

Any C alternative will be expected to be on par with C in performance. The problem is that C have practically no checks, so any safety checks put into the competing language will have a runtime cost, which often is unacceptable. This leads to a strategy of only having checks in "safe" mode. Where the "fast" mode is just as "unsafe" as C.

There are some exceptions: "foreach" avoids manually adding boundary checks and so will automatically be safer. Similarly slices helps in writing checks compared to "pointer + len" (or worse: null terminated arrays).

3. Programmer productivity

First of all, pretty much all languages ever will make vacuous claims of "higher programmer productivity". The problem is that for a business this usually doesn't matter. Why? Because the actual programming is not the main time sink. In a business, what takes time is to actually figure out what the task really is. So something like a 10% or 20% "productivity boost" won't even register. A 100% increase in productivity might show, but even that isn't guaranteed.

What matters?

So if these don't matter, what does? For a business it's whether despite the downsides the language can help the bottom line: "Is this valuable enough to outweigh the downsides?"

But if "better x" doesn't help - what does? Well... "killer features": having unique selling points that C can't match.

Look at Java, when it was released it offered the following features that most of the competing languages couldn't give you:

OO done cleanly (OO was hot at the time)
Threading out of the box (uncommon at the time)
"Write once run anywhere"
"Run your code in the browser"
Garbage collection built in
Network programming
Good standard library
Free to use

That's not just one but eight(!) killer features. How many of those unique selling points do the C alternatives have? Less than Java did at least!

The next killing feature

So my argument is that a common way languages gets adoption by being the only language in order to use something: Dart for using Flutter, JS for scripting the browser, Java for applets, ObjC for Mac and iOS apps.

Even if those monopolies disappear over time, it gives the language become known and used.

Similarly there are examples where frameworks have been popular enough to boost languages, Ruby and Python are good examples.

So looking at our example languages, Jai's strategy of bundling a game engine seems good: anyone using it will have to learn Jai, so if the engine is good enough people will learn the language too.

But aside from Jai, is anyone C alternative really looking to pursue having killer features? And if it doesn't have one, how does it prove the switch from C is worth it? It can't.

Conclusion

The "build it and they will come" idea is tempting to believe in, but there is frankly little reason to drop C unless the alternative has important unique features an/or products that C can't hope to match.

While popularity and enthusiasm is helpful, it cannot replace proven value. In the end, all that matters is whether using a language can produce more tangible value to developers than C for at least a large subset of what C is used for. While developers may be excited by new languages, that enthusiasm doesn't translate to business value.

So no matter how exciting that C alternative may look, it probably will fail.

July 17, 2022
5 min read

Optional syntax

Originally from: https://c3.handmade.network/blog/p/8460-optional_syntax

In C3 optionals are built into the language. They're not the run of the mill optionals as they carry a "optional result value". This makes them more like "Result" types than optionals.

In C3 you declare a variable holding an optional using the ! suffix:

int! x = ...

We can now assign either to the real value, or to the optional result:

int! x = 1; // x is a real value
x = MyRes.MISSING!; // x is assigned an optional
// x = MyRes.MISSING; <- Error: cannot assign "MyRes" to int

If we think of it in terms of a "Result":

Result<int> x;
x.result = 1; // x = 1
x.error = MyRes.ERR; // x = MyRes.ERR!
x.result = MyRes.ERR; // x = MyRes.ERR - fails

So the "clever" ! suffix here is used to assign to the "error" part of the Result. Unfortunately, the suffix is hard to read at the end of a line, where ! and ; often blurs together. For that reason I regularly try to revisit this syntax to see if I can improve on it.

It's used in two cases:

assignment: x = MyRes.ERR!
return: return MyRes.ERR!

While (2) could be replaced by something like return! MyRes.ERR or even raise MyRes.ERR, the assignment is not as easily tackled.

Naive ideas could be to use some symbol salad like:

int! x !!= MyRes.MISSING;
int! x <!= MyRes.MISSING;
int! x <- MyRes.MISSING;
// I'm going to exclude
// int! x := MyRes.MISSING
// as it is used as regular assign in most languages.

Or we could allow those return statements to have a different meaning in an assignment:

int! x = raise MyRes.MISSING;
int! x = return! MyRes.MISSING;

While it's possible, it creates an odd effect if we consider this example:

int! x;
return x = return! MyRes.MISSING;

This should also illustrate that using x = MyRes.MISSING! should be thought of as implicitly doing x = { 0, MyRes.MISSING }.

Understanding that we see how it works:

x = MyRes.MISSING!; // x = { 0, MyRes.MISSING }
return MyRes.MISSING!; // return { 0, MyRes.MISSING }

So really the proper way would be to always translate the !, like this:

x = fault MyRes.ERR;
return fault MyRes.ERR;

Which is a mouthful. One could of course contract that return fault into something like:

x = fault MyRes.ERR;
throw MyRes.ERR;

Unfortunately, this builds the assumption that a return may not return an optional, which it of course can:

int! x = ...
return x; // Optional, so it may be like a "throw" or not

If we want to be super clear we can do something like this:

int! x = ...
if (y) return? x; // Might return an optional
if (z) return! MyRes.MISSING; // Will return an optional.
if (w) return w; // Will not return an optional

Due to the ? being a rethrow, we could require this:

int! x = ...
if (y) return x?; // Use rethrow to make the type int
if (z) return! MyRes.MISSING; // Will return an optional.
if (w) return w; // Will not return an optional

So the question here is if this adds anything over the original:

int! x = ...
if (y) return x;
if (z) return MyRes.MISSING!; 
if (w) return w;

These are questions that need quite a bit of C3 error handling code to decide, so for now things have to stay as they are.

Comments

Comment by Christoffer Lernö

In C3 optionals are built into the language. They're not the run of the mill optionals as they carry a "optional result value". This makes them more like "Result" types than optionals.

In C3 you declare a variable holding an optional using the ! suffix:

int! x = ...

We can now assign either to the real value, or to the optional result:

int! x = 1; // x is a real value
x = MyRes.MISSING!; // x is assigned an optional
// x = MyRes.MISSING; <- Error: cannot assign "MyRes" to int

If we think of it in terms of a "Result":

Result<int> x;
x.result = 1; // x = 1
x.error = MyRes.ERR; // x = MyRes.ERR!
x.result = MyRes.ERR; // x = MyRes.ERR - fails

So the "clever" ! suffix here is used to assign to the "error" part of the Result. Unfortunately, the suffix is hard to read at the end of a line, where ! and ; often blurs together. For that reason I regularly try to revisit this syntax to see if I can improve on it.

It's used in two cases:

assignment: x = MyRes.ERR!
return: return MyRes.ERR!

While (2) could be replaced by something like return! MyRes.ERR or even raise MyRes.ERR, the assignment is not as easily tackled.

Naive ideas could be to use some symbol salad like:

int! x !!= MyRes.MISSING;
int! x <!= MyRes.MISSING;
int! x <- MyRes.MISSING;
// I'm going to exclude
// int! x := MyRes.MISSING
// as it is used as regular assign in most languages.

Or we could allow those return statements to have a different meaning in an assignment:

int! x = raise MyRes.MISSING;
int! x = return! MyRes.MISSING;

While it's possible, it creates an odd effect if we consider this example:

int! x;
return x = return! MyRes.MISSING;

This should also illustrate that using x = MyRes.MISSING! should be thought of as implicitly doing x = { 0, MyRes.MISSING }.

Understanding that we see how it works:

x = MyRes.MISSING!; // x = { 0, MyRes.MISSING }
return MyRes.MISSING!; // return { 0, MyRes.MISSING }

So really the proper way would be to always translate the !, like this:

x = fault MyRes.ERR;
return fault MyRes.ERR;

Which is a mouthful. One could of course contract that return fault into something like:

x = fault MyRes.ERR;
throw MyRes.ERR;

Unfortunately, this builds the assumption that a return may not return an optional, which it of course can:

int! x = ...
return x; // Optional, so it may be like a "throw" or not

If we want to be super clear we can do something like this:

int! x = ...
if (y) return? x; // Might return an optional
if (z) return! MyRes.MISSING; // Will return an optional.
if (w) return w; // Will not return an optional

Due to the ? being a rethrow, we could require this:

int! x = ...
if (y) return x?; // Use rethrow to make the type int
if (z) return! MyRes.MISSING; // Will return an optional.
if (w) return w; // Will not return an optional

So the question here is if this adds anything over the original:

int! x = ...
if (y) return x;
if (z) return MyRes.MISSING!; 
if (w) return w;

These are questions that need quite a bit of C3 error handling code to decide, so for now things have to stay as they are.

July 1, 2022
3 min read

Why implicit imports fails

Originally from: https://c3.handmade.network/blog/p/8448-why_implicit_imports_fails

As previously discussed, it might be possible to do implicit imports so using Foo would implicitly do it. In C3 due to the overall rules, this leads to few ambiguities (go back to the blog post to review how it works)

After using this for quite a while, I ended up concluding that full implicit imports are bad. You want enough high level importing to feel that there is some documentation of what is included to hint at the possible origin of types.

An example is when read some code that relies on an external graphics library and you encounter a type like Point or Vector2. Because at that point you can't be sure whether this is a type from the external library or from some obscure part of the standard library. Same with something like Socket or Connection: is that Socket from a standard lib networking library, or is it from some external imported library? If the standard library is big enough then you can't know for sure – and finding out is not easy.

So you want at least a high level import, but possibly not import std::net::socket granularity, but rather something like import std::net or import raylib at the top of the file – enough to make it easy to find the types and functions.

So the new updated scheme has wildcard inclusion by default (so import std::net would include all the sub modules).

In addition, I've also made modules implicitly import any other module with the same top domain. So code in std::net::socket would see the code in std::net::http without the need for an explicit import.

This means that if you start a project with some top module, for example mygame, then in the module mygame::gameloop you'll automatically import mygame::maths and mygame::data.

There are some issues with the latter. In particular, all of the standard library modules would see all other standard library modules! It's quite possible to address that, but first I want to make sure it's a problem in practice. Even completely implicit imports "almost worked", so maybe this isn't much of a problem.