Skip to content

The C3 Blog

Why does C3 use 'fn'?

Originally from: https://c3.handmade.network/blog/p/8886-why_does_c3_use_%2527fn%2527

Alongside the removal of goto, having adding an fn in front of function declarations seems superfluous and an arbitrary deviation from C.

Originally, C3 inherited this from C2, which at the time used func (it later simplified it to fn).

In C2 the keyword simplified grammar and made the code easier to search. In C3 however, it also made macro and function declarations symmetric:

macro int foo(int a) { ... }
fn int bar(int b) { ... }

Furthermore, it gives a simple and readable look for defining function types and lambdas:

def Foo = fn int();
Foo lambda = fn int() { return 1; };
Foo lambda_short = fn () => 1;

Because fn starts a declaration, it's easy to search for it using regex as well, or have some simple tool extracting function names.

So those are all straight up usability reasons for fn.

But there is one more:

// Which one is C3 code?

fn void my_function(int a)
{
    return a * a;
}

void my_function(int a)
{
    return a * a;
}

Yes, it being trivially simple to identify a C3 snippet really is an advantage as well as a disadvantage.

If you are converting C to C3, you might copy-paste the C code and piecemeal convert it. So how do you see what's C3 and what's C? Well the fn of course.

It might seem almost flippant to list this as something positive, but reinforcing a shift in interpretation can be surprisingly helpful.

Comments


Comment by Christoffer Lernö

Alongside the removal of goto, having adding an fn in front of function declarations seems superfluous and an arbitrary deviation from C.

Originally, C3 inherited this from C2, which at the time used func (it later simplified it to fn).

In C2 the keyword simplified grammar and made the code easier to search. In C3 however, it also made macro and function declarations symmetric:

macro int foo(int a) { ... }
fn int bar(int b) { ... }

Furthermore, it gives a simple and readable look for defining function types and lambdas:

def Foo = fn int();
Foo lambda = fn int() { return 1; };
Foo lambda_short = fn () => 1;

Because fn starts a declaration, it's easy to search for it using regex as well, or have some simple tool extracting function names.

So those are all straight up usability reasons for fn.

But there is one more:

// Which one is C3 code?

fn void my_function(int a)
{
    return a * a;
}

void my_function(int a)
{
    return a * a;
}

Yes, it being trivially simple to identify a C3 snippet really is an advantage as well as a disadvantage.

If you are converting C to C3, you might copy-paste the C code and piecemeal convert it. So how do you see what's C3 and what's C? Well the fn of course.

It might seem almost flippant to list this as something positive, but reinforcing a shift in interpretation can be surprisingly helpful.

A new site and v0.5.5

Originally from: https://c3.handmade.network/blog/p/8876-a_new_site_and_v0.5.5

Another month and another C3 0.5.x release (read the 0.5.4 announcement here), grab it here: https://github.com/c3lang/c3c/releases/tag/v0.5.5. As work on 0.6.0 is underway, 0.5.5 contains much fewer language updates, and instead mostly contains bug fixes.

In other news, the C3 site has gotten a face-lift: https://c3-lang.org. It's still a work in progress with more extensive guides planned.

For 0.5.5 the biggest feature is the new @link attribute. It works similar to #pragma comment(lib, ...) supported by MSVC and Clang.

module std::os::macos::cf @link(env::DARWIN, "CoreFoundation.framework");

// Use of any functions in this module section
// will implicitly link the CoreFoundation framework

...

While library dependencies still can be specified in project and library settings, this features allows fine grained dependency tracking, avoids superfluous linking. You link what you use, not more.

0.5.5 sees a lot of important fixes, such as the broken output directory setting for projects (and fixes the project template for the corresponding setting!)

The standard library has gotten new_aligned and alloc_aligned as new and alloc would not work correctly on over-aligned types, such as vectors wider than 16 bytes. In mem copy/clear/set functions now has a separate inline variant, which is important as inline requires a compile time length.

Previously aligned alloc using libc would have an extra overhead to support it, but now on POSIX and Windows native aligned allocations are used, avoiding this problem.

Here is the full change list:

Changes / improvements
  • Disallow multiple _ in a row in digits, e.g. 1__000.
  • Added @link attribute.
  • New 'linker' build option.
  • "linker" project setting updated, "system-linker" removed.
Fixes
  • Struct/union members now correctly rejects members without storage size #1147.
  • math::pow will now correctly promote integer arguments.
  • Pointer difference would fail where alignment != size (structs etc) #1150
  • Fixed array calculation for npot2 vectors.
  • $$memcpy_inline and $$memset_inline fixed.
  • .$Type = ... and .$foo = ... now works #1156.
  • int.min incorrect behaviour #1154.
  • Bitstruct cast to other bitstruct by way of underlying type would fail #1159.
  • Bug in time.add_seconds #1162.
  • Remove initial './' in Win32 paths when running a binary.
  • 'output' directory for projects was incorrect in templates.
  • Regression: no stacktrace.
  • For MacOS, running with higher optimization would crash as initializers were removed.
  • compile-run and run now returns the proper return code.
  • Allow String constants -> ichar*, and allow integer pointers to explicitly convert between unsigned signed.
  • Bug in unaligned return value lowering for Aarch64.
Stdlib changes
  • Added new_aligned and alloc_aligned functions to prevent accidental under-alignment when allocating simd.
  • Fixes to realloc of aligned allocations
  • Use native Windows calls on aligned allocations on Windows.
  • mem::copy_inline, mem::clear_inline and mem::set_inline added.
  • mem::copy / clear / set no longer has an $inline attribute.
  • Native aligned libc malloc on Windows & POSIX.
  • Simplification of the allocator interface.
  • CoreFoundation only linked on MacOS when used.

0.5 has feature stability guarantees, so code written for 0.5.0-0.5.4 will work with 0.5.5.

If you want to read more about C3, check out the documentation: https://c3-lang.org or download it and try it out: https://github.com/c3lang/c3c

Comments


Comment by Christoffer Lernö

Another month and another C3 0.5.x release (read the 0.5.4 announcement here), grab it here: https://github.com/c3lang/c3c/releases/tag/v0.5.5. As work on 0.6.0 is underway, 0.5.5 contains much fewer language updates, and instead mostly contains bug fixes.

In other news, the C3 site has gotten a face-lift: https://c3-lang.org. It's still a work in progress with more extensive guides planned.

For 0.5.5 the biggest feature is the new @link attribute. It works similar to #pragma comment(lib, ...) supported by MSVC and Clang.

module std::os::macos::cf @link(env::DARWIN, "CoreFoundation.framework");

// Use of any functions in this module section
// will implicitly link the CoreFoundation framework

...

While library dependencies still can be specified in project and library settings, this features allows fine grained dependency tracking, avoids superfluous linking. You link what you use, not more.

0.5.5 sees a lot of important fixes, such as the broken output directory setting for projects (and fixes the project template for the corresponding setting!)

The standard library has gotten new_aligned and alloc_aligned as new and alloc would not work correctly on over-aligned types, such as vectors wider than 16 bytes. In mem copy/clear/set functions now has a separate inline variant, which is important as inline requires a compile time length.

Previously aligned alloc using libc would have an extra overhead to support it, but now on POSIX and Windows native aligned allocations are used, avoiding this problem.

Here is the full change list:

Changes / improvements
  • Disallow multiple _ in a row in digits, e.g. 1__000.
  • Added @link attribute.
  • New 'linker' build option.
  • "linker" project setting updated, "system-linker" removed.
Fixes
  • Struct/union members now correctly rejects members without storage size #1147.
  • math::pow will now correctly promote integer arguments.
  • Pointer difference would fail where alignment != size (structs etc) #1150
  • Fixed array calculation for npot2 vectors.
  • $$memcpy_inline and $$memset_inline fixed.
  • .$Type = ... and .$foo = ... now works #1156.
  • int.min incorrect behaviour #1154.
  • Bitstruct cast to other bitstruct by way of underlying type would fail #1159.
  • Bug in time.add_seconds #1162.
  • Remove initial './' in Win32 paths when running a binary.
  • 'output' directory for projects was incorrect in templates.
  • Regression: no stacktrace.
  • For MacOS, running with higher optimization would crash as initializers were removed.
  • compile-run and run now returns the proper return code.
  • Allow String constants -> ichar*, and allow integer pointers to explicitly convert between unsigned signed.
  • Bug in unaligned return value lowering for Aarch64.
Stdlib changes
  • Added new_aligned and alloc_aligned functions to prevent accidental under-alignment when allocating simd.
  • Fixes to realloc of aligned allocations
  • Use native Windows calls on aligned allocations on Windows.
  • mem::copy_inline, mem::clear_inline and mem::set_inline added.
  • mem::copy / clear / set no longer has an $inline attribute.
  • Native aligned libc malloc on Windows & POSIX.
  • Simplification of the allocator interface.
  • CoreFoundation only linked on MacOS when used.

0.5 has feature stability guarantees, so code written for 0.5.0-0.5.4 will work with 0.5.5.

If you want to read more about C3, check out the documentation: https://c3-lang.org or download it and try it out: https://github.com/c3lang/c3c

C3 0.5.4 is out!

Originally from: https://c3.handmade.network/blog/p/8864-c3_0.5.4_is_out%2521

It's been about one month since C3 0.5.3 (announcement) was released. Since then there has been quite a lot of non-breaking additions to the compiler and I'm happy to announce the release of 0.5.4 of the C3 programming language. (Grab the downloads here: https://github.com/c3lang/c3c/releases/tag/v0.5.4)

In terms of changes to the language, bitstructs have gotten some additional love, with == and != supported and bit operations on bitstructs are now folded at compile time for constant bitstructs.

Startup initialization / finalization for macOS got an overhaul and is now guaranteed to be ordered, despite the OS not supporting ordering. This finally made dynamic calls safe to use with init functions.

For the stdlib, memory functions are changing. The family of new functions are now zero initializing by default. The reason is subtle: with implicit zeroing of locals, it's natural to start assuming everything is zero by default, even with heap allocations. So having mem::new being non-initialized is the wrong default, the new convention is that new means zeroing allocation and alloc means non-zeroing allocation. So all the functions ending with "zero" and "clear" are deprecated in favor of just the default new.

Also, similarly how stream methods like file.printf(...) was abandoned for io::fprintf(file, ...), the many allocator methods are getting deprecated for removal in 0.6. To replace this, std::mem::allocator is getting malloc, free, new and other functions that take an allocator.

Some examples:

int* x = mem::new_zero(int);
int* y = mem::new(int);
// replaced by:
int* x = mem::new(int);
int* y = mem::alloc(int);

Foo* f = my_allocator.new(Foo);
// replaced by:
Foo* f = allocator::new(my_allocator, Foo)

It might seem counterintuitive that allocator::new is preferable given that it's longer. However, it turns out it's much easier to work with than methods for a consistent set of functions to call.

In any case, most applications should prefer standard heap and temp allocations using the functions in std::mem.

Finally there's the addition of the experimental "GenericList", which is a tentative name. It can hold a heterogenous list of objects. The downside is that it requires more memory management as it's based around any*.

Changes / improvements

  • Hash variables may now take a designated initializer.
  • Added @safemacro to override the @ requirement for non-function-like macros.
  • More information available with debug log in non debug builds.
  • Removed install_win_reqs.bat which didn't work well.
  • Support ** to mean ./**
  • MacOS init/finalizer now respects priority.
  • Bitstructs supports != and ==.
  • Support Windows .def files using --windef.
  • Bitstructs now fold compile time constant bit ops.
  • Fix issue where in some cases a constant global with a string wasn't folded (e.g. in asm stmts)
  • Lateral implicit imports removed.
  • Default to '.' if no libdir is specified.
  • Improved error messages for --lib.
  • Added --linker to set the linker #1067.

Fixes

  • Fixes to macro context evaluation with macro varargs.
  • Dynamic methods registered before init functions on MacOS.
  • Fixed clobber on x86 cpuid instruction.
  • Removed invalid syntax from grammar.y.
  • output project setting now respected.
  • Aliased declarations caused errors when used in initializers.
  • Aliased consts used as constant initializers caused errors.
  • Exported module names replace :: by _.
  • Const ternary would evaluate incorrectly for ?:
  • $$MODULE would report the incorrect module name in macros.
  • Fixed debug info for globals and for/switch scopes.
  • out now correctly detects subscript[] use.
  • Ambiguous recursive imports are now correctly detected.
  • Overzealous local escape check corrected #1127.
  • Fixes to the matrix functions #1130.

Stdlib changes

  • Deprecated Allocator helper functions.
  • Added mem::allocator functions corresponding to removed allocator functions.
  • Changed mem::new / mem::temp_new to accept an optional initializer, and will clear by default.
  • Mem _clear and _zero variants deprecated. "new_*" functions will clear by default.
  • Mem "alloc_" functions replace old "new_" behaviour.
  • Fixed temp memory issue with formatter.
  • Added temp_push and temp_pop for pushing / popping the temp allocator manually (or from C).
  • Added byte_size to List
  • Added GenericList.

0.5 has feature stability guarantees, so any code written for 0.5.0 will with on all of 0.5.x.

If you want to read more about C3, check out the documentation: https://c3-lang.org or download it and try it out: https://github.com/c3lang/c3c

Comments


Comment by Christoffer Lernö

It's been about one month since C3 0.5.3 (announcement) was released. Since then there has been quite a lot of non-breaking additions to the compiler and I'm happy to announce the release of 0.5.4 of the C3 programming language. (Grab the downloads here: https://github.com/c3lang/c3c/releases/tag/v0.5.4)

In terms of changes to the language, bitstructs have gotten some additional love, with == and != supported and bit operations on bitstructs are now folded at compile time for constant bitstructs.

Startup initialization / finalization for macOS got an overhaul and is now guaranteed to be ordered, despite the OS not supporting ordering. This finally made dynamic calls safe to use with init functions.

For the stdlib, memory functions are changing. The family of new functions are now zero initializing by default. The reason is subtle: with implicit zeroing of locals, it's natural to start assuming everything is zero by default, even with heap allocations. So having mem::new being non-initialized is the wrong default, the new convention is that new means zeroing allocation and alloc means non-zeroing allocation. So all the functions ending with "zero" and "clear" are deprecated in favor of just the default new.

Also, similarly how stream methods like file.printf(...) was abandoned for io::fprintf(file, ...), the many allocator methods are getting deprecated for removal in 0.6. To replace this, std::mem::allocator is getting malloc, free, new and other functions that take an allocator.

Some examples:

int* x = mem::new_zero(int);
int* y = mem::new(int);
// replaced by:
int* x = mem::new(int);
int* y = mem::alloc(int);

Foo* f = my_allocator.new(Foo);
// replaced by:
Foo* f = allocator::new(my_allocator, Foo)

It might seem counterintuitive that allocator::new is preferable given that it's longer. However, it turns out it's much easier to work with than methods for a consistent set of functions to call.

In any case, most applications should prefer standard heap and temp allocations using the functions in std::mem.

Finally there's the addition of the experimental "GenericList", which is a tentative name. It can hold a heterogenous list of objects. The downside is that it requires more memory management as it's based around any*.

Changes / improvements

  • Hash variables may now take a designated initializer.
  • Added @safemacro to override the @ requirement for non-function-like macros.
  • More information available with debug log in non debug builds.
  • Removed install_win_reqs.bat which didn't work well.
  • Support ** to mean ./**
  • MacOS init/finalizer now respects priority.
  • Bitstructs supports != and ==.
  • Support Windows .def files using --windef.
  • Bitstructs now fold compile time constant bit ops.
  • Fix issue where in some cases a constant global with a string wasn't folded (e.g. in asm stmts)
  • Lateral implicit imports removed.
  • Default to '.' if no libdir is specified.
  • Improved error messages for --lib.
  • Added --linker to set the linker #1067.

Fixes

  • Fixes to macro context evaluation with macro varargs.
  • Dynamic methods registered before init functions on MacOS.
  • Fixed clobber on x86 cpuid instruction.
  • Removed invalid syntax from grammar.y.
  • output project setting now respected.
  • Aliased declarations caused errors when used in initializers.
  • Aliased consts used as constant initializers caused errors.
  • Exported module names replace :: by _.
  • Const ternary would evaluate incorrectly for ?:
  • $$MODULE would report the incorrect module name in macros.
  • Fixed debug info for globals and for/switch scopes.
  • out now correctly detects subscript[] use.
  • Ambiguous recursive imports are now correctly detected.
  • Overzealous local escape check corrected #1127.
  • Fixes to the matrix functions #1130.

Stdlib changes

  • Deprecated Allocator helper functions.
  • Added mem::allocator functions corresponding to removed allocator functions.
  • Changed mem::new / mem::temp_new to accept an optional initializer, and will clear by default.
  • Mem _clear and _zero variants deprecated. "new_*" functions will clear by default.
  • Mem "alloc_" functions replace old "new_" behaviour.
  • Fixed temp memory issue with formatter.
  • Added temp_push and temp_pop for pushing / popping the temp allocator manually (or from C).
  • Added byte_size to List
  • Added GenericList.

0.5 has feature stability guarantees, so any code written for 0.5.0 will with on all of 0.5.x.

If you want to read more about C3, check out the documentation: https://c3-lang.org or download it and try it out: https://github.com/c3lang/c3c

Regarding programming forums and such

Originally from: https://c3.handmade.network/blog/p/8863-regarding_programming_forums_and_such

An observation: I notice that by virtue of people being mostly anonymous, a curious effect occurs on programming discords (and by extension elsewhere):

People who are "chat savvy" (or whatever we should call being good at writing in a way that is similar to being good at social interactions in real life) are able to dominate discussions by virtue of this.

In addition, people tend to "cluster" when it comes to opinions, so that followers of such persons may sway others ("everyone else agrees on this").

However, such savvy has nothing to do with actual programming skill or knowledge. Many of these "leaders" are in fact fairly inexperienced, if not outright beginners. Age is similarly obfuscated, so that teenagers might be seen as old and middle-aged persons may be taken for teenagers.

The most obvious example when someone new comes to a Discord and enthusiastically starts presenting ideas. If these ideas are not "approved" by the leaders, the person might immediately be mocked and treated as a beginner / idiot.

I've seen this played out several times, at one occasion a 60+ year old gentleman presenting a language and compiler, where he solved several long standing practical problems he'd encountered over the course of his career. He was laughed at and ridiculed as knowing nothing about programming or real world problems by kids 1/3rd his age because he wasn't presenting it in the way that was "expected" in that community. It was painful to see.

Outside of direct bullying, criticizing well-known people in the business is a favorite past-time. You often find people confidently deriding others as "not having any experience", "not knowing what they're talking about", "is just making things up etc". These critics are very sure of themselves, with aforementioned followers to echo those feelings. So you end up with a bunch of 16-year olds deriding 50+ year old programmers with multiple hit games under their belt as "not knowing anything about programming" and collecting pats on the back by the crowd for saying something so profound!

So what is my point? Nothing really except for these observations and to conclude that there is no wisdom of the crowds online, you need to find truth on your own.

Comments


Comment by Christoffer Lernö

An observation: I notice that by virtue of people being mostly anonymous, a curious effect occurs on programming discords (and by extension elsewhere):

People who are "chat savvy" (or whatever we should call being good at writing in a way that is similar to being good at social interactions in real life) are able to dominate discussions by virtue of this.

In addition, people tend to "cluster" when it comes to opinions, so that followers of such persons may sway others ("everyone else agrees on this").

However, such savvy has nothing to do with actual programming skill or knowledge. Many of these "leaders" are in fact fairly inexperienced, if not outright beginners. Age is similarly obfuscated, so that teenagers might be seen as old and middle-aged persons may be taken for teenagers.

The most obvious example when someone new comes to a Discord and enthusiastically starts presenting ideas. If these ideas are not "approved" by the leaders, the person might immediately be mocked and treated as a beginner / idiot.

I've seen this played out several times, at one occasion a 60+ year old gentleman presenting a language and compiler, where he solved several long standing practical problems he'd encountered over the course of his career. He was laughed at and ridiculed as knowing nothing about programming or real world problems by kids 1/3rd his age because he wasn't presenting it in the way that was "expected" in that community. It was painful to see.

Outside of direct bullying, criticizing well-known people in the business is a favorite past-time. You often find people confidently deriding others as "not having any experience", "not knowing what they're talking about", "is just making things up etc". These critics are very sure of themselves, with aforementioned followers to echo those feelings. So you end up with a bunch of 16-year olds deriding 50+ year old programmers with multiple hit games under their belt as "not knowing anything about programming" and collecting pats on the back by the crowd for saying something so profound!

So what is my point? Nothing really except for these observations and to conclude that there is no wisdom of the crowds online, you need to find truth on your own.

How bad is LLVM really?

Originally from: https://c3.handmade.network/blog/p/8852-how_bad_is_llvm_really

LLVM used to be hailed as a great thing, but with language projects such as Rust, Zig and others complaining it's bad and slow and they're moving away from it – how bad is LLVM really?

What is LLVM?

LLVM of today is not just a compiler backend, it's a whole toolchain, and the project also provides a linker (lld), a C compiler available as a library (Clang) and much more.

Except for the anomaly of Zig (Zig also uses the entire Clang as a library), most language projects simply use the LLVM backend, and possibly also the lld linker.

LLVM, Clang, lld and most other parts of the project are written in C++, with a C API available for LLVM, but not for most of the other libraries.

The speed problem

When Clang was released, a selling point was that it compiled faster than GCC. Since then this has slipped a bit and GCC and Clang is about equally slow.

The problem is not in optimized builds - most people accept that optimized builds will compile slowly. No, the problem is that unoptimized builds compile slowly. How slow? LLVM codegen and linking takes over 98% of the total compilation time for the C3 compiler when codegen is single threaded with no optimizations.

If codegen is 2 magnitudes slower than parsing, lexing and semantic checking combined, then you can see why compiler writers might not be totally happy with LLVM's performance.

Why is LLVM slow?

First a disclaimer: I have only read the LLVM source code a bit, I haven't contributed anything beyond a few small fixes so I'm not an expert.

However, it seems to me that LLVM has a fairly traditional C++ OO design. One thing this results in is an abundance of heap allocations. An early experiment switching the C3 compiler to mimalloc improved LLVM running times with a whopping 10%, which could only be true if memory allocations were a large contributor to the runtime cost. I would have expected LLVM to use arena allocators, but that doesn't seem to be the case for most code.

Heap allocations aside, using C++ or similar languages often invite certain inefficient patterns. It's easy to just rely on high level constructs to solve problems:

Need to check if a list has duplicates? No problem, just grab a hash set and check!

Except if the list is typically only 2-3 entries, then just doing a linear search might be much faster and require no setup. It doesn't matter how clever and fast the hash set is. And they're usually fast – LLVM has lots of optimized containers, but if no container was needed, then it doesn't matter how fast it was.

It's not necessarily bad code, but it's not code this is likely to be highly performant.

Why is LLVM "bad"?

There are other warts LLVM has. First up, the documentation isn't particularly great. It's not worse than many other libraries I've used, so this is more of a "we all wish it could be better because understanding the backend is hard enough as it is".

More fundamentally though, LLVM is very much a backend for C/C++. While LLVM has test suites, Clang is ultimately the product in the LLVM umbrella that really tests the backend. This results in codegen not used by Clang being notoriously unreliable, as well as often poorly optimized (passing structs around by value for instance).

Another consequence is that LLVM often has mandatory UB where C/C++ does. For example, integer division by zero is currently an unescapable undefined behaviour in LLVM – which is bad if your language wanted to define x / 0 to be 0 for example. Another example is when i << x overflows due to x being the same or larger than the bit width of i. This yields a poison value in LLVM, so if you wanted, say, it to be 0, you would have add a select on every such shift as there is no way to request well defined behaviour. At least in this case the result is a poison value and not UB. C/C++ of course considers i << x undefined behaviour for these overflow cases.

So: bugs, not-so-great documentation and assumption of C/C++ semantics are probably the main complaints I've seen.

The problem with alternatives

Alternatives to LLVM that pop up are Cranelift, QBE etc. However, at the moment none of those offers the same kind of complete solution that LLVM provides - and some of them are slower than using LLVM! If you already started using LLVM's advanced features, you will struggle with feature parity, not to mention the limited platform support.

Integrating with GCC is an alternative, but it doesn't solve the compilation speed problem, nor the other "bad" things about LLVM.

At this point, a lot of projects will start thinking about writing their own backend, and honestly this is probably a better alternative than using anything incomplete off the shelf at the moment, as this ensures there isn't some missing functionality that is impossible to handle later.

So while there are some promising upcoming backends (Tilde Backend comes to mind), there isn't really a drop in replacement for LLVM today.

LLVM the good parts

While there are these downsides to LLVM, we shouldn't lose track of what it actually brings to the table. It's a full fledged backend that is way more field tested than anything one could hope to write by oneself. It's reliable in the sense that it's not going away tomorrow or in five years. Buried in LLVM + Clang is a treasure trove of domain knowledge that a single developer can't be expected to accumulate on their own.

Being able to use LLVM is a huge service to language developers. What it lacks in speed it wins back in completeness.

Final words

We all love to complain about LLVM. It's far from perfect, not the least in regards to speed. But at the same time, it is allowing language designers to build compilers that produces production quality machine code on a wide variety of platforms. So really, starting out with LLVM is a good idea. Once there is a backend that works there is plenty of time to explore other backends without any pressure.

So is LLVM bad? Well it has its bad parts, but it's also probably the best backend you can pick for your compiler when you start out (not counting transpiling to C).

You can worry about the bad parts later.

Comments


Comment by Christoffer Lernö

LLVM used to be hailed as a great thing, but with language projects such as Rust, Zig and others complaining it's bad and slow and they're moving away from it – how bad is LLVM really?

What is LLVM?

LLVM of today is not just a compiler backend, it's a whole toolchain, and the project also provides a linker (lld), a C compiler available as a library (Clang) and much more.

Except for the anomaly of Zig (Zig also uses the entire Clang as a library), most language projects simply use the LLVM backend, and possibly also the lld linker.

LLVM, Clang, lld and most other parts of the project are written in C++, with a C API available for LLVM, but not for most of the other libraries.

The speed problem

When Clang was released, a selling point was that it compiled faster than GCC. Since then this has slipped a bit and GCC and Clang is about equally slow.

The problem is not in optimized builds - most people accept that optimized builds will compile slowly. No, the problem is that unoptimized builds compile slowly. How slow? LLVM codegen and linking takes over 98% of the total compilation time for the C3 compiler when codegen is single threaded with no optimizations.

If codegen is 2 magnitudes slower than parsing, lexing and semantic checking combined, then you can see why compiler writers might not be totally happy with LLVM's performance.

Why is LLVM slow?

First a disclaimer: I have only read the LLVM source code a bit, I haven't contributed anything beyond a few small fixes so I'm not an expert.

However, it seems to me that LLVM has a fairly traditional C++ OO design. One thing this results in is an abundance of heap allocations. An early experiment switching the C3 compiler to mimalloc improved LLVM running times with a whopping 10%, which could only be true if memory allocations were a large contributor to the runtime cost. I would have expected LLVM to use arena allocators, but that doesn't seem to be the case for most code.

Heap allocations aside, using C++ or similar languages often invite certain inefficient patterns. It's easy to just rely on high level constructs to solve problems:

Need to check if a list has duplicates? No problem, just grab a hash set and check!

Except if the list is typically only 2-3 entries, then just doing a linear search might be much faster and require no setup. It doesn't matter how clever and fast the hash set is. And they're usually fast – LLVM has lots of optimized containers, but if no container was needed, then it doesn't matter how fast it was.

It's not necessarily bad code, but it's not code this is likely to be highly performant.

Why is LLVM "bad"?

There are other warts LLVM has. First up, the documentation isn't particularly great. It's not worse than many other libraries I've used, so this is more of a "we all wish it could be better because understanding the backend is hard enough as it is".

More fundamentally though, LLVM is very much a backend for C/C++. While LLVM has test suites, Clang is ultimately the product in the LLVM umbrella that really tests the backend. This results in codegen not used by Clang being notoriously unreliable, as well as often poorly optimized (passing structs around by value for instance).

Another consequence is that LLVM often has mandatory UB where C/C++ does. For example, integer division by zero is currently an unescapable undefined behaviour in LLVM – which is bad if your language wanted to define x / 0 to be 0 for example. Another example is when i << x overflows due to x being the same or larger than the bit width of i. This yields a poison value in LLVM, so if you wanted, say, it to be 0, you would have add a select on every such shift as there is no way to request well defined behaviour. At least in this case the result is a poison value and not UB. C/C++ of course considers i << x undefined behaviour for these overflow cases.

So: bugs, not-so-great documentation and assumption of C/C++ semantics are probably the main complaints I've seen.

The problem with alternatives

Alternatives to LLVM that pop up are Cranelift, QBE etc. However, at the moment none of those offers the same kind of complete solution that LLVM provides - and some of them are slower than using LLVM! If you already started using LLVM's advanced features, you will struggle with feature parity, not to mention the limited platform support.

Integrating with GCC is an alternative, but it doesn't solve the compilation speed problem, nor the other "bad" things about LLVM.

At this point, a lot of projects will start thinking about writing their own backend, and honestly this is probably a better alternative than using anything incomplete off the shelf at the moment, as this ensures there isn't some missing functionality that is impossible to handle later.

So while there are some promising upcoming backends (Tilde Backend comes to mind), there isn't really a drop in replacement for LLVM today.

LLVM the good parts

While there are these downsides to LLVM, we shouldn't lose track of what it actually brings to the table. It's a full fledged backend that is way more field tested than anything one could hope to write by oneself. It's reliable in the sense that it's not going away tomorrow or in five years. Buried in LLVM + Clang is a treasure trove of domain knowledge that a single developer can't be expected to accumulate on their own.

Being able to use LLVM is a huge service to language developers. What it lacks in speed it wins back in completeness.

Final words

We all love to complain about LLVM. It's far from perfect, not the least in regards to speed. But at the same time, it is allowing language designers to build compilers that produces production quality machine code on a wide variety of platforms. So really, starting out with LLVM is a good idea. Once there is a backend that works there is plenty of time to explore other backends without any pressure.

So is LLVM bad? Well it has its bad parts, but it's also probably the best backend you can pick for your compiler when you start out (not counting transpiling to C).

You can worry about the bad parts later.


Comment by Christoffer Lernö

Lexing, parsing and analysis is about 1-2% of the entire compile time when compiling C3 code with no optimizations. The rest is LLVM + linking, where linking is a small part of the time.

You can compare some compiler benchmarks here: https://github.com/nordlow/compiler-benchmark

Not that such benchmark really gives the real time compilation will take on general code, but it gives a rough level of magnitude between different compilers (and consequently compiler backends, as this is where the most of the time is spent for something like C)

Syntax - when in doubt, don't innovate

Originally from: https://c3.handmade.network/blog/p/8851-syntax_-_when_in_doubt%252C_don%2527t_innovate

One of the most attractive things about language design is to be able to tweak the syntax of a language on its fundamental level, so not surprisingly you'll see language designers coming up with all sorts of alternatives to conventional syntax.

The problem is that it takes a while – I'd say a year at least – to figure out if some particular new syntax is good. It often takes less time figure out if it's bad, but in some cases it might not be obvious until very late. So just because it's not immediately bad, it doesn't mean you won't find out something later.

Even worse, it's hard to weed out false negatives: sometimes syntax might appear to be "bad" simply because it is unfamiliar.

For that reason I think a good rule of thumb when working on syntax might be "when in doubt, do not innovate".

Just like other language features should "carry their weight" (that is, their value should outweigh their cost), so should syntax. "It's setting the language apart" or "I like how it looks" is fairly low on the value scale if the language is intended for use by others. If you're not sure whether some new syntax is necessary then it's better to wait until you know if it is. Meanwhile, there are established syntax conventions out there you can lean on.

New syntax shines where it enables (possibly new and innovative) language features to be expressed cleanly and clearly. It is probably better to prioritize such syntax innovations than, say, innovate new symbol combinations for arithmetics.

Comments


Comment by Christoffer Lernö

One of the most attractive things about language design is to be able to tweak the syntax of a language on its fundamental level, so not surprisingly you'll see language designers coming up with all sorts of alternatives to conventional syntax.

The problem is that it takes a while – I'd say a year at least – to figure out if some particular new syntax is good. It often takes less time figure out if it's bad, but in some cases it might not be obvious until very late. So just because it's not immediately bad, it doesn't mean you won't find out something later.

Even worse, it's hard to weed out false negatives: sometimes syntax might appear to be "bad" simply because it is unfamiliar.

For that reason I think a good rule of thumb when working on syntax might be "when in doubt, do not innovate".

Just like other language features should "carry their weight" (that is, their value should outweigh their cost), so should syntax. "It's setting the language apart" or "I like how it looks" is fairly low on the value scale if the language is intended for use by others. If you're not sure whether some new syntax is necessary then it's better to wait until you know if it is. Meanwhile, there are established syntax conventions out there you can lean on.

New syntax shines where it enables (possibly new and innovative) language features to be expressed cleanly and clearly. It is probably better to prioritize such syntax innovations than, say, innovate new symbol combinations for arithmetics.

C3 0.5.3 Released

Originally from: https://c3.handmade.network/blog/p/8848-c3_0.5.3_released

It's almost 2 months since 0.5.0 was released and we're now at 0.5.3. This is the change list from 0.5.2:

Changes / improvements

  • Migrate from using actual type with GEP, use i8 or i8 array instead.
  • Optimize foreach for single element arrays.
  • Move all calls to panic due to checks to the end of the function.

Fixes

  • Single module command line option was not respected.
  • Fixed issue with compile time defined types (String in this case), which would crash the compiler in certain cases.
  • Projects now correctly respect optimization directives.
  • Generic modules now correctly follow the implicit import rules of regular modules.
  • Passing an untyped list to a macro and then using it as a vaarg would crash the compiler.
  • Extern const globals now work correctly.

Stdlib changes

  • init_new/init_temp deprecated, replaced by new_init and temp_init.

What about 0.5.1 and 0.5.2?

Unfortunately I never blogged about those. So here is a short recap on what happened in 0.5.1 and 0.5.2:

Changes / improvements

  • Allow trailing comma in calls and parameters #1092.
  • Improved error messages for const errors.
  • Do not link with debug libraries unless using static libraries.
  • Add 'print-linking' build option.
  • System linker may be used even if the target arch is different from current.
  • Slice -> array/vector works for constant slice lengths.

Fixes

  • Fixes issue where single character filenames like 'a.c3' would be rejected.
  • Better errors when index type doesn't match len() when doing user defined foreach.
  • Fixes to to_int for hexadecimal strings.
  • Fixed issue when using a generic type from a generic type.
  • Bug with vector parameters when the size > 2 and modified.
  • Missing error on assigning to in-parameters through subscripting.
  • Inference of a vector on the lhs of a binary expression would cause a crash.
  • Fixes to PriorityQueue
  • On Aarch64 use the correct frame pointer type.
  • On Aarch64 macOS, ensure the minimum version is 11.0 (Big Sur)
  • Fixes to the yacc grammar.
  • Dsym generation on macOS will correctly emit -arch.
  • Stacktrace on signals on Linux when backtrace is available.

Stdlib changes

  • Allow to_int family functions take a base, parsing base 2-10 and 16.
  • delete and delete_range added to DString.
  • Splitter iterator added.
  • splitter and iterator String methods.
  • load_new, load_buffer and load_temp std::io::file functions.

0.5 has feature stability guarantees, so any code written for 0.5.0 will with on all of 0.5.x.

If you want to read more about C3, check out the documentation: https://c3-lang.org or download it and try it out: https://github.com/c3lang/c3c

Comments


Comment by Christoffer Lernö

It's almost 2 months since 0.5.0 was released and we're now at 0.5.3. This is the change list from 0.5.2:

Changes / improvements

  • Migrate from using actual type with GEP, use i8 or i8 array instead.
  • Optimize foreach for single element arrays.
  • Move all calls to panic due to checks to the end of the function.

Fixes

  • Single module command line option was not respected.
  • Fixed issue with compile time defined types (String in this case), which would crash the compiler in certain cases.
  • Projects now correctly respect optimization directives.
  • Generic modules now correctly follow the implicit import rules of regular modules.
  • Passing an untyped list to a macro and then using it as a vaarg would crash the compiler.
  • Extern const globals now work correctly.

Stdlib changes

  • init_new/init_temp deprecated, replaced by new_init and temp_init.

What about 0.5.1 and 0.5.2?

Unfortunately I never blogged about those. So here is a short recap on what happened in 0.5.1 and 0.5.2:

Changes / improvements

  • Allow trailing comma in calls and parameters #1092.
  • Improved error messages for const errors.
  • Do not link with debug libraries unless using static libraries.
  • Add 'print-linking' build option.
  • System linker may be used even if the target arch is different from current.
  • Slice -> array/vector works for constant slice lengths.

Fixes

  • Fixes issue where single character filenames like 'a.c3' would be rejected.
  • Better errors when index type doesn't match len() when doing user defined foreach.
  • Fixes to to_int for hexadecimal strings.
  • Fixed issue when using a generic type from a generic type.
  • Bug with vector parameters when the size > 2 and modified.
  • Missing error on assigning to in-parameters through subscripting.
  • Inference of a vector on the lhs of a binary expression would cause a crash.
  • Fixes to PriorityQueue
  • On Aarch64 use the correct frame pointer type.
  • On Aarch64 macOS, ensure the minimum version is 11.0 (Big Sur)
  • Fixes to the yacc grammar.
  • Dsym generation on macOS will correctly emit -arch.
  • Stacktrace on signals on Linux when backtrace is available.

Stdlib changes

  • Allow to_int family functions take a base, parsing base 2-10 and 16.
  • delete and delete_range added to DString.
  • Splitter iterator added.
  • splitter and iterator String methods.
  • load_new, load_buffer and load_temp std::io::file functions.

0.5 has feature stability guarantees, so any code written for 0.5.0 will with on all of 0.5.x.

If you want to read more about C3, check out the documentation: https://c3-lang.org or download it and try it out: https://github.com/c3lang/c3c

Say hello to C3 0.5

Originally from: https://c3.handmade.network/blog/p/8824-say_hello_to_c3_0.5

C3 is a programming language that builds on the syntax and semantics of the C language, with the goal of evolving it while still retaining familiarity for C programmers. It's an evolution, not a revolution: the C-like for programmers who like C.

It is finally time to release C3 0.5. This version is the first version of the C3 compiler (and by extension, the C3 language) which is feature-stable.

Before 0.5, the language changed in the same minor version, so the 0.4.1 version of the compiler might not compile code written for 0.4.20 and vice versa.

From 0.5 and forward this changes: each future version will have its own branch where bug fixes will happen, but otherwise the features are frozen. New features will be reserved for the dev and master branches. Consequently, as we announce 0.5, work will actually move on to 0.6 which is where the active development will happen.

This allows people to pick a version to confidently work with, knowing that there will be no changes to language semantics or the standard library.

Feature complete

With 0.5, C3 language itself can also be considered feature complete, and for 0.6, 0.7, 0.8, 0.9 the focus will be on the standard library. A good standard library should address real life use-cases, to solve commonly encountered issues of the users.

In order to properly know what those use-cases are, a diverse set of projects must be written in C3. And for people to build non-trivial projects in C3 without problems there must be some stability guarantees to the compiler itself. This is what 0.5 provides, and why we now switch forward to refining the standard library.

Explore C3

Interested in trying out C3 0.5? Learn more on the language's official site: https://c3-lang.org. Obtain the compiler from GitHub at https://github.com/c3lang/c3c/issues and join the community shaping the future of the C3 programming language.

Comments


Comment by Christoffer Lernö

C3 is a programming language that builds on the syntax and semantics of the C language, with the goal of evolving it while still retaining familiarity for C programmers. It's an evolution, not a revolution: the C-like for programmers who like C.

It is finally time to release C3 0.5. This version is the first version of the C3 compiler (and by extension, the C3 language) which is feature-stable.

Before 0.5, the language changed in the same minor version, so the 0.4.1 version of the compiler might not compile code written for 0.4.20 and vice versa.

From 0.5 and forward this changes: each future version will have its own branch where bug fixes will happen, but otherwise the features are frozen. New features will be reserved for the dev and master branches. Consequently, as we announce 0.5, work will actually move on to 0.6 which is where the active development will happen.

This allows people to pick a version to confidently work with, knowing that there will be no changes to language semantics or the standard library.

Feature complete

With 0.5, C3 language itself can also be considered feature complete, and for 0.6, 0.7, 0.8, 0.9 the focus will be on the standard library. A good standard library should address real life use-cases, to solve commonly encountered issues of the users.

In order to properly know what those use-cases are, a diverse set of projects must be written in C3. And for people to build non-trivial projects in C3 without problems there must be some stability guarantees to the compiler itself. This is what 0.5 provides, and why we now switch forward to refining the standard library.

Explore C3

Interested in trying out C3 0.5? Learn more on the language's official site: https://c3-lang.org. Obtain the compiler from GitHub at https://github.com/c3lang/c3c/issues and join the community shaping the future of the C3 programming language.


Comment by Christoffer Lernö

The change list for 0.5:

Changes / improvements

  • Trackable allocator with leak allocation backtraces.
  • $defined can take a list of expressions.
  • $and compile time "and" which does not check expressions after the first is an error.
  • $is_const returns true if an expression is compile time const.
  • $assignable returns true is an expression may be implicitly cast to a type.
  • $checks and @checked removed, replaced by an improved $defined
  • Asm string blocks use AT&T syntax for better reliability.
  • Distinct methods changed to separate syntax.
  • 'exec' directive to run scripts at compile time.
  • Project key descriptions in --list command.
  • Added init-lib to simplify library creation.
  • Local const work like namespaced global const.
  • Added $$atomic_fetch_* builtins.
  • vectors may now contain pointers.
  • void! does not convert to anyfault.
  • $$masked_load / $$masked_store / $$gather / $$scatter for vector masked load/store.
  • $$select builtin for vector masked select.
  • Added builtin benchmarks by benchmark, compile-benchmark commands and @benchmark attribute.
  • Subtype matching in type switches.
  • Added parentof typeid property.
  • Slice assignment is expanded.
  • Enforced optional handling.
  • Better dead code analysis, and added dead code errors.
  • Exhaustive switches with enums has better analysis.
  • Globals may now be initialized with optional values.
  • New generic syntax.
  • Slice initialization.
  • $feature for feature flags.
  • Native stacktrace for Linux, MacOS and Windows.
  • Macro ref parameters are now of pointer type and ref parameters are not assignable.
  • Added nextcase default.
  • Added $embed to embed binary data.
  • Ad hoc generics are now allowed.
  • Allow inferred type on method first argument.
  • Fix to void expression blocks
  • Temporary objects may now invoke methods using ref parameters.
  • Delete object files after successful linking.
  • Compile time subscript of constant strings and bytes.
  • @if introduced, other top level conditional compilation removed.
  • Dynamically dispatched interfaces with optional methods.
  • $if now uses $if <expr>: syntax.
  • $assert now uses $assert <expr> : <optional message>
  • $error is syntax sugar for $assert false : "Some message"
  • $include, $echo no longer has mandatory () around the arguments.
  • $exec for including the output of files.
  • assert no longer allows "try unwrap"
  • Updated cpu arguments for x86
  • Removed support for ranged case statements that were floats or enums, or non-constant.
  • nextcase with a constant expression that does not match any case is an error.
  • Dropped support for LLVM 13-14.
  • Updated grammar and lexer definition.
  • Removal of $elif.
  • any / anyfault may now be aliased.
  • @stdcall etc removed in favor of @callconv
  • Empty fault definitions is now an error.
  • Better errors on incorrect bitstruct syntax.
  • Internal use wildcard type rather than optional wildcard.
  • Experimental scaled vector type removed.
  • Disallow parameterize attributes without parameters eg define @Foo() = { @inline }.
  • Handle @optreturn contract, renamed @return!.
  • Restrict interface style functions.
  • Optional propagation and assignment '!' and '?' are flipped.
  • Add l suffix (alias for i64).
  • Allow getting the underlying type of anyfault.
  • De-duplicate string constants.
  • Change @extname => @extern.
  • define and typedef removed.
  • define is replaced by def.
  • LLVM "wrapper" library compilation is exception free.
  • private is replaced by attribute @private.
  • Addition of @local for file local visibility.
  • Addition of @public for overriding default visibility.
  • Default visibility can be overridden per module compile unit. Eg module foo @private.
  • Optimized macro codegen for -O0.
  • Addition of unary +.
  • Remove possibility to elide length when using ':' for slices.
  • Remove the : and ; used in $if, $switch etc.
  • Faults have an ordinal.
  • Generic module contracts.
  • Type inference on enum comparisons, e.g foo_enum == ABC.
  • Allow {} to initialize basic types.
  • String literals default to String.
  • More const modification detection.
  • C3L zip support.
  • Support printing object files.
  • Downloading of libraries using vendor "fetch".
  • Structural casts removed.
  • Added "native" option for vector capability.
  • $$shufflevector replaced with $$swizzle and $$swizzle2.
  • Builtin swizzle accessors.
  • Lambdas, e.g a = int(x, y) => x + y.
  • $$FILEPATH builtin constant.
  • variant renamed any.
  • anyerr renamed anyfault.
  • Added $$wasm_memory_size and $$wasm_memory_grow builtins.
  • Add "link-args" for project.
  • Possible to suppress entry points using --no-entry.
  • Added memory-env option.
  • Use the .wasm extension on WASM binaries.
  • Update precedence clarification rules for ^|&.
  • Support for casting any expression to void.
  • Win 32-bit processor target removed.
  • Insert null-check for contracts declaring & params.
  • Support user defined attributes in generic modules.
  • --strip-unused directive for small binaries.
  • $$atomic_store and $$atomic_load added.
  • usz/isz replaces usize and isize.
  • @export attribute to determine what is visible in precompiled libraries.
  • Disallow obviously wrong code returning a pointer to a stack variable.
  • Add &^| operations for bitstructs.
  • @noinit replaces = void to opt-out of implicit zeroing.
  • Multiple declarations are now allowed in most places, eg int a, b;.
  • Allow simplified (boolean) bitstruct definitions.
  • Allow @test to be placed on module declarations.
  • Updated name mangling for non-exports.
  • defer catch and defer try statements added.
  • Better errors from $assert.
  • @deprecated attribute added.
  • Allow complex array length inference, eg int[*][2][*] a = ....
  • Cleanup of cast code.
  • Removal of generic keyword.
  • Remove implicit cast enum <-> int.
  • Allow enums to use a distinct type as the backing type.
  • Update addition and subtraction on enums.
  • @ensure checks only non-optional results.
  • assert may now take varargs for formatting.

Stdlib changes

  • Tracking allocator with location.
  • init_new/init_temp for allocating init methods.
  • DString.printf is now DString.appendf.
  • Tuple and Maybe types.
  • .as_str() replaced by .str_view()
  • Added math::log(x , base) and math::ln(x).
  • Hashmap keys implicitly copied if copy/free are defined.
  • Socket handling.
  • csv package.
  • Many random functions.
  • Updated posix/win32 stdlib namespacing
  • process stdlib
  • Stdlib updates to string.
  • Many additions to List: remove, array_view, add_all, compact etc
  • Added dstringwriter.
  • Improved printf formatting.
  • is_finite/is_nam/is_inf added.
  • OnStack allocator to easily allocate a stack buffer.
  • File enhancements: mkdir, rmdir, chdir.
  • Path type for file path handling.
  • Distinct String type.
  • VarString replaced by DString.
  • Removal of std::core::str.
  • JSON parser and general Object type.
  • Addition of EnumMap.
  • RC4 crypto.
  • Matrix identity macros.
  • compare_exchange added.
  • printfln and println renamed printfn and printn.
  • Support of roundeven.
  • Added easings.
  • Updated complex/matrix, added quaternion maths.
  • Improved support for freestanding.
  • Improved windows main support, with @winmain annotations.
  • SimpleHeapAllocator added.
  • Added win32 standard types.
  • Added saturated math.
  • Added @expect, @unlikely and @likely macros.
  • Temp allocator uses memory-env to determine starting size.
  • Temp allocator is now accessed using mem::temp(), heap allocator using mem::heap().
  • Float parsing added.
  • Additions to std::net, ipv4/ipv6 parsing.
  • Stream api.
  • Random api.
  • Sha1 hash function.
  • Extended enumset functionality.
  • Updated malloc/calloc/realloc/free removing old helper functions.
  • Added TrackingAllocator.
  • Add checks to prevent incorrect alignment on malloc.
  • Updated clamp.
  • Added Clock and DateTime.
  • Added posix socket functions.

Fixes

  • Structs returned from macros and then indexed into directly could previously be miscompiled.
  • Naked functions now correctly handles asm.
  • Indexing into arrays would not always widen the index safely.
  • Macros with implicit return didn't correctly deduct the return type.
  • Reevaluating a bitstruct (due to checked) would break.
  • Fix missing comparison between any.
  • Fix issue of designated initializers containing bitstructs.
  • Fix issue of designated initializers that had optional arguments.
  • Fixed ++ and -- for bitstructs.
  • Fix to bug where library source files were sometimes ignored.
  • Types of arrays and vectors are consistently checked to be valid.
  • Anonymous bitstructs check of duplicate member names fixed.
  • Assignment to anonymous bitstruct members in structs.
  • Fix casts on empty initializers.
  • Fix to DString reserve.
  • Fix where aliases did not do arithmetic promotion.
  • @local declarations in generic modules available by accident.
  • Fixes missing checks to body arguments.
  • Do not create debug declaration for value-only parameter.
  • Bug in alignment for atomics.
  • Fix to bug when comparing nested arrays.
  • Fix to bug when a macro is using rethrow.
  • Fixes bug initializing a const struct with a const struct value.
  • Fixes bug when void is passed to an "any"-vararg.
  • Fixed defer/return value ordering in certain cases.
  • Fixes to the x64 ABI.
  • Updates to how variadics are implemented.
  • Fixes to shift checks.
  • Fixes to string parsing.
  • Bug when rethrowing an optional from a macro which didn't return an optional.
  • Fixed issues with ranged cases.
  • Disallow trailing ',' in function parameter list.
  • Fixed errors on flexible array slices.
  • Fix of readdir issues on macOS.
  • Fix to slice assignment of distinct types.
  • Fix of issue casting subarrays to distinct types.
  • Fixes to split, rindex_of.
  • List no longer uses the temp allocator by default.
  • Remove test global when not in test mode.
  • Fix sum/product on floats.
  • Fix error on void! return of macros.
  • Removed too permissive casts on subarrays.
  • Using C files correctly places objects in the build folder.
  • Fix of overaligned deref.
  • Fix negating a float vector.
  • Fix where $typeof(x) { ... } would not be a valid compound literal.
  • Fix so that using var in if (var x = ...) works correctly.
  • Fix int[] -> void* casts.
  • Fix in utf8to16 conversions.
  • Updated builtin checking.
  • Reduce formatter register memory usage.
  • Fixes to the "any" type.
  • Fix bug in associated values.
  • More RISC-V tests and fixes to the ABI.
  • Fix issue with hex floats assumed being double despite f suffix.
  • Fix of the tan function.
  • Fixes to the aarch64 ABI when passing invalid vectors.
  • Fix creating typed compile time variables.
  • Fix bug in !floatval codegen.
  • Fix of visibility issues for generic methods.
  • Fixes to $include.
  • Fix of LLVM codegen for optionals in certain cases.
  • Fix of $vasplat when invoked repeatedly.
  • Fix to $$DATE.
  • Fix of attributes on nested bitstructs.
  • Fix comparing const values > 64 bits.
  • Defer now correctly invoked in expressions like return a > 0 ? Foo.ABC! : 1.
  • Fix conversion in if (int x = foo()).
  • Delay C ABI lowering until requested to prevent circular dependencies.
  • Fix issue with decls accidentally invalidated during $checked eval.
  • Fold optional when casting slice to pointer.
  • Fixed issue when using named arguments after varargs.
  • Fix bug initializing nested struct/unions.
  • Fix of bool -> vector cast.
  • Correctly widen C style varargs for distinct types and optionals.
  • Fix of too aggressive codegen in ternary codegen with array indexing.

Comment by Christoffer Lernö

It allows the language to be easily parsable. The classic problem in a C-like grammar is that it is ambiguous with respect to types vs variables. In C this is typically solved using the "lexer hack", where the parser feeds types back into the lexer. Other methods include outlawing certain types of expressions and using infinite lookahead, this is the method D uses for example.

In C3, the distinct naming rules for types disambiguates the grammar, making it LL(1). Also see here: https://c3-lang.org/faq/#syntax-language-design

So to be clear, it's not about trying to enforce some arbitrary name standards, but rather to simplify the grammar. Picking PascalCase for the types was pretty much the only possible choice. I might write a blog post about this some time.

Too much power, too poor accuracy - the story of $checks in C3

Originally from: https://c3.handmade.network/blog/p/8810-too_much_power%252C_too_poor_accuracy_-_the_story_of_checks_in_c3

Recently C3 lost its $checks() function. It would take any sequence of declarations and expressions, and if it failed to semantically check anywhere, return false.

It was an extremely powerful and flexible way of testing pretty much anything at compile time. Some examples:

// Test if a value may be indexed:
$checks(a[0]);
// Test if something supports addition:
$checks(a + a);
// Test if you can assign something to the type of another variable
$checks(b = a);
// Test if you can call a function with the values of two variables
$checks(foo(a, b));
// Check if a type has a particular field
$checks(Foo x, x.my_field);
// Check if a type is ordered
$checks(Foo x, x < x);

In essence, $checks was a Swiss Army knife for compile-time validation, making it redundant to employ multiple compile-time functions like $defined(x). So, why did we part ways with $checks (and its contract counterpart @checked)?

Well, it turns out that with power comes also lack of clarity. Take, for example, the $checks(foo(a, b)) call – it could potentially fail for a multitude of reasons:

  1. foo might not be visible in the scope.
  2. foo needs to be called with the module name, e.g. my_module::foo
  3. foo might not be a callable variable pointer or function.
  4. a might not be visible in the scope.
  5. b might not be visible in the scope.
  6. foo might take fewer than 2 or more than 2 arguments.
  7. There could be a type mismatch between a and the first parameter of foo.
  8. There could be a type mismatch between b and the second parameter of foo.

So while we might have wanted to test for some of these, it might fail for any of the listed cases and there is no way we can determine which one, unless we move it out of the $checks and test it so that it errors just the same way.

While this is a problem when writing the $checks, it also poses a problem when refactoring, as it is hard to tell when you accidentally change something that breaks inside of $checks, causing it to reject legitimate parameters.

So $checks unfortunately combines power with inexactness. In fact, its power comes from being inexact and just bundling all the implicit checks together.

The alternative solution

C3 already had $defined(...) which would do a lightweight check if a variable or a field was defined. Its functionality had almost completely been eclipsed by $checks(...) but now got a new life: $defined would semantically check all but the outermost part of a nested expression. The final expression would then be conditionally checked.

The new behaviour was reminiscent of $checks, but would only have a single "tested" semantic check. For example, $defined(foo(a, b)) would return true if it checked correctly, and false only if "foo" wasn't callable or didn't accept 2 arguments.

The downside is that $defined must be carefully crafted to correctly do each "test" it supports.

But all in all, this is a substantial upgrade to correct compile time checking, which is very important in C3.


Addition: without $checks the various examples instead become:

// Test if a value may be indexed:
$defined(a[0]);
// Test if something supports addition:
types::is_numerical($typeof(a))
// Test if you can assign something to the type of another variable
$assignable(a, $typeof(b));
// Test if you can call a function with the values of two variables
$defined(foo(a, b));
// Check if a type has a particular field
$defined(Foo{}.my_field);
// Check if a type is ordered
Foo.is_ordered

Comments


Comment by Christoffer Lernö

Recently C3 lost its $checks() function. It would take any sequence of declarations and expressions, and if it failed to semantically check anywhere, return false.

It was an extremely powerful and flexible way of testing pretty much anything at compile time. Some examples:

// Test if a value may be indexed:
$checks(a[0]);
// Test if something supports addition:
$checks(a + a);
// Test if you can assign something to the type of another variable
$checks(b = a);
// Test if you can call a function with the values of two variables
$checks(foo(a, b));
// Check if a type has a particular field
$checks(Foo x, x.my_field);
// Check if a type is ordered
$checks(Foo x, x < x);

In essence, $checks was a Swiss Army knife for compile-time validation, making it redundant to employ multiple compile-time functions like $defined(x). So, why did we part ways with $checks (and its contract counterpart @checked)?

Well, it turns out that with power comes also lack of clarity. Take, for example, the $checks(foo(a, b)) call – it could potentially fail for a multitude of reasons:

  1. foo might not be visible in the scope.
  2. foo needs to be called with the module name, e.g. my_module::foo
  3. foo might not be a callable variable pointer or function.
  4. a might not be visible in the scope.
  5. b might not be visible in the scope.
  6. foo might take fewer than 2 or more than 2 arguments.
  7. There could be a type mismatch between a and the first parameter of foo.
  8. There could be a type mismatch between b and the second parameter of foo.

So while we might have wanted to test for some of these, it might fail for any of the listed cases and there is no way we can determine which one, unless we move it out of the $checks and test it so that it errors just the same way.

While this is a problem when writing the $checks, it also poses a problem when refactoring, as it is hard to tell when you accidentally change something that breaks inside of $checks, causing it to reject legitimate parameters.

So $checks unfortunately combines power with inexactness. In fact, its power comes from being inexact and just bundling all the implicit checks together.

The alternative solution

C3 already had $defined(...) which would do a lightweight check if a variable or a field was defined. Its functionality had almost completely been eclipsed by $checks(...) but now got a new life: $defined would semantically check all but the outermost part of a nested expression. The final expression would then be conditionally checked.

The new behaviour was reminiscent of $checks, but would only have a single "tested" semantic check. For example, $defined(foo(a, b)) would return true if it checked correctly, and false only if "foo" wasn't callable or didn't accept 2 arguments.

The downside is that $defined must be carefully crafted to correctly do each "test" it supports.

But all in all, this is a substantial upgrade to correct compile time checking, which is very important in C3.


Addition: without $checks the various examples instead become:

// Test if a value may be indexed:
$defined(a[0]);
// Test if something supports addition:
types::is_numerical($typeof(a))
// Test if you can assign something to the type of another variable
$assignable(a, $typeof(b));
// Test if you can call a function with the values of two variables
$defined(foo(a, b));
// Check if a type has a particular field
$defined(Foo{}.my_field);
// Check if a type is ordered
Foo.is_ordered

Some guidelines to new syntax design

Originally from: https://c3.handmade.network/blog/p/8778-some_guidelines_to_new_syntax_design

Syntax discussions tend to be highly contextual. The syntax of a language is not a standalone, separate entity, but rather interacts with what type of algorithmic solutions you envision users to employ. On top of that, one must be aware of that syntax shapes the solutions users will prefer in sometimes unpredictable ways.

This makes completely new syntax very hard to analyze. And also hard to write any guidelines for.

That said, I think there are some things we can say about syntax design, to form some very simple (and obvious) guidelines:

  1. In general, an easy-to-parse syntax tend to be easier for a user to read quickly than a complex-to-parse syntax.
  2. Newly invented syntax will initially be harder for people to grok than established syntax. So it is bad if you try to make experienced programmers understand it "at a glance".
  3. Newly invented syntax does makes the language feel more "different" (unique, inventive etc) than established syntax. So it is good if you want to make the language stand out as being different at a glance.
  4. It's harder to know the downsides of newly invented syntax. So much more research is needed, and it's important to be ready to change it down the line if it doesn't work out.
  5. One's personal opinions of what "nice looking syntax" is very unlikely to be the objectively most accurate opinion, so be aware how that "beautiful" syntax might be hideous to someone else.

Happy hacking!

Comments


Comment by Christoffer Lernö

Syntax discussions tend to be highly contextual. The syntax of a language is not a standalone, separate entity, but rather interacts with what type of algorithmic solutions you envision users to employ. On top of that, one must be aware of that syntax shapes the solutions users will prefer in sometimes unpredictable ways.

This makes completely new syntax very hard to analyze. And also hard to write any guidelines for.

That said, I think there are some things we can say about syntax design, to form some very simple (and obvious) guidelines:

  1. In general, an easy-to-parse syntax tend to be easier for a user to read quickly than a complex-to-parse syntax.
  2. Newly invented syntax will initially be harder for people to grok than established syntax. So it is bad if you try to make experienced programmers understand it "at a glance".
  3. Newly invented syntax does makes the language feel more "different" (unique, inventive etc) than established syntax. So it is good if you want to make the language stand out as being different at a glance.
  4. It's harder to know the downsides of newly invented syntax. So much more research is needed, and it's important to be ready to change it down the line if it doesn't work out.
  5. One's personal opinions of what "nice looking syntax" is very unlikely to be the objectively most accurate opinion, so be aware how that "beautiful" syntax might be hideous to someone else.

Happy hacking!