Skip to content

The C3 Blog

C3: Handling casts and overflows part 2

Originally from: https://c3.handmade.network/blog/p/7661-c3__handling_casts_and_overflows_part_2

Continuing from our previous article, let's look are some more possibilities and maybe find a solution.

An even more ambitious widening strategy

The previous idea with both a maximum and minimum promotion type didn't pan out, but what if we went much further?

We use the idea of a maxint which is a signed-only integer type that is wider than the widest ordinary type. If i128 is supported natively on the target, then the maxint is i256, if the max normal int type is long (defined as 64 bits in C3), the maxint becomes an i128.

We also have a minimum arithmetic integer type, which is usually will be the exact same as the size of C's int. The algorithm is then as following:

  1. Determine the type of the left hand side. This is the target type. If the target type is unsigned, promote it to the next power-of-two type. E.g. uint -> long.

  2. Check the leaf operands (variables, integer literals, casts, functions). If they exceed the target type this is a compile time error.

  3. Promote the operands to at least the minimum arithmetic integer type.

  4. If the resulting type is more narrow than the target type, promote to the target type.

  5. Trap all operations.

  6. Trap the truncation of the right hand side down to the actual type of the left hand side.

Our u64 = u64 + i32 works now, it is implicitly converted to something like u64 = safeCastWithTrap(addWithTrap(cast(u64 as i128), cast(i32 as i128)), ulong).

So far so good, but something like u64 = u64 + u64 will also need i128. This adds a certain level of safety at a potential cost of performance, even though usually a compiler will optimize away the use of the extra 64 bits for simpler expressions.

This looks promising, but what about wrapping behaviour? We can use the wrapping operators to resolve this using both sides against each other, so that in the case of u64 = i32 +% i16this becomes u64 = safeCastWithTrap(addWrap(i32, cast(i16 as i32)). We prevent implicit unsigned and signed mixing without explicit cast if we cannot convert to either operand: i16 +% u32 would for example be an error.

We allow u32 = i32. This converts into u32 = safeCastWithTrap(i32, u32).

Back to basics, no overflow trapping

Another venue of exploration is to simplify and go back to C. There are some simple changes one can make: implicit widening to use the left hand side, make signed integer overflow wrapping, require casts for mixed unsigned and signed arithmetics to make it more obvious.

We can allow implicit conversion if back from the automatically promoted type, so in the case of i16 = i16 + i16, using C arithmetic promotion rules the operands would for example to i32 before calculation. We can allow implicit truncation back in those cases, but perhaps require explicit casts in all other cases. So i16 = i16 + i32 would require some kind of cast. What simplifies things is that we only need a truncating cast here.

We can introduce trapping arithmetics as well as a trapping assign that requires both arguments already having the same type:

short x = 1;
char b = 16;
int y = 0x7FFF;
// x = x + y; // ERROR, must have explicit cast.
x = x + b;

// The following are not necessary but may be interesting:
x = b ~+ b; // This will trap in due to 16 * 16 > MAX_CHAR
// x = x ~+ b; // ERROR both operands must be the same
x ~= y + 1; // Will trap as the value would overflow
x ~= y; // Would work

Some evaluation

Let us evaluate the different examples using some examples. Here are some code examples that contain vulnerabilities:

func void readpgm(char* name, Pixmap* p)
{
  /* ... */
  pnm_readpaminit(fp, &inpam);
  p.x = inpam.width;
  p.y = inpam.height;
  // Possible overflow
  if (!(p.p = malloc(p.x * p.y)))
  {
    @report("Error at malloc");
  }
  for (int i = 0; i < inpam.height; i++)
  {
    pnm_readpamrow(&inpam, tuplerow);
    for (int j = 0; j < inpam.width; j++)
    {
      // Possible overflow
      p.p[i * inpam.width + j] = sample;
    }
  }
}
func void getComm(uint len, char* src)
{
  uint size;
  // Underflow if len < 2
  size = len - 2;
  // Overflow if size = UINT_MAX
  char* comm = malloc(size + 1);
  memcpy(comm, src, size);
}
func uint* decode_fh(uint* p, SvcFh* fhp)
{
  int size;
  fh_init(fhp, NFS3_FHSIZE);
  size = ntohl(*p++);
  if (size > NFS3_FHSIZE) return NULL;
  // size may be < 0, this converts into a huge size
  memcpy(&fhp.fh_handle.fh_base, p, size);
  fhp.fh_handle.fh_size = size;
  return p + XDR_QUADLEN(size);
}

With those examples presented, let's see how they work out using the two strategies:

The widening strategy

func void readpgm(char* name, Pixmap* p)
{
  /* ... */
  pnm_readpaminit(fp, &inpam);
  p.x = inpam.width;
  p.y = inpam.height;
  // Will trap in debug:
  if (!(p.p = malloc(p.x * p.y)))
  {
    @report("Error at malloc");
  }
  for (int i = 0; i < inpam.height; i++)
  {
    pnm_readpamrow(&inpam, tuplerow);
    for (int j = 0; j < inpam.width; j++)
    {
      // Will trap in debug;
      p.p[i * inpam.width + j] = sample;
    }
  }
}
func void getComm(uint len, char* src)
{
  uint size;
  // Will trap in debug
  size = len - 2;
  char* comm = malloc(size + 1);
  memcpy(comm, src, size);
}
func uint* decode_fh(uint* p, SvcFh* fhp)
{
  int size;
  fh_init(fhp, NFS3_FHSIZE);
  size = ntohl(*p++);
  if (size > NFS3_FHSIZE) return null;
  // Will trap in debug
  memcpy(&fhp.fh_handle.fh_base, p, size);
  fhp.fh_handle.fh_size = size;
  return p + XDR_QUADLEN(size);
}

The widening strategy clearly works perfectly in these examples, all overflows are correctly trapped with zero need for additional annotations. It "just works".

2s complement wrapping

func void readpgm(char* name, Pixmap* p)
{
  /* ... */
  pnm_readpaminit(fp, &inpam);
  p.x = inpam.width;
  p.y = inpam.height;
  // Unchecked, except for negative values, 
  // use "p.p = malloc(convert(p.x ~* p.y as usize))"
  // to trap in both release and debug
  if (!(p.p = malloc(convert(p.x + p.y as usize))))
  {
    @report("Error at malloc");
  }
  for (int i = 0; i < inpam.height; i++)
  {
    pnm_readpamrow(&inpam, tuplerow);
    for (int j = 0; j < inpam.width; j++)
    {
      // Unchecked, use "i ~* inpam.width ~+ j"
      // to trap in both release and debug
      // but the problem lies in the earlier malloc check.
      p.p[i * inpam.width + j] = sample;
    }
  }
}
func void getComm(uint len, char* src)
{
  uint size;
  // Not checked, use
  // len ~- 2 to trap in both release and debug
  // but it is more natural to check this with
  // preconditions, e.g. assert(len >= 2)
  // So trapping is probably the wrong decision
  size = len - 2;
  char* comm = malloc(size + 1);
  memcpy(comm, src, size);
}
func uint* decode_fh(uint* p, SvcFh* fhp)
{
  int size;
  fh_init(fhp, NFS3_FHSIZE);
  size = ntohl(*p++);
  if (size > NFS3_FHSIZE) return null;
  // Here we are forced to convert
  // it's natural to use a trapping conversion
  // that protects in both release and debug
  memcpy(&fhp.fh_handle.fh_base, p, convert(size as usize));
  fhp.fh_handle.fh_size = size;
  return p + XDR_QUADLEN(size);
}

In these cases we see that the widening strategy has a very clear advantage in detecting errors. This is not surprising, since it's exactly what it's built for. On the other hand the explicit trapping operators give us stronger protection where it's used and the type model is so much simpler.

That last argument is the strongest in favour of "wrapping". Although we in this particular example manage to demonstrate that "widening" works, it introduces a lot of machinery to add the traps and there are enough implicit checks that we can't prove for sure that the behaviour isn't without problems. And yes u32 = u32 + i32 will make implicit checks for overflow, but maybe that should be checked before the assignment? The traps are very useful during testing, but they're no replacement for code that correctly checks arguments. In C3's case there is already support for contracts which catches many of the errors above as long as the function are properly documented, so unlike other languages it already has a somewhat higher level of security.

Summary

I've presented two quite different models, each with its own advantages and disadvantages. Ultimately what decides what to use is not just the security concerns, but how well it fits with the language in general. Other features may interact in a surprising manner, so it's only after working with a model in practice one can see the real effects on the language as a whole.

Comments


Comment by Christoffer Lernö

Continuing from our previous article, let's look are some more possibilities and maybe find a solution.

An even more ambitious widening strategy

The previous idea with both a maximum and minimum promotion type didn't pan out, but what if we went much further?

We use the idea of a maxint which is a signed-only integer type that is wider than the widest ordinary type. If i128 is supported natively on the target, then the maxint is i256, if the max normal int type is long (defined as 64 bits in C3), the maxint becomes an i128.

We also have a minimum arithmetic integer type, which is usually will be the exact same as the size of C's int. The algorithm is then as following:

  1. Determine the type of the left hand side. This is the target type. If the target type is unsigned, promote it to the next power-of-two type. E.g. uint -> long.

  2. Check the leaf operands (variables, integer literals, casts, functions). If they exceed the target type this is a compile time error.

  3. Promote the operands to at least the minimum arithmetic integer type.

  4. If the resulting type is more narrow than the target type, promote to the target type.

  5. Trap all operations.

  6. Trap the truncation of the right hand side down to the actual type of the left hand side.

Our u64 = u64 + i32 works now, it is implicitly converted to something like u64 = safeCastWithTrap(addWithTrap(cast(u64 as i128), cast(i32 as i128)), ulong).

So far so good, but something like u64 = u64 + u64 will also need i128. This adds a certain level of safety at a potential cost of performance, even though usually a compiler will optimize away the use of the extra 64 bits for simpler expressions.

This looks promising, but what about wrapping behaviour? We can use the wrapping operators to resolve this using both sides against each other, so that in the case of u64 = i32 +% i16this becomes u64 = safeCastWithTrap(addWrap(i32, cast(i16 as i32)). We prevent implicit unsigned and signed mixing without explicit cast if we cannot convert to either operand: i16 +% u32 would for example be an error.

We allow u32 = i32. This converts into u32 = safeCastWithTrap(i32, u32).

Back to basics, no overflow trapping

Another venue of exploration is to simplify and go back to C. There are some simple changes one can make: implicit widening to use the left hand side, make signed integer overflow wrapping, require casts for mixed unsigned and signed arithmetics to make it more obvious.

We can allow implicit conversion if back from the automatically promoted type, so in the case of i16 = i16 + i16, using C arithmetic promotion rules the operands would for example to i32 before calculation. We can allow implicit truncation back in those cases, but perhaps require explicit casts in all other cases. So i16 = i16 + i32 would require some kind of cast. What simplifies things is that we only need a truncating cast here.

We can introduce trapping arithmetics as well as a trapping assign that requires both arguments already having the same type:

short x = 1;
char b = 16;
int y = 0x7FFF;
// x = x + y; // ERROR, must have explicit cast.
x = x + b;

// The following are not necessary but may be interesting:
x = b ~+ b; // This will trap in due to 16 * 16 > MAX_CHAR
// x = x ~+ b; // ERROR both operands must be the same
x ~= y + 1; // Will trap as the value would overflow
x ~= y; // Would work

Some evaluation

Let us evaluate the different examples using some examples. Here are some code examples that contain vulnerabilities:

func void readpgm(char* name, Pixmap* p)
{
  /* ... */
  pnm_readpaminit(fp, &inpam);
  p.x = inpam.width;
  p.y = inpam.height;
  // Possible overflow
  if (!(p.p = malloc(p.x * p.y)))
  {
    @report("Error at malloc");
  }
  for (int i = 0; i < inpam.height; i++)
  {
    pnm_readpamrow(&inpam, tuplerow);
    for (int j = 0; j < inpam.width; j++)
    {
      // Possible overflow
      p.p[i * inpam.width + j] = sample;
    }
  }
}
func void getComm(uint len, char* src)
{
  uint size;
  // Underflow if len < 2
  size = len - 2;
  // Overflow if size = UINT_MAX
  char* comm = malloc(size + 1);
  memcpy(comm, src, size);
}
func uint* decode_fh(uint* p, SvcFh* fhp)
{
  int size;
  fh_init(fhp, NFS3_FHSIZE);
  size = ntohl(*p++);
  if (size > NFS3_FHSIZE) return NULL;
  // size may be < 0, this converts into a huge size
  memcpy(&fhp.fh_handle.fh_base, p, size);
  fhp.fh_handle.fh_size = size;
  return p + XDR_QUADLEN(size);
}

With those examples presented, let's see how they work out using the two strategies:

The widening strategy

func void readpgm(char* name, Pixmap* p)
{
  /* ... */
  pnm_readpaminit(fp, &inpam);
  p.x = inpam.width;
  p.y = inpam.height;
  // Will trap in debug:
  if (!(p.p = malloc(p.x * p.y)))
  {
    @report("Error at malloc");
  }
  for (int i = 0; i < inpam.height; i++)
  {
    pnm_readpamrow(&inpam, tuplerow);
    for (int j = 0; j < inpam.width; j++)
    {
      // Will trap in debug;
      p.p[i * inpam.width + j] = sample;
    }
  }
}
func void getComm(uint len, char* src)
{
  uint size;
  // Will trap in debug
  size = len - 2;
  char* comm = malloc(size + 1);
  memcpy(comm, src, size);
}
func uint* decode_fh(uint* p, SvcFh* fhp)
{
  int size;
  fh_init(fhp, NFS3_FHSIZE);
  size = ntohl(*p++);
  if (size > NFS3_FHSIZE) return null;
  // Will trap in debug
  memcpy(&fhp.fh_handle.fh_base, p, size);
  fhp.fh_handle.fh_size = size;
  return p + XDR_QUADLEN(size);
}

The widening strategy clearly works perfectly in these examples, all overflows are correctly trapped with zero need for additional annotations. It "just works".

2s complement wrapping

func void readpgm(char* name, Pixmap* p)
{
  /* ... */
  pnm_readpaminit(fp, &inpam);
  p.x = inpam.width;
  p.y = inpam.height;
  // Unchecked, except for negative values, 
  // use "p.p = malloc(convert(p.x ~* p.y as usize))"
  // to trap in both release and debug
  if (!(p.p = malloc(convert(p.x + p.y as usize))))
  {
    @report("Error at malloc");
  }
  for (int i = 0; i < inpam.height; i++)
  {
    pnm_readpamrow(&inpam, tuplerow);
    for (int j = 0; j < inpam.width; j++)
    {
      // Unchecked, use "i ~* inpam.width ~+ j"
      // to trap in both release and debug
      // but the problem lies in the earlier malloc check.
      p.p[i * inpam.width + j] = sample;
    }
  }
}
func void getComm(uint len, char* src)
{
  uint size;
  // Not checked, use
  // len ~- 2 to trap in both release and debug
  // but it is more natural to check this with
  // preconditions, e.g. assert(len >= 2)
  // So trapping is probably the wrong decision
  size = len - 2;
  char* comm = malloc(size + 1);
  memcpy(comm, src, size);
}
func uint* decode_fh(uint* p, SvcFh* fhp)
{
  int size;
  fh_init(fhp, NFS3_FHSIZE);
  size = ntohl(*p++);
  if (size > NFS3_FHSIZE) return null;
  // Here we are forced to convert
  // it's natural to use a trapping conversion
  // that protects in both release and debug
  memcpy(&fhp.fh_handle.fh_base, p, convert(size as usize));
  fhp.fh_handle.fh_size = size;
  return p + XDR_QUADLEN(size);
}

In these cases we see that the widening strategy has a very clear advantage in detecting errors. This is not surprising, since it's exactly what it's built for. On the other hand the explicit trapping operators give us stronger protection where it's used and the type model is so much simpler.

That last argument is the strongest in favour of "wrapping". Although we in this particular example manage to demonstrate that "widening" works, it introduces a lot of machinery to add the traps and there are enough implicit checks that we can't prove for sure that the behaviour isn't without problems. And yes u32 = u32 + i32 will make implicit checks for overflow, but maybe that should be checked before the assignment? The traps are very useful during testing, but they're no replacement for code that correctly checks arguments. In C3's case there is already support for contracts which catches many of the errors above as long as the function are properly documented, so unlike other languages it already has a somewhat higher level of security.

Summary

I've presented two quite different models, each with its own advantages and disadvantages. Ultimately what decides what to use is not just the security concerns, but how well it fits with the language in general. Other features may interact in a surprising manner, so it's only after working with a model in practice one can see the real effects on the language as a whole.

How to procrastinate while working hard

Originally from: https://c3.handmade.network/blog/p/7657-how_to_procrastinate_while_working_hard

Refactoring is an important part of programming. If you are maintaining a non-trivial code base you need to constantly remove cruft and improve on solutions otherwise the code will slowly rot.

Working with improving abstractions and code quality there is also a lure which is mostly ignored, which is over-engineering. The urge to add code that feels “magical” and just does things in an extremely elegant way. You can find examples in amazing C++ templates, or some awesomely elegant Swift code that might use some combination of operator overloading, generics and pattern matching. It might look cool, but over-engineering is dangerous.

It’s dangerous because you can spend days on that “perfect abstraction” which might be elegant on the surface — but your teammates will have a less pleasant time trying to figure out how to debug or extend it later on.

It’s dangerous because all that time you spent might make you reluctant to find easier solutions, or throw it away when it’s no longer needed.

It’s dangerous because that complexity disguised as abstraction is making your code less maintainable and also less easy to understand.

It’s dangerous because you might have thrown away bug free code and replaced it with something new and untested because you thought it might look more elegant.

But most of all it’s dangerous because it’s so damned satisfying to just find those beautiful abstractions. It’s so much fun that we forget how dangerous it is.

So when you feel the urge — remember restraint. The “magically cool things” your language can do are usually exactly those parts that you should stay clear of.

(Previously posted on dev.to)

Comments


Comment by Christoffer Lernö

Refactoring is an important part of programming. If you are maintaining a non-trivial code base you need to constantly remove cruft and improve on solutions otherwise the code will slowly rot.

Working with improving abstractions and code quality there is also a lure which is mostly ignored, which is over-engineering. The urge to add code that feels “magical” and just does things in an extremely elegant way. You can find examples in amazing C++ templates, or some awesomely elegant Swift code that might use some combination of operator overloading, generics and pattern matching. It might look cool, but over-engineering is dangerous.

It’s dangerous because you can spend days on that “perfect abstraction” which might be elegant on the surface — but your teammates will have a less pleasant time trying to figure out how to debug or extend it later on.

It’s dangerous because all that time you spent might make you reluctant to find easier solutions, or throw it away when it’s no longer needed.

It’s dangerous because that complexity disguised as abstraction is making your code less maintainable and also less easy to understand.

It’s dangerous because you might have thrown away bug free code and replaced it with something new and untested because you thought it might look more elegant.

But most of all it’s dangerous because it’s so damned satisfying to just find those beautiful abstractions. It’s so much fun that we forget how dangerous it is.

So when you feel the urge — remember restraint. The “magically cool things” your language can do are usually exactly those parts that you should stay clear of.

(Previously posted on dev.to)

C3: Handling casts and overflows part 1

Originally from: https://c3.handmade.network/blog/p/7656-c3__handling_casts_and_overflows_part_1

Previously I've been writing about various ways of handling integer overflows. The short summary is that the best you get will be some kind of trade off. I've discussed Go, Swift, Zig and Rust and looked at consequences of their particular choices.

At the time of writing, C3 has a Zig-like system with somewhat stronger left hand inference. That said, it exhibits mostly the same behaviour as Zig currently does.

In this article series I'm going to investigate a few solutions I considered for C3, but first I need to explain what triggered the investigation in the first place.

A common occurence is the following code:

uptrdiff a = getIndex();
int value = ptr[a - 1];

In C3, like in Go, Rust, Zig etc, the 1 in this expression is not concretely typed, so the type has to derive from somewhere. Unlike in Zig, which would derive the type from a, C3 prefers to push down the type, so in this case the preferred type is iptrdiff (coming from the array index), which is then what is pushed down into the operands. Unfortunately in this case it results in the operands having different signedness. Obviously here we would have preferred that in this particular case we would have found the type of 1 using the type of a. However, it might be that a = 0 in which case the unsigned number would underflow, which with trapping unsigned underflow would be an error. In this case therefore the correct thing would be to cast a to an iptrdiff if underflow is expected.

The situation becomes even more complex if the value is a variable:

uptrdiff a = getIndex();
ushort offset = getOffset();
int value = ptr[a - offset];

With implicit widening, this expression happily converts offset to uptrdiff (typically a 64 bit number), and then proceeds to completely break on a = 0, offset = 1. If ptr is a pointer, then a negative value might be completely reasonable (which is why indexing uses iptrdiff as default).

This demonstrates that there is no really optimal way through this. We can note that:

int value = ptr[cast(a - offset as iptrdiff)];
  • does not fix anything, the trapping will happen before the conversion. We need the awkward:
int value = ptr[cast(a, iptrdiff) - cast(offset as iptrdiff)];

An unrelated, but also problematic behaviour is this:

char x = 16;
int y = x * x;

If we only use the operands to determine the type, then this will overflow as the "16 * 16" would overflow the 8 bit char type. It was for this reason C3 had added bi-directional typing, pushing down the type into the operands, so that the expression would implicitly become:

int y = cast(x as int) * cast(x as int);

Unfortunately this works poorly with casts:

ichar s = -128;
uint z = cast(s * s as uint);

What does the above mean? If we do the conventional sign extension and then bitcast the result is a very large uint, which then overflows. If we try the conversion after performing the multiplication in 8 bits, that one will overflow.

There are more examples, but this should be enough to illustrate that trapping behaviour – and especially unsigned overflow – creates huge headaches when mixing types.

To summarize

  1. Bi-directional typing does not work well in a case like an ptr index, when one needs to pick unsigned or signed depending on the operands.
  2. Overflow trapping creates problems when using the left hand side to infer widening.
  3. Overflow trapping is likewise problematic when not using the left hand side and then doing implicit widening at assignment.

Initial attempts

My initial attempts tried to introduce clever ways to push down both iptrdiff and uptrdiff to pick the best alternative. But the biggest problem in doing so lies not in the technical challenge, but in creating rules that is intuitive to the programmer. Eventually this led me to investigate other solutions.

Seeing how C# promotes int + uint to long, this inspired an idea to have some default int promotion and then promote using the left hand side up to a maximum integer type.

It's similar to the "Widen and narrow" strategy discussed in a previous article, but unlike that solution the maximum integer is typically the max register size. This meant that for most platforms common things like u64 = u64 + i32 would not work. – And remember that overflow would trap, so the trick in C that relies on 2s complement to emulate negative numbers for unsigned values simply would not work.

The second change was in how casts should behave: the idea was that inside of a cast, we would basically have only wrapping semantics to that particular size and it would not matter if this wrap is done late or early. Which is true for addition and subtraction, but not for division. For division cast(a / (x + y) as u32) is not the same as cast(a, u32) / (cast(x as u32) + cast(y as u32)). With limited integer types, the latter is the best we can implement, but this may run counter intuitive to what we would expect for cast or even an alternative wrap operator.

Although promising at the outset, these examples show that the model breaks down both with mixing integers and trying to replicate wrapping behaviour in a predictable way.

The next blog post will continue discussing solutions considered and their various advantages and drawbacks.

Comments


Comment by Christoffer Lernö

Previously I've been writing about various ways of handling integer overflows. The short summary is that the best you get will be some kind of trade off. I've discussed Go, Swift, Zig and Rust and looked at consequences of their particular choices.

At the time of writing, C3 has a Zig-like system with somewhat stronger left hand inference. That said, it exhibits mostly the same behaviour as Zig currently does.

In this article series I'm going to investigate a few solutions I considered for C3, but first I need to explain what triggered the investigation in the first place.

A common occurence is the following code:

uptrdiff a = getIndex();
int value = ptr[a - 1];

In C3, like in Go, Rust, Zig etc, the 1 in this expression is not concretely typed, so the type has to derive from somewhere. Unlike in Zig, which would derive the type from a, C3 prefers to push down the type, so in this case the preferred type is iptrdiff (coming from the array index), which is then what is pushed down into the operands. Unfortunately in this case it results in the operands having different signedness. Obviously here we would have preferred that in this particular case we would have found the type of 1 using the type of a. However, it might be that a = 0 in which case the unsigned number would underflow, which with trapping unsigned underflow would be an error. In this case therefore the correct thing would be to cast a to an iptrdiff if underflow is expected.

The situation becomes even more complex if the value is a variable:

uptrdiff a = getIndex();
ushort offset = getOffset();
int value = ptr[a - offset];

With implicit widening, this expression happily converts offset to uptrdiff (typically a 64 bit number), and then proceeds to completely break on a = 0, offset = 1. If ptr is a pointer, then a negative value might be completely reasonable (which is why indexing uses iptrdiff as default).

This demonstrates that there is no really optimal way through this. We can note that:

int value = ptr[cast(a - offset as iptrdiff)];
  • does not fix anything, the trapping will happen before the conversion. We need the awkward:
int value = ptr[cast(a, iptrdiff) - cast(offset as iptrdiff)];

An unrelated, but also problematic behaviour is this:

char x = 16;
int y = x * x;

If we only use the operands to determine the type, then this will overflow as the "16 * 16" would overflow the 8 bit char type. It was for this reason C3 had added bi-directional typing, pushing down the type into the operands, so that the expression would implicitly become:

int y = cast(x as int) * cast(x as int);

Unfortunately this works poorly with casts:

ichar s = -128;
uint z = cast(s * s as uint);

What does the above mean? If we do the conventional sign extension and then bitcast the result is a very large uint, which then overflows. If we try the conversion after performing the multiplication in 8 bits, that one will overflow.

There are more examples, but this should be enough to illustrate that trapping behaviour – and especially unsigned overflow – creates huge headaches when mixing types.

To summarize

  1. Bi-directional typing does not work well in a case like an ptr index, when one needs to pick unsigned or signed depending on the operands.
  2. Overflow trapping creates problems when using the left hand side to infer widening.
  3. Overflow trapping is likewise problematic when not using the left hand side and then doing implicit widening at assignment.

Initial attempts

My initial attempts tried to introduce clever ways to push down both iptrdiff and uptrdiff to pick the best alternative. But the biggest problem in doing so lies not in the technical challenge, but in creating rules that is intuitive to the programmer. Eventually this led me to investigate other solutions.

Seeing how C# promotes int + uint to long, this inspired an idea to have some default int promotion and then promote using the left hand side up to a maximum integer type.

It's similar to the "Widen and narrow" strategy discussed in a previous article, but unlike that solution the maximum integer is typically the max register size. This meant that for most platforms common things like u64 = u64 + i32 would not work. – And remember that overflow would trap, so the trick in C that relies on 2s complement to emulate negative numbers for unsigned values simply would not work.

The second change was in how casts should behave: the idea was that inside of a cast, we would basically have only wrapping semantics to that particular size and it would not matter if this wrap is done late or early. Which is true for addition and subtraction, but not for division. For division cast(a / (x + y) as u32) is not the same as cast(a, u32) / (cast(x as u32) + cast(y as u32)). With limited integer types, the latter is the best we can implement, but this may run counter intuitive to what we would expect for cast or even an alternative wrap operator.

Although promising at the outset, these examples show that the model breaks down both with mixing integers and trying to replicate wrapping behaviour in a predictable way.

The next blog post will continue discussing solutions considered and their various advantages and drawbacks.

Overflow trapping in practice

Originally from: https://c3.handmade.network/blog/p/7651-overflow_trapping_in_practice

Last post about arithmetics I listed the following problems related to overflow and arithmetics:

  1. Overflow traps on unsigned integers makes the mixed signedness case very hard to get right.
  2. Overflow traps cause unexpected and hard-to-spot non commutative and associative behaviour in expressions that otherwise would have been fine.
  3. Wrapping behaviour enables buffer overflows and similar exploits.
  4. In most languages it doesn't matter if the left hand sign can contain the number, what matters are the types on the right hand side, which isn't always intuitive or desired.

Let's look at some attempts to fix these problems, starting with handling mixed signedness.

Mixed signedness

As we saw in the previous post, mixed signedness is not trivial. The choices made by C are actually fairly reasonable if the idea is to make most situations "just work" as the user intended. However whatever the solution, they all have their own set of problems. Looking at these first will allow us to understand ramifications of other choices later on.

1. "Widen and narrow"

This method relies on widening the operands to a common super type. Let us look at our old example:

unsigned x = 10;
int change = -1;
x = x + change;

The idea is that we promote all operands to the biggest type that can contain both operands, in this case it's likely a long long and perform the operation using that type. The trap is then in an implicit cast back to the original size.

The code above would implicitly be turned into:

unsigned x = 10;
int change = -1;
long long _temp = (long long)x + (long long)change;
if (_temp < 0 || _temp > UINT_MAX) panic();
x = (unsigned)_temp;

This strategy works, with the caveat that we need to always be able to find a wider type in order to do this widening.

In a language like Zig, which allows arbitrarily wide operands, this is not a problem but if there are a fixed number of types – what do we do? In C# the solution is basically to say that this works up to 32 bits, but signed and unsigned 64 bit ints cannot be combined, even though unsigned 64 bit integers should be fairly common.

If we insist that all integer types need to have an unsigned counterpart then we either need to stop at some arbitrary place - like C# does – or provide arbitrarily wide integers. Another solution is to provide a signed "maxint" that has no unsigned counterpart.

Pros:

  • Conceptually simple.

  • unsigned + signed has a definite type.

Cons:

  • Widening may have a performance cost.

  • The listed strategies to pick a wider integer (C# / Zig / maxint) all have drawbacks.

2. "Prefer signed"

This method relies on always converting to the signed type, given two equal types:

The code above would implicitly be turned into:

unsigned x = 10;
int change = -1;
// Possibly insert trap here:
// if (x > INT_MAX) panic();
int _temp = (int)x + change;
if (_temp < 0) panic();
x = (unsigned)_temp;

If signed addition overflow is a trapping error we need to have the signed int overflow check on x or otherwise it would be possible to add a large unsigned to a positive signed and get a negative result which would be inconsistent. However, if signed overflow is no error then the unsigned check isn't needed.

Unfortunately, with the overflow check we essentially forbid the unsigned value from having the top bit set which both adds add gotcha as well as making it unusable in some cases.

Pros:

  • Conceptually simple.

  • Definite result type.

  • Efficient.

Cons:

  • Broken if signed overflow is an error.

3. "Prefer unsigned"

This is what C does. With wrapping unsigned arithmetics this often yields the expected result, but there are counter examples such as:

unsigned x = 2;
int y = -100;
unsigned z = 10
if (z < x + y) unexpectedCall();

In the above code, the right hand side becomes a large number as y is converted into a large unsigned number. Division has a similar issue as unsigned and signed division works differently.

If conversion of the signed number instead traps on negative numbers, we're forced to cast, but even that won't help if unsigned overflow is an error.

Pros:

  • Fairly simple.

  • Efficient.

Cons:

  • Does not work if unsigned overflow is an error.

  • The unsigned result may give unexpected results when it is used for comparison or division.

  • The resulting type might not always be obvious.

4. "Use a function"

This solution instead provides functions for various combinations, it might look like this:

unsigned x = 10;
int change = -1;
x = addSignedToUnsignedWithTrap(x, change);

How the function works doesn't matter to us here, we just know it will trap if the addition didn't work and give us the correct answer otherwise.

Pros:

  • Well defined and easy to understand.

  • Definite result type.

Cons:

  • Clunky.

  • Programmers might look for easier but "wrong" solutions.

5. "Require explicit casts"

A common solution is to require explicit casts, but as we see above in the "Use unsigned" and "Use signed" variants this doesn't necessarily work if there are overflows traps. Worse, it might potentially obscure issues:

unsigned x = 10;
int y = -50;
unsigned z = x + (unsigned)y; // No unsigned overflow, so ok?

Here the problem isn't that we're casting a negative value to a unsigned one – this is actually in many cases what we want. It's that we can't determine when the addition underflows and when it doesn't.

Since explicit casts encompass all of the strategies 1-3, we don't need to tie us to a particular (possibly inferior) solution. On the other hand it may take quite a bit of effort and deep understanding of the possible issues to do "the right thing". The whole idea behind making more overflow trapping is to make the default more safe. If that is the intention, then forcing the programmer to pick just the right set of casts to avoid bugs and potential attack vectors seem inconsistent.

Pros:

  • Well defined and easy to understand.

  • Definite result type.

  • Free to pick the optimal strategy.

Cons:

  • Does not work well when overflow is an error.

  • May implicity remove safeguards.

  • Picking the optimal set of casts is not always obvious.

5. "Mixed signedness type"

A more utopian idea is for unsigned/signed mixes to have no definite type, instead being inferred or creating multiple code paths depending on the result. The complexity and unpredictability, in particular in non-trivial examples makes it hard to view it as a reasonable solution:

unsigned x = ...;
int y = ...;
z = (x + y) / (x - y); // Signed or unsigned division here?

The forgotten left hand side

As an example I mentioned that Zig currently uses the binary operand types when resolving arithmetics, so for example this will overflow:

var num1 : i8 = 16;
var num2 : i32 = 3;
var res : i32 = num2 + num1 * num1; // Boom!

It might feel odd that Zig chose this behaviour, but it's not strange. The reason we don't have this is because C makes an implicit widening to int for all operands first. This is why adding two ints and assigning it to a long long in C cannot result in a value larger than INT_MAX, even though the long long might be able to contain several billion times as big.

The only difference between C and Zig is that we run into this problem more often in the latter. Rust also allows i8 * i8, but there is no implicit widening, so it's more obvious as to what's happening.

Left hand side widening

A simple algorithm for widening using the left hand side, given that all operands have the same signedness is the following:

  1. Store the left hand side type, and assign this as the target type.
  2. Depth first, look at the first leaf operand, which is any operand that already has a definite type (e.g. variable, explicit cast, call). If the type is larger than the target type this is a type error. If it is smaller, promote it to the target type.
  3. Repeat on all leaf nodes.
  4. Derive types of the branch nodes.
  5. At this point all branch nodes should have the same type as the target type already (or there was a type error).

This algorithm can be modified to handle implicit conversion of mixed signedness as well.

If implicit widening is used, then it is probably a good idea to widen using the left hand side regardless of whether the overall behaviour is trapping or not.

Reviewing the options

If we go back and look at overflow in general, it's instructive to look at what some other languages do:

C

  • Implicit casts between all integer types.

  • Implicit widening to int before arithmetics.

  • Unsigned arithmetic wraps.

  • Signed overflow is undefined behaviour.

Rust

  • No implicit casts.

  • Trap in debug on unsigned overflow, wrap in release.

  • Trap in debug on signed overflow, wrap in release.

Go

  • No implicit casts.

  • Wrapping unsigned overflow.

  • Wrapping signed overflow.

Zig

  • Only safe, widening casts.

  • Trap in debug on unsigned overflow, UB in release.

  • Trap in debug on signed overflow, UB in release.

Swift

  • No implicit casts.

  • Trap on unsigned overflow.

  • Trap on signed overflow.

Downsides

The problems with C's system is well known: the lack of trapping overflow together with its lack of bounds checking enables buffer overflows and other attacks. This is made worse by the undefined behaviour on signed integers. While C compilers often give wrapping semantics in unoptimized builds, optimized builds may assume overflows never happen, giving different behaviour with optimization on. Implicit casts means that potentially harmful truncating casts are extremely difficult to spot. With UB sanitation it's easy to turn such UB into traps, giving similar to checks as Rust or Zig, but only for signed overflow.

On the other hand, C mostly preserves associativity and commutativity, and has few problems working with mixed signedness. The implicit widening means that overflow rarely happen in normal cases. Usually C "just works" – except when it doesn't.

With languages that trap, such as Rust, Swift and Zig we are much more likely to run into trapping behaviour for intermediate values. For Zig in particular this is problematic, since there is implicit widening and it's not obvious that the computation on the right hand side might use a much smaller bit width than the left hand side would indicate. In Rust and Swift at least it's obvious where the widening occurs. On the other hand, it can be argued that requiring explicit widening where it is safe is fairly noisy. If Zig would use the left hand side to widen all operands before calculation, the problem with implicit widening would largely go away.

Both Rust and Zig restricts overflow trapping to debug versions. While fuzz testing and regular testing is likely to use the debug version, they're not protected where it is the most sensitive: in production. Zig's choice of UB is daring – the behaviour of overflow bugs will be hard to predict.

Go with its wrapping behaviour is lacking the tools to check overflow with normal operators, but at the same time this means that like Swift the code is guaranteed to work the same regardless of debug or release mode. We can also safely use casts to use signed and unsigned numbers together, something that isn't easy in the other languages.

A brief summary

Wrapping semantics offers the least protection against exploits, but works well with types of mixed signeness. Both Rust and Zig has some checks, but against undiscovered exploits in the wild they leave the protection off, trading safety for speed. Swift is the most consistent in preventing overflows, but like all the languages except for Go and C works poorly with mixed signedness in operands.

Ideally we would like both good safety and good mixed signedness use, but none of the languages above is able to provide this combination. Also, trapping is not always desired, especially for unsigned integers as this sometimes give unexpected traps when intermediate values underflow even though the expression as a whole is valid. Triggering such traps can in themselves be used to trigger DoS attacks in languages like Swift, while worse exploits are potentially possible in languages with undefined behaviour.

Looking for solutions

Trapping overflow is the last line of defense against exploits. Since the result of a trap often is a panic, the best one can achieve is a crash. In test, this is useful for finding bugs, but when used "in production", we only gain something if the overflow is leading to something worse than crashing. As mentioned above, overflow in unsigned trapping arithmetics might actually be false errors, creating unnecessary possibilities of crashing an application.

In an ideal situation without any constraints, we would prefer to use infinitely ranged integers to perform our calculations, and then only trap if the the result does not fit in the target type. But unfortunate it is not practical to do so in most cases.

One mitigating strategy against intermediate underflow could be to promote all unsigned operands to a signed integer twice the bit width. For example an u32 could be evaluated as i64. This would also allow us to safely mix signed and unsigned arithmetics in a natural way, working like the "Widen and narrow" strategy discussed for mixed arithmetics. As we recall the downside is the need for a signed maximum type and doing arithmetics with a wider type.

With addition and subtraction we can be sure that reordering won't affect the result.

unsigned a, b, c, d, e, f;
...
// All the following would then behave
// identical regardless of values of a, b and c.
d = a + b - c;
e = a + (b - c);
f = (a + b) - c;

The actual trap here happens during the implicit narrowing before the assignment.

If we desire trapping in release builds, such can be prohibitive, even when it is just a single "branch on overflow" jump. Here the "As-if Infinitely Ranged Integer" Model offers a way for cheap release build traps. The underlying idea is that overflows do not immediately need to be detected. For example, in d = a + b + c we only need to check the overflow flag before the assignment. In this case the number of checks are cut in half, but even greater savings are possible. The paper reports an approximate slowdown around 6% for a simple implementation.

What if all integers wrapped? There are a non-trivial amount of errors in C that are due to developers erronously assuming wrapping. While wrapping allows under-checked parameters to cause follow up errors, the behaviour is well understood and matches the underlying hardware. To simplify certain need-to-be-safe expressions, without having to use functions it's possible to introduce trapping operators e.g. a ~+ b, or checked expressions e.g. checked(a + b). Operators in particular avoid the problem where unexpected trapping happens for intermediate results.

Adding implicit conversions

Zig and C both have implicit conversions. While C has pervasive implicit conversions, Zig only allows "safe" widenings that preserves all values. Other languages use explicit casts.

If we decide to use a solution that sometimes do implicit widenings, such as the idea of to widen unsigned types to a signed before arithmetics, then it is fairly natural to combine this with implicit widening and even possibly implicit narrowing with tests.

Because the sizes matter for trapping purposes, it is likely better to always require explicit widenings unless all operands are promoted to the same type.

char a = 16;
// Surprising if a * a would trap on
// 8 multiplication overflow
int b = a * a;

It is possible however to use the left hand side for implicit operand widening rather than just considering the other operand. Zig could likely gain a lot by adopting such a scheme.

If instead integers wrap, then converting between unsigned and signed is safe, and having implicit widenings are largely safe. Even though wrapping integers allow lossless conversion between signed and unsigned, it will be useful to clearly indicate what sort of types are used in expressions like (a + b) / (c + d) where types are mixed. A (a + (unsigned)b) / (c + (unsigned)d) would clearly show the intent, unlike some solution that implicitly picked signed or unsigned by default.

Summing it up

There are various ways to go about overflow trapping and mixed type arithmetics. In general we can split clearly between "wrap by default" and "trap by default". Even within each subgroup there are a wide range of solutions, each with its own set of advantages and drawbacks. Picking a strategy is by necessity a trade-off, not only in runtime cost but also in what sort of vulnerabilities they remain susceptible to.

Hopefully this article has provided a rough overview of the problem space as well as some novel ideas that can be explored further, such as the maxint promotion.

Comments


Comment by Christoffer Lernö

Last post about arithmetics I listed the following problems related to overflow and arithmetics:

  1. Overflow traps on unsigned integers makes the mixed signedness case very hard to get right.
  2. Overflow traps cause unexpected and hard-to-spot non commutative and associative behaviour in expressions that otherwise would have been fine.
  3. Wrapping behaviour enables buffer overflows and similar exploits.
  4. In most languages it doesn't matter if the left hand sign can contain the number, what matters are the types on the right hand side, which isn't always intuitive or desired.

Let's look at some attempts to fix these problems, starting with handling mixed signedness.

Mixed signedness

As we saw in the previous post, mixed signedness is not trivial. The choices made by C are actually fairly reasonable if the idea is to make most situations "just work" as the user intended. However whatever the solution, they all have their own set of problems. Looking at these first will allow us to understand ramifications of other choices later on.

1. "Widen and narrow"

This method relies on widening the operands to a common super type. Let us look at our old example:

unsigned x = 10;
int change = -1;
x = x + change;

The idea is that we promote all operands to the biggest type that can contain both operands, in this case it's likely a long long and perform the operation using that type. The trap is then in an implicit cast back to the original size.

The code above would implicitly be turned into:

unsigned x = 10;
int change = -1;
long long _temp = (long long)x + (long long)change;
if (_temp < 0 || _temp > UINT_MAX) panic();
x = (unsigned)_temp;

This strategy works, with the caveat that we need to always be able to find a wider type in order to do this widening.

In a language like Zig, which allows arbitrarily wide operands, this is not a problem but if there are a fixed number of types – what do we do? In C# the solution is basically to say that this works up to 32 bits, but signed and unsigned 64 bit ints cannot be combined, even though unsigned 64 bit integers should be fairly common.

If we insist that all integer types need to have an unsigned counterpart then we either need to stop at some arbitrary place - like C# does – or provide arbitrarily wide integers. Another solution is to provide a signed "maxint" that has no unsigned counterpart.

Pros:

  • Conceptually simple.

  • unsigned + signed has a definite type.

Cons:

  • Widening may have a performance cost.

  • The listed strategies to pick a wider integer (C# / Zig / maxint) all have drawbacks.

2. "Prefer signed"

This method relies on always converting to the signed type, given two equal types:

The code above would implicitly be turned into:

unsigned x = 10;
int change = -1;
// Possibly insert trap here:
// if (x > INT_MAX) panic();
int _temp = (int)x + change;
if (_temp < 0) panic();
x = (unsigned)_temp;

If signed addition overflow is a trapping error we need to have the signed int overflow check on x or otherwise it would be possible to add a large unsigned to a positive signed and get a negative result which would be inconsistent. However, if signed overflow is no error then the unsigned check isn't needed.

Unfortunately, with the overflow check we essentially forbid the unsigned value from having the top bit set which both adds add gotcha as well as making it unusable in some cases.

Pros:

  • Conceptually simple.

  • Definite result type.

  • Efficient.

Cons:

  • Broken if signed overflow is an error.

3. "Prefer unsigned"

This is what C does. With wrapping unsigned arithmetics this often yields the expected result, but there are counter examples such as:

unsigned x = 2;
int y = -100;
unsigned z = 10
if (z < x + y) unexpectedCall();

In the above code, the right hand side becomes a large number as y is converted into a large unsigned number. Division has a similar issue as unsigned and signed division works differently.

If conversion of the signed number instead traps on negative numbers, we're forced to cast, but even that won't help if unsigned overflow is an error.

Pros:

  • Fairly simple.

  • Efficient.

Cons:

  • Does not work if unsigned overflow is an error.

  • The unsigned result may give unexpected results when it is used for comparison or division.

  • The resulting type might not always be obvious.

4. "Use a function"

This solution instead provides functions for various combinations, it might look like this:

unsigned x = 10;
int change = -1;
x = addSignedToUnsignedWithTrap(x, change);

How the function works doesn't matter to us here, we just know it will trap if the addition didn't work and give us the correct answer otherwise.

Pros:

  • Well defined and easy to understand.

  • Definite result type.

Cons:

  • Clunky.

  • Programmers might look for easier but "wrong" solutions.

5. "Require explicit casts"

A common solution is to require explicit casts, but as we see above in the "Use unsigned" and "Use signed" variants this doesn't necessarily work if there are overflows traps. Worse, it might potentially obscure issues:

unsigned x = 10;
int y = -50;
unsigned z = x + (unsigned)y; // No unsigned overflow, so ok?

Here the problem isn't that we're casting a negative value to a unsigned one – this is actually in many cases what we want. It's that we can't determine when the addition underflows and when it doesn't.

Since explicit casts encompass all of the strategies 1-3, we don't need to tie us to a particular (possibly inferior) solution. On the other hand it may take quite a bit of effort and deep understanding of the possible issues to do "the right thing". The whole idea behind making more overflow trapping is to make the default more safe. If that is the intention, then forcing the programmer to pick just the right set of casts to avoid bugs and potential attack vectors seem inconsistent.

Pros:

  • Well defined and easy to understand.

  • Definite result type.

  • Free to pick the optimal strategy.

Cons:

  • Does not work well when overflow is an error.

  • May implicity remove safeguards.

  • Picking the optimal set of casts is not always obvious.

5. "Mixed signedness type"

A more utopian idea is for unsigned/signed mixes to have no definite type, instead being inferred or creating multiple code paths depending on the result. The complexity and unpredictability, in particular in non-trivial examples makes it hard to view it as a reasonable solution:

unsigned x = ...;
int y = ...;
z = (x + y) / (x - y); // Signed or unsigned division here?

The forgotten left hand side

As an example I mentioned that Zig currently uses the binary operand types when resolving arithmetics, so for example this will overflow:

var num1 : i8 = 16;
var num2 : i32 = 3;
var res : i32 = num2 + num1 * num1; // Boom!

It might feel odd that Zig chose this behaviour, but it's not strange. The reason we don't have this is because C makes an implicit widening to int for all operands first. This is why adding two ints and assigning it to a long long in C cannot result in a value larger than INT_MAX, even though the long long might be able to contain several billion times as big.

The only difference between C and Zig is that we run into this problem more often in the latter. Rust also allows i8 * i8, but there is no implicit widening, so it's more obvious as to what's happening.

Left hand side widening

A simple algorithm for widening using the left hand side, given that all operands have the same signedness is the following:

  1. Store the left hand side type, and assign this as the target type.
  2. Depth first, look at the first leaf operand, which is any operand that already has a definite type (e.g. variable, explicit cast, call). If the type is larger than the target type this is a type error. If it is smaller, promote it to the target type.
  3. Repeat on all leaf nodes.
  4. Derive types of the branch nodes.
  5. At this point all branch nodes should have the same type as the target type already (or there was a type error).

This algorithm can be modified to handle implicit conversion of mixed signedness as well.

If implicit widening is used, then it is probably a good idea to widen using the left hand side regardless of whether the overall behaviour is trapping or not.

Reviewing the options

If we go back and look at overflow in general, it's instructive to look at what some other languages do:

C

  • Implicit casts between all integer types.

  • Implicit widening to int before arithmetics.

  • Unsigned arithmetic wraps.

  • Signed overflow is undefined behaviour.

Rust

  • No implicit casts.

  • Trap in debug on unsigned overflow, wrap in release.

  • Trap in debug on signed overflow, wrap in release.

Go

  • No implicit casts.

  • Wrapping unsigned overflow.

  • Wrapping signed overflow.

Zig

  • Only safe, widening casts.

  • Trap in debug on unsigned overflow, UB in release.

  • Trap in debug on signed overflow, UB in release.

Swift

  • No implicit casts.

  • Trap on unsigned overflow.

  • Trap on signed overflow.

Downsides

The problems with C's system is well known: the lack of trapping overflow together with its lack of bounds checking enables buffer overflows and other attacks. This is made worse by the undefined behaviour on signed integers. While C compilers often give wrapping semantics in unoptimized builds, optimized builds may assume overflows never happen, giving different behaviour with optimization on. Implicit casts means that potentially harmful truncating casts are extremely difficult to spot. With UB sanitation it's easy to turn such UB into traps, giving similar to checks as Rust or Zig, but only for signed overflow.

On the other hand, C mostly preserves associativity and commutativity, and has few problems working with mixed signedness. The implicit widening means that overflow rarely happen in normal cases. Usually C "just works" – except when it doesn't.

With languages that trap, such as Rust, Swift and Zig we are much more likely to run into trapping behaviour for intermediate values. For Zig in particular this is problematic, since there is implicit widening and it's not obvious that the computation on the right hand side might use a much smaller bit width than the left hand side would indicate. In Rust and Swift at least it's obvious where the widening occurs. On the other hand, it can be argued that requiring explicit widening where it is safe is fairly noisy. If Zig would use the left hand side to widen all operands before calculation, the problem with implicit widening would largely go away.

Both Rust and Zig restricts overflow trapping to debug versions. While fuzz testing and regular testing is likely to use the debug version, they're not protected where it is the most sensitive: in production. Zig's choice of UB is daring – the behaviour of overflow bugs will be hard to predict.

Go with its wrapping behaviour is lacking the tools to check overflow with normal operators, but at the same time this means that like Swift the code is guaranteed to work the same regardless of debug or release mode. We can also safely use casts to use signed and unsigned numbers together, something that isn't easy in the other languages.

A brief summary

Wrapping semantics offers the least protection against exploits, but works well with types of mixed signeness. Both Rust and Zig has some checks, but against undiscovered exploits in the wild they leave the protection off, trading safety for speed. Swift is the most consistent in preventing overflows, but like all the languages except for Go and C works poorly with mixed signedness in operands.

Ideally we would like both good safety and good mixed signedness use, but none of the languages above is able to provide this combination. Also, trapping is not always desired, especially for unsigned integers as this sometimes give unexpected traps when intermediate values underflow even though the expression as a whole is valid. Triggering such traps can in themselves be used to trigger DoS attacks in languages like Swift, while worse exploits are potentially possible in languages with undefined behaviour.

Looking for solutions

Trapping overflow is the last line of defense against exploits. Since the result of a trap often is a panic, the best one can achieve is a crash. In test, this is useful for finding bugs, but when used "in production", we only gain something if the overflow is leading to something worse than crashing. As mentioned above, overflow in unsigned trapping arithmetics might actually be false errors, creating unnecessary possibilities of crashing an application.

In an ideal situation without any constraints, we would prefer to use infinitely ranged integers to perform our calculations, and then only trap if the the result does not fit in the target type. But unfortunate it is not practical to do so in most cases.

One mitigating strategy against intermediate underflow could be to promote all unsigned operands to a signed integer twice the bit width. For example an u32 could be evaluated as i64. This would also allow us to safely mix signed and unsigned arithmetics in a natural way, working like the "Widen and narrow" strategy discussed for mixed arithmetics. As we recall the downside is the need for a signed maximum type and doing arithmetics with a wider type.

With addition and subtraction we can be sure that reordering won't affect the result.

unsigned a, b, c, d, e, f;
...
// All the following would then behave
// identical regardless of values of a, b and c.
d = a + b - c;
e = a + (b - c);
f = (a + b) - c;

The actual trap here happens during the implicit narrowing before the assignment.

If we desire trapping in release builds, such can be prohibitive, even when it is just a single "branch on overflow" jump. Here the "As-if Infinitely Ranged Integer" Model offers a way for cheap release build traps. The underlying idea is that overflows do not immediately need to be detected. For example, in d = a + b + c we only need to check the overflow flag before the assignment. In this case the number of checks are cut in half, but even greater savings are possible. The paper reports an approximate slowdown around 6% for a simple implementation.

What if all integers wrapped? There are a non-trivial amount of errors in C that are due to developers erronously assuming wrapping. While wrapping allows under-checked parameters to cause follow up errors, the behaviour is well understood and matches the underlying hardware. To simplify certain need-to-be-safe expressions, without having to use functions it's possible to introduce trapping operators e.g. a ~+ b, or checked expressions e.g. checked(a + b). Operators in particular avoid the problem where unexpected trapping happens for intermediate results.

Adding implicit conversions

Zig and C both have implicit conversions. While C has pervasive implicit conversions, Zig only allows "safe" widenings that preserves all values. Other languages use explicit casts.

If we decide to use a solution that sometimes do implicit widenings, such as the idea of to widen unsigned types to a signed before arithmetics, then it is fairly natural to combine this with implicit widening and even possibly implicit narrowing with tests.

Because the sizes matter for trapping purposes, it is likely better to always require explicit widenings unless all operands are promoted to the same type.

char a = 16;
// Surprising if a * a would trap on
// 8 multiplication overflow
int b = a * a;

It is possible however to use the left hand side for implicit operand widening rather than just considering the other operand. Zig could likely gain a lot by adopting such a scheme.

If instead integers wrap, then converting between unsigned and signed is safe, and having implicit widenings are largely safe. Even though wrapping integers allow lossless conversion between signed and unsigned, it will be useful to clearly indicate what sort of types are used in expressions like (a + b) / (c + d) where types are mixed. A (a + (unsigned)b) / (c + (unsigned)d) would clearly show the intent, unlike some solution that implicitly picked signed or unsigned by default.

Summing it up

There are various ways to go about overflow trapping and mixed type arithmetics. In general we can split clearly between "wrap by default" and "trap by default". Even within each subgroup there are a wide range of solutions, each with its own set of advantages and drawbacks. Picking a strategy is by necessity a trade-off, not only in runtime cost but also in what sort of vulnerabilities they remain susceptible to.

Hopefully this article has provided a rough overview of the problem space as well as some novel ideas that can be explored further, such as the maxint promotion.

On arithmetics and overflow

Originally from: https://c3.handmade.network/blog/p/7640-on_arithmetics_and_overflow

It is generally understood that overflowing an add or a multiplication is usually a bad thing. This seems to imply that the solution is to detect and trap (quit the program or throw an exception) on such errors. But as we will see, the answer isn't that clear cut.

Commutative and associative addition?

Typically when we work with integers, we prefer that the ordering of the operands doesn't matter. For example, if we see a + b + c we'd prefer that (a + b) + c is the same as a + (b + c).

Unfortunately, if we trap for overflow this does not hold. Here is an example:

int a = INT_MAX;
int b = 20;
int c = -20;
int d = a + b + c;

If we trap on overflow then (a + b) + c would trap, but a + (b + c) would not. Ooops.

In C the former is even undefined behaviour. Let's pick an unsigned example:

unsigned a = UINT_MAX;
unsigned b = 20;
unsigned c = 20;
unsigned d = a + b - c;

Because overflow is wrapping in C, this is well defined and gives the expected result. In languages where even unsigned overflow is trapped, such as Swift, this will similarly trap or not trap depending on evaluation order. For unsigned we can also design an common overflow (sometimes called an underflow) by having a possible negative intermediate value:

unsigned a = 20;
unsigned b = 0;
unsigned c = 20;
unsigned d = a + b - c;

In this case the overflow will happens if we evaluate a + (b - c). Again this is not a problem in C, but will be a problem if the language traps the overflow.

Trapping overflow fixes actual bugs

So is trapping overflow bad? If they create subtle problems (in particularly in C where it's undefined behaviour), shouldn't we always wrap?

Again, the problem is not as simple as that. With trapping overflow we catch this exploit:

char *ptr = malloc(a * b);
...
ptr[a * i + j + offset] = some_value;

We can see here that if there is no trap on overflow, then we can pick a and b so that the allocated size overflows into a small value, and then some_value actually will be written out of bounds.

(This is from an actual exploited overflow in the wilds.)

Some people will point out that with proper bounds checking then the exploit cannot occur either. But that relies on proper bounds being known. It's probably possible to rewrite the code in the example to use slices (pointer + length) with bounds checking, but in general we can't rely on that to fix all the various overflows.

A detour: mixed signedness

A quick question: "What should the type of an unsigned int added to a signed int be?"

For C the answer is "an unsigned int", in C# the answer is "a long".

The C answer seems bad, surely we want something signed to be safe? But there's a reason to this apparent madness:

unsigned a = INT_MAX + 10U;
int b = -10;
...
int d = a + b;

In the example above, is the result well defined? – Well, yes it is! All the operands are converted into unsigned and then wrapping addition is performed, followed by an implicit cast back to an integer value.

What had happened if C instead had did a cast on the unsigned to a signed integer? Well INT_MAX + 10 results in a large negative number, we then subtract 10 from that value, which results in a value less than INT_MIN, which in C is undefined behaviour due to the overflow.

Because signed ints have undefined behaviour for overflow, it's safer to cast to unsigned in C. Implicit conversion between signed and unsigned representations means that the code looks very simple and is mostly right, even though it does clever things.

So what about languages with trapping overflow?

The usual case is that such languages require explicit casts. Let's try that:

unsigned a = INT_MAX + 10U;
int b = -10;
...
int d = (int)a + b; // BOOM!

Ok, that didn't work. What about this:

unsigned a = INT_MAX + 10U;
int b = -10;
...
int d = (int)(a + (unsigned)b); // Also BOOM!

In the second case the unsigned version of b becomes a large positive number, causing the unsigned add to overflow as well.

If you think about it this is quite natural: unsigned maths uses 2s complement numbers and wrapping overflow to represent negative numbers, so without wrapping behaviour adding negative numbers can't work.

Let's look at another common example, now with unsigned:

unsigned x = 10;
int change = -1;
x = x + change;

Again this "just works" since C gladly converts change to an unsigned value even if it's negative and it just works. With overflow trap on unsigned again it won't work.

Let's look at how we currently could solve this in Zig:

var x : u32 = 10;
var change : i32 = -1;
x = x +% @bitCast(u32, change);

Let's break it down: @bitCast is needed to convert the i32 in the same way (unsigned)change would do in C.

Now once we have this 2s complement unsigned, we request wrapping addition using the +% operator (Zig has wrapping operator counterparts for all arithmetics operations).

Problem solved!

... Or wait? What happened to our overflow check? For example we could set x = 0 and change = -1 and this would happily just work without batting an eye.

The real solution that works and traps overflow looks like this:

x = @intCast(u32, @as(i64, x) + @as(i64, change));

So we first go to the larger type (i64) perform the add, then try to narrow that number to an u32, catching any negative numbers or numbers that exceed the maximum of u32.

Now we've come full circle:

  1. Introduce trapping "by default" so it's easy to do the right thing.
  2. Realize that some cases need to circumvent the trapping to be correct.
  3. Find a "correct" solution that is very complicated that people are supposed to follow to do the right thing.

Enter the left hand side

Unfortunately we're not done yet listing the problems, what about this:

int a = INT_MAX;
long long b = a + a / 2;

Is that UB assuming long long is larger than int? In C, sure it is. What the person writing this probably wanted was the following:

int a = INT_MAX;
long long b = (long long)a + (long long)a / 2;

Mostly people don't run into this due to C's promotion rules. For example:

short a = 0x7FFF;
int b = a + a / 2;
  • will do "the right thing" on all platforms where the int is at least 32 bits. This is because C implicitly converts all arguments to an int before performing any arithmetic operation.

But as demonstrated by the above examples that only gets you so far. There is a recent language trend to perform arithmetics with the native bit width, leading to new unexpected results (this example is Zig):

var num1 : i8 = 16;
var num2 : i32 = 3;
var res : i32 = num2 + num1 * num1; // BOOM!

Here the multiplication is performed using signed 8 bits, which will overflow on 16 * 16, even it later would have been promoted to i32 for the addition.

These are hard to spot errors that will yield runtime errors but look fine to the compiler.

It would be reasonable that the left hand side guides the type used on the right hand side in the assignment, but that adds complexity that few languages want to take on.

Summary

  1. Overflow traps cause unexpected and hard-to-spot non commutative and associative behaviour in expressions that otherwise would have been fine.
  2. Wrapping behaviour enables buffer overflows and similar exploits.
  3. Overflow traps on unsigned integers makes the mixed signedness case very hard to get right.
  4. In most languages it doesn't matter if the left hand sign can contain the number, what matters are the types on the right hand side, which isn't always intuitive or desired.

In a following post I'll investigate ways to try to improve on the status quo.

Comments


Comment by Christoffer Lernö

It is generally understood that overflowing an add or a multiplication is usually a bad thing. This seems to imply that the solution is to detect and trap (quit the program or throw an exception) on such errors. But as we will see, the answer isn't that clear cut.

Commutative and associative addition?

Typically when we work with integers, we prefer that the ordering of the operands doesn't matter. For example, if we see a + b + c we'd prefer that (a + b) + c is the same as a + (b + c).

Unfortunately, if we trap for overflow this does not hold. Here is an example:

int a = INT_MAX;
int b = 20;
int c = -20;
int d = a + b + c;

If we trap on overflow then (a + b) + c would trap, but a + (b + c) would not. Ooops.

In C the former is even undefined behaviour. Let's pick an unsigned example:

unsigned a = UINT_MAX;
unsigned b = 20;
unsigned c = 20;
unsigned d = a + b - c;

Because overflow is wrapping in C, this is well defined and gives the expected result. In languages where even unsigned overflow is trapped, such as Swift, this will similarly trap or not trap depending on evaluation order. For unsigned we can also design an common overflow (sometimes called an underflow) by having a possible negative intermediate value:

unsigned a = 20;
unsigned b = 0;
unsigned c = 20;
unsigned d = a + b - c;

In this case the overflow will happens if we evaluate a + (b - c). Again this is not a problem in C, but will be a problem if the language traps the overflow.

Trapping overflow fixes actual bugs

So is trapping overflow bad? If they create subtle problems (in particularly in C where it's undefined behaviour), shouldn't we always wrap?

Again, the problem is not as simple as that. With trapping overflow we catch this exploit:

char *ptr = malloc(a * b);
...
ptr[a * i + j + offset] = some_value;

We can see here that if there is no trap on overflow, then we can pick a and b so that the allocated size overflows into a small value, and then some_value actually will be written out of bounds.

(This is from an actual exploited overflow in the wilds.)

Some people will point out that with proper bounds checking then the exploit cannot occur either. But that relies on proper bounds being known. It's probably possible to rewrite the code in the example to use slices (pointer + length) with bounds checking, but in general we can't rely on that to fix all the various overflows.

A detour: mixed signedness

A quick question: "What should the type of an unsigned int added to a signed int be?"

For C the answer is "an unsigned int", in C# the answer is "a long".

The C answer seems bad, surely we want something signed to be safe? But there's a reason to this apparent madness:

unsigned a = INT_MAX + 10U;
int b = -10;
...
int d = a + b;

In the example above, is the result well defined? – Well, yes it is! All the operands are converted into unsigned and then wrapping addition is performed, followed by an implicit cast back to an integer value.

What had happened if C instead had did a cast on the unsigned to a signed integer? Well INT_MAX + 10 results in a large negative number, we then subtract 10 from that value, which results in a value less than INT_MIN, which in C is undefined behaviour due to the overflow.

Because signed ints have undefined behaviour for overflow, it's safer to cast to unsigned in C. Implicit conversion between signed and unsigned representations means that the code looks very simple and is mostly right, even though it does clever things.

So what about languages with trapping overflow?

The usual case is that such languages require explicit casts. Let's try that:

unsigned a = INT_MAX + 10U;
int b = -10;
...
int d = (int)a + b; // BOOM!

Ok, that didn't work. What about this:

unsigned a = INT_MAX + 10U;
int b = -10;
...
int d = (int)(a + (unsigned)b); // Also BOOM!

In the second case the unsigned version of b becomes a large positive number, causing the unsigned add to overflow as well.

If you think about it this is quite natural: unsigned maths uses 2s complement numbers and wrapping overflow to represent negative numbers, so without wrapping behaviour adding negative numbers can't work.

Let's look at another common example, now with unsigned:

unsigned x = 10;
int change = -1;
x = x + change;

Again this "just works" since C gladly converts change to an unsigned value even if it's negative and it just works. With overflow trap on unsigned again it won't work.

Let's look at how we currently could solve this in Zig:

var x : u32 = 10;
var change : i32 = -1;
x = x +% @bitCast(u32, change);

Let's break it down: @bitCast is needed to convert the i32 in the same way (unsigned)change would do in C.

Now once we have this 2s complement unsigned, we request wrapping addition using the +% operator (Zig has wrapping operator counterparts for all arithmetics operations).

Problem solved!

... Or wait? What happened to our overflow check? For example we could set x = 0 and change = -1 and this would happily just work without batting an eye.

The real solution that works and traps overflow looks like this:

x = @intCast(u32, @as(i64, x) + @as(i64, change));

So we first go to the larger type (i64) perform the add, then try to narrow that number to an u32, catching any negative numbers or numbers that exceed the maximum of u32.

Now we've come full circle:

  1. Introduce trapping "by default" so it's easy to do the right thing.
  2. Realize that some cases need to circumvent the trapping to be correct.
  3. Find a "correct" solution that is very complicated that people are supposed to follow to do the right thing.

Enter the left hand side

Unfortunately we're not done yet listing the problems, what about this:

int a = INT_MAX;
long long b = a + a / 2;

Is that UB assuming long long is larger than int? In C, sure it is. What the person writing this probably wanted was the following:

int a = INT_MAX;
long long b = (long long)a + (long long)a / 2;

Mostly people don't run into this due to C's promotion rules. For example:

short a = 0x7FFF;
int b = a + a / 2;
  • will do "the right thing" on all platforms where the int is at least 32 bits. This is because C implicitly converts all arguments to an int before performing any arithmetic operation.

But as demonstrated by the above examples that only gets you so far. There is a recent language trend to perform arithmetics with the native bit width, leading to new unexpected results (this example is Zig):

var num1 : i8 = 16;
var num2 : i32 = 3;
var res : i32 = num2 + num1 * num1; // BOOM!

Here the multiplication is performed using signed 8 bits, which will overflow on 16 * 16, even it later would have been promoted to i32 for the addition.

These are hard to spot errors that will yield runtime errors but look fine to the compiler.

It would be reasonable that the left hand side guides the type used on the right hand side in the assignment, but that adds complexity that few languages want to take on.

Summary

  1. Overflow traps cause unexpected and hard-to-spot non commutative and associative behaviour in expressions that otherwise would have been fine.
  2. Wrapping behaviour enables buffer overflows and similar exploits.
  3. Overflow traps on unsigned integers makes the mixed signedness case very hard to get right.
  4. In most languages it doesn't matter if the left hand sign can contain the number, what matters are the types on the right hand side, which isn't always intuitive or desired.

In a following post I'll investigate ways to try to improve on the status quo.


Comment by Christoffer Lernö

notnullnotvoid

there are no signed or unsigned int types, just int

So like Java? There are obviously advantages to this, but it also means that some arithmetic must be done on signed values even though they are actually unsigned. This is why Java has both two shifting operators – so that you can emulate unsigned types.

all basic operators convert the arguments to CPU-word-sized ints, do signed arithmetic on them, and trap on overflow

There are often two useful sizes on a 64 bit CPU: 32 and 64 bit. Using 32 bits we gain double the throughput compared to 64 on vectorization. Which one would you pick?

If there is no trapping overflow or the top bit is not set, then we can pretend unsigned and signed are interchangeable. If none is true, then we run into the risk of triggering overflow on add, so we can't use signed as if they were unsigned. For this reason Java specifies wrapping behaviour for its integers. So basically if we have trapping overflow we cannot get by with only signed. Either remove the trap or add an unsigned type.


Comment by Christoffer Lernö

notnullnotvoid
Nope. The whole point is that you actually don't need signed and unsigned types, only signed and unsigned operations. If you read my original post, you will see that I am not suggesting having signed ints!

If you do not have a signedness, how can you determine trapping in an addition, subtraction and multiplication? The equivalence between signed and unsigned in 2s complement is limited to non-trapping operations. Also, signed and unsigned division works differently, so for division you explicitly need to know the type. Or do you propose that the programmer would have two division operators? That seems a bit onerous.

Maybe you need to provide some examples, because I have some difficulties understanding how your proposal could work.

In regards to vectors, I'm talking about auto-vectorization done by optimizing compilers. Those are guided by the semantics the backend can infer, and if it is hard to determine such a width, they won't happen. That is what I'm pointing at.


Comment by Christoffer Lernö

Do you propose something like a +signed b? Maybe you could show an example.

Implementing "defer"

Originally from: https://c3.handmade.network/blog/p/7641-implementing_defer

The defer statement is going mainstream. Go has it's own special defer which only fires on function end, otherwise defer has consistent "execute at scope end" semantics. Swift, Zig, Jai, Nim and Odin all use defer in this manner.

The problems with implementing defer is similar to implementing destructors for stack allocated objects in C++, although the presence of virtual functions complicates things.

I couldn't find anyone describing how defer is done in other compilers so when working on a version of it for C2 I had to make it up as I went along.

For posterity's sake I thought it might be interesting to do a writeup on how defer was implemented.

Setting the rules

First up there are lots of different possible rules to adapt defer to. The original draft of this article would handle goto across defers. C2 retains goto, and for a long time, so did C3 – so this was important to make the article complete. However goto adds much complexity to defer which made the article both much longer and harder to follow.

For that reason we'll limit ourselves to return, continue, break plus labelled versions of the latter. If there is interest I can go into details on how to add defer for goto in another article.

Handling early exit

The first issue in defer is the early exit:

void test()
{
  defer printf("A");
  if (rand() % 2 == 0) return;
  printf("B");
}

Every return needs to inline the defer at the end, so this is lowered to:

void test()
{
  if (rand() % 2 == 0) 
  {
    printf("A");
    return;
  }
  printf("B");
  printf("A");
}

For break and continue this is handled similar to return but only part of the defers may be inlined at the point of the break:

void test()
{
  defer printf("A");
  while (true)
  {
    defer printf("B");
    {
      defer printf("C");
      if (rand() % 2 == 0) break;
    }
  }
}

And the inlined version:

void test()
{
  while (true)
  {
    {
      if (rand() % 2 == 0) 
      {
        printf("C");
        printf("B");
        break;
      }
      printf("C");
    }
    printf("B");
  }
  printf("A");
}

We also have the labelled version of break. (I'll stick to the java-style labelled break syntax here, even though C3 has a different variant)

void test()
{
  defer printf("A");
  FOO: while (true)
  {
    defer printf("B");
    while (true)
    {
      defer printf("C");
      if (rand() % 2 == 0) break FOO;
    }
  }
}

This is again lowered to:

void test()
{
  FOO: while (true)
  {
    while (true)
    {
      if (rand() % 2 == 0) 
      {
        printf("C");
        printf("B");
        break FOO;
      }
      printf("C");
    }
    printf("B");
  }
  printf("A");
}

So as we see it's sufficient to keep a list of the defers and then inline the defer statements in reverse order where we encounter a break, continue or return.

Putting it all together

So now we've listed all the things we need to solve. How do we put it together? Here's the algorithm I used:

  1. For each defer, keep a pointer back to the previously active defer.
  2. For each dynamic scope, keep track of the current defer.
  3. On break/continue/return AST nodes, store 2 AST nodes.
  4. On each scoping AST node (while, for, compound statement etc) store 2 AST nodes.

Algorithm

  1. Set current_scope->current_defer = NULL
  2. Traverse the AST-tree.
  3. When pushing a new dynamic scope

current_scope->current_defer = prev_scope->current_defer;
4. When encountering a defer:

defer->prev_defer = current_scope->current_defer;
current_scope->current_defer = defer;
5. When encountering a scoped statement:

scoped_stmt->start_defer = current_scope->current_defer;
push_new_current_scope();
... recursively process nodes ...
scoped_stmt->end_defer = current_scope->current_defer;
pop_current_scope();
6. When encountering a break or continue:

Ast* target_ast = find_target(break_stmt);
break_stmt->end_defer = current_scope->current_defer;
break_stmt->start_defer = target_ast->start_defer;
7. When encountering a return:

return_stmt->defer = current_scope->current_defer;

This results in us being able to use each defer as the top of a linked list:

current_defer

current_defer->prev_defer

current_defer->prev_defer->prev_defer

NULL

Codegen is now easy.

We introduce a helper function to inline defers:

void codegen_defers(Defer *current, Defer *last)
{
  while (current_defer != last)
  {
    codegen(current_defer);
    current_defer = current_defer->prev_defer;
  }
}
  1. When doing codegen for a scoped statement:

codegen(scoped_stmt->inner_stmt);
codegen_defers(scoped_stmt->end_defer, scoped_stmt->start_defer);
2. When doing codegen for break or continue:

codegen_defers(break_stmt->end_defer, break_stmt->start_defer);
codegen_break(break_stmt);
3. Codegen for return

codegen_defers(return_stmt->defer, NULL);
codegen_return(return_stmt);

Going further

Ok, so now we're done? Not quite, if we want to go beyond C syntax. We can imagine something looking a bit like this:

if (File *f = getFile(), defer close(f)) { ... }

In this case we actually have two scopes: one inner scope (between {}) and the outer one that starts in the conditional.

The principle is the same so we can reuse the same solution as above, but it's worth taking note of this case.

Defer after if

We have other questions to answer as well. What does this code do:

if (x == 0) defer printf("x was 0\n");

Some people have suggested that this should be treated as:

defer
{
  if (x == 0) printf("x was 0\n");
}

I am strongly against that idea, as it would mean that compound statements suddenly have a different meaning than regular statements.

Defer as part of the function

Another interesting thing one can do with defer is the idea that a function may contain an implicit defer that is added to the scope which invokes it. Odin has that feature using "deferred attributes" (see further down from this link). This is simple to tie into the defer machinery.

Defers & goto

Handling goto with defers is a bit more complicated as one need to conditionally invoke defers:

void test(int x)
{
  if (x > 0) goto FOO;
  // When is this called?
  defer printf("A");
  FOO:
  printf("B");
}

The lowered code needs to look like this:

void test(int x)
{
  bool _defer_1 = false;
  if (x > 0) goto FOO;
  _defer_1 = true;
  FOO:
  printf("B");
  if (_defer_1) 
  {
    printf("A");
  }
}

Since B can jump into scopes as well as out of scopes, this adds another dimension to the analysis. The solution is not hard but definitely not as straightforward as the structured jumps of break and continue

Non-local jumps of setjmp are not possible to handle at all.

Go style defers

Go has a different style of defer. Go's defers actually store the defer code like a closure that is queued and invoked at function end rather than at scope end. This means a defer actually needs to allocate memory for itself. A loop like this:

for() {
  ...
  defer ...
  ...
}

Would queue up all the defers generated in the loop in a long list and release them at function end. If the defer is releasing something limited like db connections then this is a bad idea. For various "gotchas" in Go due to this style of defer, see this.

While Go defers work nicely with exceptions and goto, it has quite a bit of quirks as well as the need to reserve memory to store the defers.

Defers and errors

Sometimes one would prefer for defers to only occur on error:

File *getAndCheckFile()
{
   File *f = getFile();
   if (!f) return NULL;
   // We want to close if we return with error.
   defer close(f);
   if (!fileIsValid(f)) return NuLL;
   if (readHeader(f) != 0xdeadbeef) return NULL;
   // oops, we will be closing f!
   return f;
}

For this reason Zig introduces errdefer, and C3 has defer catch / defer try statements.

Being able to cancel defers

As an alternative (and complement) to special forms of defer is being able to cancel defers. So far I've only seen this functionality on defer implemented as RAII. Theoretically it could look something like:

File *getAndCheckFile()
{
   File *f = getFile();
   if (!f) return NULL;
   FOO: defer close(f);
   if (!fileIsValid(f)) return NuLL;
   if (readHeader(f) != 0xdeadbeef) return NULL;
   undefer FOO;
   return f;
}

Summary

Defer is useful functionality for languages that lack both finally and RAII. With structured jumps it is straightforward to implement with zero overhead.

Comments


Comment by Christoffer Lernö

The defer statement is going mainstream. Go has it's own special defer which only fires on function end, otherwise defer has consistent "execute at scope end" semantics. Swift, Zig, Jai, Nim and Odin all use defer in this manner.

The problems with implementing defer is similar to implementing destructors for stack allocated objects in C++, although the presence of virtual functions complicates things.

I couldn't find anyone describing how defer is done in other compilers so when working on a version of it for C2 I had to make it up as I went along.

For posterity's sake I thought it might be interesting to do a writeup on how defer was implemented.

Setting the rules

First up there are lots of different possible rules to adapt defer to. The original draft of this article would handle goto across defers. C2 retains goto, and for a long time, so did C3 – so this was important to make the article complete. However goto adds much complexity to defer which made the article both much longer and harder to follow.

For that reason we'll limit ourselves to return, continue, break plus labelled versions of the latter. If there is interest I can go into details on how to add defer for goto in another article.

Handling early exit

The first issue in defer is the early exit:

void test()
{
  defer printf("A");
  if (rand() % 2 == 0) return;
  printf("B");
}

Every return needs to inline the defer at the end, so this is lowered to:

void test()
{
  if (rand() % 2 == 0) 
  {
    printf("A");
    return;
  }
  printf("B");
  printf("A");
}

For break and continue this is handled similar to return but only part of the defers may be inlined at the point of the break:

void test()
{
  defer printf("A");
  while (true)
  {
    defer printf("B");
    {
      defer printf("C");
      if (rand() % 2 == 0) break;
    }
  }
}

And the inlined version:

void test()
{
  while (true)
  {
    {
      if (rand() % 2 == 0) 
      {
        printf("C");
        printf("B");
        break;
      }
      printf("C");
    }
    printf("B");
  }
  printf("A");
}

We also have the labelled version of break. (I'll stick to the java-style labelled break syntax here, even though C3 has a different variant)

void test()
{
  defer printf("A");
  FOO: while (true)
  {
    defer printf("B");
    while (true)
    {
      defer printf("C");
      if (rand() % 2 == 0) break FOO;
    }
  }
}

This is again lowered to:

void test()
{
  FOO: while (true)
  {
    while (true)
    {
      if (rand() % 2 == 0) 
      {
        printf("C");
        printf("B");
        break FOO;
      }
      printf("C");
    }
    printf("B");
  }
  printf("A");
}

So as we see it's sufficient to keep a list of the defers and then inline the defer statements in reverse order where we encounter a break, continue or return.

Putting it all together

So now we've listed all the things we need to solve. How do we put it together? Here's the algorithm I used:

  1. For each defer, keep a pointer back to the previously active defer.
  2. For each dynamic scope, keep track of the current defer.
  3. On break/continue/return AST nodes, store 2 AST nodes.
  4. On each scoping AST node (while, for, compound statement etc) store 2 AST nodes.

Algorithm

  1. Set current_scope->current_defer = NULL
  2. Traverse the AST-tree.
  3. When pushing a new dynamic scope

current_scope->current_defer = prev_scope->current_defer;
4. When encountering a defer:

defer->prev_defer = current_scope->current_defer;
current_scope->current_defer = defer;
5. When encountering a scoped statement:

scoped_stmt->start_defer = current_scope->current_defer;
push_new_current_scope();
... recursively process nodes ...
scoped_stmt->end_defer = current_scope->current_defer;
pop_current_scope();
6. When encountering a break or continue:

Ast* target_ast = find_target(break_stmt);
break_stmt->end_defer = current_scope->current_defer;
break_stmt->start_defer = target_ast->start_defer;
7. When encountering a return:

return_stmt->defer = current_scope->current_defer;

This results in us being able to use each defer as the top of a linked list:

current_defer

current_defer->prev_defer

current_defer->prev_defer->prev_defer

NULL

Codegen is now easy.

We introduce a helper function to inline defers:

void codegen_defers(Defer *current, Defer *last)
{
  while (current_defer != last)
  {
    codegen(current_defer);
    current_defer = current_defer->prev_defer;
  }
}
  1. When doing codegen for a scoped statement:

codegen(scoped_stmt->inner_stmt);
codegen_defers(scoped_stmt->end_defer, scoped_stmt->start_defer);
2. When doing codegen for break or continue:

codegen_defers(break_stmt->end_defer, break_stmt->start_defer);
codegen_break(break_stmt);
3. Codegen for return

codegen_defers(return_stmt->defer, NULL);
codegen_return(return_stmt);

Going further

Ok, so now we're done? Not quite, if we want to go beyond C syntax. We can imagine something looking a bit like this:

if (File *f = getFile(), defer close(f)) { ... }

In this case we actually have two scopes: one inner scope (between {}) and the outer one that starts in the conditional.

The principle is the same so we can reuse the same solution as above, but it's worth taking note of this case.

Defer after if

We have other questions to answer as well. What does this code do:

if (x == 0) defer printf("x was 0\n");

Some people have suggested that this should be treated as:

defer
{
  if (x == 0) printf("x was 0\n");
}

I am strongly against that idea, as it would mean that compound statements suddenly have a different meaning than regular statements.

Defer as part of the function

Another interesting thing one can do with defer is the idea that a function may contain an implicit defer that is added to the scope which invokes it. Odin has that feature using "deferred attributes" (see further down from this link). This is simple to tie into the defer machinery.

Defers & goto

Handling goto with defers is a bit more complicated as one need to conditionally invoke defers:

void test(int x)
{
  if (x > 0) goto FOO;
  // When is this called?
  defer printf("A");
  FOO:
  printf("B");
}

The lowered code needs to look like this:

void test(int x)
{
  bool _defer_1 = false;
  if (x > 0) goto FOO;
  _defer_1 = true;
  FOO:
  printf("B");
  if (_defer_1) 
  {
    printf("A");
  }
}

Since B can jump into scopes as well as out of scopes, this adds another dimension to the analysis. The solution is not hard but definitely not as straightforward as the structured jumps of break and continue

Non-local jumps of setjmp are not possible to handle at all.

Go style defers

Go has a different style of defer. Go's defers actually store the defer code like a closure that is queued and invoked at function end rather than at scope end. This means a defer actually needs to allocate memory for itself. A loop like this:

for() {
  ...
  defer ...
  ...
}

Would queue up all the defers generated in the loop in a long list and release them at function end. If the defer is releasing something limited like db connections then this is a bad idea. For various "gotchas" in Go due to this style of defer, see this.

While Go defers work nicely with exceptions and goto, it has quite a bit of quirks as well as the need to reserve memory to store the defers.

Defers and errors

Sometimes one would prefer for defers to only occur on error:

File *getAndCheckFile()
{
   File *f = getFile();
   if (!f) return NULL;
   // We want to close if we return with error.
   defer close(f);
   if (!fileIsValid(f)) return NuLL;
   if (readHeader(f) != 0xdeadbeef) return NULL;
   // oops, we will be closing f!
   return f;
}

For this reason Zig introduces errdefer, and C3 has defer catch / defer try statements.

Being able to cancel defers

As an alternative (and complement) to special forms of defer is being able to cancel defers. So far I've only seen this functionality on defer implemented as RAII. Theoretically it could look something like:

File *getAndCheckFile()
{
   File *f = getFile();
   if (!f) return NULL;
   FOO: defer close(f);
   if (!fileIsValid(f)) return NuLL;
   if (readHeader(f) != 0xdeadbeef) return NULL;
   undefer FOO;
   return f;
}

Summary

Defer is useful functionality for languages that lack both finally and RAII. With structured jumps it is straightforward to implement with zero overhead.

On arithmetics and overflow

Originally from: https://dev.to/lerno/on-arithmetics-and-overflow-1kob

It is generally understood that overflowing an add or a multiplication is usually a bad thing. This seems to imply that the solution is to detect and trap (quit the program or throw an exception) on such errors. But as we will see, the answer isn't that clear cut.

Commutative and associative addition?

Typically when we work with integers, we prefer that the ordering of the operands doesn't matter. For example, if we see a + b + c we'd prefer that (a + b) + c is the same as a + (b + c).

Unfortunately, if we trap for overflow this does not hold. Here is an example:

int a = INT_MAX;
int b = 20;
int c = -20;
int d = a + b + c;

If we trap on overflow then (a + b) + c would trap, but a + (b + c) would not. Ooops.

In C the former is even undefined behaviour. Let's pick an unsigned example:

unsigned a = UINT_MAX;
unsigned b = 20;
unsigned c = 20;
unsigned d = a + b - c;

Because overflow is wrapping in C, this is well defined and gives the expected result. In languages where even unsigned overflow is trapped, such as Swift, this will similarly trap or not trap depending on evaluation order. For unsigned we can also design an common overflow (sometimes called an underflow) by having a possible negative intermediate value:

unsigned a = 20;
unsigned b = 0;
unsigned c = 20;
unsigned d = a + b - c;

In this case the overflow will happens if we evaluate a + (b - c). Again this is not a problem in C, but will be a problem if the language traps the overflow.

Trapping overflow fixes actual bugs

So is trapping overflow bad? If they create subtle problems (in particularly in C where it's undefined behaviour), shouldn't we always wrap?

Again, the problem is not as simple as that. With trapping overflow we catch this exploit:

char *ptr = malloc(a * b);
...
ptr[a * i + j + offset] = some_value;

We can see here that if there is no trap on overflow, then we can pick a and b so that the allocated size overflows into a small value, and then some_value actually will be written out of bounds.

(This is from an actual exploited overflow in the wilds.)

Some people will point out that with proper bounds checking then the exploit cannot occur either. But that relies on proper bounds being known. It's probably possible to rewrite the code in the example to use slices (pointer + length) with bounds checking, but in general we can't rely on that to fix all the various overflows.

A detour: mixed signedness

A quick question: "What should the type of an unsigned int added to a signed int be?"

For C the answer is "an unsigned int", in C# the answer is "a long".

The C answer seems bad, surely we want something signed to be safe? But there's a reason to this apparent madness:

unsigned a = INT_MAX + 10U;
int b = -10;
...
int d = a + b;

In the example above, is the result well defined? – Well, yes it is! All the operands are converted into unsigned and then wrapping addition is performed, followed by an implicit cast back to an integer value.

What had happened if C instead had did a cast on the unsigned to a signed integer? Well INT_MAX + 10U results in a large negative number, we then subtract 10 from that value, which results in a value less than INT_MIN, which in C is undefined behaviour due to the overflow.

Because signed ints have undefined behaviour for overflow, it's safer to cast to unsigned in C. Implicit conversion between signed and unsigned representations means that the code looks very simple and is mostly right, even though it does clever things.

So what about languages with trapping overflow?

The usual case is that such languages require explicit casts. Let's try that:

unsigned a = INT_MAX + 10U;
int b = -10;
...
int d = (int)a + b; // BOOM!

Ok, that didn't work. What about this:

unsigned a = INT_MAX + 10U;
int b = -10;
...
int d = (int)(a + (unsigned)b); // Also BOOM!

In the second case the unsigned version of b becomes a large positive number, causing the unsigned add to overflow as well.

If you think about it this is quite natural: unsigned maths uses 2s complement numbers and wrapping overflow to represent negative numbers, so without wrapping behaviour adding negative numbers can't work.

Let's look at another common example, now with unsigned:

unsigned x = 10;
int change = -1;
x = x + change;

Again this "just works" since C gladly converts change to an unsigned value even if it's negative and it just works. With overflow trap on unsigned again it won't work.

Let's look at how we currently could solve this in Zig:

var x : u32 = 10;
var change : i32 = -1;
x = x +% @bitCast(u32, change);

Let's break it down: @bitCast is needed to convert the i32 in the same way (unsigned)change would do in C.

Now once we have this 2s complement unsigned, we request wrapping addition using the +% operator (Zig has wrapping operator counterparts for all arithmetics operations).

Problem solved!

... Or wait? What happened to our overflow check? For example we could set x = 0 and change = -1 and this would happily just work without batting an eye.

The real solution that works and traps overflow looks like this:

x = @intCast(u32, @as(i64, x) + @as(i64, change));

So we first go to the larger type (i64) perform the add, then try to narrow that number to an u32, catching any negative numbers or numbers that exceed the maximum of u32.

Now we've come full circle:

  1. Introduce trapping "by default" so it's easy to do the right thing.
  2. Realize that some cases need to circumvent the trapping to be correct.
  3. Find a "correct" solution that is very complicated that people are supposed to follow to do the right thing.

Enter the left hand side

Unfortunately we're not done yet listing the problems, what about this:

int a = INT_MAX;
long long b = a + a / 2;

Is that UB assuming long long is larger than int? In C, sure it is. What the person writing this probably wanted was the following:

int a = INT_MAX;
long long b = (long long)a + (long long)a / 2;

Mostly people don't run into this due to C's promotion rules. For example:

short a = 0x7FFF;
int b = a + a / 2;

Will do "the right thing" on all platforms where the int is at least 32 bits. This is because C implicitly converts all arguments to an int before performing any arithmetic operation.

But as demonstrated by the above examples that only gets you so far. There is a recent language trend to perform arithmetics with the native bit width, leading to new unexpected results (this example is Zig):

var num1 : i8 = 16;
var num2 : i32 = 3;
var res : i32 = num2 + num1 * num1; // BOOM!

Here the multiplication is performed using signed 8 bits, which will overflow on 16 * 16, even it later would have been promoted to i32 for the addition.

These are hard to spot errors that will yield runtime errors but look fine to the compiler.

It would be reasonable that the left hand side guides the type used on the right hand side in the assignment, but that adds complexity that few languages want to take on.

Summary

  1. Overflow traps cause unexpected and hard-to-spot non commutative and associative behaviour in expressions that otherwise would have been fine.
  2. Wrapping behaviour enables buffer overflows and similar exploits.
  3. Overflow traps on unsigned integers makes the mixed signedness case very hard to get right.
  4. In most languages it doesn't matter if the left hand sign can contain the number, what matters are the types on the right hand side, which isn't always intuitive or desired.

In my next post I'll investigate ways to try to improve on the status quo.

Implementing "defer"

Originally from: https://dev.to/lerno/implementing-defer-3l76

The defer statement is going mainstream. Go has it's own special defer which only fires on function end, otherwise defer has consistent "execute at scope end" semantics. Swift, Zig, Jai, Nim and Odin all use defer in this manner.

The problems with implementing defer is similar to implementing destructors for stack allocated objects in C++, although the presence of virtual functions complicates things.

I couldn't find anyone describing how defer is done in other compilers so when working on a version of it for C2 I had to make it up as I went along.

For posterity's sake I thought it might be interesting to do a writeup on how defer was implemented.

Setting the rules

First up there are lots of different possible rules to adapt defer to. The original draft of this article would handle goto across defers. C2 retains goto, and for a long time, so did C3 – so this was important to make the article complete. However goto adds much complexity to defer which made the article both much longer and harder to follow.

For that reason we'll limit ourselves to return, continue, break and labelled versions of the latter. If there is interest I can go into details on how to add defer for goto in another article.

Handling early exit

The first issue in defer is the early exit:

void test()
{
  defer printf("A");
  if (rand() % 2 == 0) return;
  printf("B");
}

Every return needs to inline the defer at the end, so this is lowered to:

void test()
{
  if (rand() % 2 == 0) 
  {
    printf("A");
    return;
  }
  printf("B");
  printf("A");
}

For break and continue this is handled similar to return but only part of the defers may be inlined at the point of the break:

void test()
{
  defer printf("A");
  while (true)
  {
    defer printf("B");
    {
      defer printf("C");
      if (rand() % 2 == 0) break;
    }
  }
}

And the inlined version:

void test()
{
  while (true)
  {
    {
      if (rand() % 2 == 0) 
      {
        printf("C");
        printf("B");
        break;
      }
      printf("C");
    }
    printf("B");
  }
  printf("A");
}

We also have the labelled version of break. (I'll stick to the java-style labelled break syntax here)

void test()
{
  defer printf("A");
  FOO: while (true)
  {
    defer printf("B");
    while (true)
    {
      defer printf("C");
      if (rand() % 2 == 0) break FOO;
    }
  }
}

This is again lowered to:

void test()
{
  FOO: while (true)
  {
    while (true)
    {
      if (rand() % 2 == 0) 
      {
        printf("C");
        printf("B");
        break FOO;
      }
      printf("C");
    }
    printf("B");
  }
  printf("A");
}

So as we see it's sufficient to keep a list of the defers and then inline the defer statements in reverse order where we encounter a break continue or return.

Putting it all together

So now we've listed all the things we need to solve. How do we put it together? Here's the algorithm I used:

  1. For each defer, keep a pointer back to the previously active defer.
  2. For each dynamic scope, keep track of the current defer.
  3. On break/continue/return AST nodes, store 2 AST nodes.
  4. On each scoping AST node (while, for, compound statement etc) store 2 AST nodes.

Algorithm

  1. Set current_scope->current_defer = NULL
  2. Traverse the AST-tree.
  3. When pushing a new dynamic scope
    current_scope->current_defer = prev_scope->current_defer;
    
  4. When encountering a defer:
    defer->prev_defer = current_scope->current_defer;
    current_scope->current_defer = defer;
    
  5. When encountering a scoped statement:
    scoped_stmt->start_defer = current_scope->current_defer;
    push_new_current_scope();
    ... recursively process nodes ...
    scoped_stmt->end_defer = current_scope->current_defer;
    pop_current_scope();
    
  6. When encountering a break or continue:
Ast* target_ast = find_target(break_stmt);
break_stmt->end_defer = current_scope->current_defer;
break_stmt->start_defer = target_ast->start_defer;
  1. When encountering a return:
    return_stmt->defer = current_scope->current_defer;
    

This results in us being able to use each defer as the top of a linked list:

current_defer
   |
   v
current_defer->prev_defer
   |
   v
current_defer->prev_defer->prev_defer
   |
   v
  NULL

Codegen is now easy.

We introduce a helper function to inline defers:

void codegen_defers(Defer *current, Defer *last)
{
  while (current_defer != last)
  {
    codegen(current_defer);
    current_defer = current_defer->prev_defer;
  }
}
  1. When doing codegen for a scoped statement:
codegen(scoped_stmt->inner_stmt);
codegen_defers(scoped_stmt->end_defer, scoped_stmt->start_defer);
  1. When doing codegen for break or continue:
codegen_defers(break_stmt->end_defer, break_stmt->start_defer);
codegen_break(break_stmt);
  1. Codegen for return
    codegen_defers(return_stmt->defer, NULL);
    codegen_return(return_stmt);
    

Going further

Ok, so now we're done? Not quite, if we want to go beyond C syntax. We can imagine something looking a bit like this:

if (File *f = getFile(), defer close(f)) { ... }

In this case we actually have two scopes: one inner scope (between {}) and the outer one that starts in the conditional.

The principle is the same so we can reuse the same solution as above, but it's worth taking note of this case.

Defer after if

We have other questions to answer as well. What does this code do:

if (x == 0) defer printf("x was 0\n");

Some people have suggested that this should be treated as:

defer
{
  if (x == 0) printf("x was 0\n");
}

I am strongly against that idea, as it would mean that compound statements suddenly have a different meaning than regular statements.

Defer as part of the function

Another interesting thing one can do with defer is the idea that a function may contain an implicit defer that is added to the scope which invokes it. Odin has that feature using "deferred attributes" (see further down from this link). This is simple to tie into the defer machinery.

Defers & goto

Handling goto with defers is a bit more complicated as one need to conditionally invoke defers:

void test(int x)
{
  if (x > 0) goto FOO;
  // When is this called?
  defer printf("A");
  FOO:
  printf("B");
}

The lowered code needs to look like this:

void test(int x)
{
  bool _defer_1 = false;
  if (x > 0) goto FOO;
  _defer_1 = true;
  FOO:
  printf("B");
  if (_defer_1) 
  {
    printf("A");
  }
}

Since B can jump into scopes as well as out of scopes, this adds another dimension to the analysis. The solution is not hard but definitely not as straightforward as the structured jumps of break and continue

Non-local jumps of setjmp are not possible to handle at all.

Go style defers

Go has a different style of defer. Go's defers actually store the defer code like a closure that is queued and invoked at function end rather than at scope end. This means a defer actually needs to allocate memory for itself. A loop like this:

for() {
  ...
  defer ...
  ...
}

Would queue up all the defers generated in the loop in a long list and release them at function end. If the defer is releasing something limited like db connections then this is a bad idea. For various "gotchas" in Go due to this style of defer, see this

While Go defers work nicely with exceptions and goto, it has quite a bit of quirks as well as the need to reserve memory to store the defers.

Defers and errors

Sometimes one would prefer for defers to only occur on error:

File *getAndCheckFile()
{
   File *f = getFile();
   if (!f) return NULL;
   // We want to close if we return with error.
   defer close(f);
   if (!fileIsValid(f)) return NuLL;
   if (readHeader(f) != 0xdeadbeef) return NULL;
   // oops, we will be closing f!
   return f;
}

For this reason Zig introduces errdefer, and C3 has defer catch / defer try statements.

Being able to cancel defers

As an alternative (and complement) to special forms of defer is being able to cancel defers. So far I've only seen this functionality on defer implemented as RAII. Theoretically it could look something like:

File *getAndCheckFile()
{
   File *f = getFile();
   if (!f) return NULL;
   FOO: defer close(f);
   if (!fileIsValid(f)) return NuLL;
   if (readHeader(f) != 0xdeadbeef) return NULL;
   undefer FOO;
   return f;
}

Summary

Defer is useful functionality for languages that lack either finally or RAII. With structured jumps it is straightforward to implement with zero overhead.

Macros in C3 - a status update

Originally from: https://dev.to/lerno/macros-in-c3-a-status-update-1m1n

I'm going to share a bit of the C3 design process here for people who might be interested.

Like error handling, macros are one of the few truly new things in C3 compared to C. Consequently I've been going back and forth with the design trying to cover all angles.

I always wanted to make macros sufficiently safe that people could use them without worries, which means that some macro use from C would have to go, but which one?

After doing an inventory of what macros could do, I roughly end up with this "feature ladder" for macros – from easily understandable and readable to more "dangerous" in terms of how easy it would be to abuse:

  1. Inlining
  2. Lazy evaluation of arguments
  3. Polymorphic parameters
  4. Non-local jumps
  5. Implicit capture
  6. Declarations escaping scope
  7. Arbitrary code generation
  8. Code fragment replacement

One has to make the cut somewhere, and for C3 I think it's reasonable to either stop at (4) or (5).

(5) - implicit capture - is a bit related to (8) but can often be extremely useful in local code.

One hard-to-place feature is taking a name or a function invocation and then generating statements from that.

Consider the following:

#define FOO(X) do { X(0); X(1); X(2); } while 0;
void doX(int i) { ... }
FOO(doX);

In C3 this is sort of covered at the (2) level, even though for C that would be (7).

Because macros are mainstream tools in C3 rather than advanced tools it’s important that the syntax is geared towards writing code for 1-3 in particular.

This naturally makes it more natural to require that the macros should resemble functions as much as possible.

(6), (7) and (8) are, when used, usually clever ways to twist C into being more brief or to have an in-code DSL.

This flexibility can create pretty neat hacks, but it’s unclear whether this is a good idea in the large. Are these just clever solutions or are they important ones? My bet is on the former: that the legitimate uses more are about closing holes in C. And if it is, then the macros are basically a poor man's syntax extensions.

If syntax extensions are desired, Kit shows how that can be done in a very elegant manner. However, syntax extensions will always sacrifice readability for power, and here C3 makes a different tradeoff so that no matter what macro you see, you should be able to make a good guess as to what it could be doing.

For comparison, here are some C macros and their counterpart in C3 (as the design currently stands):

C:

#define nodesGet(nodes, index) ((INode**)((nodes)+1))[index]

C3:

macro INode *nodesGet(nodes, index)
{
  return cast(node + 1, INode**)[index];
}

C3 allows trailing body in macros, which makes for slightly different look from C in "foreach" style macros:

#define namespaceFor(ns) for (size_t __i = 0; __i < (ns)->avail; ++__i)

namespaceFor(ns) {
  NameNode *nn = &ns->namenodes[__i];
  if (nn->name == NULL)
    continue;
  nametblHookNode(nn->name, nn->node);
}

C3 (note that the declaration for trailing body is very much undecided):

macro namespaceFor(ns; void(usize i) $body)
{
  for (usize index = 0; index < ns.avail; index++)
  {
    body(index);
  }
}

@namespaceFor(ns; usize i) 
{
  NameNode *nn = &ns->namenodes[i];
  if (nn->name == NULL) continue;
  nametblHookNode(nn->name, nn->node);
}
Using implicit capture of variables from the surrounding scope:

#define lexReturnPuncTok(tok, skip) { \
  lex->toktype = tok; \
  lex->tokp = srcp; \
  lex->srcp = srcp + (skip); \
  return; \
}
C3:
macro lexReturnPuncTok!(tok, skip, implicit lex, implicit srcp)
{
  lex.toktype = tok;
  lex.tokp = srcp;
  lex.srcp = srcp + skip;
  return;
}

Creating a good macro system that is simple enough not to be dangerous requires difficult trade offs, and it's easy to just make it as flexible as possible. That might be a mistake though, with macros becoming an advanced feature reserved for special situations instead of a regular tool in the toolkit.

A zoo of casts

Originally from: https://dev.to/lerno/a-zoo-of-casts-4bob

I recently made a post on Reddit to ask about various types of cast syntax. For posterity's sake I'm recording them here.

Note that I'm ignoring the behaviour of the cast. Some languages have different syntax for upcasts, downcasts, bitcasts etc. I'm not concerned with that here. This is merely a list of variants of visual syntax. Consequently I list :> even though that's only a special form of cast for F#, and only one of the many keyword<type>(x) casts for C++, even though there are many variants.

Also I apologize in advance if the attribution is incorrect somewhere. I don't know all the languages I list here.

(The list of languages for each is also incomplete – it's just a sample)

cast(x, int)        MATLAB
int(x)              Pascal
<int>x              Typescript
(int)x              C/C++/Java/Beef/C#
static_cast<int>    C++
x as int            C#/Swift/Rust
x as! int           Swift
cast(x as int)      SQL
cast(int)x          D, Jai
@as(int, x)         Zig
[int]x              Pike
(int)(x)            Go
x :>                F#
cast[int](x)        Nim
x.as(int)           Crystal/Ecstasy
x->(int)            Frost
(x: int)            Flow
cast<int>(x)        C2
x.asInstanceOf(int) Scala
x.(int)             Go
x $ int             ChucK
int'(x)             Verilog

For fun, here are other permutations of the cast syntax that may or may not be useful:

(x as int)       
(x, int)
x<int>
x::int
(int : x)
(int x)
int::x
(x :: int)
cast(x -> int)
x to int
x#int
int:x
x.as[int]
x[int]
x.int
(int >> x)

C3 is currently using cast(x, int) but that might change.

When evaluating syntax, readability is important and it is always nice if the precedence is crystal clear.

As an example: x as Foo[4] – would that be x as (Foo[4]) or (x as Foo)[4]? Precedence rules will obviously decide, but if we compare with cast<Foo[4]>(x) the latter is much clearer because there is no need to know the precedence.

But length also matters: x = int(y) + int(z) is succinct while x = cast(y, int) + cast(z, int) feels quite a bit more wordy.

Picking a good cast syntax for a language is clearly one of difficult trade-offs.