Type System
Overview¶
As usual, types are divided into basic types and user defined types (enum, union, struct, typedef, bitstruct). All types are defined on a global level.
Naming¶
All user-defined types in C3 start with upper-case. So MyStruct or Mystruct would be fine, mystruct_t or mystruct would not. This naming requirement ensures that the language is easy to parse for tools. It is possible to use attributes to change the external name of a type:
This affects generated C headers, but little else.
Differences from C¶
Unlike C, C3 does not use type qualifiers. const exists, but is a storage class modifier, not a type qualifier. Instead of volatile, volatile loads and stores are implemented using @volatile_load and @volatile_store. Restrictions on function parameter usage are implemented through parameter preconditions.
C3's equivalent of C's typedef has a slightly different syntax in C3 and is renamed alias. In contrast, in C3 a distinct type is created when using C3's typedef keyword. As such, take care to not confuse C3's alias and typedef keywords relative to C.
C3 also requires all function pointers to be used with an alias. For example:
alias Callback = fn void();
Callback a = null; // Ok!
fn Callback getCallback() { /* ... */ } // Ok!
// fn fn void() getCallback() { /* ... */ } - ERROR!
// fn void() a = null; - ERROR!
Compile time properties¶
Types have built-in type properties available through ::property syntax. The following properties are common to all C3 runtime types:
alignment- The standard alignment of the type in bytes. For exampleint::alignmentwill typically be 4.kind- The category of type, e.g.TypeKind.POINTERTypeKind.STRUCT(see std::core::types).cname- Returns a string with the extern name of the type, rarely used.name- Returns a string with the unqualified name of the type.qname- Returns a string with the qualified (using the full path) name of the type.size- Returns the storage size of the type in bytes.typeid- Returns a runtime typeid for the type.methods- Returns the methods implemented for a type.get_tag(tagname)- Returns true if the type has a particular tag.has_tag(tagname)- Retrieves the tag defined on the type.has_equals- True if the type implements==is_ordered- True if the type implements comparisons.is_substruct- True if the type has an inline member.
Basic types¶
Basic types are divided into floating point types and integer types.
Integer types are either signed or unsigned.
Integer types¶
| Name | bit size | signed |
|---|---|---|
bool† | 1 | no |
ichar | 8 | yes |
char | 8 | no |
short | 16 | yes |
ushort | 16 | no |
int | 32 | yes |
uint | 32 | no |
long | 64 | yes |
ulong | 64 | no |
int128 | 128 | yes |
uint128 | 128 | no |
iptr‡ | varies | yes |
uptr‡ | varies | no |
sz‡ | varies | yes |
usz‡ | varies | no |
†: bool will be stored as a byte.
‡: Size, pointer and pointer-sized types depend on the target platform.
Integer type properties¶
Integer types (except for bool) also have the following type properties:
maxThe maximum value for the type.minThe minimum value for the type.
Integer arithmetics¶
All signed integer arithmetic uses 2's complement.
Integer constants¶
Integer constants are 1293832 or -918212.
Integers may be written in decimal, but also
- in binary with the prefix 0b e.g.
0b0101000111011,0b011 - in octal with the prefix 0o e.g.
0o0770,0o12345670 - in hexadecimal with the prefix 0x e.g.
0xdeadbeef0x7f7f7f
In the case of binary, octal and hexadecimal, the type is assumed to be unsigned.
Furthermore, underscore _ may be used to add space between digits to improve readability e.g. 0xFFFF_1234_4511_0000, 123_000_101_100
Integer literal suffix and type¶
Integer literals follow C's rules:
- A decimal literal is by default
int. If it does not fit in anint, the type islongorint128. Picking the smallest type that fits the literal. - If the literal is suffixed by
uorUit is instead assumed to be anuint, but will beulongoruint128if it doesn't fit, like in (1). - Binary, octal and hexadecimal will implicitly be unsigned.
- If an
lorLsuffix is given, the type is assumed to belong. IfllorLLis given, it is assumed to beint128. - If the
ulorULis given, the type is assumed to beulong. IfullorULL, then it assumed to beuint128. - If a binary, octal or hexadecimal starts with zeros, infer the type size from the number of bits that would be needed if all digits were the maximum for the base.
$Typeof(1); // int
$Typeof(1u); // uint
$Typeof(1L); // long
$Typeof(0x11); // uint, hex is unsigned by default
$Typeof(0x1ULL); // uint128
$Typeof(4000000000); // long, since the number exceeds int.max
$Typeof(0x000000000000); // ulong: 12 hex chars indicate a 48 bit value
$Typeof(0b000000000000); // uint: 12 binary chars indicate a 12 bit value
TwoCC, FourCC and EightCC literals¶
FourCC codes are often used to identify binary format types. C3 adds direct support for 4 character codes, but also 2 and 8 characters:
- 2 character strings, e.g.
'C3', would convert to an ushort or short. - 4 character strings, e.g.
'TEST', converts to an uint or int. - 8 character strings, e.g.
'FOOBAR11'converts to an ulong or long.
Conversion is always done so that the character string has the correct ordering in memory. This means that the same characters may have different integer values on different architectures due to endianness.
Base64 and hex data literals¶
Base64 encoded values work like TwoCC/FourCC/EightCC, in that it is laid out in byte order in memory. It uses the format b64'<base64>'. Hex encoded values work as base64 but with the format x'<hex>'. In data literals any whitespace is ignored, so '00 00 11'x encodes to the same value as x'000011'.
In our case we could encode b64'Rk9PQkFSMTE=' as 'FOOBAR11'.
Base64 and hex data literals initializes to arrays of the char type:
char[*] hello_world_base64 = b64"SGVsbG8gV29ybGQh";
char[*] hello_world_hex = x"4865 6c6c 6f20 776f 726c 6421";
String literals, and raw strings¶
Regular string literals is text enclosed in " ... " just like in C. C3 also offers another type of literal: raw strings.
Raw strings uses text between ` `. Inside of a raw string, no escapes are available, and it can span across multiple lines. To write a ` double the character:
String foo = `C:\foo\bar.dll`;
ZString bar = `"Say ``hello``"`;
String baz =
`pushq %rax;
addq $1, %rax;
popq %rax;`;
// Same as
String foo = "C:\\foo\\bar.dll";
String bar = "\"Say `hello`\"";
String baz = "pushq %rax;\naddq $1, %rax;\npopq %rax;";
Floating point types¶
| Name | bit size |
|---|---|
bfloat16† | 16 |
float16† | 16 |
float | 32 |
double | 64 |
float128† | 128 |
†: Support is still incomplete and not all systems have native support.
Floating point type properties¶
On top of the regular properties, floating point types also have the following properties:
maxThe maximum value for the type.minThe minimum value for the type.infInfinity.nanFloat NaN.
Floating point constants¶
Floating point constants will at least use 64 bit precision. Just like for integer constants, it is allowed to use underscore, but it may not occur immediately before or after a dot or an exponential.
Floating point values may be written in decimal or hexadecimal. For decimal, the exponential symbol is e (or E, both are acceptable), for hexadecimal p (or P) is used: -2.22e-21 -0x21.93p-10
By default a floating point literal is of type double, but if the suffix f is used (eg 1.0f), it is instead of float type.
C compatibility¶
For C compatibility the following types are also defined in std::core::cinterop
| Name | C type |
|---|---|
CChar | char |
CShort | short int |
CUShort | unsigned short int |
CInt | int |
CUInt | unsigned int |
CLong | long int |
CULong | unsigned long int |
CLongLong | long long |
CULongLong | unsigned long long |
CLongDouble | long double |
float and double will always match their C counterparts.
Note that signed C char and unsigned char will correspond to ichar and char. CChar is only available to match the default signedness of char on the platform.
Other built-in types¶
Pointer types¶
Pointers mirror C: Foo* is a pointer to a Foo, while Foo** is a pointer to a pointer of Foo.
Pointer type properties¶
In addition to the standard properties, pointers also have the inner property. It returns the type of the object pointed to as a typeid.
Optional¶
An Optional type is created by taking a type and appending ~. An Optional type behaves like a tagged union, containing either the Result or an Empty, which also carries a fault type.
Once extracted, a fault can be converted to another fault.
faultdef MISSING; // define a fault
int? i;
i = 5; // Assigning a real value to i.
i = io::EOF~; // Assigning an optional result to i.
fault b = MISSING; // Assign a fault to b
b = @catch(i); // Assign the Excuse in i to b (EOF)
Only variables, expressions and function returns may be Optionals. Function and macro parameters in their definitions may not be optionals.
fn Foo*? getFoo() { /* ... */ } // ✅ Ok!
int? x = 0; // ✅ Ok!
fn void processFoo(Foo*? f) { /* ... */ } // ❌ fn parameter
An Optional value can use the special if-try and if-catch to unwrap its result or its Empty, it is also possible to implicitly return if it is Empty using ! and panic with !!.
To learn more about the Optional type and error handling in C3, read the page on Optionals and error handling.
Note
If you want a more regular "optional" value, to store in structs, then you can use the generic Maybe type in std::collections.
The fault type¶
When an Optional does not contain a result, it is Empty, but contains a fault which explains why there was no normal value. A fault have the special property that together with the ~ suffix it creates an Empty value:
int? x = IO_ERROR~; // 'IO_ERROR~' is an Optional Empty.
fault y = IO_ERROR; // Here IO_ERROR is just a regular
// value, since it isn't followed by '~'
A new fault value can only be defined using the faultdef statement:
Like the typeid type, a fault is pointer sized and each value defined by faultdef is globally unique. This is true even when faults are separately compiled.
Note
The underlying unique value assigned to a fault may vary each time a program is run.
Fault description¶
The fault type only has one field: description, which returns the name of the fault, namespaced with the last module path, e.g. "io::EOF".
The typeid type¶
The typeid holds the runtime representation of a type. Using <typename>.typeid a type may be converted to its unique runtime id, e.g. typeid a = Foo.typeid;. The value itself is pointer-sized.
Typeid fields¶
At compile time, a typeid value has all the properties of its underlying type:
However, at runtime only a few are available:
size- always supported.kind- always supported.parent- supported on distinct and struct types, returning the inline member type.inner- supported on types implementing it.names- supported on enum types.len- supported on arrays, vectors and enums.
The any type¶
C3 contains a built-in variant type, which is essentially a struct containing a typeid plus a void* pointer to a value. While it is possible to cast the any pointer to any pointer type, it is recommended to use the anycast macro or checking the type explicitly first. With the anycast macro, the return will be an optional, which is empty if there is a mismatch.
fn void main()
{
int x;
any y = &x;
int* w = (int*)y; // Returns the pointer to x
double* z_bad = (double*)y; // Don't do this!
double*? z = anycast(y, double); // The safe way to get a value
if (y.type == int.typeid)
{
// Do something if y contains an int*
}
if (try v = anycast(y, int))
{
// same as above, but v holds the unwrapped int*
}
}
You can use a switch to check an any's type, as well. After the type has been confirmed, it is safe to dereference.
fn void test(any z)
{
// Switch
switch (z.type)
{
case int:
// This is safe here:
int* y = (int*)z;
case double:
// This is safe here:
double* y = (double*)z;
}
// Assignment switch
switch (y = z, y.type)
{
case int:
// This is safe here:
int* x = (int*)y;
}
// Finally, if we just want to deal with the case
// where it is a single specific type:
if (z.type == int.typeid)
{
// This is safe here:
int* a = (int*)z;
}
if (try b = *anycast(z, int))
{
// b is an int:
foo(b * 3);
}
}
Note that in switches, if a substruct type is passed in and it's parent matches first, it will take priority.
fn void test(any z)
{
// Will always be seen as the parent type.
switch (z.type)
{
case Parent:
// code...
case Subtype:
// code that will never execute...
}
// So order the subtypes first
// if you're comparing them against their parent.
// Of course, this is still useful in cases
// of inherited types where the parent isn't in the switch.
switch (z.type)
{
case Parent:
// modify data both Parent and Subtype have
case SomethingElse:
// completely different type code
}
}
If you don't want the child type detected as the parent type, a typedef can be used to create a distinct type without changing any data.
any fields¶
At runtime, any gives you access to two fields:
some_any.type- returns the underlying pointee typeid of the contained value.some_any.ptr- returns the rawvoid*pointer to the contained value.
Advanced use of any¶
The standard library has several helper macros to manipulate any types:
anycast(some_any, Type)returns a pointer toType*orTYPE_MISMATCHif types don't match.any_make(ptr, some_typeid)creates ananyto a giventypeidusing avoid*.some_any.retype_to(some_typeid)changes the type of ananyto the given typeid.some_any.as_inner()retypes the type of theanyto the "inner" (see theinnertype property) of the current type.
void* some_ptr = foo();
// Essentially (any)(int*)(some_ptr)
any some_int = any_make(some_ptr, int.typeid);
// Same as any_make(some_int.ptr, uint.type)
any some_uint = some_int.retype_to(uint.typeid);
typedef SomeType = int;
SomeType s = 3;
any any_val = &s;
// Result is same as (any)&s.a
any some_inner_int = any_val.as_inner();
Array types¶
Arrays are indicated by [size] after the type, e.g. int[4]. Slices use the type[]. For initialization the wildcard type[*] can be used to infer the size from the initializer. See the chapter on arrays.
Vector types¶
Vectors use [<size>] after the type, e.g. float[<3>], with the restriction that vectors may only form out of integers, floats and booleans. Similar to arrays, wildcard can be used to infer the size of a vector: int[<*>] a = { 1, 2 }.
Array and vector type properties¶
Array and vector types also support:
innerReturning the type of each element.lenGives the length of the type.
User defined types¶
Type aliases (C's typedef)¶
C3 has a construct that behaves essentially the same as C's "typedef", an alias, and it is declared using the syntax alias <new_name> = <old_name>. For example:
These are not proper types, just aliases, and querying their properties will query the properties of its aliased type.
Function pointer types¶
Function pointers are always used through an alias:
To form a function pointer, write a normal function declaration but skipping the function name. fn int foo(double x) -> fn int(double x).
Function pointers can have default arguments, e.g. alias Callback = fn void(int value = 0) but default arguments and parameter names are not taken into account when determining function pointer assignability:
alias Callback = fn void(int value = 1);
fn void test(int a = 0) { /* ... */ }
Callback callback = &test; // Ok
fn void main()
{
callback(); // Works, same as test(1);
test(); // Works, same as test(0);
callback(value: 3); // Works, same as test(3)
test(a: 4); // Works, same as test(4)
// callback(a: 3); // ERROR!
}
Function pointer type properties¶
Function pointer types also support:
paramsof- Returns a list ofReflectedParamfor each parameter.returns- This returns the return type.
Typedef - Distinct type definitions¶
typedef creates a new type, that has the same properties as the original type but is distinct from it. It cannot implicitly convert into the other type using the syntax typedef <name> = <type>
typedef MyId = int;
typedef MyId2 @constinit = int;
fn void* get_by_id(MyId id) { ... }
fn void* get_by_id2(MyId2 id) { ... }
fn void test(MyId id)
{
void* val = get_by_id(id); // Ok
// void* val2 = get_by_id(1); // ERROR expected a MyId
// Use `@constinit` to allow implicit conversion from
// literals
void* val2 = get_by_id2(1);
int a = 1;
// void* val3 = get_by_id(a); // ERROR expected a MyId
// `@constinit` doesn't work on non-literals
// void* val3 = get_by_id2(a); // ERROR expected a MyId2
void* val4 = get_by_id((MyId)a); // Works
// a = id; // ERROR can't assign 'MyId' to 'int'
}
Inline typedef¶
Using inline in the typedef declaration allows a newly created typedef type to implicitly convert to its underlying type:
typedef Abc @constinit = int;
typedef Bcd @constinit = inline int;
fn void test()
{
Abc a = 1;
Bcd b = 1;
// int i = a; Error: Abc cannot be implicitly converted to 'int'
int i = b; // This is valid
// However, 'inline' does not allow implicit conversion from
// the inline type to the typedef type:
// a = i; Error: Can't implicitly convert 'int' to 'Abc'
// b = i; Error: Can't implicitly convert 'int' to 'Bcd'
}
Aligned typedefs¶
It's possible to use typedef to create underaligned types. For example, typically an int will be 4 byte aligned, but we can create a 2-byte aligned type using typedef IntAlign2 = int @align(2);.
Storage SIMD types¶
Vectors are normally stored and passed as arrays to prevent SIMD alignment overhead. However, it's possible to define types that exactly match the SIMD types in C and other languages for storage and argument passing. These types are defined with typedef and the @simd attribute, similar to aligned typedefs: typedef Float4 = float[<4>] @simd
Typedef type properties¶
In addition to the normal properties, typedef also supports:
inner- Returns the type this is based on as atypeid.parentof- If this is an inline typedef, return the same asinner.
Generic types¶
import generic_list; // Contains the generic MyList
struct Foo
{
int x;
}
// ✅ alias for each type used with a generic module.
alias MyListFoo = MyList {Foo};
MyListFoo working_example;
fn void main()
{
// ❌ A nested inline type definition in a function context
// will yield an error, it's only available on the top
// level or in macros. Prefer aliases.
MyList {MyList {int}} failing_example;
}
Enum and constdefs¶
These correspond to C's enum. See enums and constdefs.
Struct types¶
Read more about unions and structs and bitstructs.