Types
Overview
As usual, types are divided into basic types and user defined types (enum
, union
, struct
, fault
, def
). All types are defined on a global level.
Naming
All user defined types in C3 starts with upper case. So MyStruct
or Mystruct
would be fine, mystruct_t
or mystruct
would not.
This naming requirement ensures that the language is easy to parse for tools.
It is possible to use attributes to change the external name of a type:
This would affect things like generated C headers.
Differences from C
Unlike C, C3 does not use type qualifiers. const
exists,
but is a storage class modifier, not a type qualifier.
Instead of volatile
, volatile loads and stores are used.
Restrictions on function parameter usage are instead described by parameter preconditions.
typedef
has a slightly different syntax and renamed def
.
C3 also requires all function pointers to be used with a def
for example:
Basic types
Basic types are divided into floating point types, and integer types. Integer types being either signed or unsigned.
Integer types
Name | bit size | signed |
---|---|---|
bool * | 1 | no |
ichar | 8 | yes |
char | 8 | no |
short | 16 | yes |
ushort | 16 | no |
int | 32 | yes |
uint | 32 | no |
long | 64 | yes |
ulong | 64 | no |
int128 | 128 | yes |
uint128 | 128 | no |
iptr ** | varies | yes |
uptr ** | varies | no |
isz ** | varies | yes |
usz ** | varies | no |
* bool
will be stored as a byte.
** size, pointer and pointer sized types depend on platform.
Integer arithmetics
All signed integer arithmetics uses 2’s complement.
Integer constants
Integer constants are 1293832 or -918212. Without a suffix, suffix type is assumed to the signed integer of arithmetic promotion width. Adding the u
suffix gives a unsigned integer of the same width. Use ixx
and uxx
– where xx
is the bit width for typed integers, e.g. 1234u16
Integers may be written in decimal, but also
- in binary with the prefix 0b e.g.
0b0101000111011
,0b011
- in octal with the prefix 0o e.g.
0o0770
,0o12345670
- in hexadecimal with the prefix 0x e.g.
0xdeadbeef
0x7f7f7f
In the case of binary, octal and hexadecimal, the type is assumed to be unsigned.
Furthermore, underscore _
may be used to add space between digits to improve readability e.g. 0xFFFF_1234_4511_0000
, 123_000_101_100
TwoCC, FourCC and EightCC
FourCC codes are often used to identify binary format types. C3 adds direct support for 4 character codes, but also 2 and 8 characters:
- 2 character strings, e.g.
'C3'
, would convert to an ushort or short. - 4 character strings, e.g.
'TEST'
, converts to an uint or int. - 8 character strings, e.g.
'FOOBAR11'
converts to an ulong or long.
Conversion is always done so that the character string has the correct ordering in memory. This means that the same characters may have different integer values on different architectures due to endianness.
Base64 and hex data literals
Base64 encoded values work like TwoCC/FourCC/EightCC, in that is it laid out in byte order in memory. It uses the format b64'<base64>'
. Hex encoded values work as base64 but with the format x'<hex>'
. In data literals any whitespace is ignored, so '00 00 11'x
encodes to the same value as x'000011'
.
In our case we could encode b64'Rk9PQkFSMTE='
as 'FOOBAR11'
.
Base64 and hex data literals initializes to arrays of the char type:
String literals, and raw strings
Regular string literals is text enclosed in " ... "
just like in C. C3 also offers two other types of literals: multi-line strings and raw strings.
Raw strings uses text between ` `. Inside of a raw string, no escapes are available. To write a ` double the character:
Floating point types
Name | bit size |
---|---|
bfloat16 * | 16 |
float16 * | 16 |
float | 32 |
double | 64 |
float128 * | 128 |
*support is still incomplete.
Floating point constants
Floating point constants will at least use 64 bit precision. Just like for integer constants, it is allowed to use underscore, but it may not occur immediately before or after a dot or an exponential.
Floating point values may be written in decimal or hexadecimal. For decimal, the exponential symbol is e (or E, both are acceptable), for hexadecimal p (or P) is used: -2.22e-21
-0x21.93p-10
It is possible to type a floating point by adding a suffix:
Suffix | type |
---|---|
bf16 | bfloat16 |
f16 | float16 |
f32 or f | float |
f64 | double |
f128 | float128 |
C compatibility
For C compatibility the following types are also defined in std::core::cinterop
Name | c type |
---|---|
CChar | char |
CShort | short int |
CUShort | unsigned short int |
CInt | int |
CUInt | unsigned int |
CLong | long int |
CULong | unsigned long int |
CLongLong | long long |
CULongLong | unsigned long long |
CLongDouble | long double |
float
and double
will always match their C counterparts.
Note that signed C char and unsigned char will correspond to ichar
and char
. CChar
is only available to match the default signedness of char
on the platform.
Other built-in types
Pointer types
Pointers mirror C: Foo*
is a pointer to a Foo
, while Foo**
is a pointer to a pointer of Foo.
The typeid
type
The typeid
can hold a runtime identifier for a type. Using <typename>.typeid
a type may be converted to its unique runtime id,
e.g. typeid a = Foo.typeid;
. This value is pointer-sized.
The any
type
C3 contains a built-in variant type, which is essentially struct containing a typeid
plus a void*
pointer to a value.
While it is possible to cast the any
pointer to any pointer type,
it is recommended to use the anycast
macro or checking the type explicitly first.
Switching over the any
type is another method to unwrap the pointer inside:
any.type
returns the underlying pointee typeid of the contained value. any.ptr
returns
the raw void*
pointer.
Array types
Arrays are indicated by [size]
after the type, e.g. int[4]
. Slices use the type[]
. For initialization the wildcard type[*]
can be used to infer the size
from the initializer. See the chapter on arrays.
Vector types
Vectors use [<size>]
after the type, e.g. float[<3>]
, with the restriction that vectors may only form out
of integers, floats and booleans. Similar to arrays, wildcard can be used to infer the size of a vector: int[<*>] a = { 1, 2 }
.
Types created using def
”typedef”
Like in C, C3 has a “typedef” construct, def <typename> = <type>
Function pointer types
Function pointers are always used through a def
:
To form a function pointer, write a normal function declaration but skipping the function name. fn int foo(double x)
->
fn int(double x)
.
Function pointers can have default arguments, e.g. def Callback = fn void(int value = 0)
but default arguments
and parameter names are not taken into account when determining function pointer assignability:
Distinct types
Distinct types is a kind of type alias which creates a new type that has the same properties as the original type
but is - as the name suggests - distinct from it. It cannot implicitly convert into the other type using the syntax
distict <name> = <type>
Inline distinct
Using inline
in the distinct
declaration allows a distinct type to implicitly convert to its underlying type:
Generic types
Find out more about generic types.
Enum
Enum or enumerated types use the following syntax:
The access requires referencing the enum
’s name as State.WAITING
because
an enum like State
is a separate namespace by default, just like C++‘s class enum
.
Enum associated values
It is possible to associate each enum value with one or more a static values.
Multiple static values can be associated with an enum value, for example:
Enum type inference
When an enum
is used where the type can be inferred, like in switch case-clauses or in variable assignment, the enum name is not required:
If the enum
without it’s name matches with a global in the same scope, it needs the enum name to be added as a qualifier, for example:
Optional Type
An Optional type is created by taking a type and appending !
.
An Optional type behaves like a tagged union, containing either the
result or an Excuse that is of a fault type.
Once extracted, any specific fault can be converted to an anyfault
.
Only variables, expressions and function returns may be Optionals. Function and macro parameters in their definitions may not be optionals.
Read more about the Optional types on the page about Optionals and error handling.
Optional Excuses are of type Fault
When an Optional does not contain a result, it is empty, and has an Excuse, which is afault
.
The anyfault
type may contain any such fault.
Like the typeid type, the constants are pointer sized
and each value is globally unique. For example the underlying value of
MapResult.NOT_FOUND
is guaranteed to be different from IOResult.IO_ERROR
.
This is true even if they are separately compiled.
A fault may be stored as a normal value, but is also unique so that it may be passed
in an Optional as a function return value using the
rethrow !
operator.
Struct types
Structs are always named:
A struct’s members may be accessed using dot notation, even for pointers to structs.
(One might wonder whether it’s possible to take a Person**
and use dot access. – It’s not allowed, only one level of dereference is done.)
To change alignment and packing, attributes such as @packed
may be used.
Struct subtyping
C3 allows creating struct subtypes using inline
:
Union types
Union types are defined just like structs and are fully compatible with C.
As usual unions are used to hold one of many possible values:
Note that unions only take up as much space as their largest member, so Integral.sizeof
is equivalent to long.sizeof
.
Nested sub-structs / unions
Just like in C99 and later, nested anonymous sub-structs / unions are allowed. Note that the placement of struct / union names is different to match the difference in declaration.
Bitstructs
Bitstructs allows storing fields in a specific bit layout. A bitstruct may only contain integer types and booleans, in most other respects it works like a struct.
The main differences is that the bitstruct has a backing type and each field has a specific bit range. In addition, it’s not possible to take the address of a bitstruct field.
The bitstruct will follow the endianness of the underlying type:
It is however possible to pick a different endianness, in which case the entire representation will internally assume big endian layout:
In this case the same example yields CDAB9A78
and 789AABCD
respectively.
Bitstruct backing types may be integers or char arrays. The difference in layout is somewhat subtle:
Bitstructs can be made to have overlapping bit fields. This is useful when modelling a layout which has multiple different layouts depending on flag bits: