Basic Types
C3 provides a similar set of fundamental data types as C: integers, floats, arrays and pointers. On top of this it expands on this set by adding slices and vectors, as well as the any and typeid types for advanced use.
Integers¶
C3 has signed and unsigned integer types. The built-in signed integer types are ichar, short, int, long, int128, iptr and sz. ichar to int128 have all well-defined power-of-two bit sizes, whereas iptr has the same number of bits as a void* and sz has the same number of bits as the maximum difference between two pointers. For each signed integer type there is a corresponding unsigned integer type: char, ushort, uint, ulong, uint128, uptr and usz.
| type | signed? | min | max | bits |
|---|---|---|---|---|
| ichar | yes | -128 | 127 | 8 |
| short | yes | -32768 | 32767 | 16 |
| int | yes | -2^31 | 2^31 - 1 | 32 |
| long | yes | -2^63 | 2^63 - 1 | 64 |
| int128 | yes | -2^127 | 2^127 - 1 | 128 |
| iptr | yes | varies | varies | varies |
| sz | yes | varies | varies | varies |
| char | no | 0 | 255 | 8 |
| ushort | no | 0 | 65535 | 16 |
| uint | no | 0 | 2^32 - 1 | 32 |
| ulong | no | 0 | 2^64 - 1 | 64 |
| uint128 | no | 0 | 2^128 - 1 | 128 |
| uptr | no | 0 | varies | varies |
| usz | no | 0 | varies | varies |
On 64-bit machines iptr/uptr and sz/usz are usually 64-bits, like long/ulong. On 32-bit machines on the other hand they are generally int/uint.
Integer constants¶
Numeric constants typically use decimal, e.g. 234, but may also use hexadecimal (base 16) numbers by prefixing the number with 0x or 0X, e.g. int a = 0x42edaa02;. There is also octal (base 8) using the 0o or 0O prefix, and 0b for binary (base 2) numbers:
Numbers may also insert underscore _ between digits to improve readability, e.g. 1_000_000.
For decimal numbers, the value is assumed to be a signed int, unless the number doesn't fit in an int, in which case it is assumed to be the smallest signed type it does fit in (long or int128).
For hexadecimal, octal and binary, the type is assumed to be unsigned.
An integer literal can implicitly convert to a floating point literal, or an integer of a different type provided the number fits in the type.
Constant suffixes¶
If you want to ensure that a constant is of a certain type, you can either add an explicit cast like: (ulong)345, or use an integer suffix: 345ul.
The following integer suffixes are available:
| suffix | type |
|---|---|
| l | long |
| ll | int128 |
| u | uint |
| ul | ulong |
| ull | uint128 |
Suffixes may be uppercase or lowercase.
Booleans¶
A bool will be either true or false. Although a bool is only a single bit of data, it should be noted that it is stored in a byte.
Character literals¶
A character literal is a value enclosed in ''. Its value is interpreted as being its ASCII value for a single character.
It is also possible to use 2, 4 or 8 character wide character literals. Such are interpreted as ushort, uint and ulong respectively and are laid out in memory from left to right. This means that the actual value depends on the endianness of the target.
- 2 character literals, e.g.
'C3', would convert to a ushort. - 4 character literals, e.g.
'TEST', converts to a uint. - 8 character literals, e.g.
'FOOBAR11'converts to a ulong.
The 4 character literals correspond to the layout of FourCC codes. It will also correctly arrange unicode characters in memory. E.g. Char32 smiley = '\u1F603'
Floating point types¶
As is common, C3 has two floating point types: float and double. float is the 32 bit floating point type and double is 64 bits.
Floating point constants¶
Floating point constants will at least use 64 bit precision. Just like for integer constants, it is possible to use _ to improve readability, but it may not occur immediately before or after a dot or an exponential.
C3 supports floating point values either written in decimal or hexadecimal formats. For decimal, the exponential symbol is e (or E, both are acceptable), for hexadecimal p (or P) is used: -2.22e-21 -0x21.93p-10
While floating point numbers default to double it is possible to type a floating point by adding a suffix:
| Suffix | type |
|---|---|
f32 or f | float |
f64 | double |
Arrays¶
Arrays have the format Type[size], so for example: int[4]. An array is a type consisting of the same element repeated a number of times. Our int[4] is essentially four int values packed together.
For initialization it's sometimes convenient to use the wildcard Type[*] declaration, which infers the length from the number of elements:
Slices¶
Slices have the format Type[]. Unlike the array, a slice does not hold the values themselves but instead presents a view of some underlying array or vector.
Slices have two properties: .ptr, which retrieves the array it points to, and .len which is the length of the slice - that is, the number of elements it is possible to index into.
Usually we can get a slice by taking the address of an array:
Because indexing into slices is range checked in safe mode, slices are vastly more safe than providing pointer + length separately.
The internal representation of a slice is a two element struct:
This definition can be found in the modulestd::core::runtime. Vectors¶
Vectors, similar to arrays, use the format Type[<size>], with the restriction that vectors may only form out of integers, floats and booleans. Similar to arrays, wildcard can be used to infer the size of a vector:
Vectors are based on hardware SIMD vectors, and support many different operations that work on all elements in parallel, including arithmetics:
Vector initialization and literals work the same way as arrays, using { ... }, however, it's also possible to use swizzling arguments to designated initialization:
String literals¶
String literals are special and can convert to several different types: String, char and ichar arrays and slices and finally ichar* and char*.
String literals are text enclosed in " " just like in C. These support escape sequences like \n for line break and need to use \" for any " inside of the string.
C3 also offers raw strings which are enclosed in ` `. A raw string may span multiple lines. Inside of a raw string, no escapes are available, and to write a `, simply double the character:
// Note: String is a typedef inline char[]
String three_lines =
`multi
line
string`;
String foo = `C:\foo\bar.dll`;
String bar = `"Say ``hello``"`;
// Same as
String foo = "C:\\foo\\bar.dll";
String bar = "\"Say `hello`\"";
String is a typedef inline char[], which can implicitly convert to char[] when required.
ZString is a typedef inline char*.ZString is a C compatible null terminated string, which can implicitly convert to char* when required.
Base64 and hex data literals¶
Base64 literals are strings prefixed with b64 containing Base64 encoded data, which is converted into a char array at compile time:
// The array below contains the characters "Hello World!"
char[*] hello_world_base64 = b64"SGVsbG8gV29ybGQh";
The corresponding hex data literals convert a hexadecimal string rather than Base64:
// The array below contains the characters "Hello World!"
char[*] hello_world_hex = x"4865 6c6c 6f20 776f 726c 6421";
Pointer types¶
Pointers have the syntax Type*. A pointer is a memory address where one or possibly more elements of the underlying address are stored. Pointers can be stacked: Foo* is a pointer to a Foo while Foo** is a pointer to a pointer to Foo.
The pointer type has a special literal called null, which is an invalid, empty pointer.
void*¶
The void* type is a special pointer which implicitly converts to any other pointer. It is not "a pointer to void", but rather a wildcard pointer which matches any other pointer.
Printing values¶
Printing values can be done using io::print, io::printn, io::printf and io::printfn. This requires importing the module std::io.
Note
The n variants of the print functions will add a newline after printing, which is what we'll often use in the examples, but print and printf work the same way.
import std::io; // Get the io functions.
fn void main()
{
int a = 1234;
ulong b = 0xFFAABBCCDDEEFF;
double d = 13.03e-04;
char[*] hex = x"4865 6c6c 6f20 776f 726c 6421";
io::printn(a);
io::printn(b);
io::printn(d);
io::printn(hex);
}
If you run this program you will get:
To get more control we can format the output using printf and printfn:
import std::io;
fn void main()
{
int a = 1234;
ulong b = 0xFFAABBCCDDEEFF;
double d = 13.03e-04;
char[*] hex = x"4865 6c6c 6f20 776f 726c 6421";
io::printfn("a was: %d", a);
io::printfn("b in hex was: %x", b);
io::printfn("d in scientific notation was: %e", d);
io::printfn("Bytes as string: %s", (String)&hex);
}
We can apply the standard printf formatting rules, but unlike in C/C++ there is no need to indicate the type when using %d - it will print unsigned and signed up to int128, in fact there is no support for %u, %lld etc in io::printf. Furthermore, %s works not just on strings but on any type:
import std::io;
enum Foo
{
ABC,
BCD,
EFG,
}
fn void main()
{
int a = 1234;
uint128 b = 0xFFEEDDCC_BBAA9988_77665544_33221100;
Foo foo = BCD;
io::printfn("a: %s, b: %d, foo: %s", a, b, foo);
}
This prints: