Table of Contents
Docs
Introduction
Want To Download C3?
Download C3, available on Mac, Windows and Linux.
C3 is an evolution of C and a minimalist systems programming language.
🦺 Ergonomics and Safety¶
- Optionals to safely and quickly handle errors and null.
- Defer to clean up resources.
- Slices and foreach for safe iteration.
- Contracts in comments, to add constraints to your code.
- Automatically free memory after use in
@poolcontext.
⚡ Performance by default¶
- Write SIMD vectors to program the hardware directly.
- Access to different memory allocators to fine tune performance.
- Zero overhead errors.
- Fast compilation times.
- LLVM backend for industrial strength optimisations.
- Easy to use inline assembly.
🔋Batteries included standard library¶
- Dynamic containers and strings.
- Cross-platform abstractions for ease of use.
- Access to the native platform when you need it.
🔧 Leverage existing C or C++ libraries¶
- Full C ABI compatibility.
- C3 can link C code, C can link C3 code.
📦 Modules are simple¶
- Modules namespace code.
- Modules make encapsulation simple with explicit control.
- Interfaces define shared behaviour to write robust libraries.
- Generic modules make extending code easier.
- Simple struct composition and reuse with struct subtyping.
🎓 Macros without a PhD¶
- Macros can be similar to normal functions.
- Or write code that understands the types in your code.
⚠️ Warning: Docs may not reflect current language state. The C3 standard library and compiler are still evolving. Please verify examples against the compiler and standard library directly. If you spot mismatches, open a GitHub issue or help us fix them!
Getting Started
Hello World
Not installed the C3 compiler yet?
Download C3, available on Mac, Windows and Linux.
👋 Hello world¶
Let's start with the traditional first program, Hello World in C3:
The import statement imports other modules, and we want printn which
is in std::io.
Next we define a function which starts with the fn keyword followed by the return type. We don't need to return anything, so return void. The function name main then follows, followed by the function's parameter list, which is empty.
Note
The function named main is a bit special, as it is where the program starts, or the entry point of the program.
For Unix-like OSes there are a few different variants, for example we might declare it as fn void main(String[] args). In that case the parameter "args" contains a slice of strings, of the program's command line arguments, starting with the name of the program itself.
🔭 Function scope¶
{ and } signifies the start and end of the function respectively,
we call this the function's scope. Inside the function scope we have a single function
call to printn inside std::io. We use the last part of the path "io" in front of
the function to identify what module it belongs to.
📏 Imports can use a shorthand¶
We could have used the original longer path: std::io::printn
if we wanted, but we can shorten it to just the lowest level module like io::printn. This is the convention in C3 and is known as "path-shortening", it avoids writing long import paths that can make code harder to read.
The io::printn function takes a single argument and prints it, followed by a newline, then the function ends and the program terminates.
🔧 Compiling the program¶
Let's take the above program and put it in a file called hello_world.c3.
We can then compile it with:
And run it:
It should print Hello, World! and return back to the command line prompt.
If you are on Windows, you will have hello_world.exe instead. Call it in the same way.
🏃 Compiling and running¶
When we start out it can be useful to compile and then have the compiler start the
program immediately. We can do that with compile-run:
$ c3c compile-run hello_world.c3
> Program linked to executable 'hello_world'.
> Launching hello_world...
> Hello, World
Want more options when compiling? Check the c3c compiler build options.
🎉 Successfully working?¶
Congratulations! You're now up and running with C3.
❓ Need help?¶
We're happy to help on the C3 Discord.
How to compile
Want To Download Pre-Built C3 Binaries?
Download C3, available on Mac, Windows and Linux.
For other platforms it should be possible to compile it on any platform LLVM can compile to. You will need git and CMake installed.
1. Install LLVM¶
See the LLVM documentation on how to set up LLVM for development. - On MacOS, installing through Homebrew or MacPorts works fine. - Using apt-get on Linux should work fine as well. - For Windows you can download suitable pre-compiled LLVM binaries from here.
2. Clone the C3 compiler source code from Github¶
This should be as simple as doing:
... from the command line.
3. Build the compiler¶
Create the build directory:
Use CMake to set up:
Build the compiler:
4. Test it out¶
Building via Docker¶
You can build c3c using either an Ubuntu 18.04 or 20.04 container:
Replace 18 with 20 to build through Ubuntu 20.04.
For a release build specify:
A c3c executable will be found under bin/.
Building on Mac using Homebrew¶
- Install CMake:
brew install cmake - Install LLVM 17+:
brew install llvm - Clone the C3C github repository:
git clone https://github.com/c3lang/c3c.git - Enter the C3C directory
cd c3c. - Create a build directory
mkdir build - Change directory to the build directory
cd build - Set up CMake build for debug:
cmake .. - Build:
cmake --build .
Building on Mac using MacPorts¶
c3c may be built on Mac systems not supported by Homebrew
using the cmake, llvm-17 and clang-17
ports from MacPorts.
- Install CMake:
sudo port install cmake - Install LLVM 17:
sudo port install llvm-17 - Install clang 17:
sudo port install clang-17 - Clone the C3C github repository:
git clone https://github.com/c3lang/c3c.git - Enter the C3C directory
cd c3c. - Create a build directory
mkdir build - Change directory to the build directory
cd build - ❗️Important before you run cmake❗️
Set LLVM_DIR to the directory with the llvm-17 macport .cmake files
export LLVM_DIR=/opt/local/libexec/llvm-17/lib/cmake/llvm - Set up CMake build for debug:
cmake .. - Build:
cmake --build .
See also discussion #1701
Prebuilt binaries¶
- Installing on Windows
- Installing on Mac Arm64
- Installing on Ubuntu
- Installing on Debian
- Installing on Arch
Installing on Windows¶
- Download the C3 compiler or the debug build.
- Unzip it into a folder
Optional: set c3c as a global environment variable¶
- copy the folder
- navigate to
C:\Program Files - paste the folder here
- navigate inside the folder you've pasted
- copy the path of the folder
- search for "edit the system environment variables" on your computer
- click on the "environment variables" button on the bottom right
- under "user variables" double click on "path"
- click on "new" and paste the path to the folder
- run
c3canywhere on your computer!
Installing on Mac Arm64¶
- Make sure you have XCode with command line tools installed.
- Download the C3 compiler or the debug build.
- Unzip executable and standard lib.
- Run
./c3c.
Installing on Ubuntu¶
- Download the C3 compiler or the debug build.
- Unpack executable and standard lib.
- Run
./c3c.
Installing on Debian¶
- Download the C3 compiler or the debug build.
- Unpack executable and standard lib.
- Run
./c3c.
Installing on Arch Linux¶
There is an AUR package for the c3c compiler : c3c-git.
You can use your AUR package manager:
Or clone it manually:
Troubleshooting¶
Note: If you get an error like No module named 'std::io' could be found, you may need to set the C3C_LIB environment variable to point to the standard library location:
Bash/Zsh:
Fish:
Windows (PowerShell):
"cc: not found"¶
On Linux and MacOS, C3 uses the available C compiler to link with the correct libraries. While C3 contains a built-in linker, it is likely that your system will lack a complete environment unless a C compiler is available.
Linux users should generally install GCC or Clang, according to their distribution's documentation. Below is a list of officially tested distributions and the minimum packages required to compile and link C3 programs:
| Distribution | Required Packages | Command |
|---|---|---|
| Ubuntu / Debian | gcc, libc6-dev |
sudo apt-get install gcc libc6-dev |
| Fedora / Rocky | gcc |
sudo dnf install gcc |
| Arch Linux | gcc |
sudo pacman -S gcc |
| openSUSE | gcc, glibc-devel |
sudo zypper install gcc glibc-devel |
| Alpine Linux | gcc, musl-dev |
sudo apk add gcc musl-dev |
| Void Linux | gcc |
sudo xbps-install -S gcc |
On MacOS, you can either install XCode or download the stand-alone command-line tools.
Project Setup
Not installed the C3 compiler yet?
Download C3, available on Mac, Windows and Linux.
Projects in C3¶
Projects are optional, but are a good way to manage compiling code when there are a lot of files and modules. They also allow you to specify libraries to link, and define how your project should be built for specific targets.
💡 Creating a new project¶
The c3c init command will create a new directory containing your project structure.
It requires a name of the project, we will use myc3project in its place.
You can also customize the path where the project will be created or specify a template. For more information check the init command reference.
📁 Project structure¶
If you check the directory that was created you might find it a bit confusing with a bunch of different directories, but worry not because if you expand them you will realise that most of them are actually empty!
.
├─ build/
├─ docs/
├─ lib/
├─ resources/
├─ scripts/
├─ src/
│ └─ main.c3
├─ test/
├─ LICENSE
├─ project.json
└─ README.md
Directory Overview¶
| Directory | Usage |
|---|---|
./build |
Where your temporary files and build results will go. |
./docs |
Code Documentation |
./lib |
C3 libraries (with the .c3l suffix) |
./resources |
Non-code resources like images, sound effects etc. |
./scripts |
Scripts, including .c3 scripts that generate code at compile time. |
./src |
Storing our code, by default contains main.c3 with "Hello World". |
project.json |
Record project information, similar to package.json in NodeJS. |
LICENSE |
Project license. |
README.md |
Help others understand and use your code. |
🔧 Building the project¶
Assuming you have successfully initialized a project as seen above, we can now look at how to compile it.
🏃 Build & run¶
C3 has a simple command to build & run our project.
c3c run
> Program linked to executable 'build/myc3project'.
> Launching ./build/myc3project...
> Hello, World
You can also specify the target to build & run.
🔧 Build¶
If you only want to build the project, you can use the build command:
This command builds the project targets defined in our project.json file.
Note
If you want to build a specific target, you can do so by specifying its name.
The default target is created with the name of the project, such as myc3project.
We will now have a binary in build, which we can run:
It should print Hello, World! and return back to the command line prompt.
If you are on Windows, you will have myc3project.exe instead. Call it in the same way.
If you need more detail later on check C3 project build commands and C3 project configuration to learn more.
Roadmap
Want To Download C3?
Download C3, available on Mac, Windows and Linux.
C3 Roadmap¶
C3 Is Feature Stable¶
The C3 0.7.x series can be run in production with the same general caveats for using any pre-1.0 software.
While we strive to have zero bug count, there are still bugs being found. This means that anyone using it in production would need to stay updated with the latest fixes.
The focus of 0.8–0.9 will be fleshing out the cross-platform standard
library and making sure the syntax and semantics are solid. Also, the
toolchain will expand and improve. Please refer to this issue for what's
left in terms of features for 1.0.
The intended roadmap has one major 0.1 release per year:
| Date | Release |
|---|---|
| 2026-06-01 | 0.8 |
| 2027-04-01 | 0.9 |
| 2028-04-01 | 1.0 |
Compatibility¶
Minor releases in the same major release series are compatible.
For example 0.6.0, 0.6.1, ... 0.6.x are compatible and 0.7.0, 0.7.1, ... 0.7.x are compatible.
Standard library¶
The standard library is less mature than the compiler. It needs more
functionality and more tests. The compiler reaching a 1.0 release only
means a language freeze, the standard library will continue to evolve
past the 1.0 release.
Design Goals
Want To Download C3?
Download C3, available on Mac, Windows and Linux.
Design goals¶
- Procedural language, with a pragmatic ethos to get work done.
- Minimalistic, no feature should be unnecessary or redundant.
- Stay close to C - only change where there is a significant need.
- Learning C3 should be easy for a C programmer.
- Seamless C integration.
- Ergonomic common patterns.
- Data is inert.
- Zero Is Initialization (ZII).*
- Avoid "big ideas".
"Zero Is Initialization" is an idiom where types and code are written so that the zero value is a meaningful, initialized state.*
Features¶
- Full C ABI compatibility
- Module system
- Operator overloading
- Generic modules
- Design by contract
- Zero overhead errors
- Semantic macro system
- First-class SIMD vector types
- Struct subtyping
- Safe array access using slices
- Safe array iteration using foreach
- Easy to use inline assembly
- Cross-platform standard library which includes dynamic containers and strings
- LLVM backend
C3 Background¶
C3 is an evolution of C, a minimalistic language designed for systems programming, enabling the same paradigms and retaining the same syntax as far as possible.
C3 started as an experimental fork of the C2 language by Bas van den Berg. It has evolved significantly, not just in syntax but also in regard to error handling, macros, generics and strings.
Language Overview
Examples
Overview¶
This is meant for a quick reference, to learn more of the details, check the relevant sections.
If Statement¶
For Loop¶
fn void example_for()
{
// the for-loop is the same as C99.
for (int i = 0; i < 10; i++)
{
io::printfn("%d", i);
}
// also equal
for (;;)
{
// ..
}
}
Foreach Loop¶
// Prints the values in the slice.
fn void example_foreach(float[] values)
{
foreach (index, value : values)
{
io::printfn("%d: %f", index, value);
}
}
// Updates each value in the slice
// by multiplying it by 2.
fn void example_foreach_by_ref(float[] values)
{
foreach (&value : values)
{
*value *= 2;
}
}
While Loop¶
fn void example_while()
{
// again exactly the same as C
int a = 10;
while (a > 0)
{
a--;
}
// Declaration
while (Point* p = getPoint())
{
// ..
}
}
Enum And Switch¶
Switches have implicit break and scope. Use "nextcase" to explicitly fallthrough or use comma:
enum Height : uint
{
LOW,
MEDIUM,
HIGH,
}
fn void demo_enum(Height h)
{
switch (h)
{
case LOW:
case MEDIUM:
io::printn("Not high");
// Implicit break.
case HIGH:
io::printn("High");
}
// This also works
switch (h)
{
case LOW:
case MEDIUM:
io::printn("Not high");
// Implicit break.
case Height.HIGH:
io::printn("High");
}
// Completely empty cases are not allowed.
switch (h)
{
case LOW:
break; // Explicit break required, since switches can't be empty.
case MEDIUM:
io::printn("Medium");
case HIGH:
break;
}
// special checking of switching on enum types
switch (h)
{
case LOW:
case MEDIUM:
case HIGH:
break;
default: // warning: default label in switch which covers all enumeration value
break;
}
// Using "nextcase" will fallthrough to the next case statement,
// and each case statement starts its own scope.
switch (h)
{
case LOW:
int a = 1;
io::printn("A");
nextcase;
case MEDIUM:
int a = 2;
io::printn("B");
nextcase;
case HIGH:
// a is not defined here
io::printn("C");
}
}
Enums are always namespaced.
Enums support various reflection properties: .values returns an array with all enums. .len or .elements returns the number
of enum values, .inner returns the storage type. .names returns an array with the names of all enums. .associated
returns an array of the typeids of the associated values for the enum.
enum State : uint
{
START,
STOP,
}
State start = State.values[0];
usz enums = State.elements; // 2
String[] names = State.names; // [ "START", "STOP" ]
Duff's Device¶
Using nextcase we can implement a version of Duff's Device:
fn void duff(int* to, int* from, int count)
{
int n = (count + 7) / 8;
switch (count % 8)
{
case 0: *to++ = *from++; nextcase;
case 7: *to++ = *from++; nextcase;
case 6: *to++ = *from++; nextcase;
case 5: *to++ = *from++; nextcase;
case 4: *to++ = *from++; nextcase;
case 3: *to++ = *from++; nextcase;
case 2: *to++ = *from++; nextcase;
case 1: *to++ = *from++; if (--n > 0) nextcase 0;
}
}
Defer¶
Defer will be invoked on scope exit.
fn void test(int x)
{
defer io::printn();
defer io::print("A");
if (x == 1) return;
{
defer io::print("B");
if (x == 0) return;
}
io::print("!");
}
fn void main()
{
test(1); // Prints "A"
test(0); // Prints "BA"
test(10); // Prints "B!A"
}
Because it's often relevant to run different defers when having an error return there is also a way to create an error defer, by using the catch keyword directly after the defer.
Similarly, using defer try can be used to only run if the scope exits in a regular way.
fn void? test(int x)
{
defer io::printn("");
defer io::print("A");
defer try io::print("X");
defer catch io::print("B");
defer (catch err) io::printf("%s", err);
if (x == 1) return NOT_FOUND~;
io::print("!");
}
fn void main()
{
(void)test(0); // Prints "!XA"
(void)test(1); // Prints "builtin::NOT_FOUNDBA" and returns a NOT_FOUND
// Note that we need to use (void) to explicitly discard the Optional result.
}
Struct Types¶
alias Callback = fn int(char c);
enum Status : int
{
IDLE,
BUSY,
DONE,
}
struct MyData
{
char* name;
Callback open;
Callback close;
Status status;
// named sub-structs (x.other.value)
struct other
{
int value;
int status; // ok, no name clash with other status
}
// anonymous sub-structs (x.value)
struct
{
int value;
int status; // error, name clash with other status in MyData
}
// anonymous union (x.person)
union
{
Person* person;
Company* company;
}
// named sub-unions (x.either.this)
union either
{
int this;
bool or;
char* that;
}
}
Function Pointers¶
module demo;
alias Callback = fn int(char* text, int value);
fn int my_callback(char* text, int value)
{
return 0;
}
Callback cb = &my_callback;
fn void example_cb()
{
int result = cb("demo", 123);
// ..
}
Error Handling¶
Errors are handled using optional results, denoted with a '?' suffix. A variable of an optional
result type may either contain the regular value or a fault value.
faultdef DIVISION_BY_ZERO;
fn double? divide(int a, int b)
{
// We return an optional result of type DIVISION_BY_ZERO
// when b is zero.
if (b == 0) return DIVISION_BY_ZERO~;
return (double)a / (double)b;
}
// Re-returning an optional result uses "!" suffix
fn void? test_may_fail()
{
divide(foo(), bar())!;
}
fn void main()
{
// ratio is an optional result.
double? ratio = divide(foo(), bar());
// Handle the optional result value if it exists.
if (catch err = ratio)
{
switch (err)
{
case DIVISION_BY_ZERO:
io::printn("Division by zero");
return;
default:
io::printn("Unexpected error!");
return;
}
}
// Flow typing makes "ratio"
// have the plain type 'double' here.
io::printfn("Ratio was %f", ratio);
}
fn void print_file(String filename)
{
String? file = (String)file::load_temp(filename);
// The following function is not called on error,
// so we must explicitly discard it with a void cast.
(void)io::printfn("Loaded %s and got:%s", filename, file);
if (catch err = file)
{
switch(err)
{
case io::FILE_NOT_FOUND:
io::printfn("I could not find the file %s", filename);
default:
io::printfn("Could not load %s.", filename);
}
}
}
// Note that the above is only illustrating how Optionals may skip
// call invocation. A more normal implementation would be:
fn void print_file2(String filename)
{
String? file = (String)file::load_temp(filename);
if (catch err = file)
{
// Print the error
io::printfn("Failed to load %s: %s", filename, err);
// We return, so that below 'file' will be unwrapped.
return;
}
// No need for a void cast here, 'file' is unwrapped to 'String'.
io::printfn("Loaded %s and got:\n%s", filename, file);
}
Read more about optionals and error handling here.
Contracts¶
Pre- and postconditions are optionally compiled into asserts helping to optimize the code.
<*
@param foo : "the number of foos"
@require foo > 0, foo < 1000
@return "number of foos x 10"
@ensure return < 10000, return > 0
*>
fn int test_foo(int foo)
{
return foo * 10;
}
<*
@param array : "the array to test"
@param length : "length of the array"
@require length > 0
*>
fn int get_last_element(int* array, int length)
{
return array[length - 1];
}
Read more about contracts here.
Struct Methods¶
It's possible to namespace functions with a union, struct or enum type to enable "dot syntax" calls:
struct Foo
{
int i;
}
fn void Foo.next(Foo* this)
{
if (this) this.i++;
}
fn void test()
{
Foo foo = { 2 };
foo.next();
foo.next();
// Prints 4
io::printfn("%d", foo.i);
}
Macros¶
Macro arguments may be immediately evaluated.
macro foo(a, b)
{
return a(b);
}
fn int square(int x)
{
return x * x;
}
fn int test()
{
int a = 2;
int b = 3;
return foo(&square, 2) + a + b; // 9
// return foo(square, 2) + a + b;
// Error: function should be followed by (...) or prefixed by &.
}
Macro arguments may have deferred evaluation, which is basically duplication of the expression using #var syntax.
macro @foo(#a, b, #c)
{
#c = #a(b) * b;
}
macro @foo2(#a)
{
return #a * #a;
}
fn int square(int x)
{
return x * x;
}
fn int test1()
{
int a = 2;
int b = 3;
@foo(square, a + 1, b);
return b; // 27
}
fn int printme(int a)
{
io::printn(a);
return a;
}
fn int test2()
{
return @foo2(printme(2)); // Returns 4 and prints "2" twice.
}
Improve macro errors with preconditions:
<*
@param x : "value to square"
@require types::is_numerical($typeof(x)) : "cannot multiply"
*>
macro square(x)
{
return x * x;
}
fn void test()
{
square("hello"); // Error: cannot multiply "hello"
int a = 1;
square(&a); // Error: cannot multiply '&a'
}
Read more about macros here.
Compile Time Reflection & Execution¶
Access type information and loop over values at compile time:
import std::io;
struct Foo
{
int a;
double b;
int* ptr;
}
macro print_fields($Type)
{
$foreach $field : $Type.membersof:
io::printfn("Field %s, offset: %s, size: %s, type: %s",
$field.nameof, $field.offsetof, $field.sizeof, $field.typeid.nameof);
$endforeach
}
fn void main()
{
print_fields(Foo);
}
This prints on x64:
Field a, offset: 0, size: 4, type: int
Field b, offset: 8, size: 8, type: double
Field ptr, offset: 16, size: 8, type: int*
Compile Time Execution¶
Macros with only compile time variables are completely evaluated at compile time:
macro long fib(long $n)
{
$if $n <= 1:
return $n;
$else
return fib($n - 1) + fib($n - 2);
$endif
}
const long FIB19 = fib(19);
// Same as const long FIB19 = 4181;
Note
C3 macros are designed to provide a replacement for C preprocessor macros. They extend such macros by providing compile time evaluation using constant folding, which offers an IDE friendly, limited, compile time execution.
However, if you are doing more complex compile time code generation it is recommended to use $exec and related techniques to generate code in external scripts instead.
Read more about compile time execution here.
Operator Overloading¶
struct Vec2
{
int x, y;
}
fn Vec2 Vec2.add(self, Vec2 other) @operator(+)
{
return { self.x + other.x, self.y + other.y };
}
fn Vec2 Vec2.sub(self, Vec2 other) @operator(-)
{
return { self.x - other.x, self.y - other.y };
}
fn void main()
{
Vec2 v1 = { 1, 2 };
Vec2 v2 = { 100, 4 };
Vec2 v3 = v1 + v2; // v3 = { 101, 6 }
}
Read more about operator overloading here.
Generics¶
Declarations may be generic.
module stack;
struct Stack <Type>
{
usz capacity;
usz size;
Type* elems;
}
fn void Stack.push(Stack* this, Type element)
{
if (this.capacity == this.size)
{
this.capacity = this.capacity ? this.capacity * 2 : 16;
this.elems = realloc(this.elems, Type.sizeof * this.capacity);
}
this.elems[this.size++] = element;
}
fn Type Stack.pop(Stack* this)
{
assert(this.size > 0);
return this.elems[--this.size];
}
fn bool Stack.empty(Stack* this)
{
return !this.size;
}
Testing it out:
alias IntStack = Stack{int};
fn void test()
{
IntStack stack;
stack.push(1);
stack.push(2);
// Prints pop: 2
io::printfn("pop: %d", stack.pop());
// Prints pop: 1
io::printfn("pop: %d", stack.pop());
Stack {double} dstack;
dstack.push(2.3);
dstack.push(3.141);
dstack.push(1.1235);
// Prints pop: 1.1235
io::printfn("pop: %f", dstack.pop());
}
Read more about generics here
Dynamic Calls¶
Runtime dynamic dispatch through interfaces:
import std::io;
// Define a dynamic interface
interface MyName
{
fn String myname();
}
struct Bob (MyName) { int x; }
// Required implementation as Bob implements MyName
fn String Bob.myname(Bob*) @dynamic { return "I am Bob!"; }
// Ad hoc implementation
fn String int.myname(int*) @dynamic { return "I am int!"; }
fn void whoareyou(any a)
{
MyName b = (MyName)a;
if (!&b.myname)
{
io::printn("I don't know who I am.");
return;
}
io::printn(b.myname());
}
fn void main()
{
int i = 1;
double d = 1.0;
Bob bob;
any a = &i;
whoareyou(a);
a = &d;
whoareyou(a);
a = &bob;
whoareyou(a);
}
Read more about dynamic calls here.
Classic text games¶
Here are two classic simple text based games showcasing C3 features and the C3 standard library.
Guess a number¶
import std::io, std::math::random;
fn int main()
{
int secret = rand(20) + 1;
int tries = 6;
// game loop
while OUTER: (true)
{
io::printfn("Enter a guess between 1 and 20, "
"%d tries remaining", tries);
int? guess = io::treadline().to_int();
if (catch err = guess)
{
if (err == io::EOF) return 1; // Prevent infinite loop
io::printn("That wasn't a valid number, try again.");
continue;
}
switch
{
case guess < secret: io::printn("Too Small");
case guess > secret: io::printn("Too Large");
default: io::printn("You Win!"); break OUTER;
}
if (--tries == 0)
{
io::printfn("Game Over - the number was %s", secret);
break;
}
}
io::printn("Thank you for playing!");
return 0;
}
Rock, paper, scissors¶
import std::io, std::math::random;
enum Action : (String abbrev, String full)
{
ROCK { "r", "Rock" },
PAPER { "p", "Paper" },
SCISSORS { "s", "Scissors" },
}
const ROUNDS = 3;
fn int main()
{
int p_score;
int c_score;
int rounds = ROUNDS;
io::printfn("Let's play Rock-Paper-Scissors!");
while (rounds > 0)
{
io::printfn("Best out of %d, %d rounds remaining. ", ROUNDS, rounds);
io::printn("What is your guess? [r]ock, [p]aper, or [s]cissors?");
Action guess;
while (true)
{
String? s = io::treadline();
if (catch s) return 1;
if (try current_guess = Action.lookup_field(abbrev, s))
{
guess = current_guess;
break;
}
io::printn("input invalid.");
}
io::printfn("Player: %s", guess.full);
Action comp = Action.from_ordinal(rand(3));
io::printfn("Computer: %s", comp.full);
switch
{
case comp == ROCK && guess == SCISSORS:
case comp == SCISSORS && guess == PAPER:
case comp == PAPER && guess == ROCK:
io::printn("Computer Score!");
c_score++;
rounds--;
case guess == ROCK && comp == SCISSORS:
case guess == SCISSORS && comp == PAPER:
case guess == PAPER && comp == ROCK:
io::printn("Player Score!");
p_score++;
rounds--;
default:
io::printn("Tie!");
}
io::printfn("Score: Player: %d, Computer: %d", p_score, c_score);
}
switch
{
case p_score < c_score: io::printn("COMPUTER WINS GAME!");
case p_score > c_score: io::printn("PLAYER WINS GAME!");
default: io::printn("GAME TIED!");
}
io::printn("Thank you for playing.");
return 0;
}
Type System
Overview¶
As usual, types are divided into basic types and user defined types (enum, union, struct, typedef, bitstruct). All types are defined on a global level.
Naming¶
All user-defined types in C3 start with upper-case. So MyStruct or Mystruct would be fine, mystruct_t or mystruct would not.
This naming requirement ensures that the language is easy to parse for tools.
It is possible to use attributes to change the external name of a type:
This affects generated C headers, but little else.
Differences from C¶
Unlike C, C3 does not use type qualifiers. const exists,
but is a storage class modifier, not a type qualifier.
Instead of volatile, volatile loads and stores are implemented using @volatile_load and @volatile_store.
Restrictions on function parameter usage are implemented through parameter preconditions.
C3's equivalent of C's typedef has a slightly different syntax in C3 and is renamed alias. In contrast, in C3 a distinct type is created when using C3's typedef keyword. As such, take care to not confuse C3's alias and typedef keywords relative to C.
C3 also requires all function pointers to be used with an alias. For example:
alias Callback = fn void();
Callback a = null; // Ok!
fn Callback getCallback() { /* ... */ } // Ok!
// fn fn void() getCallback() { /* ... */ } - ERROR!
// fn void() a = null; - ERROR!
Compile time properties¶
Types have built in type properties available through .method syntax. The following properties
are common to all C3 runtime types:
alignof- The standard alignment of the type in bytes. For exampleint.alignofwill typically be 4.kindof- The category of type, e.g.TypeKind.POINTERTypeKind.STRUCT(see std::core::types).extnameof- Returns a string with the extern name of the type, rarely used.nameof- Returns a string with the unqualified name of the type.qnameof- Returns a string with the qualified (using the full path) name of the type.sizeof- Returns the storage size of the type in bytes.typeid- Returns a runtime typeid for the type.methodsof- Returns the methods implemented for a type.has_tagof(tagname)- Returns true if the type has a particular tag.tagof(tagname)- Retrieves the tag defined on the type.is_eq- True if the type implements==is_ordered- True if the type implements comparisons.is_substruct- True if the type has an inline member.
*Note: 0.8.0 moves to the int::align syntax instead.
Basic types¶
Basic types are divided into floating point types and integer types.
Integer types are either signed or unsigned.
Integer types¶
| Name | bit size | signed |
|---|---|---|
bool† |
1 | no |
ichar |
8 | yes |
char |
8 | no |
short |
16 | yes |
ushort |
16 | no |
int |
32 | yes |
uint |
32 | no |
long |
64 | yes |
ulong |
64 | no |
int128 |
128 | yes |
uint128 |
128 | no |
iptr‡ |
varies | yes |
uptr‡ |
varies | no |
isz‡ |
varies | yes |
usz‡ |
varies | no |
†: bool will be stored as a byte.
‡: Size, pointer and pointer-sized types depend on the target platform.
Note that isz is renamed sz from 0.8.0 and onwards.
Integer type properties¶
Integer types (except for bool) also have the following type properties:
maxThe maximum value for the type.minThe minimum value for the type.
Integer arithmetics¶
All signed integer arithmetic uses 2's complement.
Integer constants¶
Integer constants are 1293832 or -918212.
Integers may be written in decimal, but also
- in binary with the prefix 0b e.g.
0b0101000111011,0b011 - in octal with the prefix 0o e.g.
0o0770,0o12345670 - in hexadecimal with the prefix 0x e.g.
0xdeadbeef0x7f7f7f
In the case of binary, octal and hexadecimal, the type is assumed to be unsigned.
Furthermore, underscore _ may be used to add space between digits to improve readability e.g. 0xFFFF_1234_4511_0000, 123_000_101_100
Integer literal suffix and type¶
Integer literals follow C's rules:
- A decimal literal is by default
int. If it does not fit in anint, the type islongorint128. Picking the smallest type that fits the literal. - If the literal is suffixed by
uorUit is instead assumed to be anuint, but will beulongoruint128if it doesn't fit, like in (1). - Binary, octal and hexadecimal will implicitly be unsigned.
- If an
lorLsuffix is given, the type is assumed to belong. IfllorLLis given, it is assumed to beint128. - If the
ulorULis given, the type is assumed to beulong. IfullorULL, then it assumed to beuint128. - If a binary, octal or hexadecimal starts with zeros, infer the type size from the number of bits that would be needed if all digits were the maximum for the base.
$typeof(1); // int
$typeof(1u); // uint
$typeof(1L); // long
$typeof(0x11); // uint, hex is unsigned by default
$typeof(0x1ULL); // uint128
$typeof(4000000000); // long, since the number exceeds int.max
$typeof(0x000000000000); // ulong: 12 hex chars indicate a 48 bit value
$typeof(0b000000000000); // uint: 12 binary chars indicate a 12 bit value
TwoCC, FourCC and EightCC literals¶
FourCC codes are often used to identify binary format types. C3 adds direct support for 4 character codes, but also 2 and 8 characters:
- 2 character strings, e.g.
'C3', would convert to an ushort or short. - 4 character strings, e.g.
'TEST', converts to an uint or int. - 8 character strings, e.g.
'FOOBAR11'converts to an ulong or long.
Conversion is always done so that the character string has the correct ordering in memory. This means that the same characters may have different integer values on different architectures due to endianness.
Base64 and hex data literals¶
Base64 encoded values work like TwoCC/FourCC/EightCC, in that it is laid out in byte order in memory. It uses the format b64'<base64>'. Hex encoded values work as base64 but with the format x'<hex>'. In data literals any whitespace is ignored, so '00 00 11'x encodes to the same value as x'000011'.
In our case we could encode b64'Rk9PQkFSMTE=' as 'FOOBAR11'.
Base64 and hex data literals initializes to arrays of the char type:
char[*] hello_world_base64 = b64"SGVsbG8gV29ybGQh";
char[*] hello_world_hex = x"4865 6c6c 6f20 776f 726c 6421";
String literals, and raw strings¶
Regular string literals is text enclosed in " ... " just like in C. C3 also offers another type of literal: raw strings.
Raw strings uses text between ` `. Inside of a raw string, no escapes are available, and it can span across multiple lines. To write a ` double the character:
String foo = `C:\foo\bar.dll`;
ZString bar = `"Say ``hello``"`;
String baz =
`pushq %rax;
addq $1, %rax;
popq %rax;`;
// Same as
String foo = "C:\\foo\\bar.dll";
String bar = "\"Say `hello`\"";
String baz = "pushq %rax;\naddq $1, %rax;\npopq %rax;";
Floating point types¶
| Name | bit size |
|---|---|
bfloat16† |
16 |
float16† |
16 |
float |
32 |
double |
64 |
float128† |
128 |
†: Support is still incomplete and not all systems have native support.
Floating point type properties¶
On top of the regular properties, floating point types also have the following properties:
maxThe maximum value for the type.minThe minimum value for the type.infInfinity.nanFloat NaN.
Floating point constants¶
Floating point constants will at least use 64 bit precision. Just like for integer constants, it is allowed to use underscore, but it may not occur immediately before or after a dot or an exponential.
Floating point values may be written in decimal or hexadecimal. For decimal, the exponential symbol is e (or E, both are acceptable), for hexadecimal p (or P) is used: -2.22e-21 -0x21.93p-10
By default a floating point literal is of type double, but if the suffix f is used (eg 1.0f), it is instead of
float type.
C compatibility¶
For C compatibility the following types are also defined in std::core::cinterop
| Name | C type |
|---|---|
CChar |
char |
CShort |
short int |
CUShort |
unsigned short int |
CInt |
int |
CUInt |
unsigned int |
CLong |
long int |
CULong |
unsigned long int |
CLongLong |
long long |
CULongLong |
unsigned long long |
CLongDouble |
long double |
float and double will always match their C counterparts.
Note that signed C char and unsigned char will correspond to ichar and char. CChar is only available to match the default signedness of char on the platform.
Other built-in types¶
Pointer types¶
Pointers mirror C: Foo* is a pointer to a Foo, while Foo** is a pointer to a pointer of Foo.
Pointer type properties¶
In addition to the standard properties, pointers also have the inner
property. It returns the type of the object pointed to as a typeid.
Optional¶
An Optional type is created by taking a type and appending ~.
An Optional type behaves like a tagged union, containing either the
Result or an Empty, which also carries a fault type.
Once extracted, a fault can be converted to another fault.
faultdef MISSING; // define a fault
int? i;
i = 5; // Assigning a real value to i.
i = io::EOF~; // Assigning an optional result to i.
fault b = MISSING; // Assign a fault to b
b = @catch(i); // Assign the Excuse in i to b (EOF)
Only variables, expressions and function returns may be Optionals. Function and macro parameters in their definitions may not be optionals.
fn Foo*? getFoo() { /* ... */ } // ✅ Ok!
int? x = 0; // ✅ Ok!
fn void processFoo(Foo*? f) { /* ... */ } // ❌ fn parameter
An Optional value can use the special if-try and if-catch to unwrap its result or its Empty,
it is also possible to implicitly return if it is Empty using ! and panic with !!.
To learn more about the Optional type and error handling in C3, read the page on Optionals and error handling.
Note
If you want a more regular "optional" value, to store in structs, then you can use the generic Maybe type in std::collections.
The fault type¶
When an Optional does not contain a result, it is Empty, but contains a fault which explains why there was no
normal value. A fault have the special property that together with the ~ suffix it creates an Empty value:
int? x = IO_ERROR~; // 'IO_ERROR~' is an Optional Empty.
fault y = IO_ERROR; // Here IO_ERROR is just a regular
// value, since it isn't followed by '~'
A new fault value can only be defined using the faultdef statement:
Like the typeid type, a fault is pointer sized
and each value defined by faultdef is globally unique. This is true even when faults are separately compiled.
Note
The underlying unique value assigned to a fault may vary each time a program is run.
Fault nameof¶
The fault type only has one field: nameof, which returns the name of the fault, namespaced with the last module path, e.g. "io::EOF".
The typeid type¶
The typeid holds the runtime representation of a type. Using <typename>.typeid a type may be converted to its unique runtime id,
e.g. typeid a = Foo.typeid;. The value itself is pointer-sized.
Typeid fields¶
At compile time, a typeid value has all the properties of its underlying type:
However, at runtime only a few are available:
sizeof- always supported.kindof- always supported.parentof- supported on distinct and struct types, returning the inline member type.inner- supported on types implementing it.names- supported on enum types.len- supported on arrays, vectors and enums.
The any type¶
C3 contains a built-in variant type, which is essentially a struct containing a typeid plus a void* pointer to a value.
While it is possible to cast the any pointer to any pointer type, it is recommended to use the anycast macro or checking the type explicitly first. With the anycast macro, the return will be
an optional, which is empty if there is a mismatch.
fn void main()
{
int x;
any y = &x;
int* w = (int*)y; // Returns the pointer to x
double* z_bad = (double*)y; // Don't do this!
double*? z = anycast(y, double); // The safe way to get a value
if (y.type == int.typeid)
{
// Do something if y contains an int*
}
if (try v = anycast(y, int))
{
// same as above, but v holds the unwrapped int*
}
}
You can use a switch to check an any's type, as well. After the type has been confirmed, it is safe to dereference.
fn void test(any z)
{
// Switch
switch (z.type)
{
case int:
// This is safe here:
int* y = (int*)z;
case double:
// This is safe here:
double* y = (double*)z;
}
// Assignment switch
switch (y = z, y.type)
{
case int:
// This is safe here:
int* x = (int*)y;
}
// Finally, if we just want to deal with the case
// where it is a single specific type:
if (z.type == int.typeid)
{
// This is safe here:
int* a = (int*)z;
}
if (try b = *anycast(z, int))
{
// b is an int:
foo(b * 3);
}
}
Note that in switches, if a substruct type is passed in and it's parent matches first, it will take priority.
fn void test(any z)
{
// Will always be seen as the parent type.
switch (z.type)
{
case Parent:
// code...
case Subtype:
// code that will never execute...
}
// So order the subtypes first
// if you're comparing them against their parent.
// Of course, this is still useful in cases
// of inherited types where the parent isn't in the switch.
switch (z.type)
{
case Parent:
// modify data both Parent and Subtype have
case SomethingElse:
// completely different type code
}
}
If you don't want the child type detected as the parent type, a typedef can be used to create a distinct type without changing any data.
any fields¶
At runtime, any gives you access to two fields:
some_any.type- returns the underlying pointee typeid of the contained value.some_any.ptr- returns the rawvoid*pointer to the contained value.
Advanced use of any¶
The standard library has several helper macros to manipulate any types:
anycast(some_any, Type)returns a pointer toType*orTYPE_MISMATCHif types don't match.any_make(ptr, some_typeid)creates ananyto a giventypeidusing avoid*.some_any.retype_to(some_typeid)changes the type of ananyto the given typeid.some_any.as_inner()retypes the type of theanyto the "inner" (see theinnertype property) of the current type.
void* some_ptr = foo();
// Essentially (any)(int*)(some_ptr)
any some_int = any_make(some_ptr, int.typeid);
// Same as any_make(some_int.ptr, uint.type)
any some_uint = some_int.retype_to(uint.typeid);
typedef SomeType = int;
SomeType s = 3;
any any_val = &s;
// Result is same as (any)&s.a
any some_inner_int = any_val.as_inner();
Array types¶
Arrays are indicated by [size] after the type, e.g. int[4]. Slices use the type[]. For initialization the wildcard type[*] can be used to infer the size
from the initializer. See the chapter on arrays.
Vector types¶
Vectors use [<size>] after the type, e.g. float[<3>], with the restriction that vectors may only form out
of integers, floats and booleans. Similar to arrays, wildcard can be used to infer the size of a vector: int[<*>] a = { 1, 2 }.
Array and vector type properties¶
Array and vector types also support:
innerReturning the type of each element.lenGives the length of the type.
User defined types¶
Type aliases (C's typedef)¶
C3 has a construct that behaves essentially the same as C's "typedef", an alias, and it is declared using the syntax alias <new_name> = <old_name>. For example:
These are not proper types, just aliases, and querying their properties will query the properties of its aliased type.
Function pointer types¶
Function pointers are always used through an alias:
To form a function pointer, write a normal function declaration but skipping the function name. fn int foo(double x) ->
fn int(double x).
Function pointers can have default arguments, e.g. alias Callback = fn void(int value = 0) but default arguments
and parameter names are not taken into account when determining function pointer assignability:
alias Callback = fn void(int value = 1);
fn void test(int a = 0) { /* ... */ }
Callback callback = &test; // Ok
fn void main()
{
callback(); // Works, same as test(1);
test(); // Works, same as test(0);
callback(value: 3); // Works, same as test(3)
test(a: 4); // Works, same as test(4)
// callback(a: 3); // ERROR!
}
Function pointer type properties¶
Function pointer types also support:
paramsof- Returns a list ofReflectedParamfor each parameter.returns- This returns the return type.
Typedef - Distinct type definitions¶
typedef creates a new type, that has the same properties as the original type but is distinct from it. It cannot implicitly convert into the other type using the syntax
typedef <name> = <type>
typedef MyId = int;
typedef MyId2 @constinit = int;
fn void* get_by_id(MyId id) { ... }
fn void* get_by_id2(MyId2 id) { ... }
fn void test(MyId id)
{
void* val = get_by_id(id); // Ok
// void* val2 = get_by_id(1); // ERROR expected a MyId
// Use `@constinit` to allow implicit conversion from
// literals
void* val2 = get_by_id2(1);
int a = 1;
// void* val3 = get_by_id(a); // ERROR expected a MyId
// `@constinit` doesn't work on non-literals
// void* val3 = get_by_id2(a); // ERROR expected a MyId2
void* val4 = get_by_id((MyId)a); // Works
// a = id; // ERROR can't assign 'MyId' to 'int'
}
Inline typedef¶
Using inline in the typedef declaration allows a newly created typedef type to implicitly convert to its underlying type:
typedef Abc @constinit = int;
typedef Bcd @constinit = inline int;
fn void test()
{
Abc a = 1;
Bcd b = 1;
// int i = a; Error: Abc cannot be implicitly converted to 'int'
int i = b; // This is valid
// However, 'inline' does not allow implicit conversion from
// the inline type to the typedef type:
// a = i; Error: Can't implicitly convert 'int' to 'Abc'
// b = i; Error: Can't implicitly convert 'int' to 'Bcd'
}
Aligned typedefs¶
It's possible to use typedef to create underaligned types. For example, typically an int will be 4 byte aligned, but we can create a 2-byte aligned type using typedef IntAlign2 = int @align(2);.
Storage SIMD types¶
Vectors are normally stored and passed as arrays to prevent SIMD alignment overhead. However, it's possible to define types that exactly match the SIMD types in C and other languages for storage and argument passing. These types are defined with typedef and the @simd attribute, similar to aligned typedefs: typedef Float4 = float[<4>] @simd
Typedef type properties¶
In addition to the normal properties, typedef also supports:
inner- Returns the type this is based on as atypeid.parentof- If this is an inline typedef, return the same asinner.
Generic types¶
import generic_list; // Contains the generic MyList
struct Foo
{
int x;
}
// ✅ alias for each type used with a generic module.
alias MyListFoo = MyList {Foo};
MyListFoo working_example;
fn void main()
{
// ❌ A nested inline type definition in a function context
// will yield an error, it's only available on the top
// level or in macros. Prefer aliases.
MyList {MyList {int}} failing_example;
}
Enum and constdefs¶
These correspond to C's enum. See enums and constdefs.
Struct types¶
Read more about unions and structs and bitstructs.
C to C3
A Guide For C Programmers
Overview¶
This is intended for existing C programmers.
This primer is intended as a guide to how the C syntax – and in some cases C semantics – are different in C3. It is intended to help you take a piece of C code and understand how it can be converted manually to C3.
Functions¶
Functions are declared like C, but you need to put fn in front:
Find out more about functions, including named arguments and default arguments.
Calling C Functions¶
Declare a function (or variable) with extern and it will be possible to
access it from C3:
Note that currently only the C standard library is automatically passed to the linker. In order to link with other libraries, you need to explicitly tell the compiler to link them.
If you want to use a different identifier inside of your C3 code compared to
the function or variable's external name, use the @cname attribute:
extern fn int _puts(char* message) @cname("puts");
...
_puts("Hello world"); // <- calls the puts function in libc
New macro system¶
The old C macro system is replaced by a new C3 macro system.
Read more about semantic macros.
Identifiers¶
Naming standards¶
Name standards are strictly enforced, to simplify the C3 grammar:
// Starting with uppercase and followed somewhere by at least
// one lower case is a user defined type:
Foo x;
M____y y;
// Starting with lowercase is a variable or a function or a member name:
x.myval = 1;
int z = 123;
fn void fooBar(int x) { ... }
// Only upper case is a constant or an enum value:
const int FOOBAR = 123;
enum Test
{
STATE_A,
STATE_B
}
Variable Declaration¶
Multiple declarations are restricted¶
Multiple declaration with initialization isn't allowed in C3:
In conditionals, a special form of multiple declarations is allowed but each must then provide its type:
Zero initialization by default¶
In C global variables are implicitly zeroed out, but local variables aren’t. In C3 both global and local variables are zeroed out by default, but may be explicitly undefined (using the @noinit attribute) if you wish to match the C behaviour.
Removal of the const type qualifier¶
The const qualifier is only retained for actual constant variables. C3 uses a special type of post condition for functions to indicate that they do not alter input parameters.
<*
This function ensures that foo is not changed in the function.
@param [in] foo
@param [out] bar
*>
fn void test(Foo* foo, Bar* bar)
{
bar.y = foo.x;
// foo.x = foo.x + 1 - compile time error, can't write to 'in' param.
// int x = bar.y - compile time error, can't read from an 'out' param.
}
Expressions¶
Bit operator precedence changed¶
Notably bit operations have higher precedence than +/- and comparison operators, making code like this: a & b == c evaluate like (a & b) == c instead of C's a & (b == c). The elvis operator, ?:, also binds tighter than ternary. See the page about precedence rules.
0-prefix octal syntax removed¶
The old 0777 octal syntax present in C has been removed and replaced by a 0o prefix in C3, e.g. 0o777. Strings in C3 do not support octal sequences aside from '\0'.
Member access using . even for pointers¶
The -> operator is removed, access uses dot for both direct and pointer access. Note that this is just single access: to access a pointer of a pointer (e.g. int**) an explicit dereference would be needed.
In the special case of needing to dereference and index into an array, use .[] syntax:
int[3] a;
int[3]* b = &a; // Different from C!
// b[1] = 3; ERROR: expected an int[3] but got an int.
(*b)[1] = 3; // Works
b.[1] = 3; // Same as the above
This situation does not arise in C, due to pointer decay.
Signed overflow is well-defined¶
Signed integer overflow always wraps using 2s complement. It is never undefined behaviour.
Restrictions in implicit conversion rules¶
C3 does not permit implicit narrowing. Implicit widening is only allowed when there is only a single way to widen an expression.
Take the case of long x = int_val_1 + int_val_2. In C this would widen the result of the addition:
long x = (long)(int_val_1 + int_val_2), but there is another possible
way to widen: long x = (long)int_val_1 + (long)int_val_2. So, in this case, the widening is disallowed in C3. However, long x = int_val_1 is unambiguous, so C3 permits it just like C (read more on the conversion page).
Evaluation order is well-defined¶
Evaluation order (after precedence, meaning when operators have equal precedence, a.k.a. associativity) is left-to-right. In assignment expressions, assignment happens after expression evaluation.
int a = foo() + bar(); // Always evaluates foo() before bar()
*(baz()) = foo(); // foo() evaluates before baz()
Types¶
Struct, Enum And Union Declarations¶
Don't add a ; after enum, struct and union declarations, and note the slightly
different syntax for declaring a named struct inside of a struct.
Also, user-defined types are used without a struct, union or enum keyword, as
if the name was a C typedef.
Arrays¶
Array sizes are written next to the type, and arrays do not decay to pointers, you need to do it manually:
You will probably prefer slices to pointers when passing data around:
// C
int x[100] = ...;
int y[30] = ...;
int z[15] = ...;
sort_my_array(x, 100);
sort_my_array(y, 30);
// Sort part of the array!
sort_my_array(z + 1, 10);
// C3
int[100] x = {};
int[30] y = {};
sort_my_array(&x); // Implicit conversion from int[100]* -> int[]
sort_my_array(&y); // Implicit conversion from int[30]* -> int[]
sort_my_array(z[1..10]); // Inclusive ranges!
Note that declaring an array of inferred size will look different in C3:
Arrays are trivially copyable:
Find out more about arrays.C's typedef and #define become alias¶
C's typedef is replaced by alias:
alias also allows you to do things that C uses #define for:
// C
#define println puts
#define my_excellent_string my_string
char *my_string = "Party on";
...
println(my_excellent_string);
// C3
alias println = puts;
alias my_excellent_string = my_string;
char* my_string = "Party on";
...
println(my_excellent_string);
Find out more about alias.
typedef creates new types¶
typedef in C3 creates a new type with it's own methods, and the original type cannot implicitly convert to this new type, unless cast.
typedef MyId = int;
fn void get_by_id(MyId id)
{
return;
}
fn void test()
{
MyId valid = 7;
int invalid = 7;
get_by_id(valid); // allowed
get_by_id(invalid); // not allowed
}
Changes To enum and introducing constdef¶
C3 enums give new features, such as returning the name of the enum value at runtime. Their underlying representation always starts at 0 without gaps. For C enums with gaps, C3 uses constdef instead:
Read more about enums here.
Bitfields Are Replaced By Explicit Bitstructs¶
A bitstruct has an explicit container type, and each field has an exact bit range.
bitstruct Foo : short
{
int a : 0..2; // Exact bit ranges, bits 0-2
uint b : 3..6;
MyEnum c : 7..13;
}
There exists a simplified form for a bitstruct containing only booleans, it is the same except the ranges are left out:
For more information see the page on bitstructs.
Fixed size basic types¶
Several C types that would be variable sized are fixed size, and others changed names:
// C3
short a; // Guaranteed 16 bits
int b; // Guaranteed 32 bits
long c; // Guaranteed 64 bits
ulong d; // Guaranteed 64 bits
int128 e; // Guaranteed 128 bits
uint128 f; // Guaranteed 128 bits
usz g; // Same as C size_t, depends on target
sz h; // Same as C ptrdiff_t
iptr i; // Same as intptr_t depends on target
uptr j; // Same as uintptr_t depends on target
Find out more about types.
Type Qualifiers¶
Qualifiers like const and volatile are removed, but const before a constant
will make it treated as a compile time constant. The constant does not need to be typed.
const A = false;
// Compile time
$if A:
// This will not be compiled
$else
// This will be compiled
$endif
volatile is replaced by macros for volatile load and store.
Modules¶
Modules And Import Instead Of #include¶
Declaring the module name is not mandatory, but if you leave it out the file name will be used as the module name. Imports are recursive.
module otherlib::foo;
fn void test() { ... }
struct FooStruct { ... }
module mylib::bar;
import otherlib;
fn void myCheck()
{
foo::test(); // foo prefix is mandatory.
mylib::foo::test(); // This also works;
FooStruct x; // But user defined types don't need the prefix.
otherlib::foo::FooStruct y; // But it is allowed.
}
No mandatory header files¶
There is a C3 interchange header format for declaring interfaces of libraries, but it is only used in special cases.
Comments¶
The /* */ comments are nesting
Note that doc contracts starting with <* and ending with *>, have special rules for parsing them, and are not considered a regular comment. Find out more about contracts.
C3 also treats #! on the first line as a line comment //.
Statements¶
goto Removed¶
goto is removed, but there is labelled break and continue as well as defer
to handle the cases when it is commonly used in C.
// C
Foo *foo = malloc(sizeof(Foo));
if (tryFoo(foo)) goto FAIL;
if (modifyFoo(foo)) goto FAIL;
free(foo);
return true;
FAIL:
free(foo);
return false;
// C3, direct translation:
do FAIL:
{
Foo* foo = malloc(Foo.sizeof);
if (tryFoo(foo)) break FAIL;
if (modifyFoo(foo)) break FAIL;
free(foo);
return true;
};
free(foo);
return false;
// C3, using defer:
Foo* foo = malloc(Foo.sizeof);
defer free(foo);
if (tryFoo(foo)) return false;
if (modifyFoo(foo)) return false;
return true;
Changes To switch¶
casestatements automatically break.- Use
nextcaseto fallthrough to the next statement. - Empty
casestatements have implicit fallthrough.
Implicit break in switches¶
Empty case statements have implicit fall through in C3, otherwise the nextcase statement is needed. nextcase can also be used to jump to any other case statement in the switch.
For example:
We can jump to an arbitrary switch-case label in C3:
Undefined Behaviour¶
C3 has less undefined behaviour, in particular integers are defined as using 2s complement and signed overflow is wrapping. Find out more about undefined behaviour.
Other Changes¶
The following things are enhancements to C, that don't have an equivalent in C.
- Defer
- Methods
- Optionals
- Generic modules
- Contracts
- Compile time evaluation
- Reflection
- Operator overloading
- Macro methods
- Static initialize and finalize functions
- Dynamic interfaces
For the full list of all new features see the feature list.
Finally, the FAQ answers many questions you might have as you start out.
Language Fundamentals
Basic Types
C3 provides a similar set of fundamental data types as C: integers, floats, arrays and pointers. On top of this it
expands on this set by adding slices and vectors, as well as the any and typeid types for advanced use.
Integers¶
C3 has signed and unsigned integer types. The built-in signed integer types are ichar, short, int, long,
int128, iptr and isz. ichar to int128 have all well-defined power-of-two bit sizes, whereas iptr
has the same number of bits as a void* and isz has the same number of bits as the maximum difference
between two pointers. For each signed integer type there is a corresponding unsigned integer type: char,
ushort, uint, ulong, uint128, uptr and usz.
| type | signed? | min | max | bits |
|---|---|---|---|---|
| ichar | yes | -128 | 127 | 8 |
| short | yes | -32768 | 32767 | 16 |
| int | yes | -2^31 | 2^31 - 1 | 32 |
| long | yes | -2^63 | 2^63 - 1 | 64 |
| int128 | yes | -2^127 | 2^127 - 1 | 128 |
| iptr | yes | varies | varies | varies |
| isz | yes | varies | varies | varies |
| char | no | 0 | 255 | 8 |
| ushort | no | 0 | 65535 | 16 |
| uint | no | 0 | 2^32 - 1 | 32 |
| ulong | no | 0 | 2^64 - 1 | 64 |
| uint128 | no | 0 | 2^128 - 1 | 128 |
| uptr | no | 0 | varies | varies |
| usz | no | 0 | varies | varies |
On 64-bit machines iptr/uptr and isz/usz are usually 64-bits, like long/ulong.
On 32-bit machines on the other hand they are generally int/uint.
Note: from 0.8.0 and onward, isz is renamed sz and is used as the default.
Integer constants¶
Numeric constants typically use decimal, e.g. 234, but may also use hexadecimal (base 16) numbers by prefixing
the number with 0x or 0X, e.g. int a = 0x42edaa02;. There is also octal (base 8) using the
0o or 0O prefix, and 0b for binary (base 2) numbers:
Numbers may also insert underscore _ between digits to improve readability, e.g. 1_000_000.
For decimal numbers, the value is assumed to be a signed int, unless the number doesn't fit in an
int, in which case it is assumed to be the smallest signed type it does fit in (long or int128).
For hexadecimal, octal and binary, the type is assumed to be unsigned.
An integer literal can implicitly convert to a floating point literal, or an integer of a different type provided the number fits in the type.
Constant suffixes¶
If you want to ensure that a constant is of a certain type, you can either add an explicit cast
like: (ushort)345, or use an integer suffix: 345u16.
The following integer suffixes are available:
| suffix | type |
|---|---|
| i8 | ichar |
| i16 | short |
| i32 | int |
| i64 | long |
| i128 | int128 |
| u8 | char |
| u16 | ushort |
| u32 | uint |
| u | uint |
| u64 | ulong |
| u128 | uint128 |
Note how uint also has the u suffix.
Booleans¶
A bool will be either true or false. Although a bool is only a single bit of data,
it should be noted that it is stored in a byte.
Character literals¶
A character literal is a value enclosed in ''. Its value is interpreted as being its
ASCII value for a single character.
It is also possible to use 2, 4 or 8 character wide character literals. Such are interpreted
as ushort, uint and ulong respectively and are laid out in memory from left to right.
This means that the actual value depends on the endianness
of the target.
- 2 character literals, e.g.
'C3', would convert to a ushort. - 4 character literals, e.g.
'TEST', converts to a uint. - 8 character literals, e.g.
'FOOBAR11'converts to a ulong.
The 4 character literals correspond to the layout of FourCC
codes. It will also correctly arrange unicode characters in memory. E.g. Char32 smiley = '\u1F603'
Floating point types¶
As is common, C3 has two floating point types: float and double. float is the 32 bit floating
point type and double is 64 bits.
Floating point constants¶
Floating point constants will at least use 64 bit precision.
Just like for integer constants, it is possible to use _ to improve
readability, but it may not occur immediately before or after a dot or an exponential.
C3 supports floating point values either written in decimal or hexadecimal formats.
For decimal, the exponential symbol is e (or E, both are acceptable),
for hexadecimal p (or P) is used: -2.22e-21 -0x21.93p-10
While floating point numbers default to double it is possible to type a
floating point by adding a suffix:
| Suffix | type |
|---|---|
f32 or f |
float |
f64 |
double |
Arrays¶
Arrays have the format Type[size], so for example: int[4]. An array is a type consisting
of the same element repeated a number of times. Our int[4] is essentially four int values
packed together.
For initialization it's sometimes convenient to use the wildcard Type[*] declaration, which
infers the length from the number of elements:
Slices¶
Slices have the format Type[]. Unlike the array, a slice does not hold the values themselves
but instead presents a view of some underlying array or vector.
Slices have two properties: .ptr, which retrieves the array it points to, and .len which
is the length of the slice - that is, the number of elements it is possible to index into.
Usually we can get a slice by taking the address of an array:
Because indexing into slices is range checked in safe mode, slices are vastly more safe than providing pointer + length separately.
The internal representation of a slice is a two element struct:
This definition can be found in the modulestd::core::runtime.
Vectors¶
Vectors, similar to arrays, use the format
Type[<size>], with the restriction that vectors may only form out
of integers, floats and booleans. Similar to arrays, wildcard can be
used to infer the size of a vector:
Vectors are based on hardware SIMD vectors, and support many different operations that work on all elements in parallel, including arithmetics:
Vector initialization and literals work the same way as arrays, using { ... }, however, it's also possible to use
swizzling arguments to designated initialization:
String literals¶
String literals are special and can convert to several different types:
String, char and ichar arrays and slices and finally ichar* and char*.
String literals are text enclosed in " " just like in C. These support
escape sequences like \n for line break and need to use \" for any " inside of the
string.
C3 also offers raw strings which are enclosed in ` `.
A raw string may span multiple lines.
Inside of a raw string, no escapes are available, and to write a `, simply double the character:
// Note: String is a typedef inline char[]
String three_lines =
`multi
line
string`;
String foo = `C:\foo\bar.dll`;
String bar = `"Say ``hello``"`;
// Same as
String foo = "C:\\foo\\bar.dll";
String bar = "\"Say `hello`\"";
String is a
typedef inline char[], which can implicitly convert to char[] when required.
ZString is a typedef inline char*.ZString is a C compatible null terminated string, which can implicitly convert to char* when required.
Base64 and hex data literals¶
Base64 literals are strings prefixed with b64 containing
Base64 encoded data, which
is converted into a char array at compile time:
// The array below contains the characters "Hello World!"
char[*] hello_world_base64 = b64"SGVsbG8gV29ybGQh";
The corresponding hex data literals convert a hexadecimal string rather than Base64:
// The array below contains the characters "Hello World!"
char[*] hello_world_hex = x"4865 6c6c 6f20 776f 726c 6421";
Pointer types¶
Pointers have the syntax Type*. A pointer is a memory address where one or possibly more
elements of the underlying address are stored. Pointers can be stacked: Foo* is a pointer to a Foo
while Foo** is a pointer to a pointer to Foo.
The pointer type has a special literal called null, which is an invalid, empty pointer.
void*¶
The void* type is a special pointer which implicitly converts to any other pointer. It is not "a pointer to void",
but rather a wildcard pointer which matches any other pointer.
Printing values¶
Printing values can be done using io::print, io::printn, io::printf and io::printfn. This requires
importing the module std::io.
Note
The n variants of the print functions will add a newline after printing, which is what we'll often
use in the examples, but print and printf work the same way.
import std::io; // Get the io functions.
fn void main()
{
int a = 1234;
ulong b = 0xFFAABBCCDDEEFF;
double d = 13.03e-04;
char[*] hex = x"4865 6c6c 6f20 776f 726c 6421";
io::printn(a);
io::printn(b);
io::printn(d);
io::printn(hex);
}
If you run this program you will get:
To get more control we can format the output using printf and printfn:
import std::io;
fn void main()
{
int a = 1234;
ulong b = 0xFFAABBCCDDEEFF;
double d = 13.03e-04;
char[*] hex = x"4865 6c6c 6f20 776f 726c 6421";
io::printfn("a was: %d", a);
io::printfn("b in hex was: %x", b);
io::printfn("d in scientific notation was: %e", d);
io::printfn("Bytes as string: %s", (String)&hex);
}
We can apply the standard printf formatting rules, but
unlike in C/C++ there is no need to indicate the type when using %d - it will print unsigned and
signed up to int128, in fact there is no support for %u, %lld etc in io::printf. Furthermore,
%s works not just on strings but on any type:
import std::io;
enum Foo
{
ABC,
BCD,
EFG,
}
fn void main()
{
int a = 1234;
uint128 b = 0xFFEEDDCC_BBAA9988_77665544_33221100;
Foo foo = BCD;
io::printfn("a: %s, b: %d, foo: %s", a, b, foo);
}
This prints:
Variables
Zero init by default¶
Unlike C, C3 local variables are zero-initialized by default. To avoid zero initialization, you need to explicitly opt-out.
int x; // x = 0
int y @noinit; // y is explicitly undefined and must be assigned before use.
AStruct foo; // foo is implicitly zeroed
AStruct bar = {}; // bar is explicitly zeroed
AStruct baz @noinit; // baz is explicitly undefined
Using a variable that is explicitly undefined before assignment will trap or be initialized to a specific value when compiling "safe" and is undefined behaviour in "fast" builds.
Functions
C3 has both regular functions and mmethods. Methods are functions namespaced using type names, and allow invocation using the dot syntax.
Regular functions¶
Regular functions are the same as C aside from the keyword fn, which is followed by the conventional C declaration of <return type> <name>(<parameter list>).
Function arguments¶
C3 allows the use of default arguments as well as named arguments. Note that any unnamed arguments must appear before any named arguments.
fn int test_with_default(int foo = 1)
{
return foo;
}
fn void test()
{
test_with_default();
test_with_default(100);
}
Named arguments
fn void test_named(int times, double data)
{
for (int i = 0; i < times; i++)
{
io::printf("Hello %d\n", i + data);
}
}
fn void test()
{
// Named only
test_named(times: 1, data: 3.0);
// Unnamed only
test_named(3, 4.0);
// Mixing named and unnamed
test_named(15, data: 3.141592);
}
Named arguments with defaults:
fn void test_named_default(int times = 1, double data = 3.0, bool dummy = false)
{
for (int i = 0; i < times; i++)
{
io::printfn("Hello %f", i + data);
}
}
fn void test()
{
// Named only
test_named_default(times: 10, data: 3.5);
// Unnamed and named
test_named_default(3, dummy: false);
// Overwriting an unnamed argument with a named argument is an error:
// test_named_default(2, times: 3); ERROR!
// Unnamed may not follow named arguments.
// test_named_default(times: 3, 4.0); ERROR!
}
Vaargs¶
There are four types of vaargs:
- single typed
- explicitly typed any: pass non-any arguments as references
- implicitly typed any: arguments are implicitly converted to references (use with care)
- untyped C-style
Examples:
fn void va_singletyped(int... args)
{
/* args has type int[] */
}
fn void va_variants_explicit(any... args)
{
/* args has type any[] */
}
fn void va_variants_implicit(args...)
{
/* args has type any[] */
}
extern fn void va_untyped(...); // only used for extern C functions
fn void test()
{
va_singletyped(1, 2, 3);
int x = 1;
any v = &x;
va_variants_explicit(&&1, &x, v); // pass references for non-any arguments
va_variants_implicit(1, x, "foo"); // arguments are implicitly converted to anys
va_untyped(1, x, "foo"); // extern C-function
}
For typed vaargs, we can pass a slice instead of the individual arguments, by using the splat ... operator for example:
Splat¶
- Splat
...unknown size slice ONLY in a typed vaarg slot.
fn void va_singletyped(int... args) {
io::printfn("%s", args);
}
fn void main()
{
int[2] arr = {1, 2};
va_singletyped(...arr); // arr is splatting two arguments
}
- Splat
...any array anywhere
fn void foo(int a, int b, int c)
{
io::printfn("%s, %s, %s", a, b, c);
}
fn void main()
{
int[2] arr = {1, 2};
foo(...arr, 7); // arr is splatting two arguments
}
- Splat
...known size slices anywhere
fn void foo(int a, int b, int c)
{
io::printfn("%s, %s, %s", a, b, c);
}
fn void main()
{
int[5] arr = {1, 2, 3, 4, 5};
foo(...arr[:3]); // slice is splatting three arguments
}
Named arguments and vaargs¶
Usually, a parameter after vaargs would never be assigned to:
fn void testme(int a, double... x, double rate = 1.0) { /* ... */ }
fn void test()
{
// x is { 2.0, 5.0, 6.0 } rate would be 1.0
testme(3, 2.0, 5.0, 6.0);
}
However, named arguments can be used to set this value:
fn void testme(int a, double... x, double rate = 1.0) { /* ... */ }
fn void test()
{
// x is { 2.0, 5.0 } rate would be 6.0
testme(3, 2.0, 5.0, rate: 6.0);
}
Functions and Optional returns¶
Function return values may be Optionals – denoted by <type>? indicating that this
function might either return an Optional with a result, or an Optional with an Excuse.
For example, this function might return BAD_JOSS_ERROR or BAD_LUCK_ERROR if it fails to produce a valid value.
faultdef BAD_LUCK_ERROR, BAD_JOSS_ERROR;
fn double? test_error()
{
double val = random_value();
if (val > 0.5) return BAD_LUCK_ERROR~;
if (val >= 0.2) return BAD_JOSS_ERROR~;
return val;
}
A function call which is passed one or more Optional arguments will only execute if all Optional values contain a result, otherwise the first Excuse found is returned.
fn void test()
{
// The following line either prints a value less than 0.2
// or does not print at all. The (void) is needed
// to let the compiler know we're deliberately
// ignoring the Optional result.
(void)io::printfn("%d", test_error());
// ?? sets a default value if an Excuse is found
double x = (test_error() + test_error()) ?? 100;
// This prints either a value less than 0.4 or 100:
io::printfn("%d", x);
}
This allows us to chain functions:
fn void print_input_with_explicit_checks()
{
String? line = io::treadline();
if (try line)
{
// line is a regular "string" here.
int? val = line.to_int();
if (try val)
{
io::printfn("You typed the number %d", val);
return;
}
}
io::printn("You didn't type an integer :/ ");
}
fn void print_input_with_chaining()
{
if (try int val = io::treadline().to_int())
{
io::printfn("You typed the number %d", val);
return;
}
io::printn("You didn't type an integer :/ ");
}
Methods¶
Methods look exactly like functions, but are prefixed with a type name and are (usually) invoked using dot syntax, on an instance of the type.
struct Point
{
int x;
int y;
}
fn void Point.add(Point* p, int x)
{
p.x += x;
}
fn void example()
{
Point p = { 1, 2 };
// with struct-functions
p.add(10);
// Also callable as:
Point.add(&p, 10);
}
The target object may be passed by value or by pointer:
enum State
{
STOPPED,
RUNNING
}
fn bool State.may_open(State state)
{
switch (state)
{
case STOPPED: return true;
case RUNNING: return false;
}
}
You can add methods to all runtime types, including built-in types:
fn int int.add(int i, int other)
{
return i + other;
}
fn void test()
{
int i = 3;
int j = i.add(4);
}
Implicit first parameters¶
Because the type of the first argument is known, it may be left out. To indicate a non-null pointer, & is used.
fn int Foo.test(&self) { /* ... */ }
// (almost) equivalent to
fn int Foo.test(Foo* self) { /* ... */ }
fn int Bar.test(self) { /* ... */ }
// equivalent to
fn int Bar.test(Bar self) { /* ... */ }
This means that in order to express a nullable first parameter, one must use the explicit form (e.g. Foo* self) rather than the untyped &self form.
It is customary to use self as the name of the first parameter, but it is not required.
Restrictions on methods¶
- Methods on a struct/union may not have the same name as a member.
- Methods on enums may not have the same name as an associated value.
- When taking a function pointer of a method, use the full name.
- Using subtypes, overlapping function names will be shadowed.
Guidelines on method use¶
Methods are customarily associated with Object-Oriented programming.
In this style one will often encounter code like some_object.run_everything().
C3 is not accommodating to this style, instead one should prefer task::run_everything(some_object).
Both the standard library and the design of the language instead follow
the principle that functions are used whenever the system is mutating
global data, whereas methods are used for mutating a particular value, or
extracting data from it. foo.add(bar), foo.to_list() and foo.push(x)
are all good uses of methods. On the flip side, methods usage like
context.parse_data(data), game.run(settings) and url.make_request()
are emphatically not recommended.
Contracts¶
C3's error handling is not intended to use errors to signal invalid data or to check invariants and post conditions. Instead C3's approach is to add annotations to the function, that conditionally will be compiled into asserts.
As an example, the following code:
<*
@param foo `the number of foos`
@require foo > 0, foo < 1000
@return `number of foos x 10`
@ensure return < 10000, return > 0
*>
fn int test_foo(int foo)
{
return foo * 10;
}
Will in debug builds be compiled into something like this:
fn int test_foo(int foo)
{
assert(foo > 0);
assert(foo < 1000);
int _return = foo * 10;
assert(_return < 10000);
assert(_return > 0);
return _return;
}
The compiler is allowed to use the contracts for optimizations. For example this:
fn int test_example(int bar)
{
// The following is always invalid due to the `@ensure`
if (test_foo(bar) == 0) return -1;
return 1;
}
May be optimized to:
In this case the compiler can look at the post condition of result > 0 to determine that testFoo(foo) == 0 must always be false.
Looking closely at this code, we note that nothing guarantees that bar is not violating the preconditions. In Safe builds this will usually be checked in runtime, but a sufficiently smart compiler will warn about the lack of checks on bar. Execution of code violating pre and post conditions has unspecified behaviour.
Short function declaration syntax¶
For very short functions, C3 offers a "short declaration" syntax using =>:
Lambdas¶
It's possible to create anonymous functions using the regular fn syntax. Anonymous
functions are identical to regular functions and do not capture variables from the
surrounding scope:
alias IntTransform = fn int(int);
fn void apply(int[] arr, IntTransform t)
{
foreach (&i : arr) *i = t(*i);
}
fn void main()
{
int[] x = { 1, 2, 5 };
// Short syntax with inference:
apply(x, fn (i) => i * i);
// Regular syntax without inference:
// apply(x, fn int(int i) { return i * i; });
// Prints [1, 4, 25]
io::printfn("%s", x);
}
Static initializer and finalizers¶
It is sometimes useful to run code at startup and shutdown of a program.
Static initializers and finalizers are regular functions annotated with
@init and @finalizer that are run at startup and shutdown respectively.
(Note: this should not be confused with constructors and destructors
in object-oriented languages.)
fn void run_at_startup() @init
{
// Run at startup
some_function.init(512);
}
fn void run_at_shutdown() @finalizer
{
some_thing.shutdown();
}
Note that invoking @finalizer is a best effort attempt by the OS and may not
be called during abnormal shutdown.
Changing priority of static initializers and finalizers¶
It is possible to provide an argument to the attributes to set the actual priority. It is recommended that programs use a priority of 1024 or higher. The higher the value, the later it will be called. The lowest priority is 65535.
// Print "Hello World" at startup.
fn void start_world() @init(3000)
{
io::printn("World");
}
fn void start_hello() @init(2000)
{
io::print("Hello ");
}
Implementing parameter access constraints¶
<*
A read-only function
@param [in] value
*>
fn void read(int* value)
{
io::printf("%d",*value);
// (*value)++; <- Error: 'in' parameters may not be assigned to.
}
<*
A write-only function
@param [out] buffer
*>
fn void write(int* buffer)
{
(*buffer)++;
// int test = *buffer; <- Error: 'out' parameters may not be read.
}
See the contracts for more details.
Statements
Statements largely work like in C, but with some additions.
Labelled break and continue¶
Labelled break and continue lets you break out of an outer scope. Labels can be put on if,
switch, while and do statements.
fn void test(int i)
{
if FOO: (i > 0)
{
while (1)
{
io::printfn("%d", i);
// Break out of the top if statement.
if (i++ > 10) break FOO;
}
}
}
Do-without-while¶
Do-while statements can skip the ending while. In that case it acts as if the while was while(0):
The function below prints World! if x is zero, otherwise it prints Hello World!.
Nextcase and labelled nextcase¶
The nextcase statement is used in switch and if-catch to jump to the next statement:
It's also possible to use nextcase with an expression, to jump to an arbitrary case or between labeled switch statements:
switch MAIN: (enum_var)
{
case FOO:
switch (i)
{
case 1:
doSomething();
nextcase 3; // Jump to case 3
case 2:
doSomethingElse();
case 3:
nextcase rand(); // Jump to random case
default:
io::printn("Ended");
nextcase MAIN: BAR; // Jump to outer (MAIN) switch
}
case BAR:
io::printn("BAR");
default:
break;
}
Which can be used as structured goto when creating state machines.
Switch cases with runtime evaluation¶
It's possible to use switch as an enhanced if-else chain:
The above would be equivalent to writing:
Note that because of this, the first match is always picked. Consider:
Because of the evaluation order, only foo() will be invoked for x > 0, even when x is greater than 2.
It's also possible to omit the conditional after switch. In that case it is implicitly assumed to be the same as writing (true)
Jumptable switches with @jump¶
Regular switch statements with only enum or integer cases may use the @jump
attribute. This attribute ensures that the switch is implemented as
a jump using a jumptable. In C this is possible to do manually using labels and
calculated gotos which are extensions available in GCC/Clang.
The behaviour of the switch itself does not change with a jumptable,
but some restrictions will apply. Typically used for situations
like bytecode interpreters, it might perform worse
or better than a regular switch depending on the situation.
nextcase statements will also use jumptable dispatch when
@jump is used.
Expressions
Temporary address¶
Expressions work like in C, with one exception: it is possible to take the address of a temporary. This uses the operator && rather than &.
Consequently, this is valid:
A pointer created with && is only valid until the end of the
current function. In other words, you should never return the
pointer created by && from a function as it will never be safe
to use.
Well-defined evaluation order¶
Expressions have a well-defined evaluation order:
- Binary expressions are evaluated from left to right.
- Assignment occurs right to left, so
a = a++would result inabeing unchanged. - Call arguments are evaluated in parameter order.
Compound literals¶
C3 has C's compound literals:
Arrays follow the same syntax:
Note that when it's possible, inferring the type is allowed and preferred, so we have for the above examples:
One may take the address of temporaries, using&& (rather than & for normal variables). This allows the following:
Passing a slice
fn void test(int[] y) { ... }
// Using &&
test(&&(int[3]){ 1, 2, 3 });
// Explicitly slicing:
test(((int[3]){ 1, 2, 3 })[..]);
// Using a slice directly as a temporary:
test((int[]){ 1, 2, 3 });
// Same as above but with inferred type:
test({ 1, 2, 3 });
Passing the pointer to an array
fn void test1(int[3]* z) { ... }
fn void test2(int* z) { ... }
test1(&&(int[3]){ 1, 2, 3 });
test2(&&(int[3]){ 1, 2, 3 });
Constant expressions¶
In C3 all constant expressions are guaranteed to be calculated at compile time. The following are considered constant expressions:
- The
nullliteral. - Boolean, floating point and integer literals.
- The result of arithmetics on constant expressions.
- Compile time variables (prefixed with
$) - Global constant variables with initializers that are constant expressions.
- The result of macros that do not generate code and only use constant expressions.
- The result of a cast if the value is cast to a boolean, floating point or integer type and the value that is converted is a constant expression.
- String literals.
- Initializer lists containing constant values.
Some things that are not constant expressions:
- Any pointer that isn't the
nullliteral, even if it's derived from a constant expression. - The result of a cast except for casts of constant expressions to a numeric type.
- Compound literals - even when values are constant expressions.
Including binary data¶
The $embed(...) function includes the contents of a file into the compilation as a
constant array of bytes:
The result of an embed works similar to a string literal and may implicitly convert to a char*,
void*, char[], char[*] or String.
Limiting length¶
It's possible to limit the length of what is included using the optional second parameter.
Failure to load at compile time and defaults¶
Usually it's a compile time error if the file can't be included, but sometimes it's useful to only optionally include it. If this is desired, declare the left hand side an Optional:
my_image will be an optional io::FILE_NOT_FOUND~ if the image is missing.
This also allows us to pass a default value using ??:
Modules
C3 groups functions, types, variables and macros into namespaces called modules. When doing builds, any C3 file must start with the module keyword, specifying the module. When compiling single files, the module is not needed and the module name is assumed to be the file name, converted to lower case, with any invalid characters replaced by underscore (_).
A module can consist of multiple files, e.g.
file_a.c3
file_b.c3
file_c.c3
Here file_a.c3 and file_b.c3 belong to the same module, foo while file_c.c3 belongs to bar.
Details¶
Some details about the C3 module system:
- Modules can be arbitrarily nested, e.g.
module foo::bar::baz;to create the sub module baz in the sub modulebarof the modulefoo. - Module names must be alphanumeric lower case letters, and may contain an underscore
_. - Module names are limited to 31 characters.
- Modules may be spread across multiple files.
- A single file may have multiple module declarations.
- Each declaration of a distinct module is called a module section.
Importing Modules¶
Modules are imported using the import statement. Imports always recursively import sub-modules. Any module
will automatically import all other modules with the same parent module.
foo.c3
bar.c3
module bar;
import some;
// import some::foo; <- not needed, as it is a sub module to "some"
fn void test()
{
foo::test();
// some::foo::test() also works.
}
In some cases there may be ambiguities, in which case the full path can be used to resolve the ambiguity:
abc.c3
de.c3
test.c3
Implicit Imports¶
The module system will also implicitly import:
- The
std::coremodule (and sub modules). - Any other module sharing the same top module. E.g. the module
foo::abcwill implicitly also import modulesfooandfoo::cdeif they exist.
Visibility¶
All files in the same module share the same global declaration namespace. By default a symbol is visible to all other modules.
To make a symbol only visible inside the module, use the @private attribute.
In this example, the other modules can use the init() function after importing foo, but only files in the foo module can use open(), as it is specified as private.
It's possible to further restrict visibility: @local works like @private except it's only visible in the
local context.
// File foo.c3
module foo;
fn void abc() @private { }
fn void de() @local { }
// File foo2.c3
module foo;
fn void test()
{
abc(); // Access of private in the same module is ok
// de(); <- Error: function is local to foo.c3
}
Overriding Symbol Visibility Rules¶
By using import <module> @public, it's possible to access another module´s private symbols.
Many other module systems have hierarchal visibility rules, but the import @public feature allows
visibility to be manipulated in a more ad-hoc manner without imposing hard rules.
For example, you may provide a library with two modules: "mylib::net" and "mylib::file" - which both use functions
and types from a common "mylib::internals" module. The two libraries use import mylib::internals @public
to access this module's private functions and type. To an external user of the library, the "mylib::internals"
does not seem to exist, but inside of your library you use it as a shared dependency.
A simple example:
// File a.c3
module a;
fn void a_function() @private { ... }
// File b.c3
module b;
fn void b_function() @private { ... }
// File c.c3
module c;
import a;
import b @public;
fn void test()
{
// Error! a_function() is private
a::a_function();
// Allowed since `import b @public` allowed `b`
// to "public" in this context.
b::b_function();
}
Note: @local visibility cannot be overridden using a "@public" import.
Changing The Default Visibility¶
In a normal module, global declarations will be public by default. If some other
visibility is desired, it's possible to declare @private or @local after the module name.
It will affect all declarations in the same section.
module foo @private;
fn void ab_private() { ... } // Private
module foo;
fn void ab_public() { ... } // Public
module bar;
import foo;
fn void test()
{
foo::ab_public(); // Works
// foo::ab_private(); <- Error, private method
}
If the default visibility is @private or @local, using @public sets the visibility to public:
module foo @private;
fn void ab_private() { ... } // Private
fn void ab_public() @public { ... } // Public
Linker Visibility and Exports¶
A function or global prefixed extern will be assumed to be linked in later.
An "extern" function may not have a body, and global variables are prohibited
from having an init expression.
The attribute @export explicitly marks a function as being exported when
creating a (static or dynamic) library. It can also change the linker name of
the function.
Using Functions and Types From Other Modules¶
As a rule, functions, macros, constants, variables and types in the same module do not need any namespace prefix. For imported modules the following rules hold:
- Functions, macros, constants and variables require at least the (sub-) module name.
- Types do not require the module name unless the name is ambiguous.
- In case of ambiguity, only so many levels of module names are needed as to make the symbol unambiguous.
// File a.c3
module a;
struct Foo { ... }
struct Bar { ... }
struct TheAStruct { ... }
fn void a_function() { ... }
// File b.c3
module b;
struct Foo { ... }
struct Bar { ... }
struct TheBStruct { ... }
fn void b_function() { ... }
// File c.c3
module c;
import a, b;
struct TheCStruct { ... }
struct Bar { ... }
fn void c_function() { ... }
fn void test()
{
TheAStruct stA;
TheBStruct stB;
TheCStruct stC;
// Name required to avoid ambiguity;
b::Foo stBFoo;
// Will always pick the current module's
// name.
Bar bar;
// Namespace required:
a::a_function();
b::b_function();
// A local symbol does not require it:
c_function();
}
This means that the rule for the common case can be summarized as
Types are used without prefix; functions, variables, macros and constants are prefixed with the sub module name.
Module Sections¶
A single file may have multiple module declarations, even for the same module. This allows us to write for example:
// File foo.c3
module foo;
fn int hello_world()
{
return my_hello_world();
}
module foo @private;
import std::io; // The import is only visible in this section.
fn int my_hello_world() // @private by default
{
io::printn("Hello, world\n");
return 0;
}
module foo @test;
fn void test_hello() // @test by default
{
assert(hello_world() == 0);
}
Versioning and Dynamic Inclusion¶
NOTE: This feature may significantly change.
When including dynamic libraries, it is possible to use optional functions and globals. This is done using the
@dynamic attribute.
An example library could have this:
dynlib.c3i
module dynlib;
fn void do_something() @dynamic(4.0)
fn void do_something_else() @dynamic(0, 5.0)
fn void do_another_thing() @dynamic(0, 2.5)
Importing the dynamic library and setting the base version to 4.5 and minimum version to 3.0, we get the following:
test.c3
import dynlib;
fn void test()
{
if (@available(dynlib::do_something))
{
dynlib::do_something();
}
else
{
dynlib::do_something_else();
}
}
In this example the code would run do_something if available
(that is, when the dynamic library is 4.0 or higher), or
fallback to do_something_else otherwise.
If we tried to conditionally add something not available in the compilation itself, that is a compile time error:
if (@available(dynlib::do_another_thing))
{
// Error: This function is not available with 3.0
dynlib::do_another_thing();
}
Versionless dynamic loading is also possible:
maybe_dynlib.c3i
test2.c3
import maybe_dynlib;
fn void testme2()
{
if (@available(maybe_dynlib::testme))
{
dynlib::testme();
}
}
This allows things like optionally loading dynamic libraries on the platforms where this is available.
Textual Includes¶
$include¶
It's sometimes useful to include an entire file, doing so employs the $include function.
Includes are only valid at the top level.
File Foo.c3
File Foo.x
The result is as if Foo.c3 contained the following:
The include may use an absolute or relative path, the relative path is always relative to the source file in which the include appears.
Note that to use it, the trust level of the compiler must be set to at least 2 with
the --trust option (i.e. use --trust=include or --trust=full from the command line).
$exec¶
An alternative to $include is $exec which is similar to include, but instead includes the output of an external
program as the included text.
An example:
import std::io;
// On Linux or MacOS this will insert 'String a = "Hello world!";'
$exec("echo", { "String a = \\\"Hello world!\\\"\\;" });
fn void main()
{
io::printn(a);
}
$exec can take in 1 to 3 arguments:
- command/scriptname (String)
- args (String[]): arguments passed on commandline to the command/script
- stdin (String): text that the command/script can read from stdin
Using $exec requires full trust level, which is enabled with --trust=full from the command line.
$exec will by default run from the /scripts directory for projects, for non-project builds,
the current directory is used as well.
$exec Scripting¶
$exec allows a special scripting mode, where one or more C3 files are compiled on the fly and
run by $exec.
import std::io;
// Compile foo.c3 and bar.c3 in the /scripts directory, invoke the resulting binary
// with the argument 'test'
$exec("foo.c3;bar.c3", { "test" });
fn void main()
{
...
}
Non-Recursive Imports¶
In specific circumstances you only wish to import a module without its submodules. This can be helpful in certain situations where otherwise unnecessary name-collisions would occur, but should not be used in the general case.
The syntax for non-recursive imports is import <module_name> @norecurse; for example:
For example only importing "mylib" into "my_code" and not wishing to import "submod".
module mylib;
import std::io;
fn void only_want_this()
{
io::printn("only_want_this");
}
module mylib::submod;
import std::io;
fn void undesired_fn()
{
io::printn("undesired_fn");
}
module my_code;
// Using Non-recursive import undesired_fn not found
import mylib @norecurse;
// Using Recursive import undesired_fn is found
// import mylib;
fn void main()
{
mylib::only_want_this();
submod::undesired_fn(); // This should error
}
Naming
C3 introduces fairly rigid naming rules to reduce ambiguity and make the language easy to parse for tools.
As a basic rule, all identifiers are limited to a-z, A-Z, 0-9 and _. The initial character can not be a number. Furthermore, all identifiers are limited to 127 characters.
Module sub-paths are limited to 31 characters, and a full module path must be 63 characters or less.
Structs, unions, enums, typedefs and aliases¶
All user-defined types must start with A-Z after any optional initial _ and include at least 1 lower case letter. Bar, _T_i12 and TTi are all valid names. _1, bAR and BAR are not. For C-compatibility it's possible to alias the type to an external name using the attribute "cname".
struct Foo @cname("foo")
{
int x;
Bar bar;
}
union Bar
{
int i;
double d;
}
enum Baz
{
VALUE_1,
VALUE_2
}
Variables and parameters¶
All variables and parameters except for global constant variables must start with a-z after any optional initial _. ___a fooBar and _test_ are all valid variable / parameter names. _, _Bar, X are not.
Global constants¶
Global constants must start with A-Z after any optional initial _. _FOO2, BAR_FOO, X are all valid global constants, _, _bar, x are not.
Enum members / Faults¶
enum members and faults defined with faultdef follow the same naming standard as global constants.
Struct / union members¶
Struct and union members follow the same naming rules as variables.
Modules¶
Module names may contain a-z, 0-9 and _, no upper case characters are allowed.
Functions and macros¶
Functions and macros must start with a-z after any optional initial _.
C3 recommended code style¶
While C3 doesn't mandate a particular style of naming, the standard library nonetheless uses naming conventions which are recommended for official bindings and standard library contributions:
alias MyInt = int; // Types use PascalCase
struct SomeStructType
{
int a_field; // Members use snake_case
double foo_baz;
}
int some_global = 1; // Globals use snake_case
fn void some_function(int a_param) // Functions and parameters use snake_case
{
int foo_bar = 4; // Locals use snake_case
}
// Methods use snake_case, and the first parameter is usually called "self"
fn void SomeStructType.call_me(self, int a)
{
some_function(self.a_field + a);
}
// Macros use snake_case
macro @some_macro(a)
{
return a + a;
}
const MY_FOO = 123; // Constants use SCREAMING_SNAKE_CASE
So in short:
1. Types use PascalCase
2. Constants use SCREAMING_SNAKE_CASE
3. Everything else uses snake_case
Brace style is often a controversial topic. The C3 standard library uses Allman brace style:
For canonical C3 code outside of the stdlib and vendor (the official binding repository), prefer either Allman or K&R:
Regarding tab-vs-spaces, contributions to the C3 stdlib or vendor should use tabs for indentation and spaces for formatting.
Comments
C3 has four distinct comment types:
- The normal
//single line comment. - The classic
/* ... */multi-line C style comment, but unlike in C they are allowed to nest. - Documentation comments
<* ... *>the text within these comments will be parsed as documentation and optional Contracts on the following code. - Shebang comment
#!, which works like a single line comment, but is only valid as the first two characters in a file.
Doc contracts¶
Documentation contracts start with <* and must be terminated using *>.
Any initial text up until the first @-directive on a new line will be interpreted as
free text documentation.
For example:
<*
Here are some docs.
@param num_foo : `The number of foos.`
@require num_foo > 4
@require num_foo <= 100 : "Prevent too many foos."
@deprecated
@mycustom "2"
*>
fn void bar(int num_foo)
{
io::printfn("%d", num_foo);
}
Doc Contracts Are Parsed¶
The following was extracted:
- The function description: "Here are some docs."
- The num_foo parameter has the description: "The number of foos".
- A Contract annotation for the compiler: @require num_foo > 4 which tells the compiler and a user of the function that a precondition is that num_foo must be greater than 4.
- A second contract annotation with the description: "Prevent too many foos".
- A function Attribute marking it as @deprecated, which displays warnings.
- A custom function Attribute @mycustom. The compiler is free to silently ignore custom Attributes, they can be used to optionally emit warnings, but are otherwise ignored.
Available annotations¶
| Name | format |
|---|---|
@param |
@param [<ref>] <param> [ : <description>] |
@return |
@return <description> |
@return? |
@return? [<func>!], [<fault1>, <fault2>, ..., [: <description>]] |
@require |
@require <expr1>, <expr2>, ..., [: <description>] |
@ensure |
@ensure <expr1>, <expr2>, ..., [: <description>] |
@deprecated |
@deprecated [<description>] |
@pure |
@pure |
Fault inheritance¶
It is possible to reference the faults of another function or macro by using the syntax @return? some_func!. This will include all faults returned by some_func. This can be combined with other faults.
<*
@return? check_triangle!, io::EOF
*>
fn TriangleKind? get_triangle_kind(Triangle* triangle)
{
check_triangle(triangle)!;
// ...
}
See Contracts for information regarding @require, @ensure, @const, @pure.
*[<ref>] is an optional mutability description e.g. [&in]
*[<description>] denotes that a description is optional.
Language Common
Arrays
Arrays have a central role in programming. C3 offers built-in arrays, slices and vectors. The standard library enhances this further with dynamically sized arrays and other collections.
Fixed Size 1D Arrays¶
These are declared as <type>[<size>], e.g. int[4]. Fixed arrays are treated as values and will be copied if given as parameter. Unlike C, the number is part of its type. Taking a pointer to a fixed array will create a pointer to a fixed array, e.g. int[4]*.
Unlike C, fixed arrays do not decay into pointers. Instead, an int[4]* may be implicitly converted into an int*.
// C
int foo(int *a) { ... }
int x[3] = { 1, 2, 3 };
foo(x);
// C3
fn int foo(int* a) { ... }
int[3] x = { 1, 2, 3 };
foo(&x);
When you want to initialize a fixed array without specifying the size, use the [*] array syntax:
You can get the length of an array using the .len property:
int len1 = int[4].len; // 4
int[3] a = { 1, 2, 3 };
int len2 = a.len; // 3
int[*] b = { 1, 2 };
int len3 = b.len; // 2
Indexing into pointers of arrays¶
A source of confusion going from C to C3 is that indexing into, for example, a pointer int[3]* would yield an int[3], rather than an int.
To get the integer inside of the array that is pointed to, we need to do a dereference:
int[3] a = { 1, 2, 3 };
int[3]* b = &a;
int x = (*b)[1]; // Correctly returns 2
// Broken: int x = b[1]
A convenient shorthand for (*b)[1] is to use implicit subscript dereference: b.[1]. Here the . is only doing a dereference if the variable
is a pointer. So given the example above we have:
a[1]; // Returns 2
a.[1]; // Returns 2
b[1]; // BROKEN! Out of bounds access
(*b)[1]; // Returns 2
b.[1]; // Returns 2
This feature is mainly useful in generic modules and macros.
Slice¶
The final type is the slice <type>[] e.g. int[]. A slice is a view into either a fixed or variable array. Internally it is represented as a struct containing a pointer and a size. Both fixed and variable arrays may be converted into slices, and slices may be implicitly converted to pointers.
fn void test()
{
int[4] arr = { 1, 2, 3, 4 };
int[4]* ptr = &arr;
// Assignments to slices
int[] slice1 = &arr; // Implicit conversion
int[] slice2 = ptr; // Implicit conversion
// Assignments from slices
int[] slice3 = slice1; // Assign slices from other slices
int* int_ptr = slice1; // Assign from slice
int[4]* arr_ptr = (int[4]*)slice1; // Cast from slice
}
Slicing Arrays¶
It's possible to use the range syntax to create slices from pointers, arrays, and other slices.
This is written arr[<start-index> .. <end-index>], where end-index is inclusive.
fn void test()
{
int[5] a = { 1, 20, 50, 100, 200 };
int[] b = a[0 .. 4]; // The whole array as a slice.
int[] c = a[2 .. 3]; // { 50, 100 }
}
You can also use arr[<start-index> : <slice-length>]
fn void test()
{
int[5] a = { 1, 20, 50, 100, 200 };
int[] b2 = a[0 : 5]; // { 1, 20, 50, 100, 200 } start-index 0, slice-length 5
int[] c2 = a[2 : 2]; // { 50, 100 } start-index 2, slice-length 2
}
It’s possible to omit the first and last indices of a range:
- arr[..<end-index>] Omitting the start index will default it to 0
- arr[<start-index>..] Omitting the end index will assign it to arr.len-1 (this is not allowed on pointers)
Equivalently with index offset arr[:<slice-length>] you can omit the start-index
The following are all equivalent and slice the whole array
fn void test()
{
int[5] a = { 1, 20, 50, 100, 200 };
int[] b = a[0 .. 4];
int[] c = a[..4];
int[] d = a[0..];
int[] e = a[..];
int[] f = a[0 : 5];
int[] g = a[:5];
}
You can also slice in reverse from the end with ^i where the index is len-i for example:
- ^1 means len-1
- ^2 means len-2
- ^3 means len-3
Again, this is not allowed for pointers since the length is unknown.
fn void test()
{
int[5] a = { 1, 20, 50, 100, 200 };
int[] b1 = a[1 .. ^1]; // { 20, 50, 100, 200 } a[1 .. (a.len-1)]
int[] b2 = a[1 .. ^2]; // { 20, 50, 100 } a[1 .. (a.len-2)]
int[] b3 = a[1 .. ^3]; // { 20, 50 } a[1 .. (a.len-3)]
int[] c1 = a[^1..]; // { 200 } a[(a.len-1)..]
int[] c2 = a[^2..]; // { 100, 200 } a[(a.len-2)..]
int[] c3 = a[^3..]; // { 50, 100, 200 } a[(a.len-3)..]
int[] d = a[^3 : 2]; // { 50, 100 } a[(a.len-3) : 2]
// Slicing a whole array, the inclusive index of : gives the difference
int[] e = a[0 .. ^1]; // a[0 .. a.len-1]
int[] f = a[0 : ^0]; // a[0 : a.len]
}
One may also assign to slices:
Or copy slices to slices:
Copying between two overlapping ranges, e.g. a[1..2] = a[0..1] is unspecified behaviour.
Conversion List¶
int[4] |
int[] |
int[4]* |
int* |
|
|---|---|---|---|---|
int[4] |
copy | - | - | - |
int[] |
- | assign | assign | - |
int[4]* |
- | cast | assign | cast |
int* |
- | assign | assign | assign |
Note that all casts above are inherently unsafe and will only work if the type cast is indeed compatible.
For example:
int[4] a;
int[4]* b = &a;
int* c = b;
// Safe cast:
int[4]* d = (int[4]*)c;
int e = 12;
int* f = &e;
// Incorrect, but not checked
int[4]* g = (int[4]*)f;
// Also incorrect but not checked.
int[] h = f[0..2];
Internals¶
Internally the layout of a slice is guaranteed to be struct { <type>* ptr; usz len; }. Note that in 0.8+, the length is sz.
There is a built-in struct std::core::runtime::SliceRaw which
has the exact data layout of the fat array pointers. It is defined to be
Dynamically allocated slices¶
Standard library provides utilities for allocating multiple elements into a slice:
// uses calloc under the hood (memory is zeroed out)
int[] arr1 = mem::new_array(int, 10);
defer mem::free(arr1);
// uses malloc under the hood (memory is undefined)
int[] arr2 = mem::alloc_array(int, 10);
defer mem::free(arr2);
Iteration Over Arrays¶
foreach element by copy¶
You may iterate over slices, arrays and vectors using foreach (Type x : array).
Using compile-time type inference this can be abbreviated
to foreach (x : array) for example:
fn void test()
{
int[4] arr = { 1, 2, 3, 5 };
foreach (item : arr)
{
io::printfn("item: %s", item);
}
// Or equivalently, writing the type:
foreach (int x : arr)
{
/* ... */
}
}
foreach element by reference¶
Using & it is possible to get an element by reference rather than by copy.
Providing two variables to foreach, the first is assumed to be the index and the second the value:
fn void test()
{
int[4] arr = { };
foreach (idx, &item : arr)
{
*item = 7 + (int)idx; // Mutates the array element
// index is usz when not specified, requiring an explicit
// cast on platforms where usz is larger than int.
// 0.8+, "sz" rather than usz is used.
}
// Or equivalently, writing the types
foreach (int idx, int* &item : arr)
{
*item = 7 + idx; // Mutates the array element
}
}
foreach_r reverse iterating¶
With foreach_r arrays or slices can be iterated over in reverse order
fn void test()
{
float[4] arr = { 1.0, 2.0 };
foreach_r (idx, item : arr)
{
// Prints 2.0, 1.0
io::printfn("item: %s", item);
}
// Or equivalently, writing the types
foreach_r (int idx, float item : arr)
{
// Prints 2.0, 1.0
io::printfn("item: %s", item);
}
}
Iteration Over Array-Like types¶
It is possible to enable foreach on any custom type
by implementing .len and [] methods and annotating them using the @operator attribute:
struct DynamicArray
{
usz count;
usz capacity;
int* elements;
}
macro int DynamicArray.get(DynamicArray* arr, usz element) @operator([])
{
return arr.elements[element];
}
macro usz DynamicArray.count(DynamicArray* arr) @operator(len)
{
return arr.count;
}
fn void DynamicArray.push(DynamicArray* arr, int value)
{
arr.ensure_capacity(arr.count + 1); // Function not shown in example.
arr.elements[arr.count++] = value;
}
fn void test()
{
DynamicArray v;
v.push(3);
v.push(7);
// Will print 3 and 7
foreach (int i : v)
{
io::printfn("%d", i);
}
}
For more information, see operator overloading
Dynamic Arrays and Lists¶
The standard library offers dynamic arrays and other collections in the std::collections module.
alias ListStr = List {String};
fn void test()
{
ListStr list_str;
// Initialize the list on the heap.
list_str.init(mem);
list_str.push("Hello"); // Add the string "Hello"
list_str.push("World");
foreach (str : list_str)
{
io::printn(str); // Prints "Hello", then "World"
}
String str = list_str[1]; // str == "World"
list_str.free(); // Free all memory associated with list.
}
Fixed Size Multi-Dimensional Arrays¶
Declare two-dimensional fixed arrays as <type>[<inner-size>][<outer-size>] arr, like int[4][2] arr. Below you can see how this compares to C:
// C
// Uses: name[<outer-size>][<inner-size>]
int array_in_c[4][2] = {
{1, 2},
{3, 4},
{5, 6},
{7, 8},
};
// C3
// Uses: <type>[<inner-size>][<outer-size>]
// C3 declares the dimensions, inner-most to outer-most
int[4][2] array = {
{1, 2, 3, 4},
{5, 6, 7, 8},
};
// To match C we must invert the order of the dimensions
int[2][4] array = {
{1, 2},
{3, 4},
{5, 6},
{7, 8},
};
// C3 also supports Irregular arrays, for example:
int[][4] array = {
{ 1 },
{ 2, 3 },
{ 4, 5, 6 },
{ 7, 8, 9, 10 },
};
Note
Accessing the multi-dimensional fixed array has inverted array index order compared to when the array was declared.
Strings
In C3, multiple string types are available, each suited to different use cases.
String¶
\
Strings are usually the typical type to use, they can be sliced, compared etc ... \
It is possible to access the length of a String instance through the .len operator.
ZString¶
ZString is used when working with C code, which expects null-terminated C-style strings of type char*.
It is a typedef so converting to a ZString requires an explicit cast. This helps to remind the user to check there is appropriate \0 termination of the string data.
Caution
Ensure the terminal \0 when converting from String to ZString.
WString¶
\
The WString type is similar to ZString but uses Char16*, typically for UTF-16 encoded strings. This type is useful for applications where 16-bit character encoding is required.
DString¶
\
DString is a dynamic string builder that supports various string operations at runtime, allowing for flexible manipulation without the need for manual memory allocation.
Enums
Enums¶
Enums use the following syntax:
enum State : int
{
WAITING,
RUNNING,
TERMINATED
}
// Access enum values via:
State current_state = WAITING; // or '= State.WAITING'
The access requires referencing the enum's name as State.WAITING because
an enum like State is a separate namespace by default, just like C++'s class enum.
Standard enums are always backed by an ordinal value running from zero and up, without any gaps. For enums for non-consecutive values, see constdef. To create enums that implement a bit-mask, you can also consider using bitstructs.
Enum associated values¶
It is possible to associate each enum value with one or more static values.
enum State : int (String description)
{
WAITING { "waiting" },
RUNNING { "running" },
TERMINATED { "ended" },
}
fn void main()
{
State process = State.RUNNING;
io::printfn("%s", process.description);
}
Multiple static values can be associated with an enum value, for example:
struct Position
{
int x;
int y;
}
enum State : int (String desc, bool active, Position pos)
{
WAITING { "waiting", false, { 1, 2} },
RUNNING { "running", true, {12,22} },
TERMINATED { "ended", false, { 0, 0} },
}
fn void main()
{
State process = RUNNING;
if (process.active)
{
io::printfn("Process is: %s", process.desc);
io::printfn("Position x: %d", process.pos.x);
}
}
Enum type inference¶
When an enum is used where the type can be inferred, like in switch case-clauses or in variable assignment, the enum name is not required:
State process = WAITING; // State.WAITING is inferred.
switch (process)
{
case RUNNING: // State.RUNNING is inferred
io::printfn("Position x: %d", process.pos.x);
default:
io::printfn("Process is: %s", process.desc);
}
fn void test(State s) { ... }
test(RUNNING); // State.RUNNING is inferred
If the enum without its name matches with a global in the same scope, it needs the enum name to be added as a qualifier, for example:
module test;
// Global variable
// ❌ Don't do this!
const State RUNNING = State.TERMINATED;
test(RUNNING); // Ambiguous
test(test::RUNNING); // Uses global variable.
test(State.RUNNING); // Uses enum constant.
Enum to and from ordinal¶
You can convert an enum to its ordinal with .ordinal, and convert it
back with EnumName.from_ordinal(...):
fn void store_enum(State s)
{
write_int_to_file(s.ordinal);
}
fn State read_enum()
{
return State.from_ordinal(read_int_from_file());
}
However, a plain cast is also valid:
Enum type properties 0.7.x¶
Enum types have the following additional properties in addition to the usual properties for user-defined types:
membersofreturns a list of member references, similar to struct'smembersof.innerreturns the type of the ordinal as atypeid.lookup_field(field_name, value)lookup an enum by associated value.namesreturns a list containing the names of all enums.from_ordinal(value)convert an integer to an enum.valuesreturn a list containing all the enum values of an enum.lenreturn the number of enum values.
Enum type properties 0.8.0+¶
Enum types have the following additional properties in addition to the usual properties for user-defined types:
membersreturns a list of member references, similar to struct'smembers.innerreturns the type of the ordinal as atypeid.lookup_field(field_name, value)lookup an enum by associated value.namesreturns a list containing the names of all enums.from_ordinal(value)convert an integer to an enum.valuesreturn a list containing all the enum values of an enum.
Constdef¶
When interfacing with C code, you may encounter enums that are not sequential. For situations like this, you can use a constdef in C3:
extern fn KeyCode get_key_code();
constdef KeyCode
{
UNKNOWN = 0,
RETURN = 13,
ESCAPE = 27,
BACKSPACE = 8,
TAB = 9,
SPACE = 32,
EXCLAIM, // automatically incremented to 33
QUOTEDBL,
HASH,
}
fn void main()
{
int a = (int)KeyCode.SPACE; // assigns 32 to a
// constdef behave like typedef and will not enforce
// that every value has been declared beforehand
KeyCode b = (KeyCode)2;
// can safely interact with a C function that returns the same enum
KeyCode key = get_key_code();
// Use as cast to convert from the underlying type.
KeyCode conv = (KeyCode)a;
}
Inline constdef and @constinit¶
If you need a constdef to be converted to its assigned value without using a cast, inline can be used:
constdef ConstInline : inline String
{
A = "Hello",
B = "World",
}
fn void main()
{
// implicitly converted to string due to inline
String a = ConstInline.A;
ConstInline b = B;
String b_str = b;
io::printfn("%s, %s!", a, b_str); // Prints "Hello, World!"
}
We can use @constinit to allow the constdef to implicitly convert from a literal:
constdef ConstInline2 : String @constinit
{
A = "Hello",
B = "World",
}
fn void main()
{
ConstInline2 a = "Bye";
}
These conversion rules are the same as for typedef.
Structs and unions
Structs¶
Structs are always named:
A struct's members may be accessed using dot notation, even for pointers to structs.
fn void test()
{
Person p;
p.age = 21;
p.name = "John Doe";
io::printfn("%s is %d years old.", p.name, p.age);
Person* p_ptr = &p;
p_ptr.age = 20; // Ok!
io::printfn("%s is %d years old.", p_ptr.name, p_ptr.age);
}
Person** and use dot access. – It's not allowed, only one level of dereference is done.)
To change alignment and packing, attributes such as @packed may be used.
Initializing structs¶
Structs are typically initialized with an initializer list, which is a list of arguments inside of { }. For example, we can initialize our Person struct above like this:
But we can also use so-called designated initialization, where the explicit names of the members are assigned to, with a leading .:
With designated initializers we do not need to initialize all fields. The rest of the fields will automatically be zeroed out:
If a type contains members which in turn are structs or unions or arrays, then their members may be initialized using repeated .name syntax:
struct Test
{
Person owner;
Person subscriber;
}
Test t = { .owner = { 21, "John Doe" }, .subscriber.age = 42, .subscriber.name = "Test Person" };
Struct initializer splatting¶
It's possible to use the ... operator together with designated initializers to provide defaults that are overwritten by later assignments:
Struct subtyping¶
C3 allows creating struct subtypes using inline:
struct ImportantPerson
{
inline Person person;
String title;
}
fn void print_person(Person p)
{
io::printfn("%s is %d years old.", p.name, p.age);
}
fn void test()
{
ImportantPerson important_person;
important_person.age = 25;
important_person.name = "Jane Doe";
important_person.title = "Rockstar";
// Only the first part of the struct is copied.
print_person(important_person);
}
Union types¶
Union types are defined just like structs and are fully compatible with C.
As usual, unions are used to hold one of many possible values:
fn void test()
{
Integral i;
i.as_byte = 40; // Setting the active member to as_byte
i.as_int = 500; // Changing the active member to as_int
// Undefined behaviour: as_byte is not the active member,
// so this will probably print garbage.
io::printfn("%d\n", i.as_byte);
}
Note that unions only take up as much space as their largest member, so Integral.sizeof is equivalent to long.sizeof.
Nested sub-structs / unions¶
Just like in C99 and later, nested anonymous sub-structs / unions are allowed. Note that the placement of struct / union names is different to match the difference in declaration.
struct Person
{
char age;
String name;
union
{
int employee_nr;
uint other_nr;
}
union subname
{
bool b;
Callback cb;
}
}
Union and structs type properties¶
Structs and unions also support the membersof property (members in 0.8.0+), which returns a list of struct/union members.
Bitstructs
Bitstructs¶
Bitstructs allow storing fields in a specific bit layout. A bitstruct may only contain integer types and booleans, in most other respects it works like a struct.
The main difference is that the bitstruct has a backing type and each field has a specific bit range. In addition, it's not possible to take the address of a bitstruct field.
bitstruct Foo : char
{
int a : 0..2;
int b : 4..6;
bool c : 7;
}
fn void test()
{
Foo f;
f.a = 2;
io::printfn("%d", (char)f); // prints 2
f.b = 1;
io::printfn("%d", (char)f); // prints 18
f.c = true;
io::printfn("%d", (char)f); // prints 146
// Normal designated initializers are supported
f = { .a = 1, .b = 3, .c = false };
// As a special case, boolean fields may drop
// the initializer value, this implicitly sets them
// to true. Below the '.c' is the same as '.c = true'
f = { .a = 2, .b = 2, .c };
}
Bitstruct endianness¶
The bitstruct will follow the endianness of the underlying type:
bitstruct Test : uint
{
ushort a : 0..15;
ushort b : 16..31;
}
fn void test()
{
Test t;
t.a = 0xABCD;
t.b = 0x789A;
char* c = (char*)&t;
// Prints 789AABCD
io::printfn("%X", (uint)t);
for (int i = 0; i < 4; i++)
{
// Prints CDAB9A78
io::printf("%X", c[i]);
}
io::printn();
}
It is, however, possible to pick a different endianness, in which case the entire representation will internally assume big-endian layout:
In this case the same example yields CDAB9A78 and 789AABCD respectively.
Bitstruct backing types¶
Bitstruct backing types may be integers or char arrays. The difference in layout is somewhat subtle:
bitstruct Test1 : char[4]
{
ushort a : 0..15;
ushort b : 16..31;
}
bitstruct Test2 : char[4] @bigendian
{
ushort a : 0..15;
ushort b : 16..31;
}
fn void test()
{
Test1 t1;
Test2 t2;
t1.a = t2.a = 0xABCD;
t1.b = t2.b = 0x789A;
char* c = (char*)&t1;
for (int i = 0; i < 4; i++)
{
// Prints CDAB9A78 on x86
io::printf("%X", c[i]);
}
io::printn();
c = (char*)&t2;
for (int i = 0; i < 4; i++)
{
// Prints ABCD789A
io::printf("%X", c[i]);
}
io::printn();
}
Bitstructs with overlapping fields¶
Bitstructs can be made to have overlapping bit fields. This is useful when modeling a layout which has multiple different layouts depending on flag bits:
bitstruct Foo : char @overlap
{
int a : 2..5;
// "b" is valid due to the @overlap attribute
int b : 1..3;
}
Boolean-only bitstructs¶
When a bitstruct consists of only bool fields, the bit position may be dropped, and the bit position is inferred:
// The following produce exactly the same layout:
bitstruct Explicit : int
{
bool a : 0;
bool b : 1;
bool c : 2;
}
bitstruct Implicit : int
{
bool a;
bool b;
bool c;
}
Bitstructs as bit masks¶
It is possible to use bitstructs to implement bitmasks without using the explicit masking values, see the following example:
constdef BitMaskEnum : uint
{
ABC = 1 << 0,
DEF = 1 << 1,
ACTIVE = 1 << 5,
}
bitstruct BitMask : uint
{
bool abc : 0;
bool def : 1;
bool active: 5;
}
fn void test()
{
// Classic bit mask:
BitMaskEnum foo = BitMaskEnum.ABC | BitMaskEnum.DEF;
BitMaskEnum bar = BitMaskEnum.ACTIVE | BitMaskEnum.ABC;
BitMaskEnum baz = foo & bar;
if (baz & BitMaskEnum.ACTIVE) { ... }
// Using a bitstruct
BitMask a = { .abc, .def }; // Just .abc is the same as .abc = true
BitMask b = { .active, .abc };
BitMask c = a & b;
if (c.active) { ... }
assert((uint)b == (uint)bar, "Layout is the same");
}
Bitstruct type properties¶
Bitstructs also support:
membersof- Return a list of all bitstruct members. (usememberin 0.8.0+).inner- Return the type of the bitstruct "container" type.
Vectors
Vectors - where possible - based on underlying hardware vector implementations. A vector is similar to an array, but with additional functionality. The restriction is that a vector may only consist of elements that are numerical types, boolean or pointers.
A vector is declared similar to an array but uses [<>] rather than [], e.g. int[<4>].
(If you are searching for the counterpart of C++'s std::vector, look instead at the standard
library List type.)
Arithmetics on vectors¶
Vectors support all arithmetics and other operations supported by its underlying type. The operations are always performed elementwise.
For integer and boolean types, bit operations such as ^ | & << >> are available, and for pointers, pointer arithmetic
is supported.
Scalar values¶
Scalar values will implicitly widen to vectors when used with vectors:
Additional operations¶
The std::math module contains a wealth of additional operations available on vectors using dot-method syntax.
.sum()- sum all vector elements..product()- multiply all vector elements..max()- get the maximum element..min()- get the minimum element..dot(other)- return the dot product with the other vector..length()- return the square root of the dot product (not available on integer vectors)..distance(other)- return the length of the difference of the two vectors (not available on integer vectors)..normalize()- return a normalized vector (not available on integer vectors)..lerp(other, t)- linearly interpolate toward other by t..reflect(other)- reflect vector about other (assumes other is normalized)..comp_lt(other)- return a boolean vector with a component wise "<".comp_le(other)- return a boolean vector with a component wise "<=".comp_eq(other)- return a boolean vector with a component wise "==".comp_gt(other)- return a boolean vector with a component wise ">".comp_ge(other)- return a boolean vector with a component wise ">=".comp_ne(other)- return a boolean vector with a component wise "!="
Dot methods available for scalar values, such as ceil, fma etc are in general also available for vectors.
Swizzling¶
Swizzling using dot notation is supported, using x, y, z, w or r, g, b, a:
int[<3>] a = { 11, 22, 33 };
int[<4>] b = a.xxzx; // b = { 11, 11, 33, 11 }
int c = b.w; // c = 11;
char[<4>] color = { 0x11, 0x22, 0x33, 0xFF };
char red = color.r; // red = 0x11
b.xy = b.zw;
color.rg += { 1, 2 };
Array-like operations¶
Like arrays, it's possible to make slices and iterate over vectors.
Note
The storage alignment of vectors are often different from arrays, which should be taken into account when storing vectors.
Memory Management
Like in C, memory is manually managed in C3. An object can either be passed as a value on the stack, or it can be separately allocated.
fn void test()
{
int a = 12; // This variable is allocated on the stack.
int b = a; // This copies the value from a to the stack variable b.
int[2] c = { 1, 2 };
int[2] d = c; // In C3 arrays are values and are copied by value.
io::printn(d); // Prints "{ 1, 2 }"
c[0] = 10;
io::printn(c); // Prints "{ 10, 2 }"
io::printn(d); // Prints "{ 1, 2 }"
}
Allocating on the heap¶
The problem with stack allocations is that the length and sizes must be known up front. Imagine if we wanted to create an array with n number of entries and return that as a slice.
A first attempt might be:
const MAX_NUMBER = 100;
<* @require n >= 0 && n <= MAX_NUMBER *>
fn int[] create_array(int n)
{
int[MAX_NUMBER] arr;
for (int i = 0; i < n; i++)
{
arr[i] = i;
}
return arr[:n]; // Error: returns a pointer to a stack allocated variable
}
Aside from the problem with having a MAX_NUMBER, we can't return a pointer to this array, even as a slice, because the memory where arr is stored is returned when the call to create_array returns.
The normal solution here is to allocate memory on the heap instead, the code might look like this:
<* @require n >= 0 *>
fn int[] create_array(int n)
{
int* arr = malloc(n * int.sizeof);
for (int i = 0; i < n; i++)
{
arr[i] = i;
}
return arr[:n]; // Turn the pointer into a slice with length "n"
}
This allocates enough memory to hold n ints, and returns the result.
The downside is that we must make sure that we release the memory back when we're done:
fn void test()
{
int[] array = create_array(3);
do_things(array);
free(array); // Release memory back to the OS
}
Note
There are convenience functions in the standard library to allocate arrays on the heap. Use mem::new_array(int, n) - zero initialized - or mem::alloc_array(int, n) - not initialized - rather than malloc directly.
Temporary allocations¶
Having to clean up heap allocations is not always convenient. For example, what if we wanted to do this:
In this example do_things would need to release the data, or we leak memory. But we're just using this temporarily – we always just create it and then delete it. Isn't there any simpler way?
In C3, the solution is using the temporary allocator. Allocation with the temporary allocator is just like with the heap allocator, but it uses the @pool macro to flush all temporary allocators deeper down in the call tree:
fn void some_function()
{
@pool()
{
do_calculations();
};
// All temporary allocations inside of do_calculations
// and deeper down are freed when exiting the `@pool` scope.
}
To allocate we use tmalloc, which works the same as malloc, but uses the temporary allocator.
<* @require n >= 0 *>
fn int[] create_temp_array(int n)
{
int* arr = tmalloc(n * int.sizeof);
for (int i = 0; i < n; i++)
{
arr[i] = i;
}
return arr[:n];
}
fn void test_temp()
{
do_things(create_temp_array(3)); // Creates a temporary array
}
fn void a_function()
{
@pool()
{
test_temp();
void* date = tmalloc(1000);
};
// All temporary memory is released when exiting `@pool()`
}
Using single line function body syntax => we can write this even more compactly as:
We can even nest @pools:
fn void nested()
{
@pool()
{
int* a = tmalloc(int.sizeof);
*a = 123;
// Only 'a' is valid
@pool()
{
int* b = tmalloc(int.sizeof);
*b = *a;
// Both 'b' and 'a' are valid
};
// 'b' is released, only 'a' is valid
io::printn(*a);
};
// 'a' is released
}
Temp allocator pitfalls
Because temporary allocations are released using @pool, you should never pass temporary allocated data to other threads or store them in variables that outlive the @pool scope.
The compiler will try to detect using temporary data after free, but the ability to do so depends on whether the code is compiled with safety checks / address sanitizer or not. Support will also differ between OS and architectures.
Always make sure that temporary allocations aren't used beyond the scope of their @pool.
Functions that allocate¶
Standard library functions that allocate generally require you to pass an allocator. This allows you to use the standard heap allocator, mem, the temp allocator tmem or some other Allocator you might be using instead:
List{int} list;
list.init(mem); // "list" will use the heap allocator
list.push(1);
list.push(42);
io::printn(list); // Prints "{ 1, 42 }"
list.free(); // Free the memory in the list
If you are using mem, then in general you will need to free it in some way. Either it's built into the type, such as in the List example above, or else you will need to handle it yourself, like in this case:
String s = string::format(mem, "Hello %s", "World");
// The string "s" is allocated on the heap
io::printn(s);
// Prints "Hello World"
free(s);
// Frees the string
On the other hand, if you use the temp allocator, you only need to make sure it's wrapped in a @pool:
@pool()
{
List{int} list;
list.init(tmem); // "list" will use the temp allocator
list.push(1);
list.push(42);
io::printn(list);
String s = string::format(tmem, "Hello %s", "World");
io::printn(s);
}; // s and list are freed here, because they used temp memory
Because of the usefulness of the temp allocator idiom, there are often temp allocator versions of functions, prefixed "t" or "temp_":
@pool()
{
List{int} list;
list.tinit(); // Use the temp allocator
list.push(1);
list.push(42);
String s = string::tformat("Hello %s", "World"); // Use the temp allocator
};
Implicit initialization¶
Some types, such as List, HashMap and DString will use the temp allocator by default if they are not initialized.
@pool()
{
List{int} list;
list.push(1); // Implicitly initialize with the temp allocator
list.push(42);
DString str; // DString is a dynamic string
str.appendf("Hello %s", "World");
// The "appendf" implicitly initializes "str" with the temp allocator
str.insert_at(5, ",");
str.append("!");
io::printn(str); // Prints Hello, World!
}; // list and str are freed here
This is often useful for locals, but in the case of globals, you might want the container
to use the heap allocator by default. For most containers there is a ONHEAP constant which
allows you to statically initialize globals to use the heap allocator:
List {int} l = list::ONHEAP {int};
fn void main()
{
l.push(1); // Implicitly allocates on the heap, not the temp allocator.
}
Beyond allocating raw memory¶
In C, memory is allocated with plain malloc (uninitialized memory) and calloc (zero-initialized memory). The C3 standard library provides those, but also additional convenience functions:
new and alloc macros¶
The new and alloc macros take a type and allocate just enough memory for that value. This is often more convenient and clearer than Foo* f = malloc(Foo.sizeof).
Foo* f = mem::new(Foo); // Returns a zero initialized pointer for a type
int* p = mem::alloc(int); // Same as 'new' but memory is uninitialized
Foo* t = mem::tnew(Foo); // Same as 'new' but using the temp allocator
new and tnew also take an optional initializer, allowing you to allocate and initialize in a single call.
There are also more specialized functions such as new_with_padding and new_aligned, the former when you need to add additional memory at the end of the allocation, and new_aligned for when you have overaligned types – typically vectors with alignment greater than 16.
new_array and alloc_array for creating arrays¶
// Returns a pointer to a Foo[3] array, zero initialized
Foo[] arr = mem::new_array(Foo, 3);
// Same but memory is uninitialized
Foo[] a2 = mem::alloc_array(Foo, 3);
// Same as new_array, but using the temp allocator
Foo[] tarr = mem::temp_array(Foo, 3);
@clone¶
@clone allows you to take a value and create a pointer copy of it.
// Creates an int pointer, initialized to 33
int* x = @clone(33);
// Same as @clone but using the temp allocator
int* y = @tclone(33);
int[] z = { 1, 2 };
// This clones the elements of a slice or array, in this case "z"
int[] a = @clone_slice(z);
// Same as @clone_slice, but using the temp allocator
int[] t = @tclone_slice(z);
Optionals (Essential)
In this section we will go over the essential information about Optionals and safe methods for working with them, for example
if (catch optional_value)
and the Rethrow operator !.
In the advanced section there are other nice to have features.
Like an alternative to safely unwrap a result from an Optional using
if (try optional_value)
and an unsafe method to force unwrap !!
a result from an Optional, return default values for optionals ?? if they are empty and other more specialised concepts.
What is an Optional?¶
An Optional is a safer alternative to returning -1 or null from
a function when a valid value can't be returned. An Optional
has either a result or is empty. When an Optional
is empty it has an Excuse explaining what happened.
- For example, trying to open a missing file returns the
Excuseofio::FILE_NOT_FOUND. - Optionals are declared by adding
?after the type. - An
Excuseis of typefault. The Optional Excuse is set with~after the value.
🎁 Unwrapping an Optional¶
Note
Unwrapping an Optional is safe because it checks it has a result present before trying to use it.
After unwrapping, the variable then behaves like a normal variable, a non-Optional.
Checking if an Optional is empty¶
import std::io;
fn void? test()
{
// Return an Excuse by adding '~' after the fault.
return io::FILE_NOT_FOUND~;
}
fn void main(String[] args)
{
// If the Optional is empty, assign the
// Excuse to a variable:
if (catch excuse = test())
{
io::printfn("test() gave an Excuse: %s", excuse);
}
}
Automatically unwrapping an Optional result¶
If we escape the current scope from an if (catch my_var) using a return, break, continue
or Rethrow !,
then the variable is automatically unwrapped to a non-Optional:
fn void? test()
{
int? foo = unreliable_function();
if (catch excuse = foo)
{
// Return the excuse with `~` operator
return excuse~;
}
// Because the compiler knows 'foo' cannot
// be empty here, it is unwrapped to non-Optional
// 'int foo' in this scope:
io::printfn("foo: %s", foo); // 7
}
Using the Rethrow operator ! to unwrap an Optional value¶
- The Rethrow operator
!will return from the function with theExcuseif the Optional result is empty. - The resulting value will be unwrapped to a non-Optional.
import std::io;
// Function returning an Optional
fn int? maybe_function() { /* ... */ }
fn void? test()
{
// ❌ This will be a compile error
// maybe_function() returns an Optional
// and 'bar' is not declared Optional:
// int bar = maybe_function();
int bar = maybe_function()!;
// ✅ The above is equivalent to:
// int? temp = maybe_function();
// if (catch excuse = temp) return excuse~
// Now temp is unwrapped to a non-Optional
int bar = temp; // ✅ This is OK
}
⚠️ Optionals affect types and control flow¶
Optionals in expressions produce Optionals¶
If you use an Optional anywhere in an expression, the resulting expression will be an Optional too.
import std::io;
fn void main(String[] args)
{
// Returns Optional with result of type `int` or an Excuse
int? first_optional = 7;
// This is Optional too:
int? second_optional = first_optional + 1;
}
Optionals affect function return types¶
import std::io;
fn int test(int input)
{
io::printn("test(): inside function body");
return input;
}
fn void main(String[] args)
{
int? optional_argument = 7;
// `optional_argument` makes returned `returned_optional`
// Optional too:
int? returned_optional = test(optional_argument);
}
Functions conditionally run when called with Optional arguments¶
When calling a function with Optionals as arguments, the result will be the first Excuse found looking left-to-right. The function is only executed if all Optional arguments have a result.
import std::io;
fn int test(int input, int input2)
{
io::printn("test(): inside function body");
return input;
}
fn void main(String[] args)
{
int? first_optional = io::FILE_NOT_FOUND~;
int? second_optional = 7;
// Return first excuse we find
int? third_optional = test(first_optional, second_optional);
if (catch excuse = third_optional)
{
// excuse == io::FILE_NOT_FOUND
io::printfn("third_optional's Excuse: %s", excuse);
}
}
Interfacing with C¶
For C the interface to C3:
- The Excuse in the Optional of type fault is returned as the regular return.
- The result in the Optional is passed by reference.
For example:
Thec3fault_t is guaranteed to be a pointer sized value.Optionals (Advanced)
Optionals are only defined in certain code¶
✅ Variable declarations
✅ Function return signatureHandling an empty Optional¶
File reading example¶
- If the file is present the Optional result will be the first 100 bytes of the file.
- If the file is not present the Optional
Excusewill beio::FILE_NOT_FOUND.
Try running this code below with and without a file called file_to_open.txt in the same directory.
import std::io;
<*
Function modifies 'buffer'
Returns an Optional with a 'char[]' result
OR an empty Optional with an Excuse
*>
fn char[]? read_file(String filename, char[] buffer)
{
// Return Excuse if opening a file failed, using Rethrow `!`
File file = file::open(filename, "r")!;
// At scope exit, close the file.
// Discard the Excuse from file.close() with (void) cast
defer (void)file.close();
// Return Excuse if reading failed, using Rethrow `!`
file.read(buffer)!;
return buffer; // return a buffer result
}
fn void? test_read()
{
char[] buffer = mem::new_array(char, 100);
defer free(buffer); // Free memory on scope exit
char[]? read_buffer = read_file("file_to_open.txt", buffer);
// Catch the empty Optional and assign the Excuse
// to `excuse`
if (catch excuse = read_buffer)
{
io::printfn("Excuse found: %s", excuse);
// Returning Excuse using the `~` suffix
return excuse~;
}
// `read_buffer` behaves like a normal variable here
// because the Optional being empty was handled by 'if (catch)'
// which automatically unwrapped 'read_buffer' at this point.
io::printfn("read_buffer: %s", read_buffer);
}
fn void main()
{
test_read()!!; // Panic on failure.
}
Return a default value if Optional is empty¶
The ?? operator allows us to return a default value if the Optional is empty.
import std::io;
fn void test_bad()
{
int regular_value;
int? optional_value = function_may_error();
// An empty Optional found in optional_value
if (catch optional_value)
{
// Assign default result when empty.
regular_value = -1;
}
// A result was found in optional_value
if (try optional_value)
{
regular_value = optional_value;
}
io::printfn("The value was: %d", regular_value);
}
fn void test_good()
{
// Return '-1' when `foo_may_error()` is empty.
int regular_value = foo_may_error() ?? -1;
io::printfn("The value was: %d", regular_value);
}
Modifying the returned Excuse¶
A common use of ?? is to catch an empty Optional and change
the Excuse to another more specific Excuse, which
allows us to distinguish one failure from the other,
even when they had the same Excuse originally.
import std::io;
faultdef DOG_ATE_HOMEWORK, TEXTBOOK_ON_FIRE;
fn int? test()
{
return io::FILE_NOT_FOUND~;
}
fn void? examples()
{
int? a = test(); // io::FILE_NOT_FOUND
int? b = test(); // io::FILE_NOT_FOUND
// We can tell these apart by default assigning our own unique
// Excuse. Our custom Excuse is assigned only if an
// empty Optional is returned.
int? c = test() ?? DOG_ATE_HOMEWORK~;
int? d = test() ?? TEXTBOOK_ON_FIRE~;
// If you want to immediately return with an Excuse,
// use the "~" and "!" operators together, see the code below:
int e = test() ?? DOG_ATE_HOMEWORK~!;
int f = test() ?? TEXTBOOK_ON_FIRE~!;
}
Force unwrapping expressions¶
The force unwrap operator !! will
make the program panic and exit if the expression is an empty optional.
This is useful when the error should – in normal cases – not happen
and you don't want to write any error handling for it.
That said, it should be used with great caution in production code.
fn void find_file_and_test()
{
find_file()!!;
// Force unwrap '!!' is roughly equal to:
// if (catch find_file()) unreachable("Unexpected excuse");
}
Find empty Optional without reading the Excuse¶
import std::io;
fn void test()
{
int? optional_value = io::FILE_NOT_FOUND~;
// Find empty Optional, then handle inside scope
if (catch optional_value)
{
io::printn("Found empty Optional, the Excuse was not read");
}
}
Run code if the Optional has a result¶
This is a convenience method, the logical inverse of
if (catch)
and is helpful when you don't care about the empty branch of
the code or you wish to perform an early return.
fn void test()
{
// 'optional_value' is a non-Optional variable inside the scope
if (try optional_value)
{
io::printfn("Result found: %s", optional_value);
}
// The Optional result is assigned to 'unwrapped_value' inside the scope
if (try unwrapped_value = optional_value)
{
io::printfn("Result found: %s", unwrapped_value);
}
}
Another example:
import std::io;
// Returns Optional result with `int` type or empty with an Excuse
fn int? reliable_function()
{
return 7; // Return a result
}
fn void main(String[] args)
{
int? reliable_result = reliable_function();
// Unwrap the result from reliable_result
if (try reliable_result)
{
// reliable_result is unwrapped in this scope, can be used as normal
io::printfn("reliable_result: %s", reliable_result);
}
}
if (try) but they must be
joined with &&. However you cannot use logical OR (||) conditions:
import std::io;
// Returns Optional with an 'int' result or empty with an Excuse
fn int? reliable_function()
{
return 7; // Return an Optional result
}
fn void main(String[] args)
{
int? reliable_result1 = reliable_function();
int? reliable_result2 = reliable_function();
// Unwrap the result from reliable_result1 and reliable_result2
if (try reliable_result1 && try reliable_result2 && 5 > 2)
{
// `reliable_result1` can be used as a normal variable here
io::printfn("reliable_result1: %s", reliable_result1);
// `reliable_result2` can be used as a normal variable here
io::printfn("reliable_result2: %s", reliable_result2);
}
// ERROR cannot use logical OR `||`
// if (try reliable_result1 || try reliable_result2)
// {
// io::printn("this can never happen);
// }
}
Shorthands to work with Optionals¶
Getting the Excuse¶
Retrieving the Excuse with if (catch excuse = optional_value) {...}
is not the only way to get the Excuse from an Optional, we can use the macro @catch instead.
Unlike if (catch) this will never cause automatic unwrapping.
fn void main(String[] args)
{
int? optional_value = io::FILE_NOT_FOUND~;
fault excuse = @catch(optional_value);
if (excuse)
{
io::printfn("Excuse found: %s", excuse);
}
}
Checking if an Optional has a result without unwrapping¶
The @ok macro will return true if an Optional result is present and
false if the Optional is empty.
Functionally this is equivalent to !@catch, meaning no Excuse was found, for example:
fn void main(String[] args)
{
int? optional_value = 7;
bool result_found = @ok(optional_value);
assert(result_found == !@catch(optional_value));
}
No void? variables¶
The void? type has no possible representation as a variable, and may
only be a function return type.
Note
The main function cannot return an optional.
To store the Excuse returned from a void? function without
if (catch foo = optional_value),
use the @catch macro to convert the Optional to a fault:
C Interop
C3 is C ABI compatible. That means you can call C from C3, and call C3 from C without having to
do anything special. As a quick way to call C, you can simply declare the function as a
C3 function but with extern in front of it. As long as the function is linked, it will work:
extern fn void puts(char*); // C "puts"
fn void main()
{
// This will call the "puts"
// function in the standard c lib.
puts("Hello, world!");
}
To use a different identifier inside of your C3 code compared to the function or variable’s external name, use the @cname attribute:
extern fn void foo_puts(char*) @cname("puts"); // C "puts"
fn void main()
{
foo_puts("Hello, world!"); // Still calls C "puts"
}
While C3 functions are available from C using their external name, it's often useful to
define an external name using @cname or @export with a name to match C usage.
module foo;
fn int square(int x) @export // @export ensures external visibility
{
return x * x;
}
fn int square2(int x) @export("square")
{
return x * x;
}
Calling from C:
extern int square(int);
int foo_square(int) __attribute__ ((weak, alias ("foo__square")));
void test()
{
// This would call square2
printf("%d\n", square(11));
// This would call square
printf("%d\n", foo_square(11));
}
Linking static and dynamic libraries¶
If you have a library foo.a or foo.so or foo.obj (depending on type and OS), just add
-l foo on the command line, or in the project file add it to the linked-libraries value, e.g.
"linked-libraries" = ["foo"].
To add library search paths, use -L <directory> from the command line and linker-search-paths
the project file (e.g. "linker-search-paths" = ["../mylibs/", "/extra-libs/"])
Gotchas¶
- Bitstructs will be seen as its backing type, when used from C.
- C bit fields must be manually converted to a C3 bitstruct with the correct layout for each target platform.
- C assumes the enum size is
CInt - C3 uses fixed integer sizes, this means that
intandCIntdoes not need to be the same though in practice on 32/64 bit machines,longis usually the only type that differs in size between C and C3. - Atomic types are not supported by C3.
- In C3 there are generic Atomic types instead.
- There are no
volatileandconstqualifiers like in C.- C3 has global constants declared with
const. - Instead of the
volatiletype qualifier, there are standard library macros@volatile_loadand@volatile_store.
- C3 has global constants declared with
- Passing arrays by value like in C3 must be represented as passing a struct containing the array.
- In C3, fixed arrays do not decay into pointers like in C.
- When defining a C function that has an array argument, replace the array type with a pointer. E.g.
void test(int[] a)should becomeextern fn void test(int* a). If the function has a sized array, likevoid test2(int[4] b)replace it with a pointer to a sized array:extern fn void test2(int[4]* b); - Note that a pointer to an array is always implicitly convertable to a pointer to the first element. For example,
int[4]*may be implicitly converted toint*.
- When defining a C function that has an array argument, replace the array type with a pointer. E.g.
- The C3 names of functions are name-spaced with the module by default when using
@export, so when exporting a function with@exportthat is to be used from C, specify an explicit external name. E.g.fn void myfunc() @export("myfunc") { ... }.
Contracts
Contracts are optional pre- and post-condition checks that the compiler may use for static analysis, runtime checks and optimization. Note that conforming C3 compilers are not obliged to use contracts.
However, violating either pre- or post-conditions is unspecified behaviour, and a compiler may optimize code as if they are always true – even if a potential bug may cause them to be violated.
In safe mode, pre- and post-conditions are checked using runtime asserts.
Why is contract analysis optional for compilers?¶
A frequent question is: "why are contracts opt-in rather than mandatory"? The answer to this is that it allows C3 compilers to be built for resource-constrained environments where it is challenging to fit static analysis. It also makes it simpler to build simple C3 compilers for learning purposes.
Conversely, it should be easy for advanced compilers to have enough information to do advanced static analysis as part of the regular compilation step, so it is important that the constraints are explicit and available.
Pre-conditions¶
Pre-conditions are usually used to validate incoming arguments.
Each condition must be an expression that can be evaluated to a boolean.
Pre-conditions use the @require annotation, and optionally can have an
error message to display after them.
<*
@require foo > 0, foo < 1000 : "optional error msg"
*>
fn int test_foo(int foo)
{
return foo * 10;
}
If we now write the following code:
With c3c (the standard C3 compiler) we will get a compile time error, saying that the contract is violated. However, expressions requiring more static analysis are often only caught at runtime.
Post conditions¶
Post-conditions are evaluated to make checks on the resulting state after passing through the function.
The post-condition uses the @ensure annotation. Where return is used to represent the return value from the function.
<*
@require foo != null
@ensure return > foo.x
*>
fn uint check_foo(Foo* foo)
{
uint y = abs(foo.x) + 1;
// If we had row: foo.x = 0, then this would be a runtime contract error.
return y * abs(foo.x);
}
Parameter annotations¶
@param supports [in] [out] and [inout]. These are only applicable
for pointer arguments. [in] disallows writing to the variable,
[out] disallows reading from the variable. Without an annotation,
pointers may both be read from and written to without checks. If an & is placed
in front of the annotation (e.g. [&in]), then this means the pointer must be non-null
and is checked for null.
| Type | readable? | writable? | use as "in"? | use as "out"? | use as "inout"? |
|---|---|---|---|---|---|
| no annotation | Yes | Yes | Yes | Yes | Yes |
in |
Yes | No | Yes | No | No |
out |
No | Yes | No | Yes | No |
inout |
Yes | Yes | Yes | Yes | Yes |
However, it should be noted that the compiler might not detect whether the annotation is correct or not! This program might compile, even though it's strictly incorrect.
<*
@param [&in] i
*>
fn void lying_func(int* i)
{
int* b = i;
*b = 1; // Circumvent checks!
}
fn void test()
{
int a = 1;
lying_func(&a);
io::printfn("%d", a); // Might print 2!
}
However, compilers detect this(*)
<*
@param [&in] i
*>
fn void bad_func(int* i)
{
*i = 2; // <- Compiler error: cannot write to "in" parameter
}
* The spec allows a barebones compiler to completely ignore contracts. Using such a compiler even this check might be ignored.
Pure in detail¶
The pure annotation allows a program to make assumptions in regard to how the function treats global variables.
Unlike for const, a pure function is not allowed to call a function which is known to be impure.
However, just like for const the compiler might not detect whether the annotation
is correct or not! This program might compile, but will behave strangely:
int i = 0;
fn void bad_func()
{
i = 2;
}
<*
@pure
*>
fn void lying_func()
{
bad_func() @pure; // Call bad_func by assuring it is pure!
}
fn void main()
{
i = 1;
lying_func();
io::printfn("%d", i); // Might print 2!
}
Circumventing "pure" annotations will cause the compiler to optimize under the assumption that globals are not affected, even if this isn't true.
Pre-conditions for macros¶
In order to check macros, it's often useful to use the builtin $defined
function which returns true if the code inside would pass semantic checking.
<*
@require $defined(resource.open, resource.open()) : `Expected resource to have an "open" function`
@require resource != null
@require $assignable(resource.open(), void*)
*>
macro open_resource(resource)
{
return resource.open();
}
Contract support¶
A C3 compiler may have different levels of contract use:
| Level | Behaviour |
|---|---|
| 0 | Contracts are only semantically checked |
| 1 | @require may be compiled into asserts inside of the function. Compile time violations detected through constant folding should not compile |
| 2 | As Level 1, but @ensures are also checked |
| 3 | @require is added at caller side as well |
| 4 | Static analysis is extended beyond compile time folding |
The c3c compiler currently does level 3 checking.
Defer¶
A defer always runs at the end of a scope at any point after it is declared, defer is commonly used to simplify code that needs clean-up; like closing unix file descriptors, freeing dynamically allocated memory or closing database connections.
End of a scope¶
The end of a scope also includes return, break, continue or rethrow !.
fn void test()
{
io::printn("print first");
defer io::printn("print third, on function return");
io::printn("print second");
return;
}
defer runs after the other print statements, at the function return.
Note
Rethrow ! unwraps the Optional result if present, afterwards the previously Optional variable is a normal variable again, if the Optional result is empty then the Excuse is returned from the function back to the caller.
Defer Execution order¶
When there are multiple defer statements they are executed in reverse order of their declaration, last-to-first declared.
fn void test()
{
io::printn("print first");
defer io::printn("print third, defers execute in reverse order");
defer io::printn("print second, defers execute in reverse order");
return;
}
Example defer¶
import std::io;
fn char[]? file_read(String filename, char[] buffer)
{
// return Excuse if failed to open file
File file = file::open(filename, "r")!;
defer {
io::printn("File was found, close the file");
if (catch excuse = file.close())
{
io::printfn("Fault closing file: %s", excuse);
}
}
// return if fault reading the file into the buffer
file.read(buffer)!;
return buffer;
}
If the file named filename is found the function will read the content into a buffer,
defer will then make sure that any open File handlers are closed.
Note that if a scope exit happens before the defer declaration, the defer will not run. This is a useful property because if the file failed to open, we don't need to close it.
defer try¶
A defer try is called at end of a scope when the returned Optional contained a result value.
Examples¶
fn void? test()
{
defer try io::printn("✅ defer try run");
// Returned an Optional result
return;
}
fn void main(String[] args)
{
(void)test();
}
defer try runs on scope exit.
fn void? test()
{
defer try io::printn("❌ defer try not run");
// Returned an Optional Excuse
return io::FILE_NOT_FOUND~;
}
fn void main(String[] args)
{
if (catch err = test())
{
io::printfn("test() returned a fault: %s", err);
}
}
defer try does not run on scope exit.
defer catch¶
A defer catch is called at end of a scope when exiting with an
Optional Excuse, and is helpful for logging, cleanup and freeing resources.
Memory allocation example¶
import std::core::mem;
fn char[]? test()
{
char[] data = mem::new_array(char, 12);
defer (catch err)
{
io::printfn("Excuse found: %s", err);
free(data);
}
// Returns Excuse, memory gets freed
if (!test_something(data)) return io::FILE_NOT_FOUND~;
// Returns data, defer catch doesn't run.
return data;
}
Pitfalls with defer and defer catch
If cleaning up memory allocations or resources make sure the defer or defer catch
are declared as close to the resource declaration as possible.
This helps to avoid unwanted memory leaks or unwanted resource usage from other code rethrowing ! before the defer catch was even declared.
fn void? function_throws()
{
return io::FILE_NOT_FOUND~;
}
fn String? test()
{
char[] data = mem::new_array(char, 12);
// ❌ Before the defer catch declaration
// memory was NOT freed
// function_throws()!;
defer (catch err)
{
io::printn("freeing memory");
free(data);
}
// ✅ After the defer catch declaration
// memory freed correctly
function_throws()!;
return (String)data;
}
Attributes
Attributes are compile-time annotations on functions, types, global constants and variables. Similar to Java annotations, an attribute may also take arguments. An attribute can also represent a bundle of attributes.
Built in attributes¶
@align(alignment)¶
Used for: struct, bitstructs, union, var, function
This attribute sets the minimum alignment for a field or a variable, for example:
Note that following C behaviour, @align is only able to increase
the alignment. If setting a smaller alignment than default is
desired, then use @packed (which sets the alignment to 1 for all members)
and then @align.
@benchmark¶
Used for: function
Marks the function as a benchmark function. Will be added to the list of benchmark functions when the benchmarks are run, otherwise the function will not be included in the compilation.
@bigendian¶
Used for: bitstruct
Lays out the bits as if the data was stored in a big endian type, regardless of host system endianness.
@builtin¶
Used for: function, macro, global, const
Allows a macro, function, global or constant be used from another module without the module path prefixed. Should be used sparingly.
@callconv¶
Used for: function
Sets the calling convention, which may be ignored if the convention is not supported on the target.
Valid arguments are "veccall", "cdecl", "stdcall". Any function without an explicit @callconv will use
"cdecl" which is the normal C calling convention.
Caution
On Windows, many calls are tagged stdcall in the C
headers. However, this calling convention is only ever used on 32-bit Windows,
and is a no-op on 64-bit Windows.
@compact¶
Used for: struct, union
This attribute works like @nopadding, but is applied recursively for any sub-elements, ensuring that there is no padding anywhere in the struct.
@const¶
Used for: macro
This attribute will ensure that the macro is always compile time folded (to a constant). Otherwise, a compile time error will be issued.
@deprecated¶
Used for: types, function, macro, global, const, member
Marks the particular type, global, const or member as deprecated, making use trigger a warning.
@dynamic¶
Used for: methods
Mark a method for dynamic invocation. This allows the method to be invoked through interfaces.
@export¶
Used for: function, global, const, enum, union, struct, faultdef
Marks this declaration as an export, this ensures it is never removed and exposes it as public when linking.
The attribute takes an optional string value, which is the external name. This acts as if @cname had been
added with that name.
@cname¶
Used for: function, global, const, enum, union, struct, faultdef
Sets the external (linkage) name of this declaration.
Caution
Do not confuse this with @export, which is required
to export a function or global.
@finalizer¶
Used for: function
Make this function run at shutdown. See @init for the optional priority. Note that running a
finalizer is a "best effort" attempt by the OS. During abnormal termination it is not guaranteed to run.
The function must be a void function taking no arguments.
@if¶
Used for: all declarations
Conditionally includes the declaration in the compilation. It takes a constant compile time value argument, if this
value is true then the declaration is retained, on false it is removed.
@init¶
Used for: function
Make this function run at startup before main. It has an optional priority 1 - 65535, with lower being executed earlier. It is not recommended to use values less than 128 as they are generally reserved and using them may interfere with standard program initialization.
The function must be a void function taking no arguments.
@inline¶
Used for: function, call
Declares a function to always be inlined or if placed on a call, that the call should be inlined.
@link¶
Used for: module, function, macro, global, const
Syntax for this attribute is @link(cond, link1, link2, ...),
where "link1" etc are strings names for libraries to implicitly
link to when this symbol is used.
In the case of a module section, adding @link implicitly places the
attribute on all of its symbols.
@littleendian¶
Used for: bitstruct
Lays out the bits as if the data was stored in a little endian type, regardless of host system endianness.
@local¶
Used for: any declaration
Sets the visibility to "local", which means it's only visible in the current module section.
@maydiscard¶
Used for: function, macro
Allows the return value of the function or macro to be discarded even if it is an optional. Should be used sparingly.
@mustinit¶
Used for: user-defined types
Prevents the use of the @noinit tag on a variable of the specified type.
@naked¶
Used for: function
This attribute disables prologue / epilogue emission for the function.
The body of the function should be a text asm statement.
@noalias¶
Used for: function parameters
This is similar to restrict in C. A parameter with @noalias should
be a pointer type, and the pointer is assumed not to alias to any other
pointer.
@nodiscard¶
Used for: function, macro
The return value may not be discarded.
@noinit¶
Used for: global, local variable
Prevents the compiler from zero initializing the variable.
@noinline¶
Used for: function, function call
Prevents the compiler from inlining the function or a particular function call.
@nopadding¶
Used for: struct, union
Ensures that a struct of union has no padding, emits a compile time error otherwise.
@norecurse¶
Used for: import
Import the module but not sub-modules or parent-modules, see Modules Section.
@noreturn¶
Used for: function, macro
Declares that the function will never return.
@nosanitize¶
Used for: function
This prevents sanitizers from being added to this function.
@nostrip¶
Used for: any declaration
This causes the declaration never to be stripped from the executable, even if it's not used. This also transitively applies to any dependencies the declaration might have.
@obfuscate¶
Used for: any declaration
Removes any string values that would identify the declaration in some way. Mostly this is used on faults and enums to remove the stored names.
@operator¶
Used for: method, macro method
This attribute has arguments [] []= &[] and len allowing subscript operator overloading for [] and foreach.
By implementing [] and len, foreach and foreach_r is enabled. In order to do foreach by reference,
&[] must be implemented as well.
Furthermore ==, !=, bit operations and arithmetics can all be overloaded.
@optional¶
Used for: interface methods
Placed on an interface method, this makes the method optional to implement for types that implements the interface.
See the Printable interface for an example.
@overlap¶
Used for: bitstruct
Allows bitstruct fields to have overlapping bit ranges.
@packed¶
Used for: struct, union
Causes all members to be packed as if they had alignment 1. The alignment of the struct/union is set to 1.
This alignment can be overridden with @align.
@private¶
Used for: any declaration
Sets the visibility to "private", which means it is visible in the same module, but not from other modules.
@pure¶
Used for: call
Used to annotate a non pure function as "pure" when checking for conformance to @pure on
functions.
@reflect¶
Used for: any declaration
Adds additional reflection information. Has no effect currently.
@section(name)¶
Used for: function, const, global
Declares that a global variable or function should appear in a specific section.
@tag(name, value)¶
Used for: function, macro, user defined type, struct/union/bitstruct member
Adds a compile time tag to a type, function or member which can be retrieved
at compile time using reflection: .has_tagof and .tagof.
Example: Foo.has_tagof("bar") will return true if Foo has a tag "bar".
Foo.tagof("bar") will return the value associated with that tag.
@test¶
Used for: function
Marks the function as a test function. Will be added to the list of test functions when the tests are run, otherwise the function will not be included in the compilation.
@unused¶
Used for: any declaration
Marks the declaration as possibly unused (but should not emit a warning).
@used¶
Used for: any declaration
Marks a parameter, value etc. as must being used.
@wasm¶
Used for: function, global, const
This attribute may take 0, 1 or 2 arguments. With 0 or 1 arguments
it behaves identical to @export if it is non-extern. For extern
symbols it behaves like @cname.
When used with 2 arguments, the first argument is the wasm module,
and the second is the name. It can only be used for extern symbols.
@winmain¶
Used for: function
This attribute is ignored on non-windows targets. On Windows,
it will create a WinMain entry point that will which calls
the main function. This will give other options for the main
argument, and is recommended for Windows GUI applications.
It is only valid for the main function.
@weak¶
Used for: function, const, global
Like @weaklink, but if the same definition occurs in the same compilation, the non-weak one is preferred.
@weaklink¶
Used for: function, const, global
Emits a weak symbol rather than a global.
User defined attributes¶
User defined attributes are intended for conditional application of built-in attributes.
attrdef @MyAttribute = @noreturn, @inline;
attrdef @MyCname(x) = @cname(x);
// The following two are equivalent:
fn void foo() @MyAttribute { /* */ }
fn void foo() @noreturn @inline { /* */ }
An attribute may also take parameters:
attrdef @MyAttr(val) = @tag("foo", val);
struct Test
{
int foo @MyAttr("test");
}
$echo Test.foo.tagof("foo"); // Will echo "test" at compile time
The attribute may also be completely empty:
The alias statement¶
The alias statement in C3 is intended for making new names for function pointers, identifiers and types.
Defining a type alias¶
alias <type alias> = <type> creates a type alias. A Type alias needs to follow the naming convention of user defined types (i.e. a capitalized
name with at least one lower case letter).
Function pointers must be aliased in C3. The syntax is somewhat different from C:
This defines an alias to function pointer type of a function that returns nothing and requires two arguments: an int and a bool. Here is a sample to illustrate usage:
typedef types¶
A typedef creates a new type.
A typedef does not implicitly convert to or from any other type, unlike a type alias.
Literals will convert to the typedef types if they would convert to the underlying type.
Because a typedef type is a new type, it can have its own methods, like any other user-defined type.
typedef Foo = int;
Foo f = 0; // Valid since 0 converts to an int.
f = f + 1;
int i = 1;
// f = f + i Error!
f = f + (Foo)i; // Valid
typedef inline¶
When interacting with various APIs it is sometimes desirable for typedef types to implicitly convert to
its base type, but not from that type.
Behaviour here is analogous how structs may use inline to create struct subtypes.
typedef CString = char*;
typedef ZString = inline char*;
//...
CString cstr = "cstr";
ZString zstr = "zstr";
//...
// char* from_cstr = cstr; // Error!
char* from_zstr = zstr; // Valid!
Function and variable aliases¶
alias can also be used to create aliases for functions and variables.
The syntax is alias <alias> = <original identifier>.
fn void foo() { ... }
int foo_var;
alias bar = foo;
alias bar_var = foo_var;
fn void test()
{
// These are the same:
foo();
bar();
// These access the same variable:
int x = foo_var;
int y = bar_var;
}
Using alias to create generic types, functions and variables¶
It is recommended to favour using alias to create aliases for parameterized types, but it can also be used for parameterized functions and variables:
import generic_foo;
// Parameterized function aliases
alias int_foo_call = generic_foo::foo_call {int};
alias double_foo_call = generic_foo::foo_call {double};
// Parameterized type aliases
alias IntFoo = Foo {int};
alias DoubleFoo = Foo {double};
// Parameterized global aliases
alias int_max_foo = generic_foo::max_foo {int};
alias double_max_foo = generic_foo::max_foo {double};
For more information, see the chapter on generics.
Function pointer default arguments and named parameters¶
It is possible to attach default arguments to function pointer aliases. There is no requirement that the function has the same default arguments. In fact, the function pointer may have default arguments where the function doesn't have it and vice-versa. Calling the function directly will then use the function's default arguments, whereas calling through the function pointer will use the function pointer alias's default argument.
Similarly, named parameter arguments follow the alias definition when calling through the function pointer:
Generic Programming
Generics
NOTE This section is updated for 0.7.9 and later. If you use a method before 0.7.9, use generic modules instead, which offers the same functionality but less granularity.
Generics allow you to create code that works with arbitrary types.
// If the module section is generic,
// then all its declarations are as well
// Note that previous to 0.7.9, this would be written "module my_module {Type};"
module my_module <Type>;
// Parameterized struct
struct MyStruct
{
Type a, b;
}
// Parameterized function
fn Type square(Type t)
{
return t * t;
}
We can rewrite this with individual generic declarations (note that this is not available before 0.7.9):
module my_module;
struct MyStruct <Type>
{
Type a, b;
}
fn Type square(Type t) <Type>
{
return t * t;
}
Parameter types¶
Generic parameters may be types or int, bool and enum constants. In the case of types, they are written as if it was a regular type alias, e.g Type. Constant parameters are written as if they were constant aliases, e.g. MY_CONST, COUNT etc.
An example parameterized by a constant as well as a type:
Using generic parameters¶
The code in a generic declaration uses the parameters as if they were types / constant aliases in the scope:
module foo_test <Type1, MY_CONST>;
struct Foo
{
Type1 a;
}
fn Type2 test(Type2 b, Foo* foo)
{
return foo.a + b + MY_CONST;
}
Using generics¶
To use a generic function or type, we can either define an alias for it, or invoke it directly with its parameters:
import foo_test;
alias FooFloat = Foo {float, double};
alias test_float = foo_test::test {float, double};
...
FooFloat f;
Foo{int, double} g;
...
test_float(1.0, &f);
foo_test::test{int, double} (1.0, &g);
Generics are grouped¶
All generics that are defined in the same parameterized module section are instantiated together, but so are any other generics in the same module that has identical parameters:
module abc <Test>;
// Belongs to generic 1
fn Test test1(Test a)
{
return a + 1;
}
module abc;
// Belongs to generic 1
struct Foo <Test>
{
Test a;
}
// Belongs to generic 1
fn Foo test2(Test b) <Test>
{
return (Foo) { .a = b };
}
// Different parameter name, defines a new generic 2
fn Test2 test3(Test2 a) <Test2>
{
return a * a;
}
fn void main()
{
// This will instantiate Foo, test2 and test1,
// but not test3
Foo{int} a;
}
Generic contracts¶
Just like for macros, optional constraints may be added to improve compile errors:
<*
@require $assignable(1, TypeB) && $assignable(1, TypeC)
@require $assignable((TypeB)1, TypeA) && $assignable((TypeC)1, TypeA)
*>
module vector <TypeA, TypeB, TypeC>;
/* .. code .. */
alias test_function = vector::test_func {Bar, float, int};
// This would give the error
// --> Parameter(s) failed validation:
// @require "$assignable((TypeB)1, TypeA) && $assignable((TypeC)1, TypeA)" violated.
In general, contracts placed on types and identifiers will combine. However, contracts on generic functions and macros do not carry over to the aggregated generic contract:
module foo;
<* @require Test.kindof == INTEGER *>
struct Foo <Test>
{
Test a;
}
<* @require Test.sizeof < 4 *>
fn Test testme(Test t) <Test>
{
return t * 2;
}
fn void main()
{
// This would trigger the generic contract, placed on Foo:
// testme{float}(2.0f);
// However this is fine, since
// the function contract is not checked unless invoked:
Foo{long} x;
}
Methods on generic types¶
Adding methods to a generic type extends it with the method for all generic, allowing the use of the generic parameters associated with creating the type:
module foo;
struct Foo <Type>
{
Type a;
}
module bar;
import foo, std::io;
fn Type Foo.add(self, Type b) => self.a + b;
fn void main()
{
Foo{int} f1 = { 3 };
Foo{double} f2 = { 3.4 };
io::printn(f1.add(5));
io::printn(f2.add(5));
}
We can also extend a particular instance, but in that case we do not access the parameterization.
module foo;
struct Foo <Type> { Type a; }
module bar;
import foo, std::io;
fn int Foo{int}.add(self, int b) => self.a + b;
// The below code would print "Error: 'Type' could not be found, did you spell it right?"
// fn Type Foo{int}.sub(self, Type b) => self.a - b;
fn void main()
{
Foo{int} f1 = { 3 };
Foo{double} f2 = { 3.4 };
io::printn(f1.add(5));
// io::printn(f2.add(5)); ERROR - There is no field or method 'Foo{double}.add'
}
Macros
The macro capabilities of C3 reache across several constructs:
macros, generic functions, generic modules, and compile time variables
(prefixed with $), macro compile time execution (using $if, $for, $foreach, $switch) and attributes.
A quick comparison of C and C3 macros¶
Conditional compilation¶
Macros¶
// C Macro
#define M(x) ((x) + 2)
#define UInt32 unsigned int
// Use:
int y = M(foo() + 2);
UInt32 b = y;
// C3 Macro
macro m(x)
{
return x + 2;
}
alias UInt32 = uint;
// Use:
int y = m(foo() + 2);
UInt32 b = y;
Dynamic scoping¶
Expression arguments¶
First class types¶
Trailing blocks for macros¶
// C Macro
#define FOR_EACH(x, list) \
for (x = (list); x; x = x->next)
// Use:
Foo *it;
FOR_EACH(it, list)
{
if (!process(it)) return;
}
// C3 Macro
macro @for_each(list; @body(it))
{
for ($typeof(list) x = list; x; x = x.next)
{
@body(x);
}
}
// Use:
@for_each(list; Foo* x)
{
if (!process(x)) return;
}
First class names¶
Declaration attributes¶
Consider these two examples comparing declaration attribute syntax in C vs C3:
// C Macro
#define DEPRECATED_INLINE __attribute__((deprecated)) __attribute__((always_inline))
int foo(int x) DEPRECATED_INLINE { ... }
// C3 Macro
attrdef @DeprecatedInline = @deprecated, @inline;
fn int foo(int) @DeprecatedInline { ... }
Declaration macros¶
Stringification¶
Top level evaluation¶
Scripting languages usually have unbounded top level evaluation. The flexibility of this style of meta programming has a trade-off in making the code more challenging to understand.
In C3, top level compile time evaluation is limited to @if attributes to conditionally enable or disable declarations and a handful of other somewhat limited compile time evaluation features (e.g. $assert, etc). This makes the code easier to read, but comes at the cost of expressive power. However, C3 makes this tradeoff for a reason:
Preventing top level compile time evaluation helps prevent lots of declarations from popping into existence seemingly by magic, which is a common source of codebase intelligibility degrading over time in C and C++. By restricting the system to only either including or removing those declarations that are or aren't applicable, via @if, C3 makes it so that you still get conditional compilation and macros but with much less bewildering "magic". This also allows IDE's to effectively work with C3 source code despite its extensive macro system.
In effect, top level declarations become always visible in C3, regardless of whether they are included or removed, whereas in C and C++ unbounded invisible declarations may occur, causing code to become increasingly opaque and riddled with seemingly indecipherable "magic" and numerous variables and constants seemingly coming from nowhere.
Local function scopes in contrast have the full range of C3's compile time evaluation features available though, which are arguably often more expressive and pleasant to use than C and C++'s equivalents for many use cases.
Macro declarations¶
A macro is defined using the syntax macro <return_type> <name>(<parameters>). Specifying the return type of a macro is optional and if omitted the return type is inferred but must always be well-defined (hence different paths cannot return different types, etc).
The parameters have different sigils that must prefix their names where applicable: $ means compile time evaluated (constant expression or type). # indicates an expression that is not yet evaluated, but is bound to where it was defined.
Macros that use any expression parameters (#) or trailing macro bodies (@body(...)) must have a name that begins with @. The reason is that macros which don't use such features can be thought of as being essentially function-like, without any surprising behavior such as lazily implementing expressions or (as is the case of macros with trailing bodies) essentially creating a new type of statement.
The @ warns the reader of a macro call of the possibility that the call may be doing more "magic" or may be more prone to bugs than if the macro lacked the @. Thus, unlike most languages, C3 enables the programmer to choose between more safe or more expressive macros and to make that choice immediately clear to the reader.
Note that $ parameters (unlike # and @body parameters) do not cause a macro to need a @ prefix.
For example, here's a basic swap written as a macro instead of using pointers, which makes it potentially more efficient by avoiding pointer indirection overhead:
<*
@checked $defined(#a = #b, #b = #a)
*>
macro void @swap(#a, #b)
{
var temp = #a;
#a = #b;
#b = temp;
}
This expands on usage like this:
fn void test()
{
int a = 10;
int b = 20;
@swap(a, b);
}
// Equivalent to:
fn void test()
{
int a = 10;
int b = 20;
{
int __temp = a;
a = b;
b = __temp;
}
}
Note the necessary #. Here is an incorrect swap and what it would expand to:
macro void badswap(a, b)
{
var temp = a;
a = b;
b = temp;
}
fn void test()
{
int a = 10;
int b = 20;
badswap(a, b);
}
// Equivalent to:
fn void test()
{
int a = 10;
int b = 20;
{
int __a = a;
int __b = b;
int __temp = __a;
__a = __b;
__b = __temp;
}
}
Macro methods¶
Similar to regular methods a macro may also be associated with a particular type:
See the chapter on functions for more details.
Capturing a trailing block¶
It is often useful for a macro to take a trailing compound statement as an argument. In C++ this pattern is usually expressed with a lambda, but in C3 this is completely inlined.
To accept a trailing block, ; @name(param1, ...) is placed after declaring the regular macro parameters.
Here's an example to illustrate its use:
<*
A macro looping through a list of values, executing the body once
every pass.
@require $defined(a.len) && $defined(a[0])
*>
macro @foreach(a; @body(index, value))
{
for (int i = 0; i < a.len; i++)
{
@body(i, a[i]);
}
}
fn void test()
{
double[] a = { 1.0, 2.0, 3.0 };
@foreach(a; int index, double value)
{
io::printfn("a[%d] = %f", index, value);
};
}
// Expands to code similar to:
fn void test()
{
double[] a = { 1.0, 2.0, 3.0 };
{
double[] __a = a;
for (int __i = 0; __i < __a.len; __i++)
{
int __index = __i;
double __value = __a[__i];
io::printfn("a[%d] = %f", __index, __value);
}
}
}
Macros returning values¶
A macro may return a value, in which case it is then considered an expression rather than a statement:
macro square(x)
{
return x * x;
}
fn int getTheSquare(int x)
{
return square(x);
}
fn double getTheSquare2(double x)
{
return square(x);
}
Calling macros¶
It's perfectly fine for a macro to invoke another macro or itself.
macro square(x) { return x * x; }
macro squarePlusOne(x)
{
return square(x) + 1; // Expands to "return x * x + 1;"
}
The maximum recursion depth is limited to the macro-recursion-depth build setting.
Macro vaargs¶
Macros support the typed vaargs used by C3 functions: macro void foo(int... args) and macro void bar(args...)
but also support a unique set of macro vaargs that look like C-style vaargs: macro void baz(...).
To access the arguments there is a family of $va-* built-in functions to retrieve the arguments:
macro compile_time_sum(...)
{
var $x = 0;
$for var $i = 0; $i < $vacount; $i++:
$x += $vaconst[$i];
$endfor
return $x;
}
$if compile_time_sum(1, 3) > 2: // Will compile to $if 4 > 2
...
$endif
$vacount¶
Returns the number of arguments passed into the macro's vaarg list.
$vaarg¶
Returns the argument as a regular parameter. The argument is guaranteed to be evaluated once, even if the argument is used multiple times.
$vaconst¶
Returns the argument as a compile time constant, this is suitable for
placing in a compile time variable or use for compile time evaluation,
e.g. $foo = $vaconst[1]. This corresponds to $ parameters.
$vaexpr¶
Returns the argument as an unevaluated expression. Multiple uses will
evaluate the expression multiple times. This corresponds to # parameters.
$vatype¶
Returns the argument as a type. This corresponds to $Type style parameters,
e.g. $vatype[2] a = 2.
$vasplat¶
$vasplat allows you to paste the vaargs in the call into another call. For example,
if the macro was called with values "foo" and 1, the code foo($vasplat), would become foo("foo", 1).
You can even extract a range of arguments from the splat: $vasplat[2..4]. In this case, doing so would paste in arguments 2, 3 and 4.
Nor is $vasplat limited to function arguments. You can also use $vasplat within initializers. For example:
Untyped lists¶
Compile time variables may hold untyped lists. Such lists may be iterated over or implicitly converted to initializer lists:
Compile Time
During compilation, constant expressions will automatically be folded. Together with the compile
time conditional statements $if, $switch and the compile time iteration statements $for $foreach
it is possible to perform limited compile time execution.
Compile time values¶
During compilation, global constants are considered compile time values, as are any derived constant values, such as type names and sizes, variable alignments, etc.
Inside of a macro or a function, it is possible to define mutable compile time variables. Such local variables are prefixed with $ (e.g. $foo). It is also possible to define local type variables, which are also prefixed using $ (e.g. $MyType, $ParamType, etc).
Mutable compile time variables are not allowed in the global scope.
Concatenation¶
The compile time concatenation operator +++ can be used at compile
time to concatenate arrays and strings:
macro int[3] @foo(int $y)
{
int[2] $z = { 1, 2 };
return $z +++ $y;
}
fn void main()
{
io::printn(@foo(4)); // prints "{ 1, 2, 4 }"
}
Compile time && and ||¶
The operators &&& and ||| perform compile time versions of && and
||. The difference between the runtime operators is that the right hand side is not type
checked if the left hand side is false in the case of &&& and true in the case of |||.
This allows us to safely write this macro code:
If @foo() doesn't exist, then this still compiles. However, if we had used && instead this would have been an error:
$if and $switch¶
$if <const expr>: takes a compile time constant value and evaluates it to see if it is true or false. If it is true, then the code in the "then" branch is retained and semantically checked, while the $else branch – if present – is discarded. And conversely, if the result is false, then the "then" branch is discarded and the $else branch is retained. Here are some basic usage examples:
macro @foo($x, #y)
{
$if $x > 3:
#y += $x * $x;
$else
#y += $x;
$endif
}
const int FOO = 10;
fn void test()
{
int a = 5;
int b = 4;
@foo(1, a); // Allowed, expands to a += 1;
// @foo(b, a); // Error: b is not a compile time constant.
@foo(FOO, a); // Allowed, expands to a += FOO * FOO;
}
For switching between multiple possibilities, use $switch.
macro @foo($x, #y)
{
$switch $x:
$case 1:
#y += $x * $x;
$case 2:
#y += $x;
$case 3:
#y *= $x;
$default:
#y -= $x;
$endswitch
}
Switching without passing a value argument to $switch itself is also allowed (much like normal switch), which works like an if-else chain in that it permits arbitrary conditional expressions per case instead of only allowing a specific constant per case:
macro @foo($x, #y)
{
$switch:
$case $x > 10:
#y += $x * $x;
$case $x < 0:
#y += $x;
$default:
#y -= $x;
$endswitch
}
Loops using $foreach and $for¶
$for ... $endfor works analogous to for, only it is limited to using compile time variables. $foreach ... $endforeach similarly
matches the behaviour of foreach.
Compile time looping:
macro foo($a)
{
$for var $x = 0; $x < $a; $x++:
io::printfn("%d", $x);
$endfor
}
fn void test()
{
foo(2);
// Expands to ->
// io::printfn("%d", 0);
// io::printfn("%d", 1);
}
Looping over enums:
macro foo_enum($SomeEnum)
{
$foreach $x : $SomeEnum.values:
io::printfn("%d", (int)$x);
$endforeach
}
enum MyEnum
{
A,
B,
}
fn void test()
{
foo_enum(MyEnum);
// Expands to ->
// io::printfn("%d", (int)MyEnum.A);
// io::printfn("%d", (int)MyEnum.B);
}
Note
The content of the $foreach or $for body must be at least a complete statement.
It's not possible to compile partial statements.
Compile time macro execution¶
If a macro only takes compile time parameters, that is only $-prefixed parameters, and then does not generate any other statements than returns, then the macro will be completely compile time executed.
This constant evaluation allows us to write some limited compile time code. For example, this macro will compute Fibonacci numbers at compile time:
macro long @fib(long $n)
{
$if $n <= 1:
return $n;
$else
return @fib($n - 1) + @fib($n - 2);
$endif
}
It is important to remember that if we had replaced $n with n the compiler would have complained. n <= 1 is not considered to be a constant expression, even if the actual argument to the macro was a constant. This limitation is deliberate, to offer control over what is compiled out and what isn't.
Conditional compilation at the top level using @if¶
At the top level (where globals are declared; such as functions, variables, etc), conditional compilation is controlled by appending @if attributes onto declarations:
The argument to @if must be resolvable to a constant at compile time. This means that the argument may also be a compile time evaluated macro:
macro bool @foo($x) => $x > 2;
int x @if(@foo(5)); // Will be included
int y @if(@foo(0)); // Will not be included
In contrast though, attempts to use more general-purpose compile-time features such as $if at the top level will cause compilation failure. Compare:
// Compiles:
fn void func_a() @if(true)
{
//...
}
// Doesn't compile:
$if true:
fn void func_b()
{
//...
}
$endif
For more information about the motivation and rationale behind this design choice to use @if (and a limited subset of other compile-time constructs such as $assert) at the top level for declarations instead of allowing arbitrary compile-time evaluation, see the related discussion about why in the part of the macro page that covers top level @if.
Evaluation order of top level conditional compilation¶
Conditional compilation at the top level can cause unexpected ordering issues, especially when combined with
$defined. At a high level, there are three phases of evaluation:
- Non-conditional declarations are registered.
- Conditional module sections are either discarded or have all of their non-conditional declarations registered.
- Each module in turn will evaluate
@ifattributes for each module section.
The order of module and module section evaluation in (2) and (3) is not deterministic and any use of $defined should not
rely on this ordering.
Compile time introspection¶
At compile time, full type information is available. This allows for creation of reusable, code generating macros for things like serialization.
usz foo_alignment = Foo.alignof;
usz foo_member_count = Foo.membersof.len;
String foo_name = Foo.nameof;
// Note: In 0.8+ use:
// sz foo_alignment = Foo::alignof;
// sz foo_member_count = Foo::members.len;
// String foo_name = Foo::name;
To read more about all the fields available at compile time, see the page on reflection.
Compile time functions¶
The following is a list of functions available at compile time:
$alignof¶
Get the alignment of something. See reflection.
*Note: this function is removed in 0.8+, use $reflect instead.
$assert¶
Check a condition at compile time.
$defined¶
This highly versatile compile time function returns true if a type or identifier is defined. It can also be used on an expression, returning "true" if the outermost expression is valid. Similarly, it can be used with a declaration, e.g. $defined(int a = foo) to verify that it's valid to declare a variable with the given argument.
However, be aware that $defined is for handling well-defined expressions, not arbitrary syntax. Invalid code placed inside $defined will cause compilation to fail, not return false.
See reflection.
$echo¶
Print a message to stdout when compiling the code.
$embed¶
Embed binary data from a file. See the "including binary data" secton of the expressions page to see a few different usage examples.
This is useful for bundling any necessary data inside the executable or library itself so that there is no need for managing separate files when the program is redistributed to users. Such embedded data is fixed at compile time though, and so $embed shouldn't be used for files that need to persist changes between invocations of the program (e.g. work documents, saved games, etc). However, once loaded, $embed data is just arbitrary run-time data and thus you can still create and modify whatever other data you want based on it during each program run.
For example:
char[*] img_data = $embed("some_image.png");
import std::io;
fn void main()
{
io::printn(img_data);
// Prints an image's raw data
// as an array of unsigned bytes.
}
$error¶
When this is compiled, issue a compile time error.
$eval¶
Converts a compile time string to the corresponding variable or function. See reflection.
$exec¶
Execute a script at compile time and include the result in the source code. See more.
$extnameof, $qnameof and $nameof¶
Get the external name of a symbol.
*Note: this function is removed in 0.8+, use $reflect instead.
See reflection.
External names are the names written into the symbol table of the executable or library binary, which subsequently may later be used by other programs to call into the binary by linking to those names, such as via foreign function interfaces (FFI) from another language or via direct use of the binary interface (such as enabled by the ABI and library compatibility of C and C3).
The external name of a symbol in the built binary can be set by attaching an @export("<intended_symbol_name>") attribute.
On Linux, the nm shell command can be used to view the symbol table of a binary directly, thus enabling determination of what names a foreign program would see when looking at the binary. For example, try running nm path/to/binary &> nm_out.txt then viewing the nm_out.txt file. The &> combines both normal (stdout) and error (stderrr) output into the file, whereas just > would redirect only normal (stdout) output.
On Windows, you can try dumpbin /SYMBOLS for debug builds, dumpbin /EXPORTS for libraries, or dumpbin /IMPORTS for executables, but it may not help as much since large parts of the symbol table may be missing and hence misleading. There may also be tools available only in Visual Studio or associated with it, since Microsoft designs it that way intentionally to encourage programs to be built the way Microsoft wants.
On Mac, try otool, nm, or objdump. Running brew install binutils before may help.
$feature¶
Check if a given feature is enabled. Features are passed using -D <FEATURE_NAME> on the command line.
$include¶
Includes a file into the current file at the top level as raw text, resulting in that file's text being compiled as if directly written into the location of the $include.
As an important limitation, the text may not include a module statement.
Note that if pure data inclusion is what you want then $embed may be more helpful than $include, and if you want dynamic data, $exec may be better.
$nameof¶
Get the local name of a symbol. See reflection.
*Note: this function is removed in 0.8+, use $reflect instead.
Local names (a.k.a. unqualified names) are the "leaf nodes" (the very last item) of the full namespace path to a symbol.
For example, $nameof(io::printn) is printn.
$offsetof¶
Get the offset of a member. See reflection.
*Note: this function is removed in 0.8+, use $reflect instead.
$qnameof¶
Get the qualified name of a symbol. See reflection.
Qualified names are the full ("absolute") namespace paths needed to reach a symbol.
For example, $qnameof(io::printn) is std::io::printn.
*Note: this function is removed in 0.8+, use $reflect instead.
$vacount¶
Return the number of macro vaarg arguments. For this and other vaarg compile-time functions see here.
$vaconst¶
Return a vaarg as a $constant parameter.
$vaexpr¶
Return a vaarg as an #expr parameter.
$vasplat¶
Expand the vaargs into an initializer list or function call, thus providing a way of passing part or all of the vaarg list's arguments onward.
To expand only part of a vaarg list rather than all of it, use $vasplat[<min>..<max>] with the intended indices instead of just $vasplat. See the section on slicing arrays to learn more about the wide variety of ways that such index ranges can be formed.
$vatype¶
Get a vaarg as a $Type parameter.
$sizeof¶
Return the size of an expression.
*Note: this function is removed in 0.8+, use $reflect instead.
$stringify¶
Turn an expression into a string. This is typically used with expression parameters (# prefixed parameters) in macros.
Such stringification is very useful for debug printing and code generation, among other things. For example, just to illustrate why:
import std::io;
macro @show(#expr)
{
io::printfn("%s == %s", $stringify(#expr), #expr);
}
macro @announce(#expr)
{
io::printn($stringify(#expr));
#expr;
}
fn void main()
{
int num = 0;
@show(num);
@announce(num += 5);
@show(num);
}
This elminates redundancy when print debugging. This code could be refined to be better, such as by making @show handle Optionals correctly, but the simple version above is less distracting. However, as you can see, code can be annoted for temporary print debugging very easily by using $stringify based expression macros.
$typeof¶
Get the type of an expression at compile time, without ever evaluating it at run time and thus without causing side effects.
For example, the following C3 test passes:
fn void typeof_has_no_side_effects() @test
{
int minutes_left = 20;
$assert($typeof(minutes_left += 10).nameof == "int");
assert(minutes_left == 20);
// The state of `minutes_left` above never changes.
}
$typefrom¶
Get a type from a compile time constant typeid. It can also convert a compile-time string to the corresponding type.
See reflection.
Reflection
C3 allows both compile time and runtime reflection.
During compile time, some type information is available in the form of compile time constants associated with each type.
Runtime type information is also available by retrieving a typeid from a runtime object (such as from an object of type any via <runtime_obj>.type most commonly) and then comparing the properties of the returned runtime typeid against the corresponding properties (if any) of the compile time equivalent <type>.typeid. Note however that run time typeids currently have a much smaller set of available properties.
See the documentation about the any type for more information if you want or need runtime reflection. Such runtime info can be switched on or conditionally checked (e.g. via <runtime_obj>.type == <type>.typeid) to implement runtime polymorphism.
Compile time reflection in 0.7.x¶
During compile time there are many compile time fields that may be accessed using "dot notation" of the form <type>.<property>. That works for types, but in contrast when you want to retrieve type information about values or other expressions then try the $ functions instead.
For example, notice that <type>.sizeof and $sizeof(<value>) do not operate on the same kinds of entities. The former is for types whereas the later is for values.
They can nonetheless be used to achieve similar effects though. For example, the following assertions all pass:
$assert(short.sizeof == $sizeof((short)0));
short sh = 0;
$assert($sizeof(sh) == $typeof(sh).sizeof);
Type properties¶
Here are the property-like ("dot notation") constants associated with each type:
alignofassociatedelementsextnameofinfinnerkindoflenmaxmembersofmethodsofminnannameofnamesparamsofparentofqnameofreturnssizeoftypeidvalues
Many of these properties are very useful for writing generics macros and contracts.
alignof¶
Returns the alignment in bytes needed for the type.
associated¶
Only available for enums. Returns an array containing the types of associated values if any.
enum Foo : int (double d, String s)
{
BAR { 1.0, "normal" },
BAZ { 2.0, "exceptional" }
}
String s = Foo.associated[0].nameof; // "double"
inf¶
Only available for floating point types
Returns a representation of floating point "infinity".
inner¶
This returns a typeid to an "inner" type. What this means is different for each type:
- Array -> the array base type.
- Bitstruct -> underlying base type.
- Distinct -> the underlying type.
- Enum -> underlying enum base type.
- Pointer -> the type being pointed to.
- Vector -> the vector base type.
It is not defined for other types.
kindof¶
Returns the underlying TypeKind as defined in std::core::types.
len¶
Returns the length of the array. For enums and constdefs, it will return the number of constants.
max¶
Returns the maximum value of the type (only valid for integer and float types).
membersof¶
Only available for bitstruct, struct and union types.
Returns a compile time list containing the fields in a bitstruct, struct or union. The
elements have the compile time only type of member_ref.
Note: As the list is an "untyped" list, you are limited to iterating and accessing it at compile time.
A member_ref has properties alignof, kindof, membersof, nameof, offsetof, sizeof and typeid.
methodsof¶
This property returns the methods associated with a type as a constant array of strings.
Methods are generally registered after types are registered, which means that the use of "methodsof" may return inconsistent results depending on where in the resolution cycle it is invoked. It is always safe to use inside a function.
min¶
Returns the minimum value of the type (only valid for integer and float types).
nameof¶
Returns the name of the type.
names¶
Returns a slice containing the names of an enum.
paramsof¶
Only available for function pointer types. Returns a ReflectedParam struct for all function pointer parameters.
alias TestFunc = fn int(int x, double f);
String s = TestFunc.paramsof[1].name; // "f"
typeid t = TestFunc.paramsof[1].type; // double.typeid
parentof¶
Only available for bitstruct and struct types. Returns the typeid of the parent type.
returns¶
Only available for function types. Returns the typeid of the return type.
sizeof¶
Returns the size in bytes for the given type, like C sizeof.
typeid¶
Returns the typeid for the given type. aliass will return the typeid of the underlying type. The typeid size is the same as that of an iptr.
values¶
Returns a slice containing the values of an enum.
Compile time reflection in 0.8+¶
Starting from 0.8.0, compile time information about types is accessed using ::, e.g. MyType::size.
For values use $reflect(<value>) to access the reflected properties for the underlying value.
The exception is $typeof(<value>), which creates a type from the type of the value. There are convenience macros like @sizeof(<value>), @kindof(<value>) for immediately accessing reflection data without explicitly invoking $reflect.
Type properties & functions¶
The following type properties and functions are available:
alignment(all runtime types)from_ordinal(constdef and enum only)has_equalsis_orderedis_substruct(struct only)len(array, vector, enum, constdef - runtime available)lookup_field(enum)max/min(int and float types)members(struct, union, enum, bitstruct)methods(all non-optional runtime types)nan/inf(float types)inner(runtime types except int, float, struct and union types)kind(runtime available)name/qname/cname(cname is limited to all user-defined types)params(function types)parent(constdef, struct, typedef - runtime available)returns(function types)size(runtime available)typeid(all runtime types + untypedlist)get_tag/has_tag(user-defined types)values(constdef, enum)
alignment¶
Returns the alignment in bytes needed for the type.
from_ordinal¶
Only available for constdef and enum. Converts an integer value to the enum/constdef of that ordinal. In the case of constdef it might be different from the actual value.
has_equals¶
Is == and != supported.
is_ordered¶
Are all comparisons supported, either because the type has is built-in or added through operator overloading.
is_substruct¶
Only available for structs.
True is a struct has an inline member.
len¶
Returns the length of the array or vector. For enums and constdefs, it will return the number of constants.
lookup_field¶
Only available for enums.
Look up the enum value by matching the first associated value:
enum Foo : (int val)
{
ABC { 3 },
LIFF { 42 }
}
...
Foo? foo = Foo::lookup_field(val, 42); // Returns Foo.ABC
max / min¶
Only available for integer and floating point types.
Returns the maximum / minimum value of the type.
members¶
Only available for enum, bitstruct, struct and union types.
Returns a compile time list containing the fields in a bitstruct, struct or union. For enums it's the associated value declarations. The elements are of type reflected_ref, as if you had done $reflect on the element.
Note: As the list is an "untyped" list, you are limited to iterating and accessing it at compile time.
methods¶
This property returns the methods associated with a type as a constant array of strings.
Note
Warning!
Methods are generally registered after types are registered, which means that the use of "methodsof" may return inconsistent results depending on where in the resolution cycle it is invoked. It is always safe to use inside a function.
nan / inf¶
Only available for floating point types
Returns a representation of floating point "NaN" / "infinity".
inner¶
This returns a typeid to an "inner" type. What this means is different for each type:
- Array -> the array base type.
- Bitstruct -> underlying base type.
- Distinct -> the underlying type.
- Enum -> underlying enum base type.
- Pointer -> the type being pointed to.
- Vector -> the vector base type.
It is not defined for other types.
kind¶
Returns the underlying TypeKind as defined in std::core::types.
name / qname / cname¶
Returns the name of the type: qname is the qualified name, so adds the module path before the name. cname returns the external name, and as such isn't available for built-in types.
params¶
Only available for function pointer types. Returns a ReflectedParam struct for all function pointer parameters.
alias TestFunc = fn int(int x, double f);
String s = TestFunc::params[1].name; // "f"
typeid t = TestFunc::params[1].type; // double.typeid
parent¶
Only available for typedef, constdef, bitstruct and struct types.
Returns the typeid of the inline field.
returns¶
Only available for function types. Returns the typeid of the return type.
size¶
Returns the size in bytes for the given type, like C sizeof.
get_tag / has_tag¶
get_tag retrieves the value of a @tag defined on the type, has_tag is used to check if the tag exists.
typeid¶
Returns the typeid for the given type. aliass will return the typeid of the underlying type. The typeid size is the same as that of an iptr.
values¶
Returns a slice containing the values of an enum or constdef.
Compile time functions¶
There are several built-in functions to inspect the code during compile time.
$alignof0.7.x only$defined$eval$evaltype0.7.x only$extnameof0.7.x only$nameof0.7.x only$offsetof0.7.x only$qnameof0.7.x only$sizeof0.7.x only$stringify$typeof$typefrom$reflect0.8+
$defined¶
Returns true when the expression(s) inside are defined and all sub expressions
are valid.
$defined(Foo); // => true
$defined(Foo.x); // => true
$defined(Foo.baz); // => false
Foo foo = {};
// Check if a method exists:
$if $defined(foo.call):
// Check what the method accepts:
$switch :
$case $defined(foo.call(1)) :
foo.call(1);
$default :
// do nothing
$endswitch
$endif
// Other way to write that:
$if $defined(foo.call, foo.call(1)):
foo.call(1);
$endif
The full list of what $defined can check:
SomeType a = <expr>- checks if<expr>can be used to initialize a variable of typeSomeTypevar $a = <expr>- checks if<expr>can be compile-time evaluated.*<expr>- checks if<expr>can be dereferenced,<expr>must already be valid<expr>[<index>]- checks if indexing is valid,<expr>and<index>must already be valid, and when possible to check at compile-time if<index>is out of bounds this will returnfalse<expr>[<index>] = <value>- same as above, but also checks if<value>can be assigned,<expr>,<index>and<value>must already be valid<expr>.<ident1>.<ident2>- check if.<ident2>is valid,<expr>.<ident1>must already be valid ("ident" is short for "identifier")ident,#ident,@ident,IDENT,$$IDENT,$ident- check if identifier existsType- check if the type exists&<expr>- check if you can take the address of<expr>,<expr>must already be valid&&<expr>- check if you can take the temporary address of<expr>,<expr>must already be valid$eval(<expr>)- check if the$evalevaluates to something valid,<expr>must already be valid<expr>(<arg0>, ...)- check that the arguments are valid for the<expr>macro/function,<expr>and all args must already be valid<expr>!!and<expr>!- check that<expr>is an optional,<expr>must already be valid<expr>?- check that<expr>is a fault,<expr>must already be valid<expr1> binary_operator <expr2>- check if thebinary_operator(+,-, ...) is defined between the two expressions, both expressions must already be valid(<Type>)<expr>- check if<expr>can be casted to<Type>, both<Type>and<expr>must already be valid
If for example <expr> is not defined when trying (<Type>)<expr> this will
result in a compile-time error.
$eval¶
Converts a compile time string with the corresponding variable:
int a = 123; // => a is now 123
$eval("a") = 222; // => a is now 222
$eval("mymodule::fooFunc")(a); // => same as mymodule::fooFunc(a)
$eval is limited to a single, optionally path prefixed, identifier.
Consequently methods cannot be evaluated directly:
struct Foo { ... }
fn int Foo.test(Foo* f) { ... }
fn void test()
{
void* test1 = &$eval("test"); // Works
void* test2 = &Foo.$eval("test"); // Works
// void* test3 = &$eval("Foo.test"); // Error
}
$reflect¶
Returns a reflection_ref of the expression. It can be queried for properties such as name, size, offset, alignment etc.
More information is forthcoming.
$stringify¶
Returns the expression as a string. $stringify has a special behaviour for handling macro expression parameters, where $stringify(#foo) will return the expression contained in #foo as a string, exactly as written in the macro call's arguments, rather than simply return "#foo".
Thus, for example:
import std::io;
macro @describe(#expr)
{
io::printfn("The value of `%s` is `%s`.", $stringify(#expr), #expr);
}
fn void main()
{
@describe(isz.sizeof);
//Prints:
// The value of `isz.sizeof` is `8`.
}
$typeof¶
Returns the type of an expression or variable.
$typefrom¶
Get a type from a compile time constant typeid. It can also convert a compile-time string to the corresponding type.
0.7.x only¶
$alignof¶
Returns the alignment in bytes needed for the type or member.
module test::bar;
struct Foo
{
int x;
char[] y;
}
int g = 123;
$alignof(Foo.x); // => returns 4
$alignof(Foo.y); // => returns 8 on 64 bit
$alignof(Foo); // => returns 8 on 64 bit
$alignof(g); // => returns 4
$extnameof¶
Returns the external name of a type, variable or function. The external name is the one used by the linker.
fn void testfn(int x) { }
String a = $extnameof(g); // => "test.bar.g";
String b = $extnameof(testfn); // => "test.bar.testfn"
$nameof¶
Returns the name of a function or variable as a string without module prefixes.
fn void test() { }
int g = 1;
String a = $nameof(g); // => "g"
String b = $nameof(test); // => "test"
$offsetof¶
Returns the offset of a member in a struct.
$qnameof¶
Returns the same as $nameof, but with the full module name prepended.
module abc;
fn void test() { }
int g = 1;
String a = $qnameof(g); // => "abc::g"
String b = $qnameof(test); // => "abc::test"
$sizeof¶
This is used on a value to determine the allocation size needed. $sizeof(a) is equivalent
to doing $typeof(a).sizeof. Note that this is only used on values and not on types.
Any & Interfaces
Working with the type of any at runtime.¶
The any type is recommended for writing code that is polymorphic at runtime where macros are not appropriate.
It can be thought of as a typed void*.
An any can be created by assigning any pointer to it. You can then query the any type for the typeid of
the enclosed type (the type the pointer points to) using the type field.
This allows switching over the typeid, using a normal switch:
Sometimes one needs to manually construct an any-pointer, which
is typically done using the any_make function: any_make(ptr, type)
will create an any pointing to ptr and with typeid type.
Since the runtime typeid is available, we can query for any runtime typeid property available
at runtime, for example the size, e.g. my_any.type.sizeof. This allows us to do a lot of work
on with the enclosed data without knowing the details of its type.
For example, this would make a copy of the data and place it in the variable any_copy:
void* data = malloc(a.type.sizeof);
mem::copy(data, a.ptr, a.type.sizeof);
any any_copy = any_make(data, a.type);
Variable argument functions with implicit any¶
Regular typed vaargs are of a single type, e.g. fn void abc(int x, double... args).
In order to take variable functions that are of multiple types, any may be used.
There are two variants:
Explicit any vararg functions¶
This type of function has a format like fn void vaargfn(int x, any... args). Because only
pointers may be passed to an any, the arguments must explicitly be pointers (e.g. vaargfn(2, &b, &&3.0)).
While explicit, this may be somewhat less user-friendly than implicit vararg functions:
Implicit any vararg functions¶
The implicit any vararg function has instead a format like fn void vaanyfn(int x, args...).
Calling this function will implicitly cause taking the pointer of the values (so for
example in the call vaanyfn(2, b, 3.0), what is actually passed are &b and &&3.0).
Because this passes values implicitly by reference, care must be taken not to mutate any values passed in this manner. Doing so would very likely break user expectations.
Interfaces¶
Most statically typed object-oriented languages implement extensibility using virtual pointer tables (vtables). In C, and by extension C3, this is possible to emulate by passing around structs containing a pointer to a list of function pointers in addition to the data.
While this is efficient and often the best solution, it puts certain assumptions on the code and makes interfaces more challenging to evolve over time.
As an alternative there are languages (such as Objective-C) which instead use message passing to dynamically typed objects, where the availability of functionality may be queried at runtime.
C3 provides this latter functionality over the any type using interfaces.
Defining an interface¶
The first step is to define an interface:
While myname will behave as a method, we declare it without a type. Note here that unlike normal methods we leave
out the first "self" argument.
Implementing the interface¶
To declare that a type implements an interface, add it after the type name:
struct Baz (MyName)
{
int x;
}
// Note how the first argument differs from the interface.
fn String Baz.myname(Baz* self) @dynamic
{
return "I am Baz!";
}
If a type declares an interface but does not implement its methods then that is a compile time error.
A type may implement multiple interfaces by placing them all inside of (), e.g. struct Foo (VeryOptional, MyName) { ... }.
A limitation is that only user-defined types may declare they are implementing interfaces. To make existing types implement interfaces is possible but does not provide compile time checks.
One of the interfaces available in the standard library is Printable, which contains to_format and to_new_string.
If we implemented it for our struct above it might look like this:
fn String Baz.to_new_string(Baz* baz, Allocator allocator) @dynamic
{
return string::printf("Baz(%d)", baz.x, allocator: allocator);
}
@dynamic methods¶
A method must be declared @dynamic to implement an interface, but a method may also be declared @dynamic without the type declaring it implementing a particular interface. For example, this allows us to write:
// This will make "int" satisfy the MyName interface
fn String int.myname(int*) @dynamic
{
return "I am int!";
}
@dynamic methods have their reference retained in the runtime code and can also be searched for at runtime and invoked
from the any type.
Referring to an interface by pointer¶
An interface, e.g. MyName, can be cast back and forth to any, but only types which
implement the interface completely may implicitly be cast to the interface.
So for example:
Baz b = { 1 };
double d = 0.5;
int i = 3;
MyName a = &b; // Valid, Baz implements MyName.
// MyName c = &d; // Error, double does not implement MyName.
MyName c = (MyName)&d; // Would break at runtime as double doesn't implement MyName
// MyName z = &i; // Error, implicit conversion because int doesn't explicitly implement it.
MyName* z = (MyName)&i; // Explicit conversion works and is safe at runtime if int implements "myname"
Calling dynamic methods¶
Methods implementing interfaces are like normal methods, and if called directly, they are just normal function calls. The difference is that they may be invoked through the interface:
If we have an optional method we should first check that it is implemented:
interface VeryOptional
{
fn void do_something(int x, void* ptr) @optional;
}
fn void do_something(VeryOptional z)
{
if (&z.do_something)
{
z.do_something(1, null);
}
}
We first query if the method exists on the value. If it does we actually run it.
Here is another example, showing how the correct function will be called depending on type, checking
for methods on an any:
fn void whoareyou2(any a)
{
MyName b = (MyName)a;
// Query if the function exists
if (!&b.myname)
{
io::printn("I don't know who I am.");
return;
}
// Dynamically call the function
io::printn(b.myname());
}
fn void main()
{
int i;
double d;
Baz baz;
any a = &i;
whoareyou2(a); // Prints "I am int!"
a = &d;
whoareyou2(a); // Prints "I don't know who I am."
a = &baz;
whoareyou2(a); // Prints "I am Baz!"
}
Subtype inheritance¶
A struct with an "inline" member or a typedef which is declared with "inline", will
inherit dynamic methods from its inline "parent". This inheritance is not
available for "inline" enums.
struct BazChild
{
inline Baz b;
int x;
}
fn void main()
{
BazChild bp;
any a = &bp;
whoareyou2(a); // Prints "I am Baz!"
}
Reflection invocation¶
This functionality is not yet implemented and may see syntax changes
It is possible to retrieve any @dynamic function by name and invoke it:
alias VoidMethodFn = fn void(void*);
fn void* int.test_something(&self) @dynamic
{
io::printfn("Testing: %d", *self);
}
fn void main()
{
int z = 321;
any a = &z;
VoidMethodFn test_func = a.reflect("test_something");
test_func(a); // Will print "Testing: 321"
}
This feature allows methods to be linked up at runtime.
Operator Overloading
C3 has operator overloading for working with containers and for creating numerical types.
Overloads for containers¶
"Element at" operator []¶
Implementing [] allows a type to use the my_type[<value>] syntax:
It's possible to use any type as the argument, such as a string:
Only a single [] overload is allowed.
"Element ref" operator &[]¶
Similar to [], the &[] operator returns a value for &my_type[<value>], which may
be retrieved in a different way. If this overload isn't defined, then &my_type[<value>] would
be a syntax error.
"Element set" operator []=¶
This operator, the assignment counterpart of [], allows setting an element using my_type[<index>] = <value>.
"len" operator¶
Unlike the previous operator overloads, the "len" operator simply enables functionality
which augments the []-family of operators: you can use the "from end" syntax e.g my_type[^1]
to get the last element assuming the indexing uses integers.
Enabling foreach¶
In order to use a type with foreach, e.g. foreach(d : foo), at a minimum, methods
with overloads for [] (@operator([])) and len (@operator(len)) need to be added.
If &[] is implemented, foreach by reference will be enabled (e.g. foreach(double* &d : foo)).
fn double Foo.get(&self, usz i) @operator([])
{
return self.x[i];
}
fn usz Foo.len(&self) @operator(len)
{
return self.x.len;
}
fn void test(Foo f)
{
// Print all elements in f
foreach (d : f)
{
io::printfn("%f", d);
}
}
Operator overloading for numerical types¶
+ - * / % together with unary minus and plus, bit operators ^ | & and << >> are available for overloading
numerical types. These overloads are limited to user-defined types.
Symmetric and reverse operators¶
For numerical types, @operator_s (defining a symmetric operator)
and @operator_r (defining a reverse operator) are available.
These are only available when matching different types. For example,
defining + between a Complex number and a double can look like this:
macro Complex Complex.add_double(self, double d) @operator_s(+)
{
return self.add(self, complex_from_real(d));
}
The above would match both "Complex + double" and "double + Complex",
with the actual evaluation order of the arguments happening in
the expected order, meaning something like get_double() + get_complex()
would always evaluate the arguments from left to right.
As for @operator_r, it is useful in the case where the evaluation isn't symmetric:
macro Complex Complex.double_sub_this(self, double d) @operator_r(-)
{
return complex_from_real(d).sub(self);
}
The above would define "double - Complex" but not "Complex - double".
Resolving overloads¶
Numerical operators that take more than one operator can be properly overloaded,
so we can for example write a different + for adding Complex to int
as opposed to "Complex + double".
However, if "Complex + int" doesn't exist then the integer value will follow the normal conversion rules to implicitly cast it to a double!
More formally the resolution works in this manner:
- Is there an exact match to the second argument? If so, then this is picked.
- Is there a way to match by implicitly casting the second argument? If there is only one match, then this is picked. If there are multiple matches, then the operation is ambiguous and will be considered an error.
struct Foo
{
float a;
}
fn Foo Foo.minus_float(self, float f) @operator(-) => { .a = self.a - f };
fn Foo Foo.minus_double(self, double d) @operator(-) => { .a = self.a - d };
fn void main()
{
Foo x = { 1.0f };
Foo y = { 2.2f };
Foo zf = x - 2.0f; // Uses Foo.minus_float
Foo zi = x - 2; // ERROR: Ambiguous, implicitly cast value matches both overloads.
}
Bitstructs and bit operations¶
As a special rule, bitstructs may not overload ^ & |, as these operations are already
defined on bitstructs.
Combined assignment operators¶
If + is defined for a type, then += is defined as well, and similarly for the
other operators. However, it is also possible to explicitly override the combined assignment
operators to optimize those cases.
struct Foo
{
int a;
}
fn Foo Foo.add(self, Foo other) @operator(+) => { .a = self.a + other. a };
fn Foo Foo.add_self(&self, Foo other) @operator(+=)
{
self.a += other.a;
return *self;
}
fn void main()
{
Foo x = { 1 };
Foo y = { 2 };
Foo z = x + y; // Uses Foo.add
x += y; // Uses Foo.add_self
}
Operator overloading for ==¶
Overloading == is, like overloading arithmetic operators, only allowed on user-defined types.
* Please note that 0.8.x removes !=.
Operator overloading for < (0.8+)¶
Overloading < is supported for 0.8.0 and onwards. If < and == are implemented, then the type supports all ordering operations: < <= == != >= >.
Note
Some words of caution
Operator overloading should always be written to behave in the same manner
as the operators behave when used with builtin types. For example: + should be used for addition, not concatenation. << should be used for left bitshift, not to append values to an array or print things to stdout.
Violating the expected behaviour of operators is why operator overloading is often frowned upon despite its usefulness. Operator overloading that follows expectations can make the code clearer and easier to read. Violating expectations on the other hand obfuscates the code and makes it harder to read and understand and hence also harder to safely share and reuse. It is bad style and poor taste.
Build Your Project
Build Commands
Building a project is done by invoking the C3 compiler with the build or run command inside of the project structure. The compiler will search upwards in the file hierarchy until a project.json file is found.
You can also customize the project build config.
Compile Individual Files¶
By default the compiler will compile a stand-alone file as an executable binary, rather than as a static or dynamic library.
The resulting executable binary will be given the same name as whichever C3 file contains the main function.
Alternatively, libraries can be compiled via c3c static-lib or c3c dynamic-lib or by creating a project configured as such and built via c3c build and c3c run and so on.
Run¶
When starting out with C3, it's natural to use compile-run to try things out. For larger projects, the built-in build system is recommended instead.
The compile-run command works the same as normal compilation (via compile, build, etc), but also immediately runs the resulting executable.
Common additional parameters¶
Additional parameters:
- --lib <path> add a library to search.
- --output <path> override the output directory.
- --path <path> execute as if standing at <path>
Init a new project¶
Create a new project structure in the current directory.
Use the --template command option to select a template. The following are built in:
exe- the default template, produces an executable.static-lib- template for producing a static library.dynamic-lib- template for producing a dynamic library.
It is also possible to give the path to a custom template.
Additional parameters:
- --template <path> indicate an alternative template to use.
For example, c3c init hello_world creates the following structure:
.
├─ build/
├─ docs/
├─ lib/
├─ resources/
├─ scripts/
├─ src/
│ └─ main.c3
├─ test/
├─ LICENSE
├─ project.json
└─ README.md
Check the project configuration docs to learn more about configuring your project.
Test¶
Will run any tests in the project in the"sources" directory defined in your project.json. For example:
Tests are defined with a @test attribute. For example:
Build¶
Build the project in the current path. It doesn't matter where in the project structure you are.
The built-in templates define two targets: debug (which is the default) and release.
Clean¶
Removes some of the generated build artifacts of the previous builds of the C3 project. In most cases this is unnecessary.
Build and Run¶
Build the target (if needed) and run the executable.
Clean and Run¶
Clean, build and run the target.
Dist¶
Clean, build and package the target for distribution to end users.
For convenience, the c3c dist command will also run the target afterwards if it is an executable, as it is likely you will want to check that the program is still working.
You should also transfer the distribution package to a clean machine and test that the application works correctly there too at a minimum. Otherwise, there is a high risk that your application will be broken due to some dependencies existing on your machine that don't exist on end users' machines. Developers' machines often have many more libraries already installed than users' machines, hence users' machines are far more likely to lack necessary dependencies. It is hard to reliably discern without testing.
Caution
c3c dist has not been properly added yet!
Docs¶
Not added yet!
Rebuilds the documentation based upon whatever documentation comments and contracts have been written into your C3 code, so that you and other programmers working on your project can easily get a more expedient and more readily navigable overview of what things are available and what they do and how to use them.
This is what is known as a "documentation generation" or "docgen" system. The most common example of a documentation generator in the C and C++ ecosystem is perhaps Doxygen (a 3rd party tool) but many other languages have their own built-in documentation generators.
Alternatively, if you do not want to maintain documentation for your project, such as if your project is too transient for documentation to matter or if you want to potentially iterate faster, then consider instead at least ensuring that you have lots of unit tests (via the @test attribute) and assertions ($assert and assert) in your code so that the "self documenting" qualities of your codebase are maximized. Be aware however that some things can never be adequately expressed in any amount of "self documenting" code. Each approach has tradeoffs. A balanced mix is also a good approach.
Benchmark¶
Runs benchmarks on a target, meaning that every function that has been annotated with @benchmark will be run and have its performance profiled (including time spent and a CPU cycle count) so that you can easily monitor opportunities for optimization and avoid computational waste.
Customizing A Project¶
This is a description of the configuration options in project.json:
(you can see the full list executing c3c --list-project-properties)
{
// Language version of C3.
"langrev": "1",
// Warnings used for all targets.
"warnings": [ "no-unused" ],
// Directories where C3 library files may be found.
"dependency-search-paths": [ "lib" ],
// Libraries to use for all targets.
"dependencies": [ ],
// Authors, optionally with email.
"authors": [ "John Doe <[email protected]>" ],
// Version using semantic versioning.
"version": "0.1.0",
// Sources compiled for all targets.
"sources": [ "src/**" ],
// C sources if the project also compiles C sources
// relative to the project file.
// "c-sources": [ "csource/**" ],
// Include directories for C sources relative to the project file.
// "c-include-dirs: [ "csource/include" ],
// Output location, relative to project file.
"output": "../build",
"targets": {
"my_app": {
// Executable or library.
"type": "executable",
// Architecture and OS target.
// You can use 'c3c --list-targets' to list all valid targets,
// "target": "linux-x64",
// Current Target options:
// android-aarch64
// elf-aarch64 elf-riscv32 elf-riscv64 elf-x86 elf-x64 elf-xtensa
// mcu-x86 mingw-x64 netbsd-x86 netbsd-x64 openbsd-x86 openbsd-x64
// freebsd-x86 freebsd-x64 ios-aarch64
// linux-aarch64 linux-riscv32 linux-riscv64 linux-x86 linux-x64
// macos-aarch64 macos-x64
// wasm32 wasm64
// windows-aarch64 windows-x64
// Additional libraries, sources
// and overrides of global settings here.
},
},
// Global settings.
// C compiler if the project also compiles C sources
// defaults to 'cc'.
"cc": "cc",
// C compiler flags
"cflags": "",
// Set the include directories for C sources.
"c-include-dirs": "",
// CPU name, used for optimizations in the LLVM backend.
"cpu": "generic",
// Debug information, may be "none", "full" and "line-tables".
"debug-info": "full",
// FP math behaviour: "strict", "relaxed", "fast".
"fp-math": "strict",
// Link libc other default libraries.
"link-libc": true,
// Memory environment: "normal", "small", "tiny", "none".
"memory-env": "normal",
// Optimization: "O0", "O1", "O2", "O3", "O4", "O5", "Os", "Oz".
"opt": "O0",
// Code optimization level: "none", "less", "more", "max".
"optlevel": "none",
// Code size optimization: "none", "small", "tiny".
"optsize": "none",
// Relocation model: "none", "pic", "PIC", "pie", "PIE".
"reloc": "none",
// Trap on signed and unsigned integer wrapping for testing.
"trap-on-wrap": false,
// Turn safety (contracts, runtime bounds checking, null pointer checks etc).
"safe": true,
// Compile all modules together, enables more inlining.
"single-module": true,
// Use / don't use soft float, value is otherwise target default.
"soft-float": false,
// Strip unused code and globals from the output.
"strip-unused": true,
// The size of the symtab, which limits the amount
// of symbols that can be used. Should usually not be changed.
"symtab": 1048576,
// Use the system linker.
"linker": "cc",
// Include the standard library.
"use-stdlib": true,
// Set general level of x64 cpu: "baseline", "ssse3", "sse4", "avx1", "avx2-v1", "avx2-v2", "avx512", "native".
"x86cpu": "native",
// Set max type of vector use: "none", "mmx", "sse", "avx", "avx512", "native".
"x86vec": "sse",
// Enable sanitizer: none, address, memory, thread.
"sanitize": "none",
// Features enabled for all targets.
"features": "",
}
By default, an executable is assumed, but changing the type to "static-lib" or "dynamic-lib"
creates static library and dynamic library targets respectively.
Compilation options¶
The project file contains common settings at the top level that can be overridden by each
target by simply assigning that particular key. So if the top level defines target
to be macos-x64 and the actual target defines it to be windows-x64, then the windows-x64 target will be used for compilation.
Similarly, compiler command line parameters can be used in turn to override the target setting.
targets¶
The list of targets that can be built.
dependencies¶
List of C3 libraries (".c3l") to use when compiling the target.
sources¶
List of source files to compile.
test-sources¶
List of additional source files to compile when running tests.
cc¶
C compiler to use for compiling C sources (if C sources are compiled together with C3 files).
c-sources¶
List of C sources to compile, using the default C compiler.
linker-search-paths¶
This adds paths for the linker to search, when linking normal C libraries.
linked-libraries¶
This is a list of C libraries to link to. The names need to follow the normal
naming standard for how libraries are provided to the system linker.
So, for example, on Linux libraries have names like libfoo.a but when
presented to the linker the name is foo. As an example "linked-libraries": ["curl"]
would on Linux look for the library libcurl.a and libcurl.so in the
paths given by "linker-search-paths".
version¶
Not handled yet.
Version for the library. Will also be provided as a compile time constant.
authors¶
List of authors who are credited with creating and/or working on the project.
These can be accessed as lists using env::AUTHORS, which gives a list of names, and env::AUTHOR_EMAILS, which gives a list of their e-mails (where available).
The formatting is expected to be in the format "first last John Doe <[email protected]>.
langrev¶
Not handled yet.
The language revision to use.
features¶
This is a list of upper-case constants that can be tested for
in the source code using $feature(NAME_OF_FEATURE).
warnings¶
Not completely supported yet.
List of warnings to enable during compilation.
opt¶
Optimization setting: O0, O1, O2, O3, O4, O5, Os, Oz.
Target options¶
type¶
This mandatory option should be one of:
- "executable" – a normal executable application.
- "dynamic-lib" - a dynamic library.
- "static-lib" - static library.
- "benchmark" - target that only runs benchmarks.
- "test" - target that only runs tests.
- "object-files" - compile to object files, but does not perform any linking.
- "prepare" - target that does not perform any compilation, but may do things like invoking other scripts using "exec".
Using environment variables¶
Not supported yet.
In addition to constants, any values starting with $ will be assumed to be environment variables.
For example "$HOME" would on Unix-like systems (e.g. Linux, the BSDs, Mac) return the home directory. For strings that start with $ but should not be interpreted as an environment variable you need to escape it with a backslash (\). For example, the string "\$HOME" would be interpreted as the plain string "$HOME".
Language Rules
Implicit Conversions
Conversion Rules For C3¶
C3 differs in some crucial respects when it comes to number conversions and promotions. These are the rules for C3:
floattointconversions require a cast.inttofloatconversions do not require a cast.booltofloatconverts to0.0or1.0- Widening
floatconversions are only conditionally allowed*. - Narrowing conversions require a cast*.
- Widening
intconversions are only conditionally allowed*. - Signed <-> unsigned conversions of the same type do not require a cast. Note: As of 0.8.0 it requires a cast.
- In conditionals
floattobooldo not require a cast, any non-zerofloatvalue isconsidered true. - Implicit conversion to
boolonly occurs in conditionals or when the value is enclosed in()e.g.bool x = (1.0)orif (1.0) { ... }
C3 uses two's complement arithmetic for all integer math.
Note
These abbreviations are used in the text below: - "lhs" means "left hand side". - "rhs" means "right hand side".
Target type¶
The left hand side of an assignment, or the parameter type in a call is known as the target type. The target type is used for implicit widening and inferring struct initialization.
Common arithmetic promotion¶
Like C, C3 uses implicit arithmetic promotion of integer and floating point variables before arithmetic operations:
- For any floating point type with a bitwidth smaller than 32 bits, widen to
float. For example, in C3float16converts tofloatbefore arithmetic is performed. - For an integer type smaller than the minimum arithmetic width, promote the value to a same-signed integer of the minimum arithmetic width. This usually corresponds to a C
intorunsigned int. For example, in C3ushortconverts touintbefore arithmetic is performed.
Implicit narrowing¶
An expression with an integer type, may implicitly narrow to a smaller integer type, and similarly a float type may implicitly narrow to a less wide floating point type. This is determined from the following algorithm:
- Shifts and assign look at the lhs expression.
++,--,~,-,!!,!- check the inner type.+,-,*,/,%,^,|,&,??,?:- check both lhs and rhs.- Narrowing
int/floatcast, assume the type is the narrowed type. - Widening
int/floatcast, look at the inner expression, ignoring the cast. - In the case of any other cast, assume it is opaque and the type is that of the cast.
- In the case of an integer literal, instead of looking at the type, check that the integer would fit the type to narrow to.
- For
.lenaccess, allow narrowing to C int width. - For all other expressions, check against the size of the type.
As rough guide: if all the sub expressions originally are small enough it's ok to implicitly convert the result.
Examples:
float16 h = 12.0;
float f = 13.0;
double d = 22.0;
char x = 1;
short y = -3;
int z = 0xFFFFF;
ulong w = -0xFFFFFFF;
x = x + x; // => Calculated as `x = (char)((int)x + (int)x);`.
x = y + x; // => Error. Narrowing is not allowed because short y is wider than char x.
h = x * h; // => Calculated as `h = (float16)((float)x * (float)h);`.
h = f + x; // => Error. Narrowing is not allowed because float f is wider than float16 h.
Implicit widening¶
Unlike C, implicit widening will only happen on "simple expressions": if the expression is a primary expression, or a unary operation on a primary expression.
For assignment, special rules hold. For an assignment to a binary expression, if its two subexpressions are "simple expressions" and the binary expression is +, -, /, *, allow an implicit promotion of the two sub expressions.
int a = ...
short b = ...
char c = ...
long d = a; // Valid - simple expression.
int e = (int)(d + (a + b)); // Error
int f = (int)(d + ~b); // Valid
long g = a + b; // Valid
As a rule of thumb, if there is more than one possible conversion, then an explicit cast is needed.
Example:
long h = a + (b + c);
// Possible intention 1
long h = (long)(a + (b + c));
// Possible intention 2
long h = (long)a + (long)(b + c);
// Possible intention 3
long h = (long)a + ((long)b + (long)c);
Maximum type¶
The maximum type is a concept used when unifying two or more types. The algorithm follows:
- First perform implicit promotion.
- If both types are the same, the maximum type is this type.
- If one type is a floating point type, and the other is an integer type,
the maximum type is the floating point type. E.g.
int + float -> float. - If both types are floating point types, the maximum type is the widest floating point type. E.g.
float + double -> double. - If both types are integer types with the same signedness, the
maximum type is the widest integer type of the two. E.g.
uint + ulong -> ulong. - 0.7.0: If both types are integer types with different signedness, the
maximum type is a signed integer with the same bit width as the maximum integer type.
ulong + int -> long. 0.8.0: Compare the two types to the list: ichar, char, short, ushort, int, uint, long, ulong, int128, uint128, the max type is the one furthermost to the right in the list. Consequentlyulong + int -> ulong,uint + int -> uint. - If at least one side is a struct or a pointer to a struct with an
inlinedirective on a member, check recursively if the type of the inline member can be used to find a maximum type (see below under sub struct conversions) - All other cases are errors.
Substruct conversions¶
Substructs may be used in place of its parent struct in many cases. The rule is as follows:
- A substruct pointer may implicitly convert to a parent struct.
- A substruct value may be implicitly assigned to a variable with the parent struct type. This will truncate the value, copying only the parent part of the substruct. However, a substruct value cannot be assigned its parent struct.
- Substruct slices and arrays cannot be cast (implicitly or explicitly) to an array of the parent struct type.
Pointer conversions¶
Pointer conversion between types usually needs explicit casts.
The exception is void* which any type may implicitly convert to or from.
Conversion rules from and to arrays are detailed under arrays
Vector conversions¶
Vector conversions always need to be explicit. They work
as regular conversions with one notable exception: converting a true boolean
vector value into an int will yield a value with all bits set. So bool[<2>] { true, false }
converted to for example char[<2>] will yield { 255, 0 }.
Vectors can also implicitly be cast to the corresponding array type, for example: char[<2>] <=> char[2].
Binary conversions¶
1. Multiplication, division, remainder, subtraction / addition with both operands being numbers¶
These operations are only valid for integer and float types.
- Resolve the operands.
- Find the maximum type of the two operands.
- Promote both operands to the resulting type if both are simple expressions
- The resulting type of the expression is the maximum type.
2. Addition with left side being a pointer¶
- Resolve the operands.
- If the rhs is not an integer, this is an error.
- If the rhs has a bit width that exceeds isz, this is an error.
- The result of the expression is the lhs type.
3. Subtraction with lhs pointer and rhs integer¶
- Resolve the operands.
- If the right hand type has a bit width that exceeds isz, this is an error.
- The result of the expression is the left hand type.
4. Subtraction with both sides pointers¶
- Resolve the operands.
- If either side is a
void*, it is cast to the other type. - If the types of the sides are different, this is an error.
- The result of the expression is isz.
- If this result exceeds the target width, this is an error.
6. Bit operations ^ & |¶
These operations are only valid for integers and booleans.
- Resolve the operands.
- Find the maximum type of the two operands.
- Promote both operands to the maximum type if they are simple expressions.
- The result of the expression is the maximum type.
7. Shift operations << >>¶
These operations are only valid for integers.
- Resolve the operands.
- In safe mode, insert a trap to ensure that rhs >= 0 and rhs < bit width of the left hand side.
- The result of the expression is the lhs type.
8. Assignment operations += -= *= /= %= ^= |= &=¶
- Resolve the lhs.
- Resolve the right operand as an assignment rhs.
- The result of the expression is the lhs type.
9. Assignment shift >>= <<=¶
- Resolve both operands
- In safe mode, insert a trap to ensure that rhs >= 0 and rhs < bit width of the left hand side.
- The result of the expression is the lhs type.
10. && and ||¶
- Resolve both operands.
- Insert bool cast of both operands.
- The type is bool.
11. <= == >= !=¶
- Resolve the operands, left to right.
- Find the maximum type of the two operands.
- Promote both operands to the maximum type.
- The type is bool.
Unary conversions¶
1. Bit negate¶
- Resolve the inner operand.
- If the inner type is not an integer this is an error.
- The type is the inner type.
2. Boolean not¶
- Resolve the inner operand.
- The type is bool.
3. Negation¶
- Resolve the inner operand.
- If the inner type is not a number, then this is an error.
- If the inner type is an unsigned integer, cast it to the same signed type.
- The type is the type of the result from (3).
4. & and &&¶
- Resolve the inner operand.
- The type is a pointer to the type of the inner operand.
5. *¶
- Resolve the inner operand.
- If the operand is not a pointer, or is a
void*pointer, this is an error. - The type is the pointee of the inner operand's type.
Dereferencing 0 is implementation defined.
6. ++ and --¶
- Resolve the inner operand.
- If the type is not a number, this is an error.
- The type is the same as the inner operand.
Base expressions¶
1. Typed identifiers¶
- The type is that of the declaration.
- If the width of the type is less than that of the target type, widen to the target type.
- If the width of the type is greater than that of the target type, it is an error.
2. Constants and literals¶
- If the constant is an integer, it is assumed to be the arithmetic promotion width and signed. Suffixes imply the following: 'u' - unsigned, 'ul' - unsigned 64-bit, 'ull' - unsigned 128-bit, 'l' - signed 64-bit, 'll' - signed 128-bit. If a constant does not fit in the arithmetic promotion width, the following rules apply: if decimal, promote to the smallest signed integer able to contain it, if hex, binary or octal, promote to the smallest signed or unsigned integer able to contain it.
- If the constant is a floating point value, it is assumed to be a
doubleunless suffixed withfwhich is then assumed to be afloat.
Precedence
Precedence rules in C3 differs from C/C++. Here are all precedence levels in C3, listed from highest (1) to lowest (11 in 0.7.x, 12 in 0.8+):
Version 0.7.x¶
(),[],.,!!postfix!,++and--- prefix
-,~, prefix*,&, prefix++and-- - infix
*,/,% <<,>>^,|, infix&+, infix-,+++==,!=,>=,<=,>,<&&,&&&||,|||- ternary
?:?? =,*=,/=,%=,+=,-=,<<=,>>=,&=,^=,|=
Version 0.8+¶
(),[],.,!!postfix!,!!,~,++and--- prefix
-,~,!,!!,*,&,++,-- - infix
*,/,% <<,>>^,|, infix&?:,??+, infix-,+++==,!=,>=,<=,>,<&&,&&&||,|||?=,*=,/=,%=,+=,-=,<<=,>>=,&=,^=,|=
The main difference is that bitwise operations and shift has higher precedence than addition/subtraction and multiplication/division in C3. Bitwise operations also have higher precedence than the relational operators. Also, there is no difference in precedence between the bitwise operators.
Examples:
a + b >> c + d
(a + b) >> (c + d) // C (+ - are evaluated before >>)
a + (b >> c) + d // C3 (>> is evaluated before + -)
a & b == c
a & (b == c) // C (bitwise operators are evaluated after relational)
(a & b) == c // C3 (bitwise operators are evaluated before relational)
a > b == c < d
(a > b) == (c < d) // C (< > binds tighter than ==)
((a > b) == c) < d // C3 Error, requires parenthesis!
a | b ^ c & d
a | ((b ^ c) & d) // C (All bitwise operators have different precedence)
((a | b) ^ c) & d // C3 Error, requires parenthesis!
The change in precedence of the bitwise operators corrects a long standing issue in the C specification. The change in precedence for shift operations goes towards making the precedence less surprising.
Conflating the precedence of relational and equality operations, and all bitwise operations was motivated by simplification: few remember the exact internal differences in precedence between bitwise operators. Parenthesis is required for those conflated levels of precedence.
Left-to-right offers a very simple model to think about the internal order of operations, and encourages use of explicit ordering, as best practice in C is to use parentheses anyway.
Undefined Behaviour
Like C, C3 uses undefined behaviour. In contrast, C3 will trap - that is, print an error trace and abort – on undefined behaviour in debug builds. This is similar to using C with a UB sanitizer. It is only during release builds that actual undefined behaviour occurs.
In C3, undefined behaviour means that the compiler is free to interpret undefined behaviour as if behaviour cannot occur.
In the example below:
The case of x == 0 would invoke undefined behaviour for 255/x. For that reason,
the compiler may assume that x != 0 and compile it into the following code:
As a contrast, the safe build will compile code equivalent to the following.
List of undefined behaviours¶
The following operations cause undefined behaviour in release builds of C3:
| operation | will trap in safe builds |
|---|---|
| int / 0 | Yes |
| int % 0 | Yes |
| reading explicitly uninitialized memory | Possible* |
| array index out of bounds | Yes |
dereference null |
Yes |
| dereferencing memory not allocated | Possible* |
| dereferencing memory outside of its lifetime | Possible* |
| casting pointer to the incorrect array | Possible* |
| violating pre or post conditions | Yes |
| violating asserts | Yes |
reaching unreachable() code |
Yes |
* "Possible" indicates trapping is implementation dependent.
List of implementation dependent behaviours¶
Some behaviour is allowed to differ between implementations and platforms.
| operation | will trap in safe builds | permitted behaviour |
|---|---|---|
| comparing pointers of different provenance | Optional | Any result |
| subtracting pointers of different provenance | Optional | Any result |
| shifting by more or equal to the bit width | Yes | Any result |
| shifting by negative amount | Yes | Any result |
| conversion floating point <-> integer type is out of range | Optional | Any result |
| conversion between pointer types produces one with incorrect alignment | Optional | Any result / Error |
| calling a function through a function pointer that does not match the function | Optional | Any result / Error |
| attempt to modify a string literal | Optional | Partial modification / Error |
modifying a const variable |
Optional | Partial modification / Error |
List of undefined behaviour in C, which is defined in C3¶
Signed Integer Overflow¶
Signed integer is always wrapped using 2s complement.
Modifying the intermediate results of an expression¶
Behaves as if the intermediate result was stored in a variable on the stack.
Misc Advanced
Inline Assembly
C3 provides two ways to insert inline assembly code: asm strings and asm blocks.
Assembly strings¶
This form takes a single compile time string and passes it directly to the underlying backend without any changes.
Assembly blocks¶
Assembly blocks use a common grammar for all types of processors. C3's asm implementation assumes that all assembly statements can be reduced to variations of the following general format:
Where an arg is:
- An identifier, e.g.
FOO,x. - A numeric constant
10xFFetc. - A register name (always lower case with a '$' prefix) e.g.
$eax$r7. - The address of a variable e.g.
&x. - An indirect address:
[addr]or[addr + index * <const> + offset]. - Any expression inside of "()" (will be evaluated before entering the
asmblock).
An example:
int aa = 3;
int g;
int* gp = &g;
int* xa = &aa;
usz asf = 1;
asm
{
movl x, 4; // Move 4 into the variable x
movl [gp], x; // Move the value of x into the address in gp
movl x, 1; // Move 1 into x
movl [xa + asf * 4 + 4], x; // Move x into the address at xa[asf + 1]
movl $eax, (23 + x); // Move 23 + x into EAX
movl x, $eax; // Move EAX into x
movq [&z], 33; // Move 33 into the memory address of z
}
Note
The current state of inline asm is a work in progress. Only a subset of x86, aarch64 and riscv instructions are available, and other platforms have no support at all. It is likely that the grammar will be extended as more architectures are supported. More instructions can be added as they are needed, so please file issues when you encounter missing instructions you need.
Builtins
The compiler offers builtin constants and functions. Some are only available on certain targets. All builtins use the $$
name prefix.
Builtin constants¶
These constants are generated by the compiler and can safely be used by the user.
$$BENCHMARK_NAMES¶
An array of names of the benchmark functions as a String[].
The program must be run in benchmark mode (e.g. via the c3c benchmark shell command) for this array to be non-empty.
$$BENCHMARK_FNS¶
An array of addresses to the benchmark functions as a void*[].
The program must be run in benchmark mode (e.g. via the c3c benchmark command) for this array to be non-empty.
$$DATE¶
The current date (year, month, day) as a String.
In contrast, to retreive the time of day (hours, minutes, seconds) try using $$TIME.
$$FILE¶
The current source code file name (not including any of the path) as a String.
$$FILEPATH¶
The full ("absolute") path to the current source code file as a String.
$$FUNC¶
The current function name as a String.
This will return "<GLOBAL>" if used on the global level (outside any function), such as via String global_func_name = $$FUNC;, because there is no corresponding function name in that case.
$$FUNCTION¶
The current function as an identifier, as if its name were written in place of $$FUNCTION.
As such, it may be queried for associated info (e.g. $$FUNCTION.nameof, $typeof($$FUNCTION), etc) or assigned to a function pointer and later called, etc. Thus, more info than just a String function name may be accessed this way, in contrast to $$FUNC.
$$LINE¶
The current line as an integer.
$$LINE_RAW¶
Usually the same as $$LINE, but in case of a macro inclusion it returns the line in the macro rather than
the line where the macro was included.
$$MODULE¶
The current module name as a String.
Keep in mind that there can be multiple modules per file in C3 if multiple module sections are used. In contrast, for a per file name try $$FILE or $$FILEPATH.
$$TIME¶
The current time of day (hours, minutes, seconds) as a String.
In contrast, to retreive the calendar day (year, month, day) try using $$DATE..
Compiler builtin functions¶
The $$ namespace defines compiler builtin functions.
These special functions are not guaranteed to exist on
all platforms, and are ways to wrap compiler implemented, optimized implementations
of some particular functionality. They are mainly intended for standard
library internal use. The standard library has macros
that wrap these builtins, so they should normally not be used on their own.
$$trap¶
Emits a trap instruction.
$$unreachable¶
Inserts an "unreachable" annotation.
$$stacktrace¶
Returns the current "callstack" reference if available. OS and compiler dependent.
$$volatile_store¶
Takes a variable and a value and stores the value as a volatile store.
$$volatile_load¶
Takes a variable and returns the value using a volatile load.
$$memcpy¶
Builtin memcpy instruction.
$$memset¶
Builtin memset instruction.
$$prefetch¶
Prefetch a memory location.
$$sysclock¶
Access to the cycle counter register (or similar low latency clock) on supported
architectures (e.g. RDTSC on x86), otherwise $$sysclock will yield 0.
$$syscall¶
Makes a syscall according to the platform convention on platforms where it is supported.
Math functions¶
Functions $$ceil, $$trunc, $$sin, $$cos, $$log, $$log2, $$log10, $$rint, $$round
$$sqrt, $$roundeven, $$floor, $$sqrt, $$pow, $$exp, $$fma and $$fabs, $$copysign,
$$round, $$nearbyint.
Can be applied to float vectors or numbers. Returns the same type.
Functions $$min, $$abs and $$max can be applied to any integer or float number or vector.
Function $$pow_int takes a floating point value or vector and an integer and returns
the same type as the first parameter.
Saturated addition, subtraction and left shift for integers and integer vectors:
$$sat_add, $$sat_shl, $$sat_sub.
Bit functions¶
$$fshl and $$fshr¶
Funnel shift left and right, takes either two integers or two integer vectors.
$$ctz, $$clz, $$bitreverse, $$bswap, $$popcount¶
Bit functions work on an integer or an integer vector.
Vector functions¶
$$reduce_add, $$reduce_mul, $$reduce_and, $$reduce_or, $$reduce_xor work on integer vectors.
$$reduce_fadd, $$reduce_fmul works on float vectors.
$$reduce_max, $$reduce_min works on any vector.
$$reverse reverses the values in any vector.
$$shufflevector rearranges the values of two vectors using a fixed mask into
a resulting vector.
Debugging
C3 provides several powerful features and compiler flags to help identify memory corruption, logic errors, and performance bottlenecks.
Virtual Memory Temp Allocator (VMEM_TEMP)¶
The temporary allocator (tmem) is extremely fast but can lead to "use-after-scope" bugs if pointers to temporary data are stored in globals or long-lived structs.
To debug these issues, you can enable the Virtual Memory tracking mode by passing the -D VMEM_TEMP flag to the compiler (or adding "VMEM_TEMP" to your project.json features).
How it works:¶
When VMEM_TEMP is enabled:
1. Hardware Protection: The allocator uses the OS virtual memory system (MMU) to manage pages.
2. Instant Crash: When a @pool or test scope ends, the memory pages are removed or marked as protected. Any attempt to access "dead" temporary data will cause an immediate Segfault.
3. Large Address Space: It reserves a wide virtual address range (typically 4GB) to ensure allocations don't overlap, making corruption much easier to catch.
Tip
If your program is crashing with a Segfault only when -D VMEM_TEMP is enabled, look for pointers pointing into tmem that are being accessed after the @pool that created them has closed.
Backtraces¶
In Safe Mode (default), C3 automatically generates detailed backtraces when a panic or crash occurs.
Manual Backtraces:¶
You can capture a backtrace at any time as a string:
import std::os::backtrace;
fn void log_stack() {
String bt = backtrace::get(tmem)!;
io::eprint(bt);
}
Sanitizers¶
C3 supports integration with LLVM's Address Sanitizer (ASAN) and Thread Sanitizer (TSAN).
Address Sanitizer (ASAN)¶
To enable ASAN, compile with:
ASAN will detect: - Out-of-bounds access to heap, stack, and globals. - Use-after-free bugs. - Memory leaks.Thread Sanitizer (TSAN)¶
For multi-threaded applications, TSAN helps find data races:
Tracking Allocator¶
The TrackingAllocator is a wrapper that can be placed around any other allocator to detect memory leaks and capture backtraces for every allocation.
fn void main() {
TrackingAllocator tracker;
tracker.init(mem); // Wrap the default 'mem' allocator
defer tracker.free();
Allocator a = &tracker;
// Use 'allocator::new' to pass a specific allocator:
int* p = allocator::new(a, int);
// If not freed, tracker.print_report() will show any leaks.
tracker.print_report();
}
Allocation Tracking Macros¶
For convenience, C3 provides macros to automatically wrap a block of code with a tracking allocator.
@report_heap_allocs_in_scope¶
This macro runs the enclosed code and automatically prints a full memory report at the end of the scope.
@assert_leak¶
Similar to the report macro, but instead of just printing, it will assert that no memory has leaked. If leaks are found, it triggers a panic with a report.
Note
This macro only performs tracking and assertions if debug symbols are enabled or the -D MEMORY_ASSERTS feature flag is used. Otherwise, it executes the code block normally with no overhead.
fn void main() {
@assert_leak()
{
// code that should not leak
void* p = mem::malloc(64);
mem::free(p);
};
}
Testing Macros¶
C3 includes a built-in testing framework in std::core::test. These macros provide descriptive failure messages, stringifying the expressions being tested.
fn void test_math() @test {
int x = 10;
int y = 20;
test::eq(x + y, 40);
// Test failed ^^^ ( example.c3:4 ) `30` != `40`
}
Assertions and Unreachable¶
assert¶
Used for runtime checks that should always be true.
- Safe Mode: triggers a panic with backtrace if the condition fails
- Unchecked Mode: is assumed to always be
true, generating an LLVMunreachableinstruction, becoming an optimization hint telling the compiler this path is impossible.
Note
Use @assert_always as drop-in replacement if the assertion should also happen in Unchecked Mode
unreachable¶
Marks a code path that logically should never be hit.
- Safe Mode: Triggers a panic with the provided message and a backtrace.
- Unchecked Mode: Generates an LLVM
unreachableinstruction. This is an optimization hint telling the compiler this path is impossible. If the path is actually reached, the program will have undefined behavior (which often manifests as a crash or very strange execution state).
switch (state) {
case START: // ...
case END: // ...
default: unreachable("Invalid state encountered");
}
Contracts¶
C3 supports Contracts using the @require and @ensure attributes. These are checked in Safe Mode.
@require: Pre-conditions that must be true when the function is called.@ensure: Post-conditions that must be true when the function returns.
<*
@require b != 0 : "Divisor must not be zero"
@ensure return == a / b
*>
fn float divide(float a, float b)
{
return a / b;
}
If a contract is violated in safe mode, the program panics with a descriptive message and a backtrace.
Safe vs. Unchecked Mode¶
Understanding the difference between modes is crucial for debugging:
| Feature | Safe Mode (-O0, -O1) | Unchecked Mode (-O2+) |
|---|---|---|
| Bounds Checking | Enabled | Disabled |
| Null Checks | Enabled | Disabled |
| Contracts | Evaluated | Ignored |
| Backtraces | Generated | Optional/None |
| Zero-Init | Guaranteed | Guaranteed |
Always perform your primary development and testing in Safe Mode. Switch to Unchecked Mode only for final releases or performance profiling once the logic is verified.
Library Packaging
Note that the library system is in early alpha. Everything below is subject to change.
C3 allows convenient packaging of C3 source files optionally with statically or dynamically linked libraries. To use such a library, simply pass the path to the library directory and add the library you wish to link to. The compiler will resolve any dependencies to other libraries and only compile those that are in use.
How it works¶
A library may be used either packaged or unpacked. If unpacked, it is simply a directory with the .c3l suffix, which contains all the necessary files. If packed, it is simply a compressed variant of a directory with the same structure.
The specification¶
In the top of the library resides the manifest.json file which has the following structure:
{
"provides" : "my_lib",
"execs" : [],
"targets" : {
"macos-x64" : {
"linkflags" : [],
"dependencies" : [],
"linked-libs" : ["my_lib_static", "Cocoa.framework", "c"]
},
"windows-x64" : {
"linkflags" : ["/stack:65536"],
"dependencies" : ["ms_my_extra"],
"linked-libs" : ["my_lib_static", "kernel32"],
"execs" : [],
}
}
}
In the example above, this library supports two targets: macos-x64 and windows-x64. If we tried to use it with any other target, the compiler would give an error.
We see that if we use the windows-x64 target it will also load the ms_my_extra library. We also see that the linker would have a special argument on that platform.
Both targets expect my_lib_static to be available for linking. If this library provides this static or dynamic library it will be in the target sub-directories, so it likely has the path windows-x64/my_lib_static.lib or macos-z64/libmy_lib_static.a.
Source code¶
Aside from the manifest, C3 will read any C and C3 files in the same directory as manifest.json, as well as any files in the target subdirectory for the current target. For static libraries,
typically a .c3i file (that is, a C3 file without any implementations) is provided, similar to
how .h files are used in C.
Additional actions¶
"exec", which is available both at the top level and per-target, lists the scripts which will be
invoked when a library is used. This requires running the compiler at full trust level using the
--trust=full option.
How to – automatically – export libraries¶
This feature is not implemented yet. The documentation for this feature will materialize once it is finished.
Implementation Details
Grammar
Keywords¶
The following are reserved keywords used by C3:
void bool char double
float float16 int128 ichar
int iptr isz long
short uint128 uint ulong
uptr ushort usz float128
any fault typeid assert
asm bitstruct break case
catch const continue alias
default defer typedef do
else enum extern false
for foreach foreach_r fn
tlocal if inline import
macro module nextcase null
return static struct switch
true try union var
while attrdef
$alignof $assert $case $default
$defined $echo $embed $exec
$else $endfor $endforeach $endif
$endswitch $eval $evaltype $error
$extnameof $for $foreach $if
$include $nameof $offsetof $qnameof
$sizeof $stringify $switch $typefrom
$typeof $vaarg
The following attributes are built in:
@align @benchmark @bigendian @builtin
@cdecl @cname @deprecated @dynamic
@export @extname @inline @interface
@littleendian @local @maydiscard @mustinit
@naked @nodiscard @noinit @noinline
@noreturn @nostrip @obfuscate @operator
@overlap @packed @priority @private
@public @pure @reflect @section
@stdcall @test @unused @used
@veccall @wasm @weak @winmain
The following constants are defined:
$$BENCHMARK_FNS $$BENCHMARK_NAMES $$DATE
$$FILE $$FILEPATH $$FUNC
$$FUNCTION $$LINE $$LINE_RAW
$$MODULE $$TEST_FNS $$TEST_NAMES
$$TIME
Yacc grammar¶
%{
#include <stdio.h>
#define YYERROR_VERBOSE
int yydebug = 1;
extern char yytext[];
extern int column;
int yylex(void);
void yyerror(char *s);
%}
%token IDENT HASH_IDENT CT_IDENT CONST_IDENT
%token TYPE_IDENT CT_TYPE_IDENT
%token AT_TYPE_IDENT AT_IDENT CT_INCLUDE
%token STRING_LITERAL INTEGER
%token INC_OP DEC_OP SHL_OP SHR_OP LE_OP GE_OP EQ_OP NE_OP
%token AND_OP OR_OP MUL_ASSIGN DIV_ASSIGN MOD_ASSIGN ADD_ASSIGN
%token SUB_ASSIGN SHL_ASSIGN SHR_ASSIGN AND_ASSIGN
%token XOR_ASSIGN OR_ASSIGN VAR NUL ELVIS NEXTCASE ANYFAULT
%token MODULE IMPORT DEF EXTERN
%token CHAR SHORT INT LONG FLOAT DOUBLE CONST VOID USZ ISZ UPTR IPTR ANY
%token ICHAR USHORT UINT ULONG BOOL INT128 UINT128 FLOAT16 FLOAT128 BFLOAT16
%token TYPEID BITSTRUCT STATIC BANGBANG AT_CONST_IDENT HASH_TYPE_IDENT
%token STRUCT UNION ENUM ELLIPSIS DOTDOT BYTES
%token CT_ERROR
%token CASE DEFAULT IF ELSE SWITCH WHILE DO FOR CONTINUE BREAK RETURN FOREACH_R FOREACH
%token FN FAULT MACRO CT_IF CT_ENDIF CT_ELSE CT_SWITCH CT_CASE CT_DEFAULT CT_FOR CT_FOREACH CT_ENDFOREACH
%token CT_ENDFOR CT_ENDSWITCH BUILTIN IMPLIES INITIALIZE FINALIZE CT_ECHO CT_ASSERT CT_EVALTYPE CT_VATYPE
%token TRY CATCH SCOPE DEFER LVEC RVEC OPTELSE CT_TYPEFROM CT_TYPEOF TLOCAL
%token CT_VASPLAT INLINE DISTINCT CT_VACONST CT_NAMEOF CT_VAREF CT_VACOUNT CT_VAARG
%token CT_SIZEOF CT_STRINGIFY CT_QNAMEOF CT_OFFSETOF CT_VAEXPR
%token CT_EXTNAMEOF CT_EVAL CT_DEFINED CT_CHECKS CT_ALIGNOF ASSERT
%token ASM CHAR_LITERAL REAL TRUE FALSE CT_CONST_IDENT
%token LBRAPIPE RBRAPIPE HASH_CONST_IDENT
%start translation_unit
%%
path
: IDENT SCOPE
| path IDENT SCOPE
;
path_const
: path CONST_IDENT
| CONST_IDENT
;
path_ident
: path IDENT
| IDENT
;
path_at_ident
: path AT_IDENT
| AT_IDENT
;
ident_expr
: CONST_IDENT
| IDENT
| AT_IDENT
;
local_ident_expr
: CT_IDENT
| HASH_IDENT
;
ct_call
: CT_ALIGNOF
| CT_DEFINED
| CT_EXTNAMEOF
| CT_NAMEOF
| CT_OFFSETOF
| CT_QNAMEOF
;
ct_analyse
: CT_EVAL
| CT_SIZEOF
| CT_STRINGIFY
;
ct_arg
: CT_VACONST
| CT_VAARG
| CT_VAREF
| CT_VAEXPR
;
flat_path
: primary_expr param_path
| type
| primary_expr
;
maybe_optional_type
: optional_type
| empty
;
string_expr
: STRING_LITERAL
| string_expr STRING_LITERAL
;
bytes_expr
: BYTES
| bytes_expr BYTES
;
expr_block
: LBRAPIPE opt_stmt_list RBRAPIPE
;
base_expr
: string_expr
| INTEGER
| bytes_expr
| NUL
| BUILTIN CONST_IDENT
| BUILTIN IDENT
| CHAR_LITERAL
| REAL
| TRUE
| FALSE
| path ident_expr
| ident_expr
| local_ident_expr
| type initializer_list
| type '.' access_ident
| type '.' CONST_IDENT
| '(' expr ')'
| expr_block
| ct_call '(' flat_path ')'
| ct_arg '(' expr ')'
| ct_analyse '(' expr ')'
| CT_VACOUNT
| CT_CHECKS '(' expression_list ')'
| lambda_decl compound_statement
;
primary_expr
: base_expr
| initializer_list
;
range_loc
: expr
| '^' expr
;
range_expr
: range_loc DOTDOT range_loc
| range_loc DOTDOT
| DOTDOT range_loc
| range_loc ':' range_loc
| ':' range_loc
| range_loc ':'
| DOTDOT
;
call_inline_attributes
: AT_IDENT
| call_inline_attributes AT_IDENT
;
call_invocation
: '(' call_arg_list ')'
| '(' call_arg_list ')' call_inline_attributes
;
access_ident
: IDENT
| AT_IDENT
| HASH_IDENT
| CT_EVAL '(' expr ')'
| TYPEID
;
call_trailing
: '[' range_loc ']'
| '[' range_expr ']'
| call_invocation
| call_invocation compound_statement
| '.' access_ident
| INC_OP
| DEC_OP
| '!'
| BANGBANG
;
call_stmt_expr
: base_expr
| call_stmt_expr call_trailing
;
call_expr
: primary_expr
| call_expr call_trailing
;
unary_expr
: call_expr
| unary_op unary_expr
;
unary_stmt_expr
: call_stmt_expr
| unary_op unary_expr
;
unary_op
: '&'
| AND_OP
| '*'
| '+'
| '-'
| '~'
| '!'
| INC_OP
| DEC_OP
| '(' type ')'
;
mult_op
: '*'
| '/'
| '%'
;
mult_expr
: unary_expr
| mult_expr mult_op unary_expr
;
mult_stmt_expr
: unary_stmt_expr
| mult_stmt_expr mult_op unary_expr
;
shift_op
: SHL_OP
| SHR_OP
;
shift_expr
: mult_expr
| shift_expr shift_op mult_expr
;
shift_stmt_expr
: mult_stmt_expr
| shift_stmt_expr shift_op mult_expr
;
bit_op
: '&'
| '^'
| '|'
;
bit_expr
: shift_expr
| bit_expr bit_op shift_expr
;
bit_stmt_expr
: shift_stmt_expr
| bit_stmt_expr bit_op shift_expr
;
additive_op
: '+'
| '-'
;
additive_expr
: bit_expr
| additive_expr additive_op bit_expr
;
additive_stmt_expr
: bit_stmt_expr
| additive_stmt_expr additive_op bit_expr
;
relational_op
: '<'
| '>'
| LE_OP
| GE_OP
| EQ_OP
| NE_OP
;
relational_expr
: additive_expr
| relational_expr relational_op additive_expr
;
relational_stmt_expr
: additive_stmt_expr
| relational_stmt_expr relational_op additive_expr
;
rel_or_lambda_expr
: relational_expr
| lambda_decl IMPLIES relational_expr
;
and_expr
: relational_expr
| and_expr AND_OP relational_expr
;
and_stmt_expr
: relational_stmt_expr
| and_stmt_expr AND_OP relational_expr
;
or_expr
: and_expr
| or_expr OR_OP and_expr
;
or_stmt_expr
: and_stmt_expr
| or_stmt_expr OR_OP and_expr
;
or_expr_with_suffix
: or_expr
| or_expr '~'
| or_expr '~' '!'
;
or_stmt_expr_with_suffix
: or_stmt_expr
| or_stmt_expr '~'
| or_stmt_expr '~' '!'
;
ternary_expr
: or_expr_with_suffix
| or_expr '?' expr ':' ternary_expr
| or_expr_with_suffix ELVIS ternary_expr
| or_expr_with_suffix OPTELSE ternary_expr
| lambda_decl implies_body
;
ternary_stmt_expr
: or_stmt_expr_with_suffix
| or_stmt_expr '?' expr ':' ternary_expr
| or_stmt_expr_with_suffix ELVIS ternary_expr
| or_stmt_expr_with_suffix OPTELSE ternary_expr
| lambda_decl implies_body
;
assignment_op
: '='
| ADD_ASSIGN
| SUB_ASSIGN
| MUL_ASSIGN
| DIV_ASSIGN
| MOD_ASSIGN
| SHL_ASSIGN
| SHR_ASSIGN
| AND_ASSIGN
| XOR_ASSIGN
| OR_ASSIGN
;
empty
:
;
assignment_expr
: ternary_expr
| CT_TYPE_IDENT '=' type
| unary_expr assignment_op assignment_expr
;
assignment_stmt_expr
: ternary_stmt_expr
| CT_TYPE_IDENT '=' type
| unary_stmt_expr assignment_op assignment_expr
;
implies_body
: IMPLIES expr
;
lambda_decl
: FN maybe_optional_type fn_parameter_list opt_attributes
;
expr_no_list
: assignment_stmt_expr
;
expr
: assignment_expr
;
constant_expr
: ternary_expr
;
param_path_element
: '[' expr ']'
| '[' expr DOTDOT expr ']'
| '.' IDENT
;
param_path
: param_path_element
| param_path param_path_element
;
arg : param_path '=' expr
| type
| param_path '=' type
| expr
| CT_VASPLAT '(' range_expr ')'
| CT_VASPLAT '(' ')'
| ELLIPSIS expr
;
arg_list
: arg
| arg_list ',' arg
;
call_arg_list
: arg_list
| arg_list ';'
| arg_list ';' parameters
| ';'
| ';' parameters
| empty
;
opt_arg_list_trailing
: arg_list
| arg_list ','
| empty
;
enum_constants
: enum_constant
| enum_constants ',' enum_constant
;
enum_list
: enum_constants
| enum_constants ','
;
enum_constant
: CONST_IDENT
| CONST_IDENT '(' arg_list ')'
| CONST_IDENT '(' arg_list ',' ')'
;
identifier_list
: IDENT
| identifier_list ',' IDENT
;
enum_param_decl
: type
| type IDENT
| type IDENT '=' expr
;
base_type
: VOID
| BOOL
| CHAR
| ICHAR
| SHORT
| USHORT
| INT
| UINT
| LONG
| ULONG
| INT128
| UINT128
| FLOAT
| DOUBLE
| FLOAT16
| BFLOAT16
| FLOAT128
| IPTR
| UPTR
| ISZ
| USZ
| ANYFAULT
| ANY
| TYPEID
| TYPE_IDENT
| path TYPE_IDENT
| CT_TYPE_IDENT
| CT_TYPEOF '(' expr ')'
| CT_TYPEFROM '(' constant_expr ')'
| CT_VATYPE '(' constant_expr ')'
| CT_EVALTYPE '(' constant_expr ')'
;
type
: base_type
| type '*'
| type '[' constant_expr ']'
| type '[' ']'
| type '[' '*' ']'
| type LVEC constant_expr RVEC
| type LVEC '*' RVEC
;
optional_type
: type
| type '!'
;
local_decl_after_type
: CT_IDENT
| CT_IDENT '=' constant_expr
| IDENT opt_attributes
| IDENT opt_attributes '=' expr
;
local_decl_storage
: STATIC
| TLOCAL
;
decl_or_expr
: var_decl
| optional_type local_decl_after_type
| expr
;
var_decl
: VAR IDENT '=' expr
| VAR CT_IDENT '=' expr
| VAR CT_IDENT
| VAR CT_TYPE_IDENT '=' type
| VAR CT_TYPE_IDENT
;
initializer_list
: '{' opt_arg_list_trailing '}'
;
ct_case_stmt
: CT_CASE constant_expr ':' opt_stmt_list
| CT_CASE type ':' opt_stmt_list
| CT_DEFAULT ':' opt_stmt_list
;
ct_switch_body
: ct_case_stmt
| ct_switch_body ct_case_stmt
;
ct_for_stmt
: CT_FOR '(' for_cond ')' opt_stmt_list CT_ENDFOR
;
ct_foreach_stmt
: CT_FOREACH '(' CT_IDENT ':' expr ')' opt_stmt_list CT_ENDFOREACH
| CT_FOREACH '(' CT_IDENT ',' CT_IDENT ':' expr ')' opt_stmt_list CT_ENDFOREACH
;
ct_switch
: CT_SWITCH '(' constant_expr ')'
| CT_SWITCH '(' type ')'
| CT_SWITCH
;
ct_switch_stmt
: ct_switch ct_switch_body CT_ENDSWITCH
;
var_stmt
: var_decl ';'
decl_stmt_after_type
: local_decl_after_type
| decl_stmt_after_type ',' local_decl_after_type
;
declaration_stmt
: const_declaration
| local_decl_storage optional_type decl_stmt_after_type ';'
| optional_type decl_stmt_after_type ';'
;
return_stmt
: RETURN expr ';'
| RETURN ';'
;
catch_unwrap_list
: relational_expr
| catch_unwrap_list ',' relational_expr
;
catch_unwrap
: CATCH catch_unwrap_list
| CATCH IDENT '=' catch_unwrap_list
| CATCH type IDENT '=' catch_unwrap_list
;
try_unwrap
: TRY rel_or_lambda_expr
| TRY IDENT '=' rel_or_lambda_expr
| TRY type IDENT '=' rel_or_lambda_expr
;
try_unwrap_chain
: try_unwrap
| try_unwrap_chain AND_OP try_unwrap
| try_unwrap_chain AND_OP rel_or_lambda_expr
;
default_stmt
: DEFAULT ':' opt_stmt_list
;
case_stmt
: CASE expr ':' opt_stmt_list
| CASE expr DOTDOT expr ':' opt_stmt_list
| CASE type ':' opt_stmt_list
;
switch_body
: case_stmt
| default_stmt
| switch_body case_stmt
| switch_body default_stmt
;
cond_repeat
: decl_or_expr
| cond_repeat ',' decl_or_expr
;
cond
: try_unwrap_chain
| catch_unwrap
| cond_repeat
| cond_repeat ',' try_unwrap_chain
| cond_repeat ',' catch_unwrap
;
else_part
: ELSE if_stmt
| ELSE compound_statement
;
if_stmt
: IF optional_label paren_cond '{' switch_body '}'
| IF optional_label paren_cond '{' switch_body '}' else_part
| IF optional_label paren_cond statement
| IF optional_label paren_cond compound_statement else_part
;
expr_list_eos
: expression_list ';'
| ';'
;
cond_eos
: cond ';'
| ';'
;
for_cond
: expr_list_eos cond_eos expression_list
| expr_list_eos cond_eos
;
for_stmt
: FOR optional_label '(' for_cond ')' statement
;
paren_cond
: '(' cond ')'
;
while_stmt
: WHILE optional_label paren_cond statement
;
do_stmt
: DO optional_label compound_statement WHILE '(' expr ')' ';'
| DO optional_label compound_statement ';'
;
optional_label_target
: CONST_IDENT
| empty
;
continue_stmt
: CONTINUE optional_label_target ';'
;
break_stmt
: BREAK optional_label_target ';'
;
nextcase_stmt
: NEXTCASE CONST_IDENT ':' expr ';'
| NEXTCASE expr ';'
| NEXTCASE CONST_IDENT ':' type ';'
| NEXTCASE type ';'
| NEXTCASE ';'
;
foreach_var
: optional_type '&' IDENT
| optional_type IDENT
| '&' IDENT
| IDENT
;
foreach_vars
: foreach_var
| foreach_var ',' foreach_var
;
foreach_stmt
: FOREACH optional_label '(' foreach_vars ':' expr ')' statement
: FOREACH_R optional_label '(' foreach_vars ':' expr ')' statement
;
defer_stmt
: DEFER statement
| DEFER TRY statement
| DEFER CATCH statement
;
ct_if_stmt
: CT_IF constant_expr ':' opt_stmt_list CT_ENDIF
| CT_IF constant_expr ':' opt_stmt_list CT_ELSE opt_stmt_list CT_ENDIF
;
assert_expr
: try_unwrap_chain
| expr
;
assert_stmt
: ASSERT '(' assert_expr ')' ';'
| ASSERT '(' assert_expr ',' expr ')' ';'
;
asm_stmts
: asm_stmt
| asm_stmts asm_stmt
;
asm_instr
: INT
| IDENT
| INT '.' IDENT
| IDENT '.' IDENT
;
asm_addr
: asm_expr
| asm_expr additive_op asm_expr
| asm_expr additive_op asm_expr '*' INTEGER
| asm_expr additive_op asm_expr '*' INTEGER additive_op INTEGER
| asm_expr additive_op asm_expr shift_op INTEGER
| asm_expr additive_op asm_expr additive_op INTEGER
;
asm_expr
: CT_IDENT
| CT_CONST_IDENT
| IDENT
| '&' IDENT
| CONST_IDENT
| REAL
| INTEGER
| '(' expr ')'
| '[' asm_addr ']'
asm_exprs
: asm_expr
| asm_exprs ',' asm_expr
;
asm_stmt
: asm_instr asm_exprs ';'
| asm_instr ';'
;
asm_block_stmt
: ASM '(' expr ')'
| ASM '{' asm_stmts '}'
| ASM '{' '}'
;
/* Order here matches compiler */
statement
: compound_statement
| var_stmt
| declaration_stmt
| return_stmt
| if_stmt
| while_stmt
| defer_stmt
| switch_stmt
| do_stmt
| for_stmt
| foreach_stmt
| continue_stmt
| break_stmt
| nextcase_stmt
| asm_block_stmt
| ct_echo_stmt
| ct_assert_stmt
| ct_if_stmt
| ct_switch_stmt
| ct_foreach_stmt
| ct_for_stmt
| expr_no_list ';'
| assert_stmt
| ';'
;
compound_statement
: '{' opt_stmt_list '}'
;
statement_list
: statement
| statement_list statement
;
opt_stmt_list
: statement_list
| empty
;
switch_stmt
: SWITCH optional_label '{' switch_body '}'
| SWITCH optional_label '{' '}'
| SWITCH optional_label paren_cond '{' switch_body '}'
| SWITCH optional_label paren_cond '{' '}'
;
expression_list
: decl_or_expr
| expression_list ',' decl_or_expr
;
optional_label
: CONST_IDENT ':'
| empty
;
ct_assert_stmt
: CT_ASSERT constant_expr ':' constant_expr ';'
| CT_ASSERT constant_expr ';'
| CT_ERROR constant_expr ';'
;
ct_include_stmt
: CT_INCLUDE string_expr ';'
;
ct_echo_stmt
: CT_ECHO constant_expr ';'
bitstruct_declaration
: BITSTRUCT TYPE_IDENT ':' type opt_attributes bitstruct_body
bitstruct_body
: '{' '}'
| '{' bitstruct_defs '}'
| '{' bitstruct_simple_defs '}'
;
bitstruct_defs
: bitstruct_def
| bitstruct_defs bitstruct_def
;
bitstruct_simple_defs
: base_type IDENT ';'
| bitstruct_simple_defs base_type IDENT ';'
;
bitstruct_def
: base_type IDENT ':' constant_expr DOTDOT constant_expr ';'
| base_type IDENT ':' constant_expr ';'
;
static_declaration
: STATIC INITIALIZE opt_attributes compound_statement
| STATIC FINALIZE opt_attributes compound_statement
;
attribute_name
: AT_IDENT
| AT_TYPE_IDENT
| path AT_TYPE_IDENT
;
attribute_operator_expr
: '&' '[' ']'
| '[' ']' '='
| '[' ']'
;
attr_param
: attribute_operator_expr
| constant_expr
;
attribute_param_list
: attr_param
| attribute_param_list ',' attr_param
;
attribute
: attribute_name
| attribute_name '(' attribute_param_list ')'
;
attribute_list
: attribute
| attribute_list attribute
;
opt_attributes
: attribute_list
| empty
;
trailing_block_param
: AT_IDENT
| AT_IDENT '(' ')'
| AT_IDENT '(' parameters ')'
;
macro_params
: parameters
| parameters ';' trailing_block_param
| ';' trailing_block_param
| empty
;
macro_func_body
: implies_body ';'
| compound_statement
;
macro_declaration
: MACRO macro_header '(' macro_params ')' opt_attributes macro_func_body
;
struct_or_union
: STRUCT
| UNION
;
struct_declaration
: struct_or_union TYPE_IDENT opt_attributes struct_body
;
struct_body
: '{' struct_declaration_list '}'
;
struct_declaration_list
: struct_member_decl
| struct_declaration_list struct_member_decl
;
enum_params
: enum_param_decl
| enum_params ',' enum_param_decl
;
enum_param_list
: '(' enum_params ')'
| '(' ')'
| empty
;
struct_member_decl
: type identifier_list opt_attributes ';'
| struct_or_union IDENT opt_attributes struct_body
| struct_or_union opt_attributes struct_body
| BITSTRUCT ':' type opt_attributes bitstruct_body
| BITSTRUCT IDENT ':' type opt_attributes bitstruct_body
| INLINE type IDENT opt_attributes ';'
| INLINE type opt_attributes ';'
;
enum_spec
: ':' type enum_param_list
| empty
;
enum_declaration
: ENUM TYPE_IDENT enum_spec opt_attributes '{' enum_list '}'
;
faults
: CONST_IDENT
| faults ',' CONST_IDENT
;
fault_declaration
: FAULT TYPE_IDENT opt_attributes '{' faults '}'
| FAULT TYPE_IDENT opt_attributes '{' faults ',' '}'
;
func_macro_name
: IDENT
| AT_IDENT
;
func_header
: optional_type type '.' func_macro_name
| optional_type func_macro_name
;
macro_header
: func_header
| type '.' func_macro_name
| func_macro_name
;
fn_parameter_list
: '(' parameters ')'
| '(' ')'
;
parameters
: parameter '=' expr
| parameter
| parameters ',' parameter
| parameters ',' parameter '=' expr
;
parameter
: type IDENT opt_attributes
| type ELLIPSIS IDENT opt_attributes
| type ELLIPSIS CT_IDENT
| type CT_IDENT
| type ELLIPSIS opt_attributes
| type HASH_IDENT opt_attributes
| type '&' IDENT opt_attributes
| type opt_attributes
| '&' IDENT opt_attributes
| HASH_IDENT opt_attributes
| ELLIPSIS
| IDENT opt_attributes
| IDENT ELLIPSIS opt_attributes
| CT_IDENT
| CT_IDENT ELLIPSIS
;
func_definition
: FN func_header fn_parameter_list opt_attributes ';'
| FN func_header fn_parameter_list opt_attributes macro_func_body
;
const_declaration
: CONST CONST_IDENT opt_attributes '=' expr ';'
| CONST type CONST_IDENT opt_attributes '=' expr ';'
;
func_typedef
: FN optional_type fn_parameter_list
;
opt_distinct_inline
: DISTINCT
| DISTINCT INLINE
| INLINE DISTINCT
| INLINE
| empty
;
generic_parameters
: bit_expr
| type
| generic_parameters ',' bit_expr
| generic_parameters ',' type
;
typedef_type
: func_typedef
| type opt_generic_parameters
;
multi_declaration
: ',' IDENT
| multi_declaration ',' IDENT
;
global_storage
: TLOCAL
| empty
;
global_declaration
: global_storage optional_type IDENT opt_attributes ';'
| global_storage optional_type IDENT multi_declaration opt_attributes ';'
| global_storage optional_type IDENT opt_attributes '=' expr ';'
;
opt_tl_stmts
: top_level_statements
| empty
;
tl_ct_case
: CT_CASE constant_expr ':' opt_tl_stmts
| CT_CASE type ':' opt_tl_stmts
| CT_DEFAULT ':' opt_tl_stmts
;
tl_ct_switch_body
: tl_ct_case
| tl_ct_switch_body tl_ct_case
;
define_attribute
: AT_TYPE_IDENT '(' parameters ')' opt_attributes '=' '{' opt_attributes '}'
| AT_TYPE_IDENT opt_attributes '=' '{' opt_attributes '}'
;
opt_generic_parameters
: '{' generic_parameters '}'
| empty
;
define_ident
: IDENT '=' path_ident opt_generic_parameters
| CONST_IDENT '=' path_const opt_generic_parameters
| AT_IDENT '=' path_at_ident opt_generic_parameters
;
define_declaration
: DEF define_ident ';'
| DEF define_attribute ';'
| DEF TYPE_IDENT opt_attributes '=' opt_distinct_inline typedef_type ';'
;
tl_ct_if
: CT_IF constant_expr ':' opt_tl_stmts CT_ENDIF
| CT_IF constant_expr ':' opt_tl_stmts CT_ELSE opt_tl_stmts CT_ENDIF
;
tl_ct_switch
: ct_switch tl_ct_switch_body CT_ENDSWITCH
;
module_param
: CONST_IDENT
| TYPE_IDENT
;
module_params
: module_param
| module_params ',' module_param
;
module
: MODULE path_ident opt_attributes ';'
| MODULE path_ident '{' module_params '}' opt_attributes ';'
;
import_paths
: path_ident
| path_ident ',' path_ident
;
import_decl
: IMPORT import_paths opt_attributes ';'
;
translation_unit
: top_level_statements
| empty
;
top_level_statements
: top_level
| top_level_statements top_level
;
opt_extern
: EXTERN
| empty
;
top_level
: module
| import_decl
| opt_extern func_definition
| opt_extern const_declaration
| opt_extern global_declaration
| ct_assert_stmt
| ct_echo_stmt
| ct_include_stmt
| tl_ct_if
| tl_ct_switch
| struct_declaration
| fault_declaration
| enum_declaration
| macro_declaration
| define_declaration
| static_declaration
| bitstruct_declaration
;
%%
void yyerror(char *s)
{
fflush(stdout);
printf("\n%*s\n%*s\n", column, "^", column, s);
}
int main(int argc, char *argv[])
{
yyparse();
return 0;
}
C3 Specification¶
Notation¶
The syntax is specified using Extended Backus-Naur Form (EBNF):
production ::= PRODUCTION_NAME '::=' expression?
expression ::= alternative ("|" alternative)*
alternative ::= term term*
term ::= PRODUCTION_NAME | TOKEN | set | group | option | repetition
set ::= '[' (range | CHAR) (rang | CHAR)* ']'
range ::= CHAR '-' CHAR
group ::= '(' expression ')'
option ::= expression '?'
repetition ::= expression '*'
Productions are expressions constructed from terms and the following operators, in increasing precedence:
Uppercase production names are used to identify lexical tokens. Non-terminals are in lower case. Lexical tokens are enclosed in single quotes ''.
The form a..b represents the set of characters from a through b as alternatives.
Source code representation¶
A program consists of one or more translation units stored in files written in the Unicode character set, stored as a sequence of bytes using the UTF-8 encoding. Except for comments and the contents of character and string literals, all input elements are formed only from the ASCII subset (U+0000 to U+007F) of Unicode.
Carriage return¶
The carriage return (U+000D) is usually treated as white space, but may be stripped from the source code prior to lexical translation.
Bidirectional markers¶
Unbalanced bidirectional markers (such as U+202D and U+202E) is not legal.
Lexical Translations¶
A raw byte stream is translated into a sequence of tokens which white space and comments are discarded. The resulting input elements form the tokens that are the terminal symbols of the syntactic grammar.
The longest possible translation is used at each step, even if the result does not ultimately make a correct program while another lexical translation would.
Example:
a--bis translated asa,--,b, which does not form a grammatically correct expression, even though the tokenizationa,-,-,bcould form a grammatically correct expression.
Line Terminators¶
The C3 compiler divides the sequence of input bytes into lines by recognizing line terminators
Lines are terminated by the ASCII LF character (U+000A), also known as "newline". A line termination specifies the termination of the // form of a comment.
Comments¶
There are two types of regular comments:
// texta line comment. The text between//and line end is ignored./* text */block comments. The text between/*and*/is ignored. It has nesting behaviour, so for every/*discovered between the first/*and the last*/a corresponding*/must be found.
White Space¶
White space is defined as the ASCII horizontal tab character (U+0009), carriage return (U+000D), space character (U+0020) and the line terminator character (U+000D).
Letters and digits¶
UC_LETTER ::= [A-Z]
LC_LETTER ::= [a-z]
LETTER ::= UC_LETTER | LC_LETTER
DIGIT ::= [0-9]
HEX_DIGIT ::= [0-9a-fA-F]
BINARY_DIGIT ::= [01]
OCTAL_DIGIT ::= [0-7]
LC_LETTER_ ::= LC_LETTER | "_"
UC_LETTER_ ::= UC_LETTER | "_"
ALPHANUM ::= LETTER | DIGIT
ALPHANUM_ ::= ALPHANUM | "_"
UC_ALPHANUM_ ::= UC_LETTER_ | DIGIT
LC_ALPHANUM_ ::= LC_LETTER_ | DIGIT
Identifiers¶
Identifiers name program entities such as variables and types. An identifier is a sequence of one or more letters and digits. The first character in an identifier must be a letter or underscore.
C3 has three groups of identifiers: const identifiers - containing only underscore and upper-case letters, type identifiers - starting with an upper case letter followed by at least one underscore letter and regular identifiers, starting with a lower case letter.
Identifiers are limited to 127 characters.
IDENTIFIER ::= "_"* LC_LETTER ALPHANUM_*
CONST_IDENT ::= "_"* UC_LETTER UC_ALPHANUM_*
TYPE_IDENT ::= "_"* UC_LETTER UC_ALPHANUM_* LC_LETTER ALPHANUM_*
CT_IDENT ::= "$" IDENTIFIER
CT_BUILTIN_CONST ::= "$$" CONST_IDENT
CT_BUILTIN_FN ::= "$$" IDENTIFIER
CT_TYPE_IDENT ::= "$" TYPE_IDENT
AT_IDENT ::= "@" IDENT
AT_TYPE_IDENT ::= "@" TYPE_IDENT
HASH_IDENT ::= "#" IDENT
PATH_SEGMENT ::= "_"* LC_LETTER LC_ALPHANUM_*
Keywords¶
The following keywords are reserved and may not be used as identifiers:
any bfloat bool
char double fault
float float128 float16
ichar int int128
iptr isz long
short typeid uint
uint128 ulong uptr
ushort usz void
alias assert asm
attrdef bitstruct break
case catch const
continue default defer
do else enum
extern false faultdef
for foreach foreach_r
fn tlocal if
inline import macro
module nextcase null
interface return static
struct switch true
try typedef union
var while
$alignof $assert $assignable
$case $default $defined
$echo $else $embed
$endfor $endforeach $endif
$endswitch $eval $error
$exec $extnameof $feature
$for $foreach $if
$include $is_const $nameof
$offsetof $qnameof $sizeof
$stringify $switch $typefrom
$typeof $vacount $vatype
$vaconst $vaarg $vaexpr
$vasplat
Operators and punctuation¶
The following character sequences represent operators and punctuation.
& @ ~ | ^ :
, / $ . ; =
> < # { } -
( ) * [ ] %
>= <= + += -= !
? ?: && ?? &= |=
^= /= .. == [< >]
++ -- %= != || ::
<< >> !! -> => ...
<<= >>= +++ &&& ||| ???
Backslash escapes¶
The following backslash escapes are available for characters and string literals:
\0 0x00 zero value
\a 0x07 alert/bell
\b 0x08 backspace
\e 0x1B escape
\f 0x0C form feed
\n 0x0A newline
\r 0x0D carriage return
\t 0x09 horizontal tab
\v 0x0B vertical tab
\\ 0x5C backslash
\' 0x27 single quote '
\" 0x22 double quote "
\x Escapes a single byte hex value
\u Escapes a two byte unicode hex value
\U Escapes a four byte unicode hex value
Constants¶
TODO
Extern constants¶
TODO
Untyped constants¶
TODO
Constant literals¶
TODO
Variables¶
TODO
Global variables¶
TODO
Extern global variables¶
TODO
Thread local globals¶
TODO
Local variables¶
TODO
Static locals¶
TODO
Thread local locals¶
TODO
Copying declarations¶
TODO
Macro copying¶
TODO
Defer copying¶
TODO
Types¶
Types consist of built-in types and user-defined types (enums, structs, unions, bitstructs and typedef).
Boolean types¶
bool may have the two values true and false. It holds a single bit of information but is
stored in a char type.
Integer types¶
The built-in integer types:
char unsigned 8-bit
ichar signed 8-bit
ushort unsigned 16-bit
short signed 16-bit
uint unsigned 32-bit
int signed 32-bit
ulong unsigned 64-bit
long signed 64-bit
uint128 unsigned 128-bit
int128 singed 128-bit
In addition, the following type aliases exist:
uptr unsigned pointer size
iptr signed pointer size
usz unsigned pointer offset / object size
isz signed pointer offset / object size
Floating point types¶
Built-in floating point types:
float16 IEEE 16-bit*
bfloat16 Brainfloat*
float IEEE 32-bit
double IEEE 64-bit
float128 IEEE 128-bit*
(* optionally supported)
Vector types¶
A vector lowers to the platform's vector types where available. A vector has a base type and a width.
Vector base type¶
The base type of a vector must be of boolean, pointer, enum, integer or floating point type, or a distinct type wrapping one of those types.
Min width¶
The vector width must be at least 1.
Element access¶
Vector elements are accessed using []. It is possible to take the address of a single element.
Field access syntax¶
It is possible to access the index 0-3 with field access syntax. 'x', 'y', 'z', 'w' corresponds to indices 0-3. Alternatively 'r', 'g', 'b', 'a' may be used.
Swizzling¶
It is possible to form new vectors by combining field access names of individual elements. For example
foo.xz constructs a new vector with the fields from the elements with index 0 and 2 from the vector "foo". There is
no restriction on ordering, and the same field may be repeated. The width of the vector is the same as the number of
elements in the swizzle. Example: foo.xxxzzzyyy would be a vector of width 9.
Mixing the "rgba" and "xyzw" access name sets is an error. Consequently foo.rgz would be invalid as "rg" is from the "rgba" set and "z" is from the "xyzw" set.
Swizzling assignment¶
A swizzled vector may be a lvalue if there is no repeat of an index. Example: foo.zy is a valid lvalue, but foo.xxy is not.
Alignment¶
Alignment of vectors have the same alignment as arrays of the same size and type.
Vector operations¶
Vectors support the same arithmetics and bit operations as its underlying type, and will perform the operation element-wise. Vector operations ignore overloads on the underlying type.
Example:
int[<2>] a = { 1, 3 };
int[<2>] b = { 2, 7 };
int[<2>] c = a * b;
// Equivalent to
int[<2>] c = { a[0] * b[0], a[1] * b[1] };
Vectors support ++ and -- operators, which will be applied to each element. For example, given the int vector int[<2>] x = { 1, 2 }, the expression x++ will return the vector { 1, 2 } and update the vector x to { 2, 3 }
Enum vector "ordinal"¶
Enum vectors support .ordinal, which will return the ordinal of all elements. Note that the .from_ordinal method of enums may take a vector and then return an enum vector.
Vector limits¶
Vectors may have a compiler defined maximum bit width. This will be at least as big as the largest supported SIMD vector. A typical value is 4096 bits. For the purpose of calculating max with, boolean vectors are considered to be 8 bits wide per element.
Simd vectors¶
TODO
Alignment¶
TODO
Size¶
TODO
Elements¶
TODO
Array types¶
An array has the alignment of its elements. An array must have at least one element.
Slice types¶
The slice consist of a pointer, followed by an usz length, having the alignment of pointers.
Pointer types¶
A pointer is an address to memory.
Pointee type¶
The type of the memory pointed to is the pointee type. It may be any runtime type. In the case of a void* the pointee type is unknown.
Deref¶
Dereferencing a pointer will return the value in the memory location interpreted as the pointee type.
Pointer arithmetics¶
An usz or isz offset may be added to a pointer resulting in a new pointer of the same type. This will offset the underlying address by the offset times the pointee size. An example: the size of a long is 8 bytes. Adding 3 to a pointer to a long consequently increases the address by 24 (3 * 8).
Subscripting¶
Subscripting a pointer is equal to performing pointer arithmetics by adding the index, followed by a deref. Subscripts on pointers may be negative and will never do bounds checks.
iptr and uptr¶
A pointer may be losslessly cast to an iptr or uptr. An iptr or uptr may be cast to a pointer of any type.
The wildcard pointer void*¶
The void* may implicitly cast into any other pointer type. The void* pointer implicitly casts into any other pointer.
A void* pointer may never be directly dereferenced or subscripted, it must first be cast to non-void pointer type.
Pointer arithmetic on void*¶
Performing pointer arithmetics on void* will assume that the element size is 1.
Struct types¶
A struct may not have zero members.
Alignment¶
A non-packed struct has the alignment of the member that has the highest alignment. A packed struct has alignment 1. See align attribute for details on changing the alignment.
Flexible array member¶
The last member of a struct may be a flexible array member. This is a placeholder for an unknown length array. A struct must have at least one other member other than the flexible array member.
The syntax of the flexible array member is the same as arrays of inferred length: Type[*]. The member will contribute to alignment as if it was a one element array.
Struct memory layout and size¶
The members of a struct is laid out in memory in order of declaration. Each member will be placed at the first offset aligned to the type of the member. This may cause padding to occur between members.
Finally, the end of the struct will be padded so that the size is a multiple of its alignment.
Inline¶
TODO
Union types¶
A union may not have zero members.
Alignment¶
A union has the alignment of the member that has the highest alignment. See align attribute for details on changing the alignment.
Union size¶
The size of a union is the size of its largest member, padded so that the size is a multiple of its alignment.
Bitstruct type¶
Container type¶
The container type is restricted to integer types and char arrays, or typedefs based on such types.
Fault type¶
Alignment¶
Alignment is the same as that of the uptr type.
Size¶
Size is the same as that of the uptr type.
Representation¶
In underlying representation, the fault matches that of an uptr.
Faultdef¶
faultdef will create unique instances of the fault type.
Zero value¶
The zero fault type can be created implicitly casting from null or {}.
Assigning a zero value fault¶
An optional empty constructed from a zero value fault, will behave as if it was a result with an undefined value. Performing operations on an undefined value will in itself give an undefined value.
Enum type¶
TODO
Typeid type¶
The typeid type is a built-in type that represents a unique identifier for a type. In its underlying representation, it matches that of an iptr.
Associated values¶
TODO
Ordinal¶
TODO
Inline¶
Const enum type¶
TODO
Value¶
TODO
Inline¶
TODO
Typedef¶
TODO
Underlying type¶
TODO
Inline¶
TODO
Alias¶
TODO
Interface¶
TODO
Inheritance¶
TODO
Implementing interface¶
TODO
Method lookup¶
TODO
Declarations¶
TODO
Expressions¶
TODO
Assignment expression¶
assignment_expr ::= unary_expr assignment_op expr
assignment_op ::= "=" | "+=" | "-=" | "*=" | "/=" | "%=" | "<<=" | ">>=" | "&=" | "^=" | "|="
Statements¶
TODO
stmt ::= compound_stmt | non_compound_stmt
non_compound_stmt ::= assert_stmt | if_stmt | while_stmt | do_stmt | foreach_stmt | foreach_r_stmt
| for_stmt | return_stmt | break_stmt | continue_stmt | var_stmt
| declaration_stmt | defer_stmt | nextcase_stmt | asm_block_stmt
| ct_echo_stmt | ct_error_stmt | ct_assert_stmt | ct_if_stmt | ct_switch_stmt
| ct_for_stmt | ct_foreach_stmt | expr_stmt | ct_assign_stmt
Compile time assign statements¶
Type assign statement¶
This assigns a new type to a compile time type variable. The value of the expression is the type assigned.
Asm block statement¶
An asm block is either a string expression or a brace enclosed list of asm statements.
asm_block_stmt ::= "asm" ("(" constant_expr ")" | "{" asm_stmt* "}")
asm_stmt ::= asm_instr asm_exprs? ";"
asm_instr ::= ("int" | IDENTIFIER) ("." IDENTIFIER)
asm_expr ::= CT_IDENT | CT_CONST_IDENT | "&"? IDENTIFIER | CONST_IDENT | FLOAT_LITERAL
| INTEGER | "(" expr ")" | "[" asm_addr "]"
asm_addr ::= asm_expr (additive_op asm_expr asm_addr_trail?)?
asm_addr_trail ::= "*" INTEGER (additive_op INTEGER)? | (shift_op | additive_op) INTEGER
TODO
Assert statement¶
The assert statement will evaluate the expression and call the panic function if it evaluates to false.
assert_stmt ::= "assert" "(" expr ("," assert_message)? ")" ";"
assert_message ::= constant_expr ("," expr)*
Conditional inclusion¶
assert statements are only included in "safe" builds. They may turn into assume directives for
the compiler on "fast" builds.
Assert message¶
The assert message is optional. It can be followed by an arbitrary number of expressions, in which case the message is understood to be a format string, and the following arguments are passed as values to the format function.
The assert message must be a compile time constant. There are no restriction on the format argument expressions.
Panic function¶
If the assert message has no format arguments or no assert message is included,
then the regular panic function is called. If it has format arguments then panicf is called instead.
In the case the panicf function does not exist (for example, compiling without the standard library),
then the format and the format arguments will be ignored and the assert will be treated
as if no assert message was available.
Break statement¶
A break statement exits a while, for, do, foreach or switch scope. A labelled break
may also exit a labelled if.
Break labels¶
If a break has a label, then it will instead exit an outer scope with the label.
Unreachable code¶
Any statement following break in the same scope is considered unreachable.
Compile time echo statement¶
During parsing, the compiler will output the text in the statement when it is semantically checked. The statement will be turned into a NOP statement after checking.
The message¶
The message must be a compile time constant string.
Compile time assert statement¶
During parsing, the compiler will check the compile time expression
and create a compile time error with the optional message. After
evaluation, the $assert becomes a NOP statement.
Evaluated expression¶
The checked expression must evaluate to a boolean compile time constant.
Error message¶
The second parameter, which is optional, must evaluate to a constant string.
Compile time error statement¶
During parsing, when semantically checked this statement will output a compile time error with the message given.
Error message¶
The parameter must evaluate to a constant string.
Compile time if statement¶
If the cond expression is true, the then-branch is processed by the compiler. If it evaluates to false, the else-branch is processed if it exists.
Cond expression¶
The cond expression must be possible to evaluate to true or false at compile time.
Scopes¶
The "then" and "else" branches will add a compile time scope that is exited when reaching $endif.
It adds no runtime scope.
Evaluation¶
Statements in the branch not picked will not be semantically checked.
Compile time switch statement¶
ct_switch_stmt ::= "$switch" (ct_expr_or_type)? ":"
ct_case_stmt ::= ("$default" | "$case" ct_expr_or_type) ":" stmt*
No cond expression switch¶
If the cond expression is missing, evaluation will go through each case until one case expression evaluates to true.
Type expressions¶
If a cond expression is a type, then all case statement expressions must be types as well.
Ranged cases¶
Compile time switch does not support ranged cases.
Fallthrough¶
If a case clause has no statements, then when executing the case, rather than exiting the switch, the next case clause immediately following it will be used. If that one should also be missing statements, the procedure will be repeated until a case clause with statements is encountered, or the end of the switch is reached.
Break and nextcase¶
Compile time switches do not support break nor nextcase.
Evaluation of statements¶
Only the case which is first matched has its statements processed by the compiler. All other statements are ignored and will not be semantically checked.
Continue statement¶
A continue statement jumps to the cond expression of a while, for, do or foreach
Continue labels¶
If a continue has a label, then it will jump to the cond of the while/for/do in the outer scope
with the corresponding label.
Unreachable code¶
Any statement following continue in the same scope is considered unreachable.
Declaration statement¶
A declaration statement adds a new runtime or compile time variable to the current scope. It is available after the declaration statement.
declaration_stmt ::= const_declaration | local_decl_storage? optional_type decls_after_type ";"
local_decl_storage ::= "tlocal" | "static"
decls_after_type ::= local_decl_after_type ("," local_decl_after_type)*
decl_after_type ::= CT_IDENT ("=" constant_expr)? | IDENTIFIER opt_attributes ("=" expr)?
Thread local storage¶
Using tlocal allocates the runtime variable as a thread local variable. In effect this is the same as declaring
the variable as a global tlocal variable, but the visibility is limited to the function. tlocal may not be
combined with static.
The initializer for a tlocal variable must be a valid global init expression.
Static storage¶
Using static allocates the runtime variable as a function global variable. In effect this is the same as declaring
a global, but visibility is limited to the function. static may not be combined with tlocal.
The initializer for a static variable must be a valid global init expression.
Scopes¶
Runtime variables are added to the runtime scope, compile time variables to the compile time scope. See var statements .
Multiple declarations¶
If more than one variable is declared, no init expressions are allowed for any of the variables.
No init expression¶
If no init expression is provided, the variable is zero initialized.
Opt-out of zero initialization¶
Using the @noinit attribute opts out of zero initialization.
Prevent opt-out of zero initialization¶
Using the @mustinit attribute disables the use of the @noinit attribute.
Self referencing initialization¶
An init expression may refer to the address of the same variable that is declared, but not the value of the variable.
Example:
Defer statement¶
The defer statements are executed at (runtime) scope exit, whether through return, break, continue or rethrow.
Defer in defer¶
The defer body (statement) may not be a defer statement. However, if the body is a compound statement then this may have any number of defer statements.
Static and tlocal variables in defer¶
Static and tlocal variables are allowed in a defer statement. Only a single variable is instantiated regardless of the number of inlining locations.
Defer and return¶
If the return has an expression, then it is evaluated before the defer statements (due to exit from the current
function scope),
are executed.
Example:
Defer and jump statements¶
A defer body may not contain a break, continue, return or rethrow that would exit the statement.
Defer execution¶
Defer statements are executed in the reverse order of their declaration, starting from the last declared defer statement.
defer try¶
A defer try type of defer will only execute if the scope is left through normal fallthrough, break,
continue or a return with a result.
It will not execute if the exit is through a rethrow or a return with an optional value.
defer catch¶
A defer catch type of defer will only execute if the scope is left through a rethrow or a return with an optional
value
It will not execute if the exit is a normal fallthrough, break, continue or a return with a result.
Non-regular returns - longjmp, panic and other errors¶
Defers will not execute when doing longjmp terminating through a panic or other error. They
are only invoked on regular scope exits.
Expr statement¶
An expression statement evaluates an expression.
No discard¶
If the expression is a function or macro call either returning an optional or annotated @nodiscard, then
the expression is a compile time error. A function or macro returning an optional can use the @maydiscard
attribute to suppress this error.
If statement¶
An if statement will evaluate the cond expression, then execute the first statement (the "then clause") in the if-body if it evaluates to "true", otherwise execute the else clause. If no else clause exists, then the next statement is executed.
if_stmt ::= "if" (label ":")? "(" cond_expr ")" if_body
if_body ::= non_compound_stmt | compound_stmt else_clause? | "{" switch_body "}"
else_clause ::= "else" (if_stmt | compound_stmt)
Scopes¶
Both the "then" clause and the else clause open new scopes, even if they are non-compound statements. The cond expression scope is valid until the exit of the entire statement, so any declarations in the cond expression are available both in then and else clauses. Declarations in the "then" clause is not available in the else clause and vice versa.
Special parsing of the "then" clause¶
If the then-clause isn't a compound statement, then it must follow on the same row as the cond expression. It may not appear on a consecutive row.
Break¶
It is possible to use labelled break to break out of an if statement. Note that an unlabelled break may not
be used.
If-try¶
The cond expression may be a try-unwrap chain. In this case, the unwrapped variables are scoped to the "then" clause only.
If-catch¶
The cond expression may be a catch-unwrap. The unwrap is scoped to the "then" clause only. If one or more variables are in the catch, then the "else" clause have these variables implicitly unwrapped.
Example:
int? a = foo();
int? b = foo();
if (catch a, b)
{
// Do something
}
else
{
int x = a + b; // Valid, a and b are implicitly unwrapped.
}
If-catch implicit unwrap¶
If an if-catch's "then"-clause will jump out of the outer scope in all code paths and the catch is on one or more variables, then this variable(s) will be implicitly unwrapped in the outer scope after the if-statement.
Example:
Nextcase statement¶
Nextcase will jump to another switch case.
Labels¶
When a nextcase has a label, the jump is to the switch in an outer scope with the corresponding label.
No expression jumps¶
A nextcase without any expression jumps to the next case clause in the current switch. It is not possible
to use no expression nextcase with labels.
Jumps to default¶
Using default jumps to the default clause of a switch.
Missing case¶
If the switch has constant case values, and the nextcase expression is constant, then the value of the expression must match a case clause. Not matching a case is a compile time error.
If one or more cases are non-constant and/or the nextcase expression is non-constant, then no compile time check is made.
Variable expression¶
If the nextcase has a non-constant expression, or the cases are not all constant, then first the nextcase expression is evaluated. Next, execution will proceed as if the switch was invoked again, but with the nextcase expression as the switch cond expression. See switch statement.
If the switch does not have a cond expression, nextcase with an expression is not allowed.
Unreachable code¶
Any statement in the same scope after a nextcase are considered unreachable.
Switch statement¶
switch_stmt ::= "switch" (label ":")? ("(" cond_expr ")")? switch body
switch_body ::= "{" case_clause* "}"
case_clause ::= default_stmt | case_stmt
default_stmt ::= "default" ":" stmt*
case_stmt ::= "case" label? expr (".." expr)? ":" stmt*
Regular switch¶
If the cond expression exists and all case statements have constant expression, then first the cond expression is evaluated, next the case corresponding to the expression's value will be jumped to and the statement will be executed. After reaching the end of the statements and a new case clause or the end of the switch body, the execution will jump to the first statement after the switch.
If-switch¶
If the cond expression is missing or the case statements are non-constant expressions, then each case clause will be evaluated in order after the cond expression has been evaluated (if it exists):
- If a cond expression exists, calculate the case expression and execute the case if it is matching the cond expression. A default statement has no expression and will always be considered matching the cond expression reached.
- If no con expression exists, calculate the case expression and execute the case if the expression evaluates to "true" when implicitly converted to boolean. A default statement will always be considered having the "true" result.
Any-switch¶
If the cond expression is an any type, the switch is handled as if switching was done over the type
field of the any. This field has the type of typeid, and the cases follows the rules
for switching over typeid.
If the cond expression is a variable, then this variable is implicitly converted to a pointer with the pointee type given by the case statement.
Example:
any a = abc();
switch (a)
{
case int:
int b = *a; // a is int*
case float:
float z = *a; // a is float*
case Bar:
Bar f = *a; // a is Bar*
default:
// a is not unwrapped
}
Ranged cases¶
Cases may be ranged. The start and end of the range must both be constant integer values. The start must be less or equal to the end value. Using non-integers or non-constant values is a compile time error.
Fallthrough¶
If a case clause has no statements, then when executing the case, rather than exiting the switch, the next case clause immediately following it will be executed. If that one should also be missing statement, the procedure will be repeated until a case clause with statements is encountered (and executed), or the end of the switch is reached.
Exhaustive switch¶
If a switch case has a default clause or it is switching over an enum and there exists a case for each enum value then the switch is exhaustive.
Break¶
If an unlabelled break, or a break with the switch's label is encountered, then the execution will jump out of the switch and proceed directly after the end of the switch body.
Unreachable code¶
If a switch is exhaustive and all case clauses end with a jump instruction, containing no break statement out of the current switch, then the code directly following the switch will be considered unreachable.
Switching over typeid¶
If the switch cond expression is a typeid, then case declarations may use only the type name after the case,
which will be interpreted as having an implicit .typeid. Example: case int: will be interpreted as if
written case int.typeid.
Nextcase without expression¶
Without a value nextcase will jump to the beginning of the next case clause. It is not allowed to
put nextcase without an expression if there are no following case clauses.
Nextcase with expression¶
Nextcase with an expression will evaluate the expression and then jump as if the switch was entered with the cond expression corresponding to the value of the nextcase expression. Nextcase with an expression cannot be used on a switch without a cond expression.
Do statement¶
The do statement first evaluates its body (inner statement), then evaluates the cond expression. If the cond expression evaluates to true, jumps back into the body and repeats the process.
Unreachable code¶
The statement after a do is considered unreachable if the cond expression cannot ever be false
and there is no break out of the do.
Break¶
break will exit the do with execution continuing on the following statement.
Continue¶
continue will jump directly to the evaluation of the cond, as if the end of the statement had been reached.
Do block¶
If no while part exists, it will only execute the block once, as if it ended with while (false), this is
called a "do block"
For statement¶
The for statement will perform the (optional) init expression. The cond expression will then be tested. If
it evaluates to true then the body will execute, followed by the incr expression. After execution will
jump back to the cond expression and execution will repeat until the cond expression evaluates to false.
for_stmt ::= "for" label? "(" init_expr ";" cond_expr? ";" incr_expr ")" stmt
init_expr ::= decl_expr_list?
incr_expr ::= expr_list?
Init expression¶
The init expression is only executed once before the rest of the for loop is executed. Any declarations in the init expression will be in scope until the for loop exits.
The init expression may optionally be omitted.
Incr expression¶
The incr expression is evaluated before evaluating the cond expr every time except for the first one.
The incr expression may optionally be omitted.
Cond expression¶
The cond expression is evaluated every loop. Any declaration in the cond expression is scoped to the current loop, i.e. it will be reinitialized at the start of every loop.
The cond expression may optionally be omitted. This is equivalent to setting the cond expression to
always return true.
Unreachable code¶
The statement after a for is considered unreachable if the cond expression cannot ever be false, or is
omitted and there is no break out of the loop.
Break¶
break will exit the for with execution continuing on the following statement after the for.
Continue¶
continue will jump directly to the evaluation of the cond, as if the end of the statement had been reached.
Equivalence of while and for¶
A while loop is functionally equivalent to a for loop without init and incr expressions.
foreach and foreach_r statements¶
The foreach statement will loop over a sequence of values. The foreach_r is equivalent to
foreach but the order of traversal is reversed.
foreach starts with element 0 and proceeds step by step to element len - 1.
foreach_r starts starts with element len - 1 and proceeds step by step to element 0.
foreach_stmt ::= "foreach" label? "(" foreach_vars ":" expr ")" stmt
foreach_r_stmt ::= "foreach_r" label? "(" foreach_vars ":" expr ")" stmt
foreach_vars ::= (foreach_index ",")? foreach_var
foreach_var ::= type? "&"? IDENTIFIER
Break¶
break will exit the foreach statement with execution continuing on the following statement after.
Continue¶
continue will cause the next iteration to commence, as if the end of the statement had been reached.
Iteration by value or reference¶
Normally iteration are by value. Each element is copied into the foreach variable. If &
is added before the variable name, the elements will be retrieved by reference instead, and consequently
the type of the variable will be a pointer to the element type instead.
Foreach variable¶
The foreach variable may omit the type. In this case the type is inferred. If the type differs from the element type, then an implicit conversion will be attempted. Failing this is a compile time error.
Foreach index¶
If a variable name is added before the foreach variable, then this variable will receive the index of the element.
For foreach_r this mean that the first value of the index will be len - 1.
The index type defaults to usz.
If an optional type is added to the index, the index will be converted to this type. The type must be an integer type. The conversion happens as if the conversion was a direct cast. If the actual index value would exceed the maximum representable value of the type, this does not affect the actual iteration, but may cause the index value to take on an incorrect value due to the cast.
For example, if the optional index type is char and the actual index is 256, then the index value would show 0
as (char)256 evaluates to zero.
Modifying the index variable will not affect the foreach iteration.
Foreach support¶
Foreach is natively supported for any slice, array, pointer to an array, vector and pointer to a vector. These types support both iteration by value and reference.
In addition, a type with operator overload for len and [] will support iteration by value,
and a type with operator overload for len and &[] will support iteration by reference.
Return statement¶
The return statement evaluates its expression (if present) and returns the result.
Jumps in return statements¶
If the expression should in itself cause an implicit return, for example due to the rethrow operator !, then this
jump will happen before the return.
An example:
return foo()!;
// is equivalent to:
int temp = foo()!;
return temp;
Empty returns¶
An empty return is equivalent to a return with a void type. Consequently constructs like foo(); return;
and return (void)foo();
are equivalent.
Unreachable code¶
Any statement directly following a return in the same scope are considered unreachable.
While statement¶
The while statement evaluates the cond expression and executes the statement if it evaluates to true. After this the cond expression is evaluated again and the process is repeated until cond expression returns false.
Unreachable code¶
The statement after a while is considered unreachable if the cond expression cannot ever be false
and there is no break out of the while.
Break¶
break will exit the while with execution continuing on the following statement.
Continue¶
continue will jump directly to the evaluation of the cond, as if the end of the statement had been reached.
Var statement¶
A var statement declares a variable with inferred type, or a compile time type variable. It can be used both for runtime and compile time variables. The use for runtime variables is limited to macros.
Inferring type¶
In the case of a runtime variable, the type is inferred from the expression. Not providing an expression is a compile time error. The expression must resolve to a runtime type.
For compile time variables, the expression is optional. The expression may resolve to a runtime or compile time type.
Scope¶
Runtime variables will follow the runtime scopes, identical to behaviour in a declaration statement. The compile
time variables will follow the compile time scopes which are delimited by scoping compile time
statements ($if, $switch,
$foreach and $for).
Macros¶
TODO
Attributes¶
Attributes are modifiers attached to modules, variables, type declarations etc.
| name | used with |
|---|---|
@align |
fn, const, variables, user-defined types, struct member |
@benchmark |
module, fn |
@bigendian |
bitstruct only |
@builtin |
macro, fn, global, constant |
@callconv |
fn, call |
@deprecated |
fn, macro, interface, variables, constants, user-defined types, struct member |
@dynamic |
fn |
@export |
fn, globals, constants, struct, union, enum, faultdef |
@cname |
fn, globals, constants, user-defined types, faultdef |
@if |
all except local variables and calls |
@inline |
fn, call |
@interface |
fn |
@littleendian |
bitstruct only |
@local |
module, fn, macro, globals, constants, user-defined types, attributes and aliases |
@maydiscard |
fn, macro |
@mustinit |
variables |
@naked |
fn |
@nodiscard |
fn, macro |
@noinit |
user-defined types |
@noinline |
fn, call |
@noreturn |
fn, macro |
@nostrip |
fn, globals, constants, struct, union, enum, faultdef |
@obfuscate |
enum, faultdef |
@operator |
fn, macro |
@optional |
interface methods |
@overlap |
bitstruct only |
@packed |
struct, union |
@priority |
initializer/finalizer |
@private |
module, fn, macro, globals, constants, user-defined types, attributes and aliases |
@public |
module, fn, macro, globals, constants, user-defined types, attributes and aliases |
@pure |
call |
@reflect |
fn, globals, constants, user-defined types |
@section |
fn, globals, constants |
@test |
module, fn |
@unused |
all except call and initializer/finalizers |
@used |
all except call and initializer/finalizers |
@weak |
fn, globals, constants |
@winmain |
fn |
@deprecated¶
Takes an optional constant string. If the node is in use, print the deprecation and add the optional string if present.
@optional¶
Marks an interface method as optional, and so does not need to be implemented by a conforming type.
@winmain¶
Marks a main function as a win32 winmain function, which is the entrypoint for a windowed
application on Windows. This allows the main function to take a different set of
arguments than usual.
@callconv¶
@callconv can be used with a function or a call. It takes a constant string which is either "veccall", "stdcall" or "cdecl". If more than one @callconv
is applied to a function or call, the last one takes precedence.
By default, the call convention is "cdecl".
User defined attributes¶
User defined attributes group a list of attributes.
attribute_decl ::= "attrdef" AT_TYPE_IDENT ("(" parameters ")")? attribute* "=" "{" attribute* "}" ";"
Empty list of attributes¶
The list of attributes may be empty.
Parameter arguments¶
Arguments given to user defined attributes will be passed on to the attributes in the list.
Expansion¶
When a user defined attribute is encountered, its list of attributes is copied and appended instead of the user defined attribute. Any argument passed to the attribute is evaluated and passed as a constant by the name of the parameter to the evaluation of the attribute parameters in the list.
Nesting¶
A user defined attribute can contain other user defined attributes. The definition may not be cyclic.
Methods¶
Operator overloading¶
@operator overloads may only be added to user defined types (typedef, unions, struct, enum and fault).
Indexing operator ([])¶
This requires a return type and a method parameter, which is the index.
Reference indexing operator (&[])¶
This requires a return type and a method parameter, which is the index. If [] is implemented,
it should return a pointer to [].
Assigning index operator (=[])¶
This has a void return type, and index should match that of [] and &[]. Value should match that
of [] and be the pointee of the result of &[].
Len operator (len)¶
This must have an integer return type.
Dynamic methods¶
@dynamic may be used on methods for any type except any and interfaces.
Built-in functions¶
Modules¶
Module paths are hierarchal, with each sub-path appended with '::' + the name:
Each module declaration starts its own module section. All imports and all @local declarations
are only visible in the current module section.
module_section ::= "module" path opt_generic_params? attributes? ";"
generic_param ::= TYPE_IDENT | CONST_IDENT
opt_generic_params ::= "{" generic_param ("," generic_param)* "}"
Any visibility attribute defined in a module section will be the default visibility in all declarations in the section.
If the @benchmark attribute is applied to the module section then all function declarations
will implicitly have the @benchmark attribute.
If the @test attribute is applied to the module section then all function declarations
will implicitly have the @test attribute.
Generic modules¶
TODO
Program initialization¶
TODO
Optionals and faults¶
TODO
FAQ
FAQ
Standard library¶
Q: What are the most fundamental modules in the standard library?
A: By default C3 will implicitly import anything in std::core into
your files. It contains string functions, allocators and conveniences for
doing type introspection. The latter is in particular useful when writing
contracts for macros:
std::core::arrayfunctions for working with arrays.std::core::builtincontains functions that are to be used without a module prefix,unreachable(),bitcast(),@catch()and@ok()are especially important.std::core::cinteropcontains types which will match the C types on the platform.std::core::dstringHas the dynamic string type.std::core::memcontainsmallocetc, as well as functions for atomic and volatile load / store.std::core::stringhas all string functionality, including conversions, splitting and searching strings.
Aside from the std::core module, std::collections is important as it
holds various containers. Of those the generic List type in std::collections::list
and the HashMap in std::collections::map are very frequently used.
IO is a must, and std::io contains std::io::file for working with files,
std::io::path for working with paths. std::io itself contains
functionality for writing to streams in various ways. Useful streams can
be found in the stream sub folder.
Also of interest could be std::net for sockets. std::threads for
platform independent threads, std::time for dates and timers, libc for
invoking libc functions. std::os for working with OS specific code and
std::math for math functions and vector methods.
Q: How do strings work?
(see Strings for more info.)
A: C3 defines a native string type String, which is a typedef char[]. Because
char[] is essentially a pointer + length, some care has to be taken to
ensure that the pointer is properly managed.
For dynamic strings, or as a string builder, use DString. To get a String from
a DString you can either get a view using str_view() or make a copy using copy_str().
In the former case, the String may become invalid if DString is then mutated.
ZString is a zero terminated typedef char*. It is used to model zero-terminated
strings like in C. It is mostly useful interfacing with C.
WString is a Char16*, useful on those platforms, like Win32, where this
is the common unicode format. Like ZString, it is mostly useful when interfacing
with C.
Language features¶
Q: How do I use slices?
(see Arrays/Slice for more info.)
A: Slices are typically preferred in any situation where one in C would pass a pointer + length. It is a struct containing a pointer + a length.
Given an array, pointer or another slice you use either [start..end]
or [start:len] to create it:
You can also just pass a pointer to an array:
The start and/or end may be omitted:
It is possible to use ranges to assign:
It is important to remember that the lifetime of a slice is the same as the lifetime of its underlying pointer:
Q: How do I pass vaargs to another function that takes varargs?
A: Use the splat operator, ...
fn void test(String format, args...)
{
io::printfn(format, ...args);
}
fn void main()
{
test("Format: %s %d", "Foo", 123);
}
Q: What are vectors?
(see Vectors for more info.)
A: Vectors are similar to arrays, but declared with [< >] rather than [ ]. The element type may also only be of integer, floating point, bool or pointer types. Vectors are backed by SIMD types on supported platforms. Arithmetic operators available on the element type are also available on the vector as a whole and are performed element-wise, thus enabling more convenient vector math. For example:
Swizzling (shorthand for rearranging vector components, which is commonly used in graphics and game programming) is also supported:
Any scalar value will be expanded to the vector size:
Memory management¶
Q: How do I work with memory?
A: There is malloc, calloc and free just like in C. The main difference is that these will invoke whatever
the current heap allocator is, which does not need to be the allocator provided by libc. You can get the current heap
allocator using mem and do allocations directly. There is also a temporary allocator.
Convenience functions are available for allocating particular types: mem::new(Type) would allocate a single Type
on the heap and zero initialize it. mem::alloc(Type) does the same but without zero initialization.
Alternatively, mem::new can take a second initializer argument:
Foo* f1 = malloc(Foo.sizeof); // No initialization
Foo* f2 = calloc(Foo.sizeof); // Zero initialization
Foo* f3 = mem::new(Foo); // Zero initialization
Foo* f4 = mem::alloc(Foo); // No initialization
Foo* f5 = mem::new(Foo, { 4, 10.0, .a = 123 }); // Initialized to argument
For arrays mem::new_array and mem::alloc_array work in corresponding ways:
Foo* foos1 = malloc(Foo.sizeof * len); // No initialization
Foo* foos2 = calloc(Foo.sizeof * len); // Zero initialization
Foo[] foos3 = mem::new_array(Foo, len); // Zero initialization
Foo[] foos4 = mem::alloc_array(Foo, len); // No initialization
Regardless of how they are allocated, they can be freed using free()
Q: How does the temporary allocator work?
A: The temporary allocator is a kind of stack allocator. tmalloc, tcalloc and trealloc correspond to malloc, calloc and realloc. There is no free, as temporary allocations are freed when the entire pool (a.k.a. arena) of temporary objects is released all at once (making it both very easy to use and extremely performant). You use the @pool() macro to create a temporary allocation scope. When execution exits this scope, the temporary objects within it are all freed automatically. For example:
@pool()
{
void* some_mem = tmalloc(128);
foo(some_mem);
};
// Temporary allocations are automatically freed here.
Similar to the heap allocator, there is also mem::tnew, mem::temp_alloc, mem::temp_array and mem::temp_alloc_array,
which all work like their heap counterparts.
Q: How can I return a temporarily allocated object from inside a temporary allocation scope?
A: You need to pass in a copy of the temp allocator outside of @pool and allocate explicitly
using that allocator.
// Store the temp allocator
Allocator temp = tmem;
@pool()
{
// Note, 'temp != tmem' here!
void* some_mem = tmalloc(128);
// Allocate this on the external temp allocator
Foo* foo = allocator::new(temp, Foo);
foo.z = foo(some_mem);
// Now "some_mem" will be released,
// but the memory pointed to by "foo" is still valid.
return foo;
};
Interfacing with C code¶
(see C Interoperability for more info.)
Q: How do I call a C function from C3?
A: Just copy the C function declaration and prefix it with extern (and don’t forget the fn as well).
Imagine for example that you have the function double test(int a, void* b). To call it from C3 just declare
extern fn double test(CInt a, void* b) in the C3 code.
Q: My C function / global has a name that doesn't conform to the C3 name requirements. Just extern fn doesn't work.
A: In this case you need to give the function a C3-compatible name and then use the @cname attribute to
indicate its actual external name. For example, the function int *ABC(void *x) could be declared in the C3 code as
extern fn int* abc(void* x) @cname("ABC").
There are many examples of this in the std::os modules.
Patterns¶
Q: When should I put functionality in a method versus a free function?
A: In the C3 standard library, free functions are preferred unless the function is only acting on the particular
type. Some exceptions exist, but prefer things like io::fprintf(file, "Hello %s", name) over
file.fprintf("Hello %s", name). The former also has the advantage that it's easier to extend to work with many
types.
Q: Are there any naming conventions in the standard library that one should know about?
A: Yes. A function or method with new in the name will in general do one or more allocations and can take an
optional allocator. A function or method with temp in the name will usually allocate using the temp allocator.
The method free will free all memory associated with a type. destroy is similar to free but also indicates
that other resources (such as file handles) are released. In some cases close is used instead of destroy.
Function and variable names use snake_case (all lower case with _ separating words).
Q: How do I create overloaded methods?
A: This can be achieved with macro methods.
Imagine you have two methods:
fn void Obj.func1(&self, String... args) @private {} // vaargs variant
fn void Obj.func2(&self, Foo* pf) @private {} // Foo pointer variant
We can now create a macro method on Obj which compiles to different calls depending on arguments:
// The macro must be vararg, since the functions take different amount of arguments
macro void Obj.func(&self, ...)
{
// Does it have a single argument of type 'Foo*'?
$if $vacount == 1 &&& @typeis($vaarg[0], Foo*):
// If so, dispatch to func2
return self.func2($vaarg[0]);
$else
// Otherwise, dispatch all vaargs to func1
return self.func1($vasplat);
$endif
}
The above would make it possible to use both obj.func("Abc", "Def") and obj.func(&my_foo). (The use of &&& is the same as && except that the right hand side is lazily evaluated. In this case, it only is checked if $vacount is 1.)
Platform support¶
Q: How do I use WASM?
A: Currently WASM support is really incomplete.
You can try this:
compile --reloc=none --target wasm32 -g0 --link-libc=no --no-entry mywasm.c3
Unless you are compiling with something that already runs initializers,
you will need to call the function runtime::wasm_initialize() early in your
main or call it externally (for example from JS) with the name _initialize(),
otherwise globals might not be set up properly.
This should yield an out.wasm file, but there is no CI running on the WASM code
and no one is really using it yet, so the quality is low.
We do want WASM to be working really well, so if you're interested in writing something in WASM please reach out to the C3 development team and we'll help you get things working.
Q: How do I conditionally compile based on compiler flags?
A: You can pass feature flags on the command line using -D SOME_FLAG or using the features key
in the project file.
You can then test for them using $feature(FLAG_NAME):
int my_var @if($feature(USE_MY_VAR));
fn int test()
{
$if $feature(USE_MY_VAR):
return my_var;
$else
return 0;
$endif
}
Syntax & Language design¶
Q: Why does C3 require that types start with upper case but functions with lower case?
A: C's grammar is ambiguous. Usually compilers implement the so-called lexer hack, but other methods exist as well, such as delayed parsing. It is also possible to make it unambiguous using infinite lookahead.
However, all of those methods make it much harder for tools to search the source code accurately. By making the naming convention part of the grammar, C3 is straightforward to parse with a single token lookahead.
Q: Can't you relax C3's naming rules?
It is a common misunderstanding that the naming rules are something enforced by the semantic analyzer. This is not true: it is a lexer rule, to be able to distinguish between types and other identifiers.
It is the only way to make a C grammar parsable with only 1 token lookahead. All other approaches add significant complexity to work around this, and often they still rely on C being parsed in order, top to bottom.
Consequently, the answer is a strong NO. There is no way to "relax" the rules, because they are fundamental to making a C-like grammar parsable.
Either C3 has int a = 2; with these rules, or it gains some alternative
variable declaration syntax like var a : int = 2;. But in the latter case, the changes would not end there since
the declaration syntax also strongly shapes struct declarations, for-statements and other things to the point that
no one would recognize it as a C evolution anyway. So it's a non-starter.
Q: Why are there no closures and only non-capturing lambdas?
A: With closures, life-time management of captured variables becomes important to track. This can become arbitrarily complex, and without RAII or any other memory management technique it is fairly difficult to make code safe. Non-capturing lambdas on the other hand are fairly safe.
Q: Why is it called C3 and not something better?
A: Naming a programming language isn't easy. Most programming languages have pretty bad names, and while C3 isn't the best, no real better alternative has come along.
Q: Why are there no static methods?
A: Static methods create a tension between free functions in modules and functions namespaced by the type. Java for example, resolves this by not having free functions at all. C3 resolves it by not having static methods (nor static variables). Consequently more functions become part of the module rather than the type.
Q: Why do macros with trailing bodies require ; at the end?
A: All macro calls, including those with a trailing body, are expressions, so it would be ambiguous to let them terminate a statement without a much more complicated grammar. An example:
// How can the parser determine that the
// last `}` ends the expression? (And does it?)
int a = @test() {} + @test() {}
*b = 123;
// In comparison, the grammar for this is easy:
int a = @test() {} + @test() {};
*b = 123;
C3 strives for a simple grammar, and so the trade-off of having to use ; was a fairly low price to pay for this feature.
Q: Why does C3 choose to call Optional "Optional" and not "Result"?
A: C3's optional has properties both from the traditional "Maybe" and "Result". While it carries two possible values,
like a Result, it is trivially composable in the way optionals are. In the "Result" case, we cannot implicitly combine
Result<int, MyError> and Result<int, YourError>, which is also often reflected in the support for them.
For the "Maybe" it is trivial, so we see how languages do things like "Optional Chaining". C3 even goes beyond that, and implements implicit "flat map" for operations with its Optional.
For Result or multiple returns, it's also not guaranteed how big the error value can be. For C3 the size is defined to be pointer-sized to minimize overhead.
This means that using an Optional is not heavier than using a "Maybe" or even a boolean return in some other language.
For these reasons, the Optional leans more towards "Maybe" usage than "Result", and the name was chosen to nudge towards the correct use.
Q: Why doesn't C3 have a tagged union?
A: Tagged unions are great, but there is still discussion of what it should look like if it was included in C3.
See this issue for more details.
Q: Why is the declaration of arrays swapped compared to C?
A: The way C3 types are declared is the most inside one is to the left, the outermost to the right. Indexing or dereferencing will peel off the rightmost part.
C uses a different way to do this: we place * and [] not on the type but on the variable, in the order it must be unpacked.
So given int (*foo) x[4] we first dereference it (from inside) int[4], then index from the right.
If we wanted to extract a standalone type from this, we'd have int(*)[4] for a pointer to an array of 4 integers.
For "left is innermost", the declaration would instead be int[4]*. If left-is-innermost we can easily describe a pointer
to an array of int pointers (which happens in C3 since arrays don't implicitly decay) int[4]. In C that would be "int()[4]",
which is generally regarded as less easy to read, not the least because you need to think of which of * or [] has priority.
In C3, we can have a variable List{int}[3] x, which is an array of 3 List{int}. If we do x[1] we will get an element of
List{int}, from the middle element in the array. If we then further index this with [5], like x[1][5] we will get
the 5th element of that list.
Q: Why does C3 use :: to separate namespaces and not .?
A: . is nice to type and read, but there are challenges. In particular, C3's "path shortening", where you're allowed to write
file::open("foo.txt") rather than having to use the full std::io::file::open("foo.txt") is only made possible because
the namespace is distinct at the grammar level. If we play with changing the syntax because it isn't as elegant as
file.open("foo.txt"), we'd have to pay by actually writing std.io.file.open("foo.txt") or change to a flat module system.
One can also note that if . is used, then something like file.open("foo.txt") would be ambiguous if there was both a module file
and a variable file in the scope.
Choices in tooling¶
Q: Why does C3 have comments in the default project.json?
A: This was done as a way for users to understand what the various fields were used for. While this would more properly be
a .json5 file, many json parsers could ignore comments anyway. As tooling improves, comments will be phased out. Already it's
possible to manipulate the project file from the command line.
Q: Why does C3 use JSON for project.json, why not YAML, TOML or something else?
A: JSON is a format with a parser in almost any language, plus it is straightforward to write a parser for.
Originally C3 used TOML, which is great for manual configs. However, we moved away from this exactly because the project files should be easily generated and manipulated by tools, with no strict requirement that formatting remains the same.
If tools manipulated hand-written TOML, there would be an expectation to retain formatting and comments in the same style, which would put a burden on tool writers.
Q: Will C3 have a package manager?
A: There will be some standard API for uploading and downloading C3 libraries. However, it will not be a full dependency manager. In an attempt to limit over-use of dependencies, each dependency will need to be downloaded separately, rather than automatically.
See, for example, the discussion here.
Cross-compiling To Windows From Linux¶
Q: How do I cross-compile my C3 program for Windows on Linux?
A: With the C3 compiler you can specify which target you would like to cross-compile to. For Windows the following target would be needed:
c3c compile main.c3 --target windows-x64
This requires the MSVC SDK components, which c3c automatically downloads and configures, including the Windows SDK files needed to enable cross-compilation to Windows.
Changes from C¶
Q: Why does C3 have zero initialization for local variables?
A: There are several reasons:
- In the "zero-is-initialization" paradigm, zeroing variables, in particular structs, is very common. By offering zero initialization by default this avoids a whole class of vulnerabilities.
- Another alternative that was considered for C3 was mandatory initialization, but this adds a lot of extra boilerplate.
- C3 also offers a way to opt out of zero-initialization, so the change comes at no performance loss.
Q: Why is the const qualifier removed?
A: "const correctness" requires littering const across the code base. Although const is useful, it provides weaker guarantees than it appears.
Q: Why was the 0777 octal format removed?
A: C's octal syntax looks too much like base 10 with leading zeros prepended (and is sometimes used outside of C to represent fixed-width base 10 numbers). Removing such ambiguous octal syntax prevents a common source of subtle numerical errors in C.
Q: Why is goto gone?
A: It is very difficult to make goto work well with defer and implicit unwrapping of optional results. It is not just making the compiler harder to write, but
the code is harder to understand as well. The replacements together with defer cover many if not all usages of goto in regular code.
All Features
Here is a summary of all the features of C3 and how it differs from C.
Symbols and literals¶
Changes relating to literals, identifiers etc.
Added¶
0oprefix for octal.0bprefix for binary.- Optional
_as digit separator. - Hexadecimal byte data, e.g
x"abcd". - Base64 byte data, e.g.
b64"QzM=". - Type name restrictions (PascalCase).
- Variable and function name restrictions (must start with lower case letter).
- Constant name restrictions (no lower case).
- Character literals may be 2, 4, 8, 16 bytes long. (2cc, 4cc etc).
- Raw string literals between "`".
\eescape character.- Source code must be UTF-8.
- Assumes
\nfor newlines.\ris stripped from source. - Integer suffixes are fixed size:
L,ULis 64-bit literals on all platforms,LLandULLare guaranteed 128-bit literals. - The
nullliteral is a pointer value of 0. - The
trueandfalseare boolean constants true and false.
Removed¶
- Trigraphs / digraphs.
- 0123-style octal.
z,LLandULLsuffixes.
Built-in types¶
Added¶
- Type declaration is left to right:
int[4]*[2] a;instead ofint (*a[2])[4]; - Simd vector types using
[<>]syntax, e.g.float[<4>], use[<*>]for inferred length. - Slice type built in, using
[]suffix, e.g.int[] typedefis similar to C's typedef but forms a new type. (Example: theStringtype is a new type withchar[]internal representation)- Built-in 128-bit integer on all platforms.
charis an unsigned 8-bit integer.icharis its signed counterpart.- Well-defined bitwidth for integer types: ichar/char (8 bits), short/ushort (16 bits), int/uint (32 bits), long/ulong (64 bits), int128/uint128 (128 bits)
- Pointer-sized
iptranduptrintegers. iszanduszintegers corresponding to thesize_tbitwidth.- Optional types are formed using the
?suffix. boolis the boolean type.typeidis a unique type identifier for a type, it can be used at runtime and compile time.anycontains atypeidandvoid*allowing it to act as a reference to any type of value.faulta constant representing an error (see below).
Changed¶
- Inferred array type uses
[*](e.g.int[*] x = { 1, 2 };). - Flexible array member uses
[*].
Removed¶
- The C "spiral rule" type declaration (see above).
- Complex types (implemented as user-defined types instead).
size_t,ptrdiff_t(see above).- Array types do not decay.
Types¶
Added¶
bitstructa struct with a container type allowing precise control over bit-layout, replacing bitfields and enum masks.faulta constant with unique values which are used together with optional.- Vector types.
- Optional types.
- Operator overloading for arithmetics, bit operators and equality.
enumallows a set of unique constants to be associated with each enum value.- Compile time reflection and limited runtime reflection on types (see "Reflection")
- All types have a
typeidproperty uniquely referring to that particular type. - Distinct types, which are similar to aliases, but represent distinctly different types.
- Types may have methods. Methods can be added to any type, including built-in types.
- Subtyping: using
inlineon a struct member allows a struct to be implicitly converted to this member type and use corresponding methods. - Using
inlineon atypedefallows it to be implicitly converted to its base type (but not vice versa). - Types may add operator overloading to support
foreachand subscript operations. - Generic types through generic modules, using
{ ... }for the generic parameter list (e.g.List{ int } list;). - Interface types and
anytypes, which allow dynamic invocation of methods. - Types may overload arithmetic operators to implement new numerical types.
Changed¶
- C's typedef is replaced by
aliasand has somewhat different syntax (e.g.alias MyTypeAlias = int;). - Function pointer syntax is prefixed by an
fnand followed by a regular function declaration without the function name. For example,fn void(int)is the type for a function that takes anintand returns nothing. Named parameters and default arguments are also permitted, such asfn void(int num = 0). typedefin C3 creates a new type which can have its own methods, but shares the same common internal representation as the original type.
Removed¶
- Enums, structs and unions no longer have distinct namespaces.
- Enum, struct and union declarations should not have a trailing ';'
aliascan only be used at the top level, not inside a function.- Anonymous structs are not allowed.
- Type qualifiers are all removed, including
const,restrict, andvolatile. However,constmay be applied to compile-time values and each such constant must have its name written inALL_CAPS. - Function pointers types cannot be used "raw", but must always be used through a type alias.
Introspection¶
Compile time type methods: alignof, associated, elements, extnameof, inf, inner, kindof, len,
max, membersof, min, nan, names, params, returns, sizeof, typeid, values,
qnameof, is_eq, is_ordered.
Runtime type methods: inner, kind, len, names, sizeof.
Expressions¶
Added¶
- Array initializers may use ranges. (e.g.
int[256] x = { [0..128] = 1 }) ?:operator, returning the first value if it can be converted to a boolean true, otherwise the second value is returned.- Optionals support an "or else" operator
??returning the first value if it is a normal (non-fault) result or else the second value if the first value is an abnormal (fault-containing) Optional value. Thus,??provides a mechanism for returning default values when evaluations encounter problems. - Rethrow
!suffix operator which implicitly returns the Optional value if it was an abnormal (fault-containing) Optional value. - Dynamic calls, allowing function calls to be made on generic data of type
anyor to use interfaces as a dynamic dispatching mechanism. - Create a slice using a range subscript (e.g.
a[4..8]to form a slice from element 4 to element 8). - Two range subscript methods:
[start..inclusive_end]and[start:length]. Start, end and length may be omitted for default values. - Indexing from end: slices, arrays and vectors may be indexed from the end using
^.^1represents the last element. This works for ranges as well. - Range assignment, assign a single value to an entire range e.g.
a[4..8] = 1;. - Slice assignment: copy one range to the other range, e.g.
a[4..8] = b[8..12];. - Array, vector and slice comparison:
==can be used to make an element-wise comparison of two containers. ~suffix operator turns afaultinto an optional value.!!suffix panics if the value is an optional value.$defined(...)returns true if the outermost expression contained within it is defined. Sub-expressions must also be valid.- Compile time "and" and "or" using
&&&and|||. Both sides of the operator should be compile-time constants. If the left hand side of&&&is false, the right hand side is not type-checked. For|||the right hand side is not type-checked if the left hand side is true. - Lambdas (anonymous functions) may be defined. They work just like functions and do not capture any state (i.e. are not "closures", unlike in some other languages). Not capturing state makes it easier for C3 to retain a simpler lifetime model.
- Simple bitstructs (only containing booleans) may be manipulated using bit operations
& ^ | ~and assignment. - Structs may implicitly convert to their
inlinemember if they have one. - Pointers to arrays may implicitly convert to slices.
- Any pointer may implicitly convert to an
anycontaining the type of the pointee. - An optional value will implicitly invoke “flatmap” on an expression it is a subexpression of.
- Swizzling for arrays and vectors. For example, to reverse a 3-element vector
vecvia swizzling you can usevec.xyz = vec.zyx;.
Changed¶
- Operator precedence of bit operations is higher than
+and-. - Well defined-evaluation order: left-to-right, assignment after expression evaluation.
sizeofis$sizeofand only works on expressions. UseType.sizeofon types.alignofis$alignoffor expressions. Types useType.alignof.- Narrowing conversions are only allowed if all sub-expressions are as small or smaller than the type.
- Widening conversions are only allowed on simple expressions (i.e. most binary expressions and some unary may not be widened).
Removed¶
- The comma operator is removed.
Functions¶
Added¶
- Functions may be called using named arguments. The name is the same as the parameter name, but followed by
:in the call. For example:foo(name: a, len: 2). - Typed vaargs are declared
Type... argument, and will take 0 or more arguments of the given type. - It is possible to "splat" an array or slice into the location of a typed vararg using
...:foo(a, b, ...list) anyvaargs are declared asargument...(i.e. without a type in the function parameter list). Such a vaarg list can take 0 or more arguments of any type. All passed arguments are implicitly converted to theanytype when the function is called.- The function declaration may have
@inlineor@noinlineas a default. - Using
@inlineor@noinlineon a function call expression will override the function default. - Type methods are functions defined in the form
fn void Foo.my_method(Foo* foo) { ... }. They can be invoked using dot syntax. - Type methods may be attached to any type, even arrays and vectors.
- Error handling uses Optional return types, which are similar to tagged unions that either contain a valid result or a
faultstate.
Changed¶
- Function declarations use the
fnprefix.
Removed¶
- Functions with C-style vaargs may be called, and declared as external functions, but not used for C3 functions.
Attributes¶
C3 adds a long range of attributes in the form @name(...). It is possible to create custom
attribute groups using attrdef (e.g. attrdef MyAttribute(usz align) = { @aligned(align) @weak };) which
groups certain attributes. Empty attribute groups are permitted.
The complete list: @align, @benchmark, @bigendian, @builtin,
@callconv, @deprecated, @dynamic, @export,
@cname, @if, @inline, @interface,
@littleendian, @local, @maydiscard, @mustinit, @naked,
@nodiscard, @noinit, @noreturn, @nostrip,
@obfuscate, @operator, @overlap, @priority,
@private, @public, @pure, @reflect,
@section, @test, @used, @unused.
Declarations¶
Added¶
vardeclaration for type inferred variables in macros. E.g.var a = some_value;.vardeclaration for new type variables in macros. E.g.var $Type = int;.vardeclaration for compile time mutable variables in functions and macros. E.g.var $foo = 1;.constdeclarations may be untyped. Such constants are not stored in the resulting binary.
Changed¶
tlocaldeclares a variable to be thread local.statictop level declarations are replaced with@local. (staticin functions is unchanged)
Removed¶
restrictremoved.atomicshould be replaced by atomic load/store operations.volatileshould be replaced by volatile load/store operations.
Statements¶
Added¶
- Match-style variant of the switch statement, which allows each case to hold an expression to test.
- Switching over type with
typeid. asmblocks for inline assembly.nextcaseto fallthrough to the next case.nextcase <expr>to jump to the case with the expression value. This may be an expression evaluated at runtime.nextcase defaultto jump to thedefaultclause.- Labelled
while/do/for/foreachto use withbreaknextcaseandcontinue. foreachto iterate over arrays, vectors, slices and user-defined containers using operator overloading.foreach_rto iterate in reverse.foreach/foreach_rmay take the element by value or reference. The index may optionally be provided.$if,$switch,$for,$foreachstatements executing at compile time.$echoprinting a message at compile time.$assertcompile time assert.deferstatement to execute statements at scope exit.defer catchanddefer try, which are similar todeferbut execute only if the Optional value contains afaultor a normal result respectively.dostatements may omit while, behaving same aswhile (0).ifmay have a label. Labelledifmay be exited using labelled break.if (try ...)statements run code when an expression is a "valid"/"normal" result (i.e. to handle the "happy path" when working with Optionals).if (catch ...)statements run code when an expression is an "invalid"/"abnormal" (fault-containing) result (i.e. to handle the "failure path" when working with Optionals). It can be used to implicitly unwrap variables.- Exhaustive switching on enums.
Changed¶
- Switch cases will have implicit break, rather than implicit fallthrough.
assertis an actual statement and may take a string or a format + arguments.static_assertfrom C and C++ corresponds to$assertin C3 and is a statement.
Removed¶
gotohas been removed and replaced by labelledbreak,continueandnextcase.
Compile time evaluation¶
Added¶
@if(cond)to conditionally include a struct/union field, a user-defined type, etc.- Compile time variables with
$prefix, e.g.$foo. $if...$else...$endifand$switch...$endswitchinside of functions to conditionally include code.$forand$foreachto loop over compile time variables and data.$typeofdetermines an expression type without evaluating it.- Type properties may be accessed at compile time.
$definedreturns true if the expression (variable, function, type, etc) passed to it would compile. The expression passed to$definedis not actually executed though and thus does not have side effects.$erroremits an error if encountered.$embedincludes a file as binary data.$includeincludes a file as text.$execincludes the output of a program as code.$evaltypetakes a compile time string and turns it into a type.$evaltakes a string and turns it into an identifier.$extnameofturns an identifier into its string external name.$nameofturns an identifier into its local string name.$qnameofturns an identifier into its local string name with the module prefixed.- Compile time constant values are always compile time folded for arithmetic operations and casts.
$$FUNCTIONreturns the current function as an identifier, as if its name had been written in place of$$FUNCTION.
Changed¶
#definefor constants is replaced by untyped constants, e.g.#define SOME_CONSTANT 1becomesconst SOME_CONSTANT = 1;.#definefor variable and function aliases is replaced byalias, e.g.#define native_foo win32_foobecomesalias native_foo = win32_foo;- In-function
#if...#else..#endifis replaced by$ifand#if...#elif...#endifis replaced by$switch. - For converting code into a string use
$stringify. - Macros for date, line etc are replaced by
$$DATE,$$FILE,$$FILEPATH,$$FUNC,$$LINE,$$MODULE,$$TIME.
Removed¶
- Top level
#if...#endifdoes not have a counterpart. Use@ifinstead. - No
#includedirectives,$includewill include text, but normally C3 code will useimportto access code from other modules.
Macros¶
Added¶
macrofor defining macros.- “Function-like” macros have no prefix and have only regular parameters or type parameters.
- “At”-macros are prefixed with
@and may also have compile time values, expression parameters, and a trailing body. - Type parameters are prefixed with
$and conform to C3's required type naming convention (e.g.$TypeFoo, a.k.a. "PascalCase"). - Expression parameters (i.e. macro parameters prefixed with
#) are unevaluated expressions. This is similar to arguments to#definein C. - Compile time values have a
$prefix and must contain compile time constant values. - Any macro that evaluates to a constant result can be used as if it was the resulting constant.
- Macros may be recursively evaluated.
- Macros are inlined at the location where they are invoked.
- Unless resulting in a single constant, macros implicitly create a runtime scope.
Removed¶
- No
#definemacros. - Macros cannot be incomplete statements.
Features provided by builtins¶
Some features are provided by "builtins" in the standard library, and appear like normal functions and macros in the standard library, but nonetheless provide unique functionality:
@likely(...)/@unlikely(...)on branches affects compilation optimization.@anycast(...)casts ananywith an optional result.unreachable(...)marks a path as unreachable with a panic in safe mode.unsupported(...)similar to unreachable but for functionality not implemented.@expect(...)expect a certain value with an optional probability for the optimizer.@prefetch(...)prefetches a pointer, meaning that the memory at the pointed to address will be loaded before it is necessarily required, thus possibly improving performance under the right conditions.swizzle(...)swizzles a vector.@volatile_load(...)and@volatile_store(...)volatile load/store.@atomic_load(...)and@atomic_store(...)atomic load/store.compare_exchange(...)atomic compare exchange.- Saturating add, sub, mul, shl on integers.
- Vector reduce operations: add, mul, and, or, xor, max, min.
Modules¶
- Modules are defined using
module <name>;, where<name>is of the formfoo::bar::baz. - Modules can be split into an unlimited number of module sections, each starting with the same module name declaration if intended to become part of the same module. Multiple differently named modules can also be defined per file.
- The
importstatement imports a given module. - Each module section has its own set of import statements.
- Importing a module gives access to the declarations that are
@public. - Declarations are default
@public, but a module section may set a different default (e.g.module my_module @private;). @privatemeans the declaration is only visible in the current module.@localmeans the declaration is only visible in the current module section. This also implies that the declaration will not be visible outside the current file either.- Imports are recursive. For example,
import my_libwill implicitly also importmy_lib::net. - Multiple imports may be specified with the same
import, e.g.import std::net, std::io;. - Generic modules are not type checked until any of their types, functions or globals are instantiated.
Contracts¶
- Doc contracts (starting with
<*and ending with*>) are parsed for correct contract syntax and semantics. They are not inert comments, despite also serving as documentation comments. - The first part, up until the first
@directive on a new line, is ignored. - The
@paramdirective for pointer arguments may define usage constraints[in][out]and[inout]. - Pointer argument constraints may add a
&prefix to indicate that they may not benull, e.g.[&inout]. - Contracts may be attached to generic modules, functions and macros.
@requiredirectives are evaluated given the arguments provided. Failing them may be a compile time or runtime error.- The
@ensuredirective is evaluated at exit – if the return is a "valid"/"normal" (non-fault) result and not an "invalid"/"abnormal" (fault-containing) Optional. returncan be used as a variable identifier inside of@ensure, and holds the return value.@return?optionally lists the errors used. This will be checked at compile time.@puresays that no writing to globals is allowed inside and only@purefunctions may be called.
Benchmarking¶
- Benchmarks are indicated by
@benchmark. - Marking a module section
@benchmarkmakes all functions inside of it implicitly benchmarks. - Benchmarks are usually not compiled.
- Benchmarks are instead only run by the compiler on request.
Testing¶
- Tests are indicated by
@test. - Marking a module section
@testmakes all functions inside of it implicitly tests. - Tests are usually not compiled.
- Tests are instead only run by the compiler on request.
Safe / fast¶
Compilation has two modes: “safe” and “fast”. Safe mode will insert checks for out-of-bounds access, null-pointer deref, shifting by negative numbers, division by zero, violation of contracts and asserts.
Fast mode will assume that all of those checks always pass. This means that unexpected behaviour may result from violating those checks. It is recommended to develop in "safe" mode.
If debug symbols are available, C3 will produce a stack trace in safe mode where an error occurs.
Comparison
An important question to answer is "How does C3 compare to other similar programming languages?". Here is an extremely brief (and not yet complete) overview.
C¶
As C3 is an evolution of C, the languages are quite similar. C3 adds features, but also removes a few.
In C but not in C3
- Qualified types (
const,volatileetc) - Unsafe implicit conversions
In C3 but not in C
- Module system
- Operator overloading
- Generics
- Compile time execution and semantic macros
- Integrated build system
- Error handling
- Defer
- Value methods
- Associated enum data
- Distinct types and subtypes
- Gradual contracts
- Built-in slices
- Foreach for iteration over arrays and types
- Dynamic calls and types
C++¶
C++ is a complex object-oriented "almost superset" of C. It tries to be everything to everyone, while squeezing this into a C syntax. The language is well known for its many pitfalls and quirky corners – as well as its long compile times.
C3 is in many ways different from C++ in the same way that C is different from C++, but the semantic macro system and the generics close the gap in terms of writing reusable generic code. The C3 module system and error handling is also very different from how C++ does things.
In C++ but not in C3
- Objects and classes
- RAII
- Exceptions
In C3 but not in C++
- Module system (yet)
- Integrated build system
- Semantic macros
- Error handling
- Defer
- Associated enum data
- Built-in slices
- Dynamic calls
Rust¶
Rust is a safe systems programming language. While not quite as complex as C++, it is still a feature rich programming language with semantic macros, traits and pattern matching to mention a few.
Error handling is handled using Result and Optional, which is similar to how C3 works.
C3 compares to Rust much like C, although the presence of built-in slices and strings reduces the places where C3 is unsafe. Rust provides arrays and strings, but they are not built in.
In Rust but not in C3
- RAII
- Memory safety
- Safe union types with functions
- Different syntax from C
- Pattern matching
- Async built in
In C3 but not in Rust
- Same ease of programming as C
- Gradual contracts
- Familiar C syntax and behaviour
- Dynamic calls
Zig¶
Zig is a systems programming language with extensive compile time execution to enable polymorphic functions and parameterized types. It aims to be a C replacement.
Compared to C3, Zig tries to be a completely new language in terms of syntax and feel. C3 uses macros to a modest degree, whereas it is more pervasive in Zig, and C3 does not depart from C to the same degree. Like Rust, it features slices as a first-class type. The standard library uses an explicit allocator to allow it to work with many different allocation strategies.
Zig is a very ambitious project, aiming to support as many types of platforms as possible.
In Zig but not in C3
- Pervasive compile time execution with type generation
- Memory allocation failure is an error
- Build toolchain is scripted using build files written in Zig
- Different syntax and behaviour compared to C
- Structs define namespace
- Async primitives built in*
- Arbitrary integer sizes
Note
*Note that as of this writing, async is temporarily missing from Zig.
In C3 but not in Zig
- Module system.
- Operator overloading
- C ABI compatibility by default
- First-class lambdas*
- Macros with lazy parameters and/or trailing bodies.
- Gradual contracts
- Dynamic interfaces
- Familiar C syntax and behaviour
- Declarative integrated build system
- Built-in benchmarks
Note
*In Zig, you can achieve a similar result by creating an anonymous struct with a single function.
Jai¶
Jai is a programming language aimed at high performance game programming. It has an extensive compile time meta programming functionality, even to the point of being able to run programs at compile time. It also has compile-time polymorphism, a powerful macro system and uses an implicit context system to switch allocation schemes.
In Jai but not in C3
- Pervasive compile time execution
- Jai's compile time execution is the build system.
- Different syntax and behaviour compared to C
- More powerful macro system than C3
- Implicit constructors
In C3 but not in Jai
- Module system
- Declarative integrated build system
- Gradual contracts
- Familiar C syntax and behaviour
- Fairly small language
- Dynamic interfaces
Odin¶
Odin is a language built for high performance but tries to remain a simple language to learn. Superficially the syntax shares much with Jai, and some of Jai's features things – like an implicit context – also show up in Odin. In contrast with both Jai and Zig, Odin uses only minimal compile time evaluation and instead only relies on parametric polymorphism to ensure reuse. It also contains conveniences, like maps and arrays built into the language. For error handling it relies on Go style tuple returns.
In Odin but not in C3
- Different syntax and behaviour compared to C
- Ad hoc parametric polymorphism
- Multiple return values
- Error handling through multiple returns
- A rich built-in set of types for maths
In C3 but not in Odin
- Familiar C syntax and behaviour
- Semantic macros
- Value methods
- Gradual contracts
- Built-in error handling
- Dynamic interfaces
- Operator overloading
D¶
D is an incredibly extensive language. It covers anything C++ does and adds much more. D manages this with much fewer syntactic quirks than C++. It is a strong, feature-rich language.
In D but not in C3
- Objects and classes
- RAII
- Exceptions
-
Optional GC
-
+ Many, many more features.
In C3 but not in D
- Fairly small language
Rejected Ideas
These are ideas that will not be implemented in C3.
The rationale for each is also given below.
Constructors and destructors¶
A fundamental concept in C3 is that data is not "active". This is to say there is no code associated with the data implicitly, unlike constructors and destructors in an object oriented language. Not having constructors / destructors prevents RAII-style resource handling, but also allows the code to assume the memory can be freely allocated and initialized as it sees fit, without causing any corruption or undefined behaviour.
There is a fundamental difference between active objects and inert data. Each has its advantages and disadvantages. C3 follows the C model, which is that data is passive and does not enforce any behaviour. This has very deep implications on the semantics of the language and adding constructors and destructors would change the language greatly, requiring many parts of the language to be altered.
For that reason, constructors and destructors will not be considered for C3.
Unicode identifiers¶
The main argument for unicode identifiers is that "it allows people to code in their own language". However, there is no proof that this actually is used in practice. Furthermore there are practical issues, such as bidirectional text, characters with different code points that are rendered in an identical way, etc.
Given the complexity and the lack of actual proven benefit, unicode identifiers will not happen for C3.
Builtin type-name variants¶
A common request is to change the builtin type names from char, int, long etc, to some other standard, such as u8 i32 or uint8, int32. Various rationales are usually given for each, but ultimately it is a matter of taste and habit.
Because C3 limits user-defined names to PascalCase in order to easily resolve the language grammar, it is not possible to create type aliases for such names, which leads to requests to build them into the language itself. (Int32 is fine, but int32 is not, nor are INT32 and I32).
Originally, C3 was going to have both bit-fixed type names (like today, where int is always 32 bits, long always 64 bits and so on) as well as explicit bitsize-names like u8, i32 that aliased to the same types. Ultimately this was shelved, because it would mean that libraries would end up standardizing on one style or the other, creating friction when used. Ultimately the language would end up with one style being the "accepted" way to name things anyway. So after quite a bit of deliberation, the C naming scheme was chosen. This was mainly for the following three reasons:
- It's familiar from C, so one would need to rewrite and learn less coming from C/C++/C#/Java, the code would also look more C-like.
i32has readability problems when combined withifor index in for loops, which was considered a major drawback.- While the
int32scheme does not have readability issues, it is longer than the C names in almost every case.
After this decision was made and the types established, someone mentioned that s32 could have been an alternative to consider as well, and indeed it is far superior as a prefix for bitsize-names. However, it's not obvious that for example sptr is better than iptr, plus the decision was made.
Over the years, requests for builtin types have occasionally appeared, but interestingly, not always arguing for the same scheme. Some would say iXX was the only possibility, others thought such naming was out of the question and an intXX scheme the only right decision and so on. Given that, it's rather clear that the preference for any naming scheme is subjective, and one is pretty much as good as the other.
So, the C3 naming scheme will not change, although small tweaks are not ruled out.
Get Involved
Community & Contribute
Contributions Welcome!¶
The C3 language is still in its development phase, which means functionality and specification are subject to change. That also means that any contribution right now will have a big impact on the language. So if you find the project interesting, here’s what you can do to help:
💬 Discuss The Language¶
- Join us on C3 Discord.
💡 Suggest Improvements¶
- Found a bug? File an issue for C3 compiler
- Spotted a typo or broken link? File an issue for the website
💪 Contribute¶
Now that the compiler is stable, what is needed now are the non-essentials, such as a documentation generator, editor plugins, language server protocol (LSP), etc.
Thank You
Thank You
Thank You¶
- A huge "thank you" goes out to all contributors and sponsors.