Table of Contents
Docs
Introduction
Want To Download C3?
Download C3, available on Mac, Windows and Linux.
C3 is an evolution of C and a minimalist systems programming language.
🦺 Ergonomics and Safety¶
- Optionals to safely and quickly handle errors and null.
- Defer to clean up resources.
- Slices and foreach for safe iteration.
- Contracts in comments, to add constraints to your code.
- Automatically free memory after use in
@poolcontext.
⚡ Performance by default¶
- Write SIMD vectors to program the hardware directly.
- Access to different memory allocators to fine tune performance.
- Zero overhead errors.
- Fast compilation times.
- LLVM backend for industrial strength optimisations.
- Easy to use inline assembly.
🔋Batteries included standard library¶
- Dynamic containers and strings.
- Cross-platform abstractions for ease of use.
- Access to the native platform when you need it.
🔧 Leverage existing C or C++ libraries¶
- Full C ABI compatibility.
- C3 can link C code, C can link C3 code.
📦 Modules are simple¶
- Modules namespace code.
- Modules make encapsulation simple with explicit control.
- Interfaces define shared behaviour to write robust libraries.
- Generic modules make extending code easier.
- Simple struct composition and reuse with struct subtyping.
🎓 Macros without a PhD¶
- Macros can be similar to normal functions.
- Or write code that understands the types in your code.
⚠️ Warning: Docs may not reflect current language state. The C3 standard library and compiler are still evolving. Please verify examples against the compiler and standard library directly. If you spot mismatches, open a GitHub issue or help us fix them!
Getting Started
Hello World
Not installed the C3 compiler yet?
Download C3, available on Mac, Windows and Linux.
👋 Hello world¶
Let's start with the traditional first program, Hello World in C3:
The import statement imports other modules, and we want printn which
is in std::io.
Next we define a function which starts with the fn keyword followed by the return type. We don't need to return anything, so return void. The function name main then follows, followed by the function's parameter list, which is empty.
Note
The function named main is a bit special, as it is where the program starts, or the entry point of the program.
For Unix-like OSes there are a few different variants, for example we might declare it as fn void main(String[] args). In that case the parameter "args" contains a slice of strings, of the program's command line arguments, starting with the name of the program itself.
🔭 Function scope¶
{ and } signifies the start and end of the function respectively,
we call this the function's scope. Inside the function scope we have a single function
call to printn inside std::io. We use the last part of the path "io" in front of
the function to identify what module it belongs to.
📏 Imports can use a shorthand¶
We could have used the original longer path: std::io::printn
if we wanted, but we can shorten it to just the lowest level module like io::printn. This is the convention in C3 and is known as "path-shortening", it avoids writing long import paths that can make code harder to read.
The io::printn function takes a single argument and prints it, followed by a newline, then the function ends and the program terminates.
🔧 Compiling the program¶
Let's take the above program and put it in a file called hello_world.c3.
We can then compile it with:
And run it:
It should print Hello, World! and return back to the command line prompt.
If you are on Windows, you will have hello_world.exe instead. Call it in the same way.
🏃 Compiling and running¶
When we start out it can be useful to compile and then have the compiler start the
program immediately. We can do that with compile-run:
$ c3c compile-run hello_world.c3
> Program linked to executable 'hello_world'.
> Launching hello_world...
> Hello, World
Want more options when compiling? Check the c3c compiler build options.
🎉 Successfully working?¶
Congratulations! You're now up and running with C3.
❓ Need help?¶
We're happy to help on the C3 Discord.
How to compile
Want To Download Pre-Built C3 Binaries?
Download C3, available on Mac, Windows and Linux.
For other platforms it should be possible to compile it on any platform LLVM can compile to. You will need git and CMake installed.
1. Install LLVM¶
See the LLVM documentation on how to set up LLVM for development. - On MacOS, installing through Homebrew or MacPorts works fine. - Using apt-get on Linux should work fine as well. - For Windows you can download suitable pre-compiled LLVM binaries from here.
2. Clone the C3 compiler source code from Github¶
This should be as simple as doing:
... from the command line.
3. Build the compiler¶
Create the build directory:
Use CMake to set up:
Build the compiler:
4. Test it out¶
Building via Docker¶
You can build c3c using either an Ubuntu 18.04 or 20.04 container:
Replace 18 with 20 to build through Ubuntu 20.04.
For a release build specify:
A c3c executable will be found under bin/.
Building on Mac using Homebrew¶
- Install CMake:
brew install cmake - Install LLVM 17+:
brew install llvm - Clone the C3C github repository:
git clone https://github.com/c3lang/c3c.git - Enter the C3C directory
cd c3c. - Create a build directory
mkdir build - Change directory to the build directory
cd build - Set up CMake build for debug:
cmake .. - Build:
cmake --build .
Building on Mac using MacPorts¶
c3c may be built on Mac systems not supported by Homebrew
using the cmake, llvm-17 and clang-17
ports from MacPorts.
- Install CMake:
sudo port install cmake - Install LLVM 17:
sudo port install llvm-17 - Install clang 17:
sudo port install clang-17 - Clone the C3C github repository:
git clone https://github.com/c3lang/c3c.git - Enter the C3C directory
cd c3c. - Create a build directory
mkdir build - Change directory to the build directory
cd build - ❗️Important before you run cmake❗️
Set LLVM_DIR to the directory with the llvm-17 macport .cmake files
export LLVM_DIR=/opt/local/libexec/llvm-17/lib/cmake/llvm - Set up CMake build for debug:
cmake .. - Build:
cmake --build .
See also discussion #1701
Prebuilt binaries¶
- Installing on Windows
- Installing on Mac Arm64
- Installing on Ubuntu
- Installing on Debian
- Installing on Arch
Installing on Windows¶
- Download the C3 compiler or the debug build.
- Unzip it into a folder
Optional: set c3c as a global environment variable¶
- copy the folder
- navigate to
C:\Program Files - paste the folder here
- navigate inside the folder you've pasted
- copy the path of the folder
- search for "edit the system environment variables" on your computer
- click on the "environment variables" button on the bottom right
- under "user variables" double click on "path"
- click on "new" and paste the path to the folder
- run
c3canywhere on your computer!
Installing on Mac Arm64¶
- Make sure you have XCode with command line tools installed.
- Download the C3 compiler or the debug build.
- Unzip executable and standard lib.
- Run
./c3c.
The binary is not signed
You need to approve it with: xattr -d com.apple.quarantine c3c, or go to the security settings, approve it, then run it again.
Installing on Ubuntu¶
- Download the C3 compiler or the debug build.
- Unpack executable and standard lib.
- Run
./c3c.
Installing on Debian¶
- Download the C3 compiler or the debug build.
- Unpack executable and standard lib.
- Run
./c3c.
Installing on Arch Linux¶
There is an AUR package for the c3c compiler : c3c-git.
You can use your AUR package manager:
Or clone it manually:
Troubleshooting¶
Note: If you get an error like No module named 'std::io' could be found, you may need to set the C3C_LIB environment variable to point to the standard library location:
Bash/Zsh:
Fish:
Windows (PowerShell):
"cc: not found"¶
On Linux and MacOS, C3 uses the available C compiler to link with the correct libraries. While C3 contains a built-in linker, it is likely that your system will lack a complete environment unless a C compiler is available.
Linux users should generally install GCC or Clang, according to their distribution's documentation. Below is a list of officially tested distributions and the minimum packages required to compile and link C3 programs:
| Distribution | Required Packages | Command |
|---|---|---|
| Ubuntu / Debian | gcc, libc6-dev |
sudo apt-get install gcc libc6-dev |
| Fedora / Rocky | gcc |
sudo dnf install gcc |
| Arch Linux | gcc |
sudo pacman -S gcc |
| openSUSE | gcc, glibc-devel |
sudo zypper install gcc glibc-devel |
| Alpine Linux | gcc, musl-dev |
sudo apk add gcc musl-dev |
| Void Linux | gcc |
sudo xbps-install -S gcc |
On MacOS, you can either install XCode or download the stand-alone command-line tools.
Project Setup
Not installed the C3 compiler yet?
Download C3, available on Mac, Windows and Linux.
Projects in C3¶
Projects are optional, but are a good way to manage compiling code when there are a lot of files and modules. They also allow you to specify libraries to link, and define how your project should be built for specific targets.
💡 Creating a new project¶
The c3c init command will create a new directory containing your project structure.
It requires a name of the project, we will use myc3project in its place.
You can also customize the path where the project will be created or specify a template. For more information check the init command reference.
📁 Project structure¶
If you check the directory that was created you might find it a bit confusing with a bunch of different directories, but worry not because if you expand them you will realise that most of them are actually empty!
.
├─ build/
├─ docs/
├─ lib/
├─ resources/
├─ scripts/
├─ src/
│ └─ main.c3
├─ test/
├─ LICENSE
├─ project.json
└─ README.md
Directory Overview¶
| Directory | Usage |
|---|---|
./build |
Where your temporary files and build results will go. |
./docs |
Code Documentation |
./lib |
C3 libraries (with the .c3l suffix) |
./resources |
Non-code resources like images, sound effects etc. |
./scripts |
Scripts, including .c3 scripts that generate code at compile time. |
./src |
Storing our code, by default contains main.c3 with "Hello World". |
project.json |
Record project information, similar to package.json in NodeJS. |
LICENSE |
Project license. |
README.md |
Help others understand and use your code. |
🔧 Building the project¶
Assuming you have successfully initialized a project as seen above, we can now look at how to compile it.
🏃 Build & run¶
C3 has a simple command to build & run our project.
c3c run
> Program linked to executable 'build/myc3project'.
> Launching ./build/myc3project...
> Hello, World
You can also specify the target to build & run.
🔧 Build¶
If you only want to build the project, you can use the build command:
This command builds the project targets defined in our project.json file.
Note
If you want to build a specific target, you can do so by specifying its name.
The default target is created with the name of the project, such as myc3project.
We will now have a binary in build, which we can run:
It should print Hello, World! and return back to the command line prompt.
If you are on Windows, you will have myc3project.exe instead. Call it in the same way.
If you need more detail later on check C3 project build commands and C3 project configuration to learn more.
Roadmap
Want To Download C3?
Download C3, available on Mac, Windows and Linux.
C3 Roadmap¶
C3 Is Feature Stable¶
The C3 0.7.x series can be run in production with the same general caveats for using any pre-1.0 software.
While we strive to have zero bug count, there are still bugs being found. This means that anyone using it in production would need to stay updated with the latest fixes.
The focus of 0.8–0.9 will be fleshing out the cross-platform standard
library and making sure the syntax and semantics are solid. Also, the
toolchain will expand and improve. Please refer to this issue for what's
left in terms of features for 1.0.
The intended roadmap has one major 0.1 release per year:
| Date | Release |
|---|---|
| 2026-06-01 | 0.8 |
| 2027-04-01 | 0.9 |
| 2028-04-01 | 1.0 |
Compatibility¶
Minor releases in the same major release series are compatible.
For example 0.6.0, 0.6.1, ... 0.6.x are compatible and 0.7.0, 0.7.1, ... 0.7.x are compatible.
Standard library¶
The standard library is less mature than the compiler. It needs more
functionality and more tests. The compiler reaching a 1.0 release only
means a language freeze, the standard library will continue to evolve
past the 1.0 release.
Design Goals
Want To Download C3?
Download C3, available on Mac, Windows and Linux.
Design goals¶
- Procedural language, with a pragmatic ethos to get work done.
- Minimalistic, no feature should be unnecessary or redundant.
- Stay close to C - only change where there is a significant need.
- Learning C3 should be easy for a C programmer.
- Seamless C integration.
- Ergonomic common patterns.
- Data is inert.
- Zero Is Initialization (ZII).*
- Avoid "big ideas".
"Zero Is Initialization" is an idiom where types and code are written so that the zero value is a meaningful, initialized state.*
Features¶
- Full C ABI compatibility
- Module system
- Operator overloading
- Generic modules
- Design by contract
- Zero overhead errors
- Semantic macro system
- First-class SIMD vector types
- Struct subtyping
- Safe array access using slices
- Safe array iteration using foreach
- Easy to use inline assembly
- Cross-platform standard library which includes dynamic containers and strings
- LLVM backend
C3 Background¶
C3 is an evolution of C, a minimalistic language designed for systems programming, enabling the same paradigms and retaining the same syntax as far as possible.
C3 started as an experimental fork of the C2 language by Bas van den Berg. It has evolved significantly, not just in syntax but also in regard to error handling, macros, generics and strings.
Language Overview
Examples
Overview¶
This is meant for a quick reference, to learn more of the details, check the relevant sections.
For a richer catalogue of example projects and scripts covering various themes and difficulties, check out the C3 compiler's resources/examples directory.
If Statement¶
For Loop¶
fn void example_for()
{
// the for-loop is the same as C99.
for (int i = 0; i < 10; i++)
{
io::printfn("%d", i);
}
// also equal
for (;;)
{
// ..
}
}
Foreach Loop¶
// Prints the values in the slice.
fn void example_foreach(float[] values)
{
foreach (index, value : values)
{
io::printfn("%d: %f", index, value);
}
}
// Updates each value in the slice
// by multiplying it by 2.
fn void example_foreach_by_ref(float[] values)
{
foreach (&value : values)
{
*value *= 2;
}
}
While Loop¶
fn void example_while()
{
// again exactly the same as C
int a = 10;
while (a > 0)
{
a--;
}
// Declaration
while (Point* p = getPoint())
{
// ..
}
}
Enum And Switch¶
Switches have implicit break and scope. Use "nextcase" to explicitly fallthrough or use comma:
enum Height : uint
{
LOW,
MEDIUM,
HIGH,
}
fn void demo_enum(Height h)
{
switch (h)
{
case LOW:
case MEDIUM:
io::printn("Not high");
// Implicit break.
case HIGH:
io::printn("High");
}
// This also works
switch (h)
{
case LOW:
case MEDIUM:
io::printn("Not high");
// Implicit break.
case Height.HIGH:
io::printn("High");
}
// Completely empty cases are not allowed.
switch (h)
{
case LOW:
break; // Explicit break required, since switches can't be empty.
case MEDIUM:
io::printn("Medium");
case HIGH:
break;
}
// special checking of switching on enum types
switch (h)
{
case LOW:
case MEDIUM:
case HIGH:
break;
default: // warning: default label in switch which covers all enumeration value
break;
}
// Using "nextcase" will fallthrough to the next case statement,
// and each case statement starts its own scope.
switch (h)
{
case LOW:
int a = 1;
io::printn("A");
nextcase;
case MEDIUM:
int a = 2;
io::printn("B");
nextcase;
case HIGH:
// a is not defined here
io::printn("C");
}
}
Enums are always namespaced.
Enums support various reflection properties: .values returns an array with all enums. .len or .elements returns the number
of enum values, .inner returns the storage type. .names returns an array with the names of all enums. .associated
returns an array of the typeids of the associated values for the enum.
enum State : uint
{
START,
STOP,
}
State start = State::values[0];
sz enums = State::len; // 2
String[] names = State::names; // [ "START", "STOP" ]
Duff's Device¶
Using nextcase we can implement a version of Duff's Device:
fn void duff(int* to, int* from, int count)
{
int n = (count + 7) / 8;
switch (count % 8)
{
case 0: *to++ = *from++; nextcase;
case 7: *to++ = *from++; nextcase;
case 6: *to++ = *from++; nextcase;
case 5: *to++ = *from++; nextcase;
case 4: *to++ = *from++; nextcase;
case 3: *to++ = *from++; nextcase;
case 2: *to++ = *from++; nextcase;
case 1: *to++ = *from++; if (--n > 0) nextcase 0;
}
}
Defer¶
Defer will be invoked on scope exit.
fn void test(int x)
{
defer io::printn();
defer io::print("A");
if (x == 1) return;
{
defer io::print("B");
if (x == 0) return;
}
io::print("!");
}
fn void main()
{
test(1); // Prints "A"
test(0); // Prints "BA"
test(10); // Prints "B!A"
}
Because it's often relevant to run different defers when having an error return there is also a way to create an error defer, by using the catch keyword directly after the defer.
Similarly, using defer try can be used to only run if the scope exits in a regular way.
fn void? test(int x)
{
defer io::printn("");
defer io::print("A");
defer try io::print("X");
defer catch io::print("B");
defer (catch err) io::printf("%s", err);
if (x == 1) return NOT_FOUND~;
io::print("!");
}
fn void main()
{
(void)test(0); // Prints "!XA"
(void)test(1); // Prints "builtin::NOT_FOUNDBA" and returns a NOT_FOUND
// Note that we need to use (void) to explicitly discard the Optional result.
}
Struct Types¶
alias Callback = fn int(char c);
enum Status : int
{
IDLE,
BUSY,
DONE,
}
struct MyData
{
char* name;
Callback open;
Callback close;
Status status;
// named sub-structs (x.other.value)
struct other
{
int value;
int status; // ok, no name clash with other status
}
// anonymous sub-structs (x.value)
struct
{
int value;
int status; // error, name clash with other status in MyData
}
// anonymous union (x.person)
union
{
Person* person;
Company* company;
}
// named sub-unions (x.either.this)
union either
{
int this;
bool or;
char* that;
}
}
Function Pointers¶
module demo;
alias Callback = fn int(char* text, int value);
fn int my_callback(char* text, int value)
{
return 0;
}
Callback cb = &my_callback;
fn void example_cb()
{
int result = cb("demo", 123);
// ..
}
Error Handling¶
Errors are handled using optional results, denoted with a '?' suffix. A variable of an optional
result type may either contain the regular value or a fault value.
faultdef DIVISION_BY_ZERO;
fn double? divide(int a, int b)
{
// We return an optional result of type DIVISION_BY_ZERO
// when b is zero.
if (b == 0) return DIVISION_BY_ZERO~;
return (double)a / (double)b;
}
// Re-returning an optional result uses "!" suffix
fn void? test_may_fail()
{
divide(foo(), bar())!;
}
fn void main()
{
// ratio is an optional result.
double? ratio = divide(foo(), bar());
// Handle the optional result value if it exists.
if (catch err = ratio)
{
switch (err)
{
case DIVISION_BY_ZERO:
io::printn("Division by zero");
return;
default:
io::printn("Unexpected error!");
return;
}
}
// Flow typing makes "ratio"
// have the plain type 'double' here.
io::printfn("Ratio was %f", ratio);
}
fn void print_file(String filename)
{
String? file = (String)file::load_temp(filename);
// The following function is not called on error,
// so we must explicitly discard it with a void cast.
(void)io::printfn("Loaded %s and got:%s", filename, file);
if (catch err = file)
{
switch(err)
{
case io::FILE_NOT_FOUND:
io::printfn("I could not find the file %s", filename);
default:
io::printfn("Could not load %s.", filename);
}
}
}
// Note that the above is only illustrating how Optionals may skip
// call invocation. A more normal implementation would be:
fn void print_file2(String filename)
{
String? file = (String)file::load_temp(filename);
if (catch err = file)
{
// Print the error
io::printfn("Failed to load %s: %s", filename, err);
// We return, so that below 'file' will be unwrapped.
return;
}
// No need for a void cast here, 'file' is unwrapped to 'String'.
io::printfn("Loaded %s and got:\n%s", filename, file);
}
Read more about optionals and error handling here.
Contracts¶
Pre- and postconditions are optionally compiled into asserts helping to optimize the code.
<*
@param foo : "the number of foos"
@require foo > 0, foo < 1000
@return "number of foos x 10"
@ensure return < 10000, return > 0
*>
fn int test_foo(int foo)
{
return foo * 10;
}
<*
@param array : "the array to test"
@param length : "length of the array"
@require length > 0
*>
fn int get_last_element(int* array, int length)
{
return array[length - 1];
}
Read more about contracts here.
Struct Methods¶
It's possible to namespace functions with a union, struct or enum type to enable "dot syntax" calls:
struct Foo
{
int i;
}
fn void Foo.next(Foo* this)
{
if (this) this.i++;
}
fn void test()
{
Foo foo = { 2 };
foo.next();
foo.next();
// Prints 4
io::printfn("%d", foo.i);
}
Macros¶
Macro arguments may be immediately evaluated.
macro foo(a, b)
{
return a(b);
}
fn int square(int x)
{
return x * x;
}
fn int test()
{
int a = 2;
int b = 3;
return foo(&square, 2) + a + b; // 9
// return foo(square, 2) + a + b;
// Error: function should be followed by (...) or prefixed by &.
}
Macro arguments may have deferred evaluation, which is basically duplication of the expression using #var syntax.
macro @foo(#a, b, #c)
{
#c = #a(b) * b;
}
macro @foo2(#a)
{
return #a * #a;
}
fn int square(int x)
{
return x * x;
}
fn int test1()
{
int a = 2;
int b = 3;
@foo(square, a + 1, b);
return b; // 27
}
fn int printme(int a)
{
io::printn(a);
return a;
}
fn int test2()
{
return @foo2(printme(2)); // Returns 4 and prints "2" twice.
}
Improve macro errors with preconditions:
<*
@param x : "value to square"
@require types::is_numerical($Typeof(x)) : "cannot multiply"
*>
macro square(x)
{
return x * x;
}
fn void test()
{
square("hello"); // Error: cannot multiply "hello"
int a = 1;
square(&a); // Error: cannot multiply '&a'
}
Read more about macros here.
Compile Time Reflection & Execution¶
Access type information and loop over values at compile time:
import std::io;
struct Foo
{
int a;
double b;
int* ptr;
}
macro print_fields($Type)
{
$foreach $field : $Type::members:
io::printfn("Field %s, offset: %s, size: %s, type: %s",
$field.name, $field.offset, $field.size, $field.type.name);
$endforeach
}
fn void main()
{
print_fields(Foo);
}
This prints on x64:
Field a, offset: 0, size: 4, type: int
Field b, offset: 8, size: 8, type: double
Field ptr, offset: 16, size: 8, type: int*
Compile Time Execution¶
Macros with only compile time variables are completely evaluated at compile time:
macro long fib(long $n)
{
$if $n <= 1:
return $n;
$else
return fib($n - 1) + fib($n - 2);
$endif
}
const long FIB19 = fib(19);
// Same as const long FIB19 = 4181;
Note
C3 macros are designed to provide a replacement for C preprocessor macros. They extend such macros by providing compile time evaluation using constant folding, which offers an IDE friendly, limited, compile time execution.
However, if you are doing more complex compile time code generation it is recommended to use $exec and related techniques to generate code in external scripts instead.
Read more about compile time execution here.
Operator Overloading¶
struct Vec2
{
int x, y;
}
fn Vec2 Vec2.add(self, Vec2 other) @operator(+)
{
return { self.x + other.x, self.y + other.y };
}
fn Vec2 Vec2.sub(self, Vec2 other) @operator(-)
{
return { self.x - other.x, self.y - other.y };
}
fn void main()
{
Vec2 v1 = { 1, 2 };
Vec2 v2 = { 100, 4 };
Vec2 v3 = v1 + v2; // v3 = { 101, 6 }
}
Read more about operator overloading here.
Generics¶
Declarations may be generic.
module stack;
struct Stack <Type>
{
sz capacity;
sz size;
Type* elems;
}
fn void Stack.push(Stack* this, Type element)
{
if (this.capacity == this.size)
{
this.capacity = this.capacity ? this.capacity * 2 : 16;
this.elems = realloc(this.elems, Type::size * this.capacity);
}
this.elems[this.size++] = element;
}
fn Type Stack.pop(Stack* this)
{
assert(this.size > 0);
return this.elems[--this.size];
}
fn bool Stack.empty(Stack* this)
{
return !this.size;
}
Testing it out:
alias IntStack = Stack{int};
fn void test()
{
IntStack stack;
stack.push(1);
stack.push(2);
// Prints pop: 2
io::printfn("pop: %d", stack.pop());
// Prints pop: 1
io::printfn("pop: %d", stack.pop());
Stack {double} dstack;
dstack.push(2.3);
dstack.push(3.141);
dstack.push(1.1235);
// Prints pop: 1.1235
io::printfn("pop: %f", dstack.pop());
}
Read more about generics here
Dynamic Calls¶
Runtime dynamic dispatch through interfaces:
import std::io;
// Define a dynamic interface
interface MyName
{
fn String myname();
}
struct Bob (MyName) { int x; }
// Required implementation as Bob implements MyName
fn String Bob.myname(Bob*) @dynamic { return "I am Bob!"; }
// Ad hoc implementation
fn String int.myname(int*) @dynamic { return "I am int!"; }
fn void whoareyou(any a)
{
MyName b = (MyName)a;
if (!&b.myname)
{
io::printn("I don't know who I am.");
return;
}
io::printn(b.myname());
}
fn void main()
{
int i = 1;
double d = 1.0;
Bob bob;
any a = &i;
whoareyou(a);
a = &d;
whoareyou(a);
a = &bob;
whoareyou(a);
}
Read more about dynamic calls here.
Classic text games¶
Here are two classic simple text based games showcasing C3 features and the C3 standard library.
Guess a number¶
import std::io, std::math::random;
fn int main()
{
int secret = rand(20) + 1;
int tries = 6;
// game loop
while OUTER: (true)
{
io::printfn("Enter a guess between 1 and 20, "
"%d tries remaining", tries);
int? guess = io::treadline().to_int();
if (catch err = guess)
{
if (err == io::EOF) return 1; // Prevent infinite loop
io::printn("That wasn't a valid number, try again.");
continue;
}
switch
{
case guess < secret: io::printn("Too Small");
case guess > secret: io::printn("Too Large");
default: io::printn("You Win!"); break OUTER;
}
if (--tries == 0)
{
io::printfn("Game Over - the number was %s", secret);
break;
}
}
io::printn("Thank you for playing!");
return 0;
}
Rock, paper, scissors¶
import std::io, std::math::random;
enum Action : (String abbrev, String full)
{
ROCK { "r", "Rock" },
PAPER { "p", "Paper" },
SCISSORS { "s", "Scissors" },
}
const ROUNDS = 3;
fn int main()
{
int p_score;
int c_score;
int rounds = ROUNDS;
io::printfn("Let's play Rock-Paper-Scissors!");
while (rounds > 0)
{
io::printfn("Best out of %d, %d rounds remaining. ", ROUNDS, rounds);
io::printn("What is your guess? [r]ock, [p]aper, or [s]cissors?");
Action guess;
while (true)
{
String? s = io::treadline();
if (catch s) return 1;
if (try current_guess = Action.lookup_field(abbrev, s))
{
guess = current_guess;
break;
}
io::printn("input invalid.");
}
io::printfn("Player: %s", guess.full);
Action comp = Action.from_ordinal(rand(3));
io::printfn("Computer: %s", comp.full);
switch
{
case comp == ROCK && guess == SCISSORS:
case comp == SCISSORS && guess == PAPER:
case comp == PAPER && guess == ROCK:
io::printn("Computer Score!");
c_score++;
rounds--;
case guess == ROCK && comp == SCISSORS:
case guess == SCISSORS && comp == PAPER:
case guess == PAPER && comp == ROCK:
io::printn("Player Score!");
p_score++;
rounds--;
default:
io::printn("Tie!");
}
io::printfn("Score: Player: %d, Computer: %d", p_score, c_score);
}
switch
{
case p_score < c_score: io::printn("COMPUTER WINS GAME!");
case p_score > c_score: io::printn("PLAYER WINS GAME!");
default: io::printn("GAME TIED!");
}
io::printn("Thank you for playing.");
return 0;
}
Type System
Overview¶
As usual, types are divided into basic types and user defined types (enum, union, struct, typedef, bitstruct). All types are defined on a global level.
Naming¶
All user-defined types in C3 start with upper-case. So MyStruct or Mystruct would be fine, mystruct_t or mystruct would not.
This naming requirement ensures that the language is easy to parse for tools.
It is possible to use attributes to change the external name of a type:
This affects generated C headers, but little else.
Differences from C¶
Unlike C, C3 does not use type qualifiers. const exists,
but is a storage class modifier, not a type qualifier.
Instead of volatile, volatile loads and stores are implemented using @volatile_load and @volatile_store.
Restrictions on function parameter usage are implemented through parameter preconditions.
C3's equivalent of C's typedef has a slightly different syntax in C3 and is renamed alias. In contrast, in C3 a distinct type is created when using C3's typedef keyword. As such, take care to not confuse C3's alias and typedef keywords relative to C.
C3 also requires all function pointers to be used with an alias. For example:
alias Callback = fn void();
Callback a = null; // Ok!
fn Callback getCallback() { /* ... */ } // Ok!
// fn fn void() getCallback() { /* ... */ } - ERROR!
// fn void() a = null; - ERROR!
Compile time properties¶
Types have built-in type properties available through ::property syntax. The following properties
are common to all C3 runtime types:
alignment- The standard alignment of the type in bytes. For exampleint::alignmentwill typically be 4.kind- The category of type, e.g.TypeKind.POINTERTypeKind.STRUCT(see std::core::types).cname- Returns a string with the extern name of the type, rarely used.name- Returns a string with the unqualified name of the type.qname- Returns a string with the qualified (using the full path) name of the type.size- Returns the storage size of the type in bytes.typeid- Returns a runtime typeid for the type.methods- Returns the methods implemented for a type.get_tag(tagname)- Returns true if the type has a particular tag.has_tag(tagname)- Retrieves the tag defined on the type.has_equals- True if the type implements==is_ordered- True if the type implements comparisons.is_substruct- True if the type has an inline member.
Basic types¶
Basic types are divided into floating point types and integer types.
Integer types are either signed or unsigned.
Integer types¶
| Name | bit size | signed |
|---|---|---|
bool† |
1 | no |
ichar |
8 | yes |
char |
8 | no |
short |
16 | yes |
ushort |
16 | no |
int |
32 | yes |
uint |
32 | no |
long |
64 | yes |
ulong |
64 | no |
int128 |
128 | yes |
uint128 |
128 | no |
iptr‡ |
varies | yes |
uptr‡ |
varies | no |
sz‡ |
varies | yes |
usz‡ |
varies | no |
†: bool will be stored as a byte.
‡: Size, pointer and pointer-sized types depend on the target platform.
Integer type properties¶
Integer types (except for bool) also have the following type properties:
maxThe maximum value for the type.minThe minimum value for the type.
Integer arithmetics¶
All signed integer arithmetic uses 2's complement.
Integer constants¶
Integer constants are 1293832 or -918212.
Integers may be written in decimal, but also
- in binary with the prefix 0b e.g.
0b0101000111011,0b011 - in octal with the prefix 0o e.g.
0o0770,0o12345670 - in hexadecimal with the prefix 0x e.g.
0xdeadbeef0x7f7f7f
In the case of binary, octal and hexadecimal, the type is assumed to be unsigned.
Furthermore, underscore _ may be used to add space between digits to improve readability e.g. 0xFFFF_1234_4511_0000, 123_000_101_100
Integer literal suffix and type¶
Integer literals follow C's rules:
- A decimal literal is by default
int. If it does not fit in anint, the type islongorint128. Picking the smallest type that fits the literal. - If the literal is suffixed by
uorUit is instead assumed to be anuint, but will beulongoruint128if it doesn't fit, like in (1). - Binary, octal and hexadecimal will implicitly be unsigned.
- If an
lorLsuffix is given, the type is assumed to belong. IfllorLLis given, it is assumed to beint128. - If the
ulorULis given, the type is assumed to beulong. IfullorULL, then it assumed to beuint128. - If a binary, octal or hexadecimal starts with zeros, infer the type size from the number of bits that would be needed if all digits were the maximum for the base.
$Typeof(1); // int
$Typeof(1u); // uint
$Typeof(1L); // long
$Typeof(0x11); // uint, hex is unsigned by default
$Typeof(0x1ULL); // uint128
$Typeof(4000000000); // long, since the number exceeds int.max
$Typeof(0x000000000000); // ulong: 12 hex chars indicate a 48 bit value
$Typeof(0b000000000000); // uint: 12 binary chars indicate a 12 bit value
TwoCC, FourCC and EightCC literals¶
FourCC codes are often used to identify binary format types. C3 adds direct support for 4 character codes, but also 2 and 8 characters:
- 2 character strings, e.g.
'C3', would convert to an ushort or short. - 4 character strings, e.g.
'TEST', converts to an uint or int. - 8 character strings, e.g.
'FOOBAR11'converts to an ulong or long.
Conversion is always done so that the character string has the correct ordering in memory. This means that the same characters may have different integer values on different architectures due to endianness.
Base64 and hex data literals¶
Base64 encoded values work like TwoCC/FourCC/EightCC, in that it is laid out in byte order in memory. It uses the format b64'<base64>'. Hex encoded values work as base64 but with the format x'<hex>'. In data literals any whitespace is ignored, so '00 00 11'x encodes to the same value as x'000011'.
In our case we could encode b64'Rk9PQkFSMTE=' as 'FOOBAR11'.
Base64 and hex data literals initializes to arrays of the char type:
char[*] hello_world_base64 = b64"SGVsbG8gV29ybGQh";
char[*] hello_world_hex = x"4865 6c6c 6f20 776f 726c 6421";
String literals, and raw strings¶
Regular string literals is text enclosed in " ... " just like in C. C3 also offers another type of literal: raw strings.
Raw strings uses text between ` `. Inside of a raw string, no escapes are available, and it can span across multiple lines. To write a ` double the character:
String foo = `C:\foo\bar.dll`;
ZString bar = `"Say ``hello``"`;
String baz =
`pushq %rax;
addq $1, %rax;
popq %rax;`;
// Same as
String foo = "C:\\foo\\bar.dll";
String bar = "\"Say `hello`\"";
String baz = "pushq %rax;\naddq $1, %rax;\npopq %rax;";
Floating point types¶
| Name | bit size |
|---|---|
bfloat16† |
16 |
float16† |
16 |
float |
32 |
double |
64 |
float128† |
128 |
†: Support is still incomplete and not all systems have native support.
Floating point type properties¶
On top of the regular properties, floating point types also have the following properties:
maxThe maximum value for the type.minThe minimum value for the type.infInfinity.nanFloat NaN.
Floating point constants¶
Floating point constants will at least use 64 bit precision. Just like for integer constants, it is allowed to use underscore, but it may not occur immediately before or after a dot or an exponential.
Floating point values may be written in decimal or hexadecimal. For decimal, the exponential symbol is e (or E, both are acceptable), for hexadecimal p (or P) is used: -2.22e-21 -0x21.93p-10
By default a floating point literal is of type double, but if the suffix f is used (eg 1.0f), it is instead of
float type.
C compatibility¶
For C compatibility the following types are also defined in std::core::cinterop
| Name | C type |
|---|---|
CChar |
char |
CShort |
short int |
CUShort |
unsigned short int |
CInt |
int |
CUInt |
unsigned int |
CLong |
long int |
CULong |
unsigned long int |
CLongLong |
long long |
CULongLong |
unsigned long long |
CLongDouble |
long double |
float and double will always match their C counterparts.
Note that signed C char and unsigned char will correspond to ichar and char. CChar is only available to match the default signedness of char on the platform.
Other built-in types¶
Pointer types¶
Pointers mirror C: Foo* is a pointer to a Foo, while Foo** is a pointer to a pointer of Foo.
Pointer type properties¶
In addition to the standard properties, pointers also have the inner
property. It returns the type of the object pointed to as a typeid.
Optional¶
An Optional type is created by taking a type and appending ~.
An Optional type behaves like a tagged union, containing either the
Result or an Empty, which also carries a fault type.
Once extracted, a fault can be converted to another fault.
faultdef MISSING; // define a fault
int? i;
i = 5; // Assigning a real value to i.
i = io::EOF~; // Assigning an optional result to i.
fault b = MISSING; // Assign a fault to b
b = @catch(i); // Assign the Excuse in i to b (EOF)
Only variables, expressions and function returns may be Optionals. Function and macro parameters in their definitions may not be optionals.
fn Foo*? getFoo() { /* ... */ } // ✅ Ok!
int? x = 0; // ✅ Ok!
fn void processFoo(Foo*? f) { /* ... */ } // ❌ fn parameter
An Optional value can use the special if-try and if-catch to unwrap its result or its Empty,
it is also possible to implicitly return if it is Empty using ! and panic with !!.
To learn more about the Optional type and error handling in C3, read the page on Optionals and error handling.
Note
If you want a more regular "optional" value, to store in structs, then you can use the generic Maybe type in std::collections.
The fault type¶
When an Optional does not contain a result, it is Empty, but contains a fault which explains why there was no
normal value. A fault have the special property that together with the ~ suffix it creates an Empty value:
int? x = IO_ERROR~; // 'IO_ERROR~' is an Optional Empty.
fault y = IO_ERROR; // Here IO_ERROR is just a regular
// value, since it isn't followed by '~'
A new fault value can only be defined using the faultdef statement:
Like the typeid type, a fault is pointer sized
and each value defined by faultdef is globally unique. This is true even when faults are separately compiled.
Note
The underlying unique value assigned to a fault may vary each time a program is run.
Fault description¶
The fault type only has one field: description, which returns the name of the fault, namespaced with the last module path, e.g. "io::EOF".
The typeid type¶
The typeid holds the runtime representation of a type. Using <typename>.typeid a type may be converted to its unique runtime id,
e.g. typeid a = Foo.typeid;. The value itself is pointer-sized.
Typeid fields¶
At compile time, a typeid value has all the properties of its underlying type:
However, at runtime only a few are available:
size- always supported.kind- always supported.parent- supported on distinct and struct types, returning the inline member type.inner- supported on types implementing it.names- supported on enum types.len- supported on arrays, vectors and enums.
The any type¶
C3 contains a built-in variant type, which is essentially a struct containing a typeid plus a void* pointer to a value.
While it is possible to cast the any pointer to any pointer type, it is recommended to use the anycast macro or checking the type explicitly first. With the anycast macro, the return will be
an optional, which is empty if there is a mismatch.
fn void main()
{
int x;
any y = &x;
int* w = (int*)y; // Returns the pointer to x
double* z_bad = (double*)y; // Don't do this!
double*? z = anycast(y, double); // The safe way to get a value
if (y.type == int.typeid)
{
// Do something if y contains an int*
}
if (try v = anycast(y, int))
{
// same as above, but v holds the unwrapped int*
}
}
You can use a switch to check an any's type, as well. After the type has been confirmed, it is safe to dereference.
fn void test(any z)
{
// Switch
switch (z.type)
{
case int:
// This is safe here:
int* y = (int*)z;
case double:
// This is safe here:
double* y = (double*)z;
}
// Assignment switch
switch (y = z, y.type)
{
case int:
// This is safe here:
int* x = (int*)y;
}
// Finally, if we just want to deal with the case
// where it is a single specific type:
if (z.type == int.typeid)
{
// This is safe here:
int* a = (int*)z;
}
if (try b = *anycast(z, int))
{
// b is an int:
foo(b * 3);
}
}
Note that in switches, if a substruct type is passed in and it's parent matches first, it will take priority.
fn void test(any z)
{
// Will always be seen as the parent type.
switch (z.type)
{
case Parent:
// code...
case Subtype:
// code that will never execute...
}
// So order the subtypes first
// if you're comparing them against their parent.
// Of course, this is still useful in cases
// of inherited types where the parent isn't in the switch.
switch (z.type)
{
case Parent:
// modify data both Parent and Subtype have
case SomethingElse:
// completely different type code
}
}
If you don't want the child type detected as the parent type, a typedef can be used to create a distinct type without changing any data.
any fields¶
At runtime, any gives you access to two fields:
some_any.type- returns the underlying pointee typeid of the contained value.some_any.ptr- returns the rawvoid*pointer to the contained value.
Advanced use of any¶
The standard library has several helper macros to manipulate any types:
anycast(some_any, Type)returns a pointer toType*orTYPE_MISMATCHif types don't match.any_make(ptr, some_typeid)creates ananyto a giventypeidusing avoid*.some_any.retype_to(some_typeid)changes the type of ananyto the given typeid.some_any.as_inner()retypes the type of theanyto the "inner" (see theinnertype property) of the current type.
void* some_ptr = foo();
// Essentially (any)(int*)(some_ptr)
any some_int = any_make(some_ptr, int.typeid);
// Same as any_make(some_int.ptr, uint.type)
any some_uint = some_int.retype_to(uint.typeid);
typedef SomeType = int;
SomeType s = 3;
any any_val = &s;
// Result is same as (any)&s.a
any some_inner_int = any_val.as_inner();
Array types¶
Arrays are indicated by [size] after the type, e.g. int[4]. Slices use the type[]. For initialization the wildcard type[*] can be used to infer the size
from the initializer. See the chapter on arrays.
Vector types¶
Vectors use [<size>] after the type, e.g. float[<3>], with the restriction that vectors may only form out
of integers, floats and booleans. Similar to arrays, wildcard can be used to infer the size of a vector: int[<*>] a = { 1, 2 }.
Array and vector type properties¶
Array and vector types also support:
innerReturning the type of each element.lenGives the length of the type.
User defined types¶
Type aliases (C's typedef)¶
C3 has a construct that behaves essentially the same as C's "typedef", an alias, and it is declared using the syntax alias <new_name> = <old_name>. For example:
These are not proper types, just aliases, and querying their properties will query the properties of its aliased type.
Function pointer types¶
Function pointers are always used through an alias:
To form a function pointer, write a normal function declaration but skipping the function name. fn int foo(double x) ->
fn int(double x).
Function pointers can have default arguments, e.g. alias Callback = fn void(int value = 0) but default arguments
and parameter names are not taken into account when determining function pointer assignability:
alias Callback = fn void(int value = 1);
fn void test(int a = 0) { /* ... */ }
Callback callback = &test; // Ok
fn void main()
{
callback(); // Works, same as test(1);
test(); // Works, same as test(0);
callback(value: 3); // Works, same as test(3)
test(a: 4); // Works, same as test(4)
// callback(a: 3); // ERROR!
}
Function pointer type properties¶
Function pointer types also support:
paramsof- Returns a list ofReflectedParamfor each parameter.returns- This returns the return type.
Typedef - Distinct type definitions¶
typedef creates a new type, that has the same properties as the original type but is distinct from it. It cannot implicitly convert into the other type using the syntax
typedef <name> = <type>
typedef MyId = int;
typedef MyId2 @constinit = int;
fn void* get_by_id(MyId id) { ... }
fn void* get_by_id2(MyId2 id) { ... }
fn void test(MyId id)
{
void* val = get_by_id(id); // Ok
// void* val2 = get_by_id(1); // ERROR expected a MyId
// Use `@constinit` to allow implicit conversion from
// literals
void* val2 = get_by_id2(1);
int a = 1;
// void* val3 = get_by_id(a); // ERROR expected a MyId
// `@constinit` doesn't work on non-literals
// void* val3 = get_by_id2(a); // ERROR expected a MyId2
void* val4 = get_by_id((MyId)a); // Works
// a = id; // ERROR can't assign 'MyId' to 'int'
}
Inline typedef¶
Using inline in the typedef declaration allows a newly created typedef type to implicitly convert to its underlying type:
typedef Abc @constinit = int;
typedef Bcd @constinit = inline int;
fn void test()
{
Abc a = 1;
Bcd b = 1;
// int i = a; Error: Abc cannot be implicitly converted to 'int'
int i = b; // This is valid
// However, 'inline' does not allow implicit conversion from
// the inline type to the typedef type:
// a = i; Error: Can't implicitly convert 'int' to 'Abc'
// b = i; Error: Can't implicitly convert 'int' to 'Bcd'
}
Aligned typedefs¶
It's possible to use typedef to create underaligned types. For example, typically an int will be 4 byte aligned, but we can create a 2-byte aligned type using typedef IntAlign2 = int @align(2);.
Storage SIMD types¶
Vectors are normally stored and passed as arrays to prevent SIMD alignment overhead. However, it's possible to define types that exactly match the SIMD types in C and other languages for storage and argument passing. These types are defined with typedef and the @simd attribute, similar to aligned typedefs: typedef Float4 = float[<4>] @simd
Typedef type properties¶
In addition to the normal properties, typedef also supports:
inner- Returns the type this is based on as atypeid.parentof- If this is an inline typedef, return the same asinner.
Generic types¶
import generic_list; // Contains the generic MyList
struct Foo
{
int x;
}
// ✅ alias for each type used with a generic module.
alias MyListFoo = MyList {Foo};
MyListFoo working_example;
fn void main()
{
// ❌ A nested inline type definition in a function context
// will yield an error, it's only available on the top
// level or in macros. Prefer aliases.
MyList {MyList {int}} failing_example;
}
Enum and constdefs¶
These correspond to C's enum. See enums and constdefs.
Struct types¶
Read more about unions and structs and bitstructs.
C to C3
A Guide For C Programmers
Overview¶
This is intended for existing C programmers.
This primer is intended as a guide to how the C syntax – and in some cases C semantics – are different in C3. It is intended to help you take a piece of C code and understand how it can be converted manually to C3.
Functions¶
Functions are declared like C, but you need to put fn in front:
Find out more about functions, including named arguments and default arguments.
Calling C Functions¶
Declare a function (or variable) with extern and it will be possible to
access it from C3:
Note that currently only the C standard library is automatically passed to the linker. In order to link with other libraries, you need to explicitly tell the compiler to link them.
If you want to use a different identifier inside of your C3 code compared to
the function or variable's external name, use the @cname attribute:
extern fn int _puts(char* message) @cname("puts");
...
_puts("Hello world"); // <- calls the puts function in libc
New macro system¶
The old C macro system is replaced by a new C3 macro system.
Read more about semantic macros.
Identifiers¶
Naming standards¶
Name standards are strictly enforced, to simplify the C3 grammar:
// Starting with uppercase and followed somewhere by at least
// one lower case is a user defined type:
Foo x;
M____y y;
// Starting with lowercase is a variable or a function or a member name:
x.myval = 1;
int z = 123;
fn void fooBar(int x) { ... }
// Only upper case is a constant or an enum value:
const int FOOBAR = 123;
enum Test
{
STATE_A,
STATE_B
}
Variable Declaration¶
Multiple declarations are restricted¶
Multiple declaration with initialization isn't allowed in C3:
In conditionals, a special form of multiple declarations is allowed but each must then provide its type:
Zero initialization by default¶
In C global variables are implicitly zeroed out, but local variables aren’t. In C3 both global and local variables are zeroed out by default, but may be explicitly undefined (using the @noinit attribute) if you wish to match the C behaviour.
Removal of the const type qualifier¶
The const qualifier is only retained for actual constant variables. C3 uses a special type of post condition for functions to indicate that they do not alter input parameters.
<*
This function ensures that foo is not changed in the function.
@param [in] foo
@param [out] bar
*>
fn void test(Foo* foo, Bar* bar)
{
bar.y = foo.x;
// foo.x = foo.x + 1 - compile time error, can't write to 'in' param.
// int x = bar.y - compile time error, can't read from an 'out' param.
}
Expressions¶
Bit operator precedence changed¶
Notably bit operations have higher precedence than +/- and comparison operators, making code like this: a & b == c evaluate like (a & b) == c instead of C's a & (b == c). The elvis operator, ?:, also binds tighter than ternary. See the page about precedence rules.
0-prefix octal syntax removed¶
The old 0777 octal syntax present in C has been removed and replaced by a 0o prefix in C3, e.g. 0o777. Strings in C3 do not support octal sequences aside from '\0'.
Member access using . even for pointers¶
The -> operator is removed, access uses dot for both direct and pointer access. Note that this is just single access: to access a pointer of a pointer (e.g. int**) an explicit dereference would be needed.
In the special case of needing to dereference and index into an array, use .[] syntax:
int[3] a;
int[3]* b = &a; // Different from C!
// b[1] = 3; ERROR: expected an int[3] but got an int.
(*b)[1] = 3; // Works
b.[1] = 3; // Same as the above
This situation does not arise in C, due to pointer decay.
Signed overflow is well-defined¶
Signed integer overflow always wraps using 2s complement. It is never undefined behaviour.
Restrictions in implicit conversion rules¶
C3 does not permit implicit narrowing. Implicit widening is only allowed when there is only a single way to widen an expression.
Take the case of long x = int_val_1 + int_val_2. In C this would widen the result of the addition:
long x = (long)(int_val_1 + int_val_2), but there is another possible
way to widen: long x = (long)int_val_1 + (long)int_val_2. So, in this case, the widening is disallowed in C3. However, long x = int_val_1 is unambiguous, so C3 permits it just like C (read more on the conversion page).
Evaluation order is well-defined¶
Evaluation order (after precedence, meaning when operators have equal precedence, a.k.a. associativity) is left-to-right. In assignment expressions, assignment happens after expression evaluation.
int a = foo() + bar(); // Always evaluates foo() before bar()
*(baz()) = foo(); // foo() evaluates before baz()
Types¶
Struct, Enum And Union Declarations¶
Don't add a ; after enum, struct and union declarations, and note the slightly
different syntax for declaring a named struct inside of a struct.
Also, user-defined types are used without a struct, union or enum keyword, as
if the name was a C typedef.
Arrays¶
Array sizes are written next to the type, and arrays do not decay to pointers, you need to do it manually:
You will probably prefer slices to pointers when passing data around:
// C
int x[100] = ...;
int y[30] = ...;
int z[15] = ...;
sort_my_array(x, 100);
sort_my_array(y, 30);
// Sort part of the array!
sort_my_array(z + 1, 10);
// C3
int[100] x = {};
int[30] y = {};
sort_my_array(&x); // Implicit conversion from int[100]* -> int[]
sort_my_array(&y); // Implicit conversion from int[30]* -> int[]
sort_my_array(z[1..10]); // Inclusive ranges!
Note that declaring an array of inferred size will look different in C3:
Arrays are trivially copyable:
Find out more about arrays.C's typedef and #define become alias¶
C's typedef is replaced by alias:
alias also allows you to do things that C uses #define for:
// C
#define println puts
#define my_excellent_string my_string
char *my_string = "Party on";
...
println(my_excellent_string);
// C3
alias println = puts;
alias my_excellent_string = my_string;
char* my_string = "Party on";
...
println(my_excellent_string);
Find out more about alias.
typedef creates new types¶
typedef in C3 creates a new type with it's own methods, and the original type cannot implicitly convert to this new type, unless cast.
typedef MyId = int;
fn void get_by_id(MyId id)
{
return;
}
fn void test()
{
MyId valid = 7;
int invalid = 7;
get_by_id(valid); // allowed
get_by_id(invalid); // not allowed
}
Changes To enum and introducing constdef¶
C3 enums give new features, such as returning the name of the enum value at runtime. Their underlying representation always starts at 0 without gaps. For C enums with gaps, C3 uses constdef instead:
Read more about enums here.
Bitfields Are Replaced By Explicit Bitstructs¶
A bitstruct has an explicit container type, and each field has an exact bit range.
bitstruct Foo : short
{
int a : 0..2; // Exact bit ranges, bits 0-2
uint b : 3..6;
MyEnum c : 7..13;
}
There exists a simplified form for a bitstruct containing only booleans, it is the same except the ranges are left out:
For more information see the page on bitstructs.
Fixed size basic types¶
Several C types that would be variable sized are fixed size, and others changed names:
// C3
short a; // Guaranteed 16 bits
int b; // Guaranteed 32 bits
long c; // Guaranteed 64 bits
ulong d; // Guaranteed 64 bits
int128 e; // Guaranteed 128 bits
uint128 f; // Guaranteed 128 bits
sz g; // Same as C ptrdiff_t, ssize_t, depends on target
usz h; // Same as C size_t, depends on target
iptr i; // Same as intptr_t depends on target
uptr j; // Same as uintptr_t depends on target
Find out more about types.
Type Qualifiers¶
Qualifiers like const and volatile are removed, but const before a constant
will make it treated as a compile time constant. The constant does not need to be typed.
const A = false;
// Compile time
$if A:
// This will not be compiled
$else
// This will be compiled
$endif
volatile is replaced by macros for volatile load and store.
Modules¶
Modules And Import Instead Of #include¶
Declaring the module name is not mandatory, but if you leave it out the file name will be used as the module name. Imports are recursive.
module otherlib::foo;
fn void test() { ... }
struct FooStruct { ... }
module mylib::bar;
import otherlib;
fn void myCheck()
{
foo::test(); // foo prefix is mandatory.
mylib::foo::test(); // This also works;
FooStruct x; // But user defined types don't need the prefix.
otherlib::foo::FooStruct y; // But it is allowed.
}
No mandatory header files¶
There is a C3 interchange header format for declaring interfaces of libraries, but it is only used in special cases.
Comments¶
The /* */ comments are nesting
Note that doc contracts starting with <* and ending with *>, have special rules for parsing them, and are not considered a regular comment. Find out more about contracts.
C3 also treats #! on the first line as a line comment //.
Statements¶
goto Removed¶
goto is removed, but there is labelled break and continue as well as defer
to handle the cases when it is commonly used in C.
// C
Foo *foo = malloc(sizeof(Foo));
if (tryFoo(foo)) goto FAIL;
if (modifyFoo(foo)) goto FAIL;
free(foo);
return true;
FAIL:
free(foo);
return false;
// C3, direct translation:
do FAIL:
{
Foo* foo = malloc(Foo::size);
if (tryFoo(foo)) break FAIL;
if (modifyFoo(foo)) break FAIL;
free(foo);
return true;
};
free(foo);
return false;
// C3, using defer:
Foo* foo = malloc(Foo::size);
defer free(foo);
if (tryFoo(foo)) return false;
if (modifyFoo(foo)) return false;
return true;
Changes To switch¶
casestatements automatically break.- Use
nextcaseto fallthrough to the next statement. - Empty
casestatements have implicit fallthrough.
Implicit break in switches¶
Empty case statements have implicit fall through in C3, otherwise the nextcase statement is needed. nextcase can also be used to jump to any other case statement in the switch.
For example:
We can jump to an arbitrary switch-case label in C3:
Undefined Behaviour¶
C3 has less undefined behaviour, in particular integers are defined as using 2s complement and signed overflow is wrapping. Find out more about undefined behaviour.
Other Changes¶
The following things are enhancements to C, that don't have an equivalent in C.
- Defer
- Methods
- Optionals
- Generic modules
- Contracts
- Compile time evaluation
- Reflection
- Operator overloading
- Macro methods
- Static initialize and finalize functions
- Dynamic interfaces
For the full list of all new features see the feature list.
Finally, the FAQ answers many questions you might have as you start out.
Language Fundamentals
Basic Types
C3 provides a similar set of fundamental data types as C: integers, floats, arrays and pointers. On top of this it
expands on this set by adding slices and vectors, as well as the any and typeid types for advanced use.
Integers¶
C3 has signed and unsigned integer types. The built-in signed integer types are ichar, short, int, long,
int128, iptr and sz. ichar to int128 have all well-defined power-of-two bit sizes, whereas iptr
has the same number of bits as a void* and sz has the same number of bits as the maximum difference
between two pointers. For each signed integer type there is a corresponding unsigned integer type: char,
ushort, uint, ulong, uint128, uptr and usz.
| type | signed? | min | max | bits |
|---|---|---|---|---|
| ichar | yes | -128 | 127 | 8 |
| short | yes | -32768 | 32767 | 16 |
| int | yes | -2^31 | 2^31 - 1 | 32 |
| long | yes | -2^63 | 2^63 - 1 | 64 |
| int128 | yes | -2^127 | 2^127 - 1 | 128 |
| iptr | yes | varies | varies | varies |
| sz | yes | varies | varies | varies |
| char | no | 0 | 255 | 8 |
| ushort | no | 0 | 65535 | 16 |
| uint | no | 0 | 2^32 - 1 | 32 |
| ulong | no | 0 | 2^64 - 1 | 64 |
| uint128 | no | 0 | 2^128 - 1 | 128 |
| uptr | no | 0 | varies | varies |
| usz | no | 0 | varies | varies |
On 64-bit machines iptr/uptr and sz/usz are usually 64-bits, like long/ulong.
On 32-bit machines on the other hand they are generally int/uint.
Integer constants¶
Numeric constants typically use decimal, e.g. 234, but may also use hexadecimal (base 16) numbers by prefixing
the number with 0x or 0X, e.g. int a = 0x42edaa02;. There is also octal (base 8) using the
0o or 0O prefix, and 0b for binary (base 2) numbers:
Numbers may also insert underscore _ between digits to improve readability, e.g. 1_000_000.
For decimal numbers, the value is assumed to be a signed int, unless the number doesn't fit in an
int, in which case it is assumed to be the smallest signed type it does fit in (long or int128).
For hexadecimal, octal and binary, the type is assumed to be unsigned.
An integer literal can implicitly convert to a floating point literal, or an integer of a different type provided the number fits in the type.
Constant suffixes¶
If you want to ensure that a constant is of a certain type, you can either add an explicit cast
like: (ulong)345, or use an integer suffix: 345ul.
The following integer suffixes are available:
| suffix | type |
|---|---|
| l | long |
| ll | int128 |
| u | uint |
| ul | ulong |
| ull | uint128 |
Suffixes may be uppercase or lowercase.
Booleans¶
A bool will be either true or false. Although a bool is only a single bit of data,
it should be noted that it is stored in a byte.
Character literals¶
A character literal is a value enclosed in ''. Its value is interpreted as being its
ASCII value for a single character.
It is also possible to use 2, 4 or 8 character wide character literals. Such are interpreted
as ushort, uint and ulong respectively and are laid out in memory from left to right.
This means that the actual value depends on the endianness
of the target.
- 2 character literals, e.g.
'C3', would convert to a ushort. - 4 character literals, e.g.
'TEST', converts to a uint. - 8 character literals, e.g.
'FOOBAR11'converts to a ulong.
The 4 character literals correspond to the layout of FourCC
codes. It will also correctly arrange unicode characters in memory. E.g. Char32 smiley = '\u1F603'
Floating point types¶
As is common, C3 has two floating point types: float and double. float is the 32 bit floating
point type and double is 64 bits.
Floating point constants¶
Floating point constants will at least use 64 bit precision.
Just like for integer constants, it is possible to use _ to improve
readability, but it may not occur immediately before or after a dot or an exponential.
C3 supports floating point values either written in decimal or hexadecimal formats.
For decimal, the exponential symbol is e (or E, both are acceptable),
for hexadecimal p (or P) is used: -2.22e-21 -0x21.93p-10
While floating point numbers default to double it is possible to type a
floating point by adding a suffix:
| Suffix | type |
|---|---|
f32 or f |
float |
f64 |
double |
Arrays¶
Arrays have the format Type[size], so for example: int[4]. An array is a type consisting
of the same element repeated a number of times. Our int[4] is essentially four int values
packed together.
For initialization it's sometimes convenient to use the wildcard Type[*] declaration, which
infers the length from the number of elements:
Slices¶
Slices have the format Type[]. Unlike the array, a slice does not hold the values themselves
but instead presents a view of some underlying array or vector.
Slices have two properties: .ptr, which retrieves the array it points to, and .len which
is the length of the slice - that is, the number of elements it is possible to index into.
Usually we can get a slice by taking the address of an array:
Because indexing into slices is range checked in safe mode, slices are vastly more safe than providing pointer + length separately.
The internal representation of a slice is a two element struct:
This definition can be found in the modulestd::core::runtime.
Vectors¶
Vectors, similar to arrays, use the format
Type[<size>], with the restriction that vectors may only form out
of integers, floats and booleans. Similar to arrays, wildcard can be
used to infer the size of a vector:
Vectors are based on hardware SIMD vectors, and support many different operations that work on all elements in parallel, including arithmetics:
Vector initialization and literals work the same way as arrays, using { ... }, however, it's also possible to use
swizzling arguments to designated initialization:
String literals¶
String literals are special and can convert to several different types:
String, char and ichar arrays and slices and finally ichar* and char*.
String literals are text enclosed in " " just like in C. These support
escape sequences like \n for line break and need to use \" for any " inside of the
string.
C3 also offers raw strings which are enclosed in ` `.
A raw string may span multiple lines.
Inside of a raw string, no escapes are available, and to write a `, simply double the character:
// Note: String is a typedef inline char[]
String three_lines =
`multi
line
string`;
String foo = `C:\foo\bar.dll`;
String bar = `"Say ``hello``"`;
// Same as
String foo = "C:\\foo\\bar.dll";
String bar = "\"Say `hello`\"";
String is a
typedef inline char[], which can implicitly convert to char[] when required.
ZString is a typedef inline char*.ZString is a C compatible null terminated string, which can implicitly convert to char* when required.
Base64 and hex data literals¶
Base64 literals are strings prefixed with b64 containing
Base64 encoded data, which
is converted into a char array at compile time:
// The array below contains the characters "Hello World!"
char[*] hello_world_base64 = b64"SGVsbG8gV29ybGQh";
The corresponding hex data literals convert a hexadecimal string rather than Base64:
// The array below contains the characters "Hello World!"
char[*] hello_world_hex = x"4865 6c6c 6f20 776f 726c 6421";
Pointer types¶
Pointers have the syntax Type*. A pointer is a memory address where one or possibly more
elements of the underlying address are stored. Pointers can be stacked: Foo* is a pointer to a Foo
while Foo** is a pointer to a pointer to Foo.
The pointer type has a special literal called null, which is an invalid, empty pointer.
void*¶
The void* type is a special pointer which implicitly converts to any other pointer. It is not "a pointer to void",
but rather a wildcard pointer which matches any other pointer.
Printing values¶
Printing values can be done using io::print, io::printn, io::printf and io::printfn. This requires
importing the module std::io.
Note
The n variants of the print functions will add a newline after printing, which is what we'll often
use in the examples, but print and printf work the same way.
import std::io; // Get the io functions.
fn void main()
{
int a = 1234;
ulong b = 0xFFAABBCCDDEEFF;
double d = 13.03e-04;
char[*] hex = x"4865 6c6c 6f20 776f 726c 6421";
io::printn(a);
io::printn(b);
io::printn(d);
io::printn(hex);
}
If you run this program you will get:
To get more control we can format the output using printf and printfn:
import std::io;
fn void main()
{
int a = 1234;
ulong b = 0xFFAABBCCDDEEFF;
double d = 13.03e-04;
char[*] hex = x"4865 6c6c 6f20 776f 726c 6421";
io::printfn("a was: %d", a);
io::printfn("b in hex was: %x", b);
io::printfn("d in scientific notation was: %e", d);
io::printfn("Bytes as string: %s", (String)&hex);
}
We can apply the standard printf formatting rules, but
unlike in C/C++ there is no need to indicate the type when using %d - it will print unsigned and
signed up to int128, in fact there is no support for %u, %lld etc in io::printf. Furthermore,
%s works not just on strings but on any type:
import std::io;
enum Foo
{
ABC,
BCD,
EFG,
}
fn void main()
{
int a = 1234;
uint128 b = 0xFFEEDDCC_BBAA9988_77665544_33221100;
Foo foo = BCD;
io::printfn("a: %s, b: %d, foo: %s", a, b, foo);
}
This prints:
Variables
Zero init by default¶
Unlike C, C3 local variables are zero-initialized by default. To avoid zero initialization, you need to explicitly opt-out.
int x; // x = 0
int y @noinit; // y is explicitly undefined and must be assigned before use.
AStruct foo; // foo is implicitly zeroed
AStruct bar = {}; // bar is explicitly zeroed
AStruct baz @noinit; // baz is explicitly undefined
Using a variable that is explicitly undefined before assignment will trap or be initialized to a specific value when compiling "safe" and is undefined behaviour in "fast" builds.
Functions
C3 has both regular functions and mmethods. Methods are functions namespaced using type names, and allow invocation using the dot syntax.
Regular functions¶
Regular functions are the same as C aside from the keyword fn, which is followed by the conventional C declaration of <return type> <name>(<parameter list>).
Function arguments¶
C3 allows the use of default arguments as well as named arguments. Note that any unnamed arguments must appear before any named arguments.
fn int test_with_default(int foo = 1)
{
return foo;
}
fn void test()
{
test_with_default();
test_with_default(100);
}
Named arguments
fn void test_named(int times, double data)
{
for (int i = 0; i < times; i++)
{
io::printf("Hello %d\n", i + data);
}
}
fn void test()
{
// Named only
test_named(times: 1, data: 3.0);
// Unnamed only
test_named(3, 4.0);
// Mixing named and unnamed
test_named(15, data: 3.141592);
}
Named arguments with defaults:
fn void test_named_default(int times = 1, double data = 3.0, bool dummy = false)
{
for (int i = 0; i < times; i++)
{
io::printfn("Hello %f", i + data);
}
}
fn void test()
{
// Named only
test_named_default(times: 10, data: 3.5);
// Unnamed and named
test_named_default(3, dummy: false);
// Overwriting an unnamed argument with a named argument is an error:
// test_named_default(2, times: 3); ERROR!
// Unnamed may not follow named arguments.
// test_named_default(times: 3, 4.0); ERROR!
}
Vaargs¶
There are four types of vaargs:
- single typed
- explicitly typed any: pass non-any arguments as references
- implicitly typed any: arguments are implicitly converted to references (use with care)
- untyped C-style
Examples:
fn void va_singletyped(int... args)
{
/* args has type int[] */
}
fn void va_variants_explicit(any... args)
{
/* args has type any[] */
}
fn void va_variants_implicit(args...)
{
/* args has type any[] */
}
extern fn void va_untyped(...); // only used for extern C functions
fn void test()
{
va_singletyped(1, 2, 3);
int x = 1;
any v = &x;
va_variants_explicit(&&1, &x, v); // pass references for non-any arguments
va_variants_implicit(1, x, "foo"); // arguments are implicitly converted to anys
va_untyped(1, x, "foo"); // extern C-function
}
For typed vaargs, we can pass a slice instead of the individual arguments, by using the splat ... operator for example:
Splat¶
- Splat
...unknown size slice ONLY in a typed vaarg slot.
fn void va_singletyped(int... args) {
io::printfn("%s", args);
}
fn void main()
{
int[2] arr = {1, 2};
va_singletyped(...arr); // arr is splatting two arguments
}
- Splat
...any array anywhere
fn void foo(int a, int b, int c)
{
io::printfn("%s, %s, %s", a, b, c);
}
fn void main()
{
int[2] arr = {1, 2};
foo(...arr, 7); // arr is splatting two arguments
}
- Splat
...known size slices anywhere
fn void foo(int a, int b, int c)
{
io::printfn("%s, %s, %s", a, b, c);
}
fn void main()
{
int[5] arr = {1, 2, 3, 4, 5};
foo(...arr[:3]); // slice is splatting three arguments
}
- Splat
...a struct anywhere. The splatted struct must have its fields in the same order than the function parameters (the names don't need to match, but the types do).
struct Foo
{
int x;
double y;
}
fn void foo(int a, double b)
{
io::printfn("foo: a=%s, b=%s", a, b);
}
fn void bar(double b, int a)
{
io::printfn("bar: b=%s, a=%s", a, b);
}
fn void main()
{
Foo f = {.x = 12, .y = 2.2};
foo(...f); // OK
bar(...f); // ERROR, not in the same order
}
Named arguments and vaargs¶
Usually, a parameter after vaargs would never be assigned to:
fn void testme(int a, double... x, double rate = 1.0) { /* ... */ }
fn void test()
{
// x is { 2.0, 5.0, 6.0 } rate would be 1.0
testme(3, 2.0, 5.0, 6.0);
}
However, named arguments can be used to set this value:
fn void testme(int a, double... x, double rate = 1.0) { /* ... */ }
fn void test()
{
// x is { 2.0, 5.0 } rate would be 6.0
testme(3, 2.0, 5.0, rate: 6.0);
}
Functions and Optional returns¶
Function return values may be Optionals – denoted by <type>? indicating that this
function might either return an Optional with a result, or an Optional with an Excuse.
For example, this function might return BAD_JOSS_ERROR or BAD_LUCK_ERROR if it fails to produce a valid value.
faultdef BAD_LUCK_ERROR, BAD_JOSS_ERROR;
fn double? test_error()
{
double val = random_value();
if (val > 0.5) return BAD_LUCK_ERROR~;
if (val >= 0.2) return BAD_JOSS_ERROR~;
return val;
}
A function call which is passed one or more Optional arguments will only execute if all Optional values contain a result, otherwise the first Excuse found is returned.
fn void test()
{
// The following line either prints a value less than 0.2
// or does not print at all. The (void) is needed
// to let the compiler know we're deliberately
// ignoring the Optional result.
(void)io::printfn("%d", test_error());
// ?? sets a default value if an Excuse is found
double x = (test_error() + test_error()) ?? 100;
// This prints either a value less than 0.4 or 100:
io::printfn("%d", x);
}
This allows us to chain functions:
fn void print_input_with_explicit_checks()
{
String? line = io::treadline();
if (try line)
{
// line is a regular "string" here.
int? val = line.to_int();
if (try val)
{
io::printfn("You typed the number %d", val);
return;
}
}
io::printn("You didn't type an integer :/ ");
}
fn void print_input_with_chaining()
{
if (try int val = io::treadline().to_int())
{
io::printfn("You typed the number %d", val);
return;
}
io::printn("You didn't type an integer :/ ");
}
Methods¶
Methods look exactly like functions, but are prefixed with a type name and are (usually) invoked using dot syntax, on an instance of the type.
struct Point
{
int x;
int y;
}
fn void Point.add(Point* p, int x)
{
p.x += x;
}
fn void example()
{
Point p = { 1, 2 };
// with struct-functions
p.add(10);
// Also callable as:
Point.add(&p, 10);
}
The target object may be passed by value or by pointer:
enum State
{
STOPPED,
RUNNING
}
fn bool State.may_open(State state)
{
switch (state)
{
case STOPPED: return true;
case RUNNING: return false;
}
}
You can add methods to all runtime types, including built-in types:
fn int int.add(int i, int other)
{
return i + other;
}
fn void test()
{
int i = 3;
int j = i.add(4);
}
Implicit first parameters¶
Because the type of the first argument is known, it may be left out. To indicate a non-null pointer, & is used.
fn int Foo.test(&self) { /* ... */ }
// (almost) equivalent to
fn int Foo.test(Foo* self) { /* ... */ }
fn int Bar.test(self) { /* ... */ }
// equivalent to
fn int Bar.test(Bar self) { /* ... */ }
This means that in order to express a nullable first parameter, one must use the explicit form (e.g. Foo* self) rather than the untyped &self form.
It is customary to use self as the name of the first parameter, but it is not required.
Restrictions on methods¶
- Methods on a struct/union may not have the same name as a member.
- Methods on enums may not have the same name as an associated value.
- When taking a function pointer of a method, use the full name.
- Using subtypes, overlapping function names will be shadowed.
Guidelines on method use¶
Methods are customarily associated with Object-Oriented programming.
In this style one will often encounter code like some_object.run_everything().
C3 is not accommodating to this style, instead one should prefer task::run_everything(some_object).
Both the standard library and the design of the language instead follow
the principle that functions are used whenever the system is mutating
global data, whereas methods are used for mutating a particular value, or
extracting data from it. foo.add(bar), foo.to_list() and foo.push(x)
are all good uses of methods. On the flip side, methods usage like
context.parse_data(data), game.run(settings) and url.make_request()
are emphatically not recommended.
Contracts¶
C3's error handling is not intended to use errors to signal invalid data or to check invariants and post conditions. Instead C3's approach is to add annotations to the function, that conditionally will be compiled into asserts.
As an example, the following code:
<*
@param foo `the number of foos`
@require foo > 0, foo < 1000
@return `number of foos x 10`
@ensure return < 10000, return > 0
*>
fn int test_foo(int foo)
{
return foo * 10;
}
Will in debug builds be compiled into something like this:
fn int test_foo(int foo)
{
assert(foo > 0);
assert(foo < 1000);
int _return = foo * 10;
assert(_return < 10000);
assert(_return > 0);
return _return;
}
The compiler is allowed to use the contracts for optimizations. For example this:
fn int test_example(int bar)
{
// The following is always invalid due to the `@ensure`
if (test_foo(bar) == 0) return -1;
return 1;
}
May be optimized to:
In this case the compiler can look at the post condition of result > 0 to determine that testFoo(foo) == 0 must always be false.
Looking closely at this code, we note that nothing guarantees that bar is not violating the preconditions. In Safe builds this will usually be checked in runtime, but a sufficiently smart compiler will warn about the lack of checks on bar. Execution of code violating pre and post conditions has unspecified behaviour.
Short function declaration syntax¶
For very short functions, C3 offers a "short declaration" syntax using =>:
Lambdas¶
It's possible to create anonymous functions using the regular fn syntax. Anonymous
functions are identical to regular functions and do not capture variables from the
surrounding scope:
alias IntTransform = fn int(int);
fn void apply(int[] arr, IntTransform t)
{
foreach (&i : arr) *i = t(*i);
}
fn void main()
{
int[] x = { 1, 2, 5 };
// Short syntax with inference:
apply(x, fn (i) => i * i);
// Regular syntax without inference:
// apply(x, fn int(int i) { return i * i; });
// Prints [1, 4, 25]
io::printfn("%s", x);
}
Static initializer and finalizers¶
It is sometimes useful to run code at startup and shutdown of a program.
Static initializers and finalizers are regular functions annotated with
@init and @finalizer that are run at startup and shutdown respectively.
(Note: this should not be confused with constructors and destructors
in object-oriented languages.)
fn void run_at_startup() @init
{
// Run at startup
some_function.init(512);
}
fn void run_at_shutdown() @finalizer
{
some_thing.shutdown();
}
Note that invoking @finalizer is a best effort attempt by the OS and may not
be called during abnormal shutdown.
Changing priority of static initializers and finalizers¶
It is possible to provide an argument to the attributes to set the actual priority. It is recommended that programs use a priority of 1024 or higher. The higher the value, the later it will be called. The lowest priority is 65535.
// Print "Hello World" at startup.
fn void start_world() @init(3000)
{
io::printn("World");
}
fn void start_hello() @init(2000)
{
io::print("Hello ");
}
Implementing parameter access constraints¶
<*
A read-only function
@param [in] value
*>
fn void read(int* value)
{
io::printf("%d",*value);
// (*value)++; <- Error: 'in' parameters may not be assigned to.
}
<*
A write-only function
@param [out] buffer
*>
fn void write(int* buffer)
{
(*buffer)++;
// int test = *buffer; <- Error: 'out' parameters may not be read.
}
See the contracts for more details.
Statements
Statements largely work like in C, but with some additions.
Labelled break and continue¶
Labelled break and continue lets you break out of an outer scope. Labels can be put on if,
switch, while and do statements.
fn void test(int i)
{
if FOO: (i > 0)
{
while (1)
{
io::printfn("%d", i);
// Break out of the top if statement.
if (i++ > 10) break FOO;
}
}
}
Do-without-while¶
Do-while statements can skip the ending while. In that case it acts as if the while was while(0):
The function below prints World! if x is zero, otherwise it prints Hello World!.
Nextcase and labelled nextcase¶
The nextcase statement is used in switch and if-catch to jump to the next statement:
It's also possible to use nextcase with an expression, to jump to an arbitrary case or between labeled switch statements:
switch MAIN: (enum_var)
{
case FOO:
switch (i)
{
case 1:
doSomething();
nextcase 3; // Jump to case 3
case 2:
doSomethingElse();
case 3:
nextcase rand(); // Jump to random case
default:
io::printn("Ended");
nextcase MAIN: BAR; // Jump to outer (MAIN) switch
}
case BAR:
io::printn("BAR");
default:
break;
}
Which can be used as structured goto when creating state machines.
Switch cases with runtime evaluation¶
It's possible to use switch as an enhanced if-else chain:
The above would be equivalent to writing:
Note that because of this, the first match is always picked. Consider:
Because of the evaluation order, only foo() will be invoked for x > 0, even when x is greater than 2.
It's also possible to omit the conditional after switch. In that case it is implicitly assumed to be the same as writing (true)
Jumptable switches with @jump¶
Regular switch statements with only enum or integer cases may use the @jump
attribute. This attribute ensures that the switch is implemented as
a jump using a jumptable. In C this is possible to do manually using labels and
calculated gotos which are extensions available in GCC/Clang.
The behaviour of the switch itself does not change with a jumptable,
but some restrictions will apply. Typically used for situations
like bytecode interpreters, it might perform worse
or better than a regular switch depending on the situation.
nextcase statements will also use jumptable dispatch when
@jump is used.
Expressions
Temporary address¶
Expressions work like in C, with one exception: it is possible to take the address of a temporary. This uses the operator && rather than &.
Consequently, this is valid:
A pointer created with && is only valid until the end of the
current function. In other words, you should never return the
pointer created by && from a function as it will never be safe
to use.
Well-defined evaluation order¶
Expressions have a well-defined evaluation order:
- Binary expressions are evaluated from left to right.
- Assignment occurs right to left, so
a = a++would result inabeing unchanged. - Call arguments are evaluated in parameter order.
Compound literals¶
C3 has C's compound literals:
Arrays follow the same syntax:
Note that when it's possible, inferring the type is allowed and preferred, so we have for the above examples:
One may take the address of temporaries, using&& (rather than & for normal variables). This allows the following:
Passing a slice
fn void test(int[] y) { ... }
// Using &&
test(&&(int[3]){ 1, 2, 3 });
// Explicitly slicing:
test(((int[3]){ 1, 2, 3 })[..]);
// Using a slice directly as a temporary:
test((int[]){ 1, 2, 3 });
// Same as above but with inferred type:
test({ 1, 2, 3 });
Passing the pointer to an array
fn void test1(int[3]* z) { ... }
fn void test2(int* z) { ... }
test1(&&(int[3]){ 1, 2, 3 });
test2(&&(int[3]){ 1, 2, 3 });
Constant expressions¶
In C3 all constant expressions are guaranteed to be calculated at compile time. The following are considered constant expressions:
- The
nullliteral. - Boolean, floating point and integer literals.
- The result of arithmetics on constant expressions.
- Compile time variables (prefixed with
$) - Global constant variables with initializers that are constant expressions.
- The result of macros that do not generate code and only use constant expressions.
- The result of a cast if the value is cast to a boolean, floating point or integer type and the value that is converted is a constant expression.
- String literals.
- Initializer lists containing constant values.
Some things that are not constant expressions:
- Any pointer that isn't the
nullliteral, even if it's derived from a constant expression. - The result of a cast except for casts of constant expressions to a numeric type.
- Compound literals - even when values are constant expressions.
Including binary data¶
The $embed(...) function includes the contents of a file into the compilation as a
constant array of bytes:
The result of an embed works similar to a string literal and may implicitly convert to a char*,
void*, char[], char[*] or String.
Limiting length¶
It's possible to limit the length of what is included using the optional second parameter.
Failure to load at compile time and defaults¶
Usually it's a compile time error if the file can't be included, but sometimes it's useful to only optionally include it. If this is desired, declare the left hand side an Optional:
my_image will be an optional io::FILE_NOT_FOUND~ if the image is missing.
This also allows us to pass a default value using ??:
Modules
C3 groups functions, types, variables and macros into namespaces called modules. When doing builds, any C3 file must start with the module keyword, specifying the module. When compiling single files, the module is not needed and the module name is assumed to be the file name, converted to lower case, with any invalid characters replaced by underscore (_).
A module can consist of multiple files, e.g.
file_a.c3
file_b.c3
file_c.c3
Here file_a.c3 and file_b.c3 belong to the same module, foo while file_c.c3 belongs to bar.
Details¶
Some details about the C3 module system:
- Modules can be arbitrarily nested, e.g.
module foo::bar::baz;to create the sub module baz in the sub modulebarof the modulefoo. - Module names must be alphanumeric lower case letters, and may contain an underscore
_. - Module names are limited to 31 characters.
- Modules may be spread across multiple files.
- A single file may have multiple module declarations.
- Each declaration of a distinct module is called a module section.
Importing Modules¶
Modules are imported using the import statement. Imports always recursively import sub-modules. Any module
will automatically import all other modules with the same parent module.
foo.c3
bar.c3
module bar;
import some;
// import some::foo; <- not needed, as it is a sub module to "some"
fn void test()
{
foo::test();
// some::foo::test() also works.
}
In some cases there may be ambiguities, in which case the full path can be used to resolve the ambiguity:
abc.c3
de.c3
test.c3
Implicit Imports¶
The module system will also implicitly import:
- The
std::coremodule (and sub modules). - Any other module sharing the same top module. E.g. the module
foo::abcwill implicitly also import modulesfooandfoo::cdeif they exist.
Visibility¶
All files in the same module share the same global declaration namespace. By default a symbol is visible to all other modules.
To make a symbol only visible inside the module, use the @private attribute.
In this example, the other modules can use the init() function after importing foo, but only files in the foo module can use open(), as it is specified as private.
It's possible to further restrict visibility: @local works like @private except it's only visible in the
local context.
// File foo.c3
module foo;
fn void abc() @private { }
fn void de() @local { }
// File foo2.c3
module foo;
fn void test()
{
abc(); // Access of private in the same module is ok
// de(); <- Error: function is local to foo.c3
}
Overriding Symbol Visibility Rules¶
By using import <module> @public, it's possible to access another module´s private symbols.
Many other module systems have hierarchal visibility rules, but the import @public feature allows
visibility to be manipulated in a more ad-hoc manner without imposing hard rules.
For example, you may provide a library with two modules: "mylib::net" and "mylib::file" - which both use functions
and types from a common "mylib::internals" module. The two libraries use import mylib::internals @public
to access this module's private functions and type. To an external user of the library, the "mylib::internals"
does not seem to exist, but inside of your library you use it as a shared dependency.
A simple example:
// File a.c3
module a;
fn void a_function() @private { ... }
// File b.c3
module b;
fn void b_function() @private { ... }
// File c.c3
module c;
import a;
import b @public;
fn void test()
{
// Error! a_function() is private
a::a_function();
// Allowed since `import b @public` allowed `b`
// to "public" in this context.
b::b_function();
}
Note: @local visibility cannot be overridden using a "@public" import.
Changing The Default Visibility¶
In a normal module, global declarations will be public by default. If some other
visibility is desired, it's possible to declare @private or @local after the module name.
It will affect all declarations in the same section.
module foo @private;
fn void ab_private() { ... } // Private
module foo;
fn void ab_public() { ... } // Public
module bar;
import foo;
fn void test()
{
foo::ab_public(); // Works
// foo::ab_private(); <- Error, private method
}
If the default visibility is @private or @local, using @public sets the visibility to public:
module foo @private;
fn void ab_private() { ... } // Private
fn void ab_public() @public { ... } // Public
Linker Visibility and Exports¶
A function or global prefixed extern will be assumed to be linked in later.
An "extern" function may not have a body, and global variables are prohibited
from having an init expression.
The attribute @export explicitly marks a function as being exported when
creating a (static or dynamic) library. It can also change the linker name of
the function.
Using Functions and Types From Other Modules¶
As a rule, functions, macros, constants, variables and types in the same module do not need any namespace prefix. For imported modules the following rules hold:
- Functions, macros, constants and variables require at least the (sub-) module name.
- Types do not require the module name unless the name is ambiguous.
- In case of ambiguity, only so many levels of module names are needed as to make the symbol unambiguous.
// File a.c3
module a;
struct Foo { ... }
struct Bar { ... }
struct TheAStruct { ... }
fn void a_function() { ... }
// File b.c3
module b;
struct Foo { ... }
struct Bar { ... }
struct TheBStruct { ... }
fn void b_function() { ... }
// File c.c3
module c;
import a, b;
struct TheCStruct { ... }
struct Bar { ... }
fn void c_function() { ... }
fn void test()
{
TheAStruct stA;
TheBStruct stB;
TheCStruct stC;
// Name required to avoid ambiguity;
b::Foo stBFoo;
// Will always pick the current module's
// name.
Bar bar;
// Namespace required:
a::a_function();
b::b_function();
// A local symbol does not require it:
c_function();
}
This means that the rule for the common case can be summarized as
Types are used without prefix; functions, variables, macros and constants are prefixed with the sub module name.
Module Sections¶
A single file may have multiple module declarations, even for the same module. This allows us to write for example:
// File foo.c3
module foo;
fn int hello_world()
{
return my_hello_world();
}
module foo @private;
import std::io; // The import is only visible in this section.
fn int my_hello_world() // @private by default
{
io::printn("Hello, world\n");
return 0;
}
module foo @test;
fn void test_hello() // @test by default
{
assert(hello_world() == 0);
}
Versioning and Dynamic Inclusion¶
NOTE: This feature may significantly change.
When including dynamic libraries, it is possible to use optional functions and globals. This is done using the
@dynamic attribute.
An example library could have this:
dynlib.c3i
module dynlib;
fn void do_something() @dynamic(4.0)
fn void do_something_else() @dynamic(0, 5.0)
fn void do_another_thing() @dynamic(0, 2.5)
Importing the dynamic library and setting the base version to 4.5 and minimum version to 3.0, we get the following:
test.c3
import dynlib;
fn void test()
{
if (@available(dynlib::do_something))
{
dynlib::do_something();
}
else
{
dynlib::do_something_else();
}
}
In this example the code would run do_something if available
(that is, when the dynamic library is 4.0 or higher), or
fallback to do_something_else otherwise.
If we tried to conditionally add something not available in the compilation itself, that is a compile time error:
if (@available(dynlib::do_another_thing))
{
// Error: This function is not available with 3.0
dynlib::do_another_thing();
}
Versionless dynamic loading is also possible:
maybe_dynlib.c3i
test2.c3
import maybe_dynlib;
fn void testme2()
{
if (@available(maybe_dynlib::testme))
{
dynlib::testme();
}
}
This allows things like optionally loading dynamic libraries on the platforms where this is available.
Textual Includes¶
$include¶
It's sometimes useful to include an entire file, doing so employs the $include function.
Includes are only valid at the top level.
File Foo.c3
File Foo.x
The result is as if Foo.c3 contained the following:
The include may use an absolute or relative path, the relative path is always relative to the source file in which the include appears.
Note that to use it, the trust level of the compiler must be set to at least 2 with
the --trust option (i.e. use --trust=include or --trust=full from the command line).
$exec¶
An alternative to $include is $exec which is similar to include, but instead includes the output of an external
program as the included text.
An example:
import std::io;
// On Linux or MacOS this will insert 'String a = "Hello world!";'
$exec("echo", { "String a = \\\"Hello world!\\\"\\;" });
fn void main()
{
io::printn(a);
}
$exec can take in 1 to 3 arguments:
- command/scriptname (String)
- args (String[]): arguments passed on commandline to the command/script
- stdin (String): text that the command/script can read from stdin
Using $exec requires full trust level, which is enabled with --trust=full from the command line.
$exec will by default run from the /scripts directory for projects, for non-project builds,
the current directory is used as well.
$exec Scripting¶
$exec allows a special scripting mode, where one or more C3 files are compiled on the fly and
run by $exec.
import std::io;
// Compile foo.c3 and bar.c3 in the /scripts directory, invoke the resulting binary
// with the argument 'test'
$exec("foo.c3;bar.c3", { "test" });
fn void main()
{
...
}
Non-Recursive Imports¶
In specific circumstances you only wish to import a module without its submodules. This can be helpful in certain situations where otherwise unnecessary name-collisions would occur, but should not be used in the general case.
The syntax for non-recursive imports is import <module_name> @norecurse; for example:
For example only importing "mylib" into "my_code" and not wishing to import "submod".
module mylib;
import std::io;
fn void only_want_this()
{
io::printn("only_want_this");
}
module mylib::submod;
import std::io;
fn void undesired_fn()
{
io::printn("undesired_fn");
}
module my_code;
// Using Non-recursive import undesired_fn not found
import mylib @norecurse;
// Using Recursive import undesired_fn is found
// import mylib;
fn void main()
{
mylib::only_want_this();
submod::undesired_fn(); // This should error
}
Naming
C3 introduces fairly rigid naming rules to reduce ambiguity and make the language easy to parse for tools.
As a basic rule, all identifiers are limited to a-z, A-Z, 0-9 and _. The initial character can not be a number. Furthermore, all identifiers are limited to 127 characters.
Module sub-paths are limited to 31 characters, and a full module path must be 63 characters or less.
Structs, unions, enums, typedefs and aliases¶
All user-defined types must start with A-Z after any optional initial _ and include at least 1 lower case letter. Bar, _T_i12 and TTi are all valid names. _1, bAR and BAR are not. For C-compatibility it's possible to alias the type to an external name using the attribute "cname".
struct Foo @cname("foo")
{
int x;
Bar bar;
}
union Bar
{
int i;
double d;
}
enum Baz
{
VALUE_1,
VALUE_2
}
Variables and parameters¶
All variables and parameters except for global constant variables must start with a-z after any optional initial _. ___a fooBar and _test_ are all valid variable / parameter names. _, _Bar, X are not.
Global constants¶
Global constants must start with A-Z after any optional initial _. _FOO2, BAR_FOO, X are all valid global constants, _, _bar, x are not.
Enum members / Faults¶
enum members and faults defined with faultdef follow the same naming standard as global constants.
Struct / union members¶
Struct and union members follow the same naming rules as variables.
Modules¶
Module names may contain a-z, 0-9 and _, no upper case characters are allowed.
Functions and macros¶
Functions and macros must start with a-z after any optional initial _.
C3 recommended code style¶
While C3 doesn't mandate a particular style of naming, the standard library nonetheless uses naming conventions which are recommended for official bindings and standard library contributions:
alias MyInt = int; // Types use PascalCase
struct SomeStructType
{
int a_field; // Members use snake_case
double foo_baz;
}
int some_global = 1; // Globals use snake_case
fn void some_function(int a_param) // Functions and parameters use snake_case
{
int foo_bar = 4; // Locals use snake_case
}
// Methods use snake_case, and the first parameter is usually called "self"
fn void SomeStructType.call_me(self, int a)
{
some_function(self.a_field + a);
}
// Macros use snake_case
macro @some_macro(a)
{
return a + a;
}
const MY_FOO = 123; // Constants use SCREAMING_SNAKE_CASE
So in short:
1. Types use PascalCase
2. Constants use SCREAMING_SNAKE_CASE
3. Everything else uses snake_case
Brace style is often a controversial topic. The C3 standard library uses Allman brace style:
For canonical C3 code outside of the stdlib and vendor (the official binding repository), prefer either Allman or K&R:
Regarding tab-vs-spaces, contributions to the C3 stdlib or vendor should use tabs for indentation and spaces for formatting.
Comments
C3 has four distinct comment types:
- The normal
//single line comment. - The classic
/* ... */multi-line C style comment, but unlike in C they are allowed to nest. - Documentation comments
<* ... *>the text within these comments will be parsed as documentation and optional Contracts on the following code. - Shebang comment
#!, which works like a single line comment, but is only valid as the first two characters in a file.
Doc contracts¶
Documentation contracts start with <* and must be terminated using *>.
Any initial text up until the first @-directive on a new line will be interpreted as
free text documentation.
For example:
<*
Here are some docs.
@param num_foo : `The number of foos.`
@require num_foo > 4
@require num_foo <= 100 : "Prevent too many foos."
@deprecated
@mycustom "2"
*>
fn void bar(int num_foo)
{
io::printfn("%d", num_foo);
}
Doc Contracts Are Parsed¶
The following was extracted:
- The function description: "Here are some docs."
- The num_foo parameter has the description: "The number of foos".
- A Contract annotation for the compiler: @require num_foo > 4 which tells the compiler and a user of the function that a precondition is that num_foo must be greater than 4.
- A second contract annotation with the description: "Prevent too many foos".
- A function Attribute marking it as @deprecated, which displays warnings.
- A custom function Attribute @mycustom. The compiler is free to silently ignore custom Attributes, they can be used to optionally emit warnings, but are otherwise ignored.
Available annotations¶
| Name | format |
|---|---|
@param |
@param [<ref>] <param> [ : <description>] |
@return |
@return <description> |
@return? |
@return? [<func>!], [<fault1>, <fault2>, ..., [: <description>]] |
@require |
@require <expr1>, <expr2>, ..., [: <description>] |
@ensure |
@ensure <expr1>, <expr2>, ..., [: <description>] |
@deprecated |
@deprecated [<description>] |
@pure |
@pure |
Fault inheritance¶
It is possible to reference the faults of another function or macro by using the syntax @return? some_func!. This will include all faults returned by some_func. This can be combined with other faults.
<*
@return? check_triangle!, io::EOF
*>
fn TriangleKind? get_triangle_kind(Triangle* triangle)
{
check_triangle(triangle)!;
// ...
}
See Contracts for information regarding @require, @ensure, @const, @pure.
*[<ref>] is an optional mutability description e.g. [&in]
*[<description>] denotes that a description is optional.
Language Common
Arrays
Arrays have a central role in programming. C3 offers built-in arrays, slices and vectors. The standard library enhances this further with dynamically sized arrays and other collections.
Fixed Size 1D Arrays¶
These are declared as <type>[<size>], e.g. int[4]. Fixed arrays are treated as values and will be copied if given as parameter. Unlike C, the number is part of its type. Taking a pointer to a fixed array will create a pointer to a fixed array, e.g. int[4]*.
Unlike C, fixed arrays do not decay into pointers. Instead, an int[4]* may be implicitly converted into an int*.
// C
int foo(int *a) { ... }
int x[3] = { 1, 2, 3 };
foo(x);
// C3
fn int foo(int* a) { ... }
int[3] x = { 1, 2, 3 };
foo(&x);
When you want to initialize a fixed array without specifying the size, use the [*] array syntax:
You can get the length of an array using the .len property:
int len1 = int[4].len; // 4
int[3] a = { 1, 2, 3 };
int len2 = a.len; // 3
int[*] b = { 1, 2 };
int len3 = b.len; // 2
Indexing into pointers of arrays¶
A source of confusion going from C to C3 is that indexing into, for example, a pointer int[3]* would yield an int[3], rather than an int.
To get the integer inside of the array that is pointed to, we need to do a dereference:
int[3] a = { 1, 2, 3 };
int[3]* b = &a;
int x = (*b)[1]; // Correctly returns 2
// Broken: int x = b[1]
A convenient shorthand for (*b)[1] is to use implicit subscript dereference: b.[1]. Here the . is only doing a dereference if the variable
is a pointer. So given the example above we have:
a[1]; // Returns 2
a.[1]; // Returns 2
b[1]; // BROKEN! Out of bounds access
(*b)[1]; // Returns 2
b.[1]; // Returns 2
This feature is mainly useful in generic modules and macros.
Slice¶
The final type is the slice <type>[] e.g. int[]. A slice is a view into either a fixed or variable array. Internally it is represented as a struct containing a pointer and a size. Both fixed and variable arrays may be converted into slices, and slices may be implicitly converted to pointers.
fn void test()
{
int[4] arr = { 1, 2, 3, 4 };
int[4]* ptr = &arr;
// Assignments to slices
int[] slice1 = &arr; // Implicit conversion
int[] slice2 = ptr; // Implicit conversion
// Assignments from slices
int[] slice3 = slice1; // Assign slices from other slices
int* int_ptr = slice1; // Assign from slice
int[4]* arr_ptr = (int[4]*)slice1; // Cast from slice
}
Slicing Arrays¶
It's possible to use the range syntax to create slices from pointers, arrays, and other slices.
This is written arr[<start-index> .. <end-index>], where end-index is inclusive.
fn void test()
{
int[5] a = { 1, 20, 50, 100, 200 };
int[] b = a[0 .. 4]; // The whole array as a slice.
int[] c = a[2 .. 3]; // { 50, 100 }
}
You can also use arr[<start-index> : <slice-length>]
fn void test()
{
int[5] a = { 1, 20, 50, 100, 200 };
int[] b2 = a[0 : 5]; // { 1, 20, 50, 100, 200 } start-index 0, slice-length 5
int[] c2 = a[2 : 2]; // { 50, 100 } start-index 2, slice-length 2
}
It’s possible to omit the first and last indices of a range:
- arr[..<end-index>] Omitting the start index will default it to 0
- arr[<start-index>..] Omitting the end index will assign it to arr.len-1 (this is not allowed on pointers)
Equivalently with index offset arr[:<slice-length>] you can omit the start-index
The following are all equivalent and slice the whole array
fn void test()
{
int[5] a = { 1, 20, 50, 100, 200 };
int[] b = a[0 .. 4];
int[] c = a[..4];
int[] d = a[0..];
int[] e = a[..];
int[] f = a[0 : 5];
int[] g = a[:5];
}
You can also slice in reverse from the end with ^i where the index is len-i for example:
- ^1 means len-1
- ^2 means len-2
- ^3 means len-3
Again, this is not allowed for pointers since the length is unknown.
fn void test()
{
int[5] a = { 1, 20, 50, 100, 200 };
int[] b1 = a[1 .. ^1]; // { 20, 50, 100, 200 } a[1 .. (a.len-1)]
int[] b2 = a[1 .. ^2]; // { 20, 50, 100 } a[1 .. (a.len-2)]
int[] b3 = a[1 .. ^3]; // { 20, 50 } a[1 .. (a.len-3)]
int[] c1 = a[^1..]; // { 200 } a[(a.len-1)..]
int[] c2 = a[^2..]; // { 100, 200 } a[(a.len-2)..]
int[] c3 = a[^3..]; // { 50, 100, 200 } a[(a.len-3)..]
int[] d = a[^3 : 2]; // { 50, 100 } a[(a.len-3) : 2]
// Slicing a whole array, the inclusive index of : gives the difference
int[] e = a[0 .. ^1]; // a[0 .. a.len-1]
int[] f = a[0 : ^0]; // a[0 : a.len]
}
One may also assign to slices:
Or copy slices to slices:
Copying between two overlapping ranges, e.g. a[1..2] = a[0..1] is unspecified behaviour.
Conversion List¶
int[4] |
int[] |
int[4]* |
int* |
|
|---|---|---|---|---|
int[4] |
copy | - | - | - |
int[] |
- | assign | assign | - |
int[4]* |
- | cast | assign | cast |
int* |
- | assign | assign | assign |
Note that all casts above are inherently unsafe and will only work if the type cast is indeed compatible.
For example:
int[4] a;
int[4]* b = &a;
int* c = b;
// Safe cast:
int[4]* d = (int[4]*)c;
int e = 12;
int* f = &e;
// Incorrect, but not checked
int[4]* g = (int[4]*)f;
// Also incorrect but not checked.
int[] h = f[0..2];
Internals¶
Internally the layout of a slice is guaranteed to be struct { <type>* ptr; sz len; }.
There is a built-in struct std::core::runtime::SliceRaw which
has the exact data layout of the fat array pointers. It is defined to be
Dynamically allocated slices¶
Standard library provides utilities for allocating multiple elements into a slice:
// uses calloc under the hood (memory is zeroed out)
int[] arr1 = mem::new_array(int, 10);
defer mem::free(arr1);
// uses malloc under the hood (memory is undefined)
int[] arr2 = mem::alloc_array(int, 10);
defer mem::free(arr2);
Iteration Over Arrays¶
foreach element by copy¶
You may iterate over slices, arrays and vectors using foreach (Type x : array).
Using compile-time type inference this can be abbreviated
to foreach (x : array) for example:
fn void test()
{
int[4] arr = { 1, 2, 3, 5 };
foreach (item : arr)
{
io::printfn("item: %s", item);
}
// Or equivalently, writing the type:
foreach (int x : arr)
{
/* ... */
}
}
foreach element by reference¶
Using & it is possible to get an element by reference rather than by copy.
Providing two variables to foreach, the first is assumed to be the index and the second the value:
fn void test()
{
int[4] arr = { };
foreach (idx, &item : arr)
{
*item = 7 + (int)idx; // Mutates the array element
// index is sz when not specified, requiring an explicit
// cast on platforms where sz is larger than int.
}
// Or equivalently, writing the types
foreach (int idx, int* &item : arr)
{
*item = 7 + idx; // Mutates the array element
}
}
foreach_r reverse iterating¶
With foreach_r arrays or slices can be iterated over in reverse order
fn void test()
{
float[4] arr = { 1.0, 2.0 };
foreach_r (idx, item : arr)
{
// Prints 2.0, 1.0
io::printfn("item: %s", item);
}
// Or equivalently, writing the types
foreach_r (int idx, float item : arr)
{
// Prints 2.0, 1.0
io::printfn("item: %s", item);
}
}
Iteration Over Array-Like types¶
It is possible to enable foreach on any custom type
by implementing .len and [] methods and annotating them using the @operator attribute:
struct DynamicArray
{
sz count;
sz capacity;
int* elements;
}
macro int DynamicArray.get(DynamicArray* arr, sz element) @operator([])
{
return arr.elements[element];
}
macro sz DynamicArray.count(DynamicArray* arr) @operator(len)
{
return arr.count;
}
fn void DynamicArray.push(DynamicArray* arr, int value)
{
arr.ensure_capacity(arr.count + 1); // Function not shown in example.
arr.elements[arr.count++] = value;
}
fn void test()
{
DynamicArray v;
v.push(3);
v.push(7);
// Will print 3 and 7
foreach (int i : v)
{
io::printfn("%d", i);
}
}
For more information, see operator overloading
Dynamic Arrays and Lists¶
The standard library offers dynamic arrays and other collections in the std::collections module.
alias ListStr = List {String};
fn void test()
{
ListStr list_str;
// Initialize the list on the heap.
list_str.init(mem);
list_str.push("Hello"); // Add the string "Hello"
list_str.push("World");
foreach (str : list_str)
{
io::printn(str); // Prints "Hello", then "World"
}
String str = list_str[1]; // str == "World"
list_str.free(); // Free all memory associated with list.
}
Fixed Size Multi-Dimensional Arrays¶
Declare two-dimensional fixed arrays as <type>[<inner-size>][<outer-size>] arr, like int[4][2] arr. Below you can see how this compares to C:
// C
// Uses: name[<outer-size>][<inner-size>]
int array_in_c[4][2] = {
{1, 2},
{3, 4},
{5, 6},
{7, 8},
};
// C3
// Uses: <type>[<inner-size>][<outer-size>]
// C3 declares the dimensions, inner-most to outer-most
int[4][2] array = {
{1, 2, 3, 4},
{5, 6, 7, 8},
};
// To match C we must invert the order of the dimensions
int[2][4] array = {
{1, 2},
{3, 4},
{5, 6},
{7, 8},
};
// C3 also supports Irregular arrays, for example:
int[][4] array = {
{ 1 },
{ 2, 3 },
{ 4, 5, 6 },
{ 7, 8, 9, 10 },
};
Note
Accessing the multi-dimensional fixed array has inverted array index order compared to when the array was declared.
Strings
In C3, multiple string types are available, each suited to different use cases.
String¶
\
Strings are usually the typical type to use, they can be sliced, compared etc ... \
It is possible to access the length of a String instance through the .len operator.
ZString¶
ZString is used when working with C code, which expects null-terminated C-style strings of type char*.
It is a typedef so converting to a ZString requires an explicit cast. This helps to remind the user to check there is appropriate \0 termination of the string data.
Caution
Ensure the terminal \0 when converting from String to ZString.
WString¶
\
The WString type is similar to ZString but uses Char16*, typically for UTF-16 encoded strings. This type is useful for applications where 16-bit character encoding is required.
DString¶
\
DString is a dynamic string builder that supports various string operations at runtime, allowing for flexible manipulation without the need for manual memory allocation.
Enums
Enums¶
Enums use the following syntax:
enum State : int
{
WAITING,
RUNNING,
TERMINATED
}
// Access enum values via:
State current_state = WAITING; // or '= State.WAITING'
The access requires referencing the enum's name as State.WAITING because
an enum like State is a separate namespace by default, just like C++'s class enum.
Standard enums are always backed by an ordinal value running from zero and up, without any gaps. For enums for non-consecutive values, see constdef. To create enums that implement a bit-mask, you can also consider using bitstructs.
Enum associated values¶
It is possible to associate each enum value with one or more static values.
enum State : int (String description)
{
WAITING { "waiting" },
RUNNING { "running" },
TERMINATED { "ended" },
}
fn void main()
{
State process = State.RUNNING;
io::printfn("%s", process.description);
}
Multiple static values can be associated with an enum value, for example:
struct Position
{
int x;
int y;
}
enum State : int (String desc, bool active, Position pos)
{
WAITING { "waiting", false, { 1, 2} },
RUNNING { "running", true, {12,22} },
TERMINATED { "ended", false, { 0, 0} },
}
fn void main()
{
State process = RUNNING;
if (process.active)
{
io::printfn("Process is: %s", process.desc);
io::printfn("Position x: %d", process.pos.x);
}
}
Enum type inference¶
When an enum is used where the type can be inferred, like in switch case-clauses or in variable assignment, the enum name is not required:
State process = WAITING; // State.WAITING is inferred.
switch (process)
{
case RUNNING: // State.RUNNING is inferred
io::printfn("Position x: %d", process.pos.x);
default:
io::printfn("Process is: %s", process.desc);
}
fn void test(State s) { ... }
test(RUNNING); // State.RUNNING is inferred
If the enum without its name matches with a global in the same scope, it needs the enum name to be added as a qualifier, for example:
module test;
// Global variable
// ❌ Don't do this!
const State RUNNING = State.TERMINATED;
test(RUNNING); // Ambiguous
test(test::RUNNING); // Uses global variable.
test(State.RUNNING); // Uses enum constant.
Enum to and from ordinal¶
You can convert an enum to its ordinal with .ordinal, and convert it
back with EnumName.from_ordinal(...):
fn void store_enum(State s)
{
write_int_to_file(s.ordinal);
}
fn State read_enum()
{
return State.from_ordinal(read_int_from_file());
}
However, a plain cast is also valid:
Enum types have the following additional properties in addition to the usual properties for user-defined types:
membersreturns a list of member references, similar to struct'smembers.innerreturns the type of the ordinal as atypeid.lookup_field(field_name, value)lookup an enum by associated value.namesreturns a list containing the names of all enums.from_ordinal(value)convert an integer to an enum.valuesreturn a list containing all the enum values of an enum.
Constdef¶
When interfacing with C code, you may encounter enums that are not sequential. For situations like this, you can use a constdef in C3:
extern fn KeyCode get_key_code();
constdef KeyCode
{
UNKNOWN = 0,
RETURN = 13,
ESCAPE = 27,
BACKSPACE = 8,
TAB = 9,
SPACE = 32,
EXCLAIM, // automatically incremented to 33
QUOTEDBL,
HASH,
}
fn void main()
{
int a = (int)KeyCode.SPACE; // assigns 32 to a
// constdef behave like typedef and will not enforce
// that every value has been declared beforehand
KeyCode b = (KeyCode)2;
// can safely interact with a C function that returns the same enum
KeyCode key = get_key_code();
// Use as cast to convert from the underlying type.
KeyCode conv = (KeyCode)a;
}
Inline constdef and @constinit¶
If you need a constdef to be converted to its assigned value without using a cast, inline can be used:
constdef ConstInline : inline String
{
A = "Hello",
B = "World",
}
fn void main()
{
// implicitly converted to string due to inline
String a = ConstInline.A;
ConstInline b = B;
String b_str = b;
io::printfn("%s, %s!", a, b_str); // Prints "Hello, World!"
}
We can use @constinit to allow the constdef to implicitly convert from a literal:
constdef ConstInline2 : String @constinit
{
A = "Hello",
B = "World",
}
fn void main()
{
ConstInline2 a = "Bye";
}
These conversion rules are the same as for typedef.
Structs and unions
Structs¶
Structs are always named:
A struct's members may be accessed using dot notation, even for pointers to structs.
fn void test()
{
Person p;
p.age = 21;
p.name = "John Doe";
io::printfn("%s is %d years old.", p.name, p.age);
Person* p_ptr = &p;
p_ptr.age = 20; // Ok!
io::printfn("%s is %d years old.", p_ptr.name, p_ptr.age);
}
Person** and use dot access. – It's not allowed, only one level of dereference is done.)
To change alignment and packing, attributes such as @packed may be used.
Initializing structs¶
Structs are typically initialized with an initializer list, which is a list of arguments inside of { }. For example, we can initialize our Person struct above like this:
But we can also use so-called designated initialization, where the explicit names of the members are assigned to, with a leading .:
With designated initializers we do not need to initialize all fields. The rest of the fields will automatically be zeroed out:
If a type contains members which in turn are structs or unions or arrays, then their members may be initialized using repeated .name syntax:
struct Test
{
Person owner;
Person subscriber;
}
Test t = { .owner = { 21, "John Doe" }, .subscriber.age = 42, .subscriber.name = "Test Person" };
Struct initializer splatting¶
It's possible to use the ... operator together with designated initializers to provide defaults that are overwritten by later assignments:
Struct subtyping¶
C3 allows creating struct subtypes using inline:
struct ImportantPerson
{
inline Person person;
String title;
}
fn void print_person(Person p)
{
io::printfn("%s is %d years old.", p.name, p.age);
}
fn void test()
{
ImportantPerson important_person;
important_person.age = 25;
important_person.name = "Jane Doe";
important_person.title = "Rockstar";
// Only the first part of the struct is copied.
print_person(important_person);
}
Union types¶
Union types are defined just like structs and are fully compatible with C.
As usual, unions are used to hold one of many possible values:
fn void test()
{
Integral i;
i.as_byte = 40; // Setting the active member to as_byte
i.as_int = 500; // Changing the active member to as_int
// Undefined behaviour: as_byte is not the active member,
// so this will probably print garbage.
io::printfn("%d\n", i.as_byte);
}
Note that unions only take up as much space as their largest member, so Integral::size is equivalent to long::size.
Nested sub-structs / unions¶
Just like in C99 and later, nested anonymous sub-structs / unions are allowed. Note that the placement of struct / union names is different to match the difference in declaration.
struct Person
{
char age;
String name;
union
{
int employee_nr;
uint other_nr;
}
union subname
{
bool b;
Callback cb;
}
}
Union and structs type properties¶
Structs and unions also support the members property, which returns a list of struct/union members.
Bitstructs
Bitstructs¶
Bitstructs allow storing fields in a specific bit layout. A bitstruct may only contain integer types and booleans, in most other respects it works like a struct.
The main difference is that the bitstruct has a backing type and each field has a specific bit range. In addition, it's not possible to take the address of a bitstruct field.
bitstruct Foo : char
{
int a : 0..2;
int b : 4..6;
bool c : 7;
}
fn void test()
{
Foo f;
f.a = 2;
io::printfn("%d", (char)f); // prints 2
f.b = 1;
io::printfn("%d", (char)f); // prints 18
f.c = true;
io::printfn("%d", (char)f); // prints 146
// Normal designated initializers are supported
f = { .a = 1, .b = 3, .c = false };
// As a special case, boolean fields may drop
// the initializer value, this implicitly sets them
// to true. Below the '.c' is the same as '.c = true'
f = { .a = 2, .b = 2, .c };
}
Bitstruct endianness¶
The bitstruct will follow the endianness of the underlying type:
bitstruct Test : uint
{
ushort a : 0..15;
ushort b : 16..31;
}
fn void test()
{
Test t;
t.a = 0xABCD;
t.b = 0x789A;
char* c = (char*)&t;
// Prints 789AABCD
io::printfn("%X", (uint)t);
for (int i = 0; i < 4; i++)
{
// Prints CDAB9A78
io::printf("%X", c[i]);
}
io::printn();
}
It is, however, possible to pick a different endianness, in which case the entire representation will internally assume big-endian layout:
In this case the same example yields CDAB9A78 and 789AABCD respectively.
Bitstruct backing types¶
Bitstruct backing types may be integers or char arrays. The difference in layout is somewhat subtle:
bitstruct Test1 : char[4]
{
ushort a : 0..15;
ushort b : 16..31;
}
bitstruct Test2 : char[4] @bigendian
{
ushort a : 0..15;
ushort b : 16..31;
}
fn void test()
{
Test1 t1;
Test2 t2;
t1.a = t2.a = 0xABCD;
t1.b = t2.b = 0x789A;
char* c = (char*)&t1;
for (int i = 0; i < 4; i++)
{
// Prints CDAB9A78 on x86
io::printf("%X", c[i]);
}
io::printn();
c = (char*)&t2;
for (int i = 0; i < 4; i++)
{
// Prints ABCD789A
io::printf("%X", c[i]);
}
io::printn();
}
Bitstructs with overlapping fields¶
Bitstructs can be made to have overlapping bit fields. This is useful when modeling a layout which has multiple different layouts depending on flag bits:
bitstruct Foo : char @overlap
{
int a : 2..5;
// "b" is valid due to the @overlap attribute
int b : 1..3;
}
Boolean-only bitstructs¶
When a bitstruct consists of only bool fields, the bit position may be dropped, and the bit position is inferred:
// The following produce exactly the same layout:
bitstruct Explicit : int
{
bool a : 0;
bool b : 1;
bool c : 2;
}
bitstruct Implicit : int
{
bool a;
bool b;
bool c;
}
Bitstructs as bit masks¶
It is possible to use bitstructs to implement bitmasks without using the explicit masking values, see the following example:
constdef BitMaskEnum : uint
{
ABC = 1 << 0,
DEF = 1 << 1,
ACTIVE = 1 << 5,
}
bitstruct BitMask : uint
{
bool abc : 0;
bool def : 1;
bool active: 5;
}
fn void test()
{
// Classic bit mask:
BitMaskEnum foo = BitMaskEnum.ABC | BitMaskEnum.DEF;
BitMaskEnum bar = BitMaskEnum.ACTIVE | BitMaskEnum.ABC;
BitMaskEnum baz = foo & bar;
if (baz & BitMaskEnum.ACTIVE) { ... }
// Using a bitstruct
BitMask a = { .abc, .def }; // Just .abc is the same as .abc = true
BitMask b = { .active, .abc };
BitMask c = a & b;
if (c.active) { ... }
assert((uint)b == (uint)bar, "Layout is the same");
}
Bitstruct type properties¶
Bitstructs also support:
members- Return a list of all bitstruct members.inner- Return the type of the bitstruct "container" type.
Vectors
Vectors - where possible - based on underlying hardware vector implementations. A vector is similar to an array, but with additional functionality. The restriction is that a vector may only consist of elements that are numerical types, boolean or pointers.
A vector is declared similar to an array but uses [<>] rather than [], e.g. int[<4>].
(If you are searching for the counterpart of C++'s std::vector, look instead at the standard
library List type.)
Arithmetics on vectors¶
Vectors support all arithmetics and other operations supported by its underlying type. The operations are always performed elementwise.
For integer and boolean types, bit operations such as ^ | & << >> are available, and for pointers, pointer arithmetic
is supported.
Scalar values¶
Scalar values will implicitly widen to vectors when used with vectors:
Additional operations¶
The std::math module contains a wealth of additional operations available on vectors using dot-method syntax.
.sum()- sum all vector elements..product()- multiply all vector elements..max()- get the maximum element..min()- get the minimum element..dot(other)- return the dot product with the other vector..length()- return the square root of the dot product (not available on integer vectors)..distance(other)- return the length of the difference of the two vectors (not available on integer vectors)..normalize()- return a normalized vector (not available on integer vectors)..lerp(other, t)- linearly interpolate toward other by t..reflect(other)- reflect vector about other (assumes other is normalized)..comp_lt(other)- return a boolean vector with a component wise "<".comp_le(other)- return a boolean vector with a component wise "<=".comp_eq(other)- return a boolean vector with a component wise "==".comp_gt(other)- return a boolean vector with a component wise ">".comp_ge(other)- return a boolean vector with a component wise ">=".comp_ne(other)- return a boolean vector with a component wise "!="
Dot methods available for scalar values, such as ceil, fma etc are in general also available for vectors.
Swizzling¶
Swizzling using dot notation is supported, using x, y, z, w or r, g, b, a:
int[<3>] a = { 11, 22, 33 };
int[<4>] b = a.xxzx; // b = { 11, 11, 33, 11 }
int c = b.w; // c = 11;
char[<4>] color = { 0x11, 0x22, 0x33, 0xFF };
char red = color.r; // red = 0x11
b.xy = b.zw;
color.rg += { 1, 2 };
Array-like operations¶
Like arrays, it's possible to make slices and iterate over vectors.
Note
The storage alignment of vectors are often different from arrays, which should be taken into account when storing vectors.
Memory Management
Like in C, memory is manually managed in C3. An object can either be passed as a value on the stack, or it can be separately allocated.
fn void test()
{
int a = 12; // This variable is allocated on the stack.
int b = a; // This copies the value from a to the stack variable b.
int[2] c = { 1, 2 };
int[2] d = c; // In C3 arrays are values and are copied by value.
io::printn(d); // Prints "{ 1, 2 }"
c[0] = 10;
io::printn(c); // Prints "{ 10, 2 }"
io::printn(d); // Prints "{ 1, 2 }"
}
Allocating on the heap¶
The problem with stack allocations is that the length and sizes must be known up front. Imagine if we wanted to create an array with n number of entries and return that as a slice.
A first attempt might be:
const MAX_NUMBER = 100;
<* @require n >= 0 && n <= MAX_NUMBER *>
fn int[] create_array(int n)
{
int[MAX_NUMBER] arr;
for (int i = 0; i < n; i++)
{
arr[i] = i;
}
return arr[:n]; // Error: returns a pointer to a stack allocated variable
}
Aside from the problem with having a MAX_NUMBER, we can't return a pointer to this array, even as a slice, because the memory where arr is stored is returned when the call to create_array returns.
The normal solution here is to allocate memory on the heap instead, the code might look like this:
<* @require n >= 0 *>
fn int[] create_array(int n)
{
int* arr = malloc(n * int::size);
for (int i = 0; i < n; i++)
{
arr[i] = i;
}
return arr[:n]; // Turn the pointer into a slice with length "n"
}
This allocates enough memory to hold n ints, and returns the result.
The downside is that we must make sure that we release the memory back when we're done:
fn void test()
{
int[] array = create_array(3);
do_things(array);
free(array); // Release memory back to the OS
}
Note
There are convenience functions in the standard library to allocate arrays on the heap. Use mem::new_array(int, n) - zero initialized - or mem::alloc_array(int, n) - not initialized - rather than malloc directly.
Temporary allocations¶
Having to clean up heap allocations is not always convenient. For example, what if we wanted to do this:
In this example do_things would need to release the data, or we leak memory. But we're just using this temporarily – we always just create it and then delete it. Isn't there any simpler way?
In C3, the solution is using the temporary allocator. Allocation with the temporary allocator is just like with the heap allocator, but it uses the @pool macro to flush all temporary allocators deeper down in the call tree:
fn void some_function()
{
@pool()
{
do_calculations();
};
// All temporary allocations inside of do_calculations
// and deeper down are freed when exiting the `@pool` scope.
}
To allocate we use tmalloc, which works the same as malloc, but uses the temporary allocator.
<* @require n >= 0 *>
fn int[] create_temp_array(int n)
{
int* arr = tmalloc(n * int::size);
for (int i = 0; i < n; i++)
{
arr[i] = i;
}
return arr[:n];
}
fn void test_temp()
{
do_things(create_temp_array(3)); // Creates a temporary array
}
fn void a_function()
{
@pool()
{
test_temp();
void* date = tmalloc(1000);
};
// All temporary memory is released when exiting `@pool()`
}
Using single line function body syntax => we can write this even more compactly as:
We can even nest @pools:
fn void nested()
{
@pool()
{
int* a = tmalloc(int::size);
*a = 123;
// Only 'a' is valid
@pool()
{
int* b = tmalloc(int::size);
*b = *a;
// Both 'b' and 'a' are valid
};
// 'b' is released, only 'a' is valid
io::printn(*a);
};
// 'a' is released
}
Temp allocator pitfalls
Because temporary allocations are released using @pool, you should never pass temporary allocated data to other threads or store them in variables that outlive the @pool scope.
The compiler will try to detect using temporary data after free, but the ability to do so depends on whether the code is compiled with safety checks / address sanitizer or not. Support will also differ between OS and architectures.
Always make sure that temporary allocations aren't used beyond the scope of their @pool.
Functions that allocate¶
Standard library functions that allocate generally require you to pass an allocator. This allows you to use the standard heap allocator, mem, the temp allocator tmem or some other Allocator you might be using instead:
List{int} list;
list.init(mem); // "list" will use the heap allocator
list.push(1);
list.push(42);
io::printn(list); // Prints "{ 1, 42 }"
list.free(); // Free the memory in the list
If you are using mem, then in general you will need to free it in some way. Either it's built into the type, such as in the List example above, or else you will need to handle it yourself, like in this case:
String s = string::format(mem, "Hello %s", "World");
// The string "s" is allocated on the heap
io::printn(s);
// Prints "Hello World"
free(s);
// Frees the string
On the other hand, if you use the temp allocator, you only need to make sure it's wrapped in a @pool:
@pool()
{
List{int} list;
list.init(tmem); // "list" will use the temp allocator
list.push(1);
list.push(42);
io::printn(list);
String s = string::format(tmem, "Hello %s", "World");
io::printn(s);
}; // s and list are freed here, because they used temp memory
Because of the usefulness of the temp allocator idiom, there are often temp allocator versions of functions, prefixed "t" or "temp_":
@pool()
{
List{int} list;
list.tinit(); // Use the temp allocator
list.push(1);
list.push(42);
String s = string::tformat("Hello %s", "World"); // Use the temp allocator
};
Implicit initialization¶
Some types, such as List, HashMap and DString will use the temp allocator by default if they are not initialized.
@pool()
{
List{int} list;
list.push(1); // Implicitly initialize with the temp allocator
list.push(42);
DString str; // DString is a dynamic string
str.appendf("Hello %s", "World");
// The "appendf" implicitly initializes "str" with the temp allocator
str.insert_at(5, ",");
str.append("!");
io::printn(str); // Prints Hello, World!
}; // list and str are freed here
This is often useful for locals, but in the case of globals, you might want the container
to use the heap allocator by default. For most containers there is a ONHEAP constant which
allows you to statically initialize globals to use the heap allocator:
List {int} l = list::ONHEAP {int};
fn void main()
{
l.push(1); // Implicitly allocates on the heap, not the temp allocator.
}
Beyond allocating raw memory¶
In C, memory is allocated with plain malloc (uninitialized memory) and calloc (zero-initialized memory). The C3 standard library provides those, but also additional convenience functions:
new and alloc macros¶
The new and alloc macros take a type and allocate just enough memory for that value. This is often more convenient and clearer than Foo* f = malloc(Foo::size).
Foo* f = mem::new(Foo); // Returns a zero initialized pointer for a type
int* p = mem::alloc(int); // Same as 'new' but memory is uninitialized
Foo* t = mem::tnew(Foo); // Same as 'new' but using the temp allocator
new and tnew also take an optional initializer, allowing you to allocate and initialize in a single call.
There are also more specialized functions such as new_with_padding and new_aligned, the former when you need to add additional memory at the end of the allocation, and new_aligned for when you have overaligned types – typically vectors with alignment greater than 16.
new_array and alloc_array for creating arrays¶
// Returns a pointer to a Foo[3] array, zero initialized
Foo[] arr = mem::new_array(Foo, 3);
// Same but memory is uninitialized
Foo[] a2 = mem::alloc_array(Foo, 3);
// Same as new_array, but using the temp allocator
Foo[] tarr = mem::temp_array(Foo, 3);
@clone¶
@clone allows you to take a value and create a pointer copy of it.
// Creates an int pointer, initialized to 33
int* x = @clone(33);
// Same as @clone but using the temp allocator
int* y = @tclone(33);
int[] z = { 1, 2 };
// This clones the elements of a slice or array, in this case "z"
int[] a = @clone_slice(z);
// Same as @clone_slice, but using the temp allocator
int[] t = @tclone_slice(z);
Optionals (Essential)
In this section we will go over the essential information about Optionals and safe methods for working with them, for example
if (catch optional_value)
and the Rethrow operator !.
In the advanced section there are other nice to have features.
Like an alternative to safely unwrap a result from an Optional using
if (try optional_value)
and an unsafe method to force unwrap !!
a result from an Optional, return default values for optionals ?? if they are empty and other more specialised concepts.
What is an Optional?¶
An Optional is a safer alternative to returning -1 or null from
a function when a valid value can't be returned. An Optional
has either a result or is empty. When an Optional
is empty it has an Excuse explaining what happened.
- For example, trying to open a missing file returns the
Excuseofio::FILE_NOT_FOUND. - Optionals are declared by adding
?after the type. - An
Excuseis of typefault. The Optional Excuse is set with~after the value.
🎁 Unwrapping an Optional¶
Note
Unwrapping an Optional is safe because it checks it has a result present before trying to use it.
After unwrapping, the variable then behaves like a normal variable, a non-Optional.
Checking if an Optional is empty¶
import std::io;
fn void? test()
{
// Return an Excuse by adding '~' after the fault.
return io::FILE_NOT_FOUND~;
}
fn void main(String[] args)
{
// If the Optional is empty, assign the
// Excuse to a variable:
if (catch excuse = test())
{
io::printfn("test() gave an Excuse: %s", excuse);
}
}
Automatically unwrapping an Optional result¶
If we escape the current scope from an if (catch my_var) using a return, break, continue
or Rethrow !,
then the variable is automatically unwrapped to a non-Optional:
fn void? test()
{
int? foo = unreliable_function();
if (catch excuse = foo)
{
// Return the excuse with `~` operator
return excuse~;
}
// Because the compiler knows 'foo' cannot
// be empty here, it is unwrapped to non-Optional
// 'int foo' in this scope:
io::printfn("foo: %s", foo); // 7
}
Using the Rethrow operator ! to unwrap an Optional value¶
- The Rethrow operator
!will return from the function with theExcuseif the Optional result is empty. - The resulting value will be unwrapped to a non-Optional.
import std::io;
// Function returning an Optional
fn int? maybe_function() { /* ... */ }
fn void? test()
{
// ❌ This will be a compile error
// maybe_function() returns an Optional
// and 'bar' is not declared Optional:
// int bar = maybe_function();
int bar = maybe_function()!;
// ✅ The above is equivalent to:
// int? temp = maybe_function();
// if (catch excuse = temp) return excuse~
// Now temp is unwrapped to a non-Optional
int bar = temp; // ✅ This is OK
}
⚠️ Optionals affect types and control flow¶
Optionals in expressions produce Optionals¶
If you use an Optional anywhere in an expression, the resulting expression will be an Optional too.
import std::io;
fn void main(String[] args)
{
// Returns Optional with result of type `int` or an Excuse
int? first_optional = 7;
// This is Optional too:
int? second_optional = first_optional + 1;
}
Optionals affect function return types¶
import std::io;
fn int test(int input)
{
io::printn("test(): inside function body");
return input;
}
fn void main(String[] args)
{
int? optional_argument = 7;
// `optional_argument` makes returned `returned_optional`
// Optional too:
int? returned_optional = test(optional_argument);
}
Functions conditionally run when called with Optional arguments¶
When calling a function with Optionals as arguments, the result will be the first Excuse found looking left-to-right. The function is only executed if all Optional arguments have a result.
import std::io;
fn int test(int input, int input2)
{
io::printn("test(): inside function body");
return input;
}
fn void main(String[] args)
{
int? first_optional = io::FILE_NOT_FOUND~;
int? second_optional = 7;
// Return first excuse we find
int? third_optional = test(first_optional, second_optional);
if (catch excuse = third_optional)
{
// excuse == io::FILE_NOT_FOUND
io::printfn("third_optional's Excuse: %s", excuse);
}
}
Interfacing with C¶
For C the interface to C3:
- The Excuse in the Optional of type fault is returned as the regular return.
- The result in the Optional is passed by reference.
For example:
Thec3fault_t is guaranteed to be a pointer sized value.Optionals (Advanced)
Optionals are only defined in certain code¶
✅ Variable declarations
✅ Function return signatureHandling an empty Optional¶
File reading example¶
- If the file is present the Optional result will be the first 100 bytes of the file.
- If the file is not present the Optional
Excusewill beio::FILE_NOT_FOUND.
Try running this code below with and without a file called file_to_open.txt in the same directory.
import std::io;
<*
Function modifies 'buffer'
Returns an Optional with a 'char[]' result
OR an empty Optional with an Excuse
*>
fn char[]? read_file(String filename, char[] buffer)
{
// Return Excuse if opening a file failed, using Rethrow `!`
File file = file::open(filename, "r")!;
// At scope exit, close the file.
// Discard the Excuse from file.close() with (void) cast
defer (void)file.close();
// Return Excuse if reading failed, using Rethrow `!`
file.read(buffer)!;
return buffer; // return a buffer result
}
fn void? test_read()
{
char[] buffer = mem::new_array(char, 100);
defer free(buffer); // Free memory on scope exit
char[]? read_buffer = read_file("file_to_open.txt", buffer);
// Catch the empty Optional and assign the Excuse
// to `excuse`
if (catch excuse = read_buffer)
{
io::printfn("Excuse found: %s", excuse);
// Returning Excuse using the `~` suffix
return excuse~;
}
// `read_buffer` behaves like a normal variable here
// because the Optional being empty was handled by 'if (catch)'
// which automatically unwrapped 'read_buffer' at this point.
io::printfn("read_buffer: %s", read_buffer);
}
fn void main()
{
test_read()!!; // Panic on failure.
}
Return a default value if Optional is empty¶
The ?? operator allows us to return a default value if the Optional is empty.
import std::io;
fn void test_bad()
{
int regular_value;
int? optional_value = function_may_error();
// An empty Optional found in optional_value
if (catch optional_value)
{
// Assign default result when empty.
regular_value = -1;
}
// A result was found in optional_value
if (try optional_value)
{
regular_value = optional_value;
}
io::printfn("The value was: %d", regular_value);
}
fn void test_good()
{
// Return '-1' when `foo_may_error()` is empty.
int regular_value = foo_may_error() ?? -1;
io::printfn("The value was: %d", regular_value);
}
Modifying the returned Excuse¶
A common use of ?? is to catch an empty Optional and change
the Excuse to another more specific Excuse, which
allows us to distinguish one failure from the other,
even when they had the same Excuse originally.
import std::io;
faultdef DOG_ATE_HOMEWORK, TEXTBOOK_ON_FIRE;
fn int? test()
{
return io::FILE_NOT_FOUND~;
}
fn void? examples()
{
int? a = test(); // io::FILE_NOT_FOUND
int? b = test(); // io::FILE_NOT_FOUND
// We can tell these apart by default assigning our own unique
// Excuse. Our custom Excuse is assigned only if an
// empty Optional is returned.
int? c = test() ?? DOG_ATE_HOMEWORK~;
int? d = test() ?? TEXTBOOK_ON_FIRE~;
// If you want to immediately return with an Excuse,
// use the "~" and "!" operators together, see the code below:
int e = test() ?? DOG_ATE_HOMEWORK~!;
int f = test() ?? TEXTBOOK_ON_FIRE~!;
}
Force unwrapping expressions¶
The force unwrap operator !! will
make the program panic and exit if the expression is an empty optional.
This is useful when the error should – in normal cases – not happen
and you don't want to write any error handling for it.
That said, it should be used with great caution in production code.
fn void find_file_and_test()
{
find_file()!!;
// Force unwrap '!!' is roughly equal to:
// if (catch find_file()) unreachable("Unexpected excuse");
}
Find empty Optional without reading the Excuse¶
import std::io;
fn void test()
{
int? optional_value = io::FILE_NOT_FOUND~;
// Find empty Optional, then handle inside scope
if (catch optional_value)
{
io::printn("Found empty Optional, the Excuse was not read");
}
}
Run code if the Optional has a result¶
This is a convenience method, the logical inverse of
if (catch)
and is helpful when you don't care about the empty branch of
the code or you wish to perform an early return.
fn void test()
{
// 'optional_value' is a non-Optional variable inside the scope
if (try optional_value)
{
io::printfn("Result found: %s", optional_value);
}
// The Optional result is assigned to 'unwrapped_value' inside the scope
if (try unwrapped_value = optional_value)
{
io::printfn("Result found: %s", unwrapped_value);
}
}
Another example:
import std::io;
// Returns Optional result with `int` type or empty with an Excuse
fn int? reliable_function()
{
return 7; // Return a result
}
fn void main(String[] args)
{
int? reliable_result = reliable_function();
// Unwrap the result from reliable_result
if (try reliable_result)
{
// reliable_result is unwrapped in this scope, can be used as normal
io::printfn("reliable_result: %s", reliable_result);
}
}
if (try) but they must be
joined with &&. However you cannot use logical OR (||) conditions:
import std::io;
// Returns Optional with an 'int' result or empty with an Excuse
fn int? reliable_function()
{
return 7; // Return an Optional result
}
fn void main(String[] args)
{
int? reliable_result1 = reliable_function();
int? reliable_result2 = reliable_function();
// Unwrap the result from reliable_result1 and reliable_result2
if (try reliable_result1 && try reliable_result2 && 5 > 2)
{
// `reliable_result1` can be used as a normal variable here
io::printfn("reliable_result1: %s", reliable_result1);
// `reliable_result2` can be used as a normal variable here
io::printfn("reliable_result2: %s", reliable_result2);
}
// ERROR cannot use logical OR `||`
// if (try reliable_result1 || try reliable_result2)
// {
// io::printn("this can never happen);
// }
}
Shorthands to work with Optionals¶
Getting the Excuse¶
Retrieving the Excuse with if (catch excuse = optional_value) {...}
is not the only way to get the Excuse from an Optional, we can use the macro @catch instead.
Unlike if (catch) this will never cause automatic unwrapping.
fn void main(String[] args)
{
int? optional_value = io::FILE_NOT_FOUND~;
fault excuse = @catch(optional_value);
if (excuse)
{
io::printfn("Excuse found: %s", excuse);
}
}
Checking if an Optional has a result without unwrapping¶
The @ok macro will return true if an Optional result is present and
false if the Optional is empty.
Functionally this is equivalent to !@catch, meaning no Excuse was found, for example:
fn void main(String[] args)
{
int? optional_value = 7;
bool result_found = @ok(optional_value);
assert(result_found == !@catch(optional_value));
}
No void? variables¶
The void? type has no possible representation as a variable, and may
only be a function return type.
Note
The main function cannot return an optional.
To store the Excuse returned from a void? function without
if (catch foo = optional_value),
use the @catch macro to convert the Optional to a fault:
C Interop
C3 is C ABI compatible. That means you can call C from C3, and call C3 from C without having to
do anything special. As a quick way to call C, you can simply declare the function as a
C3 function but with extern in front of it. As long as the function is linked, it will work:
extern fn void puts(char*); // C "puts"
fn void main()
{
// This will call the "puts"
// function in the standard c lib.
puts("Hello, world!");
}
To use a different identifier inside of your C3 code compared to the function or variable’s external name, use the @cname attribute:
extern fn void foo_puts(char*) @cname("puts"); // C "puts"
fn void main()
{
foo_puts("Hello, world!"); // Still calls C "puts"
}
While C3 functions are available from C using their external name, it's often useful to
define an external name using @cname or @export with a name to match C usage.
module foo;
fn int square(int x) @export // @export ensures external visibility
{
return x * x;
}
fn int square2(int x) @export("square")
{
return x * x;
}
Calling from C:
extern int square(int);
int foo_square(int) __attribute__ ((weak, alias ("foo__square")));
void test()
{
// This would call square2
printf("%d\n", square(11));
// This would call square
printf("%d\n", foo_square(11));
}
Linking static and dynamic libraries¶
If you have a library foo.a or foo.so or foo.obj (depending on type and OS), just add
-l foo on the command line, or in the project file add it to the linked-libraries value, e.g.
"linked-libraries" = ["foo"].
To add library search paths, use -L <directory> from the command line and linker-search-paths
the project file (e.g. "linker-search-paths" = ["../mylibs/", "/extra-libs/"])
Gotchas¶
- Bitstructs will be seen as its backing type, when used from C.
- C bit fields must be manually converted to a C3 bitstruct with the correct layout for each target platform.
- C assumes the enum size is
CInt - C3 uses fixed integer sizes, this means that
intandCIntdoes not need to be the same though in practice on 32/64 bit machines,longis usually the only type that differs in size between C and C3. - Atomic types are not supported by C3.
- In C3 there are generic Atomic types instead.
- There are no
volatileandconstqualifiers like in C.- C3 has global constants declared with
const. - Instead of the
volatiletype qualifier, there are standard library macros@volatile_loadand@volatile_store.
- C3 has global constants declared with
- Passing arrays by value like in C3 must be represented as passing a struct containing the array.
- In C3, fixed arrays do not decay into pointers like in C.
- When defining a C function that has an array argument, replace the array type with a pointer. E.g.
void test(int[] a)should becomeextern fn void test(int* a). If the function has a sized array, likevoid test2(int[4] b)replace it with a pointer to a sized array:extern fn void test2(int[4]* b); - Note that a pointer to an array is always implicitly convertable to a pointer to the first element. For example,
int[4]*may be implicitly converted toint*.
- When defining a C function that has an array argument, replace the array type with a pointer. E.g.
- The C3 names of functions are name-spaced with the module by default when using
@export, so when exporting a function with@exportthat is to be used from C, specify an explicit external name. E.g.fn void myfunc() @export("myfunc") { ... }.
Contracts
Contracts are optional pre- and post-condition checks that the compiler may use for static analysis, runtime checks and optimization. Note that conforming C3 compilers are not obliged to use contracts.
However, violating either pre- or post-conditions is unspecified behaviour, and a compiler may optimize code as if they are always true – even if a potential bug may cause them to be violated.
In safe mode, pre- and post-conditions are checked using runtime asserts.
Why is contract analysis optional for compilers?¶
A frequent question is: "why are contracts opt-in rather than mandatory"? The answer to this is that it allows C3 compilers to be built for resource-constrained environments where it is challenging to fit static analysis. It also makes it simpler to build simple C3 compilers for learning purposes.
Conversely, it should be easy for advanced compilers to have enough information to do advanced static analysis as part of the regular compilation step, so it is important that the constraints are explicit and available.
Pre-conditions¶
Pre-conditions are usually used to validate incoming arguments.
Each condition must be an expression that can be evaluated to a boolean.
Pre-conditions use the @require annotation, and optionally can have an
error message to display after them.
<*
@require foo > 0, foo < 1000 : "optional error msg"
*>
fn int test_foo(int foo)
{
return foo * 10;
}
If we now write the following code:
With c3c (the standard C3 compiler) we will get a compile time error, saying that the contract is violated. However, expressions requiring more static analysis are often only caught at runtime.
Post conditions¶
Post-conditions are evaluated to make checks on the resulting state after passing through the function.
The post-condition uses the @ensure annotation. Where return is used to represent the return value from the function.
<*
@require foo != null
@ensure return > foo.x
*>
fn uint check_foo(Foo* foo)
{
uint y = abs(foo.x) + 1;
// If we had row: foo.x = 0, then this would be a runtime contract error.
return y * abs(foo.x);
}
Parameter annotations¶
@param supports [in] [out] and [inout]. These are only applicable
for pointer arguments. [in] disallows writing to the variable,
[out] disallows reading from the variable. Without an annotation,
pointers may both be read from and written to without checks. If an & is placed
in front of the annotation (e.g. [&in]), then this means the pointer must be non-null
and is checked for null.
| Type | readable? | writable? | use as "in"? | use as "out"? | use as "inout"? |
|---|---|---|---|---|---|
| no annotation | Yes | Yes | Yes | Yes | Yes |
in |
Yes | No | Yes | No | No |
out |
No | Yes | No | Yes | No |
inout |
Yes | Yes | Yes | Yes | Yes |
However, it should be noted that the compiler might not detect whether the annotation is correct or not! This program might compile, even though it's strictly incorrect.
<*
@param [&in] i
*>
fn void lying_func(int* i)
{
int* b = i;
*b = 1; // Circumvent checks!
}
fn void test()
{
int a = 1;
lying_func(&a);
io::printfn("%d", a); // Might print 2!
}
However, compilers detect this(*)
<*
@param [&in] i
*>
fn void bad_func(int* i)
{
*i = 2; // <- Compiler error: cannot write to "in" parameter
}
* The spec allows a barebones compiler to completely ignore contracts. Using such a compiler even this check might be ignored.
Pure in detail¶
The pure annotation allows a program to make assumptions in regard to how the function treats global variables.
Unlike for const, a pure function is not allowed to call a function which is known to be impure.
However, just like for const the compiler might not detect whether the annotation
is correct or not! This program might compile, but will behave strangely:
int i = 0;
fn void bad_func()
{
i = 2;
}
<*
@pure
*>
fn void lying_func()
{
bad_func() @pure; // Call bad_func by assuring it is pure!
}
fn void main()
{
i = 1;
lying_func();
io::printfn("%d", i); // Might print 2!
}
Circumventing "pure" annotations will cause the compiler to optimize under the assumption that globals are not affected, even if this isn't true.
Pre-conditions for macros¶
In order to check macros, it's often useful to use the builtin $defined
function which returns true if the code inside would pass semantic checking.
<*
@require $defined(resource.open, resource.open()) : `Expected resource to have an "open" function`
@require resource != null
@require $assignable(resource.open(), void*)
*>
macro open_resource(resource)
{
return resource.open();
}
Contract support¶
A C3 compiler may have different levels of contract use:
| Level | Behaviour |
|---|---|
| 0 | Contracts are only semantically checked |
| 1 | @require may be compiled into asserts inside of the function. Compile time violations detected through constant folding should not compile |
| 2 | As Level 1, but @ensures are also checked |
| 3 | @require is added at caller side as well |
| 4 | Static analysis is extended beyond compile time folding |
The c3c compiler currently does level 3 checking.
Defer¶
A defer always runs at the end of a scope at any point after it is declared, defer is commonly used to simplify code that needs clean-up; like closing unix file descriptors, freeing dynamically allocated memory or closing database connections.
End of a scope¶
The end of a scope also includes return, break, continue or rethrow !.
fn void test()
{
io::printn("print first");
defer io::printn("print third, on function return");
io::printn("print second");
return;
}
defer runs after the other print statements, at the function return.
Note
Rethrow ! unwraps the Optional result if present, afterwards the previously Optional variable is a normal variable again, if the Optional result is empty then the Excuse is returned from the function back to the caller.
Defer Execution order¶
When there are multiple defer statements they are executed in reverse order of their declaration, last-to-first declared.
fn void test()
{
io::printn("print first");
defer io::printn("print third, defers execute in reverse order");
defer io::printn("print second, defers execute in reverse order");
return;
}
Example defer¶
import std::io;
fn char[]? file_read(String filename, char[] buffer)
{
// return Excuse if failed to open file
File file = file::open(filename, "r")!;
defer {
io::printn("File was found, close the file");
if (catch excuse = file.close())
{
io::printfn("Fault closing file: %s", excuse);
}
}
// return if fault reading the file into the buffer
file.read(buffer)!;
return buffer;
}
If the file named filename is found the function will read the content into a buffer,
defer will then make sure that any open File handlers are closed.
Note that if a scope exit happens before the defer declaration, the defer will not run. This is a useful property because if the file failed to open, we don't need to close it.
defer try¶
A defer try is called at end of a scope when the returned Optional contained a result value.
Examples¶
fn void? test()
{
defer try io::printn("✅ defer try run");
// Returned an Optional result
return;
}
fn void main(String[] args)
{
(void)test();
}
defer try runs on scope exit.
fn void? test()
{
defer try io::printn("❌ defer try not run");
// Returned an Optional Excuse
return io::FILE_NOT_FOUND~;
}
fn void main(String[] args)
{
if (catch err = test())
{
io::printfn("test() returned a fault: %s", err);
}
}
defer try does not run on scope exit.
defer catch¶
A defer catch is called at end of a scope when exiting with an
Optional Excuse, and is helpful for logging, cleanup and freeing resources.
Memory allocation example¶
import std::core::mem;
fn char[]? test()
{
char[] data = mem::new_array(char, 12);
defer (catch err)
{
io::printfn("Excuse found: %s", err);
free(data);
}
// Returns Excuse, memory gets freed
if (!test_something(data)) return io::FILE_NOT_FOUND~;
// Returns data, defer catch doesn't run.
return data;
}
Pitfalls with defer and defer catch
If cleaning up memory allocations or resources make sure the defer or defer catch
are declared as close to the resource declaration as possible.
This helps to avoid unwanted memory leaks or unwanted resource usage from other code rethrowing ! before the defer catch was even declared.
fn void? function_throws()
{
return io::FILE_NOT_FOUND~;
}
fn String? test()
{
char[] data = mem::new_array(char, 12);
// ❌ Before the defer catch declaration
// memory was NOT freed
// function_throws()!;
defer (catch err)
{
io::printn("freeing memory");
free(data);
}
// ✅ After the defer catch declaration
// memory freed correctly
function_throws()!;
return (String)data;
}
Attributes
Attributes are compile-time annotations on functions, types, global constants and variables. Similar to Java annotations, an attribute may also take arguments. An attribute can also represent a bundle of attributes.
Built in attributes¶
@align(alignment)¶
Used for: struct, bitstructs, union, var, function
This attribute sets the minimum alignment for a field or a variable, for example:
Note that following C behaviour, @align is only able to increase
the alignment. If setting a smaller alignment than default is
desired, then use @packed (which sets the alignment to 1 for all members)
and then @align.
@benchmark¶
Used for: function
Marks the function as a benchmark function. Will be added to the list of benchmark functions when the benchmarks are run, otherwise the function will not be included in the compilation.
@bigendian¶
Used for: bitstruct
Lays out the bits as if the data was stored in a big endian type, regardless of host system endianness.
@builtin¶
Used for: function, macro, global, const
Allows a macro, function, global or constant be used from another module without the module path prefixed. Should be used sparingly.
@callconv¶
Used for: function
Sets the calling convention, which may be ignored if the convention is not supported on the target.
Valid arguments are "veccall", "cdecl", "stdcall". Any function without an explicit @callconv will use
"cdecl" which is the normal C calling convention.
Caution
On Windows, many calls are tagged stdcall in the C
headers. However, this calling convention is only ever used on 32-bit Windows,
and is a no-op on 64-bit Windows.
@compact¶
Used for: struct, union
This attribute works like @nopadding, but is applied recursively for any sub-elements, ensuring that there is no padding anywhere in the struct.
@const¶
Used for: macro
This attribute will ensure that the macro is always compile time folded (to a constant). Otherwise, a compile time error will be issued.
@deprecated¶
Used for: types, function, macro, global, const, member
Marks the particular type, global, const or member as deprecated, making use trigger a warning.
@dynamic¶
Used for: methods
Mark a method for dynamic invocation. This allows the method to be invoked through interfaces.
@export¶
Used for: function, global, const, enum, union, struct, faultdef
Marks this declaration as an export, this ensures it is never removed and exposes it as public when linking.
The attribute takes an optional string value, which is the external name. This acts as if @cname had been
added with that name.
@cname¶
Used for: function, global, const, enum, union, struct, faultdef
Sets the external (linkage) name of this declaration.
Caution
Do not confuse this with @export, which is required
to export a function or global.
@finalizer¶
Used for: function
Make this function run at shutdown. See @init for the optional priority. Note that running a
finalizer is a "best effort" attempt by the OS. During abnormal termination it is not guaranteed to run.
The function must be a void function taking no arguments.
@if¶
Used for: all declarations
Conditionally includes the declaration in the compilation. It takes a constant compile time value argument, if this
value is true then the declaration is retained, on false it is removed.
@init¶
Used for: function
Make this function run at startup before main. It has an optional priority 1 - 65535, with lower being executed earlier. It is not recommended to use values less than 128 as they are generally reserved and using them may interfere with standard program initialization.
The function must be a void function taking no arguments.
@inline¶
Used for: function, call
Declares a function to always be inlined or if placed on a call, that the call should be inlined.
@link¶
Used for: module, function, macro, global, const
Syntax for this attribute is @link(cond, link1, link2, ...),
where "link1" etc are strings names for libraries to implicitly
link to when this symbol is used.
In the case of a module section, adding @link implicitly places the
attribute on all of its symbols.
@littleendian¶
Used for: bitstruct
Lays out the bits as if the data was stored in a little endian type, regardless of host system endianness.
@local¶
Used for: any declaration
Sets the visibility to "local", which means it's only visible in the current module section.
@maydiscard¶
Used for: function, macro
Allows the return value of the function or macro to be discarded even if it is an optional. Should be used sparingly.
@mustinit¶
Used for: user-defined types
Prevents the use of the @noinit tag on a variable of the specified type.
@naked¶
Used for: function
This attribute disables prologue / epilogue emission for the function.
The body of the function should be a text asm statement.
@noalias¶
Used for: function parameters
This is similar to restrict in C. A parameter with @noalias should
be a pointer type, and the pointer is assumed not to alias to any other
pointer.
@nodiscard¶
Used for: function, macro
The return value may not be discarded.
@noinit¶
Used for: global, local variable
Prevents the compiler from zero initializing the variable.
@noinline¶
Used for: function, function call
Prevents the compiler from inlining the function or a particular function call.
@nopadding¶
Used for: struct, union
Ensures that a struct of union has no padding, emits a compile time error otherwise.
@norecurse¶
Used for: import
Import the module but not sub-modules or parent-modules, see Modules Section.
@noreturn¶
Used for: function, macro
Declares that the function will never return.
@nosanitize¶
Used for: function
This prevents sanitizers from being added to this function.
@nostrip¶
Used for: any declaration
This causes the declaration never to be stripped from the executable, even if it's not used. This also transitively applies to any dependencies the declaration might have.
@obfuscate¶
Used for: any declaration
Removes any string values that would identify the declaration in some way. Mostly this is used on faults and enums to remove the stored names.
@operator¶
Used for: method, macro method
This attribute has arguments [] []= &[] and len allowing subscript operator overloading for [] and foreach.
By implementing [] and len, foreach and foreach_r is enabled. In order to do foreach by reference,
&[] must be implemented as well.
Furthermore ==, !=, bit operations and arithmetics can all be overloaded.
@optional¶
Used for: interface methods
Placed on an interface method, this makes the method optional to implement for types that implements the interface.
See the Printable interface for an example.
@overlap¶
Used for: bitstruct
Allows bitstruct fields to have overlapping bit ranges.
@packed¶
Used for: struct, union
Causes all members to be packed as if they had alignment 1. The alignment of the struct/union is set to 1.
This alignment can be overridden with @align.
@private¶
Used for: any declaration
Sets the visibility to "private", which means it is visible in the same module, but not from other modules.
@pure¶
Used for: call
Used to annotate a non pure function as "pure" when checking for conformance to @pure on
functions.
@reflect¶
Used for: any declaration
Adds additional reflection information. Has no effect currently.
@section(name)¶
Used for: function, const, global
Declares that a global variable or function should appear in a specific section.
@tag(name, value)¶
Used for: function, macro, user-defined type, struct/union/bitstruct member, global, local variables
Adds a compile time tag to a type, function or member which can be retrieved
at compile time using reflection: .has_tag(..) and .get_tag(...).
Example: Foo.has_tag("bar") will return true if Foo has a tag "bar".
Foo.get_tag("bar") will return the value associated with that tag. For variables and members, access
it using $reflect: $reflect(my_global).has_tag("bar").
@test¶
Used for: function
Marks the function as a test function. Will be added to the list of test functions when the tests are run, otherwise the function will not be included in the compilation.
@unused¶
Used for: any declaration
Marks the declaration as possibly unused (but should not emit a warning).
@used¶
Used for: any declaration
Marks a parameter, value etc. as must being used.
@wasm¶
Used for: function, global, const
This attribute may take 0, 1 or 2 arguments. With 0 or 1 arguments
it behaves identical to @export if it is non-extern. For extern
symbols it behaves like @cname.
When used with 2 arguments, the first argument is the wasm module,
and the second is the name. It can only be used for extern symbols.
@winmain¶
Used for: function
This attribute is ignored on non-windows targets. On Windows,
it will create a WinMain entry point that will which calls
the main function. This will give other options for the main
argument, and is recommended for Windows GUI applications.
It is only valid for the main function.
@weak¶
Used for: function, const, global
Like @weaklink, but if the same definition occurs in the same compilation, the non-weak one is preferred.
@weaklink¶
Used for: function, const, global
Emits a weak symbol rather than a global.
User defined attributes¶
User defined attributes are intended for conditional application of built-in attributes.
attrdef @MyAttribute = @noreturn, @inline;
attrdef @MyCname(x) = @cname(x);
// The following two are equivalent:
fn void foo() @MyAttribute { /* */ }
fn void foo() @noreturn @inline { /* */ }
An attribute may also take parameters:
attrdef @MyAttr(val) = @tag("foo", val);
struct Test
{
int foo @MyAttr("test");
}
$echo Test.foo.tagof("foo"); // Will echo "test" at compile time
The attribute may also be completely empty:
The alias statement¶
The alias statement in C3 is intended for making new names for function pointers, identifiers and types.
Defining a type alias¶
alias <type alias> = <type> creates a type alias. A Type alias needs to follow the naming convention of user defined types (i.e. a capitalized
name with at least one lower case letter).
Function pointers must be aliased in C3. The syntax is somewhat different from C:
This defines an alias to function pointer type of a function that returns nothing and requires two arguments: an int and a bool. Here is a sample to illustrate usage:
typedef types¶
A typedef creates a new type.
A typedef does not implicitly convert to or from any other type, unlike a type alias.
Literals will convert to the typedef types if they would convert to the underlying type.
Because a typedef type is a new type, it can have its own methods, like any other user-defined type.
typedef Foo = int;
Foo f = 0; // Valid since 0 converts to an int.
f = f + 1;
int i = 1;
// f = f + i Error!
f = f + (Foo)i; // Valid
typedef inline¶
When interacting with various APIs it is sometimes desirable for typedef types to implicitly convert to
its base type, but not from that type.
Behaviour here is analogous how structs may use inline to create struct subtypes.
typedef CString = char*;
typedef ZString = inline char*;
//...
CString cstr = "cstr";
ZString zstr = "zstr";
//...
// char* from_cstr = cstr; // Error!
char* from_zstr = zstr; // Valid!
Function and variable aliases¶
alias can also be used to create aliases for functions and variables.
The syntax is alias <alias> = <original identifier>.
fn void foo() { ... }
int foo_var;
alias bar = foo;
alias bar_var = foo_var;
fn void test()
{
// These are the same:
foo();
bar();
// These access the same variable:
int x = foo_var;
int y = bar_var;
}
Using alias to create generic types, functions and variables¶
It is recommended to favour using alias to create aliases for parameterized types, but it can also be used for parameterized functions and variables:
import generic_foo;
// Parameterized function aliases
alias int_foo_call = generic_foo::foo_call {int};
alias double_foo_call = generic_foo::foo_call {double};
// Parameterized type aliases
alias IntFoo = Foo {int};
alias DoubleFoo = Foo {double};
// Parameterized global aliases
alias int_max_foo = generic_foo::max_foo {int};
alias double_max_foo = generic_foo::max_foo {double};
For more information, see the chapter on generics.
Function pointer default arguments and named parameters¶
It is possible to attach default arguments to function pointer aliases. There is no requirement that the function has the same default arguments. In fact, the function pointer may have default arguments where the function doesn't have it and vice-versa. Calling the function directly will then use the function's default arguments, whereas calling through the function pointer will use the function pointer alias's default argument.
Similarly, named parameter arguments follow the alias definition when calling through the function pointer:
Generic Programming
Generics
Generics allow you to create code that works with arbitrary types.
// If the module section is generic,
// then all its declarations are as well
module my_module <Type>;
// Parameterized struct
struct MyStruct
{
Type a, b;
}
// Parameterized function
fn Type square(Type t)
{
return t * t;
}
We can rewrite this with individual generic declarations:
module my_module;
struct MyStruct <Type>
{
Type a, b;
}
fn Type square(Type t) <Type>
{
return t * t;
}
Parameter types¶
Generic parameters may be types or int, bool and enum constants. In the case of types, they are written as if it was a regular type alias, e.g Type. Constant parameters are written as if they were constant aliases, e.g. MY_CONST, COUNT etc.
An example parameterized by a constant as well as a type:
Using generic parameters¶
The code in a generic declaration uses the parameters as if they were types / constant aliases in the scope:
module foo_test <Type1, MY_CONST>;
struct Foo
{
Type1 a;
}
fn Type2 test(Type2 b, Foo* foo)
{
return foo.a + b + MY_CONST;
}
Using generics¶
To use a generic function or type, we can either define an alias for it, or invoke it directly with its parameters:
import foo_test;
alias FooFloat = Foo {float, double};
alias test_float = foo_test::test {float, double};
...
FooFloat f;
Foo{int, double} g;
...
test_float(1.0, &f);
foo_test::test{int, double} (1.0, &g);
Generics are grouped¶
All generics that are defined in the same parameterized module section are instantiated together, but so are any other generics in the same module that has identical parameters:
module abc <Test>;
// Belongs to generic 1
fn Test test1(Test a)
{
return a + 1;
}
module abc;
// Belongs to generic 1
struct Foo <Test>
{
Test a;
}
// Belongs to generic 1
fn Foo test2(Test b) <Test>
{
return (Foo) { .a = b };
}
// Different parameter name, defines a new generic 2
fn Test2 test3(Test2 a) <Test2>
{
return a * a;
}
fn void main()
{
// This will instantiate Foo, test2 and test1,
// but not test3
Foo{int} a;
}
Generic contracts¶
Just like for macros, optional constraints may be added to improve compile errors:
<*
@require $assignable(1, TypeB) && $assignable(1, TypeC)
@require $assignable((TypeB)1, TypeA) && $assignable((TypeC)1, TypeA)
*>
module vector <TypeA, TypeB, TypeC>;
/* .. code .. */
alias test_function = vector::test_func {Bar, float, int};
// This would give the error
// --> Parameter(s) failed validation:
// @require "$assignable((TypeB)1, TypeA) && $assignable((TypeC)1, TypeA)" violated.
In general, contracts placed on types and identifiers will combine. However, contracts on generic functions and macros do not carry over to the aggregated generic contract:
module foo;
<* @require Test.kindof == INTEGER *>
struct Foo <Test>
{
Test a;
}
<* @require Test::size < 4 *>
fn Test testme(Test t) <Test>
{
return t * 2;
}
fn void main()
{
// This would trigger the generic contract, placed on Foo:
// testme{float}(2.0f);
// However this is fine, since
// the function contract is not checked unless invoked:
Foo{long} x;
}
Methods on generic types¶
Adding methods to a generic type extends it with the method for all generic, allowing the use of the generic parameters associated with creating the type:
module foo;
struct Foo <Type>
{
Type a;
}
module bar;
import foo, std::io;
fn Type Foo.add(self, Type b) => self.a + b;
fn void main()
{
Foo{int} f1 = { 3 };
Foo{double} f2 = { 3.4 };
io::printn(f1.add(5));
io::printn(f2.add(5));
}
We can also extend a particular instance, but in that case we do not access the parameterization.
module foo;
struct Foo <Type> { Type a; }
module bar;
import foo, std::io;
fn int Foo{int}.add(self, int b) => self.a + b;
// The below code would print "Error: 'Type' could not be found, did you spell it right?"
// fn Type Foo{int}.sub(self, Type b) => self.a - b;
fn void main()
{
Foo{int} f1 = { 3 };
Foo{double} f2 = { 3.4 };
io::printn(f1.add(5));
// io::printn(f2.add(5)); ERROR - There is no field or method 'Foo{double}.add'
}
Macros
The macro capabilities of C3 reache across several constructs:
macros, generic functions, generic modules, and compile time variables
(prefixed with $), macro compile time execution (using $if, $for, $foreach, $switch) and attributes.
A quick comparison of C and C3 macros¶
Conditional compilation¶
Macros¶
// C Macro
#define M(x) ((x) + 2)
#define UInt32 unsigned int
// Use:
int y = M(foo() + 2);
UInt32 b = y;
// C3 Macro
macro m(x)
{
return x + 2;
}
alias UInt32 = uint;
// Use:
int y = m(foo() + 2);
UInt32 b = y;
Dynamic scoping¶
Expression arguments¶
First class types¶
Trailing blocks for macros¶
// C Macro
#define FOR_EACH(x, list) \
for (x = (list); x; x = x->next)
// Use:
Foo *it;
FOR_EACH(it, list)
{
if (!process(it)) return;
}
// C3 Macro
macro @for_each(list; @body(it))
{
for ($Typeof(list) x = list; x; x = x.next)
{
@body(x);
}
}
// Use:
@for_each(list; Foo* x)
{
if (!process(x)) return;
}
First class names¶
Declaration attributes¶
Consider these two examples comparing declaration attribute syntax in C vs C3:
// C Macro
#define DEPRECATED_INLINE __attribute__((deprecated)) __attribute__((always_inline))
int foo(int x) DEPRECATED_INLINE { ... }
// C3 Macro
attrdef @DeprecatedInline = @deprecated, @inline;
fn int foo(int) @DeprecatedInline { ... }
Declaration macros¶
Stringification¶
Top level evaluation¶
Scripting languages usually have unbounded top level evaluation. The flexibility of this style of meta programming has a trade-off in making the code more challenging to understand.
In C3, top level compile time evaluation is limited to @if attributes to conditionally enable or disable declarations and a handful of other somewhat limited compile time evaluation features (e.g. $assert, etc). This makes the code easier to read, but comes at the cost of expressive power. However, C3 makes this tradeoff for a reason:
Preventing top level compile time evaluation helps prevent lots of declarations from popping into existence seemingly by magic, which is a common source of codebase intelligibility degrading over time in C and C++. By restricting the system to only either including or removing those declarations that are or aren't applicable, via @if, C3 makes it so that you still get conditional compilation and macros but with much less bewildering "magic". This also allows IDE's to effectively work with C3 source code despite its extensive macro system.
In effect, top level declarations become always visible in C3, regardless of whether they are included or removed, whereas in C and C++ unbounded invisible declarations may occur, causing code to become increasingly opaque and riddled with seemingly indecipherable "magic" and numerous variables and constants seemingly coming from nowhere.
Local function scopes in contrast have the full range of C3's compile time evaluation features available though, which are arguably often more expressive and pleasant to use than C and C++'s equivalents for many use cases.
Macro declarations¶
A macro is defined using the syntax macro <return_type> <name>(<parameters>). Specifying the return type of a macro is optional and if omitted the return type is inferred but must always be well-defined (hence different paths cannot return different types, etc).
The parameters have different sigils that must prefix their names where applicable: $ means compile time evaluated (constant expression or type). # indicates an expression that is not yet evaluated, but is bound to where it was defined.
Macros that use any expression parameters (#) or trailing macro bodies (@body(...)) must have a name that begins with @. The reason is that macros which don't use such features can be thought of as being essentially function-like, without any surprising behavior such as lazily implementing expressions or (as is the case of macros with trailing bodies) essentially creating a new type of statement.
The @ warns the reader of a macro call of the possibility that the call may be doing more "magic" or may be more prone to bugs than if the macro lacked the @. Thus, unlike most languages, C3 enables the programmer to choose between more safe or more expressive macros and to make that choice immediately clear to the reader.
Note that $ parameters (unlike # and @body parameters) do not cause a macro to need a @ prefix.
For example, here's a basic swap written as a macro instead of using pointers, which makes it potentially more efficient by avoiding pointer indirection overhead:
<*
@require $defined(#a = #b, #b = #a)
*>
macro void @swap(#a, #b)
{
var temp = #a;
#a = #b;
#b = temp;
}
This expands on usage like this:
fn void test()
{
int a = 10;
int b = 20;
@swap(a, b);
}
// Equivalent to:
fn void test()
{
int a = 10;
int b = 20;
{
int __temp = a;
a = b;
b = __temp;
}
}
Note the necessary #. Here is an incorrect swap and what it would expand to:
macro void badswap(a, b)
{
var temp = a;
a = b;
b = temp;
}
fn void test()
{
int a = 10;
int b = 20;
badswap(a, b);
}
// Equivalent to:
fn void test()
{
int a = 10;
int b = 20;
{
int __a = a;
int __b = b;
int __temp = __a;
__a = __b;
__b = __temp;
}
}
Macro methods¶
Similar to regular methods a macro may also be associated with a particular type:
See the chapter on functions for more details.
Capturing a trailing block¶
It is often useful for a macro to take a trailing compound statement as an argument. In C++ this pattern is usually expressed with a lambda, but in C3 this is completely inlined.
To accept a trailing block, ; @name(param1, ...) is placed after declaring the regular macro parameters.
Here's an example to illustrate its use:
<*
A macro looping through a list of values, executing the body once
every pass.
@require $defined(a.len) && $defined(a[0])
*>
macro @foreach(a; @body(index, value))
{
for (int i = 0; i < a.len; i++)
{
@body(i, a[i]);
}
}
fn void test()
{
double[] a = { 1.0, 2.0, 3.0 };
@foreach(a; int index, double value)
{
io::printfn("a[%d] = %f", index, value);
};
}
// Expands to code similar to:
fn void test()
{
double[] a = { 1.0, 2.0, 3.0 };
{
double[] __a = a;
for (int __i = 0; __i < __a.len; __i++)
{
int __index = __i;
double __value = __a[__i];
io::printfn("a[%d] = %f", __index, __value);
}
}
}
Macros returning values¶
A macro may return a value, in which case it is then considered an expression rather than a statement:
macro square(x)
{
return x * x;
}
fn int getTheSquare(int x)
{
return square(x);
}
fn double getTheSquare2(double x)
{
return square(x);
}
Calling macros¶
It's perfectly fine for a macro to invoke another macro or itself.
macro square(x) { return x * x; }
macro squarePlusOne(x)
{
return square(x) + 1; // Expands to "return x * x + 1;"
}
The maximum recursion depth is limited to the macro-recursion-depth build setting.
Macro vaargs¶
Macros support the typed vaargs used by C3 functions: macro void foo(int... args) and macro void bar(args...)
but also support a unique set of macro vaargs that look like C-style vaargs: macro void baz(...).
macro compile_time_sum(...)
{
var $x = 0;
$for var $i = 0; $i < $vaarg.len; $i++:
$x += $vaarg[$i];
$endfor
return $x;
}
$if compile_time_sum(1, 3) > 2: // Will compile to $if 4 > 2
...
$endif
$vaarg.len¶
Returns the number of arguments passed into the macro's vaarg list.
$vaarg[idx]¶
Returns the argument at index idx as an unevaluated expression. Multiple uses will
evaluate the expression multiple times. This corresponds to # parameters.
$stringify($vaarg[idx])¶
Will return the expression at the index idx as a string. It is equivalent to $stringify(#param).
macro dbg(...)
{
$for int $i = 0; $i < $vaarg.len; $i++:
io::printfn("%s = %s", $stringify($vaarg[$i]), $vaarg[$i]);
$endfor
}
int x = 111
double y = 22.98
dbg(x, y)
// Will print
// x = 111
// y = 22.98
$Typefrom($vaarg[idx])¶
Will treat the argument at the index idx as a type. It is equivalent to the old $vatype and corresponds to $Type parameters.
e.g. $Typefrom($vaarg[2]) a = 2.
...$vaarg (splatting)¶
...$vaarg allows you to paste the vaargs in the call into another call. For example,
if the macro was called with values "foo" and 1, the code foo(...$vaarg), would become foo("foo", 1).
You can even extract a range of arguments from the splat: ...$vaarg[2..4]. In this case, doing so would paste in arguments 2, 3 and 4.
Nor is ...$vaarg limited to function arguments. You can also use ...$vaarg within initializers. For example:
Untyped lists¶
Compile time variables may hold untyped lists. Such lists may be iterated over or implicitly converted to initializer lists:
Compile Time
During compilation, constant expressions will automatically be folded. Together with the compile
time conditional statements $if, $switch and the compile time iteration statements $for $foreach
it is possible to perform limited compile time execution.
Compile time values¶
During compilation, global constants are considered compile time values, as are any derived constant values, such as type names and sizes, variable alignments, etc.
Inside of a macro or a function, it is possible to define mutable compile time variables. Such local variables are prefixed with $ (e.g. $foo). It is also possible to define local type variables, which are also prefixed using $ (e.g. $MyType, $ParamType, etc).
Mutable compile time variables are not allowed in the global scope.
Concatenation¶
The compile time concatenation operator +++ can be used at compile
time to concatenate arrays and strings:
macro int[3] @foo(int $y)
{
int[2] $z = { 1, 2 };
return $z +++ $y;
}
fn void main()
{
io::printn(@foo(4)); // prints "{ 1, 2, 4 }"
}
Compile time && and ||¶
The operators &&& and ||| perform compile time versions of && and
||. The difference between the runtime operators is that the right hand side is not type
checked if the left hand side is false in the case of &&& and true in the case of |||.
This allows us to safely write this macro code:
If @foo() doesn't exist, then this still compiles. However, if we had used && instead this would have been an error:
$if and $switch¶
$if <const expr>: takes a compile time constant value and evaluates it to see if it is true or false. If it is true, then the code in the "then" branch is retained and semantically checked, while the $else branch – if present – is discarded. And conversely, if the result is false, then the "then" branch is discarded and the $else branch is retained. Here are some basic usage examples:
macro @foo($x, #y)
{
$if $x > 3:
#y += $x * $x;
$else
#y += $x;
$endif
}
const int FOO = 10;
fn void test()
{
int a = 5;
int b = 4;
@foo(1, a); // Allowed, expands to a += 1;
// @foo(b, a); // Error: b is not a compile time constant.
@foo(FOO, a); // Allowed, expands to a += FOO * FOO;
}
For switching between multiple possibilities, use $switch.
macro @foo($x, #y)
{
$switch $x:
$case 1:
#y += $x * $x;
$case 2:
#y += $x;
$case 3:
#y *= $x;
$default:
#y -= $x;
$endswitch
}
Switching without passing a value argument to $switch itself is also allowed (much like normal switch), which works like an if-else chain in that it permits arbitrary conditional expressions per case instead of only allowing a specific constant per case:
macro @foo($x, #y)
{
$switch:
$case $x > 10:
#y += $x * $x;
$case $x < 0:
#y += $x;
$default:
#y -= $x;
$endswitch
}
Loops using $foreach and $for¶
$for ... $endfor works analogous to for, only it is limited to using compile time variables. $foreach ... $endforeach similarly
matches the behaviour of foreach.
Compile time looping:
macro foo($a)
{
$for var $x = 0; $x < $a; $x++:
io::printfn("%d", $x);
$endfor
}
fn void test()
{
foo(2);
// Expands to ->
// io::printfn("%d", 0);
// io::printfn("%d", 1);
}
Looping over enums:
macro foo_enum($SomeEnum)
{
$foreach $x : $SomeEnum::values:
io::printfn("%d", (int)$x);
$endforeach
}
enum MyEnum
{
A,
B,
}
fn void test()
{
foo_enum(MyEnum);
// Expands to ->
// io::printfn("%d", (int)MyEnum.A);
// io::printfn("%d", (int)MyEnum.B);
}
Note
The content of the $foreach or $for body must be at least a complete statement.
It's not possible to compile partial statements.
Compile time macro execution¶
If a macro only takes compile time parameters, that is only $-prefixed parameters, and then does not generate any other statements than returns, then the macro will be completely compile time executed.
This constant evaluation allows us to write some limited compile time code. For example, this macro will compute Fibonacci numbers at compile time:
macro long @fib(long $n)
{
$if $n <= 1:
return $n;
$else
return @fib($n - 1) + @fib($n - 2);
$endif
}
It is important to remember that if we had replaced $n with n the compiler would have complained. n <= 1 is not considered to be a constant expression, even if the actual argument to the macro was a constant. This limitation is deliberate, to offer control over what is compiled out and what isn't.
Conditional compilation at the top level using @if¶
At the top level (where globals are declared; such as functions, variables, etc), conditional compilation is controlled by appending @if attributes onto declarations:
The argument to @if must be resolvable to a constant at compile time. This means that the argument may also be a compile time evaluated macro:
macro bool @foo($x) => $x > 2;
int x @if(@foo(5)); // Will be included
int y @if(@foo(0)); // Will not be included
In contrast though, attempts to use more general-purpose compile-time features such as $if at the top level will cause compilation failure. Compare:
// Compiles:
fn void func_a() @if(true)
{
//...
}
// Doesn't compile:
$if true:
fn void func_b()
{
//...
}
$endif
For more information about the motivation and rationale behind this design choice to use @if (and a limited subset of other compile-time constructs such as $assert) at the top level for declarations instead of allowing arbitrary compile-time evaluation, see the related discussion about why in the part of the macro page that covers top level @if.
Evaluation order of top level conditional compilation¶
Conditional compilation at the top level can cause unexpected ordering issues, especially when combined with
$defined. At a high level, there are three phases of evaluation:
- Non-conditional declarations are registered.
- Conditional module sections are either discarded or have all of their non-conditional declarations registered.
- Each module in turn will evaluate
@ifattributes for each module section.
The order of module and module section evaluation in (2) and (3) is not deterministic and any use of $defined should not
rely on this ordering.
Compile time introspection¶
At compile time, full type information is available. This allows for creation of reusable, code generating macros for things like serialization.
sz foo_alignment = Foo::alignment;
sz foo_member_count = Foo::members.len;
String foo_name = Foo::name;
To read more about all the fields available at compile time, see the page on reflection.
Compile time functions¶
The following is a list of functions available at compile time:
$assert¶
Check a condition at compile time.
$defined¶
This highly versatile compile time function returns true if a type or identifier is defined. It can also be used on an expression, returning "true" if the outermost expression is valid. Similarly, it can be used with a declaration, e.g. $defined(int a = foo) to verify that it's valid to declare a variable with the given argument.
However, be aware that $defined is for handling well-defined expressions, not arbitrary syntax. Invalid code placed inside $defined will cause compilation to fail, not return false.
See reflection.
$echo¶
Print a message to stdout when compiling the code.
$embed¶
Embed binary data from a file. See the "including binary data" section of the expressions page to see a few different usage examples.
This is useful for bundling any necessary data inside the executable or library itself so that there is no need for managing separate files when the program is redistributed to users. Such embedded data is fixed at compile time though, and so $embed shouldn't be used for files that need to persist changes between invocations of the program (e.g. work documents, saved games, etc). However, once loaded, $embed data is just arbitrary run-time data and thus you can still create and modify whatever other data you want based on it during each program run.
For example:
char[*] img_data = $embed("some_image.png");
import std::io;
fn void main()
{
io::printn(img_data);
// Prints an image's raw data
// as an array of unsigned bytes.
}
$error¶
When this is compiled, issue a compile time error.
$eval¶
Converts a compile time string to the corresponding variable or function. See reflection.
$exec¶
Execute a script at compile time and include the result in the source code. See more.
$expand¶
Convert any compile time string into code at compile time.
$feature¶
Check if a given feature is enabled. Features are passed using -D <FEATURE_NAME> on the command line.
$include¶
Includes a file into the current file at the top level as raw text, resulting in that file's text being compiled as if directly written into the location of the $include.
As an important limitation, the text may not include a module statement.
Note that if pure data inclusion is what you want then $embed may be more helpful than $include, and if you want dynamic data, $exec may be better.
$vaarg¶
This is the interface for accessing macro raw vaargs, see here.
$stringify¶
Turn an expression into a string. This is typically used with expression parameters (# prefixed parameters) in macros.
Such stringification is very useful for debug printing and code generation, among other things. For example, just to illustrate why:
import std::io;
macro @show(#expr)
{
io::printfn("%s == %s", $stringify(#expr), #expr);
}
macro @announce(#expr)
{
io::printn($stringify(#expr));
#expr;
}
fn void main()
{
int num = 0;
@show(num);
@announce(num += 5);
@show(num);
}
This elminates redundancy when print debugging. This code could be refined to be better, such as by making @show handle Optionals correctly, but the simple version above is less distracting. However, as you can see, code can be annoted for temporary print debugging very easily by using $stringify based expression macros.
$Typeof¶
Get the type of an expression at compile time, without ever evaluating it at run time and thus without causing side effects.
For example, the following C3 test passes:
fn void typeof_has_no_side_effects() @test
{
int minutes_left = 20;
$assert($Typeof(minutes_left += 10)::name == "int");
assert(minutes_left == 20);
// The state of `minutes_left` above never changes.
}
$Typefrom¶
Get a type from a compile time constant typeid. It can also convert a compile-time string to the corresponding type.
See reflection.
Reflection
C3 allows both compile time and runtime reflection.
During compile time, some type information is available in the form of compile time constants associated with each type.
Runtime type information is also available by retrieving a typeid from a runtime object (such as from an object of type any via <runtime_obj>.type most commonly) and then comparing the properties of the returned runtime typeid against the corresponding properties (if any) of the compile time equivalent <Type>::typeid. Note however that run time typeids currently have a much smaller set of available properties.
See the documentation about the any type for more information if you want or need runtime reflection. Such runtime info can be switched on or conditionally checked (e.g. via <runtime_obj>.type == <Type>::typeid) to implement runtime polymorphism.
Compile time information about types is accessed using ::, e.g. MyType::size.
For values use $reflect(<value>) to access the reflected properties for the underlying value.
The exception is $Typeof(<value>), which creates a type from the type of the value. There are convenience macros like @sizeof(<value>), @kindof(<value>) for immediately accessing reflection data without explicitly invoking $reflect.
Type properties & functions¶
The following type properties and functions are available:
alignment(all runtime types)from_ordinal(constdef and enum only)has_equalsis_orderedis_substruct(struct only)len(array, vector, enum, constdef - runtime available)lookup_field(enum)max/min(int and float types)members(struct, union, enum, bitstruct)methods(all non-optional runtime types)nan/inf(float types)inner(runtime types except int, float, struct and union types)kind(runtime available)name/qname/cname(cname is limited to all user-defined types)params(function types)parent(constdef, struct, typedef - runtime available)returns(function types)size(runtime available)typeid(all runtime types + untypedlist)get_tag/has_tag(user-defined types)values(constdef, enum)
alignment¶
Returns the alignment in bytes needed for the type.
from_ordinal¶
Only available for constdef and enum. Converts an integer value to the enum/constdef of that ordinal. In the case of constdef it might be different from the actual value.
has_equals¶
Is == and != supported.
is_ordered¶
Are all comparisons supported, either because the type has is built-in or added through operator overloading.
is_substruct¶
Only available for structs.
True if a struct has an inline member.
len¶
Returns the length of the array or vector. For enums and constdefs, it will return the number of constants.
lookup_field¶
Only available for enums.
Look up the enum value by matching the first associated value:
enum Foo : (int val)
{
ABC { 3 },
LIFF { 42 }
}
...
Foo? foo = Foo::lookup_field(val, 42); // Returns Foo.ABC
max / min¶
Only available for integer and floating point types.
Returns the maximum / minimum value of the type.
members¶
Only available for enum, bitstruct, struct and union types.
Returns a compile time list containing the fields in a bitstruct, struct or union. For enums it's the associated value declarations. The elements are of type reflected_ref, as if you had done $reflect on the element.
Note: As the list is an "untyped" list, you are limited to iterating and accessing it at compile time.
methods¶
This property returns the methods associated with a type as a constant array of strings.
Note
Warning!
Methods are generally registered after types are registered, which means that the use of "methodsof" may return inconsistent results depending on where in the resolution cycle it is invoked. It is always safe to use inside a function.
nan / inf¶
Only available for floating point types
Returns a representation of floating point "NaN" / "infinity".
inner¶
This returns a typeid to an "inner" type. What this means is different for each type:
- Array -> the array base type.
- Bitstruct -> underlying base type.
- Distinct -> the underlying type.
- Enum -> underlying enum base type.
- Pointer -> the type being pointed to.
- Vector -> the vector base type.
It is not defined for other types.
kind¶
Returns the underlying TypeKind as defined in std::core::types.
name / qname / cname¶
Returns the name of the type: qname is the qualified name, so adds the module path before the name. cname returns the external name, and as such isn't available for built-in types.
params¶
Only available for function pointer types. Returns a ReflectedParam struct for all function pointer parameters.
alias TestFunc = fn int(int x, double f);
String s = TestFunc::params[1].name; // "f"
typeid t = TestFunc::params[1].type; // double.typeid
parent¶
Only available for typedef, constdef, bitstruct and struct types.
Returns the typeid of the inline field.
returns¶
Only available for function types. Returns the typeid of the return type.
size¶
Returns the size in bytes for the given type, like C sizeof.
get_tag / has_tag¶
get_tag retrieves the value of a @tag defined on the type, has_tag is used to check if the tag exists.
typeid¶
Returns the typeid for the given type. aliass will return the typeid of the underlying type. The typeid size is the same as that of an iptr.
values¶
Returns a slice containing the values of an enum or constdef.
Compile time functions¶
There are several built-in functions to inspect the code during compile time.
$defined$eval$stringify$Typeof$Typefrom$reflect
$defined¶
Returns true when the expression(s) inside are defined and all sub expressions
are valid.
$defined(Foo); // => true
$defined(Foo.x); // => true
$defined(Foo.baz); // => false
Foo foo = {};
// Check if a method exists:
$if $defined(foo.call):
// Check what the method accepts:
$switch :
$case $defined(foo.call(1)) :
foo.call(1);
$default :
// do nothing
$endswitch
$endif
// Other way to write that:
$if $defined(foo.call, foo.call(1)):
foo.call(1);
$endif
The full list of what $defined can check:
SomeType a = <expr>- checks if<expr>can be used to initialize a variable of typeSomeTypevar $a = <expr>- checks if<expr>can be compile-time evaluated.*<expr>- checks if<expr>can be dereferenced,<expr>must already be valid<expr>[<index>]- checks if indexing is valid,<expr>and<index>must already be valid, and when possible to check at compile-time if<index>is out of bounds this will returnfalse<expr>[<index>] = <value>- same as above, but also checks if<value>can be assigned,<expr>,<index>and<value>must already be valid<expr>.<ident1>.<ident2>- check if.<ident2>is valid,<expr>.<ident1>must already be valid ("ident" is short for "identifier")ident,#ident,@ident,IDENT,$$IDENT,$ident- check if identifier existsType- check if the type exists&<expr>- check if you can take the address of<expr>,<expr>must already be valid&&<expr>- check if you can take the temporary address of<expr>,<expr>must already be valid$eval(<expr>)- check if the$evalevaluates to something valid,<expr>must already be valid<expr>(<arg0>, ...)- check that the arguments are valid for the<expr>macro/function,<expr>and all args must already be valid<expr>!!and<expr>!- check that<expr>is an optional,<expr>must already be valid<expr>?- check that<expr>is a fault,<expr>must already be valid<expr1> binary_operator <expr2>- check if thebinary_operator(+,-, ...) is defined between the two expressions, both expressions must already be valid(<Type>)<expr>- check if<expr>can be casted to<Type>, both<Type>and<expr>must already be valid
If for example <expr> is not defined when trying (<Type>)<expr> this will
result in a compile-time error.
$eval¶
Converts a compile time string with the corresponding variable:
int a = 123; // => a is now 123
$eval("a") = 222; // => a is now 222
$eval("mymodule::fooFunc")(a); // => same as mymodule::fooFunc(a)
$eval is limited to a single, optionally path prefixed, identifier.
Consequently methods cannot be evaluated directly:
struct Foo { ... }
fn int Foo.test(Foo* f) { ... }
fn void test()
{
void* test1 = &$eval("test"); // Works
void* test2 = &Foo.$eval("test"); // Works
// void* test3 = &$eval("Foo.test"); // Error
}
$reflect¶
Returns a reflection_ref of the expression. It can be queried for properties such as name, size, offset, alignment etc.
More information is forthcoming.
$stringify¶
Returns the expression as a string. $stringify has a special behaviour for handling macro expression parameters, where $stringify(#foo) will return the expression contained in #foo as a string, exactly as written in the macro call's arguments, rather than simply return "#foo".
Thus, for example:
import std::io;
macro @describe(#expr)
{
io::printfn("The value of `%s` is `%s`.", $stringify(#expr), #expr);
}
fn void main()
{
@describe(sz::size);
//Prints:
// The value of `sz.size` is `8`.
}
$Typeof¶
Returns the type of an expression or variable.
$Typefrom¶
Get a type from a compile time constant typeid. It can also convert a compile-time string to the corresponding type.
Expression values through $reflect¶
$reflect give access to different properties depending on the expression. To determine at compile time that some information is available use $define($reflect(x).some_property).
alignment¶
Get the alignment of something. See reflection.
cname¶
This returns the external name of a symbol.
External names are the names written into the symbol table of the executable or library binary, which subsequently may later be used by other programs to call into the binary by linking to those names, such as via foreign function interfaces (FFI) from another language or via direct use of the binary interface (such as enabled by the ABI and library compatibility of C and C3).
The external name of a symbol in the built binary can be set by attaching an @export("<intended_symbol_name>") attribute.
On Linux, the nm shell command can be used to view the symbol table of a binary directly, thus enabling determination of what names a foreign program would see when looking at the binary. For example, try running nm path/to/binary &> nm_out.txt then viewing the nm_out.txt file. The &> combines both normal (stdout) and error (stderrr) output into the file, whereas just > would redirect only normal (stdout) output.
On Windows, you can try dumpbin /SYMBOLS for debug builds, dumpbin /EXPORTS for libraries, or dumpbin /IMPORTS for executables, but it may not help as much since large parts of the symbol table may be missing and hence misleading. There may also be tools available only in Visual Studio or associated with it, since Microsoft designs it that way intentionally to encourage programs to be built the way Microsoft wants.
On Mac, try otool, nm, or objdump. Running brew install binutils before may help.
name¶
Get the local name of a symbol. Local names (a.k.a. unqualified names) are the "leaf nodes" (the very last item) of the full namespace path to a symbol.
For example, $reflect(io::printn).name is printn.
offset¶
Get the offset of a member.
qname¶
Get the qualified name of a symbol.
Qualified names are the full ("absolute") namespace paths needed to reach a symbol.
For example, $reflect(io::printn).qname is std::io::printn.
size¶
Return the size of an expression in bytes.
Any & Interfaces
Working with the type of any at runtime.¶
The any type is recommended for writing code that is polymorphic at runtime where macros are not appropriate.
It can be thought of as a typed void*.
An any can be created by assigning any pointer to it. You can then query the any type for the typeid of
the enclosed type (the type the pointer points to) using the type field.
This allows switching over the typeid, using a normal switch:
Sometimes one needs to manually construct an any-pointer, which
is typically done using the any_make function: any_make(ptr, type)
will create an any pointing to ptr and with typeid type.
Since the runtime typeid is available, we can query for any runtime typeid property available
at runtime, for example the size, e.g. my_any.type.size. This allows us to do a lot of work
on with the enclosed data without knowing the details of its type.
For example, this would make a copy of the data and place it in the variable any_copy:
void* data = malloc(a.type.size);
mem::copy(data, a.ptr, a.type.size);
any any_copy = any_make(data, a.type);
Variable argument functions with implicit any¶
Regular typed vaargs are of a single type, e.g. fn void abc(int x, double... args).
In order to take variable functions that are of multiple types, any may be used.
There are two variants:
Explicit any vararg functions¶
This type of function has a format like fn void vaargfn(int x, any... args). Because only
pointers may be passed to an any, the arguments must explicitly be pointers (e.g. vaargfn(2, &b, &&3.0)).
While explicit, this may be somewhat less user-friendly than implicit vararg functions:
Implicit any vararg functions¶
The implicit any vararg function has instead a format like fn void vaanyfn(int x, args...).
Calling this function will implicitly cause taking the pointer of the values (so for
example in the call vaanyfn(2, b, 3.0), what is actually passed are &b and &&3.0).
Because this passes values implicitly by reference, care must be taken not to mutate any values passed in this manner. Doing so would very likely break user expectations.
Interfaces¶
Most statically typed object-oriented languages implement extensibility using virtual pointer tables (vtables). In C, and by extension C3, this is possible to emulate by passing around structs containing a pointer to a list of function pointers in addition to the data.
While this is efficient and often the best solution, it puts certain assumptions on the code and makes interfaces more challenging to evolve over time.
As an alternative, there are languages (such as Objective-C) which instead use message passing to dynamically typed objects, where the availability of functionality may be queried at runtime.
C3 provides this latter functionality over the any type using interfaces.
Defining an interface¶
The first step is to define an interface:
While myname will behave as a method, we declare it without a type. Note here that unlike normal methods we leave
out the first "self" argument.
Implementing the interface¶
To declare that a type implements an interface, add it after the type name:
struct Baz (MyName)
{
int x;
}
// Note how the first argument differs from the interface.
fn String Baz.myname(Baz* self) @dynamic
{
return "I am Baz!";
}
If a type declares an interface but does not implement its methods then that is a compile time error.
A type may implement multiple interfaces by placing them all inside of (), e.g. struct Foo (VeryOptional, MyName) { ... }.
A limitation is that only user-defined types may declare they are implementing interfaces. To make existing types implement interfaces is possible but does not provide compile time checks.
One of the interfaces available in the standard library is Printable, which contains to_format and to_new_string.
If we implemented it for our struct above it might look like this:
fn String Baz.to_new_string(Baz* baz, Allocator allocator) @dynamic
{
return string::printf("Baz(%d)", baz.x, allocator: allocator);
}
@dynamic methods¶
A method must be declared @dynamic to implement an interface, but a method may also be declared @dynamic without the type declaring it implementing a particular interface. For example, this allows us to write:
// This will make "int" satisfy the MyName interface
fn String int.myname(int*) @dynamic
{
return "I am int!";
}
@dynamic methods have their reference retained in the runtime code and can also be searched for at runtime and invoked
from the any type.
Referring to an interface by pointer¶
An interface, e.g. MyName, can be cast back and forth to any, but only types which
implement the interface completely may implicitly be cast to the interface.
So for example:
Baz b = { 1 };
double d = 0.5;
int i = 3;
MyName a = &b; // Valid, Baz implements MyName.
// MyName c = &d; // Error, double does not implement MyName.
MyName c = (MyName)&d; // Would break at runtime as double doesn't implement MyName
// MyName z = &i; // Error, implicit conversion because int doesn't explicitly implement it.
MyName* z = (MyName)&i; // Explicit conversion works and is safe at runtime if int implements "myname"
Calling dynamic methods¶
Methods implementing interfaces are like normal methods, and if called directly, they are just normal function calls. The difference is that they may be invoked through the interface:
If we have an optional method we should first check that it is implemented:
interface VeryOptional
{
fn void do_something(int x, void* ptr) @optional;
}
fn void do_something(VeryOptional z)
{
if (&z.do_something)
{
z.do_something(1, null);
}
}
We first query if the method exists on the value. If it does we actually run it.
Here is another example, showing how the correct function will be called depending on type, checking
for methods on an any:
fn void whoareyou2(any a)
{
MyName b = (MyName)a;
// Query if the function exists
if (!&b.myname)
{
io::printn("I don't know who I am.");
return;
}
// Dynamically call the function
io::printn(b.myname());
}
fn void main()
{
int i;
double d;
Baz baz;
any a = &i;
whoareyou2(a); // Prints "I am int!"
a = &d;
whoareyou2(a); // Prints "I don't know who I am."
a = &baz;
whoareyou2(a); // Prints "I am Baz!"
}
Subtype inheritance¶
A struct with an "inline" member or a typedef which is declared with "inline", will
inherit dynamic methods from its inline "parent". This inheritance is not
available for "inline" enums.
struct BazChild
{
inline Baz b;
int x;
}
fn void main()
{
BazChild bp;
any a = &bp;
whoareyou2(a); // Prints "I am Baz!"
}
Reflection invocation¶
This functionality is not yet implemented and may see syntax changes
It is possible to retrieve any @dynamic function by name and invoke it:
alias VoidMethodFn = fn void(void*);
fn void* int.test_something(&self) @dynamic
{
io::printfn("Testing: %d", *self);
}
fn void main()
{
int z = 321;
any a = &z;
VoidMethodFn test_func = a.reflect("test_something");
test_func(a); // Will print "Testing: 321"
}
This feature allows methods to be linked up at runtime.
Operator Overloading
C3 has operator overloading for working with containers and for creating numerical types.
Overloads for containers¶
"Element at" operator []¶
Implementing [] allows a type to use the my_type[<value>] syntax:
It's possible to use any type as the argument, such as a string:
Only a single [] overload is allowed.
"Element ref" operator &[]¶
Similar to [], the &[] operator returns a value for &my_type[<value>], which may
be retrieved in a different way. If this overload isn't defined, then &my_type[<value>] would
be a syntax error.
"Element set" operator []=¶
This operator, the assignment counterpart of [], allows setting an element using my_type[<index>] = <value>.
"len" operator¶
Unlike the previous operator overloads, the "len" operator simply enables functionality
which augments the []-family of operators: you can use the "from end" syntax e.g my_type[^1]
to get the last element assuming the indexing uses integers.
Enabling foreach¶
In order to use a type with foreach, e.g. foreach(d : foo), at a minimum, methods
with overloads for [] (@operator([])) and len (@operator(len)) need to be added.
If &[] is implemented, foreach by reference will be enabled (e.g. foreach(double* &d : foo)).
fn double Foo.get(&self, sz i) @operator([])
{
return self.x[i];
}
fn sz Foo.len(&self) @operator(len)
{
return self.x.len;
}
fn void test(Foo f)
{
// Print all elements in f
foreach (d : f)
{
io::printfn("%f", d);
}
}
Operator overloading for numerical types¶
+ - * / % together with unary minus and plus, bit operators ^ | & and << >> are available for overloading
numerical types. These overloads are limited to user-defined types.
Symmetric and reverse operators¶
For numerical types, @operator_s (defining a symmetric operator)
and @operator_r (defining a reverse operator) are available.
These are only available when matching different types. For example,
defining + between a Complex number and a double can look like this:
macro Complex Complex.add_double(self, double d) @operator_s(+)
{
return self.add(self, complex_from_real(d));
}
The above would match both "Complex + double" and "double + Complex",
with the actual evaluation order of the arguments happening in
the expected order, meaning something like get_double() + get_complex()
would always evaluate the arguments from left to right.
As for @operator_r, it is useful in the case where the evaluation isn't symmetric:
macro Complex Complex.double_sub_this(self, double d) @operator_r(-)
{
return complex_from_real(d).sub(self);
}
The above would define "double - Complex" but not "Complex - double".
Resolving overloads¶
Numerical operators that take more than one operator can be properly overloaded,
so we can for example write a different + for adding Complex to int
as opposed to "Complex + double".
However, if "Complex + int" doesn't exist then the integer value will follow the normal conversion rules to implicitly cast it to a double!
More formally the resolution works in this manner:
- Is there an exact match to the second argument? If so, then this is picked.
- Is there a way to match by implicitly casting the second argument? If there is only one match, then this is picked. If there are multiple matches, then the operation is ambiguous and will be considered an error.
struct Foo
{
float a;
}
fn Foo Foo.minus_float(self, float f) @operator(-) => { .a = self.a - f };
fn Foo Foo.minus_double(self, double d) @operator(-) => { .a = self.a - d };
fn void main()
{
Foo x = { 1.0f };
Foo y = { 2.2f };
Foo zf = x - 2.0f; // Uses Foo.minus_float
Foo zi = x - 2; // ERROR: Ambiguous, implicitly cast value matches both overloads.
}
Bitstructs and bit operations¶
As a special rule, bitstructs may not overload ^ & |, as these operations are already
defined on bitstructs.
Combined assignment operators¶
If + is defined for a type, then += is defined as well, and similarly for the
other operators. However, it is also possible to explicitly override the combined assignment
operators to optimize those cases.
struct Foo
{
int a;
}
fn Foo Foo.add(self, Foo other) @operator(+) => { .a = self.a + other. a };
fn Foo Foo.add_self(&self, Foo other) @operator(+=)
{
self.a += other.a;
return *self;
}
fn void main()
{
Foo x = { 1 };
Foo y = { 2 };
Foo z = x + y; // Uses Foo.add
x += y; // Uses Foo.add_self
}
Operator overloading for ==¶
Overloading == is, like overloading arithmetic operators, only allowed on user-defined types.
Operator overloading for <¶
If < and == are implemented, then the type supports all ordering operations: < <= == != >= >.
Note
Some words of caution
Operator overloading should always be written to behave in the same manner
as the operators behave when used with builtin types. For example: + should be used for addition, not concatenation. << should be used for left bitshift, not to append values to an array or print things to stdout.
Violating the expected behaviour of operators is why operator overloading is often frowned upon despite its usefulness. Operator overloading that follows expectations can make the code clearer and easier to read. Violating expectations on the other hand obfuscates the code and makes it harder to read and understand and hence also harder to safely share and reuse. It is bad style and poor taste.
Build Your Project
Build Commands
Building a project is done by invoking the C3 compiler with the build or run command inside of the project structure. The compiler will search upwards in the file hierarchy until a project.json file is found.
You can also customize the project build config.
Compile Individual Files¶
By default the compiler will compile a stand-alone file as an executable binary, rather than as a static or dynamic library.
The resulting executable binary will be given the same name as whichever C3 file contains the main function.
Alternatively, libraries can be compiled via c3c static-lib or c3c dynamic-lib or by creating a project configured as such and built via c3c build and c3c run and so on.
Run¶
When starting out with C3, it's natural to use compile-run to try things out. For larger projects, the built-in build system is recommended instead.
The compile-run command works the same as normal compilation (via compile, build, etc), but also immediately runs the resulting executable.
Common additional parameters¶
Additional parameters:
- --lib <path> add a library to search.
- --output <path> override the output directory.
- --path <path> execute as if standing at <path>
Init a new project¶
Create a new project structure in the current directory.
Use the --template command option to select a template. The following are built in:
exe- the default template, produces an executable.static-lib- template for producing a static library.dynamic-lib- template for producing a dynamic library.
It is also possible to give the path to a custom template.
Additional parameters:
- --template <path> indicate an alternative template to use.
For example, c3c init hello_world creates the following structure:
.
├─ build/
├─ docs/
├─ lib/
├─ resources/
├─ scripts/
├─ src/
│ └─ main.c3
├─ test/
├─ LICENSE
├─ project.json
└─ README.md
Check the project configuration docs to learn more about configuring your project.
Test¶
Will run any tests in the project in the"sources" directory defined in your project.json. For example:
Tests are defined with a @test attribute. For example:
Build¶
Build the project in the current path. It doesn't matter where in the project structure you are.
The built-in templates define two targets: debug (which is the default) and release.
Clean¶
Removes some of the generated build artifacts of the previous builds of the C3 project. In most cases this is unnecessary.
Build and Run¶
Build the target (if needed) and run the executable.
Clean and Run¶
Clean, build and run the target.
Dist¶
Clean, build and package the target for distribution to end users.
For convenience, the c3c dist command will also run the target afterwards if it is an executable, as it is likely you will want to check that the program is still working.
You should also transfer the distribution package to a clean machine and test that the application works correctly there too at a minimum. Otherwise, there is a high risk that your application will be broken due to some dependencies existing on your machine that don't exist on end users' machines. Developers' machines often have many more libraries already installed than users' machines, hence users' machines are far more likely to lack necessary dependencies. It is hard to reliably discern without testing.
Caution
c3c dist has not been properly added yet!
Docs¶
Not added yet!
Rebuilds the documentation based upon whatever documentation comments and contracts have been written into your C3 code, so that you and other programmers working on your project can easily get a more expedient and more readily navigable overview of what things are available and what they do and how to use them.
This is what is known as a "documentation generation" or "docgen" system. The most common example of a documentation generator in the C and C++ ecosystem is perhaps Doxygen (a 3rd party tool) but many other languages have their own built-in documentation generators.
Alternatively, if you do not want to maintain documentation for your project, such as if your project is too transient for documentation to matter or if you want to potentially iterate faster, then consider instead at least ensuring that you have lots of unit tests (via the @test attribute) and assertions ($assert and assert) in your code so that the "self documenting" qualities of your codebase are maximized. Be aware however that some things can never be adequately expressed in any amount of "self documenting" code. Each approach has tradeoffs. A balanced mix is also a good approach.
Benchmark¶
Runs benchmarks on a target, meaning that every function that has been annotated with @benchmark will be run and have its performance profiled (including time spent and a CPU cycle count) so that you can easily monitor opportunities for optimization and avoid computational waste.
Customizing A Project¶
This is a description of the configuration options in project.json:
(you can see the full list executing c3c --list-project-properties)
{
// Language version of C3.
"langrev": "1",
// Warnings used for all targets.
"warnings": [ "no-unused" ],
// Directories where C3 library files may be found.
"dependency-search-paths": [ "lib" ],
// Libraries to use for all targets.
"dependencies": [ ],
// Authors, optionally with email.
"authors": [ "John Doe <[email protected]>" ],
// Version using semantic versioning.
"version": "0.1.0",
// Sources compiled for all targets.
"sources": [ "src/**" ],
// C sources if the project also compiles C sources
// relative to the project file.
// "c-sources": [ "csource/**" ],
// Include directories for C sources relative to the project file.
// "c-include-dirs: [ "csource/include" ],
// Output location, relative to project file.
"output": "../build",
"targets": {
"my_app": {
// Executable or library.
"type": "executable",
// Architecture and OS target.
// You can use 'c3c --list-targets' to list all valid targets,
// "target": "linux-x64",
// Current Target options:
// android-aarch64
// elf-aarch64 elf-riscv32 elf-riscv64 elf-x86 elf-x64 elf-xtensa
// mcu-x86 mingw-x64 netbsd-x86 netbsd-x64 openbsd-x86 openbsd-x64
// freebsd-x86 freebsd-x64 ios-aarch64
// linux-aarch64 linux-riscv32 linux-riscv64 linux-x86 linux-x64
// macos-aarch64 macos-x64
// wasm32 wasm64
// windows-aarch64 windows-x64
// Additional libraries, sources
// and overrides of global settings here.
},
},
// Global settings.
// C compiler if the project also compiles C sources
// defaults to 'cc'.
"cc": "cc",
// C compiler flags
"cflags": "",
// Set the include directories for C sources.
"c-include-dirs": "",
// CPU name, used for optimizations in the LLVM backend.
"cpu": "generic",
// Debug information, may be "none", "full" and "line-tables".
"debug-info": "full",
// FP math behaviour: "strict", "relaxed", "fast".
"fp-math": "strict",
// Link libc other default libraries.
"link-libc": true,
// Memory environment: "normal", "small", "tiny", "none".
"memory-env": "normal",
// Optimization: "O0", "O1", "O2", "O3", "O4", "O5", "Os", "Oz".
"opt": "O0",
// Code optimization level: "none", "less", "more", "max".
"optlevel": "none",
// Code size optimization: "none", "small", "tiny".
"optsize": "none",
// Relocation model: "none", "pic", "PIC", "pie", "PIE".
"reloc": "none",
// Trap on signed and unsigned integer wrapping for testing.
"trap-on-wrap": false,
// Turn safety (contracts, runtime bounds checking, null pointer checks etc).
"safe": true,
// Compile all modules together, enables more inlining.
"single-module": true,
// Use / don't use soft float, value is otherwise target default.
"soft-float": false,
// Strip unused code and globals from the output.
"strip-unused": true,
// The size of the symtab, which limits the amount
// of symbols that can be used. Should usually not be changed.
"symtab": 1048576,
// Use the system linker.
"linker": "cc",
// Include the standard library.
"use-stdlib": true,
// Set general level of x64 cpu: "baseline", "ssse3", "sse4", "avx1", "avx2-v1", "avx2-v2", "avx512", "native".
"x86cpu": "native",
// Set max type of vector use: "none", "mmx", "sse", "avx", "avx512", "native".
"x86vec": "sse",
// Enable sanitizer: none, address, memory, thread.
"sanitize": "none",
// Features enabled for all targets.
"features": "",
}
By default, an executable is assumed, but changing the type to "static-lib" or "dynamic-lib"
creates static library and dynamic library targets respectively.
Compilation options¶
The project file contains common settings at the top level that can be overridden by each
target by simply assigning that particular key. So if the top level defines target
to be macos-x64 and the actual target defines it to be windows-x64, then the windows-x64 target will be used for compilation.
Similarly, compiler command line parameters can be used in turn to override the target setting.
targets¶
The list of targets that can be built.
dependencies¶
List of C3 libraries (".c3l") to use when compiling the target.
sources¶
List of source files to compile.
test-sources¶
List of additional source files to compile when running tests.
cc¶
C compiler to use for compiling C sources (if C sources are compiled together with C3 files).
c-sources¶
List of C sources to compile, using the default C compiler.
linker-search-paths¶
This adds paths for the linker to search, when linking normal C libraries.
linked-libraries¶
This is a list of C libraries to link to. The names need to follow the normal
naming standard for how libraries are provided to the system linker.
So, for example, on Linux libraries have names like libfoo.a but when
presented to the linker the name is foo. As an example "linked-libraries": ["curl"]
would on Linux look for the library libcurl.a and libcurl.so in the
paths given by "linker-search-paths".
version¶
Not handled yet.
Version for the library. Will also be provided as a compile time constant.
authors¶
List of authors who are credited with creating and/or working on the project.
These can be accessed as lists using env::AUTHORS, which gives a list of names, and env::AUTHOR_EMAILS, which gives a list of their e-mails (where available).
The formatting is expected to be in the format "first last John Doe <[email protected]>.
langrev¶
Not handled yet.
The language revision to use.
features¶
This is a list of upper-case constants that can be tested for
in the source code using $feature(NAME_OF_FEATURE).
warnings¶
Not completely supported yet.
List of warnings to enable during compilation.
opt¶
Optimization setting: O0, O1, O2, O3, O4, O5, Os, Oz.
Target options¶
type¶
This mandatory option should be one of:
- "executable" – a normal executable application.
- "dynamic-lib" - a dynamic library.
- "static-lib" - static library.
- "benchmark" - target that only runs benchmarks.
- "test" - target that only runs tests.
- "object-files" - compile to object files, but does not perform any linking.
- "prepare" - target that does not perform any compilation, but may do things like invoking other scripts using "exec".
Using environment variables¶
Not supported yet.
In addition to constants, any values starting with $ will be assumed to be environment variables.
For example "$HOME" would on Unix-like systems (e.g. Linux, the BSDs, Mac) return the home directory. For strings that start with $ but should not be interpreted as an environment variable you need to escape it with a backslash (\). For example, the string "\$HOME" would be interpreted as the plain string "$HOME".
Language Rules
Implicit Conversions
Conversion Rules For C3¶
C3 differs in some crucial respects when it comes to number conversions and promotions. These are the rules for C3:
floattointconversions require a cast.inttofloatconversions do not require a cast.booltofloatconverts to0.0or1.0- Widening
floatconversions are only conditionally allowed*. - Narrowing conversions require a cast*.
- Widening
intconversions are only conditionally allowed*. - In conditionals
floattobooldo not require a cast, any non-zerofloatvalue isconsidered true. - Implicit conversion to
boolonly occurs in conditionals or when the value is enclosed in()e.g.bool x = (1.0)orif (1.0) { ... }
C3 uses two's complement arithmetic for all integer math.
Note
These abbreviations are used in the text below: - "lhs" means "left hand side". - "rhs" means "right hand side".
Target type¶
The left hand side of an assignment, or the parameter type in a call is known as the target type. The target type is used for implicit widening and inferring struct initialization.
Common arithmetic promotion¶
Like C, C3 uses implicit arithmetic promotion of integer and floating point variables before arithmetic operations:
- For any floating point type with a bitwidth smaller than 32 bits, widen to
float. For example, in C3float16converts tofloatbefore arithmetic is performed. - For an integer type smaller than the minimum arithmetic width, promote the value to a same-signed integer of the minimum arithmetic width. This usually corresponds to a C
intorunsigned int. For example, in C3ushortconverts touintbefore arithmetic is performed.
Implicit narrowing¶
An expression with an integer type, may implicitly narrow to a smaller integer type, and similarly a float type may implicitly narrow to a less wide floating point type. This is determined from the following algorithm:
- Shifts and assign look at the lhs expression.
++,--,~,-,!!,!- check the inner type.+,-,*,/,%,^,|,&,??,?:- check both lhs and rhs.- Narrowing
int/floatcast, assume the type is the narrowed type. - Widening
int/floatcast, look at the inner expression, ignoring the cast. - In the case of any other cast, assume it is opaque and the type is that of the cast.
- In the case of an integer literal, instead of looking at the type, check that the integer would fit the type to narrow to.
- For
.lenaccess, allow narrowing to C int width. - For all other expressions, check against the size of the type.
As rough guide: if all the sub expressions originally are small enough it's ok to implicitly convert the result.
Examples:
float16 h = 12.0;
float f = 13.0;
double d = 22.0;
char x = 1;
short y = -3;
int z = 0xFFFFF;
ulong w = -0xFFFFFFF;
x = x + x; // => Calculated as `x = (char)((int)x + (int)x);`.
x = y + x; // => Error. Narrowing is not allowed because short y is wider than char x.
h = x * h; // => Calculated as `h = (float16)((float)x * (float)h);`.
h = f + x; // => Error. Narrowing is not allowed because float f is wider than float16 h.
Implicit widening¶
Unlike C, implicit widening will only happen on "simple expressions": if the expression is a primary expression, or a unary operation on a primary expression.
For assignment, special rules hold. For an assignment to a binary expression, if its two subexpressions are "simple expressions" and the binary expression is +, -, /, *, allow an implicit promotion of the two sub expressions.
int a = ...
short b = ...
char c = ...
long d = a; // Valid - simple expression.
int e = (int)(d + (a + b)); // Error
int f = (int)(d + ~b); // Valid
long g = a + b; // Valid
As a rule of thumb, if there is more than one possible conversion, then an explicit cast is needed.
Example:
long h = a + (b + c);
// Possible intention 1
long h = (long)(a + (b + c));
// Possible intention 2
long h = (long)a + (long)(b + c);
// Possible intention 3
long h = (long)a + ((long)b + (long)c);
Maximum type¶
The maximum type is a concept used when unifying two or more types. The algorithm follows:
- First perform implicit promotion.
- If both types are the same, the maximum type is this type.
- If one type is a floating point type, and the other is an integer type,
the maximum type is the floating point type. E.g.
int + float -> float. - If both types are floating point types, the maximum type is the widest floating point type. E.g.
float + double -> double. - If both types are integer types with the same signedness, the
maximum type is the widest integer type of the two. E.g.
uint + ulong -> ulong. - Compare the two types to the list: ichar, char, short, ushort, int, uint, long, ulong, int128, uint128, the max type is the one furthermost to the right in the list. Consequently
ulong + int -> ulong,uint + int -> uint. - If at least one side is a struct or a pointer to a struct with an
inlinedirective on a member, check recursively if the type of the inline member can be used to find a maximum type (see below under sub struct conversions) - All other cases are errors.
Substruct conversions¶
Substructs may be used in place of its parent struct in many cases. The rule is as follows:
- A substruct pointer may implicitly convert to a parent struct.
- A substruct value may be implicitly assigned to a variable with the parent struct type. This will truncate the value, copying only the parent part of the substruct. However, a substruct value cannot be assigned its parent struct.
- Substruct slices and arrays cannot be cast (implicitly or explicitly) to an array of the parent struct type.
Pointer conversions¶
Pointer conversion between types usually needs explicit casts.
The exception is void* which any type may implicitly convert to or from.
Conversion rules from and to arrays are detailed under arrays
Vector conversions¶
Vector conversions always need to be explicit. They work
as regular conversions with one notable exception: converting a true boolean
vector value into an int will yield a value with all bits set. So bool[<2>] { true, false }
converted to for example char[<2>] will yield { 255, 0 }.
Vectors can also implicitly be cast to the corresponding array type, for example: char[<2>] <=> char[2].
Binary conversions¶
1. Multiplication, division, remainder, subtraction / addition with both operands being numbers¶
These operations are only valid for integer and float types.
- Resolve the operands.
- Find the maximum type of the two operands.
- Promote both operands to the resulting type if both are simple expressions
- The resulting type of the expression is the maximum type.
2. Addition with left side being a pointer¶
- Resolve the operands.
- If the rhs is not an integer, this is an error.
- If the rhs has a bit width that exceeds sz, this is an error.
- The result of the expression is the lhs type.
3. Subtraction with lhs pointer and rhs integer¶
- Resolve the operands.
- If the right hand type has a bit width that exceeds sz, this is an error.
- The result of the expression is the left hand type.
4. Subtraction with both sides pointers¶
- Resolve the operands.
- If either side is a
void*, it is cast to the other type. - If the types of the sides are different, this is an error.
- The result of the expression is sz.
- If this result exceeds the target width, this is an error.
6. Bit operations ^ & |¶
These operations are only valid for integers and booleans.
- Resolve the operands.
- Find the maximum type of the two operands.
- Promote both operands to the maximum type if they are simple expressions.
- The result of the expression is the maximum type.
7. Shift operations << >>¶
These operations are only valid for integers.
- Resolve the operands.
- In safe mode, insert a trap to ensure that rhs >= 0 and rhs < bit width of the left hand side.
- The result of the expression is the lhs type.
8. Assignment operations += -= *= /= %= ^= |= &=¶
- Resolve the lhs.
- Resolve the right operand as an assignment rhs.
- The result of the expression is the lhs type.
9. Assignment shift >>= <<=¶
- Resolve both operands
- In safe mode, insert a trap to ensure that rhs >= 0 and rhs < bit width of the left hand side.
- The result of the expression is the lhs type.
10. && and ||¶
- Resolve both operands.
- Insert bool cast of both operands.
- The type is bool.
11. <= == >= !=¶
- Resolve the operands, left to right.
- Find the maximum type of the two operands.
- Promote both operands to the maximum type.
- The type is bool.
Unary conversions¶
1. Bit negate¶
- Resolve the inner operand.
- If the inner type is not an integer this is an error.
- The type is the inner type.
2. Boolean not¶
- Resolve the inner operand.
- The type is bool.
3. Negation¶
- Resolve the inner operand.
- If the inner type is not a number, then this is an error.
- If the inner type is an unsigned integer, cast it to the same signed type.
- The type is the type of the result from (3).
4. & and &&¶
- Resolve the inner operand.
- The type is a pointer to the type of the inner operand.
5. *¶
- Resolve the inner operand.
- If the operand is not a pointer, or is a
void*pointer, this is an error. - The type is the pointee of the inner operand's type.
Dereferencing 0 is implementation defined.
6. ++ and --¶
- Resolve the inner operand.
- If the type is not a number, this is an error.
- The type is the same as the inner operand.
Base expressions¶
1. Typed identifiers¶
- The type is that of the declaration.
- If the width of the type is less than that of the target type, widen to the target type.
- If the width of the type is greater than that of the target type, it is an error.
2. Constants and literals¶
- If the constant is an integer, it is assumed to be the arithmetic promotion width and signed. Suffixes imply the following: 'u' - unsigned, 'ul' - unsigned 64-bit, 'ull' - unsigned 128-bit, 'l' - signed 64-bit, 'll' - signed 128-bit. If a constant does not fit in the arithmetic promotion width, the following rules apply: if decimal, promote to the smallest signed integer able to contain it, if hex, binary or octal, promote to the smallest signed or unsigned integer able to contain it.
- If the constant is a floating point value, it is assumed to be a
doubleunless suffixed withfwhich is then assumed to be afloat.
Precedence rules in C3 differs from C/C++. Here are all precedence levels in C3, listed from highest (1) to lowest (12):
(),[],.,!!postfix!,!!,~,++and--- prefix
-,~,!,!!,*,&,++,-- - infix
*,/,% <<,>>^,|, infix&?:,??+, infix-,+++==,!=,>=,<=,>,<&&,&&&||,|||?=,*=,/=,%=,+=,-=,<<=,>>=,&=,^=,|=
The main difference is that bitwise operations and shift has higher precedence than addition/subtraction and multiplication/division in C3. Bitwise operations also have higher precedence than the relational operators. Also, there is no difference in precedence between the bitwise operators.
Examples:
a + b >> c + d
(a + b) >> (c + d) // C (+ - are evaluated before >>)
a + (b >> c) + d // C3 (>> is evaluated before + -)
a & b == c
a & (b == c) // C (bitwise operators are evaluated after relational)
(a & b) == c // C3 (bitwise operators are evaluated before relational)
a > b == c < d
(a > b) == (c < d) // C (< > binds tighter than ==)
((a > b) == c) < d // C3 Error, requires parenthesis!
a | b ^ c & d
a | ((b ^ c) & d) // C (All bitwise operators have different precedence)
((a | b) ^ c) & d // C3 Error, requires parenthesis!
The change in precedence of the bitwise operators corrects a long standing issue in the C specification. The change in precedence for shift operations goes towards making the precedence less surprising.
Conflating the precedence of relational and equality operations, and all bitwise operations was motivated by simplification: few remember the exact internal differences in precedence between bitwise operators. Parenthesis is required for those conflated levels of precedence.
Left-to-right offers a very simple model to think about the internal order of operations, and encourages use of explicit ordering, as best practice in C is to use parentheses anyway.
Undefined Behaviour
Like C, C3 uses undefined behaviour. In contrast, C3 will trap - that is, print an error trace and abort – on undefined behaviour in debug builds. This is similar to using C with a UB sanitizer. It is only during release builds that actual undefined behaviour occurs.
In C3, undefined behaviour means that the compiler is free to interpret undefined behaviour as if behaviour cannot occur.
In the example below:
The case of x == 0 would invoke undefined behaviour for 255/x. For that reason,
the compiler may assume that x != 0 and compile it into the following code:
As a contrast, the safe build will compile code equivalent to the following.
List of undefined behaviours¶
The following operations cause undefined behaviour in release builds of C3:
| operation | will trap in safe builds |
|---|---|
| int / 0 | Yes |
| int % 0 | Yes |
| reading explicitly uninitialized memory | Possible* |
| array index out of bounds | Yes |
dereference null |
Yes |
| dereferencing memory not allocated | Possible* |
| dereferencing memory outside of its lifetime | Possible* |
| casting pointer to the incorrect array | Possible* |
| violating pre or post conditions | Yes |
| violating asserts | Yes |
reaching unreachable() code |
Yes |
* "Possible" indicates trapping is implementation dependent.
List of implementation dependent behaviours¶
Some behaviour is allowed to differ between implementations and platforms.
| operation | will trap in safe builds | permitted behaviour |
|---|---|---|
| comparing pointers of different provenance | Optional | Any result |
| subtracting pointers of different provenance | Optional | Any result |
| shifting by more or equal to the bit width | Yes | Any result |
| shifting by negative amount | Yes | Any result |
| conversion floating point <-> integer type is out of range | Optional | Any result |
| conversion between pointer types produces one with incorrect alignment | Optional | Any result / Error |
| calling a function through a function pointer that does not match the function | Optional | Any result / Error |
| attempt to modify a string literal | Optional | Partial modification / Error |
modifying a const variable |
Optional | Partial modification / Error |
List of undefined behaviour in C, which is defined in C3¶
Signed Integer Overflow¶
Signed integer is always wrapped using 2s complement.
Modifying the intermediate results of an expression¶
Behaves as if the intermediate result was stored in a variable on the stack.
Misc Advanced
Inline Assembly
C3 provides two ways to insert inline assembly code: asm strings and asm blocks.
Assembly strings¶
This form takes a single compile time string and passes it directly to the underlying backend without any changes.
Assembly blocks¶
Assembly blocks use a common grammar for all types of processors. C3's asm implementation assumes that all assembly statements can be reduced to variations of the following general format:
Where an arg is:
- An identifier, e.g.
FOO,x. - A numeric constant
10xFFetc. - A register name (always lower case with a '$' prefix) e.g.
$eax$r7. - The address of a variable e.g.
&x. - An indirect address:
[addr]or[addr + index * <const> + offset]. - Any expression inside of "()" (will be evaluated before entering the
asmblock).
An example:
int aa = 3;
int g;
int* gp = &g;
int* xa = &aa;
sz asf = 1;
asm
{
movl x, 4; // Move 4 into the variable x
movl [gp], x; // Move the value of x into the address in gp
movl x, 1; // Move 1 into x
movl [xa + asf * 4 + 4], x; // Move x into the address at xa[asf + 1]
movl $eax, (23 + x); // Move 23 + x into EAX
movl x, $eax; // Move EAX into x
movq [&z], 33; // Move 33 into the memory address of z
}
Note
The current state of inline asm is a work in progress. Only a subset of x86, aarch64 and riscv instructions are available, and other platforms have no support at all. It is likely that the grammar will be extended as more architectures are supported. More instructions can be added as they are needed, so please file issues when you encounter missing instructions you need.
Builtins
The compiler offers builtin constants and functions. Some are only available on certain targets. All builtins use the $$
name prefix.
Builtin constants¶
These constants are generated by the compiler and can safely be used by the user.
$$BENCHMARK_NAMES¶
An array of names of the benchmark functions as a String[].
The program must be run in benchmark mode (e.g. via the c3c benchmark shell command) for this array to be non-empty.
$$BENCHMARK_FNS¶
An array of addresses to the benchmark functions as a void*[].
The program must be run in benchmark mode (e.g. via the c3c benchmark command) for this array to be non-empty.
$$DATE¶
The current date (year, month, day) as a String.
In contrast, to retreive the time of day (hours, minutes, seconds) try using $$TIME.
$$FILE¶
The current source code file name (not including any of the path) as a String.
$$FILEPATH¶
The full ("absolute") path to the current source code file as a String.
$$FUNC¶
The current function name as a String.
This will return "<GLOBAL>" if used on the global level (outside any function), such as via String global_func_name = $$FUNC;, because there is no corresponding function name in that case.
$$FUNCTION¶
The current function as an identifier, as if its name were written in place of $$FUNCTION.
As such, it may be queried for associated info (e.g. $reflect($$FUNCTION).name, $Typeof($$FUNCTION), etc) or assigned to a function pointer and later called, etc. Thus, more info than just a String function name may be accessed this way, in contrast to $$FUNC.
$$LINE¶
The current line as an integer.
$$LINE_RAW¶
Usually the same as $$LINE, but in case of a macro inclusion it returns the line in the macro rather than
the line where the macro was included.
$$MODULE¶
The current module name as a String.
Keep in mind that there can be multiple modules per file in C3 if multiple module sections are used. In contrast, for a per file name try $$FILE or $$FILEPATH.
$$TIME¶
The current time of day (hours, minutes, seconds) as a String.
In contrast, to retreive the calendar day (year, month, day) try using $$DATE..
Compiler builtin functions¶
The $$ namespace defines compiler builtin functions.
These special functions are not guaranteed to exist on
all platforms, and are ways to wrap compiler implemented, optimized implementations
of some particular functionality. They are mainly intended for standard
library internal use. The standard library has macros
that wrap these builtins, so they should normally not be used on their own.
$$trap¶
Emits a trap instruction.
$$unreachable¶
Inserts an "unreachable" annotation.
$$stacktrace¶
Returns the current "callstack" reference if available. OS and compiler dependent.
$$volatile_store¶
Takes a variable and a value and stores the value as a volatile store.
$$volatile_load¶
Takes a variable and returns the value using a volatile load.
$$memcpy¶
Builtin memcpy instruction.
$$memset¶
Builtin memset instruction.
$$prefetch¶
Prefetch a memory location.
$$sysclock¶
Access to the cycle counter register (or similar low latency clock) on supported
architectures (e.g. RDTSC on x86), otherwise $$sysclock will yield 0.
$$syscall¶
Makes a syscall according to the platform convention on platforms where it is supported.
Math functions¶
Functions $$ceil, $$trunc, $$sin, $$cos, $$log, $$log2, $$log10, $$rint, $$round
$$sqrt, $$roundeven, $$floor, $$sqrt, $$pow, $$exp, $$fma and $$fabs, $$copysign,
$$round, $$nearbyint.
Can be applied to float vectors or numbers. Returns the same type.
Functions $$min, $$abs and $$max can be applied to any integer or float number or vector.
Function $$pow_int takes a floating point value or vector and an integer and returns
the same type as the first parameter.
Saturated addition, subtraction and left shift for integers and integer vectors:
$$sat_add, $$sat_shl, $$sat_sub.
Bit functions¶
$$fshl and $$fshr¶
Funnel shift left and right, takes either two integers or two integer vectors.
$$ctz, $$clz, $$bitreverse, $$bswap, $$popcount¶
Bit functions work on an integer or an integer vector.
Vector functions¶
$$reduce_add, $$reduce_mul, $$reduce_and, $$reduce_or, $$reduce_xor work on integer vectors.
$$reduce_fadd, $$reduce_fmul works on float vectors.
$$reduce_max, $$reduce_min works on any vector.
$$reverse reverses the values in any vector.
$$shufflevector rearranges the values of two vectors using a fixed mask into
a resulting vector.
Debugging
C3 provides several powerful features and compiler flags to help identify memory corruption, logic errors, and performance bottlenecks.
Virtual Memory Temp Allocator (VMEM_TEMP)¶
The temporary allocator (tmem) is extremely fast but can lead to "use-after-scope" bugs if pointers to temporary data are stored in globals or long-lived structs.
To debug these issues, you can enable the Virtual Memory tracking mode by passing the -D VMEM_TEMP flag to the compiler (or adding "VMEM_TEMP" to your project.json features).
How it works:¶
When VMEM_TEMP is enabled:
1. Hardware Protection: The allocator uses the OS virtual memory system (MMU) to manage pages.
2. Instant Crash: When a @pool or test scope ends, the memory pages are removed or marked as protected. Any attempt to access "dead" temporary data will cause an immediate Segfault.
3. Large Address Space: It reserves a wide virtual address range (typically 4GB) to ensure allocations don't overlap, making corruption much easier to catch.
Tip
If your program is crashing with a Segfault only when -D VMEM_TEMP is enabled, look for pointers pointing into tmem that are being accessed after the @pool that created them has closed.
Backtraces¶
In Safe Mode (default), C3 automatically generates detailed backtraces when a panic or crash occurs.
Manual Backtraces:¶
You can capture a backtrace at any time as a string:
import std::os::backtrace;
fn void log_stack() {
String bt = backtrace::get(tmem)!;
io::eprint(bt);
}
Sanitizers¶
C3 supports integration with LLVM's Address Sanitizer (ASAN) and Thread Sanitizer (TSAN).
Address Sanitizer (ASAN)¶
To enable ASAN, compile with:
ASAN will detect: - Out-of-bounds access to heap, stack, and globals. - Use-after-free bugs. - Memory leaks.Thread Sanitizer (TSAN)¶
For multi-threaded applications, TSAN helps find data races:
Tracking Allocator¶
The TrackingAllocator is a wrapper that can be placed around any other allocator to detect memory leaks and capture backtraces for every allocation.
fn void main() {
TrackingAllocator tracker;
tracker.init(mem); // Wrap the default 'mem' allocator
defer tracker.free();
Allocator a = &tracker;
// Use 'allocator::new' to pass a specific allocator:
int* p = allocator::new(a, int);
// If not freed, tracker.print_report() will show any leaks.
tracker.print_report();
}
Allocation Tracking Macros¶
For convenience, C3 provides macros to automatically wrap a block of code with a tracking allocator.
@report_heap_allocs_in_scope¶
This macro runs the enclosed code and automatically prints a full memory report at the end of the scope.
@assert_leak¶
Similar to the report macro, but instead of just printing, it will assert that no memory has leaked. If leaks are found, it triggers a panic with a report.
Note
This macro only performs tracking and assertions if debug symbols are enabled or the -D MEMORY_ASSERTS feature flag is used. Otherwise, it executes the code block normally with no overhead.
fn void main() {
@assert_leak()
{
// code that should not leak
void* p = mem::malloc(64);
mem::free(p);
};
}
Testing Macros¶
C3 includes a built-in testing framework in std::core::test. These macros provide descriptive failure messages, stringifying the expressions being tested.
fn void test_math() @test {
int x = 10;
int y = 20;
test::eq(x + y, 40);
// Test failed ^^^ ( example.c3:4 ) `30` != `40`
}
Assertions and Unreachable¶
assert¶
Used for runtime checks that should always be true.
- Safe Mode: triggers a panic with backtrace if the condition fails
- Unchecked Mode: is assumed to always be
true, generating an LLVMunreachableinstruction, becoming an optimization hint telling the compiler this path is impossible.
Note
Use @assert_always as drop-in replacement if the assertion should also happen in Unchecked Mode
unreachable¶
Marks a code path that logically should never be hit.
- Safe Mode: Triggers a panic with the provided message and a backtrace.
- Unchecked Mode: Generates an LLVM
unreachableinstruction. This is an optimization hint telling the compiler this path is impossible. If the path is actually reached, the program will have undefined behavior (which often manifests as a crash or very strange execution state).
switch (state) {
case START: // ...
case END: // ...
default: unreachable("Invalid state encountered");
}
Contracts¶
C3 supports Contracts using the @require and @ensure attributes. These are checked in Safe Mode.
@require: Pre-conditions that must be true when the function is called.@ensure: Post-conditions that must be true when the function returns.
<*
@require b != 0 : "Divisor must not be zero"
@ensure return == a / b
*>
fn float divide(float a, float b)
{
return a / b;
}
If a contract is violated in safe mode, the program panics with a descriptive message and a backtrace.
Safe vs. Unchecked Mode¶
Understanding the difference between modes is crucial for debugging:
| Feature | Safe Mode (-O0, -O1) | Unchecked Mode (-O2+) |
|---|---|---|
| Bounds Checking | Enabled | Disabled |
| Null Checks | Enabled | Disabled |
| Contracts | Evaluated | Ignored |
| Backtraces | Generated | Optional/None |
| Zero-Init | Guaranteed | Guaranteed |
Always perform your primary development and testing in Safe Mode. Switch to Unchecked Mode only for final releases or performance profiling once the logic is verified.
Library Packaging
Note that the library system is in early alpha. Everything below is subject to change.
C3 allows convenient packaging of C3 source files optionally with statically or dynamically linked libraries. To use such a library, simply pass the path to the library directory and add the library you wish to link to. The compiler will resolve any dependencies to other libraries and only compile those that are in use.
How it works¶
A library may be used either packaged or unpacked. If unpacked, it is simply a directory with the .c3l suffix, which contains all the necessary files. If packed, it is simply a compressed variant of a directory with the same structure.
The specification¶
In the top of the library resides the manifest.json file which has the following structure:
{
"provides" : "my_lib",
"execs" : [],
"targets" : {
"macos-x64" : {
"linkflags" : [],
"dependencies" : [],
"linked-libs" : ["my_lib_static", "Cocoa.framework", "c"]
},
"windows-x64" : {
"linkflags" : ["/stack:65536"],
"dependencies" : ["ms_my_extra"],
"linked-libs" : ["my_lib_static", "kernel32"],
"execs" : [],
}
}
}
In the example above, this library supports two targets: macos-x64 and windows-x64. If we tried to use it with any other target, the compiler would give an error.
We see that if we use the windows-x64 target it will also load the ms_my_extra library. We also see that the linker would have a special argument on that platform.
Both targets expect my_lib_static to be available for linking. If this library provides this static or dynamic library it will be in the target sub-directories, so it likely has the path windows-x64/my_lib_static.lib or macos-z64/libmy_lib_static.a.
Source code¶
Aside from the manifest, C3 will read any C and C3 files in the same directory as manifest.json, as well as any files in the target subdirectory for the current target. For static libraries,
typically a .c3i file (that is, a C3 file without any implementations) is provided, similar to
how .h files are used in C.
Additional actions¶
"exec", which is available both at the top level and per-target, lists the scripts which will be
invoked when a library is used. This requires running the compiler at full trust level using the
--trust=full option.
How to – automatically – export libraries¶
This feature is not implemented yet. The documentation for this feature will materialize once it is finished.
Implementation Details
Grammar
Keywords¶
The following are reserved keywords used by C3:
void bool char double
float float16 int128 ichar
int iptr sz long
short uint128 uint ulong
uptr ushort usz float128
any fault typeid assert
asm bitstruct break case
catch const continue alias
default defer typedef do
else enum extern false
for foreach foreach_r fn
tlocal if inline import
macro module nextcase null
return static struct switch
true try union var
while attrdef
$assert $case $default $defined
$echo $else $embed $exec
$expand $endfor $endforeach $endif
$endswitch $eval $error $for
$foreach $if $include $stringify
$switch $vaarg $Typefrom $Typeof
The following attributes are built in:
@align @benchmark @bigendian @builtin
@cdecl @cname @deprecated @dynamic
@export @extname @inline @interface
@littleendian @local @maydiscard @mustinit
@naked @nodiscard @noinit @noinline
@noreturn @nostrip @obfuscate @operator
@overlap @packed @priority @private
@public @pure @reflect @section
@stdcall @test @unused @used
@veccall @wasm @weak @winmain
The following constants are defined:
$$BENCHMARK_FNS $$BENCHMARK_NAMES $$DATE
$$FILE $$FILEPATH $$FUNC
$$FUNCTION $$LINE $$LINE_RAW
$$MODULE $$TEST_FNS $$TEST_NAMES
$$TIME
Yacc grammar¶
%{
#include <stdio.h>
#define YYERROR_VERBOSE
int yydebug = 1;
extern char yytext[];
extern int column;
int yylex(void);
void yyerror(char *s);
%}
%token IDENT HASH_IDENT CT_IDENT CONST_IDENT
%token TYPE_IDENT CT_TYPE_IDENT
%token AT_TYPE_IDENT AT_IDENT CT_INCLUDE
%token STRING_LITERAL INTEGER
%token INC_OP DEC_OP SHL_OP SHR_OP LE_OP GE_OP EQ_OP NE_OP
%token AND_OP OR_OP MUL_ASSIGN DIV_ASSIGN MOD_ASSIGN ADD_ASSIGN
%token SUB_ASSIGN SHL_ASSIGN SHR_ASSIGN AND_ASSIGN
%token XOR_ASSIGN OR_ASSIGN VAR NUL ELVIS NEXTCASE ANYFAULT
%token MODULE IMPORT DEF EXTERN
%token CHAR SHORT INT LONG FLOAT DOUBLE CONST VOID USZ ISZ UPTR IPTR ANY
%token ICHAR USHORT UINT ULONG BOOL INT128 UINT128 FLOAT16 FLOAT128 BFLOAT16
%token TYPEID BITSTRUCT STATIC BANGBANG AT_CONST_IDENT HASH_TYPE_IDENT
%token STRUCT UNION ENUM ELLIPSIS DOTDOT BYTES
%token CT_ERROR
%token CASE DEFAULT IF ELSE SWITCH WHILE DO FOR CONTINUE BREAK RETURN FOREACH_R FOREACH
%token FN FAULT MACRO CT_IF CT_ENDIF CT_ELSE CT_SWITCH CT_CASE CT_DEFAULT CT_FOR CT_FOREACH CT_ENDFOREACH
%token CT_ENDFOR CT_ENDSWITCH BUILTIN IMPLIES INITIALIZE FINALIZE CT_ECHO CT_ASSERT CT_EVALTYPE CT_VATYPE
%token TRY CATCH SCOPE DEFER LVEC RVEC OPTELSE CT_TYPEFROM CT_TYPEOF TLOCAL
%token CT_VASPLAT INLINE DISTINCT CT_VACONST CT_NAMEOF CT_VAREF CT_VACOUNT CT_VAARG
%token CT_SIZEOF CT_STRINGIFY CT_QNAMEOF CT_OFFSETOF CT_VAEXPR
%token CT_EXTNAMEOF CT_EVAL CT_DEFINED CT_CHECKS CT_ALIGNOF ASSERT
%token ASM CHAR_LITERAL REAL TRUE FALSE CT_CONST_IDENT
%token LBRAPIPE RBRAPIPE HASH_CONST_IDENT
%start translation_unit
%%
path
: IDENT SCOPE
| path IDENT SCOPE
;
path_const
: path CONST_IDENT
| CONST_IDENT
;
path_ident
: path IDENT
| IDENT
;
path_at_ident
: path AT_IDENT
| AT_IDENT
;
ident_expr
: CONST_IDENT
| IDENT
| AT_IDENT
;
local_ident_expr
: CT_IDENT
| HASH_IDENT
;
ct_call
: CT_ALIGNOF
| CT_DEFINED
| CT_EXTNAMEOF
| CT_NAMEOF
| CT_OFFSETOF
| CT_QNAMEOF
;
ct_analyse
: CT_EVAL
| CT_SIZEOF
| CT_STRINGIFY
;
ct_arg
: CT_VACONST
| CT_VAARG
| CT_VAREF
| CT_VAEXPR
;
flat_path
: primary_expr param_path
| type
| primary_expr
;
maybe_optional_type
: optional_type
| empty
;
string_expr
: STRING_LITERAL
| string_expr STRING_LITERAL
;
bytes_expr
: BYTES
| bytes_expr BYTES
;
expr_block
: LBRAPIPE opt_stmt_list RBRAPIPE
;
base_expr
: string_expr
| INTEGER
| bytes_expr
| NUL
| BUILTIN CONST_IDENT
| BUILTIN IDENT
| CHAR_LITERAL
| REAL
| TRUE
| FALSE
| path ident_expr
| ident_expr
| local_ident_expr
| type initializer_list
| type '.' access_ident
| type '.' CONST_IDENT
| '(' expr ')'
| expr_block
| ct_call '(' flat_path ')'
| ct_arg '(' expr ')'
| ct_analyse '(' expr ')'
| CT_VACOUNT
| CT_CHECKS '(' expression_list ')'
| lambda_decl compound_statement
;
primary_expr
: base_expr
| initializer_list
;
range_loc
: expr
| '^' expr
;
range_expr
: range_loc DOTDOT range_loc
| range_loc DOTDOT
| DOTDOT range_loc
| range_loc ':' range_loc
| ':' range_loc
| range_loc ':'
| DOTDOT
;
call_inline_attributes
: AT_IDENT
| call_inline_attributes AT_IDENT
;
call_invocation
: '(' call_arg_list ')'
| '(' call_arg_list ')' call_inline_attributes
;
access_ident
: IDENT
| AT_IDENT
| HASH_IDENT
| CT_EVAL '(' expr ')'
| TYPEID
;
call_trailing
: '[' range_loc ']'
| '[' range_expr ']'
| call_invocation
| call_invocation compound_statement
| '.' access_ident
| INC_OP
| DEC_OP
| '!'
| BANGBANG
;
call_stmt_expr
: base_expr
| call_stmt_expr call_trailing
;
call_expr
: primary_expr
| call_expr call_trailing
;
unary_expr
: call_expr
| unary_op unary_expr
;
unary_stmt_expr
: call_stmt_expr
| unary_op unary_expr
;
unary_op
: '&'
| AND_OP
| '*'
| '+'
| '-'
| '~'
| '!'
| INC_OP
| DEC_OP
| '(' type ')'
;
mult_op
: '*'
| '/'
| '%'
;
mult_expr
: unary_expr
| mult_expr mult_op unary_expr
;
mult_stmt_expr
: unary_stmt_expr
| mult_stmt_expr mult_op unary_expr
;
shift_op
: SHL_OP
| SHR_OP
;
shift_expr
: mult_expr
| shift_expr shift_op mult_expr
;
shift_stmt_expr
: mult_stmt_expr
| shift_stmt_expr shift_op mult_expr
;
bit_op
: '&'
| '^'
| '|'
;
bit_expr
: shift_expr
| bit_expr bit_op shift_expr
;
bit_stmt_expr
: shift_stmt_expr
| bit_stmt_expr bit_op shift_expr
;
additive_op
: '+'
| '-'
;
additive_expr
: bit_expr
| additive_expr additive_op bit_expr
;
additive_stmt_expr
: bit_stmt_expr
| additive_stmt_expr additive_op bit_expr
;
relational_op
: '<'
| '>'
| LE_OP
| GE_OP
| EQ_OP
| NE_OP
;
relational_expr
: additive_expr
| relational_expr relational_op additive_expr
;
relational_stmt_expr
: additive_stmt_expr
| relational_stmt_expr relational_op additive_expr
;
rel_or_lambda_expr
: relational_expr
| lambda_decl IMPLIES relational_expr
;
and_expr
: relational_expr
| and_expr AND_OP relational_expr
;
and_stmt_expr
: relational_stmt_expr
| and_stmt_expr AND_OP relational_expr
;
or_expr
: and_expr
| or_expr OR_OP and_expr
;
or_stmt_expr
: and_stmt_expr
| or_stmt_expr OR_OP and_expr
;
or_expr_with_suffix
: or_expr
| or_expr '~'
| or_expr '~' '!'
;
or_stmt_expr_with_suffix
: or_stmt_expr
| or_stmt_expr '~'
| or_stmt_expr '~' '!'
;
ternary_expr
: or_expr_with_suffix
| or_expr '?' expr ':' ternary_expr
| or_expr_with_suffix ELVIS ternary_expr
| or_expr_with_suffix OPTELSE ternary_expr
| lambda_decl implies_body
;
ternary_stmt_expr
: or_stmt_expr_with_suffix
| or_stmt_expr '?' expr ':' ternary_expr
| or_stmt_expr_with_suffix ELVIS ternary_expr
| or_stmt_expr_with_suffix OPTELSE ternary_expr
| lambda_decl implies_body
;
assignment_op
: '='
| ADD_ASSIGN
| SUB_ASSIGN
| MUL_ASSIGN
| DIV_ASSIGN
| MOD_ASSIGN
| SHL_ASSIGN
| SHR_ASSIGN
| AND_ASSIGN
| XOR_ASSIGN
| OR_ASSIGN
;
empty
:
;
assignment_expr
: ternary_expr
| CT_TYPE_IDENT '=' type
| unary_expr assignment_op assignment_expr
;
assignment_stmt_expr
: ternary_stmt_expr
| CT_TYPE_IDENT '=' type
| unary_stmt_expr assignment_op assignment_expr
;
implies_body
: IMPLIES expr
;
lambda_decl
: FN maybe_optional_type fn_parameter_list opt_attributes
;
expr_no_list
: assignment_stmt_expr
;
expr
: assignment_expr
;
constant_expr
: ternary_expr
;
param_path_element
: '[' expr ']'
| '[' expr DOTDOT expr ']'
| '.' IDENT
;
param_path
: param_path_element
| param_path param_path_element
;
arg : param_path '=' expr
| type
| param_path '=' type
| expr
| CT_VASPLAT '(' range_expr ')'
| CT_VASPLAT '(' ')'
| ELLIPSIS expr
;
arg_list
: arg
| arg_list ',' arg
;
call_arg_list
: arg_list
| arg_list ';'
| arg_list ';' parameters
| ';'
| ';' parameters
| empty
;
opt_arg_list_trailing
: arg_list
| arg_list ','
| empty
;
enum_constants
: enum_constant
| enum_constants ',' enum_constant
;
enum_list
: enum_constants
| enum_constants ','
;
enum_constant
: CONST_IDENT
| CONST_IDENT '(' arg_list ')'
| CONST_IDENT '(' arg_list ',' ')'
;
identifier_list
: IDENT
| identifier_list ',' IDENT
;
enum_param_decl
: type
| type IDENT
| type IDENT '=' expr
;
base_type
: VOID
| BOOL
| CHAR
| ICHAR
| SHORT
| USHORT
| INT
| UINT
| LONG
| ULONG
| INT128
| UINT128
| FLOAT
| DOUBLE
| FLOAT16
| BFLOAT16
| FLOAT128
| IPTR
| UPTR
| ISZ
| USZ
| ANYFAULT
| ANY
| TYPEID
| TYPE_IDENT
| path TYPE_IDENT
| CT_TYPE_IDENT
| CT_TYPEOF '(' expr ')'
| CT_TYPEFROM '(' constant_expr ')'
| CT_VATYPE '(' constant_expr ')'
| CT_EVALTYPE '(' constant_expr ')'
;
type
: base_type
| type '*'
| type '[' constant_expr ']'
| type '[' ']'
| type '[' '*' ']'
| type LVEC constant_expr RVEC
| type LVEC '*' RVEC
;
optional_type
: type
| type '!'
;
local_decl_after_type
: CT_IDENT
| CT_IDENT '=' constant_expr
| IDENT opt_attributes
| IDENT opt_attributes '=' expr
;
local_decl_storage
: STATIC
| TLOCAL
;
decl_or_expr
: var_decl
| optional_type local_decl_after_type
| expr
;
var_decl
: VAR IDENT '=' expr
| VAR CT_IDENT '=' expr
| VAR CT_IDENT
| VAR CT_TYPE_IDENT '=' type
| VAR CT_TYPE_IDENT
;
initializer_list
: '{' opt_arg_list_trailing '}'
;
ct_case_stmt
: CT_CASE constant_expr ':' opt_stmt_list
| CT_CASE type ':' opt_stmt_list
| CT_DEFAULT ':' opt_stmt_list
;
ct_switch_body
: ct_case_stmt
| ct_switch_body ct_case_stmt
;
ct_for_stmt
: CT_FOR '(' for_cond ')' opt_stmt_list CT_ENDFOR
;
ct_foreach_stmt
: CT_FOREACH '(' CT_IDENT ':' expr ')' opt_stmt_list CT_ENDFOREACH
| CT_FOREACH '(' CT_IDENT ',' CT_IDENT ':' expr ')' opt_stmt_list CT_ENDFOREACH
;
ct_switch
: CT_SWITCH '(' constant_expr ')'
| CT_SWITCH '(' type ')'
| CT_SWITCH
;
ct_switch_stmt
: ct_switch ct_switch_body CT_ENDSWITCH
;
var_stmt
: var_decl ';'
decl_stmt_after_type
: local_decl_after_type
| decl_stmt_after_type ',' local_decl_after_type
;
declaration_stmt
: const_declaration
| local_decl_storage optional_type decl_stmt_after_type ';'
| optional_type decl_stmt_after_type ';'
;
return_stmt
: RETURN expr ';'
| RETURN ';'
;
catch_unwrap_list
: relational_expr
| catch_unwrap_list ',' relational_expr
;
catch_unwrap
: CATCH catch_unwrap_list
| CATCH IDENT '=' catch_unwrap_list
| CATCH type IDENT '=' catch_unwrap_list
;
try_unwrap
: TRY rel_or_lambda_expr
| TRY IDENT '=' rel_or_lambda_expr
| TRY type IDENT '=' rel_or_lambda_expr
;
try_unwrap_chain
: try_unwrap
| try_unwrap_chain AND_OP try_unwrap
| try_unwrap_chain AND_OP rel_or_lambda_expr
;
default_stmt
: DEFAULT ':' opt_stmt_list
;
case_stmt
: CASE expr ':' opt_stmt_list
| CASE expr DOTDOT expr ':' opt_stmt_list
| CASE type ':' opt_stmt_list
;
switch_body
: case_stmt
| default_stmt
| switch_body case_stmt
| switch_body default_stmt
;
cond_repeat
: decl_or_expr
| cond_repeat ',' decl_or_expr
;
cond
: try_unwrap_chain
| catch_unwrap
| cond_repeat
| cond_repeat ',' try_unwrap_chain
| cond_repeat ',' catch_unwrap
;
else_part
: ELSE if_stmt
| ELSE compound_statement
;
if_stmt
: IF optional_label paren_cond '{' switch_body '}'
| IF optional_label paren_cond '{' switch_body '}' else_part
| IF optional_label paren_cond statement
| IF optional_label paren_cond compound_statement else_part
;
expr_list_eos
: expression_list ';'
| ';'
;
cond_eos
: cond ';'
| ';'
;
for_cond
: expr_list_eos cond_eos expression_list
| expr_list_eos cond_eos
;
for_stmt
: FOR optional_label '(' for_cond ')' statement
;
paren_cond
: '(' cond ')'
;
while_stmt
: WHILE optional_label paren_cond statement
;
do_stmt
: DO optional_label compound_statement WHILE '(' expr ')' ';'
| DO optional_label compound_statement ';'
;
optional_label_target
: CONST_IDENT
| empty
;
continue_stmt
: CONTINUE optional_label_target ';'
;
break_stmt
: BREAK optional_label_target ';'
;
nextcase_stmt
: NEXTCASE CONST_IDENT ':' expr ';'
| NEXTCASE expr ';'
| NEXTCASE CONST_IDENT ':' type ';'
| NEXTCASE type ';'
| NEXTCASE ';'
;
foreach_var
: optional_type '&' IDENT
| optional_type IDENT
| '&' IDENT
| IDENT
;
foreach_vars
: foreach_var
| foreach_var ',' foreach_var
;
foreach_stmt
: FOREACH optional_label '(' foreach_vars ':' expr ')' statement
: FOREACH_R optional_label '(' foreach_vars ':' expr ')' statement
;
defer_stmt
: DEFER statement
| DEFER TRY statement
| DEFER CATCH statement
;
ct_if_stmt
: CT_IF constant_expr ':' opt_stmt_list CT_ENDIF
| CT_IF constant_expr ':' opt_stmt_list CT_ELSE opt_stmt_list CT_ENDIF
;
assert_expr
: try_unwrap_chain
| expr
;
assert_stmt
: ASSERT '(' assert_expr ')' ';'
| ASSERT '(' assert_expr ',' expr ')' ';'
;
asm_stmts
: asm_stmt
| asm_stmts asm_stmt
;
asm_instr
: INT
| IDENT
| INT '.' IDENT
| IDENT '.' IDENT
;
asm_addr
: asm_expr
| asm_expr additive_op asm_expr
| asm_expr additive_op asm_expr '*' INTEGER
| asm_expr additive_op asm_expr '*' INTEGER additive_op INTEGER
| asm_expr additive_op asm_expr shift_op INTEGER
| asm_expr additive_op asm_expr additive_op INTEGER
;
asm_expr
: CT_IDENT
| CT_CONST_IDENT
| IDENT
| '&' IDENT
| CONST_IDENT
| REAL
| INTEGER
| '(' expr ')'
| '[' asm_addr ']'
asm_exprs
: asm_expr
| asm_exprs ',' asm_expr
;
asm_stmt
: asm_instr asm_exprs ';'
| asm_instr ';'
;
asm_block_stmt
: ASM '(' expr ')'
| ASM '{' asm_stmts '}'
| ASM '{' '}'
;
/* Order here matches compiler */
statement
: compound_statement
| var_stmt
| declaration_stmt
| return_stmt
| if_stmt
| while_stmt
| defer_stmt
| switch_stmt
| do_stmt
| for_stmt
| foreach_stmt
| continue_stmt
| break_stmt
| nextcase_stmt
| asm_block_stmt
| ct_echo_stmt
| ct_assert_stmt
| ct_if_stmt
| ct_switch_stmt
| ct_foreach_stmt
| ct_for_stmt
| expr_no_list ';'
| assert_stmt
| ';'
;
compound_statement
: '{' opt_stmt_list '}'
;
statement_list
: statement
| statement_list statement
;
opt_stmt_list
: statement_list
| empty
;
switch_stmt
: SWITCH optional_label '{' switch_body '}'
| SWITCH optional_label '{' '}'
| SWITCH optional_label paren_cond '{' switch_body '}'
| SWITCH optional_label paren_cond '{' '}'
;
expression_list
: decl_or_expr
| expression_list ',' decl_or_expr
;
optional_label
: CONST_IDENT ':'
| empty
;
ct_assert_stmt
: CT_ASSERT constant_expr ':' constant_expr ';'
| CT_ASSERT constant_expr ';'
| CT_ERROR constant_expr ';'
;
ct_include_stmt
: CT_INCLUDE string_expr ';'
;
ct_echo_stmt
: CT_ECHO constant_expr ';'
bitstruct_declaration
: BITSTRUCT TYPE_IDENT ':' type opt_attributes bitstruct_body
bitstruct_body
: '{' '}'
| '{' bitstruct_defs '}'
| '{' bitstruct_simple_defs '}'
;
bitstruct_defs
: bitstruct_def
| bitstruct_defs bitstruct_def
;
bitstruct_simple_defs
: base_type IDENT ';'
| bitstruct_simple_defs base_type IDENT ';'
;
bitstruct_def
: base_type IDENT ':' constant_expr DOTDOT constant_expr ';'
| base_type IDENT ':' constant_expr ';'
;
static_declaration
: STATIC INITIALIZE opt_attributes compound_statement
| STATIC FINALIZE opt_attributes compound_statement
;
attribute_name
: AT_IDENT
| AT_TYPE_IDENT
| path AT_TYPE_IDENT
;
attribute_operator_expr
: '&' '[' ']'
| '[' ']' '='
| '[' ']'
;
attr_param
: attribute_operator_expr
| constant_expr
;
attribute_param_list
: attr_param
| attribute_param_list ',' attr_param
;
attribute
: attribute_name
| attribute_name '(' attribute_param_list ')'
;
attribute_list
: attribute
| attribute_list attribute
;
opt_attributes
: attribute_list
| empty
;
trailing_block_param
: AT_IDENT
| AT_IDENT '(' ')'
| AT_IDENT '(' parameters ')'
;
macro_params
: parameters
| parameters ';' trailing_block_param
| ';' trailing_block_param
| empty
;
macro_func_body
: implies_body ';'
| compound_statement
;
macro_declaration
: MACRO macro_header '(' macro_params ')' opt_attributes macro_func_body
;
struct_or_union
: STRUCT
| UNION
;
struct_declaration
: struct_or_union TYPE_IDENT opt_attributes struct_body
;
struct_body
: '{' struct_declaration_list '}'
;
struct_declaration_list
: struct_member_decl
| struct_declaration_list struct_member_decl
;
enum_params
: enum_param_decl
| enum_params ',' enum_param_decl
;
enum_param_list
: '(' enum_params ')'
| '(' ')'
| empty
;
struct_member_decl
: type identifier_list opt_attributes ';'
| struct_or_union IDENT opt_attributes struct_body
| struct_or_union opt_attributes struct_body
| BITSTRUCT ':' type opt_attributes bitstruct_body
| BITSTRUCT IDENT ':' type opt_attributes bitstruct_body
| INLINE type IDENT opt_attributes ';'
| INLINE type opt_attributes ';'
;
enum_spec
: ':' type enum_param_list
| empty
;
enum_declaration
: ENUM TYPE_IDENT enum_spec opt_attributes '{' enum_list '}'
;
faults
: CONST_IDENT
| faults ',' CONST_IDENT
;
fault_declaration
: FAULT TYPE_IDENT opt_attributes '{' faults '}'
| FAULT TYPE_IDENT opt_attributes '{' faults ',' '}'
;
func_macro_name
: IDENT
| AT_IDENT
;
func_header
: optional_type type '.' func_macro_name
| optional_type func_macro_name
;
macro_header
: func_header
| type '.' func_macro_name
| func_macro_name
;
fn_parameter_list
: '(' parameters ')'
| '(' ')'
;
parameters
: parameter '=' expr
| parameter
| parameters ',' parameter
| parameters ',' parameter '=' expr
;
parameter
: type IDENT opt_attributes
| type ELLIPSIS IDENT opt_attributes
| type ELLIPSIS CT_IDENT
| type CT_IDENT
| type ELLIPSIS opt_attributes
| type HASH_IDENT opt_attributes
| type '&' IDENT opt_attributes
| type opt_attributes
| '&' IDENT opt_attributes
| HASH_IDENT opt_attributes
| ELLIPSIS
| IDENT opt_attributes
| IDENT ELLIPSIS opt_attributes
| CT_IDENT
| CT_IDENT ELLIPSIS
;
func_definition
: FN func_header fn_parameter_list opt_attributes ';'
| FN func_header fn_parameter_list opt_attributes macro_func_body
;
const_declaration
: CONST CONST_IDENT opt_attributes '=' expr ';'
| CONST type CONST_IDENT opt_attributes '=' expr ';'
;
func_typedef
: FN optional_type fn_parameter_list
;
opt_distinct_inline
: DISTINCT
| DISTINCT INLINE
| INLINE DISTINCT
| INLINE
| empty
;
generic_parameters
: bit_expr
| type
| generic_parameters ',' bit_expr
| generic_parameters ',' type
;
typedef_type
: func_typedef
| type opt_generic_parameters
;
multi_declaration
: ',' IDENT
| multi_declaration ',' IDENT
;
global_storage
: TLOCAL
| empty
;
global_declaration
: global_storage optional_type IDENT opt_attributes ';'
| global_storage optional_type IDENT multi_declaration opt_attributes ';'
| global_storage optional_type IDENT opt_attributes '=' expr ';'
;
opt_tl_stmts
: top_level_statements
| empty
;
tl_ct_case
: CT_CASE constant_expr ':' opt_tl_stmts
| CT_CASE type ':' opt_tl_stmts
| CT_DEFAULT ':' opt_tl_stmts
;
tl_ct_switch_body
: tl_ct_case
| tl_ct_switch_body tl_ct_case
;
define_attribute
: AT_TYPE_IDENT '(' parameters ')' opt_attributes '=' '{' opt_attributes '}'
| AT_TYPE_IDENT opt_attributes '=' '{' opt_attributes '}'
;
opt_generic_parameters
: '{' generic_parameters '}'
| empty
;
define_ident
: IDENT '=' path_ident opt_generic_parameters
| CONST_IDENT '=' path_const opt_generic_parameters
| AT_IDENT '=' path_at_ident opt_generic_parameters
;
define_declaration
: DEF define_ident ';'
| DEF define_attribute ';'
| DEF TYPE_IDENT opt_attributes '=' opt_distinct_inline typedef_type ';'
;
tl_ct_if
: CT_IF constant_expr ':' opt_tl_stmts CT_ENDIF
| CT_IF constant_expr ':' opt_tl_stmts CT_ELSE opt_tl_stmts CT_ENDIF
;
tl_ct_switch
: ct_switch tl_ct_switch_body CT_ENDSWITCH
;
module_param
: CONST_IDENT
| TYPE_IDENT
;
module_params
: module_param
| module_params ',' module_param
;
module
: MODULE path_ident opt_attributes ';'
| MODULE path_ident '{' module_params '}' opt_attributes ';'
;
import_paths
: path_ident
| path_ident ',' path_ident
;
import_decl
: IMPORT import_paths opt_attributes ';'
;
translation_unit
: top_level_statements
| empty
;
top_level_statements
: top_level
| top_level_statements top_level
;
opt_extern
: EXTERN
| empty
;
top_level
: module
| import_decl
| opt_extern func_definition
| opt_extern const_declaration
| opt_extern global_declaration
| ct_assert_stmt
| ct_echo_stmt
| ct_include_stmt
| tl_ct_if
| tl_ct_switch
| struct_declaration
| fault_declaration
| enum_declaration
| macro_declaration
| define_declaration
| static_declaration
| bitstruct_declaration
;
%%
void yyerror(char *s)
{
fflush(stdout);
printf("\n%*s\n%*s\n", column, "^", column, s);
}
int main(int argc, char *argv[])
{
yyparse();
return 0;
}
C3 Specification¶
Notation¶
The syntax is specified using Extended Backus-Naur Form (EBNF):
production ::= PRODUCTION_NAME '::=' expression?
expression ::= alternative ("|" alternative)*
alternative ::= term term*
term ::= PRODUCTION_NAME | TOKEN | set | group | option | repetition
set ::= '[' (range | CHAR) (rang | CHAR)* ']'
range ::= CHAR '-' CHAR
group ::= '(' expression ')'
option ::= expression '?'
repetition ::= expression '*'
Productions are expressions constructed from terms and the following operators, in increasing precedence:
Uppercase production names are used to identify lexical tokens. Non-terminals are in lower case. Lexical tokens are enclosed in single quotes ''.
The form a..b represents the set of characters from a through b as alternatives.
Source code representation¶
A C3 program consists of one or more translation units. Each translation unit is stored in a file written in the Unicode character set, encoded as a sequence of bytes using UTF-8.
Except within comments and the contents of character and string literals, all input is formed from the ASCII subset (U+0000 to U+007F) of Unicode. A compiler may reject a source file that contains bytes outside the ASCII subset in any other position.
For simplicity, this document uses the unqualified term character to refer to a single Unicode code point in the source text. Each code point is distinct; in particular, an upper-case letter and its lower-case counterpart are different characters.
Characters¶
The following terms denote individual characters and classes of characters:
UNICODE_CHAR may appear only within comments and within character and string literals. In every other position the source text is restricted to the ASCII subset.
Letters and digits¶
The following terms denote the letters and digits used to form tokens:
UC_LETTER ::= [A-Z]
LC_LETTER ::= [a-z]
LETTER ::= UC_LETTER | LC_LETTER
DIGIT ::= [0-9]
HEX_DIGIT ::= [0-9a-fA-F]
BINARY_DIGIT ::= [01]
OCTAL_DIGIT ::= [0-7]
LC_LETTER_ ::= LC_LETTER | "_"
UC_LETTER_ ::= UC_LETTER | "_"
ALPHANUM ::= LETTER | DIGIT
ALPHANUM_ ::= ALPHANUM | "_"
UC_ALPHANUM_ ::= UC_LETTER_ | DIGIT
LC_ALPHANUM_ ::= LC_LETTER_ | DIGIT
The underscore character _ (U+005F) is not a LETTER; it is admitted only by the productions in which it appears explicitly.
Carriage return¶
The carriage return (U+000D) is treated as white space. A compiler may alternatively strip carriage return characters from the source text prior to lexical translation; both treatments are valid, so a program must not depend on whether carriage returns are preserved.
Bidirectional markers¶
Unbalanced Unicode bidirectional formatting markers — such as U+202D (LEFT-TO-RIGHT OVERRIDE) and U+202E (RIGHT-TO-LEFT OVERRIDE) — are not legal in C3 source text.
Lexical elements¶
This chapter describes how the source text of a translation unit is divided into tokens, and defines the lexical structure of each kind of token.
Tokens¶
A translation unit is translated into a sequence of tokens by repeatedly removing the longest prefix of the remaining input that forms a valid token. White space, line terminators, and comments are not tokens; they are discarded during translation, but may serve to separate tokens that would otherwise combine into a single token.
Because the longest valid prefix is always taken, a tokenization is chosen even when it does not yield a grammatically correct program and another tokenization would.
Example:
a--bis translated as the tokensa,--,b, which is not a valid expression, even though the tokenizationa,-,-,bwould be.
There are four classes of tokens: identifiers, keywords, operators and punctuation, and literals.
Line terminators¶
The compiler divides the input into lines by recognizing line terminators. A line is terminated by the ASCII LF character (U+000A), the newline. A line terminator ends a line comment, and like white space it separates tokens.
White space¶
White space is any of the space character (U+0020), the horizontal tab (U+0009), and the carriage return (U+000D). White space separates tokens and is otherwise insignificant.
Comments¶
There are two forms of regular comment:
- A line comment begins with
//and stops at the end of the line. - A block comment begins with
/*and ends with*/. Block comments nest: every/*within a block comment must be matched by a corresponding*/.
In addition, when the first line of a translation unit begins with #!, that line is treated as a line comment. The #! form is recognized only as the first line of the file; it has no special meaning anywhere else.
A comment does not begin inside a character, string, or byte literal, nor inside another comment.
Doc contracts¶
A doc contract is a block comment of the form <* text *>. Doc contracts do not nest. The text between <* and *> is optionally parsed using the contract grammar specified in Contracts. A conforming compiler may instead treat a doc contract as a regular block comment and discard its contents during lexical translation; in that case the contract has no effect on the program. When a doc contract is parsed, its tokens enter the token stream as specified in Contracts.
Identifiers¶
An identifier names a program entity. C3 distinguishes several lexical classes of identifier by the case of their characters and by an optional prefix sigil.
IDENTIFIER ::= "_"* LC_LETTER ALPHANUM_*
CONST_IDENT ::= "_"* UC_LETTER UC_ALPHANUM_*
TYPE_IDENT ::= "_"* UC_LETTER UC_ALPHANUM_* LC_LETTER ALPHANUM_*
CT_IDENT ::= "$" IDENTIFIER
CT_CONST_IDENT ::= "$" CONST_IDENT
CT_TYPE_IDENT ::= "$" TYPE_IDENT
CT_BUILTIN ::= "$$" IDENTIFIER
CT_BUILTIN_CONST ::= "$$" CONST_IDENT
AT_IDENT ::= "@" IDENTIFIER
AT_TYPE_IDENT ::= "@" TYPE_IDENT
HASH_IDENT ::= "#" IDENTIFIER
PATH_SEGMENT ::= "_"* LC_LETTER LC_ALPHANUM_*
A CONST_IDENT consists only of underscores, upper-case letters, and digits. A TYPE_IDENT begins like a CONST_IDENT but additionally contains at least one lower-case letter. A plain IDENTIFIER begins with an optional run of underscores followed by a lower-case letter.
A sequence consisting solely of underscores — including a single underscore _ — is not a valid identifier; it matches none of these classes.
Identifiers are limited to 127 characters.
Keywords¶
The following identifiers are reserved as keywords and may not be used otherwise.
alias asm assert attrdef bitstruct
break case catch const constdef
continue default defer do else
enum extern false faultdef fn
for foreach foreach_r if import
inline interface lengthof macro module
nextcase null return static struct
switch tlocal true try typedef
union var while
The built-in type names are also reserved:
any bfloat bool char double
fault float float16 float128 ichar
int int128 iptr long short
sz typeid uint uint128 untypedlist
uptr ushort usz void
The following compile-time keywords, each beginning with $, are reserved:
$assert $case $default $defined $echo
$else $embed $endfor $endforeach $endif
$endswitch $error $eval $exec $expand
$feature $for $foreach $if $include
$reflect $stringify $switch $Typefrom $Typeof
$vaarg
Operators and punctuation¶
The following character sequences are operators and punctuation:
+ - * / %
& | ^ ~ << >>
= += -= *= /= %=
&= |= ^= <<= >>=
== != < > <= >=
&& || !
? ?: ?? !!
++ --
. .. ...
, ; : ::
-> =>
( ) { } [ ]
[< >]
@ # $
&&& ||| ??? +++ +++=
The sequence $$ introduces a compile-time built-in identifier, as described under Identifiers.
Integer literals¶
An integer literal denotes an integer constant. It is written in decimal, binary, octal, or hexadecimal, and may carry a suffix fixing its type.
INTEGER ::= (DECIMAL_LIT | BINARY_LIT | OCTAL_LIT | HEX_LIT) INTEGER_SUFFIX?
DECIMAL_LIT ::= DIGIT ("_"* DIGIT)*
BINARY_LIT ::= "0" ("b" | "B") BINARY_DIGIT ("_"* BINARY_DIGIT)*
OCTAL_LIT ::= "0" ("o" | "O") OCTAL_DIGIT ("_"* OCTAL_DIGIT)*
HEX_LIT ::= "0" ("x" | "X") HEX_DIGIT ("_"* HEX_DIGIT)*
INTEGER_SUFFIX ::= ("l" | "L") ("l" | "L")?
| ("u" | "U") (("l" | "L") ("l" | "L")?)?
An underscore may appear between two digits and is insignificant; it may not appear at the start or end of the digit sequence, nor immediately after the base prefix. Underscores may be repeated.
The suffix is case-insensitive. A single l group selects a 64-bit literal, and a doubled ll group selects a 128-bit literal; a leading u selects the unsigned form. These widths are fixed on every platform.
Floating-point literals¶
A floating-point literal denotes a real constant, written in decimal or hexadecimal form, with an optional type suffix.
REAL ::= (DEC_FLOAT_LIT | HEX_FLOAT_LIT) REAL_SUFFIX?
DEC_FLOAT_LIT ::= DECIMAL_LIT DEC_EXPONENT
| DECIMAL_LIT "." DECIMAL_LIT DEC_EXPONENT?
HEX_FLOAT_LIT ::= "0" ("x" | "X") HEX_DIGITS ("." HEX_DIGITS)? HEX_EXPONENT
HEX_DIGITS ::= HEX_DIGIT ("_"* HEX_DIGIT)*
DEC_EXPONENT ::= ("e" | "E") ("+" | "-")? DIGIT+
HEX_EXPONENT ::= ("p" | "P") ("+" | "-")? DIGIT+
REAL_SUFFIX ::= "d" | "D" | "f" | "F"
A decimal floating-point literal either has an exponent, or has a fractional part with digits on both sides of the . and an optional exponent. A leading . or a trailing . is therefore not part of a floating-point literal. A hexadecimal floating-point literal always requires a p binary exponent.
The suffix is case-insensitive: f denotes the float type and d denotes the double type.
Character literals¶
A character literal is one or more characters enclosed in single quotes.
UNICODE_CHAR_NO_QUOTE is any character other than a control character (U+0000–U+001F), a backslash, or a single quote. A line terminator may not appear inside a character literal. Backslash escape sequences are described under Backslash escapes.
A character literal may contain 1, 2, 4, 8, or 16 bytes of character data; the resulting constant and its type are specified in Constants.
String literals¶
A string literal is a sequence of characters enclosed in double quotes.
The same character and escape rules apply as for character literals: no raw control characters, no line terminator, backslash escapes permitted. A string literal therefore may not span multiple lines.
Adjacent string literals are concatenated into a single string constant; a character literal participates in the same concatenation.
Raw string literals¶
A raw string literal is enclosed in backticks. No escape processing is performed on its contents: every character stands for itself, and the literal may span multiple lines. A literal backtick is written by doubling it ( ``).
RAW_CHAR is any character other than a backtick. A raw string literal yields an ordinary string constant and concatenates with adjacent string and character literals.
Byte data literals¶
A byte data literal denotes a sequence of raw bytes. It is introduced by x (hexadecimal) or b64 (Base64) and may use double-quote, single-quote, or backtick delimiters; the backtick and single-quote forms may be broken across lines, with intervening white space ignored.
BYTES ::= HEX_BYTES | B64_BYTES
HEX_BYTES ::= "x" ( '"' ... '"' | "'" ... "'" | "`" ... "`" )
B64_BYTES ::= "b64" ( '"' ... '"' | "'" ... "'" | "`" ... "`" )
A hexadecimal byte literal contains hexadecimal digits, each pair denoting one byte. A Base64 byte literal contains Base64-encoded data. Adjacent byte data literals are concatenated. The precise content rules and the resulting constant are specified in Constants.
Backslash escapes¶
Within character literals and string literals (but not raw string literals), the backslash introduces an escape sequence:
\0 the zero byte (0x00)
\a alert / bell (0x07)
\b backspace (0x08)
\e escape (0x1B)
\f form feed (0x0C)
\n newline (0x0A)
\r carriage return (0x0D)
\t horizontal tab (0x09)
\v vertical tab (0x0B)
\\ backslash (0x5C)
\' single quote (0x27)
\" double quote (0x22)
\xNN one byte, two hex digits
\uNNNN a two-byte Unicode value, four hex digits
\UNNNNNNNN a four-byte Unicode value, eight hex digits
The boolean literals¶
The keywords true and false are the two literals of the boolean type bool.
The null literal¶
The keyword null is the literal pointer value whose address is zero. Its type and conversions are specified in Types.
Constants¶
A constant is an immutable, named value. A constant declaration binds a CONST_IDENT to the value of its initializer.
A constant declaration binds exactly one name and is terminated by a semicolon. It may appear at module level or within the body of a function.
A constant declaration falls into one of three categories. A declaration preceded by extern declares an extern constant; the extern prefix is permitted only at module level, and the declaration has no initializer. A declaration that is not extern and specifies a type declares a typed constant. A declaration that is not extern and omits the type declares an untyped constant. A non-extern declaration must have an initializer.
A constant declared within the body of a function is a local constant; it has local visibility and static storage duration.
The initializer is an expression. The grammar does not restrict it, but it must evaluate to a constant value.
A constant value belongs to one of two categories. A compile-time constant can participate in compile-time constant folding; a runtime constant cannot. The classification rules are given in Compile-time and runtime constants.
Types, reflection values, and member references have no runtime representation. They may not be bound to a constant; they must be bound to a compile-time variable.
A constant declaration may carry attributes. The applicable attributes are listed in Attributes.
Typed constants¶
A typed constant is a non-extern constant declaration that specifies a type.
The initializer must be assignable to the specified type. Assignability is defined in Properties of types and values.
The specified type may be an optional type. It may have an inferred length, in which case the length is taken from the initializer.
A typed constant exists at runtime. Its address may be taken.
Untyped constants¶
An untyped constant is a non-extern constant declaration that omits the type.
An untyped constant differs from a typed constant in three respects:
- Its type is the type of its initializer.
- It does not exist at runtime.
- Its address may not be taken.
Extern constants¶
An extern constant is a constant declaration preceded by extern.
An extern constant must specify a type, has no initializer, and may be declared only at module level.
The value of an extern constant is not available during compilation. An extern constant is therefore not a constant expression, and may not be used where a constant value is required, including as the initializer of another constant.
Constant expressions¶
A constant expression is an expression that evaluates to a constant value.
Syntactically, a constant expression is an expression that excludes the assignment operators. Whether an expression is a constant expression, and the category of its value, is determined during semantic analysis.
The following require a compile-time constant: array lengths, vector lengths, bitstruct member ranges, enum values, attribute arguments, and the operands of $assert, $error, $switch, $case, and $embed.
A constant initializer and a global variable initializer accept a constant value of either category.
Variables¶
A variable is a named, mutable value held in storage. A variable declaration binds an IDENTIFIER to a storage location of a specified or inferred type.
A variable declaration may appear at module level or within the body of a function. A variable declared at module level is a global variable; a variable declared in a function body is a local variable.
Every variable has a storage duration that determines its lifetime: the portion of program execution during which storage is reserved for the variable. A variable has a constant address and retains its last-stored value throughout its lifetime. Accessing a variable outside its lifetime is undefined behaviour.
There are three storage durations:
- A variable with static storage duration exists for the entire execution of the program. Its initializer, if any, is evaluated once before program startup.
- A variable with automatic storage duration has storage allocated for the lifetime of the enclosing function call. The variable's name is in scope only within the block in which it is declared, but its storage remains valid until the function returns. If an initializer is given, it is evaluated each time the declaration is reached during execution; otherwise the variable is zero-initialized each time the declaration is reached.
- A variable with thread-local storage duration exists for the lifetime of the thread for which it is created. Each thread that accesses the variable has a distinct instance, and each instance's initializer is evaluated when its thread starts.
A global variable has static storage duration by default. A local variable has automatic storage duration by default. The modifiers static and tlocal change a local variable's storage duration to static or thread-local, respectively.
A variable without an initializer is implicitly zero-initialized. The @noinit attribute may be used to leave a variable uninitialized; a variable whose type is marked @mustinit may not use @noinit.
Variables and constants are mutually exclusive. A constant is declared with const and bound to a CONST_IDENT; a variable is declared without const and bound to an IDENTIFIER.
A separate kind of declaration introduces compile-time variables, which exist only during compilation. They are described in their own subsection below.
A variable declaration may carry attributes. The applicable attributes are listed in Attributes. Visibility of a global variable is controlled by @public, @private, and @local; see Modules.
Global variables¶
A global variable is declared at module level.
The type is required. Each name must be an IDENTIFIER. A single declaration may bind several names by separating them with commas; a declaration that binds multiple names may not have an initializer.
The initializer, if present, must be a constant expression (see Constant expressions). It may be either a compile-time or a runtime constant.
A global variable has static storage duration. If the declaration is preceded by tlocal, the variable instead has thread-local storage duration; see Thread-local variables.
A global variable preceded by extern is an extern global variable; see Extern global variables.
Extern global variables¶
An extern global variable is a global variable whose definition is provided elsewhere and resolved at link time. It is declared with the extern prefix and has no initializer.
An extern global variable must specify a type. It is otherwise subject to the same rules as a non-extern global variable, including the use of tlocal to give it thread-local storage duration.
Local variables¶
A local variable is declared within the body of a function.
local_var_decl ::= type IDENT ("," IDENT)* attributes? ("=" expression)? ";"
| "var" IDENT attributes? "=" expression ";"
In the first form the type is given explicitly. A single declaration may bind several names by separating them with commas; a declaration that binds multiple names may not have an initializer. In the second form the keyword var introduces the declaration and the type is inferred from the initializer; the initializer is required, and only a single name may be bound.
The initializer is an expression. It need not be constant.
A local variable has automatic storage duration. Its storage is allocated for the lifetime of the enclosing function call: the variable's name is in scope only within the block in which it is declared, but its storage remains valid until the function returns. If an initializer is given, it is evaluated each time the declaration is reached during execution; otherwise the variable is zero-initialized each time the declaration is reached.
The modifiers static and tlocal change a local variable's storage duration; see Static local variables and Thread-local variables. The two modifiers may not appear together, and neither may be combined with the var form.
Thread-local variables¶
A variable preceded by tlocal has thread-local storage duration: a distinct instance of the variable exists for each thread that accesses it, and each instance is initialized when its thread starts and persists for the lifetime of that thread.
A tlocal local variable is, in effect, a thread-local global variable whose name is visible only within the enclosing function.
The initializer of a tlocal variable, if present, must be a constant expression.
Static local variables¶
A local variable preceded by static is, in effect, a global variable whose name is visible only within the enclosing function. It has static storage duration: a single instance persists for the lifetime of the program, and its initializer is evaluated once before program startup.
static is permitted only on local variable declarations; a global variable already has static storage duration.
The initializer of a static local variable, if present, must be a constant expression.
Compile-time variables¶
A compile-time variable exists only during compilation. It has no runtime representation, and its address may not be taken.
A compile-time variable holds either a value or a type, distinguished by the case of its name:
- A compile-time value variable is named with a
CT_IDENT($name). - A compile-time type variable is named with a
CT_TYPE_IDENT($Name).
A compile-time variable is introduced by one of the following forms:
ct_var_decl ::= "var" CT_IDENT ("=" expression)? ";"
| "var" CT_TYPE_IDENT ("=" expression)? ";"
| type CT_IDENT ("=" expression)? ";"
A compile-time value variable may be untyped (var $x), typed by the var form with the type inferred from the initializer, or typed explicitly by giving a type in place of var. A compile-time type variable may not be given an explicit type.
An initializer is optional. If present, it must be a constant expression for a value variable, or denote a type for a type variable.
Attributes may not be applied to a compile-time variable. A compile-time variable may be declared within a function body or a macro body.
Types¶
A type determines a set of values together with the operations applicable to those values. A type is either named or expressed as a type literal.
The built-in types — booleans, integer types, floating-point types, void, any, typeid, and fault — are predeclared. Named user-defined types are introduced by struct, union, bitstruct, enum, constdef, interface, typedef, and alias declarations, each described below. A type literal constructs a type from existing types: pointers, arrays, slices, vectors, optionals, and function types.
type ::= type_name | type_literal
type_name ::= path? TYPE_IDENT | builtin_type
path ::= IDENTIFIER "::" (IDENTIFIER "::")*
builtin_type ::= "void" | "bool" | "any" | "typeid" | "fault"
| "ichar" | "char" | "short" | "ushort"
| "int" | "uint" | "long" | "ulong"
| "int128" | "uint128"
| "iptr" | "uptr" | "sz" | "usz"
| "float" | "double" | "untypedlist"
The syntax of each type literal is given in the corresponding subsection.
Boolean types¶
The type bool represents truth values. It has two values, true and false. A bool occupies one byte of storage.
Integer types¶
C3 has fourteen built-in integer types: seven signed and seven unsigned. The first five pairs have fixed power-of-two widths; the remaining four are platform-dependent.
| Signed | Unsigned | Width |
|---|---|---|
ichar |
char |
8 bits |
short |
ushort |
16 bits |
int |
uint |
32 bits |
long |
ulong |
64 bits |
int128 |
uint128 |
128 bits |
iptr |
uptr |
same width as void* |
sz |
usz |
width of the maximum pointer difference |
A signed type with N bits represents values in the range −2^(N−1) to 2^(N−1) − 1. An unsigned type with N bits represents values in the range 0 to 2^N − 1.
Integer arithmetic uses two's complement representation. Signed overflow wraps and does not produce undefined behaviour.
Floating-point types¶
C3 has two floating-point types: float is 32 bits and double is 64 bits. Both follow the IEEE 754 binary representation for their respective widths.
The void type¶
The type void represents the absence of a value. It is used as the return type of a function that produces no value, and as the pointed-to type of void*. A void value cannot be stored or named.
Pointer types¶
A pointer type denotes the address of an object of a given type:
A pointer holds the address of an object of the pointed-to type, or the literal value null.
The literal null converts implicitly to any pointer type. The type void* is a wildcard pointer: it converts implicitly to and from any other pointer type.
Pointer arithmetic follows the same rules as C: p + i advances p by i elements (each T::size bytes), p - i retreats by the same amount, and p - q for two pointers to the same element type yields a signed integer count of elements between them.
Pointer arithmetic on void* is supported and treats the element size as 1, identical to pointer arithmetic on char*.
Subscripting a pointer is equivalent to pointer arithmetic followed by a dereference. The index may be negative; pointer subscripting is never bounds-checked.
A pointer of any type may be converted losslessly to iptr or uptr and back.
A void* may not be directly dereferenced or subscripted; it must first be cast to a non-void pointer type.
Array types¶
An array type holds a fixed number of values of an element type:
The expression must be a compile-time constant expression of integer type denoting the array length. The form type[*] is permitted where the length can be inferred from an initializer; the inferred length becomes part of the type.
The length is part of the type, so int[3] and int[4] are distinct. An array is a value: assignment, parameter passing, and return copy the elements.
A pointer to an array, type[N]*, implicitly converts to a pointer to the first element, type*.
An array must have at least one element; Type[0] is not a valid type.
Slice types¶
A slice type denotes a view into a contiguous sequence of elements:
A slice is a pair consisting of a pointer to the element sequence and an integer length. The fields .ptr and .len provide these components.
A slice is obtained by taking the address of an array, by slicing an array, slice, or vector with a range expression, or by allocating a sequence of elements at runtime.
Indexing a slice is range-checked in safe builds.
Vector types¶
A vector type holds a fixed number of values that may be operated on using SIMD instructions:
The element type must be one of:
bool,- an integer type,
- a floating-point type,
- a pointer type,
- an enum type,
- a typedef whose underlying type is one of the above.
The length must be a compile-time constant expression of integer type; the form type[<*>] is permitted where the length can be inferred from an initializer.
A plain vector such as int[<3>] has the same size and ABI representation as the corresponding array type (int[3]): element alignment, no padding. When used in arithmetic or bitwise expressions, the operations are applied elementwise using SIMD instructions where available.
The @simd attribute declares a SIMD aligned vector. A SIMD aligned vector must have a length that is a power of two, and has platform SIMD alignment (typically it will match the size of the vector). As locals and globals, plain vectors and SIMD aligned vectors are treated identically in terms of alignment; the distinction arises when the vector is embedded inside a struct or an array, or appears at the ABI boundary. In those contexts a plain vector has the alignment of the corresponding array type, while a @simd vector retains its SIMD alignment.
Arithmetic and bitwise operations on a vector are applied elementwise. A scalar value used with a vector is widened by replication.
A vector must have at least one element, as an example int[<0>] is not a valid type.
A vector implicitly converts to the corresponding array type and vice versa.
It is possible to take the address of a single element. Vectors can be sliced.
Field access and swizzling¶
The elements of a vector at indices 0, 1, 2, 3 may be referred to by the field names x, y, z, w, or by the alternate set r, g, b, a. A single field access denotes the corresponding element value: for a vector v, v.x is the element at index 0, v.r is also the element at index 0, and so on. Field-name indices beyond the vector's length are an error.
A swizzle expression concatenates several such field names to form a new vector whose elements are the corresponding elements of the source. The width of the result equals the number of field names in the swizzle:
int[<4>] v = { 10, 20, 30, 40 };
int[<2>] a = v.xz; // { 10, 30 }
int[<9>] b = v.xxxzzzyyy; // { 10, 10, 10, 30, 30, 30, 20, 20, 20 }
There is no restriction on the ordering of the field names within a swizzle, and the same field may be repeated. The two field name sets (xyzw and rgba) may not be mixed within a single swizzle: v.rgz is an error.
A swizzle expression is an lvalue when no index is repeated; assigning to such an lvalue writes the corresponding source elements. For example, v.zy = e is well-formed when e is a 2-element vector; v.xxy = e is not, because index 0 appears twice on the left.
Increment and decrement¶
The unary ++ and -- operators applied to a vector are applied elementwise. For a vector v, v++ returns the original vector and replaces each element with its incremented value; ++v returns the incremented vector and stores it back into v. The operators are valid for vectors of integer element type.
Enum vectors¶
A vector whose element type is an enum is an enum vector. An enum vector supports the accessor .ordinal, which produces a vector of the enum's backing integer type holding the ordinal of each element. The static method Ty::from_ordinal, when invoked on an integer vector, returns an enum vector of Ty whose elements correspond to the supplied ordinals.
Vector size limit¶
A compiler may impose a maximum total bit width on a vector. The limit is at least as wide as the largest SIMD vector supported by the target. A typical limit is 4096 bits. For the purpose of this limit, a boolean vector is counted as 8 bits per element.
Struct types¶
A struct type is a named sequence of fields stored in declaration order:
(The full grammar of struct_union_body is given in Declarations.)
A struct must declare at least one member. A flexible array member (see below) does not by itself satisfy this requirement: a struct that contains a flexible array member must also declare at least one preceding member.
Field access uses dot notation. The dot operator also applies to a single level of pointer-to-struct: if p is of type St* and f is a field of St, then p.f denotes the field of the pointee.
A field may be declared inline. Such a field designates an inline member: values of the struct then implicitly convert to the type of that field, and methods of that type are accessible through the enclosing struct. See Properties of types and values.
Anonymous nested structs and unions are permitted, following C99 conventions.
Layout attributes (@align, @packed, @compact, @nopadding) control storage representation. See Attributes.
Flexible array member¶
The last field of a struct may be declared as an array with no specified length, of the form Ty[*]. Such a field is a flexible array member: it contributes no size to the struct itself (the struct's size is that of the preceding fields plus any required tail padding for alignment), but the storage of an instance may extend past the struct's declared size to hold elements of the array. The number of elements is determined by the size of the allocation rather than by the type.
A struct containing a flexible array member may not be embedded as a field of another struct, used as an element of an array, or copied by value; a value of such a struct is meaningful only through a pointer to underlying storage of sufficient size.
Union types¶
A union type is declared like a struct, but its fields share storage:
A union must declare at least one member.
All fields of a union share storage beginning at the same address. The alignment of a union is the maximum alignment requirement of any of its fields; consequently, any member access through a union pointer is correctly aligned regardless of which member is read. The size of a union is the size of its largest field rounded up to the nearest multiple of the union's alignment.
Writing a member of type Ty stores that value's bit pattern in the first Ty::size bytes of the union's storage. Reading a member of type Un interprets the first Un::size bytes of the union's storage as a value of type Un.
When the most recently written member is of type Ty:
- If
Un::size ≤ Ty::size, all bytes read are part of Ty's written representation; the result is those bytes reinterpreted as type U. The result is fully defined. - If
Un::size > Ty::size, the firstTy::sizebytes hold Ty's written representation; the bytes in the range[Ty::size, Un::size)hold unspecified values, and the result may be any value representable in type Un.
A union may therefore be used as a controlled way to reinterpret a bit pattern, provided the member being read is no wider than the member most recently written.
Anonymous nested structs and unions are permitted, as in struct types.
Bitstruct types¶
A bitstruct type is a struct whose fields occupy specified bit ranges within a backing storage:
bitstruct_decl ::= "bitstruct" TYPE_IDENT ("(" type ("," type)* ")")? ":" type attributes? "{" bitstruct_body "}"
The backing type is either an integer type, a character array, or a typedef whose underlying type is one of these. Each field of a bitstruct must be an integer type or bool; each field specifies a single bit position or an inclusive bit range within the backing storage.
A bitstruct field is not addressable.
By default, fields of a bitstruct may not overlap. The @overlap attribute permits overlapping ranges. Endianness of the underlying storage follows the host system by default, but may be set explicitly with @bigendian or @littleendian.
Enum types¶
An enum type is a finite ordered set of named values, optionally backed by an integer type and optionally carrying associated values:
enum_decl ::= "enum" TYPE_IDENT ("(" type ("," type)* ")")? (":" "inline"? integer_type? enum_param_list?)? attributes? "{" enum_body "}"
Each enum value has an ordinal: its position in the declaration, beginning at zero. Ordinals are consecutive; an enum type defines no gaps.
If a parameter list follows the backing type, each declared value supplies a value for each associated parameter; these are accessed through the value as if they were fields.
An enum is converted to and from its ordinal via the properties .ordinal and from_ordinal; explicit casts to and from the backing integer type are also permitted.
Constdef types¶
A constdef declaration introduces a constdef type: a set of named constants of a backing type, with explicitly chosen values that need not be consecutive.
constdef_decl ::= "constdef" TYPE_IDENT ("(" type ("," type)* ")")? (":" "inline"? type)? attributes? "{" constdef_body "}"
If the backing type is omitted, it is taken to be int. Values that are not explicitly assigned take the value of the previous value plus one.
Unlike enum, a constdef has no ordinal: its values are those of its constants. Constdef values do not implicitly convert to or from the backing type; conversions are made by explicit cast unless inline is given on the backing type, in which case values convert implicitly to the backing type.
A constdef declaration may carry the attribute @constinit to permit literals of the backing type to implicitly convert to the constdef type.
Fault types¶
The type fault is the type of fault values. Fault values are declared with faultdef:
Each declared name is a value of type fault. Fault values are used as the excuse of an empty optional and are described further under Optionals and faults.
The fault type has the alignment, size, and underlying representation of uptr. The zero fault value — the absence of a fault — may be produced implicitly by casting from null or from {}. An optional constructed from the zero fault value carries the fault state but holds no useful underlying value; subsequent operations on it propagate the fault and produce unspecified underlying values.
Optional types¶
An optional type holds either a result of a base type or an empty optional carrying a fault value as its excuse:
The base type may not itself be optional. The optional void? is permitted only as a function return type; an optional may not otherwise have base type void.
An optional type has the same size and alignment as its base type. The presence or absence of a result is tracked separately; it does not add overhead to the stored value.
The use, propagation, and handling of optionals is specified in Optionals and faults.
Function types¶
A function type describes the signature of a function:
A function type is not itself a first-class type; it is used through a typedef or alias to declare a function pointer type:
A value of an aliased function type holds the address of a function (or null). Function pointer types may carry default argument values and named parameters; see Functions and methods.
Distinct types¶
A typedef declaration introduces a new type derived from an existing type:
typedef_decl ::= "typedef" TYPE_IDENT ("(" type ("," type)* ")")? attributes? "=" "inline"? type ";"
The new type is distinct from its underlying type: values of one do not implicitly convert to the other. Literals do not implicitly convert to a typedef type unless the typedef carries the @constinit attribute.
When the inline modifier is given, values of the typedef type implicitly convert to the underlying type, but not from it.
A typedef type has its own method set, and methods, attributes, and operator overloads may be defined for it.
Type aliases¶
An alias declaration introduces a new name for an existing type:
A type alias is fully equivalent to its underlying type; the two are interchangeable in every context. Unlike a typedef, an alias does not introduce a new type.
The alias keyword is also used to introduce aliases for functions, variables, and generic instantiations; those forms are described in Declarations.
Interface types¶
An interface type names a set of method signatures:
interface_decl ::= "interface" TYPE_IDENT (":" type ("," type)*)? attributes? "{" interface_body "}"
Each entry in the body is a method signature giving a name, return type, and parameter list. A signature marked @optional need not be implemented by every type that implements the interface.
An interface value has the same representation as any: a pointer paired with a typeid. Its size is twice the pointer width; its alignment is the pointer alignment. An implementing type must satisfy the method requirements of the interface and all interfaces it extends.
Any user-defined type — struct, union, bitstruct, enum, constdef, or typedef — may implement one or more interfaces. The interface list is given in parentheses after the type name in the type declaration, and each non-optional method must be provided as a @dynamic method. Aliases may not implement interfaces, as they introduce no new type. A value of an implementing type implicitly converts to the interface type. Conversion from an interface to a concrete type, or from any to an interface, is explicit and may fail at runtime.
The any type¶
The type any is a runtime-tagged reference: it pairs a pointer with a typeid identifying the type of the pointee. Its size is twice the pointer width; its alignment is the pointer alignment.
The fields .ptr and .type retrieve the pointer and the runtime type respectively. Any pointer type implicitly converts to any.
The typeid type¶
The type typeid is the type of values identifying types at runtime. The width of typeid is the same as the width of iptr.
Every type has a corresponding typeid value, obtained by the property ::typeid of the type. The typeid of the type identified by an any value is its .type field.
The untypedlist type¶
The type untypedlist is the type of compile time lists which lack a definite type. An untyped list may be appended to and indexed into at compile time.
Because it is compile-time only, only compile-time variables may have this type. It may not exist at runtime.
Generic type instantiation¶
A type may be parameterized by a generic module. Such a type is instantiated by writing the type name followed by a brace-delimited list of type and constant arguments:
generic_instantiation ::= type_name "{" generic_arg ("," generic_arg)* "}"
generic_arg ::= type | expression
The same syntax is used to instantiate generic functions, macros, and global variables. The rules for declaring and using generics are given in Generics.
Properties of types and values¶
Underlying type¶
Every type has an underlying type.
- For a predeclared type or a type literal, the underlying type is the type itself.
- For a struct, union, bitstruct, enum, constdef, fault, or interface type, the underlying type is the declared type itself.
- For a type alias, the underlying type is the underlying type of the aliased type.
- For a
typedef, the underlying type is the underlying type of the type from which it derives.
Inner type¶
For some types, an inner type is defined.
- The inner type of a pointer is the pointed-to type.
- The inner type of an array, slice, or vector is its element type.
- The inner type of an
enumis its backing integer type. - The inner type of a
constdefis its backing type. - The inner type of a
bitstructis its backing type. - The inner type of a
typedefis the type from which it derives.
Other types have no inner type.
Type identity¶
Two types are identical if they have the same name (for named types) or the same structure (for type literals). Two distinct declarations of struct, union, bitstruct, enum, constdef, interface, or typedef produce distinct types, even when their bodies are textually identical. A type alias is identical to the type it names.
Two typedef types with the same underlying type are nevertheless distinct.
Alignment¶
Every type has an alignment requirement: a positive integer power of two. An object of a given type must be stored at an address that is a multiple of the type's alignment requirement. The alignment of a type is available at compile time through its ::alignment property.
C3 distinguishes between ABI alignment — used when a type appears as a struct field, array element, or function parameter — and alloca alignment — used when a type is allocated as a local or global variable. For most types these are identical.
Alignment depends on the platform and the ABI compiled for. However, some types are derived from others, and their alignment is given below
bool: alignment is same aschartypeid,iptr,uptr,fault,any, interface types: alignment is the same as forvoid*void: 1 byte.- Array types: the alignment of the element type.
- Slice types:
max(alignof(void*), alignof(sz)), equal to the pointer alignment on all supported platforms. - Plain vector types (ABI — embedded in struct or array, or passed as argument): the alignment of the element type, identical to the corresponding array type.
- Plain vector types (alloca — as a local or global variable): same as the SIMD aligned vector type
- Struct types: the maximum alignment of any field. Fields are laid out in declaration order with padding inserted between adjacent fields as needed; trailing padding is added after the last field so that the total size is a multiple of the struct's alignment.
- Union types: the maximum alignment of any field. The size of a union is the size of its largest field, rounded up to the nearest multiple of the union's alignment. Fields share storage at the same address with no inter-field padding.
- Bitstruct types: the alignment of the backing type, unless overridden by
@align. - Enum and constdef types: the alignment of the backing integer type.
- Optional types (
T?): the alignment ofT. An optional type has the same size asT; the optional status is tracked separately and does not affect storage. - Typedef and alias types: the alignment of the underlying or aliased type.
The @align(n) attribute raises the alignment of a struct, union, bitstruct, variable, or function to at least n, where n must be a compile-time constant power of two. Alignment may only be increased, not decreased. To reduce per-member alignment, @packed sets all member alignments to 1; a subsequent @align may then restore the aggregate's overall alignment.
Assignability¶
A value of type Va is assignable to a target of type Ty — for example, the right-hand side of an assignment, the initializer of a variable, an argument in a function call, or a value returned from a function — when any of the following holds:
- Va is identical to Ty.
- Va is a numeric literal whose value is representable in Ty, where Ty is a numeric type or a
typedeforconstdefof a numeric type with@constinitdeclared. - Va is the literal
nulland Ty is a pointer type. - Va is
void*and Ty is any pointer type, or vice versa. - Va is any non-
void*pointer and Ty isvoid*. - Va is any pointer type and Ty is
any. - Va is an interface type and Ty is
any. - Va is a numeric expression and Ty is a wider numeric type, subject to the rules in Implicit widening.
- Va is a numeric expression and Ty is a narrower numeric type, subject to the rules in Implicit narrowing.
- Va is a struct value or pointer with an inline member of type Ty (transitively), subject to the rules in Substruct conversions.
- Va is a pointer to a value whose type implements interface Ty.
- Va is an interface type that extends Ty (Ty is a parent interface of Va).
- Va is a
typedefdeclaredinlineand Ty is its underlying type. - Va is a
constdefwith a backing type declaredinlineand Ty is that backing type. - Va is a vector type and Ty is the corresponding array type with the same element type and length, or vice versa.
- Va is a slice type and Ty is
void*or a pointer to the element type of Va.
Outside these cases, conversion requires an explicit cast.
Common arithmetic promotion¶
Before arithmetic, the operands of an arithmetic operation are promoted according to the following rules:
- A floating-point operand of width less than 32 bits is promoted to
float. - An integer operand narrower than the arithmetic promotion width is promoted to an integer of the same signedness with that width.
The arithmetic promotion width is the width of a C int on the target platform. This is currently 32 bits on all supported target platforms.
Maximum type¶
When two operands of different numeric types appear in an operation that returns a single value, a maximum type is computed:
- Both operands undergo common arithmetic promotion.
- If the promoted types are identical, the maximum type is that type.
- If one is floating-point and the other is integer, the maximum type is the floating-point type.
- If both are floating-point, the maximum type is the wider type.
- If both are integer with the same signedness, the maximum type is the wider type.
- If both are integer with different signedness, the maximum type is determined by precedence in the list
ichar,char,short,ushort,int,uint,long,ulong,int128,uint128; the operand later in the list wins.
If neither operand is a numeric type but at least one is a struct, or a pointer to a struct, with an inline member, the rules are applied recursively to the inline member's type. Other combinations have no defined maximum type.
Implicit widening¶
A numeric expression implicitly converts to a wider numeric type only when it is a simple expression. An expression is simple if its value is invariant under the choice of evaluation width — that is, widening its operands and then evaluating yields the same result as evaluating at the source width and then widening. Non-simple expressions require an explicit cast to widen.
When the target is a wider integer type, the non-simple forms are:
- Binary
+,-,*— integer overflow at the source width gives a different value than at the wider width. - Binary
<<and>>— bits shifted past the source width are lost, and sign extension depends on the source width. - Unary
-— negating the minimum signed integer overflows at the source width but is well-defined at a wider width. - Unary
~— the high-order bits of the complemented value depend on the source width.
When the target is a floating-point type, the non-simple forms are:
- Binary
+,-,*— integer overflow at the source width changes the value before conversion. - Binary
/— integer division and floating-point division produce different values. - Unary
-— same corner case as for integer targets.
A ternary expression cond ? a : b is non-simple if either branch a or b is non-simple. All other expressions are simple, including identifiers, literals, function calls, member access, subscripts, comparisons, logical operators, assignment expressions, bitwise &, |, ^, the operators %, ??, and (for integer targets) /, and the unary operators +, !, ++, --, * (dereference), and & (address-of).
Within simple expressions, the widening conversion itself is further restricted by signedness:
- Signed → wider signed: allowed.
- Unsigned → wider unsigned: allowed.
- Unsigned → wider signed: allowed (the unsigned range always fits).
- Signed → unsigned: never allowed implicitly, regardless of size; requires an explicit cast.
- Integer → float, or float → wider float: allowed.
Same-size conversions between types of different signedness (e.g., int → uint) are not widening and are not permitted implicitly.
Implicit narrowing¶
A numeric expression may implicitly convert to a narrower target type through a recursive structural analysis. The compiler walks the expression tree and verifies that every contributing value is already narrow enough to fit in the target type.
The traversal rules are:
- Arithmetic and bitwise operators (
+,-,*,/,%,|,^,&,??): both operands are checked recursively; narrowing succeeds only if all operands pass. - Shift operators (
<<,>>): only the left operand is checked; the right operand does not affect the result type. - Assignment operators: only the left operand is checked.
- Comparison and logical operators (
==,!=,<,<=,>,>=,&&,||): always succeed; these producebool. - Unary operators (
+,-,~,++,--): the operand is checked recursively. - Integer and float constants: succeed if the constant value fits in the target type.
- Widening casts: if the cast source type is directly compatible with the target (see below), the check succeeds; otherwise the inner expression is checked recursively.
- All other expressions (identifiers, calls, subscripts, etc.): succeed only if the expression's own type is compatible with the target type U:
- Both types are identical.
- Both are signed integers, and the source is no wider than U.
- Both are unsigned integers, and the source is no wider than U.
- Both are floats, and the source is no wider than U.
- The source is an unsigned integer and U is a signed integer of strictly greater width.
- Any other combination — including any signed-to-unsigned conversion — fails and requires an explicit cast.
Substruct conversions¶
A struct may declare an inline member, which establishes a subtype relation between the struct and the inline member's type:
- A pointer to the substruct implicitly converts to a pointer to the inline member's type.
- A substruct value implicitly assigns to a variable of the inline member's type; the assignment copies only the inlined portion.
- The inverse conversions — from a value of the inline member's type to the substruct — are not implicit.
- Conversions between an array or slice of substructs and an array or slice of the inline member's type are not permitted, even by explicit cast.
These rules apply transitively through chains of inline members.
Vector conversions¶
A vector type implicitly converts to and from an array type with the same element type and length. All other vector conversions require an explicit cast.
When a boolean vector value is cast to a vector with integer element type, each true element yields a value with all bits set and each false element yields zero.
Casts¶
An explicit cast (type)expression produces a value of the specified type. A cast is permitted when any of the following holds:
- The source and target are numeric types.
- The source and target are pointer types.
- The source is a pointer type and the target is an integer type able to hold a pointer, or vice versa.
- The source and target are vector and array types with the same element type and length.
- The source and target differ only by chains of
typedefand alias. - The source is an interface type and the target is
any, or vice versa.
A cast between numeric types whose result is not representable in the target type has implementation-defined behaviour.
Method sets¶
Every named type has a method set: the set of methods declared on that type. A method is declared by attaching a type name to the function name in a function declaration; see Functions and methods.
The method set of a typedef is distinct from the method set of the type from which it derives. A type alias shares the method set of the type it names.
A struct with an inline member makes the methods of the inline member's type accessible through values of the enclosing struct.
Method extension on built-in types¶
A method may be declared on any named type, including a built-in type. The method name is qualified by the type name in the declaration:
A method extension on a built-in type is visible according to the same module visibility rules as any other declaration.
Operator overloading¶
A method or macro method may participate in operator overloading by carrying an @operator attribute with one of the following arguments: [], &[], []=, len, or one of the operators +, -, *, /, %, ^, |, &, <<, >>, ==, <. The variant attributes @operator_s and @operator_r apply to binary operators between two distinct types and produce symmetric and reverse forms respectively.
Restrictions:
- Arithmetic and bitwise operator overloads are permitted only on user-defined types.
- A bitstruct type may not overload
^,|, or&, since these are predefined on bitstructs. - Defining
+implicitly defines+=, and similarly for the other arithmetic operators; an explicit overload of the compound-assignment form takes precedence when present. - Defining
==implicitly defines!=. Defining<together with==defines the full set of ordering operators<,<=,>=,>,==,!=.
Overload resolution proceeds as follows:
- If an overload exactly matches the operand types, that overload is selected.
- Otherwise, if exactly one overload matches after applying implicit conversions to the non-self operand, that overload is selected.
- Otherwise, the operation is ambiguous and a compile-time error is reported.
Blocks and scope¶
C3 source text is organized into nested blocks. Each block introduces a scope — a region of program text in which a declaration is visible. C3 distinguishes runtime blocks and compile-time blocks; the two form independent scope structures within any function or macro body.
Runtime blocks¶
A runtime block is a brace-delimited sequence of declarations and statements:
Runtime blocks appear as the bodies of functions and macros and as compound statements within those bodies. Each runtime block introduces a nested runtime block scope.
Compile-time blocks¶
A compile-time block is the body of a compile-time control structure. Compile-time blocks are not brace-delimited; each is opened by a $-prefixed keyword and closed by the matching $end keyword. For example:
The full set of compile-time control structures — $if, $else, $for, $foreach, $switch, $case, $default, and related forms — and their precise grammar are given in Statements. Each compile-time block introduces a nested compile-time block scope.
A compile-time block may itself contain runtime declarations and statements as well as further compile-time blocks; conversely a runtime block may contain compile-time blocks. The two structures interleave freely in the source text but track their scopes independently.
Scopes¶
A scope is a region of program text in which a declaration is visible. C3 has four kinds of scope:
Module scope. Module-level declarations — global variables, functions, types, constants, and macros — are visible throughout every section of the module in which they appear, regardless of textual position. Mutually recursive functions therefore require no forward declarations. Visibility across module boundaries is subject to the visibility rules described in Modules.
Function scope. Each function or macro body forms a single function scope. The function scope contains all labels declared in the body; a label is visible throughout the entire body, from the beginning of the body, regardless of where the label appears textually.
Runtime block scope. Each runtime block introduces a runtime block scope. A name declared within a runtime block is visible from the point of its declaration to the closing brace of the enclosing runtime block. A declaration is not visible above itself in the same block. Runtime block scopes nest in textual order and may shadow declarations in outer runtime block scopes and at module scope.
Compile-time block scope. Each compile-time block introduces a compile-time block scope. A compile-time variable is visible from the point of its declaration to the close of its enclosing compile-time block. If no compile-time block encloses a compile-time variable, its scope extends to the end of the function or macro body. Compile-time variables may not be declared at module scope.
The runtime block scope structure and the compile-time block scope structure are independent: the boundaries of a runtime block do not end the scope of a compile-time variable, and the boundaries of a compile-time block do not end the scope of a runtime variable.
Module sections¶
Each module declaration in source code opens a module section. A single file may contain multiple sections, including sections for different modules, and a single module may span multiple files and multiple sections within each file.
module_section ::= "module" path module_attributes? ";"
module_attributes ::= ("@private" | "@local" | "@public" | "@if" "(" expression ")" | generic_params)+
generic_params ::= "<" generic_param ("," generic_param)* ">"
generic_param ::= TYPE_IDENT | CONST_IDENT
A module section may carry attributes that apply as defaults to every declaration within the section:
@private— declarations are@privateby default; visible only within the same module.@local— declarations are@localby default; visible only within the same file.@public— declarations are@publicby default; used to restore public visibility within a file whose other sections declare a more restrictive default.@if(cond)— declarations are conditionally compiled undercond, evaluated at compile time.<Ty>,<Ty, Tu>,<Ty, VALUE>, ... — opens a generic module section in which every supported declaration is parameterized over the listed parameters. A type parameter is aTYPE_IDENT; a compile-time value parameter is aCONST_IDENT. Declaration kinds that cannot be made generic, such asfaultdef, may not appear in a generic section.
Multiple attributes may be combined on a single section. Within a section, an individual declaration may override the section default — for example, @public on a declaration reverses a section default of @private.
The imports declared in a section are visible only within that section. A later section of the same module, even in the same file, does not inherit those imports and must re-import as needed.
Because module sections are not bound to a single file or author, a user may extend an existing module by opening a new section of the same name elsewhere. The new section's declarations join the module under the standard visibility rules, subject to any default attributes the section declares.
Name spaces¶
C3 maintains several syntactically distinct name spaces. Because the token kind of an identifier encodes its category, different name spaces may share the same textual name without ambiguity:
- Ordinary identifiers (
IDENTIFIER): functions, variables, parameters, and macros. - Type names (
TYPE_IDENT): user-defined types. - Constant names (
CONST_IDENT): named constants. - Compile-time identifiers (
CT_IDENT,CT_TYPE_IDENT): compile-time variables and compile-time type variables. - Labels (
CONST_IDENTin label position): a name space separate from ordinary identifiers, type names, and constant names; scoped to the enclosing function body. - Struct and union members: each struct and union has its own name space for its members, disambiguated by the type of the object being accessed.
A declaration in one name space does not conflict with a declaration of the same text in another.
Scope nesting and shadowing¶
Scopes nest. An identifier visible in an outer scope may be shadowed by a declaration of the same identifier in an inner scope. Within the inner scope, the identifier designates the inner entity; the outer declaration is hidden until the inner scope ends.
A local variable may shadow a module-scope declaration of the same name. Shadowing operates within a single scope dimension; a runtime declaration and a compile-time declaration with the same textual name do not interact, as they occupy distinct name spaces (IDENTIFIER vs CT_IDENT).
Storage duration and scope¶
The scope of a local variable (the region in which it is accessible by name) is distinct from its storage duration (the span for which its storage persists). A local variable's scope ends at the closing brace of the runtime block in which it is declared, but its storage remains valid for the entire lifetime of the enclosing function call. A pointer to a local variable therefore remains valid anywhere within the same function call, even after the variable's name has gone out of scope. The full rules for storage duration are given in Variables.
Labels¶
A label names a statement as a target for break, continue, and nextcase. Unlike C, a label is not a separate statement form that prefixes another statement; it is part of the syntax of the statement it names, written as LABEL: between the statement's introducing keyword and the rest of the statement. For example:
A label is a CONST_IDENT. The set of statements that may carry a label is fixed: if, while, do, switch, and foreach. Compile-time control structures do not support labels.
Labels have function scope: a label is visible throughout the entire body of the function or macro in which it appears, from the beginning of the body, and may therefore be referenced before its textual position. A label may not shadow another label in the same function.
Declarations¶
A declaration binds an identifier to an entity such as a variable, a constant, a type, a function, a macro, a fault value, or an alias. C3 distinguishes the following kinds of declaration:
- Variable declarations — globals, extern globals, thread-local globals, local variables, static local variables, and compile-time variables. See Variables.
- Constant declarations — typed and untyped named constants. See Constants.
- Type declarations — struct, union, bitstruct, enum, constdef, typedef, and interface declarations. See Types.
- Function declarations and method declarations. See Functions and methods.
- Macro declarations. See Macros.
- Import declarations — bringing the entities of another module into the current section. See Modules.
- Attribute definitions — introducing user-defined attributes. See Attributes.
- Alias declarations and fault value declarations, described in this chapter.
Type declarations, function declarations, method declarations, macro declarations, alias declarations, fault value declarations, attribute definitions, and import declarations may appear only at module scope. Variable and constant declarations may appear at module scope or within a function or macro body, with the additional restrictions given in Variables and Constants. Compile-time variables may appear only within a function or macro body.
Every declaration may carry attributes. The set of attributes recognized for a declaration depends on the declaration kind and is described in Attributes.
Alias declarations¶
An alias declaration introduces an additional name for an existing entity. An alias does not introduce a new type, function, value, module, or macro; it provides an alternative name through which the same entity may be referred. C3 distinguishes three forms of alias declaration, differing in what may be aliased and in the form of the right-hand side.
Type aliases¶
A type alias gives an existing type an additional name. The aliased type may be any type expression, including a generic instantiation or a compile-time type expression.
A type alias does not introduce a new type and does not have its own method set; references through the alias are equivalent to references through the underlying type's name. A type alias may not independently implement interfaces.
A type alias may itself be generic, by including a generic_decl parameter list between the alias name and the =. The parameters are in scope on the right-hand side. For example:
Module aliases¶
A module alias introduces an alternate name for a module.
A module alias may be used wherever a module path is expected, including in import declarations and in qualified-name expressions.
Identifier, constant, and macro aliases¶
The remaining alias forms introduce a new name for an ordinary identifier (a function or global variable), a constant identifier (a named constant or fault value), or an @-prefixed identifier (a macro or user-defined attribute):
alias_decl ::= "alias" alias_name generic_decl? attributes? "=" alias_source ";"
alias_name ::= IDENTIFIER | CONST_IDENT | AT_IDENT
alias_source ::= (path? IDENTIFIER | path? CONST_IDENT | path? AT_IDENT) generic_parameters?
The lexical kind of alias_name must match the lexical kind of alias_source:
- An
IDENTIFIERalias refers to a function, macro or global variable. - A
CONST_IDENTalias refers to a named constant or a fault value. - An
AT_IDENTalias refers to a macro.
The optional generic_parameters on the right-hand side instantiates a generic target, producing a non-generic alias. The optional generic_decl on the left-hand side declares the alias itself as generic; its parameters are in scope on the right-hand side.
Fault value declarations¶
A faultdef declaration introduces one or more named values of the built-in type fault.
faultdef_decl ::= "faultdef" fault_definition ("," fault_definition)* ","? ";"
fault_definition ::= CONST_IDENT attributes?
Each fault_definition introduces a distinct value of type fault. The values are visible at module scope and obey the standard visibility rules. Each fault value may carry its own attributes. A trailing comma after the last fault_definition is permitted.
A faultdef does not introduce a new type — all values declared by faultdef have type fault, and any fault value from any module is comparable and assignable to a fault-typed variable.
Expressions¶
An expression computes a value, possibly with side effects. Some expressions, including calls to functions returning void, have no value.
Operands¶
An operand denotes an elementary value in an expression. An operand is a literal, a named entity, a parenthesized expression, a compound literal, a type access expression, a compile-time access expression, or a lambda.
operand ::= literal
| path? entity_name
| "(" expression ")"
| compound_literal
| type_access_expr
| builtin_expr
| lambda_expr
entity_name ::= IDENTIFIER | CONST_IDENT | TYPE_IDENT
| CT_IDENT | CT_TYPE_IDENT | HASH_IDENT
| AT_IDENT
| BUILTIN_CONST | "null" | "true" | "false"
path ::= IDENTIFIER ("::" IDENTIFIER)* "::"
Literals are described in Constants. A path-prefixed name resolves through the module path; an unprefixed name resolves through the current section's import set as described in Modules. The lexical kind of the identifier (IDENTIFIER, CONST_IDENT, TYPE_IDENT, etc.) determines the kind of entity referenced; when more than one entity could match a textual name, ambiguity is resolved by additional path qualification.
A parenthesized expression has the same value, type, and lvalue-ness as the enclosed expression. Parentheses do not introduce a new scope.
Compound literals¶
A compound literal constructs a value of an aggregate type — struct, union, array, slice, vector, or bitstruct — from a brace-enclosed list of elements. The form follows C99 designated initializers, with three extensions: range initializers, vector swizzles, and struct splatting. C3 is stricter than C99 in one respect: positional and designated elements may not be mixed in a single literal.
compound_literal ::= "(" type ")" initializer_list
| initializer_list
initializer_list ::= "{" "}"
| "{" designated_form ","? "}"
| "{" positional_form ","? "}"
designated_form ::= ("..." expression ",")? designated_element ("," designated_element)*
positional_form ::= positional_element ("," positional_element)*
designated_element ::= designator_path ("=" expression)?
positional_element ::= expression
| "..." expression
designator_path ::= designator_step+
designator_step ::= "." IDENTIFIER
| "[" expression "]"
| "[" expression ".." expression "]"
The parenthesized type prefix is required when the type cannot be inferred from context; otherwise the type is inferred from the surrounding context.
A range designator step [a..b] may appear only as the last step of a designator path; preceding steps must be .field or [index]. So .bar.x[0..1] = 3 is well-formed, but [0..1].field = 3 is not.
Element kinds¶
An initializer list takes one of two forms:
- A designated form, optionally preceded by a single splat, followed by one or more designated elements.
- A positional form, consisting of expressions and splats in any order.
Mixing the two forms in a single literal is a compile error.
In a positional list, each expression initializes the corresponding member or position in source order. The number of elements must not exceed the number of positions; unspecified trailing positions are zero-initialized.
In a designated list, each element specifies its target by a designator path made of one or more steps:
.fieldselects a struct, union, or bitstruct member.[index]selects an array, slice, or vector position.[a..b]selects the inclusive range of positions[a, b]of the enclosing array, slice, or vector. The range form may appear only as the last step of the path. The right-hand value is evaluated once and broadcast to each position.
A path may chain steps to reach a nested target: .outer.inner = expression, [0].field = expression, .bar.x[0..1] = expression.
A vector swizzle initializer .xy = expression, .xyz = expression, .yz = expression, etc., initializes consecutive components of a vector. The component names must be ordered and contiguous: .xy, .yz, .xyzw are valid; .yx (reversed) and .xz (non-contiguous) are not. A swizzle is lowered to the equivalent range initializer.
A designator may appear without = expression, in which case the designated target is set to a default appropriate to its type (for example, true for a one-bit bitstruct field). This shorthand is rarely needed.
Designators may appear in any order. Positions not initialized by any element are zero-initialized. If two elements initialize the same position, the one appearing later in the literal supersedes the earlier one.
A union literal contains exactly one designated initializer naming the active member.
Splat¶
A splat element has the form ...expression. Its meaning depends on the surrounding form:
-
In a designated form, exactly one splat is permitted, and it must precede every designated element. The splatted expression must be of the same type as the initializer. Its values become the defaults for every member or position of the result; subsequent designators override individual targets.
-
In a positional form, any number of splats may appear in any position. Each splat expands to the elements of its operand in order, contributing them as positional values. For example, if
bhas length 2 anddhas length 3, then{ a, ...b, c, ...d }is equivalent to{ a, b[0], b[1], c, d[0], d[1], d[2] }.
Splats may not appear in a literal that mixes positional and designated elements (which is itself forbidden).
The special form ...$vaarg, valid only inside a macro body, splats the macro's variadic arguments. It is always a positional splat regardless of context, expanding each variadic argument as a positional element; using it in an initializer list therefore makes that list a positional form, and any designators in the same list are rejected by the no-mixing rule.
Evaluation order¶
Elements are evaluated in the order they appear in the literal, regardless of the position they initialize. For example, in { [1] = foo(), [0] = bar() } the call foo() is evaluated before bar().
Primary expressions¶
A primary expression is built from an operand by zero or more postfix operations: member access, subscript, slice, function or method call, macro invocation, generic instantiation, optional propagation, and postfix increment or decrement.
primary_expr ::= operand
| primary_expr "." access_ident
| primary_expr "[" expression "]"
| primary_expr "[" range_expr "]"
| primary_expr generic_arguments
| primary_expr "(" argument_list? ")" trailing_macro_block?
| primary_expr "++"
| primary_expr "--"
| primary_expr "!"
| primary_expr "!!"
| primary_expr "~"
access_ident ::= IDENTIFIER | CT_IDENT | CONST_IDENT | AT_IDENT | eval_expression
generic_arguments ::= "{" type_or_value ("," type_or_value)* "}"
range_expr ::= range_loc? (".." | ":") range_loc?
range_loc ::= "^"? expression
Member access¶
The expression a.b accesses the member b of the aggregate value or pointer a. If a has pointer-to-aggregate type, the pointer is implicitly dereferenced. The result has the member's declared type and is addressable if and only if a is addressable (or if a is a pointer).
The right-hand access_ident may be an IDENTIFIER (struct field, method, or property), a CONST_IDENT (nested constant), an AT_IDENT (method-style macro), an eval_expression or a CT_IDENT.
In the case of an eval_expression, the resolved string expression will be resolved to an IDENTIFIER, CONST_IDENT or AT_IDENT.
In the case of a CT_IDENT, two cases exist: (1), the identifier contains a string. In this case behaviour is identical to using eval($ident). (2) the identifier contains a reflected member. In this case it is as if on was to use eval($ident.name), except it also will work on anonymous inner structs and unions that do not have a name.
Subscript¶
The expression a[i] selects the element at index i. The operand a must be an array, slice, vector, pointer, or a type that overloads the subscript operator. The index i must have integer type, or ^expr to count from the end of the operand (valid for arrays, slices, and vectors of known length).
In safe builds, an out-of-range index traps. In fast builds, the behaviour is undefined.
Slicing¶
The expression a[i..j] produces a slice over the inclusive index range [i, j] of the operand. The expression a[i:n] produces a slice of length n starting at index i. Either bound may be omitted to mean "from the beginning" or "to the end", and either may be expressed as ^expr to count from the end. The result has type S[] where S is the element type of the operand.
The special case j = i - 1 in the i..j form, and the special case n = 0 in the i:n form, both yield a valid empty slice. For example, a[1..0] and a[1:0] are well-formed and produce an empty slice; the bound i may equal the length of the operand in these cases (one past the last element).
Generic instantiation¶
The expression g{Ty, Tu, value} instantiates a generic entity g with the given type and value arguments. The result is a non-generic entity that may be used directly or further composed with calls or member access. Generic arguments may be types (TYPE_IDENT) or compile-time expressions matching the generic's value parameters.
Calls¶
A call invokes a function, a method, or a macro:
argument_list ::= argument ("," argument)* ","?
argument ::= expression
| "..." expression
| IDENTIFIER ":" "..."? expression
| "." IDENTIFIER "=" expression
trailing_macro_block ::= AT_IDENT ("(" parameter_list? ")" )? compound_statement
Arguments may be supplied positionally, by name (name: expression), or by struct-field-style designator (.field = expression) for arguments of aggregate type. A ...expression spreads a slice, array, or compile-time list into the variadic part of the parameter list. Named and designated arguments may appear in any order; positional arguments must come before them.
Each argument is converted to the corresponding parameter's declared type using the rules described in Assignability.
A macro call may carry a trailing macro block — a function-literal-like body attached after the parenthesized argument list. The trailing block becomes available inside the macro under the parameter name introduced after the closing parenthesis. The detailed rules are given in Macros.
The result type of a call is the function's, method's, or macro's return type. A call producing an optional propagates that optional status to the surrounding expression (see Optional propagation).
Postfix ++ and --¶
The expressions lvalue++ and lvalue-- increment and decrement lvalue by one and yield the value before the modification. The operand must be an addressable expression of integer, floating-point, or pointer type. For pointer operands, the change is by one element.
Optional propagation¶
The postfix operators ~, !, and !! operate on optionals:
expression~converts a fault value into an optional carrying that fault as its excuse. The operand must be of typefault. The result has typevoid?and represents an optional that fails with the given fault.expression!evaluates the operand; if the operand is a successful optional, the result is its underlying value, and if the operand carries a fault, the enclosing function returns immediately with that same fault, propagating the optional. The operand must have optional type and the enclosing function must be allowed to return that optional.expression!!evaluates the operand; if the operand is successful, the result is the underlying value, otherwise the program traps. Force-unwrapping should be reserved for cases where the failure carrier is statically known to be unreachable.
Type access expressions¶
A type access expression uses :: to select a member of a type rather than a value:
Examples include Foo::SIZE (a named constant on a type), Foo::typeid (the type's runtime type identifier — see Properties of types and values), and Foo::alignment. The set of accessible names depends on the type and is described in Properties of types and values.
Compile-time access expressions¶
A compile-time access expression denotes a value or type known at compile time:
ct_arg_expr ::= "$vaarg" ("[" range_expr "]")?
ct_analyze_expr ::= ct_analyze_op "(" expression ")"
ct_defined_expr ::= "$defined" "(" ct_defined_check ("," ct_defined_check)* ")"
ct_feature_expr ::= "$feature" "(" CONST_IDENT ")"
ct_analyze_op ::= "$eval" | "$reflect" | "$stringify" | "$expand"
ct_defined_check ::= expression
| type IDENTIFIER ("=" expression)?
The semantics of these forms are described in Compile-time evaluation and Reflection. Each operand of $defined is either an expression (well-formed if a value of that form would be valid) or a candidate local variable declaration (well-formed if such a declaration would be valid).
Unary operators¶
unary_expr ::= unary_op expression
unary_op ::= "+" | "-" | "!" | "~" | "*" | "&" | "&&" | "++" | "--" | "(" type ")"
+eperforms integer promotion: integer operands narrower than the platformintare promoted toint(with the corresponding signedness); operands ofintor wider, and operands of floating-point or vector type, are returned unchanged. The operand must be of numeric or vector type. The result type may therefore differ from the operand type for narrow integers.-eis the arithmetic negation ofe; its operand must be of integer, floating-point, or vector type. Signed integer negation wraps on overflow (it is defined to wrap, not undefined).!eis the logical negation ofe; its operand must be of boolean type.~eis the bitwise complement ofe; its operand must be of integer or vector type.*pis the value pointed to byp;pmust be of pointer type, and may not bevoid*.&vis the address ofv; the operand must be addressable (an lvalue), and the result has the type "pointer to the operand's type".&&eis a temporary address: it materializes the value ofein a fresh storage location whose lifetime extends to the end of the enclosing full expression, and yields a pointer to that location. The operand need not be addressable.++lvalueand--lvalueincrement and decrementlvalueby one and yield the value after the modification. The operand must be addressable, of integer, floating-point, or pointer type.(type) expressionis an explicit cast — see Conversions.
The unary operators have higher precedence than any binary operator. Postfix operations (member access, subscript, call, optional propagation, postfix ++/--) have higher precedence than the prefix unary operators.
Binary operators¶
Binary operators combine two operands and produce a value of a determined type. The table below lists operator categories in decreasing order of precedence; operators in the same row have equal precedence. All binary operators are left-associative except where noted.
Precedence Category Operators
-------------- ---------------------- ------------------------------------
14 (highest) Primary literals, names, parenthesised expr
13 Postfix . () [] ++ -- !! ! ~
12 Unary (prefix) ! - + ~ * & && ++ -- (type)
11 Multiplicative * / %
10 Shift << >>
9 Bitwise & | ^
8 Or-else / Elvis ?: ??
7 Additive + - +++
6 Relational < <= > >= == !=
5 Logical AND && &&&
4 Logical OR || |||
3 Ternary ? : (right-assoc)
2 Assignment = += -= *= /= %=
&= |= ^= <<= >>= (right-assoc)
1 (lowest)
The compile-time variants +++, &&&, and ||| sit at the same precedence levels as their runtime counterparts; they operate on compile-time-known operands (see Compile-time evaluation).
Although the precedence table determines the parse of every well-formed expression, certain combinations of operators are nevertheless rejected as ambiguous to read. The three operator groups below are subject to the check:
- Group 1 — binary bitwise operators:
&,|,^. - Group 2 — relational and equality operators:
==,!=,<,<=,>,>=. - Group 3 — shift operators:
<<,>>.
If the operands of a binary expression with an operator in one of these groups are themselves binary expressions with an operator from the same group, the program is ill-formed. Parentheses must be used to make the intended grouping explicit.
The check applies one level deep on each side of the offending operator. A subexpression separated from the operator by an operator outside the group is not subject to the check.
An exception applies in Group 1 only: chaining the same bitwise operator on both sides is permitted, since the result is invariant under associativity. The other two groups have no such exception.
Examples:
// Well-formed
a & b == 3 // & is group 1, == is group 2 — different groups
a == b << 4 // == is group 2, << is group 3 — different groups
a & b & c // same operator in group 1 — permitted exception
// Ill-formed
a & b | c // & and | both in group 1, different operators
a == b != c // both in group 2
a << b << c // same shift operator, but group 3 has no exception
a < b == c // both in group 2
The rule is purely syntactic and does not depend on the types or values of the operands; it is checked after parsing and before further semantic analysis. Parenthesising either side suppresses the diagnostic: (a & b) | c and a & (b | c) are both well-formed.
This precedence order differs from C in several places:
- Shift binds tighter than additive.
a + b >> cparses asa + (b >> c). - Bitwise
&,|,^are all at one precedence level, between shift and or-else, tighter than relational.a & b == cparses as(a & b) == c. - The or-else operators
??and?:sit between bitwise and additive, tighter than additive.a + b ?? cparses asa + (b ?? c). - Relational and equality share one level (in C they are two levels).
For an operator op and operands a, b, the expression a op b is well-typed if a and b are of types compatible with op and with each other, according to the rules below.
Arithmetic operators¶
The arithmetic operators +, -, *, /, %, when applied to two operands of integer or floating-point type, perform arithmetic at a common type determined by arithmetic promotion (see Properties of types and values). The operators + and - are also defined for pointer arithmetic: p + i and p - i add or subtract an integer-typed offset (in element units), and p - q of two pointers to the same element type yields a signed integer count of elements between them.
The arithmetic and bitwise operators are defined elementwise on vector types.
Signed integer overflow in +, -, *, and unary - wraps modulo 2ⁿ where n is the operand width. Unsigned overflow wraps in the natural way. Division by zero in / or % on integer operands traps in safe mode and is undefined behaviour in fast mode. Floating-point division by zero, overflow, and other exceptional cases follow IEEE 754.
Shift operators¶
The operators << and >> shift the left operand by the number of positions given by the right operand. Both operands must be of integer type. The right operand is interpreted as an unsigned count; shifting by a count greater than or equal to the bit-width of the left operand's type, or by a negative count, is undefined behaviour. Right shift of a signed integer is an arithmetic shift (sign extending); right shift of an unsigned integer is a logical shift.
Bitwise operators¶
The operators &, |, ^ perform bitwise AND, OR, and XOR on operands of integer or vector-of-integer type. All three share the same precedence level. The result has the common type of the operands after arithmetic promotion.
Or-else and Elvis operators¶
a ?? b(the optional-else operator) evaluatesa; ifais a successful optional, its underlying value is the result. Otherwisebis evaluated and is the result. The operandamust have optional type;bmust be assignable to the underlying type ofa(or itself be an optional with the same underlying type).a ?: b(the Elvis operator) evaluatesa; ifais truthy, the result isa(after assignability conversion); otherwisebis evaluated and is the result. Both operands must be of types convertible to a common type.
Both operators short-circuit; b is evaluated only when needed.
Relational operators¶
The operators <, <=, >, >=, ==, != compare two operands and produce a value of type bool. They share a single precedence level. Comparison is defined for: numeric types (after arithmetic promotion), pointer types (with the usual address ordering), boolean types (with false < true), enum types (by ordinal), constdef and typedef types (per their underlying type), fault (by identity, for ==/!= only), typeid (by identity, for ==/!= only), and vector types (elementwise, yielding a vector of bool).
Two pointer values are equal if they point to the same object or are both null. Two slices are not directly comparable; use slice.ptr and slice.len if needed.
Logical operators¶
The operators && and || apply to operands of type bool. They short-circuit: in a && b the operand b is evaluated only if a is true; in a || b the operand b is evaluated only if a is false. The result has type bool.
Ternary expression¶
The expression c ? a : b evaluates c; if c is true, the result is a, otherwise b. The operand c must have type bool. The operands a and b must be of compatible types and convert to a common type. Exactly one of a and b is evaluated.
Assignment expressions¶
assign_expr ::= lvalue assign_op expression
assign_op ::= "=" | "+=" | "-=" | "*=" | "/=" | "%=" | "&=" | "|=" | "^=" | "<<=" | ">>="
An assignment stores the right-hand value into the left-hand lvalue. The right-hand operand is converted to the type of the lvalue according to the Assignability rules. The result of the assignment is the new value of the lvalue. The lvalue must be addressable.
A compound assignment lvalue op= expression is equivalent in effect to lvalue = lvalue op expression, except that the lvalue is evaluated only once.
A := form is not supported; new bindings are introduced by var name = expression or by a typed declaration (see Variables).
Cast expressions¶
An explicit cast converts a value of one type to another. The grammar of a cast expression is ( type ) expression.
A cast is permitted between any two types for which a conversion is defined; the list of permitted conversions and their semantics is given in Casts. A cast that adds information not present at runtime (for example, downcasting an any or an interface to a more specific type) may trap in safe mode if the runtime check fails.
Constant expressions¶
A constant expression is an expression whose value can be determined at compile time. The expression must not depend on runtime state and may use only operators and operand forms with defined compile-time semantics. The complete rules are described in Compile-time evaluation. Constant expressions are required in contexts such as array sizes, the values of named constants, default parameter values, attribute arguments, and the conditions of $if, $for, and other compile-time control structures.
Lambda expressions¶
A lambda expression introduces an anonymous function or macro:
lambda_expr ::= "fn" return_type? fn_parameter_list attributes? lambda_body
lambda_body ::= "{" statement* "}"
| "=>" expression
A lambda may capture compile-time values from the enclosing scope but not runtime variables. Its type is a function type. The full rules are given in Functions and methods.
Order of evaluation¶
Evaluation of an expression is fully sequenced. Every operand is evaluated and its side effects are complete before any operand whose source position lies to its right, except where short-circuiting suppresses evaluation.
The rules are:
- In a call, the called function, method, or macro expression is evaluated first; arguments are then evaluated in left-to-right source order.
- In a binary operator, the left operand is evaluated and its side effects are complete before the right operand is evaluated. The short-circuiting operators (
&&,||,??,?:) evaluate the right operand only when required. - In an assignment, the left-hand lvalue (including any subexpressions used to compute its address) is evaluated, and its address is fixed, before the right-hand operand is evaluated; the converted right-hand value is then stored.
- In a compound literal, elements are evaluated in source order regardless of the position they initialize.
- The postfix operators
++and--read the operand, fix the expression's value as the value before modification, and write back the new value before the next operand of the surrounding expression is evaluated. - The ternary expression
c ? a : bevaluatescfirst; depending on its value, exactly one ofaandbis then evaluated.
Because of these rules, expressions in C3 have no order-of-evaluation undefined behaviour. Constructs that are undefined in C, such as i = i++ + i++, have a defined result in C3 determined by strict left-to-right evaluation.
Statements¶
A statement directs the flow of execution within a function or macro body. Statements compose into sequences within blocks; each statement is terminated either by a semicolon or by the closing delimiter of a structured form.
statement ::= block_statement
| local_declaration_statement
| constant_declaration_statement
| var_statement
| expression_statement
| if_statement
| switch_statement
| while_statement
| do_statement
| for_statement
| foreach_statement
| break_statement
| continue_statement
| nextcase_statement
| return_statement
| defer_statement
| assert_statement
| asm_block_statement
| ct_statement
| ";"
Block statement¶
A block statement groups a sequence of statements into a runtime block scope (see Blocks and scope):
The statements within a block are executed in source order. The block introduces a new runtime scope; local declarations within the block are visible from their point of declaration to the closing brace.
Expression statement¶
An expression statement evaluates an expression and discards its value:
The expression's side effects are performed. If the expression has a non-void type, its value is discarded.
If the expression is a call to a function or macro whose return type is an optional, or to one that carries the @nodiscard attribute, discarding the result is a compile-time error. A function or macro that returns an optional but whose result is intended to be safely discardable may carry @maydiscard to suppress this check.
Local declaration statements¶
A local declaration statement introduces a local variable, a static or thread-local variable, an inferred-type variable, or a local constant. The full syntax and semantics are in Variables and Constants.
local_declaration_statement ::= local_storage? optional_type local_decl_after_type ("," local_decl_after_type)* ";"
local_storage ::= "static" | "tlocal"
local_decl_after_type ::= IDENTIFIER attributes? ("=" expression)?
| CT_IDENT ("=" constant_expression)?
var_statement ::= "var" (IDENTIFIER attributes? "=" expression
| CT_TYPE_IDENT ("=" expression)?
| CT_IDENT ("=" expression)?) ";"
constant_declaration_statement ::= "const" type? CONST_IDENT attributes? "=" expression ";"
A static local has function-call-independent storage; a tlocal local has per-thread storage. The var form infers the type from the initializer. A const local declares a compile-time constant; its initializer must be a constant expression.
The initializer expression of a static or tlocal local follows the same rules as the initializer of a global variable of the same form: it must be evaluable to a constant at program-image construction time. Initializers that depend on runtime values are not permitted on static or tlocal locals.
An initializer expression may reference the address of the variable being declared, but may not depend on its value. For example, void* a = &a; is well-formed, while int a = a + 1; is not: the right-hand side reads a before it has been initialized.
Conditions¶
The conditional statements if, while, do, switch, for, and the cond slot of for_stmt all accept a condition, which may include one or more declarations together with an optional try or catch unwrap:
condition ::= condition_repeat ("," (try_unwrap_chain | catch_unwrap))?
| try_unwrap_chain
| catch_unwrap
condition_repeat ::= decl_or_expression ("," decl_or_expression)*
decl_or_expression ::= var_decl
| optional_type local_decl_after_type
| expression
try_unwrap ::= "try" (type? IDENTIFIER "=")? expression
try_unwrap_chain ::= try_unwrap ("&&" (try_unwrap | expression))*
catch_unwrap ::= "catch" (type? IDENTIFIER "=")? expression ("," expression)*
A condition is treated as true when:
- Every plain expression in
condition_repeatevaluates totrue(after assignability conversion tobool). - Every
tryin atry_unwrap_chainproduces a successful optional. The unwrapped value is bound to the named identifier (if any), which is in scope for the body of the controlling statement. - In a
catch_unwrap, the captured optional has failed; the resulting fault is bound to the named identifier (if any).
The try and catch unwraps are the principal mechanism for handling optionals in conditional contexts; see Optionals and faults.
If statement¶
if_statement ::= "if" label? "(" condition ")" (block_statement else_part | statement)
else_part ::= "else" (if_statement | block_statement)
The condition is evaluated. If true, the then branch is executed; otherwise the else branch, if any, is executed. When the then branch is a single statement (not a block), an else clause is not permitted; in that case use a block.
The optional label has the form LABEL: and follows the rules in Blocks and scope. A label permits a labelled break or continue to target this statement.
When the then-clause is not a compound statement, it must appear on the same source line as the closing parenthesis of the condition. A non-compound then-clause on a separate line is a syntax error.
An if statement may be labelled and exited by a labelled break. An unlabelled break may not exit an if statement; it would otherwise be ambiguous between the enclosing if and any surrounding loop or switch.
Switch statement¶
switch_statement ::= "switch" label? ("(" condition ")")? attributes? "{" switch_body? "}"
switch_body ::= (case_clause | default_clause)+
case_clause ::= "case" expression (".." expression)? ":" statement*
default_clause ::= "default" ":" statement*
The switch evaluates its condition and selects the first case whose value equals the condition, or whose inclusive range a..b contains the condition. If no case matches, the default clause, if present, is selected.
A switch without a condition switch { ... } is equivalent to evaluating each case as a boolean expression in source order and selecting the first that is true. Such a switch is always lowered to an if-else chain (see below).
Control reaches the end of the switch statement after the selected case executes its statements; case clauses with at least one statement do not fall through automatically. A case (or default) clause whose statement list is empty, however, falls through to the next clause: the selected case executes the statements of the next clause that has any. Successive empty clauses chain together, so case A: case B: case C: do_something(); runs do_something() for any of A, B, or C.
To transfer control to another case explicitly, use a nextcase statement.
Lowering¶
A switch statement is lowered to one of two forms:
- A jump table, in which the switch operand directly indexes into a table of case targets. A jump table is produced only when the operand has integer type and every case value is a compile-time constant.
- An if-else chain, in which each case is tested in source order. This form is used when the switch has no condition, when the operand is neither an integer nor a boolean type, or when any case value is not a compile-time constant. An if-else chain is never further reduced to a jump table.
The attribute @jump requests jump-table lowering. A switch carrying @jump must satisfy the requirements above; otherwise the compiler rejects the program.
Exhaustive switches¶
A switch is exhaustive when control is guaranteed to enter exactly one of its clauses for every possible value of the operand. The two cases that produce this guarantee are:
- the switch has a
defaultclause, or - the switch operand has enum type and the switch's case clauses cover every value of the enum.
If a switch is exhaustive and every clause exits the switch through a return, nextcase, break targeting an outer construct, or other jump (rather than falling out of the switch body normally), the code following the switch is unreachable.
While and do statements¶
while_statement ::= "while" label? "(" condition ")" statement
do_statement ::= "do" label? block_statement ("while" "(" expression ")")? ";"
A while statement evaluates the condition; if true, it executes the body and repeats. A do statement executes the body once and then evaluates the trailing condition; if true, it repeats. A do { ... }; form without a trailing while clause is equivalent to { ... } with a label-aware break target.
For statement¶
for_statement ::= "for" label? "(" for_condition ")" statement
for_condition ::= init_list? ";" condition? ";" update_list?
init_list ::= decl_or_expression ("," decl_or_expression)*
update_list ::= expression ("," expression)*
The init_list is executed once; declarations within it are scoped to the entire for statement (including condition, update, and body). The condition, if present, is evaluated before each iteration; the body executes only when the condition is true. The update_list is evaluated after each iteration. An absent condition is treated as true.
Foreach statement¶
foreach_statement ::= ("foreach" | "foreach_r") label? "(" foreach_vars ":" expression ")" statement
foreach_vars ::= foreach_var ("," foreach_var)?
foreach_var ::= optional_type? "&"? IDENTIFIER
The expression must be of an iterable type. The following are natively iterable:
- arrays, slices, vectors, pointers to arrays, and pointers to vectors;
- any type that overloads
lenand[](iteration by value); - any type that overloads
lenand&[](iteration by reference).
The loop binds one or two variables:
- With one variable, that variable binds the element value. If the variable name is prefixed with
&, the binding is by reference and has pointer-to-element type instead. - With two variables, the first binds the loop index and the second binds the element (with the same
&-reference convention as the single-variable form).
The optional type on a foreach variable is the variable's declared type; if omitted, the type is inferred from the iterable's element type. A mismatched type triggers an implicit conversion; failure to convert is a compile-time error.
The index, when present, has type sz by default. An explicit type on the index variable causes a direct cast of the running index to that type at each iteration. The cast may truncate the visible index value but does not affect the iteration itself, and modifying the index variable inside the body has no effect on which element is bound next.
foreach_r iterates in reverse: it starts with the last element and proceeds toward the first. For foreach_r, the running index begins at len - 1 and decreases.
Break, continue, nextcase¶
break_statement ::= "break" CONST_IDENT? ";"
continue_statement ::= "continue" CONST_IDENT? ";"
nextcase_statement ::= "nextcase" ((CONST_IDENT ":")? (expression | "default"))? ";"
break exits the innermost enclosing while, do, for, foreach, or switch statement. An optional label names a specific labelled enclosing statement to exit.
continue skips to the next iteration of the innermost enclosing while, do, for, or foreach. An optional label names a specific labelled enclosing loop. continue may not target a switch.
nextcase transfers control to another case of the innermost enclosing switch, or of a labelled enclosing switch if a label is supplied. The forms are:
nextcase;— transfer to the textually following case.nextcase expression;— transfer to the case selected byexpression. In a switch lowered to a jump table or otherwise capable of direct case selection, control is transferred directly to the matching case without re-evaluating the cases. In a switch lowered to an if-else chain, the cases are re-tested againstexpressionstarting from the first; control transfers to the first matching case. This form may not be used in a switch that has no condition. When both thenextcaseoperand and every case value are compile-time constants, the operand must match one of the cases; anextcaseoperand that matches no case is a compile-time error. When either side is non-constant, no compile-time check is performed; the operand is evaluated and matched at runtime.nextcase default;— transfer to thedefaultclause.
Return statement¶
return terminates execution of the enclosing function or macro and returns control to the caller. If the function's declared return type is void, the operand must be omitted; otherwise it is required and must be assignable to the declared return type. A return from a function whose return type is an optional that carries a fault propagates that fault to the caller.
Defer statement¶
defer_statement ::= "defer" defer_kind? statement
defer_kind ::= "try"
| "catch"
| "(" "catch" IDENTIFIER ")"
A defer schedules the given statement to be executed when control leaves the enclosing block, regardless of how that exit occurs — fall-through, return, break, continue, nextcase, or fault propagation. Deferred statements within a block run in reverse order of their textual occurrence.
Variants:
defer try statement— runs only when the enclosing block is exited without a fault.defer catch statement— runs only when the enclosing block is exited because of a fault.defer (catch fault_name) statement— runs on fault exit, binding the fault value tofault_namefor use inside the deferred statement.
Deferred statements may not return, break, continue, or nextcase out of the function, but may execute their own internal control flow.
A defer body may not itself be a defer statement. However, if the body is a compound statement, that compound may contain any number of inner defer statements.
A defer body may not contain a break, continue, return, or rethrow that would exit the defer body itself. Such constructs are valid only when fully contained within the defer body (for example, a break inside a loop introduced by the defer body).
When the surrounding scope exits through return, the return expression is evaluated before the deferred statements run. The returned value, including any side effects of the return expression, is fixed before any defer body executes. For example:
Deferred statements are run on regular exits from the enclosing scope only. A non-regular exit — longjmp, a panic, a signal-driven termination, or any other mechanism that bypasses normal control flow — does not run pending defers.
Assert statement¶
assert(cond) evaluates cond; if cond is false, the program is terminated by calling a panic function. With no message expression, the standard library's panic function is called. With a message and additional format arguments, panicf is called with the message as a format string. If panicf is not available (for example, when compiling without the standard library), the format arguments are discarded and panic is called with the message alone.
The condition is required to be present; the message and any format arguments are optional. The message must be a compile-time constant string; the format arguments are arbitrary expressions.
Assertions are active in safe builds. In fast builds, the compiler may treat the condition as a hint (an assume directive); reaching a program point where the asserted condition is false is then undefined behaviour.
Compile-time control statements¶
Compile-time control statements direct compilation rather than runtime execution. Each is closed by a matching $end keyword. See Compile-time evaluation for full semantics.
ct_if_statement ::= "$if" constant_expression ":" statement* ("$else" statement*)? "$endif"
ct_switch_statement ::= "$switch" constant_expression? ":" ct_case_clause+ "$endswitch"
ct_case_clause ::= ("$case" constant_expression | "$default") ":" statement*
ct_foreach_statement ::= "$foreach" CT_IDENT ("," CT_IDENT)? ":" expression ":" statement* "$endforeach"
ct_for_statement ::= "$for" for_condition ":" statement* "$endfor"
ct_assert_statement ::= "$assert" constant_expression ("," constant_expression)* ";"
ct_error_statement ::= "$error" constant_expression ("," constant_expression)* ";"
ct_echo_statement ::= "$echo" constant_expression ";"
$assert triggers a compile-time error if the condition is false. $error always triggers a compile-time error with the given message. $echo emits a compile-time diagnostic. The looping and conditional forms drive code generation at compile time and form compile-time block scopes (see Blocks and scope).
Inline assembly¶
asm_block_item ::= asm_label | asm_instruction
asm_label ::= CONST_IDENT ":"
asm_instruction ::= asm_mnemonic (asm_arg ("," asm_arg)*)? ";"
asm_mnemonic ::= (IDENTIFIER | "int") ("." IDENTIFIER)?
An asm block embeds platform-specific instructions. The full syntax of instructions and operands is given in Inline assembly.
The structured form accepts a sequence of labels and instructions in a common grammar that abstracts over the underlying processor.
A label has the form NAME:, where NAME is a CONST_IDENT. Labels may be jump targets for other instructions within the same asm block; they have no visibility outside the block.
An instruction consists of a mnemonic followed by zero or more comma-separated arguments and a terminating semicolon. The mnemonic is either an IDENTIFIER or the keyword int (permitted as a mnemonic so that the x86 int instruction may be written naturally), optionally followed by a .suffix to select an instruction variant — for example, vmov.f32 or cvt.u32.
Functions and methods¶
A function is a named, callable entity with a return type, a parameter list, and an optional body. A method is a function declared in a way that associates it with a particular type and is invoked using member-access syntax on receiver values of that type.
Function declarations and definitions¶
function_declaration ::= "fn" return_type function_name fn_parameter_list generic_decl? attributes? function_body
return_type ::= type "?"? | "void"
function_name ::= (type ".")? IDENTIFIER
fn_parameter_list ::= "(" parameter_list? ","? ")"
function_body ::= compound_statement
| "=>" expression ";"
| "=>" macro_call_with_trailing_block
| ";"
A function declaration introduces a name into module scope and binds it to a function entity. The body takes one of four forms:
- A compound statement
{ ... }defines the function with the given body. - A short body
=> expression;defines the function as immediately returning the value of the expression; the function's return type, if not explicitly given, is the type of the expression. - A short body invoking a macro with a trailing block
=> @macro(args) { ... }defines the function as a call to a macro whose trailing block is the function's body. The terminating;is omitted because the trailing{ ... }serves as the statement terminator. For example:
- A forward declaration
;introduces the function but does not define it. A forward declaration must carry the@externattribute (or appear in an interface body); the definition must be supplied elsewhere.
The return type may carry the ? suffix to denote an optional return type, indicating that the function may return either a value of the underlying type or a fault.
If function_name is preceded by a type and a ., the declaration introduces a method; otherwise it introduces an ordinary function.
Methods¶
A method is a function whose name is qualified by a receiver type:
A method may be invoked on a receiver of the receiver type using member-access syntax, in which case the receiver is passed as the first argument: f.bar(7) is equivalent to Foo.bar(&f, 7) when the first parameter has pointer-to-receiver type, or to Foo.bar(f, 7) when the first parameter has value type.
Methods may extend any user-defined type and any built-in numeric, pointer, slice, vector, array, any, or typeid type. The set of methods visible on a receiver type — its method set — is defined in Properties of types and values.
The first parameter of a method declaration is the receiver. By convention it is named self; the language imposes no specific name. Its declared type must be either the receiver type or a pointer to the receiver type.
Parameters¶
parameter_list ::= parameter_decl ("," parameter_decl)*
parameter_decl ::= parameter (("=" "...") | ("=" expression))?
parameter ::= "inline"? type ("..."? IDENTIFIER attributes?
| "..."? CT_IDENT
| (HASH_IDENT | "&" IDENTIFIER) attributes?
| attributes?)
| "..."
| HASH_IDENT attributes?
| "&" IDENTIFIER attributes?
| IDENTIFIER "..."? attributes?
| CT_IDENT
| CT_IDENT "..."
The principal parameter forms are:
type name— an ordinary by-value parameter of the given type.type name = expression— a parameter with a default value. The default expression is evaluated at the call site whenever no argument is supplied for this parameter.type ... name— a typed variadic parameter; the name binds a slicetype[]over the variadic arguments. A function may declare at most one variadic parameter, and it must be the last positional parameter....— an any-typed variadic parameter; accepts any number of arguments, each converted toany. Inside the function the variadic argument is accessed as$vaarg.inline type name— an inline parameter, used on a method receiver to indicate that subtype dispatch follows the parameter's inline-member chain (see Properties of types and values).type name attributes?— a parameter with attributes that modify how the parameter is passed (e.g.,@in,@out,@noalias); see Attributes.
A parameter name may be omitted in a function declaration that has no body; the declared types alone determine the function's type. Names are required in a definition.
The parameter forms involving HASH_IDENT, &IDENTIFIER, CT_IDENT, and untyped identifiers are valid only in macro definitions; their semantics are described in Macros.
Default arguments¶
A function parameter may carry a default value, written as = expression. The expression must be a constant expression and is evaluated at the call site whenever the corresponding argument is omitted. Default arguments are matched positionally: once a parameter with a default begins the trailing portion of the parameter list, every subsequent positional parameter must also have a default.
The special form = ... indicates that the parameter defaults to the value of the surrounding variadic argument pack and is valid only in macro-style contexts.
Argument splatting¶
At the call site, an argument of the form ...expression splats a slice, array, or vector into the argument list, expanding it into a sequence of positional arguments. Splats may appear at any position in the argument list, including before, between, and after other positional arguments, and may also appear in the variadic portion of the parameter list. Each splat operand contributes its elements in order to the positional argument sequence; the resulting expanded sequence must match the function's parameter list according to the usual rules.
Function attributes¶
A function declaration may carry attributes that affect linkage, inlining, calling convention, or visible properties of the function. Some commonly used attributes are:
@extern— declares an externally linked function with no body.@export— exports the function from a static or dynamic library.@inline/@noinline— request, respectively, that the function be inlined or not inlined at call sites.@noreturn— declares that the function never returns to its caller.@naked— declares that the function body is a bare sequence of instructions, with no compiler-generated prologue or epilogue; typically used together with inline assembly.@pure— declares that the function has no observable side effects (and may be assumed to be safely callable in contracts).@deprecated— emits a diagnostic at use sites.@callconv("name")— selects an alternative calling convention provided by the target.
The complete set of attributes and their semantics is given in Attributes.
Function types and function pointers¶
A function type has the form fn return_type fn_parameter_list. A value of function type is a function pointer and may be assigned, passed, stored, and called like any other first-class value:
alias UnaryOp = fn int(int);
fn int square(int x) => x * x;
fn void main()
{
UnaryOp op = □
int y = op(5);
}
Function types compare by structural identity: two function types are the same if their return types, parameter types in order, and attributes agree. The unary & operator applied to a function name yields a function pointer of the corresponding type.
Lambdas¶
A lambda is an expression introducing an anonymous function:
lambda_expr ::= "fn" optional_type? fn_parameter_list attributes? lambda_body
lambda_body ::= compound_statement
| "=>" expression
A lambda evaluates to a function pointer. The return type may be omitted when it can be inferred from the body. A lambda has no access to runtime variables of the enclosing scope; only compile-time values of the enclosing scope are available within its body.
Methods on built-in types¶
Methods may extend any built-in numeric, pointer, slice, vector, array, any, typeid, or fault type. Such methods become available through member-access syntax wherever the declaring module is imported:
A method declared on a built-in type does not modify the type itself; it adds an entry to the method set visible within the module's import graph.
Operator overloading¶
Methods declared with operator-overload attributes — @operator(+), @operator([]), @operator(len), and so on — extend the corresponding operator for the receiver type. The list of overloadable operators and the signatures of their methods are given in Properties of types and values.
Calling¶
A function call evaluates the function expression, then the arguments in left-to-right order, then transfers control to the function with the argument values bound to the corresponding parameters. The return value of the call has the function's declared return type. Order-of-evaluation rules are given in Expressions.
If the function returns an optional (Ty?) and the call expression is used in a context that does not consume the optional status, the optional must be handled either by try / catch unwrapping (see Statements), by the postfix ! (rethrow) or !! (force-unwrap), or by the operator ?? (optional-else).
Macros¶
A macro is a callable entity defined by source text that is expanded inline at each call site, with arguments bound according to a parameter list. Unlike a function call, a macro expansion is not a runtime call: the macro's body is integrated into the caller's body, and variables and labels introduced by the macro are hygienic (do not leak into the caller's scope, and do not capture from it except through declared parameters).
Macro declarations¶
macro_declaration ::= "macro" return_type? macro_name "(" macro_params ")" generic_decl? attributes? function_body
macro_name ::= (type ".")? (IDENTIFIER | AT_IDENT)
macro_params ::= parameter_list? (";" trailing_block_param)?
trailing_block_param ::= AT_IDENT ("(" parameter_list? ")")?
The function_body non-terminal is the same as for functions; see Functions and methods.
The return type is optional. When omitted, it is inferred from the body's return expressions; all return paths within the body must agree on a single type.
A macro's name is either an IDENTIFIER or an AT_IDENT. The macro must use an AT_IDENT name if any of the following holds:
- it declares one or more expression (
#) parameters, - it declares a trailing-block parameter, or
- it declares a raw variadic parameter (the
...form described below).
This rule lets each call site signal — by the leading @ — that the call may exhibit non-function-like behaviour (lazy expression binding, insertion of a caller-supplied block, or raw access to a variadic argument pack). The presence of compile-time ($) parameters alone does not require an AT_IDENT name.
The attribute @safemacro placed on a macro declaration overrides the rule above: a macro carrying @safemacro may use an ordinary IDENTIFIER name even when it uses one of the features that would otherwise require AT_IDENT. This is provided for cases where the author has determined that the macro behaves like an ordinary function from the caller's perspective.
If macro_name is qualified by a receiver type (Foo.bar, Foo.@bar), the declaration introduces a macro method, invocable through member-access syntax on values of the receiver type. The semantics parallel ordinary methods (Functions and methods).
Macro parameters¶
A macro parameter binds an argument from the call site to a name visible within the macro body. The principal parameter forms are:
type name— a typed parameter. The argument expression is evaluated once at expansion time, converted totype, and the result is bound toname.name— an untyped parameter. The type is inferred from the argument; otherwise the parameter behaves as a typed parameter.#name— an expression parameter. The argument expression is bound tonamewithout being evaluated. Each textual use ofnamewithin the macro body re-evaluates the expression in the caller's lexical context. Use of#parameters requires anAT_IDENTmacro name (unless overridden by@safemacro).$name(aCT_IDENT) — a compile-time value parameter. The argument must be a constant expression; the parameter is a compile-time variable in the body.$Name(aCT_TYPE_IDENT) — a compile-time type parameter. The argument must be a type; the parameter denotes that type within the body.
A parameter may carry attributes (see Attributes). A trailing parameter may have a default value = expression, with the same rules as function defaults (Functions and methods).
Variadic parameters¶
A macro may declare a single variadic parameter as its last positional parameter, in one of three forms:
type... name— a typed slice variadic. The arguments are collected into a slicetype[]bound toname.name...— an untyped slice variadic. The arguments are collected into a slice ofany[]; each element preserves its original type when accessed through$Typefromand related forms....— a raw variadic. The arguments are not collected into a slice; instead, they are accessible only through the compile-time accessors below.
The typed and untyped slice forms are also available for ordinary functions. The raw form is unique to macros, and its use requires an AT_IDENT macro name (unless overridden by @safemacro).
Compile-time access to variadic arguments¶
Within the body of a macro declared with ... (a raw vaarg), the following compile-time accessors are valid:
$vaarg.len— the number of variadic arguments, as a compile-time constant.$vaarg[i]— thei-th variadic argument;imust be a compile-time constant integer....$vaarg— splats all variadic arguments, in source order, into the surrounding call or compound literal. See Expressions.$stringify($vaarg[i])— the textual form of thei-th argument as a string literal.$Typefrom($vaarg[i])— the type of thei-th argument.
Trailing-block parameters¶
A macro may declare a trailing-block parameter after a semicolon in the parameter list. The block parameter is an AT_IDENT optionally followed by a parameter list:
At the call site, the trailing block is supplied as a compound statement following the closing parenthesis of the macro call, optionally preceded by the names that bind the block's parameters:
Within the macro body, the trailing block is invoked using the AT_IDENT name, like a nested macro: @body(i, a[i]). Each invocation expands the supplied block with the named arguments.
A macro that declares a trailing-block parameter must have an AT_IDENT name.
Macro body forms¶
The macro body has the same forms as a function body: a compound statement, a short => expression; body, or a short body invoking another macro with a trailing block. See Functions and methods.
A macro that produces a value uses the same return expression; syntax as a function and may be invoked in any context in which an expression of the macro's return type is valid; a macro with return type void, or with no return paths, is invoked as a statement.
Constant folding¶
A macro is constant-folded at a given call site when the expansion reduces to a single compile-time constant. The conditions are:
- The body contains exactly one runtime statement, and that statement is a
return. - The returned expression evaluates to a compile-time constant.
Any number of compile-time statements ($if, $for, $foreach, $switch, $assert, and so on) may appear in the body without affecting constant folding, since they are evaluated during compilation and do not contribute runtime statements.
The attribute @const placed on a macro asserts that the macro folds to a compile-time constant for every valid call. The compiler verifies the assertion against the body and, if it does not hold, reports the specific construct that prevents folding.
A constant-folded macro call may appear in any context that requires a compile-time constant — array sizes, the condition of $if, the value of a named constant, attribute arguments, and so on.
Invocation¶
A macro is invoked using the same syntax as a function call. If the macro's name is an AT_IDENT, the call site uses that same form:
If the macro's name is an IDENTIFIER, the call uses the bare name:
Argument expressions are bound to parameters according to the parameter form: typed and untyped parameters bind to values at expansion, while #, $, and $T parameters bind expressions, compile-time values, and types respectively. Argument evaluation order at the call site follows the rules in Expressions.
Hygiene¶
Identifiers introduced inside a macro body — local variables, labels, parameter names — are renamed during expansion so that they do not collide with identifiers in the caller's scope. Conversely, the macro body does not implicitly see the caller's local variables; access to caller-side names is possible only through #, $, and $Name parameters, which establish the binding explicitly.
Recursion¶
A macro may invoke itself or other macros. The maximum depth of macro expansion is bounded by the implementation's macro-recursion-depth setting; exceeding the limit is a compile-time error.
Compile-time evaluation¶
C3 supports a significant subset of the language at compile time. Compile-time evaluation drives conditional compilation, generic instantiation, the bodies of macros, the conditions of $if and $assert, the sizes of arrays, the values of named constants, and the arguments of attributes. This chapter describes the compile-time expression and value forms, the compile-time control structures, and the built-in compile-time operators.
Constant expressions¶
A constant expression is an expression whose value is determined by the compiler at compile time. The result is a value of a definite type, available wherever a constant is required.
The following expressions are constant:
- Integer, floating-point, character, boolean, and string literals.
- The constants
null,true,false. - The result of any operator from Expressions applied to constant operands — including arithmetic, bitwise, shift, comparison, logical, optional-else, Elvis, ternary, and cast — together with the compile-time-only operators
+++,&&&,|||described below. - References to global constants whose initializer is itself a constant expression.
- Compile-time variables and compile-time type variables (see below).
- A compound literal whose elements are constant expressions.
- A member access on a compile-time-known aggregate.
- A call to a macro that folds to a constant (see Macros).
- The compile-time analysis expressions
$eval,$stringify,$defined,$feature,$Typeof,$Typefrom, and$reflect. - A
$vaargaccess, when the corresponding macro argument is itself a constant expression. - Type-access expressions of the form
Type::typeid,Type::alignment,Type::size, and similar (see Properties of types and values).
A constant expression is required in the following contexts:
- The size of an array type (
T[N]). - The condition of
$if,$switch, and$assert. - The value of a named constant declaration.
- An argument to an attribute.
- The offsets and widths of bitstruct members.
- The value of a
$caseclause within$switch.
Compile-time variables and types¶
A compile-time value variable has a CT_IDENT name ($name). It holds a value known at compile time and may be reassigned within its compile-time block scope.
A compile-time type variable has a CT_TYPE_IDENT name ($Name). It denotes a type known at compile time.
Compile-time variables may be declared inside function or macro bodies, in compile-time control structures, and in macro parameter lists. They obey compile-time block scope as described in Blocks and scope. They may not be declared at module scope.
A compile-time variable has no runtime existence: no storage is reserved, and its address may not be taken. References to compile-time variables in generated code are replaced by the variable's value at the point of reference.
Compile-time operators¶
Three operator variants are reserved for compile-time evaluation:
+++— compile-time concatenation. Joins two compile-time-known arrays, slices, or strings into a new compile-time value.
-
&&&— compile-time short-circuit AND. Evaluates the right operand only when the left istrue. Unlike runtime&&, the right operand is not even type-checked when the left isfalse; this allows referring to entities that may not exist on all paths. -
|||— compile-time short-circuit OR. The dual of&&&: the right operand is not type-checked when the left istrue.
The &&& and ||| operators are typically used together with $defined to guard the use of an entity by the existence of that entity:
The above is well-formed whether or not @feature exists; with ordinary &&, the call to @feature() would be type-checked even when $defined returns false and would produce a compile error.
Compile-time control flow¶
The compile-time control structures direct what code the compiler generates rather than runtime execution. Their syntax is given in Statements; the semantics below specify what the compiler does at each form.
$if and $else¶
The condition is a constant boolean expression. If the condition is true, the body of the $if branch is compiled as part of the surrounding program; if false, the body is discarded and the $else branch (if present) is compiled instead. Discarded branches are not generated and are not subject to ordinary type checking — they need only be syntactically well-formed.
A declaration introduced within the selected branch enters the enclosing scope; declarations within a discarded branch do not.
$switch¶
The switch operand is a constant expression. Each $case value is a constant expression. The compiler selects the first matching $case (or $default if none matches), compiles its body, and discards the remaining cases.
If the operand is omitted ($switch:), each $case is a boolean constant expression; the compiler selects the first $case whose expression is true.
$switch selects one of its $case clauses based on a compile-time-constant cond. The following rules apply:
- The operand and case values are constant expressions. When the operand is a type, every case value must also be a type.
- If the operand is omitted (
$switch:), each$caseis a boolean constant expression; the first case whose expression istrueis selected. - A
$caseclause with no statements falls through to the next clause. Successive empty clauses chain together, and the first non-empty clause encountered supplies the statements that are processed. - Only the selected clause is processed by the compiler; the statements of the other clauses are not semantically checked.
$switchdoes not support ranged cases,break, ornextcase.
$for¶
The compiler unrolls the loop: the body is generated once per iteration, with the loop variables (which are compile-time variables) bound to compile-time-constant values for that iteration. The control expressions are evaluated at compile time.
Each unrolled iteration produces an independent compile-time block scope. Declarations introduced in one iteration do not collide with the corresponding declarations of another iteration.
$foreach¶
$foreach unrolls over a compile-time-known sequence: a compile-time array, slice, string, or other iterable available at compile time, including the member lists exposed by reflection. With one loop variable, the variable binds the element; with two, the first binds the index and the second binds the element.
Compile-time diagnostics¶
The compile-time diagnostic statements take a sequence of comma-separated arguments and do not require enclosing parentheses:
$assert FOO > 0, "Invalid foo";
$error "Unsupported configuration";
$echo "Building with verbose mode";
$assert condition, message?, ...— evaluatesconditionat compile time; iffalse, emits a compile-time error. The optional message expressions are evaluated at compile time and composed into the diagnostic.$error message, ...— unconditionally emits a compile-time error with the given message. Typically used in a discarded branch to flag unsupported configurations.$echo message— emits a compile-time informational diagnostic. No runtime effect.
Compile-time analysis builtins¶
$defined(check, ...)— yieldstruewhen every operand is well-formed in the current scope. Each operand is either a candidate expression or a candidate local variable declaration (type IDENTIFIER ("=" expression)?). See Expressions.$feature(NAME)— yieldstruewhen the build-system feature flag namedNAMEis enabled.$eval(string)— parses the compile-time stringstringas the name of an entity (a variable, function, or other named declaration), optionally qualified by a module path, and yields a reference to that entity in the current scope. The string may not contain an arbitrary expression; it names something already declared.$stringify(expression)— yields the source text ofexpressionas a compile-time string. The expression is not evaluated.$Typeof(expression)— yields the type ofexpressionwithout evaluating it.$Typefrom(value)— yields a type. The operand is either a compile-timetypeidvalue or a compile-time string giving the name of a type.$vaarg,$vaarg.len,$vaarg[i],...$vaarg— accessors for the raw variadic arguments of a macro; see Macros.
The semantics of $reflect and the family of reflective accessors are described in Reflection.
Top-level conditional compilation¶
The @if(condition) attribute attached to a top-level declaration is the module-scope analogue of $if: a declaration carrying @if(cond) is compiled only when cond evaluates to true. A module section attribute @if(condition) applies the same effect to every declaration in the section (see Blocks and scope).
When @if-conditional declarations refer to one another, the evaluation order is consistent with module-level dependency resolution: a declaration that depends on another @if-conditional declaration sees the result of evaluating that dependency's condition.
Compile-time execution of macros¶
A macro call evaluates at compile time when invoked from a constant-expression context, provided the macro folds to a constant under the rules in Macros. The result is the constant value produced by the macro's return statement. Ordinary functions do not execute at compile time; only macros and the compile-time builtins above are usable in constant-expression contexts.
Source-text inclusion¶
The following top-level and statement-level compile-time directives bring source text into the current translation unit:
$include("path")— includes the contents of the named file at the current point in the source, as if its text had appeared there directly. Valid only at the top level. Requires trust levelincludeor higher.$exec("command", args?, stdin?)— executes an external program at compile time and includes its standard output as source text. Requires trust levelfull.$expand(string)— parses the compile-time stringstringas C3 source and inserts the resulting statements at the directive's location. When$expandappears at module scope, the string is parsed as a sequence of top-level declarations; when it appears inside a function or macro body, the string is parsed as a sequence of statements within the current scope.$embed("path")— embeds the contents of the named file as a compile-time byte-array value rather than as source text.
The trust level is configured by the build system; see Modules.
Reflection¶
C3 provides compile-time access to the structure, identity, and properties of types, values, and declarations. Reflection in C3 is fully compile-time: every reflective query produces a constant expression or a compile-time-known value, and may be used wherever a constant is required.
Type identity¶
Every type has a runtime type identifier of type typeid. The typeid of a type Ty is obtained as Ty::typeid. Two typeid values compare equal with == if and only if they identify the same type.
The type typeid is a built-in opaque type whose size and alignment equal the platform pointer width (see Properties of types and values).
A typeid value is itself a constant expression when its operand is a static type name; it is a runtime value when produced by $Typeof(expr) on an any or interface value, or by similar runtime queries on dynamic dispatch results.
Type-property access¶
Every type supports a fixed set of compile-time accessors selected through the :: operator. Some are defined for every type; others are restricted to specific kinds of type. Accessing a property that is not defined for the receiver's kind is a compile-time error.
Some accessors yield values; others yield types or compile-time-only entities such as member lists.
The accessors defined for every type are:
Ty::typeid— the type'stypeidvalue.Ty::size— the size of the type in bytes.Ty::alignment— the alignment of the type in bytes.Ty::kind— the type'sTypeKindvalue (an enum defined instd::core::types).Ty::name— the type's simple name as a compile-time string.Ty::qname— the type's fully qualified name (with module path) as a compile-time string.Ty::has_equals—trueif==and!=are defined on the type.Ty::is_ordered—trueif<,<=,>,>=are defined on the type.Ty::methods— a compile-time array of strings giving the names of methods declared on the type.
The accessors below are restricted to the type kinds for which they make sense; using them on an unsupported kind is a compile-time error.
Ty::min/Ty::max— the minimum and maximum representable values; defined for integer and floating-point types.Ty::nan/Ty::inf— NaN and infinity values; defined for floating-point types.Ty::len— the length of an array, vector, or enum-like type. For arrays and vectors it is the number of elements; for enums and constdefs it is the number of declared constants.Ty::members— a compile-time list of member descriptors. Defined for struct, union, bitstruct, and enum types. Each element is a reflective reference equivalent to$reflectapplied to that member. Because the list is untyped at runtime, it may be iterated only at compile time.Ty::inner— the innertypeidof a composite type:- Array — the element type.
- Vector — the element type.
- Pointer — the pointee type.
- Bitstruct — the backing type.
- Enum — the backing integer type.
- Typedef — the underlying type.
Ty::parent— for typedef, constdef, bitstruct, and struct types, the typeid of theinlinemember; for struct, the typeid of the inlined substruct member, if any.Ty::is_substruct— defined for struct;trueif the struct has aninlinemember.Ty::params— defined for function pointer types; a compile-time array of parameter descriptors (each with.nameand.type).Ty::returns— defined for function pointer types; the return type as atypeid.Ty::cname— the external (mangled) name of the type as a compile-time string; not defined for built-in types.Ty::from_ordinal(i)— defined for enum and constdef; produces the value with the given ordinal.Ty::lookup_field(field, value)— defined for enum; returns an optional containing the first value whose associated field equalsvalue, or a fault if none matches.Ty::values— defined for enum and constdef; a compile-time array of the declared values.Ty::get_tag(name)/Ty::has_tag(name)— query user-defined tags attached to the type.
Reflective references and member queries¶
The accessor Ty::members yields a compile-time list of reflective references: opaque compile-time-only handles describing each member. A reflective reference supports a fixed set of property accesses:
.name— the member's source name as a string..qname— the qualified name as a string..type— the member's type as atypeid..offset— the member's offset within its enclosing aggregate..alignment— the member's alignment..kind— the member'sTypeKind..get_tag(name)/.has_tag(name)— user-defined tags attached to the member.
Reflective references may be iterated using $foreach (see Statements); they are usable only at compile time.
Compile-time reflection of expressions¶
The built-in $reflect(expression) yields a reflective reference describing the given expression. The set of accessors available depends on what the expression refers to:
- For a variable or constant:
.name,.qname,.cname,.type,.alignment,.kind,.get_tag/.has_tag. - For a function or macro:
.name,.qname,.cname,.params,.returns,.get_tag/.has_tag. - For a type: the same accessors as
Type::accessorabove.
A program may check whether a particular accessor is available for an expression by combining $defined with $reflect:
Type queries on values¶
$Typeof(expression)— the static type ofexpression, as a type usable in type contexts. The expression is not evaluated.
$Typefrom(value)— the type denoted by a compile-timetypeidvalue or by a compile-time string giving the type's name.
$stringify(expression)— the source text ofexpressionas a compile-time string. For an#expressionmacro parameter,$stringifyproduces the text of the argument passed for the parameter, not the parameter's own name.
Dynamic reflection through any and interfaces¶
A value of type any or an interface type carries a runtime typeid accessible as .type. This typeid may be compared against static typeid values, used as the operand of a switch, or passed to functions for runtime dispatch.
The pointer to the underlying storage is accessible as .ptr. Combining .type with .ptr enables runtime reflection on heterogeneous values.
TypeKind¶
The enumeration TypeKind, defined in std::core::types, enumerates the kinds of type that may appear in Ty::kind and reflection results. Members include SIGNED_INT, UNSIGNED_INT, FLOAT, BOOL, POINTER, STRUCT, UNION, ENUM, CONSTDEF, VECTOR, ARRAY, SLICE, BITSTRUCT, INTERFACE, ANY, TYPEID, FAULT, FUNC, and TYPEDEF, among others. The full enumeration is part of the standard library.
Restrictions¶
Reflection is a compile-time facility. Reflective references, member lists, and type-property values that are not themselves runtime types (such as Ty::members) may not be assigned to runtime variables, returned from runtime code, or stored in runtime data structures. Compile-time iteration ($foreach) and conditional logic ($if) are the mechanisms for traversing reflective data.
Method introspection through Ty::methods is subject to the ordering caveat that methods are registered into the compiler's type tables after the types themselves; reflective queries on the method set are guaranteed consistent only when performed inside a function body.
Attributes¶
An attribute is a piece of metadata attached to a declaration or, in a few cases, to a statement. Attributes influence compilation in ways that range from the purely informational (deprecation diagnostics) to the structural (layout, linkage, calling convention). Some attributes have a single canonical meaning fixed by the language; others may be combined and composed into named compounds via attribute definitions.
Attribute syntax¶
An attribute is written @name or @name(argument-list). The lexical kind of the name is AT_IDENT. Multiple attributes may be attached to a single declaration; they are written one after another at the position the declaration's grammar permits attributes.
Each declaration form specifies the precise position where its attribute list may appear (see Declarations, Functions and methods, Variables, Types). The arguments of an attribute are constant expressions; their kinds and number depend on the specific attribute. Most attributes accept zero or one argument; the attributes @link, @tag, and @wasm accept additional arguments as described in their entries below.
Built-in attributes¶
The attributes recognized by the language are grouped below by purpose. Unless noted otherwise, an attribute is valid on the declaration forms for which it is meaningful and ignored or rejected on others.
Visibility¶
@public— the declaration is visible to importers (the default for module-level declarations).@private— the declaration is visible only within the same module.@local— the declaration is visible only within the same file.@builtin— the declaration is visible without qualification across all modules; reserved for standard-library declarations.
The first three may also appear on a module section to set the default visibility for all declarations in the section (see Blocks and scope).
Linkage and storage¶
@export— the declaration is exported as a public symbol when building a library.@weaklink— emits the symbol with weak linkage rather than global linkage. A reference to a weak-linked symbol that is unresolved at link time resolves tonullinstead of producing a link error.@weak— like@weaklink, but additionally: if a non-weak definition of the same symbol exists in the same compilation, the non-weak definition supersedes the weak one. For example, given
a call to test() invokes the non-@weak definition.
* @link(library) — adds the named library to the link command.
* @section(name) — places the declaration in the named object-file section.
* @cname(name) — overrides the symbol's external name with the given string.
* @nostrip — prevents the symbol from being removed by dead-code stripping.
Inlining, calling, and control flow¶
@inline/@noinline— request, respectively, that calls to this function be inlined or not.@callconv(name)— selects a calling convention. The argument is a compile-time string; the recognized values are implementation dependent, but will at least contain"cdecl". The default is"cdecl". If more than one@callconvis applied to a function or call, the last takes precedence.@naked— the function has no compiler-generated prologue or epilogue; typically used with inline assembly.@noreturn— the function never returns to its caller; reaching its textual end is an error.@pure— the function has no observable side effects and may be assumed safe to call in contracts.@maydiscard/@nodiscard— explicitly allow or forbid discarding the return value at call sites.@finalizer/@finalizer(priority)— registers the function to be called at program shutdown. The optional priority argument is an integer; lower values run earlier than higher values.@init/@init(priority)— registers the function to be called at program startup, before main. The optional priority argument has the same convention as @finalizer.
Initialization and layout¶
@noinit— suppresses default zero-initialization of a variable; the variable's initial value is indeterminate.@mustinit— applied to a type; declares that variables of the type may not opt out of initialization.@constinit— applied to a typedef; permits implicit conversion of literal values to the typedef's name.@safeinfer— applied to a local variable declared withvar; opts into type inference for a runtime local. Thevarform is otherwise reserved for compile-time variables and macro parameters;var x @safeinfer = expression;permits its use for a runtime local whose type is inferred from the initializer.@align(n)— raises the alignment of a type, variable, or function to at leastn(a power of two).@packed— sets all field alignments of a struct or bitstruct to 1; eliminates inter-field padding.@compact— uses the smallest possible layout consistent with field requirements.@nopadding— requires that the layout introduce no padding bytes; declarations that would require padding are rejected.@overlap— permits a struct's fields to overlap (advanced; see the standard library).@bigendian/@littleendian— fixes the byte order of a bitstruct's backing storage.@obfuscate— applied to an enum or fault declaration; omits member-name information from reflection and runtime introspection. Useful for size-sensitive builds.
Macros and compile-time¶
@safemacro— overrides theAT_IDENTnaming requirement for a macro that uses features (raw vaargs, expression parameters, trailing-block parameters) that would otherwise require it.@const— emitted on a macro; asserts that the macro folds to a compile-time constant for every valid call. The compiler verifies the assertion and reports any non-constant construct that prevents folding.@if(condition)— conditional compilation. The declaration is compiled only whencondition(a constant boolean expression) istrue.@tag(name, value)— attaches a user-defined tag accessible through reflection (Ty::get_tag(name)).
Operator overloading¶
@operator(op) declares a method as the implementation of operator op on the receiver type. The accepted operator forms are:
- Arithmetic:
+,-,*,/,%, and their assignment forms+=,-=,*=,/=,%=. - Bitwise:
&,|,^, and their assignment forms&=,|=,^=. - Shift:
<<,>>, and their assignment forms<<=,>>=. - Unary bitwise:
~. - Comparison:
==and<. The other relational and equality operators are derived from these and may not themselves be overloaded. - Subscript:
[](read),[]=(write), and&[](reference). The reference form returns a pointer to the element. - Length:
len— overloads the value queried byTy::lenand byforeachlength calculations.
Increment, decrement, the dot operator, and the comparisons !=, <=, >, >= are not directly overloadable. Increment and decrement are derived from += and -=; the missing comparisons are derived from == and <.
Two variants of @operator exist for binary operators where the operand order matters:
@operator_r(op)— declares the right-hand-side variant: applies when the receiver appears on the right of the operator and the left-hand operand is of another type. Not valid for operators where this would be meaningless.@operator_s(op)— declares the symmetric variant: applies for either order of operands. Not valid for asymmetric operators (in particular, not for<).
The required signatures for each form are given in Properties of types and values.
Each overload form imposes signature requirements on the declaring method:
[]— takes one parameter (the index, of integer type) and returns the element type.&[]— takes one parameter (the index, of integer type) and returns a pointer to the element type. When both[]and&[]are defined, the return of&[]must be a pointer to the return of[].[]=— takes two parameters (the index, of integer type, and the new element value); the return type isvoid. The value parameter's type must match the return type of[].len— takes no parameters; the return type must be an integer type.
The arithmetic, bitwise, shift, comparison, and unary overload forms follow the natural signature of their operator: each binary overload takes one parameter (the right-hand operand, after the receiver), each unary overload takes none, and the return type is the type of the operator's result.
Switch lowering¶
@jump— applied to a switch statement; requires that the switch be lowered to a jump table. The switch must satisfy the requirements for jump-table lowering described in Statements.
Diagnostics and optimization hints¶
@deprecated/@deprecated(message)— emits a compile-time diagnostic at each use of the declaration. An optional string message is included in the diagnostic.@allow_deprecated— applied to a function; suppresses@deprecateddiagnostics for declarations referenced inside that function.@unused/@used— suppress or force diagnostics about an unused or unreferenced declaration.@noalias— applied to a pointer parameter; declares that the parameter does not alias any other parameter or accessible memory for the duration of the call. The compiler may use this assumption for optimization; violating it is undefined behaviour.@nosanitize(check)— opts the function out of the named runtime sanitizer;checkis a string such as"address","memory", or"thread". The set of recognized checks is implementation-defined and may grow over time.@format(index)— marks a parameter (identified by its 1-based index) as a printf-style format string. The function must have anargs...(typed variadic) parameter; the format string parameter must be of typeString. Format mismatches diagnosed by the compiler.
Testing and benchmarking¶
@test— marks the declaration as a test function, run by the test harness. See Testing and benchmarking.@benchmark— marks the declaration as a benchmark function, run by the benchmark harness.
Platform-specific¶
@wasm/@wasm(name)/@wasm(module, name)— applied to a function, acts as@exportfor the WebAssembly target. The one-argument form sets the exported name; the two-argument form additionally sets the WebAssembly module name (the import or export module).@winmain— designates a function as the Windows GUI entry point.
Interface methods¶
@dynamic— marks a method as participating in dynamic dispatch through an interface. A type's@dynamicmethods constitute its interface implementation as described in Types.@dynamicmay not be applied to methods ofanyor to methods of an interface itself; only methods of concrete user-defined types may carry it.@optional— marks an interface method as not required for every implementor.
Type modifiers¶
A small number of @-prefixed forms appear in type positions rather than on declarations, and so are not strictly attributes in the grammatical sense; they are listed here for reference.
@simd— applied to a vector type, requests SIMD alignment (a power-of-two alignment derived from the vector's total byte size) in every context, including struct fields and array elements. A@simdvector must have a power-of-two length. The contrasting plain vector has element-natural alignment when embedded in a struct or array. See Types.
Import attributes¶
A small subset of attributes appears on import declarations rather than on entity declarations.
@public— re-export the imported module's private declarations into the importing context (see Modules).@norecurse— prevents the import from being recursive. By default, importing a module also imports all of its submodules;@norecurselimits the import to the named module only (see Modules).
Attribute definitions¶
An attribute definition introduces a user-defined attribute that expands to one or more built-in attributes. It is a top-level declaration:
attrdef_decl ::= "attrdef" AT_IDENT ("(" parameter_list? ")")? ("=" attribute_list)? ";"
attribute_list ::= attribute ("," attribute)*
An attribute defined by attrdef may have parameters; the parameters are substituted into the expansion when the attribute is applied at a use site.
attrdef @MyAttribute = @noreturn, @inline;
attrdef @MyCname(x) = @cname(x);
attrdef @TagFoo(value) = @tag("foo", value);
attrdef @MyAttributeEmpty;
A use of a user-defined attribute is equivalent to the textual substitution of its expansion at the use site. The two function declarations below are equivalent:
A user-defined attribute with no expansion is permitted and is used purely for tagging — typically combined with @tag for reflection.
User-defined attributes may not be applied to themselves and may not be mutually recursive.
Contracts¶
A contract is a pre- or post-condition attached to a function or macro that a compiler may use for static analysis, for runtime checking, and for optimization. Contracts are written inside documentation comments delimited by <* ... *> (see Lexical elements).
Contract analysis is optional in the language. A conforming compiler may ignore contracts entirely; one may evaluate them statically and reject programs at compile time; or it may insert runtime checks. Regardless of whether the compiler verifies a contract, violating a contract is unspecified behaviour: the compiler is permitted to optimize as if every contract holds. Safe builds typically lower contract conditions to runtime assertions.
This permissive policy lets simple C3 compilers omit contract analysis entirely while still letting more sophisticated compilers exploit contracts for static checking and optimization. The language does not specify which interpretation a particular compiler must use, but the existence of a contract on a declaration is well-defined and observable through tooling.
Contract syntax¶
A doc comment preceding a function or macro declaration may contain contract clauses. Each clause begins with a contract keyword (a @-prefixed identifier) and may extend over one or more lines until the next clause keyword or the closing *>.
contract_block ::= "<*" contract_clause* "*>"
contract_clause ::= require_clause
| ensure_clause
| param_clause
| pure_clause
| return_clause
| deprecated_clause
require_clause ::= "@require" expression ("," expression)* (":" string_literal)?
ensure_clause ::= "@ensure" expression ("," expression)* (":" string_literal)?
param_clause ::= "@param" ("[" param_mode "]")? IDENTIFIER (":" string_literal)?
param_mode ::= "&"? ("in" | "out" | "inout")
pure_clause ::= "@pure"
return_clause ::= "@return?" return_fault ("," return_fault)* (":" string_literal)?
return_fault ::= path? CONST_IDENT
| path? IDENTIFIER "!"
deprecated_clause ::= "@deprecated" (":" string_literal)?
Text within the doc comment that does not begin a contract clause is ordinary documentation, preserved for documentation tooling but not interpreted by the language.
Preconditions: @require¶
A @require clause introduces one or more boolean expressions evaluated at the start of each call. Each expression must evaluate to true; the optional trailing string is the message included in a contract-violation diagnostic.
<*
@require foo > 0, foo < 1000 : "foo out of range"
*>
fn int test_foo(int foo)
{
return foo * 10;
}
Within a @require expression, the parameters of the function are in scope. The expression must be free of side effects.
Postconditions: @ensure¶
An @ensure clause introduces boolean expressions evaluated immediately before the function returns. Within an @ensure expression, the keyword return denotes the value being returned (where the return type is non-void); the parameters of the function are in scope and refer to their values on entry to the function.
<*
@require foo != null
@ensure return > foo.x
*>
fn uint check_foo(Foo* foo)
{
return abs(foo.x) + 1;
}
The expression must be free of side effects.
Parameter annotations: @param¶
A @param clause annotates a single named parameter with access-mode constraints. The annotations apply primarily to pointer parameters and describe whether the function reads, writes, or both reads and writes through the pointer, and optionally whether the pointer is required to be non-null.
| Annotation | Read through pointer | Write through pointer | Non-null required |
|---|---|---|---|
| (none) | yes | yes | no |
[in] |
yes | no | no |
[out] |
no | yes | no |
[inout] |
yes | yes | no |
[&in] |
yes | no | yes |
[&out] |
no | yes | yes |
[&inout] |
yes | yes | yes |
When the & prefix is present, the parameter is required to be non-null on entry; a null argument is a contract violation. The clause may carry a trailing string used as the diagnostic message:
<*
@param [&in] data : "data must be a valid, non-null buffer"
*>
fn void process(char* data) { ... }
A conforming compiler may, but need not, statically verify that the function body respects the declared access mode. Violation is unspecified behaviour.
Purity: @pure¶
A @pure clause declares that the function neither reads from nor writes to global state. A pure function may call other pure functions but may not call functions known to be impure.
At a call site within a pure function, an otherwise-impure call may be marked @pure to assert that the call is, for the purposes of the surrounding contract, pure. The compiler may use the declared purity for optimization. As with @param, the compiler is not required to verify purity, and violations are unspecified behaviour.
Fault declarations: @return?¶
The @return? clause lists the fault values that the function may propagate through an optional return type. Each entry is one of:
- A fault constant (a
CONST_IDENT, optionally module-qualified) — adds that specific fault to the set the function may return. - A function or macro name followed by
!— inherits the@return?set of the named entity. Every fault that the referenced function or macro declares it may propagate is added to this function's set.
<*
@return? io::EOF, test! : "Returns EOF if it runs out of tokens"
*>
fn String parse(String input) { ... }
In the example, parse declares io::EOF directly and additionally inherits every fault from test's own @return? clause.
In the current implementation, the compiler statically checks that no fault is directly raised within the function — for example, by an expression of the form return io::EOF~; — unless that fault appears in the declared @return? set. This check is limited by what static analysis can determine: faults that arise from indirect calls or through dynamic dispatch are not generally tracked. A conforming compiler is not required to enforce @return? at compile time or at runtime; as with all contracts, violation is unspecified behaviour.
Deprecation: @deprecated¶
A @deprecated clause within a contract block is an alternative to the @deprecated attribute (see Attributes) and has the same effect: a compile-time diagnostic is emitted at each use of the declaration. The clause may carry a message as a trailing string:
The contract form is provided so that deprecation information may live alongside the declaration's other documentation. A declaration may carry either the contract clause or the attribute form, but not both.
Macros and contracts¶
Macros may carry the same contract clauses as functions. Because macros are expanded inline, contract conditions are particularly useful for constraining macro arguments in ways that cannot be expressed through the parameter list alone. The compile-time builtin $defined is the principal mechanism for testing argument well-formedness from within a contract:
The @require clause is evaluated at compile time when the macro is instantiated, allowing the constraint to be enforced before the macro body is processed.
Runtime evaluation¶
When the compiler chooses to lower a contract clause to a runtime check, it inserts the equivalent of an assert statement at the appropriate point: at the function's entry for @require, immediately before each return for @ensure, and at the call site for @pure-related and @param-related checks. The form of the resulting diagnostic is implementation-defined; the program traps on contract violation.
In a non-safe build, the compiler may elide all such runtime checks, leaving only the static-analysis effect of the contract.
Generics¶
C3 supports generic types, functions, and macros through parameterization by types and compile-time values. A parameterized declaration is instantiated at the point of use by supplying concrete arguments for the parameters; each unique parameterization produces a distinct entity that is compiled, type-checked, and reachable independently.
Forms of parameterization¶
A parameter list <param ("," param)*> introduces one or more parameters of these kinds:
- A type parameter (
TYPE_IDENT) names a type that is supplied at instantiation. Within the parameterized scope, the parameter is usable wherever a type is valid. - A value parameter (
CONST_IDENT) names a compile-time-constant value supplied at instantiation. The value's type must be an integer, boolean, enum, or fault type. Within the parameterized scope, the parameter is usable wherever a compile-time constant of the appropriate type is valid.
A parameter list may appear in two positions:
- On a module section, immediately after the module path:
module vector <Ty, Tu>;. This form is purely a shorthand: every declaration inside the section receives the same parameter list, exactly as if it had been written individually. - On an individual declaration — a type, function, or macro — between the declared name and the rest of the form:
struct Foo <Ty> { ... },fn Ty add(Ty a, Ty b) <Ty> { ... }.
The two forms are equivalent. The following are interchangeable:
Grouping of parameterized declarations¶
Within a single module, two parameterized declarations belong to the same generic unit when they share the number of parameters and the names of those parameters. Members of the same generic unit are instantiated together: a use of one member triggers instantiation of every other member of the unit.
module abc;
// Generic unit 1: parameterized by <Test>
fn Test test1(Test a) <Test> { return a + 1; }
struct Foo <Test> { Test a; }
fn Foo test2(Test b) <Test> { return (Foo) { .a = b }; }
// Generic unit 2: parameterized by <Test2> — a different parameter name
fn Test2 test3(Test2 a) <Test2> { return a * a; }
fn void main()
{
Foo{int} a; // Instantiates Foo, test1, and test2 for <int>.
// Does not instantiate test3, which is in a different unit.
}
A use of any member of the unit causes every member of that unit to be instantiated for the same argument tuple. This differs from C++, where each template is individually instantiated on demand; in C3, sibling members of a generic unit are kept in lock-step.
Declarations in different modules do not group, even if their parameter names and counts agree.
Instantiation¶
A parameterized entity is instantiated by supplying type or value arguments inside curly braces:
generic_arguments ::= "{" type_or_value ("," type_or_value)* "}"
type_or_value ::= type | constant_expression
For a parameterized type the result is a type; for a parameterized function or macro the result is a callable entity:
Each distinct argument tuple yields a separate instantiation. Two instantiations with structurally equal argument tuples are the same entity.
The argument count must match the parameter count of the targeted unit; each type parameter must be supplied a type and each value parameter a compile-time constant of one of the permitted kinds.
Aliases¶
A non-parameterized alias may name a specific instantiation of a parameterized entity:
An alias itself may be parameterized using its own parameter list; the alias parameters are then in scope on the right-hand side and may be passed to instantiations there:
The <Ty> after the alias name declares the alias as generic. The {Ty} on the right-hand side instantiates the underlying parameterized entity. A form such as alias List {Ty} = ... is not permitted: the curly-brace form denotes instantiation of an existing parameterized entity, not the introduction of a new parameter.
A parameterized alias is rarely useful in practice, since a use of the alias's name with arguments resolves through to the underlying entity in the same way; the alias adds no abstraction over the original.
Constraints on parameters¶
The accepted set of parameter kinds is fixed: type parameters and value parameters of integer, boolean, enum, or fault type. The language imposes no further constraint at the parameter list itself. Additional constraints are expressed through contracts (see Contracts) using compile-time predicates such as $defined, $Typeof, and the type-property accessors:
A failed contract on a parameterized declaration produces a diagnostic at the point of instantiation; the diagnostic identifies the violated @require clause and the parameter values that caused the failure.
Contracts placed on the module section and on individual parameterized type declarations combine and are evaluated together for the generic unit. Contracts on generic functions and macros are checked only when those functions or macros are themselves invoked: a contract that constrains a function's type parameter does not propagate to instantiations of a sibling generic type in the same unit.
Methods on parameterized types¶
A method declared on a parameterized type is itself implicitly parameterized over the type's parameters; the parameters are in scope within the method's signature and body:
Foo.add is part of Foo's generic unit and is instantiated alongside Foo for each argument tuple.
A method may also be declared on a specific instantiation of a parameterized type. In that case the parameters are not in scope; the method applies only to that one instantiation:
This method is available on Foo{int} only.
Visibility, name resolution, and ordering¶
Visibility rules in Modules apply unchanged to parameterized declarations. The compiler instantiates a parameterized declaration only when it is referenced; errors that depend on the parameter values, including unresolved references inside a method body or a contract that fails to hold, are reported at the point of instantiation.
Identity and ABI¶
Two instantiations are the same entity if and only if their argument tuples are component-wise equal: types compared by type identity (see Properties of types and values), and values compared by constant-expression equality. Instantiations with different argument tuples are independent entities with independent symbol identities and may have different sizes, alignments, and ABIs.
A parameterized declaration is not itself a runtime value; only its instantiations are. A function pointer cannot bind to a parameterized function — it must bind to a specific instantiation.
Modules¶
A module is the unit of namespace, visibility, and compilation in C3. Every top-level declaration belongs to exactly one module. Modules may be spread across multiple files, may be nested hierarchically, and may import declarations from other modules.
Module names and hierarchy¶
A module name is a path of one or more lowercase identifiers separated by :::
Each component must consist of lowercase ASCII letters, digits, and underscores, and must be no longer than 31 characters. The full module name, including the :: separators between components, must not exceed 127 characters in total. A nested module such as foo::bar::baz is the submodule baz of foo::bar, which is in turn the submodule bar of foo.
A C3 source file in a normal build begins with a module declaration naming the module to which its top-level declarations belong:
A source file with no module declaration belongs entirely to an implicit module whose name is the file's stem in lowercase, with characters outside the identifier alphabet replaced by underscore. A file using the implicit module name must contain no module declaration.
Module sections¶
Each module declaration opens a module section. Multiple sections may appear in one file — for the same module or for different modules — and a single module may span multiple sections and multiple files. The full grammar and the section's attribute defaults (visibility, @if, generic parameters) are described in Blocks and scope.
A section's imports and any attribute defaults apply only within that section. A subsequent section, even of the same module in the same file, must re-declare any imports it needs.
Visibility¶
Each declaration has one of three visibilities:
- public — visible everywhere the declaration's module is in scope through
import. Public is the default for module-level declarations. - private (
@private) — visible only within the declaration's own module. - local (
@local) — visible only within the same source file.
A module section may declare a default visibility (@private or @local) that applies to every declaration within the section unless an individual declaration overrides it with an explicit @public.
A declaration's visibility may be overridden at an importer's request: writing import lib @public makes the private declarations of lib accessible in the importer's section. Declarations marked @local are never accessible across files and cannot be re-exported by import @public.
Linker visibility and exports¶
Visibility (@public, @private, @local) controls source-level access. Linker visibility — whether a symbol is exposed to other translation units and to external linkers — is a separate concern controlled by the attributes @export, @weak, @weaklink, and @cname, described in Attributes. By default, source-level public declarations have linker linkage suitable for use within the C3 program but are not exported as library symbols; @export opts a declaration into being a library export.
Imports¶
An import declaration brings the declarations of another module into the current section's name space.
import_declaration ::= "import" import_item ("," import_item)* ";"
import_item ::= module_path import_attr*
import_attr ::= "@public" | "@norecurse"
A bare import lib; is recursive: it imports lib and all of its submodules, so that names declared in lib, lib::sub, lib::sub::deep, and so on become available without further imports. To import only a specific module without its submodules, append @norecurse:
To gain access to the private declarations of an imported module, append @public:
@public may not be used to access @local declarations, which are never visible across files.
It is a compile-time error if the compiler cannot locate an imported module, or any submodule reached through a recursive import. Inside a module section carrying @if, this check is suppressed unless the @if condition evaluates to true.
Implicit imports¶
Every section implicitly imports:
- The standard-library module
std::coreand its submodules. The names declared there are available without any explicitimport. - Every other module whose path shares the same top-level component as the current module. Within a module
foo::abc, the modulesfoo,foo::cde, and so on are implicitly imported.
Implicit imports may be supplemented with explicit ones; an explicit import of an implicitly-imported module is permitted and harmless.
Name resolution¶
Inside a section, an unqualified name resolves through the following name spaces in order:
- The current section's declarations.
- Other sections of the same module.
- Declarations imported by the section's
importdeclarations. - Implicitly-imported modules.
When more than one entity matches an unqualified name, the reference is ambiguous and must be qualified. A name is qualified by prefixing it with a module path: foo::bar denotes the declaration bar in module foo. Only as many leading path components as necessary to disambiguate the name need be supplied.
Type names, ordinary identifiers, constant identifiers, and attribute identifiers each form a distinct name space (see Blocks and scope); a name in one space does not conflict with the same text in another. Qualification by module path applies uniformly to all of them.
Imported ordinary and constant identifiers must be qualified with at least with the closest submodule path. Identifiers marked with @builtin is exempt from this rule. Type identifiers may be used unqualified when unambiguous.
Module aliases¶
A module alias (declared by alias name = module path;, see Declarations) may stand in place of a module path in import declarations and in qualified-name expressions:
alias mc = module my::collection;
import mc; // equivalent to import my::collection
mc::Map m; // equivalent to my::collection::Map
Source-text inclusion¶
A module may incorporate text from outside its source files through three compile-time directives:
$include("path")— splices the contents of the named file into the current source at the directive's position, as if its text had been written there. Valid only at the top level of a module section. Requires the build to be run at trust levelincludeor higher.$exec("command", args?, stdin?)— runs the named external program and splices its standard output into the source. Requires trust levelfull.$embed("path")— embeds the named file's contents as a compile-time byte-array value, not as source text. Requiresincludetrust level.
The trust level is set by the build invocation and defaults to forbidding both forms. A failed trust check is a compile-time error.
Project structure and the compiler's view¶
The C3 compiler is invoked on a set of source files; the build system tells the compiler which files belong to which modules. There is no fixed mapping between file names and module names beyond the single-file fallback above. A module may be implemented in any number of files arranged in any directory structure permitted by the build system.
The visibility, import, and alias rules described above are evaluated independently of the on-disk layout: only the module declarations in the source files determine the module structure as seen by the compiler.
Optionals and faults¶
C3 represents recoverable error conditions through optional types — types of the form Ty? whose values are either a successful instance of Ty or a fault. A function that may fail returns an optional; callers use a small set of operators and conditional forms to handle the two cases. Optionals are first-class types in the language; they participate in expressions, propagate through arithmetic and calls, and integrate with the type system rather than being a library construct.
Faults¶
A fault is a value of the built-in type fault. Fault values are introduced into a module by faultdef declarations:
Each faultdef introduces one or more named fault values. Two fault values compare equal under == and != if and only if they are the same declared fault. Faults have no inherent ordering and no associated payload; a program that needs to attach data to a fault typically wraps the fault in a richer return type.
The literal null denotes "no fault" — the absence-of-fault value carried by a successful optional. The literal may be used in fault comparisons (@catch(x) == null) and as a fault assignment.
Fault declarations follow the rules for ordinary top-level declarations and may carry attributes, visibility modifiers, and contracts (see Declarations).
Optional types¶
The type Ty? denotes an optional T: a value that is either a successful instance of Ty or a fault. An optional void? carries only the fault status. Types of the form Ty?? are not permitted: the underlying type of an optional may not itself be optional.
The optional type carries the same memory representation as Ty; the success/fault status is tracked alongside the value and does not change Ty's size or alignment. An optional is well-formed only when accompanied by a definite success/fault state; uninitialized optional variables follow the zero-initialization rules (the zero state is a successful zero value, not a fault, for most underlying types).
A value of type Ty is implicitly convertible to Ty? as a successful optional. A fault value is converted to an optional through the postfix ~ operator: excuse~ produces an optional whose underlying type is inferred from context, carrying excuse as its fault.
fn int? get_value(bool ok)
{
if (ok) return 42; // implicit success
return IO_ERROR~; // explicit fault
}
Propagation through expressions¶
An optional value in an expression makes the surrounding expression's type optional. Arithmetic, calls, and other operators applied to optional operands produce optional results that propagate the fault.
Evaluation of such an expression is conditional. If any subexpression evaluates to a faulty optional, the remaining subexpressions are not evaluated and the surrounding expression takes that fault as its value, in left-to-right order:
int? c = foo() + bar(); // if foo() is faulty, bar() is not called.
abc(foo(), bar()); // if foo() is faulty, bar() is not called and abc is not invoked.
Function arguments may not themselves be declared with an optional type. A faulty optional supplied as an argument always triggers the conditional-call propagation above.
Function return types¶
A function may declare an optional return type:
A call to such a function produces an optional value that must be consumed: it may not be assigned to a non-optional variable, passed as a non-optional argument, or returned from a non-optional function without going through one of the handling forms below.
A function with return type void? returns no value but may return a fault. Statement-level uses of such a function — close_file()!;, close_file()!!;, or a catch form — must be present whenever the call appears in a non-optional context.
Handling optionals¶
Rethrow: postfix !¶
The postfix ! operator unwraps an optional. If the optional is successful, the result is the underlying value of type Ty. If the optional is faulty, the enclosing function returns immediately, propagating the fault to its own caller. The enclosing function must therefore itself have an optional return type compatible with the fault being propagated.
The expression e! is equivalent to:
Force unwrap: postfix !!¶
The postfix !! operator unwraps an optional, trapping (terminating the program with a diagnostic) if the optional is faulty. !! is intended for cases where the programmer asserts that the optional cannot be faulty at that point; it is not a substitute for proper handling.
Optional-else: ??¶
The binary ?? operator yields the underlying value of a successful optional, or evaluates and returns a default expression if the optional is faulty:
The right operand is evaluated only when the optional is faulty. The result type is the common type of the left operand's underlying type and the right operand. The right operand may itself be an optional, in which case ?? may be chained.
try and catch in conditions¶
The conditional forms in Statements permit try and catch unwraps. try extracts the underlying value of a successful optional; catch extracts the fault of a faulty optional:
if (try x = read_byte())
{
// x : int, the optional was successful
}
if (catch excuse = read_byte())
{
// excuse : fault, the optional was faulty
}
A try condition may chain multiple unwraps with &&:
A catch condition may bind several optionals; the binding refers to the first one that is in the fault state.
Implicit unwrapping¶
After a catch branch that exits the surrounding scope through return, break, continue, or rethrow, the compiler statically determines that the original optional variable cannot be in the fault state in the code that follows. The variable is then implicitly unwrapped: within the unwrapped scope, references to it have the underlying non-optional type.
int? foo = unreliable_function();
if (catch excuse = foo)
{
return excuse~;
}
// foo is implicitly unwrapped to int from here to the end of the scope.
io::printfn("foo = %s", foo);
When the condition of an if is a catch unwrap binding one or more optionals, the else branch implicitly unwraps each bound optional. Reaching the else branch means that the catch did not match, so each operand is known to be in the success state in that branch:
int? a = foo();
int? b = bar();
if (catch excuse = a, b)
{
return excuse~;
}
else
{
int x = a + b; // a and b are implicitly unwrapped
}
The same unwrapping applies after the rethrow operator: a statement of the form int x = foo!; both rethrows the fault and binds the unwrapped value to x. Subsequent references to foo in the same scope continue to be optional.
void? and statement-level optionals¶
A void? value carries only the fault status. A void? may not be stored in a variable: there is no value content to retain, and the fault status alone is too ephemeral to be a useful target for assignment. The compiler rejects declarations such as void? x = close_file();.
void? calls are consumed at the statement level by one of the handling forms:
Faults and function-pointer types¶
A function-pointer type whose return type is optional matches another such type if and only if the underlying return types match. The set of faults a function may return is not part of the function-pointer type. Consequently, the set of possible faults at a call site is determined statically only for direct calls; for calls through function pointers or interface methods, the program must handle any fault propagated through the optional return.
@return? and contract checking¶
The @return? contract clause (see Contracts) documents the faults a function may return. A conforming compiler is not required to enforce @return? either at compile time or at runtime; current implementations check only direct fault raising visible through static analysis (return excuse~; and the ! operator on calls with known fault sets).
Built-in functions and intrinsics¶
The language reserves the namespace of identifiers beginning with $$ for compiler use. Within that namespace, two categories are distinguished:
- Built-in compile-time constants, listed below, are required of every conforming compiler. Each yields a value, of the specified type, at the point where the identifier appears.
- Intrinsic functions (other
$$-prefixed callables) are not specified by this document. A conforming compiler may provide any number of them, none of them, or different ones across versions. User code reaches intrinsic functionality through the standard library, which exposes stable wrappers (such asmem::copy) over whatever intrinsics a given compiler supplies.
Built-in compile-time constants¶
A conforming compiler provides the following constants. Source-location values describe the textual location of the reference; inside an expanded macro body, all such values except $$LINE_RAW describe the location of the call site rather than the macro source.
$$FILE— the basename of the current source file, as a string.$$FILEPATH— the full path of the current source file, as a string.$$LINE— the line number of the reference, as an integer constant.$$LINE_RAW— the line number of the reference before macro expansion, as an integer constant.$$FUNC— the unqualified name of the enclosing function, as a string.$$FUNCTION— a reflective reference (in the sense of Reflection) to the enclosing function.$$MODULE— the qualified name of the enclosing module, as a string.$$DATE— the date of compilation, as a string of the formMon Jan 1 2025.$$TIME— the time of compilation, as a string of the form12:34:56.$$BENCHMARK_NAMES— in a benchmark build, an array of strings giving the names of the benchmark functions; otherwise an empty array.$$BENCHMARK_FNS— in a benchmark build, an array of function pointers to the benchmark functions, in the same order as$$BENCHMARK_NAMES; otherwise an empty array.$$TEST_NAMES— in a test build, an array of strings giving the names of the test functions; otherwise an empty array.$$TEST_FNS— in a test build, an array of function pointers to the test functions, in the same order as$$TEST_NAMES; otherwise an empty array.
Intrinsic functions¶
Intrinsic functions in the $$ namespace — for example $$trap, $$memcpy, $$sqrt — are implementation details shared between a compiler and its standard library. The C3 language does not specify which intrinsic functions a compiler provides, nor their signatures or semantics; a conforming compiler may provide a different set, or none at all, while still implementing the language and its standard library faithfully.
User code accesses the underlying functionality through the stable, module-qualified names provided by the standard library, rather than by referencing intrinsic functions directly.
Inline assembly¶
C3 provides two forms of inline assembly: an assembly string, in which a compile-time string is passed verbatim to the backend, and an assembly block, in which a small structured grammar lets the compiler infer register clobbers and operand directions across a sequence of instructions.
asm_statement ::= asm_string_form | asm_block_form
asm_string_form ::= "asm" "(" constant_expression ")" attributes? ";"
asm_block_form ::= "asm" attributes? "{" asm_instruction* "}"
The current grammar covers a subset of x86, aarch64, and riscv; other targets may have no inline-assembly support. The instruction set accepted in the structured form is a work in progress and may be extended in later language versions.
Assembly strings¶
The string form takes a single compile-time string and passes it without further processing to the backend assembler. The string is responsible for any assembler directives, syntax, and operand referencing required by that backend; the compiler performs no operand substitution.
The string form is appropriate for self-contained, parameter-free fragments such as fence or no-op instructions, or for forwarding to backend-specific facilities not yet covered by the structured form.
Assembly blocks¶
The structured form accepts a sequence of instructions in a common grammar that abstracts over the underlying processor. Each instruction consists of an instruction mnemonic followed by zero or more comma-separated arguments and a terminating semicolon:
asm_instruction ::= IDENTIFIER (asm_arg ("," asm_arg)*)? ";"
asm_arg ::= IDENTIFIER
| CONST_IDENT
| INTEGER_LITERAL
| "$" IDENTIFIER
| "&" IDENTIFIER
| "[" asm_address "]"
| "(" expression ")"
asm_address ::= asm_arg ( "+" asm_arg ( "*" CONST_INTEGER )? )* ( "+" CONST_INTEGER )?
The six argument forms are:
- An identifier or constant identifier (
FOO,x) — a C3-level name in scope at the assembly block. The compiler resolves the name and chooses a suitable operand encoding. - An integer literal (
1,0xFF) — an immediate operand. - A register reference
$name(always lowercase, e.g.$eax,$r7) — a target-specific machine register. - An address-of expression
&name— the address of the named C3 variable. - An indirect address
[addr], optionally with an index and offset[addr + index * const + offset]— a memory operand. - A parenthesized expression
(expr)— any C3 expression, evaluated before the assembly block runs; its value is used as the operand.
A complete example:
int aa = 3;
int g;
int* gp = &g;
int* xa = &aa;
sz asf = 1;
asm
{
movl x, 4; // Move 4 into the variable x
movl [gp], x; // Move the value of x into the address in gp
movl x, 1; // Move 1 into x
movl [xa + asf * 4 + 4], x; // Move x into the address at xa[asf + 1]
movl $eax, (23 + x); // Move 23 + x into EAX
movl x, $eax; // Move EAX into x
movq [&z], 33; // Move 33 into the memory address of z
}
Inside an asm block the compiler infers register clobbers and per-operand input/output direction from the instructions used. Operand types are checked against the requirements of each instruction.
Interaction with surrounding code¶
An asm block is a statement (see Statements). Variables in scope at the block may be referenced as operands by name; doing so is equivalent to passing them through compiler-chosen operands of the appropriate kind. Register references $name bypass C3-level analysis and refer directly to the named hardware register.
The @naked function attribute (see Attributes) suppresses the compiler-generated prologue and epilogue around a function body; combined with asm blocks, it permits writing functions whose entire body is assembly.
C interoperability¶
C3 follows the platform C ABI. A C3 function may call a C function without intermediate stubs, and a C function may call a C3 function whose external symbol is known. This chapter describes the language-level facilities for declaring foreign symbols, naming exported symbols, and the points where C3 and C semantics differ.
Declaring foreign functions¶
A function definition introduced by the extern keyword has no body and refers to a function defined in another translation unit, typically a C library:
The function signature is given in C3 syntax; the compiler treats the call as it would any other function call but uses the platform C ABI for argument passing, return value, and stack handling. The name puts here is both the C3-side identifier and the external symbol name; both are looked up at link time.
The extern keyword may be applied to function and global variable declarations. The form does not introduce a new linkage attribute; it changes the declaration from "definition" to "external reference".
Renaming foreign symbols¶
When the C3-side name should differ from the external symbol name, the @cname attribute selects the external name explicitly:
extern fn void foo_puts(char*) @cname("puts");
fn void main()
{
foo_puts("Hello, world!"); // Calls C "puts"
}
@cname may be applied to extern declarations, to exported C3-side definitions, or to any declaration whose external linker name should differ from the source-level identifier. The argument must be a compile-time string and must form a valid identifier for the target's symbol table.
Exporting C3 functions to C¶
A C3 function with linker-visible linkage may be called from C using its external symbol. To make a C3 function callable as a stable, name-stable C symbol, attach @export (with or without an explicit external name) and ensure the function's signature uses types whose layout matches a C declaration on the other side:
module foo;
fn int square(int x) @export
{
return x * x;
}
fn int square2(int x) @export("square")
{
return x * x;
}
A bare @export exports the function under a symbol derived from the module-qualified name (e.g. foo__square for square in module foo). An @export("name") exports the function under the named symbol exactly. The C side may then declare and call the function in the usual way:
Because C3 namespaces symbols by module under bare @export, an exported function whose C-side name must be unprefixed should use the explicit @export("name") form.
Type correspondence¶
The following table summarises the correspondence between C and C3 types under the platform C ABI. Unless noted otherwise, the types are identical in size, alignment, and ABI representation.
| C type | C3 type |
|---|---|
signed char |
ichar |
unsigned char |
char |
short |
short |
unsigned short |
ushort |
int |
CInt (platform-defined alias) |
unsigned int |
CUInt |
long |
CLong |
unsigned long |
CULong |
long long |
CLongLong |
unsigned long long |
CULongLong |
float |
float |
double |
double |
void* |
void* |
T* |
T* |
T[N] (as argument) |
T[N]* (see Arrays below) |
C3's fixed-width integer types (int, long, etc.) have a size determined by the language rather than the platform. The C-prefixed aliases (CInt, CLong, ...) name the platform's C integer types and should be used whenever interoperating with a C declaration.
Arrays in C signatures¶
In C, an array parameter decays to a pointer. C3 has no such decay rule: a fixed array in a C3 signature represents the array as a value (typically passed as if by struct).
To call a C function whose declaration uses an array parameter, the C3-side declaration should use a pointer type:
// C: void test(int a[]);
extern fn void test(int* a);
// C: void test2(int b[4]);
extern fn void test2(int[4]* b);
A pointer to a fixed array (int[4]*) is implicitly convertible to a pointer to the array's first element (int*), so the C3-side declaration may also use the latter form when the function takes a pointer-to-element rather than pointer-to-array.
Other differences from C¶
- Bitstructs. A bitstruct (see Types) appears to C code as its backing integer type. C bit-fields cannot be expressed directly in a C3 declaration; an equivalent C3 bitstruct must be constructed manually with the correct layout for the target.
- Enum size. C compilers assume that an enum has the size of
int(CInt). When passing enums across the boundary, ensure the C3 enum's backing type isCInt. - Atomic types. C's
_Atomicqualifier has no direct C3 counterpart; C3 provides generic atomic types in the standard library that are not ABI-compatible with C atomics. constandvolatilequalifiers. C3 has no type qualifiersconstorvolatile. The Cconstqualifier on parameters is informational and does not affect ABI; the Cvolatilequalifier has no C3 equivalent at the type level, but@volatile_loadand@volatile_store(see the standard library) provide the same access semantics.- Pass-by-value arrays. A C3 function that passes a fixed array by value is ABI-equivalent to a C function that passes a struct containing the array, not to a function with an array-typed parameter.
External symbol resolution¶
An extern declaration must be matched by a definition in another translation unit at link time. A program that fails to resolve an extern symbol is ill-formed. The mechanism by which external libraries are made available to the linker — search paths, library names, link order — is part of the build system and the system linker, not of the language.
Program initialization and execution¶
A C3 program is a collection of modules linked together with a single distinguished entry point. This chapter describes how a program is started, how its global state is initialized, how user-supplied initializer and finalizer functions interact with the entry point, and how the program terminates.
Entry point¶
A program declares its entry point with a function named main at module scope. The compiler accepts several forms:
fn void main() { ... }
fn int main() { ... }
fn void main(String[] args) { ... }
fn int main(String[] args) { ... }
fn void main(int argc, char** argv) { ... }
fn int main(int argc, char** argv) { ... }
A void-returning main exits with status 0 on normal return; an int-returning main exits with the returned value.
The forms taking String[] args provide command-line arguments as a slice of strings; the C-style form taking int argc, char** argv mirrors the C entry-point signature and is intended for direct interoperation with the platform's startup conventions. The argument slice's lifetime extends for the entire run of the program.
A program has exactly one main. On a target whose system requires a different entry-point shape (for example, Windows WinMain), the @winmain attribute (see Attributes) selects that platform-specific shape; on other targets the attribute has no effect.
Global initialization¶
Global variables and constants are initialized before main is entered. Initialization proceeds in three stages, in order:
- Static constant initialization. Every named constant whose initializer is a constant expression takes its value as part of the program image. Such constants are available from the moment execution begins.
- Static variable initialization. Each global variable receives its initializer's value. Initializers are evaluated in an order consistent with their dependencies: if one global's initializer reads another, the dependee is initialized first. Globals with no dependency relationship may be initialized in any order. A global with no explicit initializer is zero-initialized unless it carries
@noinit(see Attributes), in which case its initial contents are indeterminate. - Initializer functions. After all globals have their initial values, every function declared with the
@initattribute is invoked.@initaccepts an optional priority argument; functions with a lower priority value run before those with a higher one. The relative order of@initfunctions sharing the same priority is unspecified.
main is entered after stage 3 completes. Any global that is read before its stage-2 initialization runs holds its zero value (or, with @noinit, an indeterminate value).
Globals with thread-local storage (tlocal) follow the same staged process for each thread independently: stage 2 and stage 3 are performed for each newly-created thread before that thread executes any user code.
Program termination¶
A program terminates in one of the following ways:
mainreturns. The return value of anint-returningmainis the program's exit status; avoid-returningmainyields exit status0.- A call to a function declared
@noreturnexits the program (typically through a runtime trap, a system-call wrapper, or an explicit exit function). - A trap from a contract violation, sanitizer check, or
$$trap-derived runtime check terminates the program with an implementation-defined exit status.
Before the process exits through main return, every function declared with the @finalizer attribute is invoked. @finalizer accepts an optional priority argument with the same convention as @init: lower priority values run earlier. The relative order of finalizers sharing the same priority is unspecified. Every finalizer is guaranteed to run if termination occurs through a normal return from main; finalizers are not guaranteed to run when the program exits through a trap or @noreturn call.
Static and thread-local storage lifetime¶
A static local (static storage in a function) is initialized on first entry to its declaring function. Subsequent entries see the value left by the previous execution. The variable's storage persists for the entire program lifetime.
A thread-local variable (tlocal storage) has independent storage per thread. Initialization follows the staged global model above, re-run for each thread the program creates. The storage is released when the thread terminates.
Module-section evaluation order¶
Within a translation unit, the order of declarations does not affect initialization order: dependencies between global initializers are resolved by analysis, not by source order. Two globals with mutually-dependent initializers are a compile-time error.
Across translation units and modules, the same dependency-based ordering applies. Programs that rely on a specific initialization order between independent globals are not portable.
Testing and benchmarking¶
C3 supports test and benchmark functions as a built-in feature of the language. The attributes @test and @benchmark mark functions for execution by the test and benchmark drivers; the compiler reflects the set of marked functions through compile-time constants that the standard-library driver uses to discover and invoke them.
Test functions¶
A function declared with the @test attribute is a test function. The function's signature must be:
A test function takes no arguments and returns void. A clean return counts as a passed test; a runtime trap encountered during execution — from a contract violation, a failed assert, an unhandled fault, or any other source — counts as a failure. The standard-library test driver records the test's outcome and reports it.
A test function is compiled into the program only when the compiler is invoked in a test build (the precise invocation depends on the build system). In a non-test build, the function and any other declarations whose only references are from test code are omitted entirely from the compiled program.
Benchmark functions¶
A function declared with the @benchmark attribute is a benchmark function. Its signature requirements parallel those of test functions:
Benchmark functions are compiled into the program only in a benchmark build. The standard-library benchmark driver determines how each benchmark is timed and reported.
Section-level application¶
@test and @benchmark may also be applied to an entire module section (see Blocks and scope). When a section carries one of these attributes, every function declared within the section is treated as if it carried the same attribute individually. This is the usual way of grouping a body of tests or benchmarks without repeating the attribute on each declaration:
Both functions are test functions, and both are subject to the linkage filtering described above.
Discovery through compile-time constants¶
The compiler exposes the set of marked functions through the compile-time array constants from Built-in functions and intrinsics:
$$TEST_NAMES— an array of strings naming each@testfunction. Available only in test builds; in other builds the array is empty.$$TEST_FNS— an array of function pointers to the@testfunctions, in the same order as$$TEST_NAMES.$$BENCHMARK_NAMES— the analogous array of@benchmarkfunction names. Available only in benchmark builds.$$BENCHMARK_FNS— the array of@benchmarkfunction pointers.
The standard library uses these arrays to implement the test and benchmark drivers; user code generally need not refer to them directly.
Effect on compilation¶
@test and @benchmark are linkage filters in addition to being run-time markers. Outside the corresponding build mode:
- The marked function itself is removed from compilation.
- Code that is reachable only from marked functions is likewise removed.
- References to marked functions from non-marked code are a compile-time error.
This rule lets a project keep its tests and benchmarks alongside the production code without paying any compilation cost or binary-size cost in non-test, non-benchmark builds.
A function may carry both @test and @benchmark only when no build mode includes both; in practice the two are mutually exclusive in standard usage.
Visibility and module scope¶
@test and @benchmark functions follow ordinary visibility rules: they may be @public, @private, or @local, and may be declared inside any module. Tests defined in a @private or @local scope are still discoverable by the test driver, since discovery is by symbol-table inspection at compile time rather than by source-level reference. Visibility affects what the test body itself may access, not whether the test is enumerated.
Run-time behaviour¶
This chapter consolidates the run-time behaviour of C3 programs: what is well-defined, what is implementation-defined, what is unspecified, and what is undefined. It also describes the two principal build modes (safe and fast), the traps used to enforce contracts and bounds, and the optional sanitizer machinery.
Categories of behaviour¶
C3 distinguishes four categories of run-time behaviour. Throughout this specification:
- Well-defined behaviour — the result is fully determined by the language; every conforming compiler produces the same observable outcome.
- Implementation-defined behaviour — the result is determined by the compiler in a way that the compiler's documentation must describe. Different compilers may differ; a given compiler is consistent.
- Unspecified behaviour — the result is one of a set of permitted outcomes, chosen by the compiler without obligation to document the choice. A program must not depend on which outcome occurs.
- Undefined behaviour (UB) — the language treats the operation as a precondition that the program promises not to violate, in the same way as an unchecked contract: the compiler is permitted to assume the operation does not occur and to optimize the surrounding code on that basis (including deleting branches that lead to it as unreachable). If the operation nevertheless is reached and the compiler has not optimized it away, the run-time result is itself unspecified — the program may trap, may produce arbitrary values, may resume execution at an unrelated location, may corrupt memory, or may appear to behave correctly.
The remainder of this chapter classifies specific operations.
Build modes¶
A C3 implementation provides at minimum two build modes, controlled by the build invocation:
- Safe — runtime checks are inserted for the categories listed under Traps below. A failed check terminates the program with a diagnostic. Contracts (see Contracts) are typically lowered to runtime asserts. In safe mode, the operations that would otherwise be undefined behaviour become well-defined traps.
- Fast — runtime checks are elided. Operations that would have trapped in safe mode are undefined behaviour in fast mode; the compiler may assume they do not occur and may optimize accordingly.
A program that runs cleanly in safe mode is not guaranteed to behave the same in fast mode if it relies on an operation that trapped in safe mode. Programs intended to be portable across build modes must avoid the undefined-behaviour categories below.
The standard library may expose additional intermediate modes; the behavioural categories above are the language-level minimum.
Operations that are well-defined¶
The following operations have fully defined run-time behaviour in every build mode:
- Signed integer overflow. Arithmetic on signed integers wraps modulo
2ⁿ, wherenis the operand width. This contrasts with C, where signed overflow is UB. - Unsigned integer overflow. Arithmetic on unsigned integers wraps in the natural way (no value is invalid for an unsigned type).
- Order of evaluation. Sub-expressions evaluate strictly left-to-right with all side effects completed before the next operand is evaluated (see Expressions). C-style "unsequenced side effect" UB does not arise.
- Default initialization. A variable without an explicit initializer is zero-initialized unless it carries
@noinit. A zero-initialized object has the bit pattern all-zero in every byte. - Pointer comparison. Two pointers may always be compared for equality with
==and!=. Two pointers to the same allocation may be compared for ordering with<,<=,>,>=. - Optional propagation. A faulty optional propagates through any expression that does not handle it, in left-to-right order (see Optionals and faults). When multiple subexpressions could supply a fault, the propagation point is determined by the order in which the language requires the subexpressions to be evaluated.
- Order of struct fields. Fields of a struct are laid out in declaration order. The offset of each field is at least the offset of the preceding field plus the preceding field's size (after any padding required by alignment).
Operations that trap in safe mode and are undefined in fast mode¶
The following operations are checked in safe mode (they trap the program with a diagnostic and terminate execution) and are undefined in fast mode:
- Array and slice indexing out of bounds. An index
ifor an array or slice of lengthnis in bounds when0 <= i < n. An out-of-bounds index traps in safe mode and is UB in fast mode. - Pointer dereference of a null pointer.
*porp[i]whenp == nulltraps in safe mode and is UB in fast mode. - Integer division and remainder by zero.
a / 0anda % 0on integer operands trap in safe mode and are UB in fast mode. - Contract violations. A failed
@require,@ensure, or other contract clause traps in safe mode and is UB in fast mode (see Contracts). assertfailure. A failedassertstatement traps in safe mode. In fast mode, the compiler may treat the asserted condition as a hint and optimize accordingly; reaching a program point where the asserted condition is false is UB.
Operations that are always undefined¶
The following operations are undefined behaviour in every build mode; no implementation is required to detect them. A program containing them is ill-formed in the sense that the language imposes no constraint on its execution.
- Dereferencing a dangling pointer (a pointer whose target has been deallocated or whose underlying object's lifetime has ended).
- Concurrent unsynchronized access to the same memory location from multiple threads where at least one access is a write (a data race; see Concurrency below).
- Reaching
$$unreachable()or any other declared-unreachable program point. - Returning from a
@noreturnfunction. - Violating the
@noaliascontract on a parameter. The compiler treats@noalias-marked pointer parameters as designating disjoint memory regions and may reorder, fuse, or eliminate accesses on that basis. If two such parameters in fact alias, the resulting behaviour typically manifests as silent data corruption, stale reads, or skipped writes; no diagnostic is required. - Calling a
void-returning function through a function pointer typed to return a value, or vice versa.
Operations whose category depends on the build mode¶
A few operations are classified differently in safe and fast builds without being safe-mode traps in the strict sense above:
- Shift by an out-of-range count.
a << banda >> bare well-defined when0 <= b < bit_width(a). Outside that range, the operation is unspecified. In safe mode the unspecified-ness is resolved by trapping the program; in fast mode it remains unspecified — the operation may produce any value, but does not invoke UB-style optimizations against surrounding code. - Reading uninitialized memory of a variable declared
@noinitbefore that memory has been written. The read is implementation-defined in safe mode (the compiler may, for example, fill with a recognizable bit pattern on entry to a function for diagnostic purposes) and undefined behaviour in fast mode.
Implementation-defined operations¶
The following operations have outcomes determined by the compiler and must be documented:
- The actual sizes and alignments of platform-dependent types (
uptr,iptr,CInt,CLong, etc.). - The byte order of multi-byte primitive types (controllable for bitstructs through
@bigendian/@littleendian). - The set of
$$intrinsics provided (see Built-in functions and intrinsics). - The set of sanitizer checks recognized by
@nosanitize(see Attributes). - The form of any diagnostic printed when a safe-mode trap occurs.
- The exit status used when a trap terminates the program.
Unspecified operations¶
The following operations have outcomes drawn from a permitted set, with no requirement that the choice be the same on different runs or different compilers:
- The relative order of initializer functions sharing a priority value (see Program initialization and execution).
- The relative order of finalizer functions sharing a priority value.
- The amount of padding inserted between fields of a struct or union to satisfy alignment, where the layout is not otherwise pinned by
@packed,@compact, or related attributes. - The result of a shift by an out-of-range count in fast mode (see Operations whose category depends on the build mode).
Traps¶
A trap terminates the program at a defined point with an implementation-defined diagnostic. Traps are produced by:
- Failed safe-mode checks (the list above).
$$trap()and any wrapper such as the standard library'sunreachablemacro.- Sanitizer checks that fire (when sanitizers are enabled).
- Uncaught language-level conditions such as a
!!force-unwrap on a faulty optional.
The exact diagnostic is implementation-defined. A trap is not catchable from within the program; once a trap fires, the program runs no further user code (in particular, finalizers are not guaranteed to run — see Program initialization and execution).
Sanitizers¶
A compiler may provide sanitizers — additional run-time checks beyond the safe-mode minimum. Sanitizers are enabled at build time by mechanisms outside the language. A function may opt out of a specific sanitizer through the @nosanitize(name) attribute (see Attributes).
The standard sanitizer categories conventionally recognized are "address" (memory-safety errors), "memory" (uninitialized reads), and "thread" (data races). A given implementation may support a subset, a superset, or none.
Concurrency¶
C3 adopts the memory model of C11 and C++11. The two language families share the same formal model — the same six memory orderings (relaxed, consume, acquire, release, acq_rel, seq_cst), the same sequenced-before / synchronizes-with / happens-before relations, and the same definition of a data race. C3 atomic operations interoperate with C atomic operations on the same memory location.
A data race occurs when two memory accesses to the same location
- are performed by different threads,
- are not ordered by happens-before,
- are not both atomic accesses, and
- at least one of them is a write.
A program containing a data race has undefined behaviour, in the sense defined above.
C3 does not provide an atomic type qualifier analogous to C's _Atomic. Atomic types and operations are provided by the standard library; each operation takes a memory-ordering argument drawn from the six orderings named above. The standard library also exposes fences (the analogue of atomic_thread_fence) for separating synchronization from data access.
Synchronization between threads is achieved exclusively through atomic operations, library-provided mutual-exclusion primitives, and any platform-level mechanisms exposed by the standard library. The language defines no other inter-thread visibility guarantees: in particular, ordinary loads and stores carry no implicit synchronization, and only the relations established by atomic operations and synchronization primitives provide happens-before across threads.
Summary¶
The behavioural classes above can be read as a contract between the language and the program:
- The language guarantees the well-defined outcomes regardless of build mode.
- The language traps the safe-mode-checked operations in safe mode and reserves the right to optimize aggressively against them in fast mode.
- The language assumes the never-defined operations do not occur; programs that rely on any particular outcome from them have no guaranteed behaviour.
A portable C3 program treats every category listed under "always undefined" or "trap in safe mode" as a bug to be removed, regardless of which build mode is currently in use.
FAQ
FAQ
Standard library¶
Q: What are the most fundamental modules in the standard library?
A: By default C3 will implicitly import anything in std::core into
your files. It contains string functions, allocators and conveniences for
doing type introspection. The latter is in particular useful when writing
contracts for macros:
std::core::arrayfunctions for working with arrays.std::core::builtincontains functions that are to be used without a module prefix,unreachable(),bitcast(),@catch()and@ok()are especially important.std::core::cinteropcontains types which will match the C types on the platform.std::core::dstringHas the dynamic string type.std::core::memcontainsmallocetc, as well as functions for atomic and volatile load / store.std::core::stringhas all string functionality, including conversions, splitting and searching strings.
Aside from the std::core module, std::collections is important as it
holds various containers. Of those the generic List type in std::collections::list
and the HashMap in std::collections::map are very frequently used.
IO is a must, and std::io contains std::io::file for working with files,
std::io::path for working with paths. std::io itself contains
functionality for writing to streams in various ways. Useful streams can
be found in the stream sub folder.
Also of interest could be std::net for sockets. std::threads for
platform independent threads, std::time for dates and timers, libc for
invoking libc functions. std::os for working with OS specific code and
std::math for math functions and vector methods.
Q: How do strings work?
(see Strings for more info.)
A: C3 defines a native string type String, which is a typedef char[]. Because
char[] is essentially a pointer + length, some care has to be taken to
ensure that the pointer is properly managed.
For dynamic strings, or as a string builder, use DString. To get a String from
a DString you can either get a view using str_view() or make a copy using copy_str().
In the former case, the String may become invalid if DString is then mutated.
ZString is a zero terminated typedef char*. It is used to model zero-terminated
strings like in C. It is mostly useful interfacing with C.
WString is a Char16*, useful on those platforms, like Win32, where this
is the common unicode format. Like ZString, it is mostly useful when interfacing
with C.
Language features¶
Q: How do I use slices?
(see Arrays/Slice for more info.)
A: Slices are typically preferred in any situation where one in C would pass a pointer + length. It is a struct containing a pointer + a length.
Given an array, pointer or another slice you use either [start..end]
or [start:len] to create it:
You can also just pass a pointer to an array:
The start and/or end may be omitted:
It is possible to use ranges to assign:
It is important to remember that the lifetime of a slice is the same as the lifetime of its underlying pointer:
Q: How do I pass vaargs to another function that takes varargs?
A: Use the splat operator, ...
fn void test(String format, args...)
{
io::printfn(format, ...args);
}
fn void main()
{
test("Format: %s %d", "Foo", 123);
}
Q: What are vectors?
(see Vectors for more info.)
A: Vectors are similar to arrays, but declared with [< >] rather than [ ]. The element type may also only be of integer, floating point, bool or pointer types. Vectors are backed by SIMD types on supported platforms. Arithmetic operators available on the element type are also available on the vector as a whole and are performed element-wise, thus enabling more convenient vector math. For example:
Swizzling (shorthand for rearranging vector components, which is commonly used in graphics and game programming) is also supported:
Any scalar value will be expanded to the vector size:
Memory management¶
Q: How do I work with memory?
A: There is malloc, calloc and free just like in C. The main difference is that these will invoke whatever
the current heap allocator is, which does not need to be the allocator provided by libc. You can get the current heap
allocator using mem and do allocations directly. There is also a temporary allocator.
Convenience functions are available for allocating particular types: mem::new(Type) would allocate a single Type
on the heap and zero initialize it. mem::alloc(Type) does the same but without zero initialization.
Alternatively, mem::new can take a second initializer argument:
Foo* f1 = malloc(Foo::size); // No initialization
Foo* f2 = calloc(Foo::size); // Zero initialization
Foo* f3 = mem::new(Foo); // Zero initialization
Foo* f4 = mem::alloc(Foo); // No initialization
Foo* f5 = mem::new(Foo, { 4, 10.0, .a = 123 }); // Initialized to argument
For arrays mem::new_array and mem::alloc_array work in corresponding ways:
Foo* foos1 = malloc(Foo::size * len); // No initialization
Foo* foos2 = calloc(Foo::size * len); // Zero initialization
Foo[] foos3 = mem::new_array(Foo, len); // Zero initialization
Foo[] foos4 = mem::alloc_array(Foo, len); // No initialization
Regardless of how they are allocated, they can be freed using free()
Q: How does the temporary allocator work?
A: The temporary allocator is a kind of stack allocator. tmalloc, tcalloc and trealloc correspond to malloc, calloc and realloc. There is no free, as temporary allocations are freed when the entire pool (a.k.a. arena) of temporary objects is released all at once (making it both very easy to use and extremely performant). You use the @pool() macro to create a temporary allocation scope. When execution exits this scope, the temporary objects within it are all freed automatically. For example:
@pool()
{
void* some_mem = tmalloc(128);
foo(some_mem);
};
// Temporary allocations are automatically freed here.
Similar to the heap allocator, there is also mem::tnew, mem::temp_alloc, mem::temp_array and mem::temp_alloc_array,
which all work like their heap counterparts.
Q: How can I return a temporarily allocated object from inside a temporary allocation scope?
A: You need to pass in a copy of the temp allocator outside of @pool and allocate explicitly
using that allocator.
// Store the temp allocator
Allocator temp = tmem;
@pool()
{
// Note, 'temp != tmem' here!
void* some_mem = tmalloc(128);
// Allocate this on the external temp allocator
Foo* foo = allocator::new(temp, Foo);
foo.z = foo(some_mem);
// Now "some_mem" will be released,
// but the memory pointed to by "foo" is still valid.
return foo;
};
Interfacing with C code¶
(see C Interoperability for more info.)
Q: How do I call a C function from C3?
A: Just copy the C function declaration and prefix it with extern (and don’t forget the fn as well).
Imagine for example that you have the function double test(int a, void* b). To call it from C3 just declare
extern fn double test(CInt a, void* b) in the C3 code.
Q: My C function / global has a name that doesn't conform to the C3 name requirements. Just extern fn doesn't work.
A: In this case you need to give the function a C3-compatible name and then use the @cname attribute to
indicate its actual external name. For example, the function int *ABC(void *x) could be declared in the C3 code as
extern fn int* abc(void* x) @cname("ABC").
There are many examples of this in the std::os modules.
Patterns¶
Q: When should I put functionality in a method versus a free function?
A: In the C3 standard library, free functions are preferred unless the function is only acting on the particular
type. Some exceptions exist, but prefer things like io::fprintf(file, "Hello %s", name) over
file.fprintf("Hello %s", name). The former also has the advantage that it's easier to extend to work with many
types.
Q: Are there any naming conventions in the standard library that one should know about?
A: Yes. A function or method with new in the name will in general do one or more allocations and can take an
optional allocator. A function or method with temp in the name will usually allocate using the temp allocator.
The method free will free all memory associated with a type. destroy is similar to free but also indicates
that other resources (such as file handles) are released. In some cases close is used instead of destroy.
Function and variable names use snake_case (all lower case with _ separating words).
Q: How do I create overloaded methods?
A: This can be achieved with macro methods.
Imagine you have two methods:
fn void Obj.func1(&self, String... args) @private {} // vaargs variant
fn void Obj.func2(&self, Foo* pf) @private {} // Foo pointer variant
We can now create a macro method on Obj which compiles to different calls depending on arguments:
// The macro must be vararg, since the functions take different amount of arguments
macro void Obj.func(&self, ...)
{
// Does it have a single argument of type 'Foo*'?
$if $vacount == 1 &&& @typeis($vaarg[0], Foo*):
// If so, dispatch to func2
return self.func2($vaarg[0]);
$else
// Otherwise, dispatch all vaargs to func1
return self.func1($vasplat);
$endif
}
The above would make it possible to use both obj.func("Abc", "Def") and obj.func(&my_foo). (The use of &&& is the same as && except that the right hand side is lazily evaluated. In this case, it only is checked if $vacount is 1.)
Platform support¶
Q: How do I use WASM?
A: Currently WASM support is really incomplete.
You can try this:
compile --reloc=none --target wasm32 -g0 --link-libc=no --no-entry mywasm.c3
Unless you are compiling with something that already runs initializers,
you will need to call the function runtime::wasm_initialize() early in your
main or call it externally (for example from JS) with the name _initialize(),
otherwise globals might not be set up properly.
This should yield an out.wasm file, but there is no CI running on the WASM code
and no one is really using it yet, so the quality is low.
We do want WASM to be working really well, so if you're interested in writing something in WASM please reach out to the C3 development team and we'll help you get things working.
Q: How do I conditionally compile based on compiler flags?
A: You can pass feature flags on the command line using -D SOME_FLAG or using the features key
in the project file.
You can then test for them using $feature(FLAG_NAME):
int my_var @if($feature(USE_MY_VAR));
fn int test()
{
$if $feature(USE_MY_VAR):
return my_var;
$else
return 0;
$endif
}
Syntax & Language design¶
Q: Why does C3 require that types start with upper case but functions with lower case?
A: C's grammar is ambiguous. Usually compilers implement the so-called lexer hack, but other methods exist as well, such as delayed parsing. It is also possible to make it unambiguous using infinite lookahead.
However, all of those methods make it much harder for tools to search the source code accurately. By making the naming convention part of the grammar, C3 is straightforward to parse with a single token lookahead.
Q: Can't you relax C3's naming rules?
It is a common misunderstanding that the naming rules are something enforced by the semantic analyzer. This is not true: it is a lexer rule, to be able to distinguish between types and other identifiers.
It is the only way to make a C grammar parsable with only 1 token lookahead. All other approaches add significant complexity to work around this, and often they still rely on C being parsed in order, top to bottom.
Consequently, the answer is a strong NO. There is no way to "relax" the rules, because they are fundamental to making a C-like grammar parsable.
Either C3 has int a = 2; with these rules, or it gains some alternative
variable declaration syntax like var a : int = 2;. But in the latter case, the changes would not end there since
the declaration syntax also strongly shapes struct declarations, for-statements and other things to the point that
no one would recognize it as a C evolution anyway. So it's a non-starter.
Q: Why are there no closures and only non-capturing lambdas?
A: With closures, life-time management of captured variables becomes important to track. This can become arbitrarily complex, and without RAII or any other memory management technique it is fairly difficult to make code safe. Non-capturing lambdas on the other hand are fairly safe.
Q: Why is it called C3 and not something better?
A: Naming a programming language isn't easy. Most programming languages have pretty bad names, and while C3 isn't the best, no real better alternative has come along.
Q: Why are there no static methods?
A: Static methods create a tension between free functions in modules and functions namespaced by the type. Java for example, resolves this by not having free functions at all. C3 resolves it by not having static methods (nor static variables). Consequently more functions become part of the module rather than the type.
Q: Why do macros with trailing bodies require ; at the end?
A: All macro calls, including those with a trailing body, are expressions, so it would be ambiguous to let them terminate a statement without a much more complicated grammar. An example:
// How can the parser determine that the
// last `}` ends the expression? (And does it?)
int a = @test() {} + @test() {}
*b = 123;
// In comparison, the grammar for this is easy:
int a = @test() {} + @test() {};
*b = 123;
C3 strives for a simple grammar, and so the trade-off of having to use ; was a fairly low price to pay for this feature.
Q: Why does C3 choose to call Optional "Optional" and not "Result"?
A: C3's optional has properties both from the traditional "Maybe" and "Result". While it carries two possible values,
like a Result, it is trivially composable in the way optionals are. In the "Result" case, we cannot implicitly combine
Result<int, MyError> and Result<int, YourError>, which is also often reflected in the support for them.
For the "Maybe" it is trivial, so we see how languages do things like "Optional Chaining". C3 even goes beyond that, and implements implicit "flat map" for operations with its Optional.
For Result or multiple returns, it's also not guaranteed how big the error value can be. For C3 the size is defined to be pointer-sized to minimize overhead.
This means that using an Optional is not heavier than using a "Maybe" or even a boolean return in some other language.
For these reasons, the Optional leans more towards "Maybe" usage than "Result", and the name was chosen to nudge towards the correct use.
Q: Why doesn't C3 have a tagged union?
A: Tagged unions are great, but there is still discussion of what it should look like if it was included in C3.
See this issue for more details.
Q: Why is the declaration of arrays swapped compared to C?
A: The way C3 types are declared is the most inside one is to the left, the outermost to the right. Indexing or dereferencing will peel off the rightmost part.
C uses a different way to do this: we place * and [] not on the type but on the variable, in the order it must be unpacked.
So given int (*foo) x[4] we first dereference it (from inside) int[4], then index from the right.
If we wanted to extract a standalone type from this, we'd have int(*)[4] for a pointer to an array of 4 integers.
For "left is innermost", the declaration would instead be int[4]*. If left-is-innermost we can easily describe a pointer
to an array of int pointers (which happens in C3 since arrays don't implicitly decay) int[4]. In C that would be "int()[4]",
which is generally regarded as less easy to read, not the least because you need to think of which of * or [] has priority.
In C3, we can have a variable List{int}[3] x, which is an array of 3 List{int}. If we do x[1] we will get an element of
List{int}, from the middle element in the array. If we then further index this with [5], like x[1][5] we will get
the 5th element of that list.
Q: Why does C3 use :: to separate namespaces and not .?
A: . is nice to type and read, but there are challenges. In particular, C3's "path shortening", where you're allowed to write
file::open("foo.txt") rather than having to use the full std::io::file::open("foo.txt") is only made possible because
the namespace is distinct at the grammar level. If we play with changing the syntax because it isn't as elegant as
file.open("foo.txt"), we'd have to pay by actually writing std.io.file.open("foo.txt") or change to a flat module system.
One can also note that if . is used, then something like file.open("foo.txt") would be ambiguous if there was both a module file
and a variable file in the scope.
Choices in tooling¶
Q: Why does C3 have comments in the default project.json?
A: This was done as a way for users to understand what the various fields were used for. While this would more properly be
a .json5 file, many json parsers could ignore comments anyway. As tooling improves, comments will be phased out. Already it's
possible to manipulate the project file from the command line.
Q: Why does C3 use JSON for project.json, why not YAML, TOML or something else?
A: JSON is a format with a parser in almost any language, plus it is straightforward to write a parser for.
Originally C3 used TOML, which is great for manual configs. However, we moved away from this exactly because the project files should be easily generated and manipulated by tools, with no strict requirement that formatting remains the same.
If tools manipulated hand-written TOML, there would be an expectation to retain formatting and comments in the same style, which would put a burden on tool writers.
Q: Will C3 have a package manager?
A: There will be some standard API for uploading and downloading C3 libraries. However, it will not be a full dependency manager. In an attempt to limit over-use of dependencies, each dependency will need to be downloaded separately, rather than automatically.
See, for example, the discussion here.
Cross-compiling To Windows From Linux¶
Q: How do I cross-compile my C3 program for Windows on Linux?
A: With the C3 compiler you can specify which target you would like to cross-compile to. For Windows the following target would be needed:
c3c compile main.c3 --target windows-x64
This requires the MSVC SDK components, which c3c automatically downloads and configures, including the Windows SDK files needed to enable cross-compilation to Windows.
Changes from C¶
Q: Why does C3 have zero initialization for local variables?
A: There are several reasons:
- In the "zero-is-initialization" paradigm, zeroing variables, in particular structs, is very common. By offering zero initialization by default this avoids a whole class of vulnerabilities.
- Another alternative that was considered for C3 was mandatory initialization, but this adds a lot of extra boilerplate.
- C3 also offers a way to opt out of zero-initialization, so the change comes at no performance loss.
Q: Why is the const qualifier removed?
A: "const correctness" requires littering const across the code base. Although const is useful, it provides weaker guarantees than it appears.
Q: Why was the 0777 octal format removed?
A: C's octal syntax looks too much like base 10 with leading zeros prepended (and is sometimes used outside of C to represent fixed-width base 10 numbers). Removing such ambiguous octal syntax prevents a common source of subtle numerical errors in C.
Q: Why is goto gone?
A: It is very difficult to make goto work well with defer and implicit unwrapping of optional results. It is not just making the compiler harder to write, but
the code is harder to understand as well. The replacements together with defer cover many if not all usages of goto in regular code.
All Features
Here is a summary of all the features of C3 and how it differs from C.
Symbols and literals¶
Changes relating to literals, identifiers etc.
Added¶
0oprefix for octal.0bprefix for binary.- Optional
_as digit separator. - Hexadecimal byte data, e.g
x"abcd". - Base64 byte data, e.g.
b64"QzM=". - Type name restrictions (PascalCase).
- Variable and function name restrictions (must start with lower case letter).
- Constant name restrictions (no lower case).
- Character literals may be 2, 4, 8, 16 bytes long. (2cc, 4cc etc).
- Raw string literals between "`".
\eescape character.- Source code must be UTF-8.
- Assumes
\nfor newlines.\ris stripped from source. - Integer suffixes are fixed size:
L,ULis 64-bit literals on all platforms,LLandULLare guaranteed 128-bit literals. - The
nullliteral is a pointer value of 0. - The
trueandfalseare boolean constants true and false.
Removed¶
- Trigraphs / digraphs.
- 0123-style octal.
z,LLandULLsuffixes.
Built-in types¶
Added¶
- Type declaration is left to right:
int[4]*[2] a;instead ofint (*a[2])[4]; - Simd vector types using
[<>]syntax, e.g.float[<4>], use[<*>]for inferred length. - Slice type built in, using
[]suffix, e.g.int[] typedefis similar to C's typedef but forms a new type. (Example: theStringtype is a new type withchar[]internal representation)- Built-in 128-bit integer on all platforms.
charis an unsigned 8-bit integer.icharis its signed counterpart.- Well-defined bitwidth for integer types: ichar/char (8 bits), short/ushort (16 bits), int/uint (32 bits), long/ulong (64 bits), int128/uint128 (128 bits)
- Pointer-sized
iptranduptrintegers. szanduszintegers corresponding to thesize_tbitwidth.- Optional types are formed using the
?suffix. boolis the boolean type.typeidis a unique type identifier for a type, it can be used at runtime and compile time.anycontains atypeidandvoid*allowing it to act as a reference to any type of value.faulta constant representing an error (see below).
Changed¶
- Inferred array type uses
[*](e.g.int[*] x = { 1, 2 };). - Flexible array member uses
[*].
Removed¶
- The C "spiral rule" type declaration (see above).
- Complex types (implemented as user-defined types instead).
size_t,ptrdiff_t(see above).- Array types do not decay.
Types¶
Added¶
bitstructa struct with a container type allowing precise control over bit-layout, replacing bitfields and enum masks.faulta constant with unique values which are used together with optional.- Vector types.
- Optional types.
- Operator overloading for arithmetics, bit operators and equality.
enumallows a set of unique constants to be associated with each enum value.- Compile time reflection and limited runtime reflection on types (see "Reflection")
- All types have a
typeidproperty uniquely referring to that particular type. - Distinct types, which are similar to aliases, but represent distinctly different types.
- Types may have methods. Methods can be added to any type, including built-in types.
- Subtyping: using
inlineon a struct member allows a struct to be implicitly converted to this member type and use corresponding methods. - Using
inlineon atypedefallows it to be implicitly converted to its base type (but not vice versa). - Types may add operator overloading to support
foreachand subscript operations. - Generic types through generic modules, using
{ ... }for the generic parameter list (e.g.List{ int } list;). - Interface types and
anytypes, which allow dynamic invocation of methods. - Types may overload arithmetic operators to implement new numerical types.
Changed¶
- C's typedef is replaced by
aliasand has somewhat different syntax (e.g.alias MyTypeAlias = int;). - Function pointer syntax is prefixed by an
fnand followed by a regular function declaration without the function name. For example,fn void(int)is the type for a function that takes anintand returns nothing. Named parameters and default arguments are also permitted, such asfn void(int num = 0). typedefin C3 creates a new type which can have its own methods, but shares the same common internal representation as the original type.
Removed¶
- Enums, structs and unions no longer have distinct namespaces.
- Enum, struct and union declarations should not have a trailing ';'
aliascan only be used at the top level, not inside a function.- Anonymous structs are not allowed.
- Type qualifiers are all removed, including
const,restrict, andvolatile. However,constmay be applied to compile-time values and each such constant must have its name written inALL_CAPS. - Function pointers types cannot be used "raw", but must always be used through a type alias.
Introspection¶
Compile time type methods: alignmend, cname, inf, inner, kind, len
max, members, min, nan, names, params, returns, size, typeid, values,
qname, has_equals, is_ordered.
Runtime type methods: inner, kind, len, names, size.
Expressions¶
Added¶
- Array initializers may use ranges. (e.g.
int[256] x = { [0..128] = 1 }) ?:operator, returning the first value if it can be converted to a boolean true, otherwise the second value is returned.- Optionals support an "or else" operator
??returning the first value if it is a normal (non-fault) result or else the second value if the first value is an abnormal (fault-containing) Optional value. Thus,??provides a mechanism for returning default values when evaluations encounter problems. - Rethrow
!suffix operator which implicitly returns the Optional value if it was an abnormal (fault-containing) Optional value. - Dynamic calls, allowing function calls to be made on generic data of type
anyor to use interfaces as a dynamic dispatching mechanism. - Create a slice using a range subscript (e.g.
a[4..8]to form a slice from element 4 to element 8). - Two range subscript methods:
[start..inclusive_end]and[start:length]. Start, end and length may be omitted for default values. - Indexing from end: slices, arrays and vectors may be indexed from the end using
^.^1represents the last element. This works for ranges as well. - Range assignment, assign a single value to an entire range e.g.
a[4..8] = 1;. - Slice assignment: copy one range to the other range, e.g.
a[4..8] = b[8..12];. - Array, vector and slice comparison:
==can be used to make an element-wise comparison of two containers. ~suffix operator turns afaultinto an optional value.!!suffix panics if the value is an optional value.$defined(...)returns true if the outermost expression contained within it is defined. Sub-expressions must also be valid.- Compile time "and" and "or" using
&&&and|||. Both sides of the operator should be compile-time constants. If the left hand side of&&&is false, the right hand side is not type-checked. For|||the right hand side is not type-checked if the left hand side is true. - Lambdas (anonymous functions) may be defined. They work just like functions and do not capture any state (i.e. are not "closures", unlike in some other languages). Not capturing state makes it easier for C3 to retain a simpler lifetime model.
- Simple bitstructs (only containing booleans) may be manipulated using bit operations
& ^ | ~and assignment. - Structs may implicitly convert to their
inlinemember if they have one. - Pointers to arrays may implicitly convert to slices.
- Any pointer may implicitly convert to an
anycontaining the type of the pointee. - An optional value will implicitly invoke “flatmap” on an expression it is a subexpression of.
- Swizzling for arrays and vectors. For example, to reverse a 3-element vector
vecvia swizzling you can usevec.xyz = vec.zyx;.
Changed¶
- Operator precedence of bit operations is higher than
+and-. - Well defined-evaluation order: left-to-right, assignment after expression evaluation.
sizeofis@sizeofand only works on expressions. UseType::sizeon types.alignofis@alignoffor expressions. Types useType.alignment.- Narrowing conversions are only allowed if all sub-expressions are as small or smaller than the type.
- Widening conversions are only allowed on simple expressions (i.e. most binary expressions and some unary may not be widened).
Removed¶
- The comma operator is removed.
Functions¶
Added¶
- Functions may be called using named arguments. The name is the same as the parameter name, but followed by
:in the call. For example:foo(name: a, len: 2). - Typed vaargs are declared
Type... argument, and will take 0 or more arguments of the given type. - It is possible to "splat" an array or slice into the location of a typed vararg using
...:foo(a, b, ...list) anyvaargs are declared asargument...(i.e. without a type in the function parameter list). Such a vaarg list can take 0 or more arguments of any type. All passed arguments are implicitly converted to theanytype when the function is called.- The function declaration may have
@inlineor@noinlineas a default. - Using
@inlineor@noinlineon a function call expression will override the function default. - Type methods are functions defined in the form
fn void Foo.my_method(Foo* foo) { ... }. They can be invoked using dot syntax. - Type methods may be attached to any type, even arrays and vectors.
- Error handling uses Optional return types, which are similar to tagged unions that either contain a valid result or a
faultstate.
Changed¶
- Function declarations use the
fnprefix.
Removed¶
- Functions with C-style vaargs may be called, and declared as external functions, but not used for C3 functions.
Attributes¶
C3 adds a long range of attributes in the form @name(...). It is possible to create custom
attribute groups using attrdef (e.g. attrdef MyAttribute(usz align) = { @aligned(align) @weak };) which
groups certain attributes. Empty attribute groups are permitted.
The complete list: @align, @benchmark, @bigendian, @builtin,
@callconv, @deprecated, @dynamic, @export,
@cname, @if, @inline, @interface,
@littleendian, @local, @maydiscard, @mustinit, @naked,
@nodiscard, @noinit, @noreturn, @nostrip,
@obfuscate, @operator, @overlap, @priority,
@private, @public, @pure, @reflect,
@section, @test, @used, @unused.
Declarations¶
Added¶
vardeclaration for type inferred variables in macros. E.g.var a = some_value;.vardeclaration for new type variables in macros. E.g.var $Type = int;.vardeclaration for compile time mutable variables in functions and macros. E.g.var $foo = 1;.constdeclarations may be untyped. Such constants are not stored in the resulting binary.
Changed¶
tlocaldeclares a variable to be thread local.statictop level declarations are replaced with@local. (staticin functions is unchanged)
Removed¶
restrictremoved.atomicshould be replaced by atomic load/store operations.volatileshould be replaced by volatile load/store operations.
Statements¶
Added¶
- Match-style variant of the switch statement, which allows each case to hold an expression to test.
- Switching over type with
typeid. asmblocks for inline assembly.nextcaseto fallthrough to the next case.nextcase <expr>to jump to the case with the expression value. This may be an expression evaluated at runtime.nextcase defaultto jump to thedefaultclause.- Labelled
while/do/for/foreachto use withbreaknextcaseandcontinue. foreachto iterate over arrays, vectors, slices and user-defined containers using operator overloading.foreach_rto iterate in reverse.foreach/foreach_rmay take the element by value or reference. The index may optionally be provided.$if,$switch,$for,$foreachstatements executing at compile time.$echoprinting a message at compile time.$assertcompile time assert.deferstatement to execute statements at scope exit.defer catchanddefer try, which are similar todeferbut execute only if the Optional value contains afaultor a normal result respectively.dostatements may omit while, behaving same aswhile (0).ifmay have a label. Labelledifmay be exited using labelled break.if (try ...)statements run code when an expression is a "valid"/"normal" result (i.e. to handle the "happy path" when working with Optionals).if (catch ...)statements run code when an expression is an "invalid"/"abnormal" (fault-containing) result (i.e. to handle the "failure path" when working with Optionals). It can be used to implicitly unwrap variables.- Exhaustive switching on enums.
Changed¶
- Switch cases will have implicit break, rather than implicit fallthrough.
assertis an actual statement and may take a string or a format + arguments.static_assertfrom C and C++ corresponds to$assertin C3 and is a statement.
Removed¶
gotohas been removed and replaced by labelledbreak,continueandnextcase.
Compile time evaluation¶
Added¶
@if(cond)to conditionally include a struct/union field, a user-defined type, etc.- Compile time variables with
$prefix, e.g.$foo. $if...$else...$endifand$switch...$endswitchinside of functions to conditionally include code.$forand$foreachto loop over compile time variables and data.$Typeofdetermines an expression type without evaluating it.- Type properties may be accessed at compile time.
$definedreturns true if the expression (variable, function, type, etc) passed to it would compile. The expression passed to$definedis not actually executed though and thus does not have side effects.$erroremits an error if encountered.$embedincludes a file as binary data.$includeincludes a file as text.$execincludes the output of a program as code.$expandtakes a compile time string and turns it code.$evaltakes a string and turns it into an identifier.- Compile time constant values are always compile time folded for arithmetic operations and casts.
$$FUNCTIONreturns the current function as an identifier, as if its name had been written in place of$$FUNCTION.
Changed¶
#definefor constants is replaced by untyped constants, e.g.#define SOME_CONSTANT 1becomesconst SOME_CONSTANT = 1;.#definefor variable and function aliases is replaced byalias, e.g.#define native_foo win32_foobecomesalias native_foo = win32_foo;- In-function
#if...#else..#endifis replaced by$ifand#if...#elif...#endifis replaced by$switch. - For converting code into a string use
$stringify. - Macros for date, line etc are replaced by
$$DATE,$$FILE,$$FILEPATH,$$FUNC,$$LINE,$$MODULE,$$TIME.
Removed¶
- Top level
#if...#endifdoes not have a counterpart. Use@ifinstead. - No
#includedirectives,$includewill include text, but normally C3 code will useimportto access code from other modules.
Macros¶
Added¶
macrofor defining macros.- “Function-like” macros have no prefix and have only regular parameters or type parameters.
- “At”-macros are prefixed with
@and may also have compile time values, expression parameters, and a trailing body. - Type parameters are prefixed with
$and conform to C3's required type naming convention (e.g.$TypeFoo, a.k.a. "PascalCase"). - Expression parameters (i.e. macro parameters prefixed with
#) are unevaluated expressions. This is similar to arguments to#definein C. - Compile time values have a
$prefix and must contain compile time constant values. - Any macro that evaluates to a constant result can be used as if it was the resulting constant.
- Macros may be recursively evaluated.
- Macros are inlined at the location where they are invoked.
- Unless resulting in a single constant, macros implicitly create a runtime scope.
Removed¶
- No
#definemacros. - Macros cannot be incomplete statements.
Features provided by builtins¶
Some features are provided by "builtins" in the standard library, and appear like normal functions and macros in the standard library, but nonetheless provide unique functionality:
@likely(...)/@unlikely(...)on branches affects compilation optimization.@anycast(...)casts ananywith an optional result.unreachable(...)marks a path as unreachable with a panic in safe mode.unsupported(...)similar to unreachable but for functionality not implemented.@expect(...)expect a certain value with an optional probability for the optimizer.@prefetch(...)prefetches a pointer, meaning that the memory at the pointed to address will be loaded before it is necessarily required, thus possibly improving performance under the right conditions.swizzle(...)swizzles a vector.@volatile_load(...)and@volatile_store(...)volatile load/store.@atomic_load(...)and@atomic_store(...)atomic load/store.compare_exchange(...)atomic compare exchange.- Saturating add, sub, mul, shl on integers.
- Vector reduce operations: add, mul, and, or, xor, max, min.
Modules¶
- Modules are defined using
module <name>;, where<name>is of the formfoo::bar::baz. - Modules can be split into an unlimited number of module sections, each starting with the same module name declaration if intended to become part of the same module. Multiple differently named modules can also be defined per file.
- The
importstatement imports a given module. - Each module section has its own set of import statements.
- Importing a module gives access to the declarations that are
@public. - Declarations are default
@public, but a module section may set a different default (e.g.module my_module @private;). @privatemeans the declaration is only visible in the current module.@localmeans the declaration is only visible in the current module section. This also implies that the declaration will not be visible outside the current file either.- Imports are recursive. For example,
import my_libwill implicitly also importmy_lib::net. - Multiple imports may be specified with the same
import, e.g.import std::net, std::io;. - Generic modules are not type checked until any of their types, functions or globals are instantiated.
Contracts¶
- Doc contracts (starting with
<*and ending with*>) are parsed for correct contract syntax and semantics. They are not inert comments, despite also serving as documentation comments. - The first part, up until the first
@directive on a new line, is ignored. - The
@paramdirective for pointer arguments may define usage constraints[in][out]and[inout]. - Pointer argument constraints may add a
&prefix to indicate that they may not benull, e.g.[&inout]. - Contracts may be attached to generic modules, functions and macros.
@requiredirectives are evaluated given the arguments provided. Failing them may be a compile time or runtime error.- The
@ensuredirective is evaluated at exit – if the return is a "valid"/"normal" (non-fault) result and not an "invalid"/"abnormal" (fault-containing) Optional. returncan be used as a variable identifier inside of@ensure, and holds the return value.@return?optionally lists the errors used. This will be checked at compile time.@puresays that no writing to globals is allowed inside and only@purefunctions may be called.
Benchmarking¶
- Benchmarks are indicated by
@benchmark. - Marking a module section
@benchmarkmakes all functions inside of it implicitly benchmarks. - Benchmarks are usually not compiled.
- Benchmarks are instead only run by the compiler on request.
Testing¶
- Tests are indicated by
@test. - Marking a module section
@testmakes all functions inside of it implicitly tests. - Tests are usually not compiled.
- Tests are instead only run by the compiler on request.
Safe / fast¶
Compilation has two modes: “safe” and “fast”. Safe mode will insert checks for out-of-bounds access, null-pointer deref, shifting by negative numbers, division by zero, violation of contracts and asserts.
Fast mode will assume that all of those checks always pass. This means that unexpected behaviour may result from violating those checks. It is recommended to develop in "safe" mode.
If debug symbols are available, C3 will produce a stack trace in safe mode where an error occurs.
Comparison
An important question to answer is "How does C3 compare to other similar programming languages?". Here is an extremely brief (and not yet complete) overview.
C¶
As C3 is an evolution of C, the languages are quite similar. C3 adds features, but also removes a few.
In C but not in C3
- Qualified types (
const,volatileetc) - Unsafe implicit conversions
In C3 but not in C
- Module system
- Operator overloading
- Generics
- Compile time execution and semantic macros
- Integrated build system
- Error handling
- Defer
- Value methods
- Associated enum data
- Distinct types and subtypes
- Gradual contracts
- Built-in slices
- Foreach for iteration over arrays and types
- Dynamic calls and types
C++¶
C++ is a complex object-oriented "almost superset" of C. It tries to be everything to everyone, while squeezing this into a C syntax. The language is well known for its many pitfalls and quirky corners – as well as its long compile times.
C3 is in many ways different from C++ in the same way that C is different from C++, but the semantic macro system and the generics close the gap in terms of writing reusable generic code. The C3 module system and error handling is also very different from how C++ does things.
In C++ but not in C3
- Objects and classes
- RAII
- Exceptions
In C3 but not in C++
- Module system (yet)
- Integrated build system
- Semantic macros
- Error handling
- Defer
- Associated enum data
- Built-in slices
- Dynamic calls
Rust¶
Rust is a safe systems programming language. While not quite as complex as C++, it is still a feature rich programming language with semantic macros, traits and pattern matching to mention a few.
Error handling is handled using Result and Optional, which is similar to how C3 works.
C3 compares to Rust much like C, although the presence of built-in slices and strings reduces the places where C3 is unsafe. Rust provides arrays and strings, but they are not built in.
In Rust but not in C3
- RAII
- Memory safety
- Safe union types with functions
- Different syntax from C
- Pattern matching
- Async built in
In C3 but not in Rust
- Same ease of programming as C
- Gradual contracts
- Familiar C syntax and behaviour
- Dynamic calls
Zig¶
Zig is a systems programming language with extensive compile time execution to enable polymorphic functions and parameterized types. It aims to be a C replacement.
Compared to C3, Zig tries to be a completely new language in terms of syntax and feel. C3 uses macros to a modest degree, whereas it is more pervasive in Zig, and C3 does not depart from C to the same degree. Like Rust, it features slices as a first-class type. The standard library uses an explicit allocator to allow it to work with many different allocation strategies.
Zig is a very ambitious project, aiming to support as many types of platforms as possible.
In Zig but not in C3
- Pervasive compile time execution with type generation
- Memory allocation failure is an error
- Build toolchain is scripted using build files written in Zig
- Different syntax and behaviour compared to C
- Structs define namespace
- Async primitives built in*
- Arbitrary integer sizes
Note
*Note that as of this writing, async is temporarily missing from Zig.
In C3 but not in Zig
- Module system.
- Operator overloading
- C ABI compatibility by default
- First-class lambdas*
- Macros with lazy parameters and/or trailing bodies.
- Gradual contracts
- Dynamic interfaces
- Familiar C syntax and behaviour
- Declarative integrated build system
- Built-in benchmarks
Note
*In Zig, you can achieve a similar result by creating an anonymous struct with a single function.
Jai¶
Jai is a programming language aimed at high performance game programming. It has an extensive compile time meta programming functionality, even to the point of being able to run programs at compile time. It also has compile-time polymorphism, a powerful macro system and uses an implicit context system to switch allocation schemes.
In Jai but not in C3
- Pervasive compile time execution
- Jai's compile time execution is the build system.
- Different syntax and behaviour compared to C
- More powerful macro system than C3
- Implicit constructors
In C3 but not in Jai
- Module system
- Declarative integrated build system
- Gradual contracts
- Familiar C syntax and behaviour
- Fairly small language
- Dynamic interfaces
Odin¶
Odin is a language built for high performance but tries to remain a simple language to learn. Superficially the syntax shares much with Jai, and some of Jai's features things – like an implicit context – also show up in Odin. In contrast with both Jai and Zig, Odin uses only minimal compile time evaluation and instead only relies on parametric polymorphism to ensure reuse. It also contains conveniences, like maps and arrays built into the language. For error handling it relies on Go style tuple returns.
In Odin but not in C3
- Different syntax and behaviour compared to C
- Ad hoc parametric polymorphism
- Multiple return values
- Error handling through multiple returns
- A rich built-in set of types for maths
In C3 but not in Odin
- Familiar C syntax and behaviour
- Semantic macros
- Value methods
- Gradual contracts
- Built-in error handling
- Dynamic interfaces
- Operator overloading
D¶
D is an incredibly extensive language. It covers anything C++ does and adds much more. D manages this with much fewer syntactic quirks than C++. It is a strong, feature-rich language.
In D but not in C3
- Objects and classes
- RAII
- Exceptions
-
Optional GC
-
+ Many, many more features.
In C3 but not in D
- Fairly small language
Rejected Ideas
These are ideas that will not be implemented in C3.
The rationale for each is also given below.
Constructors and destructors¶
A fundamental concept in C3 is that data is not "active". This is to say there is no code associated with the data implicitly, unlike constructors and destructors in an object oriented language. Not having constructors / destructors prevents RAII-style resource handling, but also allows the code to assume the memory can be freely allocated and initialized as it sees fit, without causing any corruption or undefined behaviour.
There is a fundamental difference between active objects and inert data. Each has its advantages and disadvantages. C3 follows the C model, which is that data is passive and does not enforce any behaviour. This has very deep implications on the semantics of the language and adding constructors and destructors would change the language greatly, requiring many parts of the language to be altered.
For that reason, constructors and destructors will not be considered for C3.
Unicode identifiers¶
The main argument for unicode identifiers is that "it allows people to code in their own language". However, there is no proof that this actually is used in practice. Furthermore there are practical issues, such as bidirectional text, characters with different code points that are rendered in an identical way, etc.
Given the complexity and the lack of actual proven benefit, unicode identifiers will not happen for C3.
Builtin type-name variants¶
A common request is to change the builtin type names from char, int, long etc, to some other standard, such as u8 i32 or uint8, int32. Various rationales are usually given for each, but ultimately it is a matter of taste and habit.
Because C3 limits user-defined names to PascalCase in order to easily resolve the language grammar, it is not possible to create type aliases for such names, which leads to requests to build them into the language itself. (Int32 is fine, but int32 is not, nor are INT32 and I32).
Originally, C3 was going to have both bit-fixed type names (like today, where int is always 32 bits, long always 64 bits and so on) as well as explicit bitsize-names like u8, i32 that aliased to the same types. Ultimately this was shelved, because it would mean that libraries would end up standardizing on one style or the other, creating friction when used. Ultimately the language would end up with one style being the "accepted" way to name things anyway. So after quite a bit of deliberation, the C naming scheme was chosen. This was mainly for the following three reasons:
- It's familiar from C, so one would need to rewrite and learn less coming from C/C++/C#/Java, the code would also look more C-like.
i32has readability problems when combined withifor index in for loops, which was considered a major drawback.- While the
int32scheme does not have readability issues, it is longer than the C names in almost every case.
After this decision was made and the types established, someone mentioned that s32 could have been an alternative to consider as well, and indeed it is far superior as a prefix for bitsize-names. However, it's not obvious that for example sptr is better than iptr, plus the decision was made.
Over the years, requests for builtin types have occasionally appeared, but interestingly, not always arguing for the same scheme. Some would say iXX was the only possibility, others thought such naming was out of the question and an intXX scheme the only right decision and so on. Given that, it's rather clear that the preference for any naming scheme is subjective, and one is pretty much as good as the other.
So, the C3 naming scheme will not change, although small tweaks are not ruled out.
String interpolation¶
There is sometimes the request for string interpolation in the style of:
String str1 = "hello";
String str2 = "world";
int val = 3;
float pi = 3.14;
String str = "{str1}:{str2}:{val}:{pi}";
- This is not a replacement for printf, as it assumes that the format string is always known at compile time. This is not always the case.
- This is often surprisingly hard to read compared to printf, except in simple cases or when dealing with huge string templates. In the latter case it doesn't need to be a language builtin.
- The general runtime case needs implicit allocation, which isn't compatible with C3 memory management.
- Compile time string generation already has @sprintf.
- The main downside of
printfis when accidentally providing too many or too few arguments, but the C3 compiler already checks for that. - This solution is as complex as
printfand they strongly overlap – printf won't get removed, so adding it would mean there are two ways of achieving the same thing (except printf doesn't have any of its downsides)
For these reasons, C3 won't get string interpolation syntax.
Get Involved
Community & Contribute
Contributions Welcome!¶
The C3 language is still in its development phase, which means functionality and specification are subject to change. That also means that any contribution right now will have a big impact on the language. So if you find the project interesting, here’s what you can do to help:
💬 Discuss The Language¶
- Join us on C3 Discord.
💡 Suggest Improvements¶
- Found a bug? File an issue for C3 compiler
- Spotted a typo or broken link? File an issue for the website
💪 Contribute¶
Now that the compiler is stable, what is needed now are the non-essentials, such as a documentation generator, editor plugins, language server protocol (LSP), etc.
Thank You
Thank You
Thank You¶
- A huge "thank you" goes out to all contributors and sponsors.