Getting Started with Axon

Welcome to Axon — the ML/AI-first systems programming language that combines the safety of Rust, the ergonomics of Python, and first-class support for tensors, automatic differentiation, and GPU computing.

What is Axon?

Axon is a compiled, statically-typed language purpose-built for machine learning and AI workloads. It provides:

  • First-class tensor types with compile-time shape checking
  • Ownership-based memory safety — no garbage collector, no data races
  • Automatic differentiation (reverse-mode autograd)
  • Native GPU execution via CUDA, ROCm, and Vulkan backends
  • Hindley-Milner type inference — write less, know more
  • A rich standard library with neural network layers, optimizers, and data loading

Axon compiles to native code through LLVM and to GPU kernels through MLIR, delivering performance on par with C++ while remaining approachable for ML researchers.


Installation

If you have Rust installed:

bash
cargo install axonc

Binary Download

Pre-built binaries are available for:

  • Linux (x86_64, aarch64)
  • macOS (x86_64, Apple Silicon)
  • Windows (x86_64)

Download from the releases page and add the binary to your PATH.

From Source

bash
git clone https://github.com/axon-lang/axon.git
cd axon
cargo build --release
# Binary is at target/release/axonc

Verify the installation:

bash
axonc --version
# axonc 0.1.0

Hello, World!

Create a file named hello.axon:

axon
fn main() {
    println("Hello, Axon!");
}

Compile and run:

bash
axonc build hello.axon -o hello
./hello
# Hello, Axon!

Or use the REPL for quick experimentation:

bash
axonc repl
>>> println("Hello from the REPL!")
Hello from the REPL!

Your First Project

Axon includes a built-in package manager. Create a new project:

bash
axonc pkg new my_project
cd my_project

This generates the following structure:

my_project/
├── Axon.toml          # Project manifest
├── src/
│   └── main.axon      # Entry point
└── tests/
    └── test_main.axon  # Test file

The generated Axon.toml:

toml
[package]
name = "my_project"
version = "0.1.0"
edition = "2026"

[dependencies]

The generated src/main.axon:

axon
fn main() {
    println("Hello from my_project!");
}

Build and run the project:

bash
axonc pkg build
axonc pkg run
# Hello from my_project!

Compiling and Running

Single File

bash
# Parse and check for errors
axonc check hello.axon

# Build an optimized binary
axonc build hello.axon -O 3 -o hello

# Emit LLVM IR for inspection
axonc build hello.axon --emit-llvm

Project-Based

bash
axonc pkg build        # Build the project
axonc pkg run          # Build and run
axonc pkg test         # Run tests
axonc pkg fmt          # Format all source files
axonc pkg lint         # Lint all source files

Optimization Levels

FlagDescription
-O 0No optimization (default, fastest compile)
-O 1Basic optimizations
-O 2Standard optimizations
-O 3Aggressive optimizations

Editor Setup

The official Axon VS Code extension provides:

  • Syntax highlighting for .axon files
  • Real-time error diagnostics via the Axon LSP
  • Go-to-definition, hover types, and find references
  • Code completion with type-aware suggestions
  • Inlay hints for inferred types
  • Semantic token highlighting
  • Code snippets for common patterns

Install from the marketplace or build from source:

bash
cd editors/vscode
npm install
npm run build

Other Editors

For any editor that supports the Language Server Protocol:

bash
axonc lsp

This starts the Axon language server over stdio, compatible with Neovim (via nvim-lspconfig), Emacs (via lsp-mode), Helix, Zed, and others.


What's Next?

Language Tour

A quick tour of Axon's syntax and core features. This guide assumes familiarity with at least one systems or ML language (Rust, Python, C++).


Variables

Variables are declared with val. They are immutable by default.

axon
val x = 42;              // immutable, type inferred as Int32
val y: Float64 = 3.14;   // explicit type annotation
var counter = 0;          // mutable variable
counter += 1;

Type inference works across expressions — you rarely need annotations:

axon
val name = "Axon";            // String
val active = true;            // Bool
val scores = [95, 87, 92];    // Vec<Int32>

Functions

Functions are declared with fn. Parameters require type annotations; return types follow :.

axon
fn add(a: Int32, b: Int32): Int32 {
    a + b   // last expression is the return value
}

fn greet(name: String) {
    println("Hello, {}!", name);
}

fn main() {
    val sum = add(3, 4);
    greet("World");
}

Unsafe Functions

Functions performing low-level operations can be marked unsafe:

axon
unsafe fn raw_pointer_access(ptr: *mut Float32): Float32 {
    // low-level memory access
}

Basic Types

Numeric Types

TypeDescriptionSize
Int8Signed 8-bit integer1 byte
Int16Signed 16-bit integer2 bytes
Int32Signed 32-bit integer4 bytes
Int64Signed 64-bit integer8 bytes
UInt8Unsigned 8-bit integer1 byte
UInt16Unsigned 16-bit integer2 bytes
UInt32Unsigned 32-bit integer4 bytes
UInt64Unsigned 64-bit integer8 bytes
Float1616-bit floating point2 bytes
Float3232-bit floating point4 bytes
Float6464-bit floating point8 bytes

Other Primitives

TypeDescription
BoolBoolean (true / false)
CharUnicode scalar value
StringUTF-8 encoded string

Integer Literals

axon
val dec = 42;          // decimal
val hex = 0xFF;        // hexadecimal
val bin = 0b1010;      // binary
val oct = 0o77;        // octal
val sci = 1.5e10;      // scientific notation

Control Flow

If / Else

if is an expression — it returns a value:

axon
val max = if a > b { a } else { b };

if score >= 90 {
    println("Excellent");
} else if score >= 70 {
    println("Good");
} else {
    println("Keep trying");
}

While Loops

axon
var i = 0;
while i < 10 {
    println("{}", i);
    i += 1;
}

For Loops

axon
for item in collection {
    println("{}", item);
}

for i in 0..10 {
    println("{}", i);
}

Match Expressions

Pattern matching with exhaustiveness checking:

axon
match value {
    0 => println("zero"),
    1 => println("one"),
    n => println("other: {}", n),
}

match option_val {
    Some(x) => println("Got {}", x),
    None => println("Nothing"),
}

Models

Named product types with fields:

axon
model Point {
    x: Float64,
    y: Float64,
}

val p = Point { x: 1.0, y: 2.0 };
println("({}, {})", p.x, p.y);

Methods via extend

axon
extend Point {
    fn distance(&amp;self, other: &amp;Point): Float64 {
        val dx = self.x - other.x;
        val dy = self.y - other.y;
        (dx * dx + dy * dy).sqrt()
    }

    fn origin(): Point {
        Point { x: 0.0, y: 0.0 }
    }
}

Enums

Sum types with variants that can hold data:

axon
enum Shape {
    Circle(Float64),
    Rectangle(Float64, Float64),
    Triangle { base: Float64, height: Float64 },
}

fn area(shape: Shape): Float64 {
    match shape {
        Shape.Circle(r) => 3.14159 * r * r,
        Shape.Rectangle(w, h) => w * h,
        Shape.Triangle { base, height } => 0.5 * base * height,
    }
}

Traits and Extend Blocks

Traits define shared behavior:

axon
trait Printable {
    fn to_string(&amp;self): String;
}

extend Printable for Point {
    fn to_string(&amp;self): String {
        format("({}, {})", self.x, self.y)
    }
}

Trait Bounds

axon
fn print_all<T: Printable>(items: Vec<T>) {
    for item in items {
        println("{}", item.to_string());
    }
}

Supertraits

axon
trait Drawable: Printable {
    fn draw(&amp;self);
}

Generics

Functions, models, and traits can be generic:

axon
fn max<T: Ord>(a: T, b: T): T {
    if a > b { a } else { b }
}

model Pair<A, B> {
    first: A,
    second: B,
}

extend<A: Display, B: Display> Pair<A, B> {
    fn show(&amp;self) {
        println("({}, {})", self.first, self.second);
    }
}

Tensor Types and Shape Annotations

Axon's killer feature — tensors are first-class citizens with compile-time shape verification:

axon
// Tensor with known shape
val weights: Tensor<Float32, [784, 256]> = randn([784, 256]);

// Dynamic batch dimension with ?
val input: Tensor<Float32, [?, 784]> = load_batch();

// Matrix multiply — shapes checked at compile time
val output = input @ weights;   // Tensor<Float32, [?, 256]>

Shape mismatches are caught before your code ever runs:

axon
val a: Tensor<Float32, [3, 4]> = randn([3, 4]);
val b: Tensor<Float32, [5, 6]> = randn([5, 6]);
val c = a @ b;   // ERROR[E3001]: shape mismatch — inner dims 4 ≠ 5

See the Tensor Guide for the full story.


Error Handling

Axon uses Option<T> and Result<T, E> for safe error handling:

axon
fn find(haystack: Vec<Int32>, needle: Int32): Option<Int32> {
    for i in 0..haystack.len() {
        if haystack[i] == needle {
            return Some(i);
        }
    }
    None
}

fn read_config(path: String): Result<Config, IOError> {
    val file = File.open(path)?;    // propagate error with ?
    val data = file.read_all()?;
    parse_config(data)
}

See Error Handling for patterns and best practices.


Modules and Visibility

Organize code into modules with mod and use:

axon
mod math {
    pub fn square(x: Float64): Float64 {
        x * x
    }

    fn internal_helper() {
        // private — not visible outside this module
    }
}

use math.square;

fn main() {
    println("{}", square(4.0));   // 16.0
}

See Modules & Packages for the full module system.


What's Next?

TopicGuide
Ownership & borrowingownership-borrowing.md
Tensor programmingtensors.md
GPU programminggpu-programming.md
Error handlingerror-handling.md
Modules & packagesmodules-packages.md
Build a neural networkTutorial: MNIST Classifier

Tensor Programming

Tensors are first-class citizens in Axon. The type system tracks tensor shapes at compile time, catching dimension mismatches before your code ever runs.


Tensor Types and Shapes

Every tensor has a dtype and a shape encoded in its type:

axon
// Static shape — all dimensions known at compile time
val weights: Tensor<Float32, [784, 256]> = randn([784, 256]);

// Dynamic batch dimension (?)
val input: Tensor<Float32, [?, 784]> = load_batch();

// Fully dynamic shape
val dynamic: Tensor<Float32, [?, ?]> = some_function();

Shape Syntax

SyntaxMeaning
[3, 4]Static shape: 3 rows, 4 columns
[?, 784]Dynamic first dim, static second dim
[?, ?, 3]Batch × height × width, 3 channels
[N]Named dimension (generic)

Creating Tensors

Initialization Functions

axon
// Zeros and ones
val z = zeros([3, 4]);           // Tensor<Float32, [3, 4]>
val o = ones([256]);             // Tensor<Float32, [256]>

// Random initialization
val r = randn([128, 64]);       // normal distribution
val u = rand([10, 10]);         // uniform [0, 1)

// From data
val t = Tensor.from_vec([1.0, 2.0, 3.0, 4.0], [2, 2]);

// Range
val seq = arange(0, 10);        // [0, 1, 2, ..., 9]

// Identity matrix
val eye = Tensor.eye(4);       // 4×4 identity

// From file
val data = load_data("weights.npy");

Dtype Selection

axon
val f16: Tensor<Float16, [1024]> = zeros([1024]);    // half precision
val f32: Tensor<Float32, [1024]> = zeros([1024]);    // single precision
val f64: Tensor<Float64, [1024]> = zeros([1024]);    // double precision
val i32: Tensor<Int32, [10]> = arange(0, 10);        // integer tensor

Shape Operations

Reshape

Change the shape without changing the data:

axon
val a: Tensor<Float32, [2, 6]> = randn([2, 6]);
val b = a.reshape([3, 4]);      // Tensor<Float32, [3, 4]>
val c = a.reshape([12]);        // Tensor<Float32, [12]>
// val d = a.reshape([5, 5]);   // ERROR[E3002]: cannot reshape [2,6] (12 elements) to [5,5] (25 elements)

Transpose

axon
val m: Tensor<Float32, [3, 4]> = randn([3, 4]);
val mt = m.transpose();         // Tensor<Float32, [4, 3]>

// For higher-rank tensors, specify axes
val t: Tensor<Float32, [2, 3, 4]> = randn([2, 3, 4]);
val tp = t.permute([0, 2, 1]);  // Tensor<Float32, [2, 4, 3]>

Squeeze and Unsqueeze

axon
val a: Tensor<Float32, [1, 3, 1, 4]> = randn([1, 3, 1, 4]);
val b = a.squeeze();            // Tensor<Float32, [3, 4]>

val c: Tensor<Float32, [3, 4]> = randn([3, 4]);
val d = c.unsqueeze(0);         // Tensor<Float32, [1, 3, 4]>

Concatenation and Stacking

axon
val a: Tensor<Float32, [2, 3]> = randn([2, 3]);
val b: Tensor<Float32, [2, 3]> = randn([2, 3]);

val cat = Tensor.cat([a, b], 0);    // Tensor<Float32, [4, 3]>
val stk = Tensor.stack([a, b], 0);  // Tensor<Float32, [2, 2, 3]>

Slicing

axon
val t: Tensor<Float32, [10, 20]> = randn([10, 20]);
val row = t[0];                  // Tensor<Float32, [20]>
val sub = t[2..5];               // Tensor<Float32, [3, 20]>

Element-Wise Operations

Standard arithmetic operators work element-wise on tensors:

axon
val a = randn([3, 4]);
val b = randn([3, 4]);

val sum  = a + b;     // element-wise addition
val diff = a - b;     // element-wise subtraction
val prod = a * b;     // element-wise multiplication (Hadamard)
val quot = a / b;     // element-wise division

// Scalar broadcasting
val scaled = a * 2.0;
val shifted = a + 1.0;

Math Functions

axon
val x = randn([100]);

val s  = x.sin();
val c  = x.cos();
val e  = x.exp();
val l  = x.log();
val sq = x.sqrt();
val ab = x.abs();
val cl = x.clamp(-1.0, 1.0);

Activation Functions

axon
val h = relu(x);
val g = gelu(x);
val s = sigmoid(x);
val t = tanh(x);
val p = softmax(logits, dim: 1);

Reduction Operations

Reduce tensors along axes:

axon
val t: Tensor<Float32, [4, 5]> = randn([4, 5]);

val total = t.sum();              // scalar
val row_sum = t.sum(dim: 1);     // Tensor<Float32, [4]>
val col_mean = t.mean(dim: 0);   // Tensor<Float32, [5]>
val max_val = t.max();            // scalar
val min_idx = t.argmin(dim: 1);  // Tensor<Int64, [4]>

Common Reductions

MethodDescription
.sum()Sum of all elements
.sum(dim: N)Sum along dimension N
.mean()Mean of all elements
.max() / .min()Maximum / minimum
.argmax(dim: N)Index of maximum along dim
.argmin(dim: N)Index of minimum along dim
.prod()Product of all elements
.norm(p)Lp norm

Linear Algebra

Matrix Multiplication (@ operator)

The @ operator performs matrix multiplication with compile-time shape checking:

axon
val A: Tensor<Float32, [3, 4]> = randn([3, 4]);
val B: Tensor<Float32, [4, 5]> = randn([4, 5]);
val C = A @ B;    // Tensor<Float32, [3, 5]>

// Inner dimensions must match
val D: Tensor<Float32, [4, 6]> = randn([4, 6]);
// val E = A @ D;   // ERROR[E3001]: matmul shape mismatch — inner dims 4 ≠ 4... wait
//                   // actually [3,4] @ [4,6] works. Let's show a real error:
val F: Tensor<Float32, [5, 6]> = randn([5, 6]);
// val G = A @ F;   // ERROR[E3001]: matmul requires inner dims to match: 4 ≠ 5

Batch Matrix Multiplication

axon
val batch_a: Tensor<Float32, [?, 8, 64]> = randn([32, 8, 64]);
val batch_b: Tensor<Float32, [?, 64, 32]> = randn([32, 64, 32]);
val batch_c = batch_a @ batch_b;   // Tensor<Float32, [?, 8, 32]>

Other Linear Algebra Operations

axon
val M = randn([4, 4]);

val d = M.det();              // determinant
val inv = M.inv();            // inverse
val (Q, R) = M.qr();         // QR decomposition
val (U, S, V) = M.svd();     // singular value decomposition
val eig = M.eigenvalues();   // eigenvalues
val tr = M.trace();           // trace
val dp = a.dot(b);            // dot product (1D tensors)

Device Transfer

Tensors can be moved between CPU and GPU:

axon
val cpu_tensor = randn([1024, 1024]);

// Move to GPU
val gpu_tensor = cpu_tensor.to_gpu();

// Compute on GPU
val result = gpu_tensor @ gpu_tensor;

// Move back to CPU for I/O
val cpu_result = result.to_cpu();
println("{}", cpu_result);

See GPU Programming for details.


Compile-Time Shape Checking

Axon's shape checker catches errors at compile time:

axon
// ✓ Shapes match
val a: Tensor<Float32, [3, 4]> = randn([3, 4]);
val b: Tensor<Float32, [4, 5]> = randn([4, 5]);
val c = a @ b;    // OK: [3,4] @ [4,5] → [3,5]

// ✗ Shape mismatch
val d: Tensor<Float32, [3, 4]> = randn([3, 4]);
val e: Tensor<Float32, [5, 6]> = randn([5, 6]);
// val f = d @ e;   // ERROR[E3001]: matmul inner dim mismatch: 4 ≠ 5

// ✗ Invalid reshape
val g: Tensor<Float32, [2, 3]> = randn([2, 3]);
// val h = g.reshape([2, 2]);  // ERROR[E3002]: element count mismatch: 6 ≠ 4

// ✗ Element-wise shape mismatch
val i: Tensor<Float32, [3, 4]> = randn([3, 4]);
val j: Tensor<Float32, [3, 5]> = randn([3, 5]);
// val k = i + j;   // ERROR[E3003]: broadcast incompatible shapes [3,4] and [3,5]

Dynamic Shapes

When dimensions are dynamic (?), shape checks happen at runtime:

axon
fn process(input: Tensor<Float32, [?, 784]>): Tensor<Float32, [?, 10]> {
    val w = randn([784, 10]);
    input @ w    // batch dim (?) propagated, inner dim (784) checked statically
}

Summary

FeatureExample
Static shapeTensor<Float32, [3, 4]>
Dynamic dimTensor<Float32, [?, 784]>
MatmulA @ B
Element-wisea + b, a * 2.0
Reductiont.sum(dim: 1)
Reshapet.reshape([6, 2])
Devicet.to_gpu(), t.to_cpu()
Shape errorCaught at compile time

See Also

Ownership and Borrowing

Axon uses an ownership system inspired by Rust to guarantee memory safety at compile time — no garbage collector, no dangling pointers, no data races.


The Three Ownership Rules

  1. Every value has exactly one owner.
  2. When the owner goes out of scope, the value is dropped.
  3. There can be either one mutable reference OR any number of immutable references to a value — never both at the same time.

Ownership and Move Semantics

By default, assigning a value moves it. The original binding becomes invalid:

axon
val tensor = randn([1024, 1024]);
val other = tensor;          // tensor is MOVED into other
// println("{}", tensor);    // ERROR[E4001]: use of moved value `tensor`
println("{}", other);        // OK

This applies to function calls as well:

axon
fn consume(t: Tensor<Float32, [3, 3]>) {
    println("{}", t);
}

val data = randn([3, 3]);
consume(data);
// consume(data);   // ERROR[E4001]: use of moved value `data`

Why Moves?

Moves prevent double-free errors and make ownership transfer explicit. When a large tensor is passed to a function, no implicit copy occurs — you always know where your data lives.


Borrowing: &T and &mut T

To use a value without taking ownership, borrow it:

Immutable Borrows (&T)

Multiple immutable borrows are allowed simultaneously:

axon
fn print_shape(t: &amp;Tensor<Float32, [?, 784]>) {
    println("Shape: {}", t.shape);
}

val input = randn([32, 784]);
print_shape(&amp;input);       // borrow, don't move
print_shape(&amp;input);       // still valid — input wasn't moved

Mutable Borrows (&mut T)

Only one mutable borrow is allowed at a time, and no immutable borrows may coexist with it:

axon
fn scale(t: &amp;mut Tensor<Float32, [3, 3]>, factor: Float32) {
    // modify tensor in place
}

var weights = randn([3, 3]);
scale(&amp;mut weights, 2.0);
println("{}", weights);    // OK — mutable borrow has ended

Borrow Conflicts

The compiler rejects overlapping mutable and immutable borrows:

axon
var data = randn([10]);
val r1 = &amp;data;
val r2 = &amp;mut data;       // ERROR[E4003]: cannot borrow `data` as mutable
                           // because it is also borrowed as immutable
println("{}", r1);

Lifetimes

Lifetimes ensure that references never outlive the data they point to. In most cases, the compiler infers lifetimes automatically:

axon
fn first_element(v: &amp;Vec<Int32>): &amp;Int32 {
    &amp;v[0]   // lifetime of return value tied to lifetime of `v`
}

When the compiler needs help, you annotate lifetimes explicitly:

axon
fn longest<'a>(a: &amp;'a String, b: &amp;'a String): &amp;'a String {
    if a.len() > b.len() { a } else { b }
}

Dangling Reference Prevention

axon
fn dangling(): &amp;String {
    val s = "hello".to_string();
    &amp;s   // ERROR[E4005]: `s` does not live long enough
}        // `s` is dropped here

Copy Types vs Move Types

Some small, stack-allocated types implement the Copy trait and are copied instead of moved:

Copy TypesMove Types
Int8 through Int64String
UInt8 through UInt64Vec<T>
Float16 through Float64Tensor<T, S>
Bool, CharHashMap<K, V>
Tuples of Copy typesModels (by default)
axon
val a: Int32 = 42;
val b = a;          // copy — both a and b are valid
println("{} {}", a, b);

val s = "hello".to_string();
val t = s;          // move — only t is valid
// println("{}", s);   // ERROR

Making Models Copyable

Derive Copy and Clone for small value types:

axon
model Color: Copy, Clone {
    r: UInt8,
    g: UInt8,
    b: UInt8,
}

val red = Color { r: 255, g: 0, b: 0 };
val also_red = red;   // copy, not move
println("{}", red.r);  // OK

Tensor Device-Aware Borrowing

Tensors carry device information (@cpu / @gpu), and the borrow checker enforces device-safety rules:

Rule: No Cross-Device Aliasing

A tensor on the GPU cannot be mutably borrowed while a CPU reference exists:

axon
var t = randn([256, 256]);
val cpu_ref = &amp;t;
val gpu_t = t.to_gpu();      // ERROR[E4007]: cannot move `t` to GPU while
                              // borrowed on CPU

Device Transfer is a Move

Transferring a tensor between devices moves it:

axon
val cpu_data = randn([1024]);
val gpu_data = cpu_data.to_gpu();    // cpu_data is moved
// println("{}", cpu_data);          // ERROR: use of moved value

val result = gpu_data.to_cpu();      // gpu_data is moved back
println("{}", result);

Safe Pattern: Borrow, Then Transfer

axon
var data = randn([256, 256]);

// Phase 1: work on CPU
val norm = data.mean();
println("Mean: {}", norm);

// Phase 2: transfer to GPU (no outstanding borrows)
val gpu_data = data.to_gpu();
val result = gpu_data @ gpu_data;

Ownership in Practice: Training Loop

A real-world example combining ownership patterns:

axon
model Trainer {
    model: NeuralNet,
    optimizer: Adam,
}

extend Trainer {
    fn train_epoch(&amp;mut self, data: &amp;DataLoader): Float32 {
        var total_loss = 0.0;

        for batch in data {
            val (inputs, targets) = batch;

            // model borrowed mutably through self
            val predictions = self.model.forward(inputs);
            val loss = cross_entropy(predictions, targets);

            total_loss += loss.item();

            loss.backward();
            self.optimizer.step();
            self.optimizer.zero_grad();
        }

        total_loss / data.len() as Float32
    }
}

Key ownership points:

  • &mut self — the trainer exclusively owns the model during training
  • data: &DataLoader — data is borrowed immutably (read-only)
  • loss.backward() consumes gradient information (move semantics on graph nodes)
  • No data races are possible — the type system guarantees it

Summary

ConceptRule
OwnershipEach value has exactly one owner
MoveAssignment transfers ownership (non-Copy types)
CopySmall primitives are implicitly copied
&TImmutable borrow — multiple allowed
&mut TMutable borrow — exclusive access
LifetimesReferences cannot outlive their referent
Device safetyCross-device aliasing is forbidden

See Also

Error Handling

Axon uses algebraic types for error handling — no exceptions, no null pointers. Every possible failure is encoded in the type system.


Option\

Option<T> represents a value that may or may not exist:

axon
enum Option<T> {
    Some(T),
    None,
}

Using Option

axon
fn find_index(items: &amp;Vec<String>, target: &amp;String): Option<Int32> {
    for i in 0..items.len() {
        if items[i] == target {
            return Some(i);
        }
    }
    None
}

fn main() {
    val names = vec!["Alice", "Bob", "Charlie"];

    match find_index(&amp;names, "Bob") {
        Some(idx) => println("Found at index {}", idx),
        None => println("Not found"),
    }
}

Option Methods

axon
val opt: Option<Int32> = Some(42);

// Unwrap (panics if None)
val x = opt.unwrap();              // 42

// Unwrap with default
val y = opt.unwrap_or(0);          // 42
val z = None.unwrap_or(0);         // 0

// Map: transform the inner value
val doubled = opt.map(|x| x * 2); // Some(84)

// is_some / is_none
if opt.is_some() {
    println("Has a value");
}

// and_then: chain optional operations
val result = opt
    .map(|x| x + 1)
    .and_then(|x| if x > 0 { Some(x) } else { None });

Result\

Result<T, E> represents an operation that can succeed (Ok) or fail (Err):

axon
enum Result<T, E> {
    Ok(T),
    Err(E),
}

Using Result

axon
fn parse_int(s: &amp;String): Result<Int32, String> {
    // parsing logic...
    if valid {
        Ok(parsed_value)
    } else {
        Err("invalid integer: " + s)
    }
}

fn read_config(path: String): Result<Config, IOError> {
    val file = File.open(path)?;       // propagate error
    val contents = file.read_all()?;    // propagate error
    val config = parse_toml(contents)?;
    Ok(config)
}

Result Methods

axon
val ok: Result<Int32, String> = Ok(42);
val err: Result<Int32, String> = Err("oops");

// Unwrap (panics on Err)
val x = ok.unwrap();           // 42

// Unwrap with default
val y = err.unwrap_or(0);      // 0

// Map the success value
val doubled = ok.map(|x| x * 2);   // Ok(84)

// Map the error
val mapped_err = err.map_err(|e| IOError.new(e));

// is_ok / is_err
if ok.is_ok() {
    println("Success!");
}

// and_then: chain fallible operations
val result = ok
    .and_then(|x| if x > 0 { Ok(x) } else { Err("negative") });

Pattern Matching on Errors

Pattern matching is the primary way to handle errors:

axon
fn process_file(path: String) {
    match File.open(path) {
        Ok(file) => {
            match file.read_all() {
                Ok(data) => println("Read {} bytes", data.len()),
                Err(e) => eprintln("Read error: {}", e),
            }
        }
        Err(e) => eprintln("Open error: {}", e),
    }
}

Matching Specific Error Types

axon
match load_model("model.axon") {
    Ok(model) => {
        println("Model loaded: {} parameters", model.param_count());
    }
    Err(IOError.NotFound(path)) => {
        eprintln("File not found: {}", path);
    }
    Err(IOError.PermissionDenied(path)) => {
        eprintln("Permission denied: {}", path);
    }
    Err(e) => {
        eprintln("Unexpected error: {}", e);
    }
}

The ? Operator

The ? operator propagates errors to the caller, reducing boilerplate:

axon
// Without ?
fn load_data(path: String): Result<Vec<Float32>, IOError> {
    val file = match File.open(path) {
        Ok(f) => f,
        Err(e) => return Err(e),
    };
    val contents = match file.read_all() {
        Ok(c) => c,
        Err(e) => return Err(e),
    };
    parse_csv(contents)
}

// With ? — equivalent but cleaner
fn load_data(path: String): Result<Vec<Float32>, IOError> {
    val file = File.open(path)?;
    val contents = file.read_all()?;
    parse_csv(contents)
}

The ? operator:

  1. If the value is Ok(v), unwraps to v
  2. If the value is Err(e), returns Err(e) from the enclosing function
  3. Works on Option<T> too — None propagates as None

Chaining with ?

axon
fn pipeline(path: String): Result<Model, Error> {
    val config = load_config(path)?;
    val data = load_dataset(&amp;config.data_path)?;
    val model = build_model(&amp;config)?;
    val trained = train(model, data)?;
    Ok(trained)
}

Panic vs Recoverable Errors

Recoverable Errors

Use Result<T, E> for expected failure modes:

axon
fn connect(host: String): Result<Connection, NetworkError> {
    // network errors are expected — caller decides what to do
}

Panics

Use panic for unrecoverable programmer errors:

axon
fn get_element(v: &amp;Vec<Int32>, idx: Int32): Int32 {
    if idx < 0 || idx >= v.len() {
        panic("index out of bounds: {} (len: {})", idx, v.len());
    }
    v[idx]
}

Panics terminate the program with a stack trace. Use them for:

  • Logic errors / violated invariants
  • unwrap() on None or Err when failure is truly unexpected
  • Debug assertions

Guidelines

SituationUse
File not foundResult<T, IOError>
Network timeoutResult<T, NetworkError>
Parse failureResult<T, ParseError>
Index out of boundspanic
Division by zeropanic
Unimplemented codepanic("not implemented")

Defining Custom Error Types

axon
enum ModelError {
    LoadFailed(String),
    ShapeMismatch { expected: Vec<Int32>, actual: Vec<Int32> },
    TrainingDiverged,
}

extend Display for ModelError {
    fn to_string(&amp;self): String {
        match self {
            ModelError.LoadFailed(path) => format("failed to load: {}", path),
            ModelError.ShapeMismatch { expected, actual } =>
                format("shape mismatch: expected {:?}, got {:?}", expected, actual),
            ModelError.TrainingDiverged => "training diverged (loss = NaN)".to_string(),
        }
    }
}

fn load_and_train(path: String): Result<Model, ModelError> {
    val model = load_model(path).map_err(|e| ModelError.LoadFailed(e.to_string()))?;
    train(model)
}

Error Handling in ML Code

A realistic training function with comprehensive error handling:

axon
fn train_model(config: &amp;TrainConfig): Result<Model, ModelError> {
    val data = DataLoader.from_csv(&amp;config.data_path)
        .map_err(|e| ModelError.LoadFailed(e.to_string()))?;

    var model = NeuralNet.new(config.hidden_size);
    var optimizer = Adam.new(model.parameters(), lr: config.learning_rate);

    for epoch in 0..config.epochs {
        var epoch_loss = 0.0;

        for batch in &amp;data {
            val (inputs, targets) = batch;
            val predictions = model.forward(inputs);
            val loss = cross_entropy(predictions, targets);

            // Check for divergence
            if loss.item().is_nan() {
                return Err(ModelError.TrainingDiverged);
            }

            epoch_loss += loss.item();
            loss.backward();
            optimizer.step();
            optimizer.zero_grad();
        }

        println("Epoch {}: loss = {:.4}", epoch, epoch_loss / data.len() as Float32);
    }

    Ok(model)
}

Summary

ConceptTypeUse Case
Missing valueOption<T>Lookup, search, optional fields
Fallible operationResult<T, E>I/O, parsing, network
Error propagation?Clean chaining of fallible calls
Unrecoverablepanic(...)Logic errors, invariant violations
Pattern matchingmatchExhaustive error handling

See Also

Modules and Packages

Axon provides a hierarchical module system for organizing code and a built-in package manager for dependency management.


Modules

Declaring Modules

Use mod to define a module inline:

axon
mod math {
    pub fn square(x: Float64): Float64 {
        x * x
    }

    pub fn cube(x: Float64): Float64 {
        x * x * x
    }

    fn helper() {
        // private — not visible outside `math`
    }
}

fn main() {
    println("{}", math.square(5.0));   // 25.0
    println("{}", math.cube(3.0));     // 27.0
    // math.helper();                  // ERROR: `helper` is private
}

File-Based Modules

For larger projects, modules map to files:

my_project/
├── Axon.toml
└── src/
    ├── main.axon         # crate root
    ├── model.axon         # mod model
    ├── data/
    │   ├── mod.axon       # mod data (directory module)
    │   ├── loader.axon    # mod data.loader
    │   └── transform.axon # mod data.transform
    └── utils.axon         # mod utils

In src/main.axon:

axon
mod model;
mod data;
mod utils;

fn main() {
    val net = model.build_network();
    val loader = data.loader.DataLoader.new("train.csv");
    utils.log("Training started");
}

In src/data/mod.axon:

axon
pub mod loader;
pub mod transform;

Visibility (pub)

Items are private by default. Use pub to make them visible outside their module:

axon
mod network {
    pub model Layer {
        pub size: Int32,        // public field
        weights: Vec<Float32>,  // private field
    }

    pub fn new_layer(size: Int32): Layer {
        Layer {
            size,
            weights: Vec.new(),
        }
    }

    extend Layer {
        pub fn forward(&amp;self, input: &amp;Vec<Float32>): Vec<Float32> {
            // public method
        }

        fn init_weights(&amp;mut self) {
            // private method — internal use only
        }
    }
}

Visibility Rules

DeclarationVisible To
fn foo()Current module only
pub fn foo()Parent module and beyond
pub model FooPublic type, fields default to private
pub field: TPublic field on a public model

Importing with use

Bring items into scope with use:

axon
use std.collections.HashMap;
use std.io.{File, Read, Write};

fn main() {
    var map = HashMap.new();
    map.insert("key", 42);
}

Path Forms

axon
// Absolute path
use std.math.sin;

// Nested imports
use std.collections.{Vec, HashMap, HashSet};

// Wildcard import (use sparingly)
use std.prelude.*;

// Aliased import
use std.collections.HashMap as Map;

Re-exports

Modules can re-export items for a cleaner public API:

axon
mod internal {
    pub fn core_function(): Int32 { 42 }
}

// Re-export so users see `my_lib.core_function`
pub use internal.core_function;

The Axon.toml Manifest

Every Axon project has an Axon.toml at its root:

toml
[package]
name = "my_ml_project"
version = "0.2.1"
edition = "2026"
authors = ["Jane Doe <jane@example.com>"]
description = "A neural network toolkit"
license = "MIT"
repository = "https://github.com/jane/my_ml_project"

[dependencies]
axon-vision = "0.3.0"
axon-nlp = { version = "0.1.0", features = ["transformers"] }
axon-data = { git = "https://github.com/axon-lang/axon-data.git", branch = "main" }

[dev-dependencies]
axon-test = "0.1.0"

[build]
opt-level = 2
gpu = "cuda"

Manifest Fields

SectionFieldDescription
[package]namePackage name (lowercase, hyphens)
versionSemantic version (MAJOR.MINOR.PATCH)
editionLanguage edition year
authorsList of authors
descriptionOne-line description
licenseSPDX license identifier
[dependencies]name = "ver"Registry dependency
name = { git = "..." }Git dependency
name = { path = "..." }Local path dependency
[dev-dependencies]Dependencies for tests only
[build]opt-levelDefault optimization level
gpuDefault GPU target

Dependencies

Adding Dependencies

bash
# From registry
axonc pkg add axon-vision
axonc pkg add axon-nlp --version 0.2.0

# Remove a dependency
axonc pkg remove axon-vision

Using Dependencies

After adding a dependency, import it like any module:

axon
use axon_vision.transforms.{resize, normalize};
use axon_nlp.tokenizer.BPETokenizer;

fn preprocess(image: Tensor<Float32, [?, ?, 3]>): Tensor<Float32, [?, 224, 224, 3]> {
    val resized = resize(image, [224, 224]);
    normalize(resized, mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225])
}

Lock File

Running axonc pkg build generates an Axon.lock file that pins exact dependency versions for reproducible builds. Commit this file to source control.


Package Manager Commands

CommandDescription
axonc pkg new <name>Create a new project
axonc pkg initInitialize in current directory
axonc pkg buildBuild the project
axonc pkg runBuild and run
axonc pkg testRun tests
axonc pkg add <pkg>Add a dependency
axonc pkg remove <pkg>Remove a dependency
axonc pkg cleanClean build artifacts
axonc pkg fmtFormat all source files
axonc pkg lintLint all source files

Standard Library Modules

Axon ships with a comprehensive standard library:

ModuleContents
std.preludeAuto-imported basics (println, Clone, Copy, Display, etc.)
std.collectionsVec, HashMap, HashSet, Option, Result
std.stringString with UTF-8 operations
std.ioFile, Read, Write, formatting
std.mathTrigonometry, logarithms, constants
std.tensorTensor creation, shape ops, reductions, linalg
std.nnNeural network layers (Linear, Conv2d, LSTM, Transformer)
std.autogradAutomatic differentiation
std.optimOptimizers (SGD, Adam, AdamW) + LR schedulers
std.lossLoss functions (CrossEntropy, MSE, BCE)
std.dataDataLoader, CSV/JSON loading
std.metricsAccuracy, precision, recall, F1, ROC-AUC
std.transformsImage and text preprocessing
std.syncMutex, RwLock, Arc, Channel
std.threadspawn, JoinHandle
std.deviceDevice abstraction, GPU query
std.randomRandom number generation
std.opsOperator traits (Add, Mul, MatMul, Index)
std.convertFrom, Into, TryFrom, TryInto

Project Organization Best Practices

my_ml_project/
├── Axon.toml
├── Axon.lock
├── src/
│   ├── main.axon           # entry point
│   ├── lib.axon             # library root (if building a library)
│   ├── model/
│   │   ├── mod.axon
│   │   ├── encoder.axon
│   │   └── decoder.axon
│   ├── data/
│   │   ├── mod.axon
│   │   └── preprocessing.axon
│   └── utils.axon
├── tests/
│   ├── test_model.axon
│   └── test_data.axon
├── benches/
│   └── bench_model.axon
└── examples/
    └── inference.axon

See Also

GPU Programming

Axon provides first-class GPU support through device annotations, automatic kernel compilation, and device-aware tensor operations. Write GPU code in Axon — no CUDA C required.


Device Annotations

@gpu Functions

Mark a function for GPU execution:

axon
@gpu
fn vector_add(a: Tensor<Float32, [1024]>, b: Tensor<Float32, [1024]>): Tensor<Float32, [1024]> {
    a + b
}

The Axon compiler lowers @gpu functions through the MLIR backend to produce optimized GPU kernels for the target platform (CUDA, ROCm, or Vulkan).

@cpu Functions

Explicitly mark a function for CPU-only execution:

axon
@cpu
fn save_results(data: Tensor<Float32, [?, 10]>, path: String) {
    val file = File.create(path);
    file.write(data);
}

@device Annotation

Specify a device target explicitly:

axon
@device("cuda:0")
fn forward_gpu0(x: Tensor<Float32, [?, 784]>): Tensor<Float32, [?, 10]> {
    // executes on CUDA device 0
    val w = randn([784, 10]);
    x @ w
}

Device Transfer

Tensors are transferred between devices with .to_gpu() and .to_cpu():

axon
fn gpu_example() {
    // Create on CPU
    val cpu_tensor = randn([1024, 1024]);

    // Transfer to GPU
    val gpu_tensor = cpu_tensor.to_gpu();

    // Compute on GPU — fast!
    val result = gpu_tensor @ gpu_tensor;

    // Transfer back to CPU for I/O
    val cpu_result = result.to_cpu();
    println("{}", cpu_result);
}

Transfer is a Move

Device transfer follows ownership rules — the source tensor is consumed:

axon
val data = randn([256, 256]);
val gpu_data = data.to_gpu();   // data is moved
// println("{}", data);          // ERROR[E4001]: use of moved value `data`

To keep a CPU copy, clone first:

axon
val data = randn([256, 256]);
val backup = data.clone();
val gpu_data = data.to_gpu();
println("{}", backup);           // OK — backup is a separate copy

Tensor Device Placement

Tensors track their device in the type system. Operations between tensors on different devices are compile-time errors:

axon
val cpu_a = randn([100]);
val gpu_b = randn([100]).to_gpu();
// val c = cpu_a + gpu_b;  // ERROR: device mismatch — cpu and gpu tensors

Creating Tensors Directly on GPU

axon
@gpu
fn init_weights(): Tensor<Float32, [784, 256]> {
    randn([784, 256])   // created directly on GPU — no transfer needed
}

GPU Kernel Compilation

When you compile with --gpu, Axon compiles @gpu functions into GPU kernels:

bash
# Compile for NVIDIA GPUs
axonc build model.axon --gpu cuda -O 3

# Compile for AMD GPUs
axonc build model.axon --gpu rocm -O 3

# Compile for Vulkan (cross-platform)
axonc build model.axon --gpu vulkan -O 3

How It Works

  1. Frontend: Axon source → AST → Typed AST (same for all targets)
  2. MIR: Typed AST → Mid-level IR with device annotations
  3. MLIR: GPU-annotated MIR → MLIR dialects (GPU, Linalg, Tensor)
  4. Lowering: MLIR → NVVM (CUDA) / ROCDL (ROCm) / SPIR-V (Vulkan)
  5. Linking: Host code + GPU kernels → single binary

Optimization Pipeline

Axon applies GPU-specific optimizations:

  • Kernel fusion — combine adjacent operations into single kernels
  • Memory coalescing — optimize memory access patterns
  • Shared memory tiling — tile matrix multiplications for cache efficiency
  • Async transfers — overlap computation with host↔device transfers

Multi-GPU Programming

Selecting a Device

axon
use std.device.{Device, cuda};

fn main() {
    val dev0 = cuda(0);   // first GPU
    val dev1 = cuda(1);   // second GPU

    val a = randn([1024, 1024]).to_device(dev0);
    val b = randn([1024, 1024]).to_device(dev1);
}

Data Parallelism

Split batches across GPUs:

axon
fn train_multi_gpu(model: &amp;mut NeuralNet, data: &amp;DataLoader) {
    val devices = [cuda(0), cuda(1)];

    for batch in data {
        val (inputs, targets) = batch;

        // Split batch across devices
        val chunks = inputs.chunk(devices.len(), dim: 0);

        var losses = Vec.new();
        for i in 0..devices.len() {
            val chunk = chunks[i].to_device(devices[i]);
            val pred = model.forward(chunk);
            val loss = cross_entropy(pred, targets);
            losses.push(loss);
        }

        // Aggregate gradients
        val total_loss = losses.sum();
        total_loss.backward();
    }
}

Device Query

axon
use std.device;

fn main() {
    val count = device.gpu_count();
    println("Available GPUs: {}", count);

    for i in 0..count {
        val dev = device.cuda(i);
        println("  GPU {}: {} ({}MB)", i, dev.name(), dev.memory_mb());
    }
}

Complete Example: GPU Matrix Multiplication

axon
use std.device.cuda;

@gpu
fn matmul_gpu(
    a: Tensor<Float32, [?, ?]>,
    b: Tensor<Float32, [?, ?]>,
): Tensor<Float32, [?, ?]> {
    a @ b
}

fn main() {
    val size = 2048;

    // Create tensors on CPU
    val a = randn([size, size]);
    val b = randn([size, size]);

    // Transfer to GPU
    val ga = a.to_gpu();
    val gb = b.to_gpu();

    // GPU matrix multiply
    val gc = matmul_gpu(ga, gb);

    // Get result
    val c = gc.to_cpu();
    println("Result shape: {}", c.shape);
    println("Result[0][0]: {}", c[0][0]);
}

Compile and run:

bash
axonc build matmul.axon --gpu cuda -O 3 -o matmul
./matmul
# Result shape: [2048, 2048]
# Result[0][0]: 12.3456

Best Practices

  1. Minimize transfers — keep data on GPU as long as possible
  2. Batch operations — GPU shines with large, parallel workloads
  3. Use @gpu functions — let the compiler handle kernel generation
  4. Profile first — not everything benefits from GPU acceleration
  5. Clone before transfer if you need the CPU copy

See Also

Tutorial 1: Hello, Tensor!

In this tutorial you'll create, manipulate, and inspect tensors — the fundamental data type in Axon.

Prerequisites: Axon installed (Getting Started)


Step 1: Create a Project

bash
axonc pkg new hello_tensor
cd hello_tensor

Step 2: Your First Tensor

Open src/main.axon and replace its contents:

axon
fn main() {
    // Create a 1D tensor from values
    val numbers = Tensor.from_vec([1.0, 2.0, 3.0, 4.0, 5.0], [5]);
    println("Numbers: {}", numbers);
    println("Shape:   {}", numbers.shape);
    println("Sum:     {}", numbers.sum());
    println("Mean:    {}", numbers.mean());
}

Run it:

bash
axonc pkg run
# Numbers: [1.0, 2.0, 3.0, 4.0, 5.0]
# Shape:   [5]
# Sum:     15.0
# Mean:    3.0

Step 3: Creating Tensors

Axon offers several tensor constructors:

axon
fn main() {
    // Zeros and ones
    val z: Tensor<Float32, [2, 3]> = zeros([2, 3]);
    println("Zeros:\n{}", z);

    val o: Tensor<Float32, [3]> = ones([3]);
    println("Ones: {}", o);

    // Random tensors
    val r = randn([2, 2]);     // normal distribution
    println("Random:\n{}", r);

    // Range
    val seq = arange(0, 5);
    println("Range: {}", seq);   // [0, 1, 2, 3, 4]

    // Identity matrix
    val eye = Tensor.eye(3);
    println("Identity:\n{}", eye);
}

Step 4: Arithmetic Operations

Tensors support element-wise arithmetic:

axon
fn main() {
    val a = Tensor.from_vec([1.0, 2.0, 3.0], [3]);
    val b = Tensor.from_vec([4.0, 5.0, 6.0], [3]);

    println("a + b = {}", a + b);    // [5.0, 7.0, 9.0]
    println("a * b = {}", a * b);    // [4.0, 10.0, 18.0]
    println("a * 2 = {}", a * 2.0);  // [2.0, 4.0, 6.0]

    // Math functions
    val x = Tensor.from_vec([0.0, 1.5708, 3.1416], [3]);
    println("sin(x) = {}", x.sin());
    println("exp(x) = {}", x.exp());
}

Step 5: Matrix Multiplication

The @ operator performs matrix multiplication:

axon
fn main() {
    val A: Tensor<Float32, [2, 3]> = Tensor.from_vec(
        [1.0, 2.0, 3.0,
         4.0, 5.0, 6.0], [2, 3]
    );

    val B: Tensor<Float32, [3, 2]> = Tensor.from_vec(
        [7.0,  8.0,
         9.0,  10.0,
         11.0, 12.0], [3, 2]
    );

    val C = A @ B;   // [2, 3] @ [3, 2] → [2, 2]
    println("A @ B =\n{}", C);
    println("Shape: {}", C.shape);
    // A @ B =
    // [[58.0, 64.0],
    //  [139.0, 154.0]]
}

Step 6: Reshaping and Transposing

axon
fn main() {
    val t: Tensor<Float32, [2, 6]> = arange(0, 12).reshape([2, 6]);
    println("Original [2, 6]:\n{}", t);

    // Reshape
    val r = t.reshape([3, 4]);
    println("Reshaped [3, 4]:\n{}", r);

    val flat = t.reshape([12]);
    println("Flat: {}", flat);

    // Transpose
    val m: Tensor<Float32, [2, 3]> = Tensor.from_vec(
        [1.0, 2.0, 3.0,
         4.0, 5.0, 6.0], [2, 3]
    );
    val mt = m.transpose();
    println("Transposed [3, 2]:\n{}", mt);
}

Step 7: Reductions

axon
fn main() {
    val data: Tensor<Float32, [3, 4]> = Tensor.from_vec(
        [1.0, 2.0, 3.0, 4.0,
         5.0, 6.0, 7.0, 8.0,
         9.0, 10.0, 11.0, 12.0], [3, 4]
    );

    println("Sum (all):   {}", data.sum());          // 78.0
    println("Mean (all):  {}", data.mean());         // 6.5
    println("Max (all):   {}", data.max());          // 12.0
    println("Sum (dim 0): {}", data.sum(dim: 0));    // [15.0, 18.0, 21.0, 24.0]
    println("Sum (dim 1): {}", data.sum(dim: 1));    // [10.0, 26.0, 42.0]
}

Step 8: Putting It All Together

A small program that normalizes a dataset:

axon
fn normalize(data: Tensor<Float32, [?, ?]>): Tensor<Float32, [?, ?]> {
    val mean = data.mean(dim: 0);
    val std = data.std(dim: 0);
    (data - mean) / std
}

fn main() {
    // Simulate a dataset: 100 samples, 4 features
    val dataset = randn([100, 4]);

    println("Before normalization:");
    println("  Mean: {}", dataset.mean(dim: 0));

    val normed = normalize(dataset);

    println("After normalization:");
    println("  Mean: {}", normed.mean(dim: 0));   // ≈ [0, 0, 0, 0]
    println("  Std:  {}", normed.std(dim: 0));    // ≈ [1, 1, 1, 1]
}

What You Learned

  • Creating tensors with from_vec, zeros, ones, randn, arange
  • Element-wise operations (+, -, *, /)
  • Matrix multiplication with @
  • Reshaping, transposing, and slicing
  • Reduction operations (sum, mean, max)
  • Compile-time shape checking

Next Steps

Tutorial 2: Linear Regression

Build a simple linear regression model from scratch using tensors and autograd. This tutorial demonstrates Axon's automatic differentiation engine.

Prerequisites: Tutorial 1: Hello, Tensor!


The Problem

We'll fit a line y = wx + b to synthetic data using gradient descent.


Step 1: Generate Synthetic Data

axon
fn generate_data(n: Int32): (Tensor<Float32, [?, 1]>, Tensor<Float32, [?, 1]>) {
    // True parameters: y = 3.5x + 2.0 + noise
    val x = randn([n, 1]);
    val noise = randn([n, 1]) * 0.3;
    val y = x * 3.5 + 2.0 + noise;
    (x, y)
}

Step 2: Define the Model

Linear regression is a single linear transformation:

axon
model LinearRegression {
    weight: Tensor<Float32, [1, 1]>,
    bias: Tensor<Float32, [1]>,
}

extend LinearRegression {
    fn new(): LinearRegression {
        LinearRegression {
            weight: randn([1, 1]),
            bias: zeros([1]),
        }
    }

    fn forward(&amp;self, x: Tensor<Float32, [?, 1]>): Tensor<Float32, [?, 1]> {
        x @ self.weight + self.bias
    }
}

Step 3: Define the Loss Function

Mean Squared Error — the standard loss for regression:

axon
fn mse_loss(
    predictions: Tensor<Float32, [?, 1]>,
    targets: Tensor<Float32, [?, 1]>,
): Tensor<Float32, []> {
    val diff = predictions - targets;
    (diff * diff).mean()
}

Step 4: Training with Autograd

Now we use Axon's autograd to compute gradients and update parameters:

axon
use std.autograd.GradTensor;
use std.optim.SGD;

fn train() {
    // Generate training data
    val (x_train, y_train) = generate_data(200);

    // Initialize model
    var net = LinearRegression.new();

    // Optimizer: SGD with learning rate 0.01
    var optimizer = SGD.new(
        [&amp;net.weight, &amp;net.bias],
        lr: 0.01,
    );

    // Training loop
    for epoch in 0..100 {
        // Forward pass
        val predictions = net.forward(x_train);

        // Compute loss
        val loss = mse_loss(predictions, y_train);

        // Backward pass — compute gradients
        loss.backward();

        // Update parameters
        optimizer.step();
        optimizer.zero_grad();

        if epoch % 10 == 0 {
            println("Epoch {}: loss = {:.4}", epoch, loss.item());
        }
    }

    // Print learned parameters
    println("Learned weight: {:.4} (true: 3.5)", net.weight.item());
    println("Learned bias:   {:.4} (true: 2.0)", net.bias.item());
}

Step 5: Evaluate the Model

axon
fn evaluate(net: &amp;LinearRegression) {
    // Generate test data
    val (x_test, y_test) = generate_data(50);

    // Predict
    val predictions = net.forward(x_test);

    // Compute test loss
    val test_loss = mse_loss(predictions, y_test);
    println("Test MSE: {:.4}", test_loss.item());

    // Print a few predictions
    println("\nSample predictions:");
    println("  x       | predicted | actual");
    println("  --------|-----------|-------");
    for i in 0..5 {
        println("  {:.4}  | {:.4}    | {:.4}",
            x_test[i].item(),
            predictions[i].item(),
            y_test[i].item()
        );
    }
}

Step 6: Full Program

axon
use std.autograd.GradTensor;
use std.optim.SGD;

model LinearRegression {
    weight: Tensor<Float32, [1, 1]>,
    bias: Tensor<Float32, [1]>,
}

extend LinearRegression {
    fn new(): LinearRegression {
        LinearRegression {
            weight: randn([1, 1]),
            bias: zeros([1]),
        }
    }

    fn forward(&amp;self, x: Tensor<Float32, [?, 1]>): Tensor<Float32, [?, 1]> {
        x @ self.weight + self.bias
    }
}

fn mse_loss(
    predictions: Tensor<Float32, [?, 1]>,
    targets: Tensor<Float32, [?, 1]>,
): Tensor<Float32, []> {
    val diff = predictions - targets;
    (diff * diff).mean()
}

fn main() {
    println("=== Linear Regression in Axon ===\n");

    // Data
    val (x_train, y_train) = generate_data(200);

    // Model
    var net = LinearRegression.new();
    var optimizer = SGD.new(
        [&amp;net.weight, &amp;net.bias],
        lr: 0.01,
    );

    // Train
    for epoch in 0..200 {
        val pred = net.forward(x_train);
        val loss = mse_loss(pred, y_train);
        loss.backward();
        optimizer.step();
        optimizer.zero_grad();

        if epoch % 50 == 0 {
            println("Epoch {:>3}: loss = {:.6}", epoch, loss.item());
        }
    }

    println("\nFinal parameters:");
    println("  weight = {:.4} (true: 3.5)", net.weight.item());
    println("  bias   = {:.4} (true: 2.0)", net.bias.item());

    // Evaluate
    val (x_test, y_test) = generate_data(50);
    val test_pred = net.forward(x_test);
    val test_loss = mse_loss(test_pred, y_test);
    println("\nTest MSE: {:.6}", test_loss.item());
}

fn generate_data(n: Int32): (Tensor<Float32, [?, 1]>, Tensor<Float32, [?, 1]>) {
    val x = randn([n, 1]);
    val noise = randn([n, 1]) * 0.3;
    val y = x * 3.5 + 2.0 + noise;
    (x, y)
}

Run:

bash
axonc build linear_reg.axon -O 2 -o linear_reg
./linear_reg
# === Linear Regression in Axon ===
#
# Epoch   0: loss = 14.832901
# Epoch  50: loss = 0.129384
# Epoch 100: loss = 0.091203
# Epoch 150: loss = 0.089847
#
# Final parameters:
#   weight = 3.4821 (true: 3.5)
#   bias   = 1.9934 (true: 2.0)
#
# Test MSE: 0.092145

Key Concepts Covered

ConceptAxon Feature
Model definitionmodel with tensor fields
Forward pass@ operator, element-wise ops
Loss functionTensor reductions (.mean())
Backpropagationloss.backward()
Parameter updateoptimizer.step()
Gradient resetoptimizer.zero_grad()

Next Steps

Tutorial 3: MNIST Classifier

Build a convolutional neural network to classify handwritten digits from the MNIST dataset using Axon's std::nn module.

Prerequisites: Tutorial 2: Linear Regression


Overview

We'll build a CNN with:

  • Two convolutional layers with ReLU and max pooling
  • Two fully connected layers
  • Softmax output for 10 digit classes

Step 1: Data Loading

axon
use std.data.DataLoader;
use std.transforms.{normalize, to_tensor};

fn load_mnist(): (DataLoader, DataLoader) {
    val train_loader = DataLoader.from_csv("data/mnist_train.csv")
        .batch_size(64)
        .shuffle(true)
        .transform(|img| {
            val tensor = to_tensor(img, [1, 28, 28]);
            normalize(tensor, mean: [0.1307], std: [0.3081])
        });

    val test_loader = DataLoader.from_csv("data/mnist_test.csv")
        .batch_size(256)
        .shuffle(false)
        .transform(|img| {
            val tensor = to_tensor(img, [1, 28, 28]);
            normalize(tensor, mean: [0.1307], std: [0.3081])
        });

    (train_loader, test_loader)
}

Step 2: Define the Model

axon
use std.nn.{Conv2d, Linear, MaxPool2d, Module, Sequential};

model MNISTNet {
    conv1: Conv2d,
    conv2: Conv2d,
    pool: MaxPool2d,
    fc1: Linear<9216, 128>,
    fc2: Linear<128, 10>,
}

extend MNISTNet {
    fn new(): MNISTNet {
        MNISTNet {
            conv1: Conv2d.new(in_channels: 1, out_channels: 32, kernel_size: 3, padding: 1),
            conv2: Conv2d.new(in_channels: 32, out_channels: 64, kernel_size: 3, padding: 1),
            pool: MaxPool2d.new(kernel_size: 2, stride: 2),
            fc1: Linear.new(),
            fc2: Linear.new(),
        }
    }
}

extend Module for MNISTNet {
    fn forward(&amp;self, x: Tensor<Float32, [?, 1, 28, 28]>): Tensor<Float32, [?, 10]> {
        // Conv block 1: [?, 1, 28, 28] → [?, 32, 14, 14]
        val h = self.conv1.forward(x);
        val h = relu(h);
        val h = self.pool.forward(h);

        // Conv block 2: [?, 32, 14, 14] → [?, 64, 7, 7]
        val h = self.conv2.forward(h);
        val h = relu(h);
        val h = self.pool.forward(h);

        // Flatten: [?, 64, 7, 7] → [?, 3136]
        val batch_size = h.shape[0];
        val h = h.reshape([batch_size, 3136]);

        // Fully connected layers
        val h = relu(self.fc1.forward(h));
        self.fc2.forward(h)
    }
}

Step 3: Training Loop

axon
use std.optim.Adam;
use std.loss.cross_entropy;
use std.metrics.accuracy;

fn train_epoch(
    net: &amp;mut MNISTNet,
    data: &amp;DataLoader,
    optimizer: &amp;mut Adam,
): (Float32, Float32) {
    var total_loss = 0.0;
    var correct = 0;
    var total = 0;

    for batch in data {
        val (images, labels) = batch;

        // Forward
        val logits = net.forward(images);
        val loss = cross_entropy(logits, labels);

        // Track metrics
        total_loss += loss.item();
        val predicted = logits.argmax(dim: 1);
        correct += (predicted == labels).sum().item() as Int32;
        total += labels.shape[0];

        // Backward
        loss.backward();
        optimizer.step();
        optimizer.zero_grad();
    }

    val avg_loss = total_loss / data.num_batches() as Float32;
    val acc = correct as Float32 / total as Float32;
    (avg_loss, acc)
}

Step 4: Evaluation

axon
fn evaluate(net: &amp;MNISTNet, data: &amp;DataLoader): (Float32, Float32) {
    var total_loss = 0.0;
    var correct = 0;
    var total = 0;

    for batch in data {
        val (images, labels) = batch;
        val logits = net.forward(images);
        val loss = cross_entropy(logits, labels);

        total_loss += loss.item();
        val predicted = logits.argmax(dim: 1);
        correct += (predicted == labels).sum().item() as Int32;
        total += labels.shape[0];
    }

    val avg_loss = total_loss / data.num_batches() as Float32;
    val acc = correct as Float32 / total as Float32;
    (avg_loss, acc)
}

Step 5: Full Training Program

axon
use std.nn.{Conv2d, Linear, MaxPool2d, Module};
use std.optim.Adam;
use std.loss.cross_entropy;
use std.data.DataLoader;
use std.transforms.{normalize, to_tensor};

fn main() {
    println("=== MNIST Classifier ===\n");

    // Load data
    val (train_loader, test_loader) = load_mnist();
    println("Train: {} samples", train_loader.len());
    println("Test:  {} samples\n", test_loader.len());

    // Create model and optimizer
    var net = MNISTNet.new();
    var optimizer = Adam.new(
        net.parameters(),
        lr: 0.001,
    );

    // Training
    val epochs = 10;
    for epoch in 0..epochs {
        val (train_loss, train_acc) = train_epoch(&amp;mut net, &amp;train_loader, &amp;mut optimizer);
        val (test_loss, test_acc) = evaluate(&amp;net, &amp;test_loader);

        println("Epoch {:>2}/{} | Train Loss: {:.4} Acc: {:.2}% | Test Loss: {:.4} Acc: {:.2}%",
            epoch + 1, epochs,
            train_loss, train_acc * 100.0,
            test_loss, test_acc * 100.0,
        );
    }

    // Final evaluation
    val (_, final_acc) = evaluate(&amp;net, &amp;test_loader);
    println("\nFinal test accuracy: {:.2}%", final_acc * 100.0);
}

Expected output:

=== MNIST Classifier ===

Train: 60000 samples
Test:  10000 samples

Epoch  1/10 | Train Loss: 0.2134 Acc: 93.41% | Test Loss: 0.0712 Acc: 97.82%
Epoch  2/10 | Train Loss: 0.0583 Acc: 98.19% | Test Loss: 0.0498 Acc: 98.41%
...
Epoch 10/10 | Train Loss: 0.0089 Acc: 99.71% | Test Loss: 0.0312 Acc: 99.12%

Final test accuracy: 99.12%

Step 6: GPU Training (Optional)

To train on GPU, simply transfer data and model:

axon
fn main() {
    var net = MNISTNet.new().to_gpu();
    var optimizer = Adam.new(net.parameters(), lr: 0.001);

    for epoch in 0..10 {
        for batch in &amp;train_loader {
            val (images, labels) = batch;
            val images = images.to_gpu();
            val labels = labels.to_gpu();

            val logits = net.forward(images);
            val loss = cross_entropy(logits, labels);
            loss.backward();
            optimizer.step();
            optimizer.zero_grad();
        }
    }
}

Compile with GPU support:

bash
axonc build mnist.axon --gpu cuda -O 3 -o mnist

Step 7: Save the Model

axon
use std.export.save;

// After training
save(&amp;net, "mnist_model.axon");
println("Model saved!");

// Load later
use std.export.load;
val loaded_net: MNISTNet = load("mnist_model.axon");

Key Concepts Covered

ConceptAxon Feature
CNN architectureConv2d, MaxPool2d, Linear
Data loadingDataLoader with transforms
Training loopforwardlossbackwardstep
Metricsargmax, accuracy calculation
GPU training.to_gpu() + --gpu cuda
Model savingstd.export.save / load

Next Steps

Tutorial 4: Building a Transformer

Build a transformer encoder from scratch in Axon to understand self-attention, multi-head attention, and the full transformer architecture.

Prerequisites: Tutorial 3: MNIST Classifier


Architecture Overview

A transformer encoder block consists of:

  1. Multi-Head Self-Attention
  2. Layer Normalization + residual connection
  3. Feed-Forward Network (two linear layers with activation)
  4. Layer Normalization + residual connection

Step 1: Scaled Dot-Product Attention

The fundamental building block of transformers:

axon
fn scaled_dot_product_attention(
    query: Tensor<Float32, [?, ?, ?]>,    // [batch, seq_len, d_k]
    key: Tensor<Float32, [?, ?, ?]>,      // [batch, seq_len, d_k]
    value: Tensor<Float32, [?, ?, ?]>,    // [batch, seq_len, d_v]
): Tensor<Float32, [?, ?, ?]> {
    val d_k = query.shape[2] as Float32;

    // Attention scores: Q @ K^T / sqrt(d_k)
    val scores = (query @ key.transpose()) / d_k.sqrt();

    // Softmax over the last dimension
    val weights = softmax(scores, dim: 2);

    // Weighted sum of values
    weights @ value
}

Step 2: Multi-Head Attention

Split the model dimension into multiple heads for parallel attention:

axon
use std.nn.Linear;

model MultiHeadAttention {
    num_heads: Int32,
    d_model: Int32,
    d_k: Int32,
    w_query: Linear<512, 512>,
    w_key: Linear<512, 512>,
    w_value: Linear<512, 512>,
    w_output: Linear<512, 512>,
}

extend MultiHeadAttention {
    fn new(d_model: Int32, num_heads: Int32): MultiHeadAttention {
        val d_k = d_model / num_heads;
        MultiHeadAttention {
            num_heads,
            d_model,
            d_k,
            w_query: Linear.new(),
            w_key: Linear.new(),
            w_value: Linear.new(),
            w_output: Linear.new(),
        }
    }

    fn forward(
        &amp;self,
        query: Tensor<Float32, [?, ?, 512]>,
        key: Tensor<Float32, [?, ?, 512]>,
        value: Tensor<Float32, [?, ?, 512]>,
    ): Tensor<Float32, [?, ?, 512]> {
        val batch_size = query.shape[0];
        val seq_len = query.shape[1];

        // Project Q, K, V
        val q = self.w_query.forward(query);
        val k = self.w_key.forward(key);
        val v = self.w_value.forward(value);

        // Reshape to [batch, num_heads, seq_len, d_k]
        val q = q.reshape([batch_size, seq_len, self.num_heads, self.d_k])
                  .permute([0, 2, 1, 3]);
        val k = k.reshape([batch_size, seq_len, self.num_heads, self.d_k])
                  .permute([0, 2, 1, 3]);
        val v = v.reshape([batch_size, seq_len, self.num_heads, self.d_k])
                  .permute([0, 2, 1, 3]);

        // Attention per head
        val attn = scaled_dot_product_attention(q, k, v);

        // Concatenate heads
        val concat = attn.permute([0, 2, 1, 3])
                         .reshape([batch_size, seq_len, self.d_model]);

        // Final projection
        self.w_output.forward(concat)
    }
}

Step 3: Feed-Forward Network

Two linear layers with GELU activation:

axon
model FeedForward {
    linear1: Linear<512, 2048>,
    linear2: Linear<2048, 512>,
}

extend FeedForward {
    fn new(d_model: Int32, d_ff: Int32): FeedForward {
        FeedForward {
            linear1: Linear.new(),
            linear2: Linear.new(),
        }
    }

    fn forward(&amp;self, x: Tensor<Float32, [?, ?, 512]>): Tensor<Float32, [?, ?, 512]> {
        val h = gelu(self.linear1.forward(x));
        self.linear2.forward(h)
    }
}

Step 4: Transformer Encoder Block

Combine attention and feed-forward with residual connections and layer norm:

axon
use std.nn.LayerNorm;

model TransformerBlock {
    attention: MultiHeadAttention,
    feed_forward: FeedForward,
    norm1: LayerNorm,
    norm2: LayerNorm,
}

extend TransformerBlock {
    fn new(d_model: Int32, num_heads: Int32, d_ff: Int32): TransformerBlock {
        TransformerBlock {
            attention: MultiHeadAttention.new(d_model, num_heads),
            feed_forward: FeedForward.new(d_model, d_ff),
            norm1: LayerNorm.new(d_model),
            norm2: LayerNorm.new(d_model),
        }
    }

    fn forward(&amp;self, x: Tensor<Float32, [?, ?, 512]>): Tensor<Float32, [?, ?, 512]> {
        // Self-attention + residual + norm
        val attn_out = self.attention.forward(x, x, x);
        val h = self.norm1.forward(x + attn_out);

        // Feed-forward + residual + norm
        val ff_out = self.feed_forward.forward(h);
        self.norm2.forward(h + ff_out)
    }
}

Step 5: Positional Encoding

Add position information since attention is permutation-invariant:

axon
fn positional_encoding(seq_len: Int32, d_model: Int32): Tensor<Float32, [?, ?]> {
    val pe = zeros([seq_len, d_model]);

    for pos in 0..seq_len {
        for i in 0..(d_model / 2) {
            val angle = pos as Float32 / (10000.0).pow(2.0 * i as Float32 / d_model as Float32);
            pe[pos][2 * i] = angle.sin();
            pe[pos][2 * i + 1] = angle.cos();
        }
    }

    pe
}

Step 6: Full Transformer Encoder

Stack multiple transformer blocks into a complete encoder:

axon
use std.nn.{Embedding, Linear, Module};

model TransformerEncoder {
    embedding: Embedding,
    layers: Vec<TransformerBlock>,
    classifier: Linear<512, 10>,
    d_model: Int32,
}

extend TransformerEncoder {
    fn new(
        vocab_size: Int32,
        d_model: Int32,
        num_heads: Int32,
        num_layers: Int32,
        d_ff: Int32,
        num_classes: Int32,
    ): TransformerEncoder {
        var layers = Vec.new();
        for _ in 0..num_layers {
            layers.push(TransformerBlock.new(d_model, num_heads, d_ff));
        }

        TransformerEncoder {
            embedding: Embedding.new(vocab_size, d_model),
            layers,
            classifier: Linear.new(),
            d_model,
        }
    }
}

extend Module for TransformerEncoder {
    fn forward(&amp;self, tokens: Tensor<Int64, [?, ?]>): Tensor<Float32, [?, 10]> {
        val seq_len = tokens.shape[1];

        // Token embedding + positional encoding
        val x = self.embedding.forward(tokens);
        val pe = positional_encoding(seq_len, self.d_model);
        var h = x + pe;

        // Pass through transformer blocks
        for layer in &amp;self.layers {
            h = layer.forward(h);
        }

        // Classification: use [CLS] token (first position)
        val cls = h[.., 0, ..];   // [batch, d_model]
        self.classifier.forward(cls)
    }
}

Step 7: Training

axon
use std.optim.AdamW;
use std.loss.cross_entropy;

fn main() {
    println("=== Transformer Encoder ===\n");

    // Hyperparameters
    val vocab_size = 10000;
    val d_model = 512;
    val num_heads = 8;
    val num_layers = 6;
    val d_ff = 2048;
    val num_classes = 10;

    // Create model
    var net = TransformerEncoder.new(
        vocab_size, d_model, num_heads, num_layers, d_ff, num_classes,
    );

    var optimizer = AdamW.new(
        net.parameters(),
        lr: 0.0001,
        weight_decay: 0.01,
    );

    println("Model: {} parameters", net.param_count());
    println("Config: d_model={}, heads={}, layers={}\n", d_model, num_heads, num_layers);

    // Training loop
    val epochs = 20;
    for epoch in 0..epochs {
        var total_loss = 0.0;
        var num_batches = 0;

        for batch in &amp;train_loader {
            val (tokens, labels) = batch;

            val logits = net.forward(tokens);
            val loss = cross_entropy(logits, labels);

            loss.backward();
            optimizer.step();
            optimizer.zero_grad();

            total_loss += loss.item();
            num_batches += 1;
        }

        val avg_loss = total_loss / num_batches as Float32;
        println("Epoch {:>2}/{}: loss = {:.4}", epoch + 1, epochs, avg_loss);
    }
}

Step 8: Using the Built-in Transformer

Axon's stdlib includes pre-built transformer components:

axon
use std.nn.{TransformerEncoder as TE, MultiHeadAttention};

fn main() {
    // One-liner transformer encoder
    val encoder = TE.new(
        d_model: 512,
        num_heads: 8,
        num_layers: 6,
        d_ff: 2048,
        dropout: 0.1,
    );

    val input: Tensor<Float32, [?, 128, 512]> = randn([32, 128, 512]);
    val output = encoder.forward(input);
    println("Output shape: {}", output.shape);   // [32, 128, 512]
}

Architecture Diagram

Input Tokens
     │
     ▼
┌─────────────┐
│  Embedding   │
│  + Pos Enc   │
└─────┬───────┘
      │
      ▼  ×N layers
┌─────────────────────┐
│  Multi-Head Attn    │
│  + Residual + Norm  │
├─────────────────────┤
│  Feed-Forward       │
│  + Residual + Norm  │
└─────────┬───────────┘
          │
          ▼
┌─────────────┐
│  Classifier  │
│  (Linear)    │
└─────────────┘
          │
          ▼
      Logits [?, num_classes]

Key Concepts Covered

ConceptImplementation
Self-attentionQ @ K^T / sqrt(d_k), softmax, @ V
Multi-headReshape → parallel attention → concat
Residual connectionsx + sublayer(x)
Layer normalizationLayerNorm
Positional encodingSinusoidal sin/cos
Classification[CLS] token → Linear

See Also

Tutorial 05: Models and Enums

Axon supports user-defined data types through models (structs) and enums, providing type-safe data modeling for ML pipelines and systems programming.

Models

Use model to group related data together with named fields:

axon
model Point {
    x: Float64,
    y: Float64,
}

fn distance(a: Point, b: Point): Float64 {
    val dx = a.x - b.x;
    val dy = a.y - b.y;
    return (dx * dx + dy * dy).sqrt();
}

fn main() {
    val origin = Point { x: 0.0, y: 0.0 };
    val target = Point { x: 3.0, y: 4.0 };
    println(distance(origin, target));  // 5.0
}

Model Methods

Attach functions to models using extend blocks:

axon
model ModelConfig {
    learning_rate: Float64,
    batch_size: Int32,
    epochs: Int32,
}

extend ModelConfig {
    fn default(): ModelConfig {
        return ModelConfig {
            learning_rate: 0.001,
            batch_size: 32,
            epochs: 10,
        };
    }

    fn with_lr(self, lr: Float64): ModelConfig {
        return ModelConfig {
            learning_rate: lr,
            batch_size: self.batch_size,
            epochs: self.epochs,
        };
    }
}

Models with Tensors

Models are ideal for encapsulating model parameters:

axon
model LinearLayer {
    weights: Tensor<Float32, [_, _]>,
    bias: Tensor<Float32, [_]>,
}

extend LinearLayer {
    fn forward(self, input: Tensor<Float32, [_, _]>): Tensor<Float32, [_, _]> {
        return input @ self.weights + self.bias;
    }
}

Enums

Enums define types that can be one of several variants:

axon
enum Activation {
    ReLU,
    Sigmoid,
    Tanh,
    LeakyReLU(Float64),  // variant with data
}

fn apply_activation(x: Float64, act: Activation): Float64 {
    match act {
        Activation.ReLU => if x > 0.0 { x } else { 0.0 },
        Activation.Sigmoid => 1.0 / (1.0 + (-x).exp()),
        Activation.Tanh => x.tanh(),
        Activation.LeakyReLU(alpha) => if x > 0.0 { x } else { alpha * x },
    }
}

Pattern Matching

Use match for exhaustive enum handling — the compiler verifies all variants are covered:

axon
enum Device {
    CPU,
    GPU(Int32),  // GPU with device index
}

fn device_name(d: Device): String {
    match d {
        Device.CPU => "cpu",
        Device.GPU(idx) => format("cuda:{}", idx),
    }
}

Ownership and Models

Axon's ownership rules apply to model fields. When a model goes out of scope, its owned fields are dropped automatically:

axon
model DataBatch {
    images: Tensor<Float32, [_, 3, 224, 224]>,
    labels: Tensor<Int64, [_]>,
}

fn process_batch(batch: DataBatch) {
    // `batch` is moved here — caller can no longer use it
    val predictions = model.forward(batch.images);
    val loss = cross_entropy(predictions, batch.labels);
}

Use references (&) to borrow without transferring ownership:

axon
fn inspect_batch(batch: &amp;DataBatch) {
    println(batch.images.shape());
    println(batch.labels.shape());
    // batch is borrowed — caller retains ownership
}

Next Steps

Tutorial 06: Error Handling

Axon uses Result and Option types for explicit, type-safe error handling. No hidden exceptions — every fallible operation returns a value you must handle.

The Option Type

Option<T> represents a value that may or may not exist:

axon
enum Option<T> {
    Some(T),
    None,
}

Using Option

axon
fn find_max(data: Tensor<Float32, [_]>): Option<Float32> {
    if data.len() == 0 {
        return None;
    }
    return Some(data.max());
}

fn main() {
    val values = tensor([1.0, 5.0, 3.0, 9.0, 2.0]);

    match find_max(values) {
        Some(max) => println("Max value: {}", max),
        None => println("Empty tensor"),
    }
}

Option Combinators

axon
val maybe_value: Option<Float64> = Some(42.0);

// unwrap_or: provide a default
val value = maybe_value.unwrap_or(0.0);

// map: transform the inner value
val doubled = maybe_value.map(|x| x * 2.0);

// is_some / is_none: check presence
if maybe_value.is_some() {
    println("Got a value!");
}

The Result Type

Result<T, E> represents an operation that can succeed (Ok) or fail (Err):

axon
enum Result<T, E> {
    Ok(T),
    Err(E),
}

Using Result

axon
fn load_model(path: String): Result<Model, String> {
    if !file_exists(path) {
        return Err("Model file not found: " + path);
    }
    val data = read_file(path)?;  // ? propagates errors
    return Ok(parse_model(data));
}

fn main() {
    match load_model("weights.axon") {
        Ok(model) => println("Loaded model with {} params", model.param_count()),
        Err(e) => println("Error: {}", e),
    }
}

The ? Operator

The ? operator propagates errors up the call stack automatically:

axon
fn train_pipeline(config_path: String): Result<Float64, String> {
    val config = load_config(config_path)?;     // returns Err early if this fails
    val data = load_dataset(config.data_path)?;  // same here
    val model = build_model(config)?;             // and here

    val final_loss = train(model, data, config.epochs)?;
    return Ok(final_loss);
}

This is equivalent to writing match at every step, but much more concise.

Custom Error Types

Define your own error types for domain-specific errors:

axon
enum TrainingError {
    DataNotFound(String),
    InvalidShape { expected: Shape, actual: Shape },
    ConvergenceFailure { epoch: Int32, loss: Float64 },
    OutOfMemory,
}

fn train(model: Model, data: Dataset): Result<Model, TrainingError> {
    if data.shape() != model.expected_input_shape() {
        return Err(TrainingError.InvalidShape {
            expected: model.expected_input_shape(),
            actual: data.shape(),
        });
    }
    // ... training logic ...
    return Ok(model);
}

Combining Option and Result

Convert between Option and Result:

axon
// Option → Result: provide an error message for the None case
val value: Result<Float64, String> = maybe_value.ok_or("value was missing");

// Result → Option: discard the error info
val maybe: Option<Float64> = result.ok();

Panics

For truly unrecoverable errors, use panic:

axon
fn assert_valid_shape(t: Tensor<Float32, [_, _]>) {
    if t.shape()[0] == 0 {
        panic("tensor must have at least one row");
    }
}

Panics terminate the program immediately with a source location and message. Use them for programming errors (invariant violations), not expected failure modes.

Best Practices

  1. Use Result for recoverable errors — file I/O, network, parsing
  2. Use Option for missing values — lookups, optional config fields
  3. Use panic for bugs — invariant violations, unreachable code
  4. Use ? operator — keeps error-handling code concise
  5. Define domain error types — makes errors self-documenting

Next Steps

Migrating from PyTorch to Axon

A side-by-side guide for PyTorch developers. Axon's ML framework is heavily inspired by PyTorch's API, with the added benefits of compile-time shape checking, ownership-based memory safety, and native GPU compilation.


Tensor Creation

python
# PyTorch
import torch

x = torch.zeros(3, 4)
y = torch.ones(5)
z = torch.randn(128, 256)
a = torch.tensor([1.0, 2.0, 3.0])
e = torch.eye(4)
r = torch.arange(0, 10)
axon
// Axon
val x = zeros([3, 4]);
val y = ones([5]);
val z = randn([128, 256]);
val a = Tensor.from_vec([1.0, 2.0, 3.0], [3]);
val e = Tensor.eye(4);
val r = arange(0, 10);

Key difference: Axon tensors carry their shape in the type system: Tensor<Float32, [3, 4]> vs PyTorch's dynamic torch.Tensor.


Tensor Operations

python
# PyTorch
c = a + b
c = a * b
c = a @ b          # matmul
c = torch.matmul(a, b)
m = x.mean(dim=0)
s = x.sum(dim=1)
r = x.reshape(3, 4)
t = x.T             # transpose
axon
// Axon
val c = a + b;
val c = a * b;
val c = a @ b;          // matmul (same!)
// no functional form needed
val m = x.mean(dim: 0);
val s = x.sum(dim: 1);
val r = x.reshape([3, 4]);
val t = x.transpose();

Model Definition

python
# PyTorch
import torch.nn as nn

class MyModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 256)
        self.fc2 = nn.Linear(256, 128)
        self.fc3 = nn.Linear(128, 10)

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        return self.fc3(x)

model = MyModel()
axon
// Axon
use std.nn.{Linear, Module};

model MyModel {
    fc1: Linear<784, 256>,
    fc2: Linear<256, 128>,
    fc3: Linear<128, 10>,
}

extend MyModel {
    fn new(): MyModel {
        MyModel {
            fc1: Linear.new(),
            fc2: Linear.new(),
            fc3: Linear.new(),
        }
    }
}

extend Module for MyModel {
    fn forward(&amp;self, x: Tensor<Float32, [?, 784]>): Tensor<Float32, [?, 10]> {
        val h = relu(self.fc1.forward(x));
        val h = relu(self.fc2.forward(h));
        self.fc3.forward(h)
    }
}

val model = MyModel.new();

Key differences:

  • model + extend Module instead of class(nn.Module)
  • Linear layer sizes are part of the type: Linear<784, 256>
  • Input/output shapes are checked at compile time
  • No super().init() boilerplate

Training Loop

python
# PyTorch
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()

for epoch in range(10):
    for inputs, targets in dataloader:
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, targets)
        loss.backward()
        optimizer.step()
    print(f"Epoch {epoch}: loss = {loss.item():.4f}")
axon
// Axon
use std.optim.Adam;
use std.loss.cross_entropy;

var optimizer = Adam.new(model.parameters(), lr: 0.001);

for epoch in 0..10 {
    for batch in &amp;dataloader {
        val (inputs, targets) = batch;
        val outputs = model.forward(inputs);
        val loss = cross_entropy(outputs, targets);
        loss.backward();
        optimizer.step();
        optimizer.zero_grad();
    }
    println("Epoch {}: loss = {:.4}", epoch, loss.item());
}

Almost identical! The main differences:

  • model.forward(x) instead of model(x)
  • cross_entropy(outputs, targets) is a function, not a class
  • optimizer.zero_grad() typically called after step() (same effect)
  • Borrow semantics: &dataloader to iterate without consuming

CNN Layers

python
# PyTorch
self.conv1 = nn.Conv2d(1, 32, kernel_size=3, padding=1)
self.pool = nn.MaxPool2d(2)
self.bn = nn.BatchNorm2d(32)
self.dropout = nn.Dropout(0.5)
axon
// Axon
self.conv1 = Conv2d.new(in_channels: 1, out_channels: 32, kernel_size: 3, padding: 1);
self.pool = MaxPool2d.new(kernel_size: 2, stride: 2);
self.bn = BatchNorm.new(32);
self.dropout = Dropout.new(rate: 0.5);

RNN / Transformer Layers

python
# PyTorch
lstm = nn.LSTM(input_size=128, hidden_size=256, num_layers=2, batch_first=True)
attention = nn.MultiheadAttention(embed_dim=512, num_heads=8)
encoder = nn.TransformerEncoder(
    nn.TransformerEncoderLayer(d_model=512, nhead=8),
    num_layers=6
)
axon
// Axon
val lstm = LSTM.new(input_size: 128, hidden_size: 256, num_layers: 2);
val attention = MultiHeadAttention.new(d_model: 512, num_heads: 8);
val encoder = TransformerEncoder.new(
    d_model: 512,
    num_heads: 8,
    num_layers: 6,
    d_ff: 2048,
    dropout: 0.1,
);

GPU / Device Management

python
# PyTorch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
x = x.to(device)

# Multi-GPU
model = nn.DataParallel(model)
axon
// Axon
use std.device.{cuda, cpu};

val model = MyModel.new().to_gpu();
val x = x.to_gpu();

// Or with explicit device
val dev = cuda(0);
val x = x.to_device(dev);

Axon difference: Device transfer is a move (ownership transfer), not a copy.


Optimizers

PyTorchAxon
torch.optim.SGD(params, lr=0.01)SGD.new(params, lr: 0.01)
torch.optim.Adam(params, lr=0.001)Adam.new(params, lr: 0.001)
torch.optim.AdamW(params, lr=1e-4, weight_decay=0.01)AdamW.new(params, lr: 0.0001, weight_decay: 0.01)

Loss Functions

PyTorchAxon
nn.CrossEntropyLoss()(output, target)cross_entropy(output, target)
nn.MSELoss()(output, target)mse_loss(output, target)
nn.BCELoss()(output, target)bce_loss(output, target)
nn.L1Loss()(output, target)l1_loss(output, target)

Axon uses functions instead of loss classes — simpler and more direct.


Autograd / Gradients

python
# PyTorch
x = torch.randn(3, requires_grad=True)
y = x * 2 + 1
y.sum().backward()
print(x.grad)

with torch.no_grad():
    prediction = model(x)
axon
// Axon
use std.autograd.{GradTensor, no_grad};

val x = GradTensor.new(randn([3]));
val y = x * 2.0 + 1.0;
y.sum().backward();
println("{}", x.grad());

no_grad(|| {
    val prediction = model.forward(x);
});

Data Loading

python
# PyTorch
from torch.utils.data import DataLoader, TensorDataset

dataset = TensorDataset(x_train, y_train)
loader = DataLoader(dataset, batch_size=64, shuffle=True)
axon
// Axon
use std.data.DataLoader;

val loader = DataLoader.new(x_train, y_train)
    .batch_size(64)
    .shuffle(true);

Model Saving / Loading

python
# PyTorch
torch.save(model.state_dict(), "model.pt")
model.load_state_dict(torch.load("model.pt"))

# ONNX export
torch.onnx.export(model, dummy_input, "model.onnx")
axon
// Axon
use std.export.{save, load, export_onnx};

save(&amp;model, "model.axon");
val model: MyModel = load("model.axon");

// ONNX export
val dummy_input = randn([1, 784]);
export_onnx(&amp;model, dummy_input, "model.onnx");

What Axon Adds Over PyTorch

FeaturePyTorchAxon
Shape checkingRuntime errorsCompile-time errors
Memory safetyManual managementOwnership system
Type safetyDynamic typingStatic types with inference
GPU compilationPython + CUDA CNative GPU via MLIR
PerformancePython overheadNative binary, no GIL
Package managerpip + setup.pyBuilt-in axonc pkg
Formattingblack (separate)Built-in axonc fmt

Quick Translation Table

PyTorchAxon
import torch(built-in, no import needed)
import torch.nn as nnuse std.nn.*
model(x)model.forward(x)
loss.item()loss.item() (same!)
.backward().backward() (same!)
.to("cuda").to_gpu()
torch.no_grad()no_grad(\\{ ... })
model.eval()model.eval() (same!)
model.train()model.train() (same!)

See Also

Migrating from Python to Axon

A side-by-side guide for Python developers moving to Axon. Axon will feel familiar in many ways, but adds static types, ownership, and compile-time shape checking.


Variables

PythonAxon
x = 42val x = 42;
x = 42 (reassign later)var x = 42;
x: int = 42val x: Int32 = 42;
python
# Python
name = "Alice"
age = 30
scores = [95, 87, 92]
axon
// Axon
val name = "Alice";
val age = 30;
val scores = vec![95, 87, 92];

Key difference: Axon variables are immutable by default. Use var for mutable variables.


Functions

python
# Python
def add(a: int, b: int) -> int:
    return a + b

def greet(name: str):
    print(f"Hello, {name}!")
axon
// Axon
fn add(a: Int32, b: Int32): Int32 {
    a + b    // implicit return (last expression)
}

fn greet(name: String) {
    println("Hello, {}!", name);
}

Differences:

  • fn instead of def
  • Curly braces instead of indentation
  • Type annotations are required on parameters
  • No return needed for the last expression

Types

PythonAxonNotes
intInt32 / Int64Explicit sizes
floatFloat32 / Float64Explicit sizes
boolBool
strString
list[int]Vec<Int32>
dict[str, int]HashMap<String, Int32>
Optional[int]Option<Int32>
NoneNone

Control Flow

If/Else

python
# Python
if score >= 90:
    grade = "A"
elif score >= 70:
    grade = "B"
else:
    grade = "C"
axon
// Axon — if is an expression!
val grade = if score >= 90 {
    "A"
} else if score >= 70 {
    "B"
} else {
    "C"
};

Loops

python
# Python
for i in range(10):
    print(i)

for item in items:
    process(item)

while condition:
    do_work()
axon
// Axon
for i in 0..10 {
    println("{}", i);
}

for item in items {
    process(item);
}

while condition {
    do_work();
}

Pattern Matching

python
# Python 3.10+
match command:
    case "quit":
        exit()
    case "hello":
        print("Hi!")
    case _:
        print("Unknown")
axon
// Axon
match command {
    "quit" => exit(),
    "hello" => println("Hi!"),
    _ => println("Unknown"),
}

Classes → Models + Extend

Python classes map to Axon models with extend blocks:

python
# Python
class Point:
    def __init__(self, x: float, y: float):
        self.x = x
        self.y = y

    def distance(self, other: 'Point') -> float:
        return ((self.x - other.x)**2 + (self.y - other.y)**2)**0.5

    def __str__(self):
        return f"({self.x}, {self.y})"
axon
// Axon
model Point {
    x: Float64,
    y: Float64,
}

extend Point {
    fn new(x: Float64, y: Float64): Point {
        Point { x, y }
    }

    fn distance(&amp;self, other: &amp;Point): Float64 {
        val dx = self.x - other.x;
        val dy = self.y - other.y;
        (dx * dx + dy * dy).sqrt()
    }
}

extend Display for Point {
    fn to_string(&amp;self): String {
        format("({}, {})", self.x, self.y)
    }
}

Key differences:

  • No inheritance — use traits for polymorphism
  • &self is explicit (immutable borrow)
  • Constructors are regular functions (by convention called new)

Error Handling

python
# Python
try:
    f = open("config.toml")
    data = f.read()
    config = parse(data)
except FileNotFoundError:
    print("Config not found")
except ParseError as e:
    print(f"Parse error: {e}")
axon
// Axon
match File.open("config.toml") {
    Ok(file) => {
        match file.read_all() {
            Ok(data) => {
                val config = parse(data);
                println("Loaded config");
            }
            Err(e) => eprintln("Read error: {}", e),
        }
    }
    Err(e) => eprintln("Config not found: {}", e),
}

// Or more concisely with ?
fn load_config(): Result<Config, IOError> {
    val file = File.open("config.toml")?;
    val data = file.read_all()?;
    parse(data)
}

NumPy / Tensors

python
# Python (NumPy)
import numpy as np

a = np.zeros((3, 4))
b = np.random.randn(3, 4)
c = a + b
d = np.dot(a, b.T)
mean = np.mean(c, axis=0)
axon
// Axon
val a = zeros([3, 4]);
val b = randn([3, 4]);
val c = a + b;
val d = a @ b.transpose();
val mean = c.mean(dim: 0);

Key differences from NumPy:

  • Compile-time shape checking — shape errors caught before runtime
  • @ operator for matrix multiplication (like Python 3.5+, but type-checked)
  • No import needed — tensors are built-in types
  • Shapes are part of the type: Tensor<Float32, [3, 4]>

List Comprehensions → Iterators

python
# Python
squares = [x**2 for x in range(10)]
evens = [x for x in numbers if x % 2 == 0]
axon
// Axon (iterator methods)
val squares: Vec<Int32> = (0..10).map(|x| x * x).collect();
val evens: Vec<Int32> = numbers.iter().filter(|x| x % 2 == 0).collect();

Modules

python
# Python — file: math_utils.py
def square(x):
    return x * x

# main.py
from math_utils import square
axon
// Axon — file: math_utils.axon
pub fn square(x: Float64): Float64 {
    x * x
}

// main.axon
mod math_utils;
use math_utils.square;

Quick Reference

PythonAxon
print(x)println("{}", x)
len(x)x.len()
type(x)Compile-time types (use :type in REPL)
isinstance(x, T)Pattern matching
NoneNone (Option type)
raise ValueError()return Err(...) or panic(...)
assert x > 0assert(x > 0)
# comment// comment
"""docstring"""/// doc comment
pip installaxonc pkg add
python script.pyaxonc build script.axon && ./script

See Also

Migrating from Rust to Axon

Axon draws significant inspiration from Rust's syntax and safety model, but is purpose-built for ML/AI workloads. If you know Rust, you'll feel at home quickly. This guide covers the key differences.

At a Glance

FeatureRustAxon
OwnershipBorrow checkerSimplified borrow checker
Generics<T: Trait><T: Trait> (same syntax)
TensorsExternal crate (ndarray)First-class Tensor<D, Shape>
GPUExternal crate (cuda-rs)Built-in @device annotation
Macrosmacro_rules! / proc macrosNot yet supported
Asyncasync/awaitNot yet supported
Package managerCargoaxonc pkg (compatible workflow)
StringsString / &strString / &str (same model)
Error handlingResult<T, E>Result<T, E> (same model)

Syntax Differences

Function Declarations

rust
// Rust
fn add(x: i32, y: i32) -> i32 {
    x + y
}
axon
// Axon — explicit return required, types use PascalCase
fn add(x: Int32, y: Int32): Int32 {
    return x + y;
}

Key differences:

  • Axon requires explicit return statements (no implicit last-expression return)
  • Primitive types use PascalCase: Int32, Int64, Float32, Float64, Bool
  • Semicolons are required on all statements

Type Names

RustAxon
i32Int32
i64Int64
f32Float32
f64Float64
boolBool
StringString
Vec<T>Vec<T>

Variable Bindings

rust
// Rust
let x = 5;          // immutable
let mut y = 10;     // mutable
axon
// Axon — val for immutable, var for mutable
val x = 5;
var y = 10;

Ownership Model

Axon uses a simplified version of Rust's ownership model:

axon
fn take_ownership(data: Tensor<Float32, [_]>) {
    // `data` is owned here — moved from caller
    println(data.sum());
}
// `data` is dropped here

fn borrow_data(data: &amp;Tensor<Float32, [_]>) {
    // `data` is borrowed — caller keeps ownership
    println(data.sum());
}

Simplifications vs Rust:

  • No lifetime annotations ('a) — lifetimes are inferred or scoped
  • No Rc<T> / Arc<T> in user code — runtime reference counting where needed
  • Borrow checker is less strict — focuses on preventing use-after-free and double-free

First-Class Tensors

The biggest difference from Rust: tensors are a built-in type with shape tracking:

axon
// Axon — tensors are first-class citizens
val x: Tensor<Float32, [3, 3]> = tensor([[1.0, 2.0, 3.0],
                                          [4.0, 5.0, 6.0],
                                          [7.0, 8.0, 9.0]]);

// Shape-checked matrix multiply (compiler verifies dimensions)
val y: Tensor<Float32, [3, 1]> = tensor([[1.0], [0.0], [1.0]]);
val result = x @ y;  // result: Tensor<Float32, [3, 1]>

In Rust, you'd need an external crate:

rust
// Rust — requires ndarray or nalgebra
use ndarray::Array2;
let x = Array2::<f32>::from_shape_vec((3, 3), vec![...]).unwrap();

Dynamic Shapes

Use _ for dimensions known only at runtime:

axon
fn process_batch(input: Tensor<Float32, [_, 784]>): Tensor<Float32, [_, 10]> {
    // batch size (_) is dynamic, feature dimensions are static
    return model.forward(input);
}

GPU Support

Axon has built-in device management — no external CUDA bindings needed:

axon
// Move tensor to GPU
val gpu_data = data.to(Device.GPU(0));

// GPU-annotated functions
@device(GPU)
fn matmul_kernel(a: Tensor<Float32, [_, _]>, b: Tensor<Float32, [_, _]>): Tensor<Float32, [_, _]> {
    return a @ b;
}

In Rust, GPU support requires unsafe FFI to CUDA/OpenCL libraries.

What's Not (Yet) in Axon

Coming from Rust, you'll miss these features (planned for future releases):

  • Traits/Interfaces — Use structural typing for now
  • Macros — No compile-time metaprogramming yet
  • Async/Await — No async runtime yet
  • Closures — Limited closure support (planned)
  • Pattern matching in valval (a, b) = tuple; not yet supported
  • Crate ecosystem — Axon's package registry is young

Build & Run Comparison

bash
# Rust
cargo build
cargo run
cargo test

# Axon
axonc build main.axon
./main
axonc pkg test

Migration Strategy

  1. Start with compute kernels — Port tensor-heavy code first (biggest Axon advantage)
  2. Keep Rust for infrastructure — Build systems, CLI tools, networking stay in Rust
  3. Use Axon for models — Training loops, inference pipelines, data processing
  4. Interop via C ABI — Axon and Rust can call each other through C FFI

Further Reading

CLI Reference

Complete reference for the axonc command-line compiler.


Synopsis

axonc <COMMAND> [OPTIONS]
axonc --help
axonc --version

Commands

axonc lex

Tokenize an Axon source file and print the token stream.

bash
axonc lex <FILE>

Arguments:

  • <FILE> — Path to the .axon source file

Example:

bash
axonc lex hello.axon
# Token(Fn, 1:1)
# Token(Identifier("main"), 1:4)
# Token(LeftParen, 1:8)
# Token(RightParen, 1:9)
# Token(LeftBrace, 1:11)
# ...

axonc parse

Parse an Axon source file and print the AST.

bash
axonc parse <FILE> [OPTIONS]

Arguments:

  • <FILE> — Path to the .axon source file

Options: | Flag | Description | |------|-------------| | --error-format <FORMAT> | Output format: human (default) or json | | --errors-only | Print only errors, suppress AST output |

Example:

bash
axonc parse hello.axon
axonc parse hello.axon --error-format=json
axonc parse hello.axon --errors-only

axonc check

Type-check an Axon source file. Runs the full frontend pipeline: lex → parse → name resolution → type inference → shape checking → borrow checking.

bash
axonc check <FILE> [OPTIONS]

Arguments:

  • <FILE> — Path to the .axon source file

Options: | Flag | Description | |------|-------------| | --error-format <FORMAT> | Output format: human (default) or json | | --emit-tast | Emit the typed AST as JSON | | --deny <CATEGORIES> | Promote diagnostic categories to errors (comma-separated) | | --allow <CATEGORIES> | Suppress diagnostic categories (comma-separated) | | --warn <CATEGORIES> | Override diagnostic categories to warnings (comma-separated) | | --error-limit <N> | Stop compilation after N errors |

Diagnostic categories: parse-error, type-error, borrow-error, shape-error, lint-warning, unused-variable, unused-import, unreachable-code, deprecated-syntax

Example:

bash
axonc check model.axon
axonc check model.axon --emit-tast
axonc check model.axon --error-format=json
axonc check model.axon --deny unused-variable
axonc check model.axon --error-limit 5

Exit codes:

  • 0 — No errors
  • 1 — One or more errors found

axonc build

Compile an Axon source file to a native binary.

bash
axonc build <FILE> [OPTIONS]

Arguments:

  • <FILE> — Path to the .axon source file

Options: | Flag | Description | Default | |------|-------------|---------| | -o, --output <PATH> | Output file path | Input filename without extension | | -O, --opt-level <LEVEL> | Optimization level: 0, 1, 2, 3 | 0 | | --emit-llvm | Emit LLVM IR text instead of compiling | — | | --emit-mir | Emit Axon MIR (debug intermediate representation) | — | | --emit-obj | Emit object file (.o) instead of binary | — | | --gpu <TARGET> | GPU target: none, cuda, rocm, vulkan | none | | --error-format <FORMAT> | Output format: human or json | human | | --deny <CATEGORIES> | Promote diagnostic categories to errors (comma-separated) | — | | --allow <CATEGORIES> | Suppress diagnostic categories (comma-separated) | — | | --warn <CATEGORIES> | Override diagnostic categories to warnings (comma-separated) | — | | --error-limit <N> | Stop compilation after N errors | — |

Examples:

bash
# Basic compilation
axonc build hello.axon

# Optimized build with custom output
axonc build model.axon -O 3 -o model

# Emit LLVM IR for inspection
axonc build model.axon --emit-llvm -o model.ll

# GPU compilation for NVIDIA
axonc build model.axon --gpu cuda -O 3

# AMD GPU target
axonc build model.axon --gpu rocm -O 2

# Emit object file for linking
axonc build model.axon --emit-obj -o model.o

Optimization levels: | Level | Description | |-------|-------------| | -O 0 | No optimization — fastest compile, easiest to debug | | -O 1 | Basic MIR optimizations: dead code elimination, constant folding | | -O 2 | Standard optimizations (includes O1 + inlining, loop unrolling, vectorization) | | -O 3 | Aggressive optimizations (includes O2 + LTO, auto-vectorization, FMA) |


axonc fmt

Format an Axon source file according to the standard style.

bash
axonc fmt <FILE>

Arguments:

  • <FILE> — Path to the .axon source file

The formatter modifies the file in place. Formatting is idempotent — running it twice produces the same output.

Example:

bash
axonc fmt src/main.axon

axonc lint

Run the linter on an Axon source file. Reports style and best-practice warnings.

bash
axonc lint <FILE>

Arguments:

  • <FILE> — Path to the .axon source file

Lint rules: | Code | Rule | |------|------| | W5001 | Unused variable | | W5002 | Unused import | | W5003 | Dead code | | W5004 | Unnecessary mutability | | W5005 | Shadowed variable | | W5006 | Naming convention violation | | W5007 | Redundant type annotation | | W5008 | Missing documentation on public items |

See Compiler Errors for details on each warning.

Example:

bash
axonc lint src/main.axon
# warning[W5001]: unused variable `temp`
#   --> src/main.axon:12:9

axonc repl

Start the interactive Read-Eval-Print Loop.

bash
axonc repl

REPL Commands: | Command | Description | |---------|-------------| | :type <expr> | Show the type of an expression | | :ast <expr> | Show the AST for an expression | | :load <file> | Load and evaluate an Axon source file | | :save <file> | Save REPL history to a file | | :clear | Clear the REPL state | | :help | Show help | | :quit | Exit the REPL |

Example:

$ axonc repl
Axon REPL v0.1.0type :help for commands

>>> val x = 42
>>> x * 2
84
>>> :type x
Int32
>>> val t = randn([3, 3])
>>> t.shape
[3, 3]
>>> :quit

axonc doc

Generate HTML documentation from doc comments in Axon source files.

bash
axonc doc <FILE> [OPTIONS]

Arguments:

  • <FILE> — Path to the .axon source file

Options: | Flag | Description | |------|-------------| | -o, --output <PATH> | Output file path (default: stdout) |

Example:

bash
axonc doc src/lib.axon -o docs/api.html

axonc lsp

Start the Axon Language Server Protocol server over stdio.

bash
axonc lsp

The LSP server provides:

  • Real-time diagnostics
  • Go-to-definition
  • Hover (type information)
  • Code completion
  • Find references
  • Rename symbol
  • Signature help
  • Inlay hints
  • Semantic tokens

Configure your editor to use axonc lsp as the language server for .axon files.


axonc pkg

Package manager commands for Axon projects.

axonc pkg new <NAME>

Create a new Axon project with standard directory structure.

bash
axonc pkg new my_project
# Created project `my_project`

Generated structure:

my_project/
├── Axon.toml
├── src/
│   └── main.axon
└── tests/
    └── test_main.axon

axonc pkg init

Initialize an Axon project in the current directory.

bash
mkdir my_project && cd my_project
axonc pkg init

axonc pkg build

Build the current project (reads Axon.toml).

bash
axonc pkg build

axonc pkg run

Build and run the project.

bash
axonc pkg run

axonc pkg test

Run all tests in the tests/ directory.

bash
axonc pkg test

axonc pkg add <PACKAGE>

Add a dependency to Axon.toml.

bash
axonc pkg add axon-vision
axonc pkg add axon-nlp --version 0.2.0

axonc pkg remove <PACKAGE>

Remove a dependency from Axon.toml.

bash
axonc pkg remove axon-vision

axonc pkg clean

Remove build artifacts.

bash
axonc pkg clean

axonc pkg fmt

Format all .axon source files in the project.

bash
axonc pkg fmt

axonc pkg lint

Lint all .axon source files in the project.

bash
axonc pkg lint

Global Options

FlagDescription
--helpPrint help information
--versionPrint version (axonc 0.1.0)

Environment Variables

VariableDescription
AXON_HOMEAxon installation directory
AXON_PATHAdditional module search paths (colon-separated)

See Also

Compiler Error Reference

Complete reference for all Axon compiler error codes. Each error includes its code, description, example code that triggers it, and how to fix it.


Error Code Ranges

RangeCategoryDescription
E0001–E0099Lexer / ParserSyntax errors
E1001–E1099Name ResolutionUndefined or duplicate names
E2001–E2099Type ErrorsType mismatches and inference failures
E3001–E3099Shape ErrorsTensor shape mismatches
E4001–E4099Borrow ErrorsOwnership and lifetime violations
W5001–W5010Lint WarningsStyle and best-practice warnings

E0001–E0099: Lexer / Parser Errors

E0001: Unexpected Character

axon
val x = 42$;
//         ^ ERROR[E0001]: unexpected character `$`

Fix: Remove or replace the invalid character.

E0002: Unterminated String Literal

axon
val s = "hello;
//      ^ ERROR[E0002]: unterminated string literal

Fix: Close the string with a matching ".

E0003: Unterminated Block Comment

axon
/* this comment never ends
//^ ERROR[E0003]: unterminated block comment

Fix: Close with */. Nested comments require matching pairs.

E0010: Expected Token

axon
fn foo( {
//      ^ ERROR[E0010]: expected `)`, found `{`

Fix: Add the missing token.

E0011: Expected Expression

axon
val x = ;
//      ^ ERROR[E0011]: expected expression, found `;`

Fix: Provide a value or expression.

E0012: Expected Type

axon
val x: = 42;
//     ^ ERROR[E0012]: expected type, found `=`

Fix: Provide a type annotation after :.

E0020: Invalid Integer Literal

axon
val x = 0xGG;
//      ^ ERROR[E0020]: invalid hexadecimal literal

Fix: Use valid digits for the number base (0-9, a-f for hex).

E0021: Invalid Float Literal

axon
val x = 1.2.3;
//      ^ ERROR[E0021]: invalid float literal

E0030: Invalid Escape Sequence

axon
val s = "\q";
//       ^ ERROR[E0030]: unknown escape sequence `\q`

Fix: Use valid escapes: \\, \n, \t, \r, \", \0.

E0040: Duplicate Match Arm

axon
match x {
    1 => println("one"),
    1 => println("one again"),  // ERROR[E0040]: duplicate match arm
}

E0050: Invalid Pattern

axon
match value {
    1 + 2 => println("?"),  // ERROR[E0050]: expected pattern, found expression
}

E1001–E1099: Name Resolution Errors

E1001: Undefined Variable

axon
fn main() {
    println("{}", unknown_var);
//                ^ ERROR[E1001]: undefined variable `unknown_var`
}

Fix: Declare the variable before use or check for typos.

E1002: Undefined Function

axon
fn main() {
    foo();
//  ^ ERROR[E1002]: undefined function `foo`
}

E1003: Undefined Type

axon
val x: NonExistent = 42;
//     ^ ERROR[E1003]: undefined type `NonExistent`

E1010: Duplicate Definition

axon
fn foo() {}
fn foo() {}
// ^ ERROR[E1010]: duplicate definition of `foo`

E1011: Duplicate Field

axon
model Point { x: Int32, x: Int32 }
//                       ^ ERROR[E1011]: duplicate field `x`

E1020: Unresolved Import

axon
use std.nonexistent.Module;
//  ^ ERROR[E1020]: unresolved import `std.nonexistent`

E1030: Private Item

axon
mod inner {
    fn secret() {}
}
inner.secret();
// ^ ERROR[E1030]: function `secret` is private

Fix: Add pub to the item or access it from within its module.


E2001–E2099: Type Errors

E2001: Type Mismatch

axon
val x: Int32 = "hello";
// ERROR[E2001]: type mismatch — expected `Int32`, found `String`

E2002: Binary Operator Type Error

axon
val x = "hello" + 42;
// ERROR[E2002]: cannot apply `+` to `String` and `Int32`

E2003: Return Type Mismatch

axon
fn foo(): Int32 {
    "not an integer"
// ERROR[E2003]: return type mismatch — expected `Int32`, found `String`
}

E2010: Missing Field

axon
model Point { x: Int32, y: Int32 }
val p = Point { x: 1 };
// ERROR[E2010]: missing field `y` in struct `Point`

E2011: Unknown Field

axon
model Point { x: Int32, y: Int32 }
val p = Point { x: 1, y: 2, z: 3 };
//                           ^ ERROR[E2011]: unknown field `z` on `Point`

E2020: Trait Not Implemented

axon
fn print_it<T: Display>(x: T) {}
print_it(SomeStruct {});
// ERROR[E2020]: trait `Display` not implemented for `SomeStruct`

Fix: Implement the required trait for the type.

E2021: Ambiguous Method

axon
// When multiple trait impls provide the same method
value.shared_method();
// ERROR[E2021]: ambiguous method call — candidates from `TraitA` and `TraitB`

Fix: Use fully qualified syntax: TraitA.shared_method(&value).

E2030: Cannot Infer Type

axon
val x = Vec.new();
// ERROR[E2030]: cannot infer type — add a type annotation

Fix: val x: Vec<Int32> = Vec.new();

E2040: Invalid Cast

axon
val x = "hello" as Int32;
// ERROR[E2040]: cannot cast `String` to `Int32`

E3001–E3099: Shape Errors

E3001: Matmul Shape Mismatch

axon
val a: Tensor<Float32, [3, 4]> = randn([3, 4]);
val b: Tensor<Float32, [5, 6]> = randn([5, 6]);
val c = a @ b;
// ERROR[E3001]: matmul shape mismatch — inner dimensions 4 ≠ 5
//   note: left shape [3, 4], right shape [5, 6]

Fix: Ensure the inner dimensions match: [M, K] @ [K, N].

E3002: Invalid Reshape

axon
val t: Tensor<Float32, [2, 3]> = randn([2, 3]);
val r = t.reshape([2, 2]);
// ERROR[E3002]: cannot reshape [2, 3] (6 elements) to [2, 2] (4 elements)

Fix: Ensure the total number of elements is preserved.

E3003: Broadcast Incompatible

axon
val a: Tensor<Float32, [3, 4]> = randn([3, 4]);
val b: Tensor<Float32, [3, 5]> = randn([3, 5]);
val c = a + b;
// ERROR[E3003]: shapes [3, 4] and [3, 5] are not broadcast-compatible

E3010: Invalid Transpose Axes

axon
val t: Tensor<Float32, [2, 3, 4]> = randn([2, 3, 4]);
val p = t.permute([0, 1, 5]);
// ERROR[E3010]: axis 5 out of range for tensor with 3 dimensions

E3020: Dynamic Shape Required

axon
// When static shape info is unavailable
// ERROR[E3020]: cannot verify shape statically — consider using `?` for dynamic dims
//   note: runtime shape check will be inserted

E4001–E4099: Borrow Errors

E4001: Use After Move

axon
val data = randn([100]);
val other = data;
println("{}", data);
// ERROR[E4001]: use of moved value `data`
//   note: `data` was moved on line 2

Fix: Clone the value or restructure to avoid the move.

E4002: Borrow of Moved Value

axon
val s = "hello".to_string();
val t = s;
val r = &amp;s;
// ERROR[E4002]: cannot borrow `s` — value has been moved

E4003: Mutable Borrow Conflict

axon
var v = vec![1, 2, 3];
val r1 = &amp;v;
val r2 = &amp;mut v;
// ERROR[E4003]: cannot borrow `v` as mutable — also borrowed as immutable
//   note: immutable borrow of `v` occurs on line 2

Fix: Ensure immutable borrows end before taking a mutable borrow.

E4004: Multiple Mutable Borrows

axon
var data = randn([10]);
val a = &amp;mut data;
val b = &amp;mut data;
// ERROR[E4004]: cannot borrow `data` as mutable more than once

E4005: Dangling Reference

axon
fn dangling(): &amp;String {
    val s = "hello".to_string();
    &amp;s
// ERROR[E4005]: `s` does not live long enough
//   note: borrowed value only lives until end of function
}

Fix: Return an owned value instead of a reference.

E4006: Mutability Required

axon
val data = randn([10]);
scale(&amp;mut data, 2.0);
// ERROR[E4006]: cannot borrow `data` as mutable — declared as immutable
//   help: consider changing to `var data`

E4007: Cross-Device Borrow

axon
var t = randn([256]);
val cpu_ref = &amp;t;
val gpu_t = t.to_gpu();
// ERROR[E4007]: cannot move `t` to GPU while borrowed on CPU

W5001–W5010: Lint Warnings

W5001: Unused Variable

axon
val x = 42;
// WARNING[W5001]: unused variable `x`
//   help: prefix with underscore: `_x`

W5002: Unused Import

axon
use std.math.sin;
// WARNING[W5002]: unused import `sin`

W5003: Dead Code

axon
fn unused_function() {}
// WARNING[W5003]: function `unused_function` is never called

W5004: Unnecessary Mutability

axon
var x = 42;
println("{}", x);
// WARNING[W5004]: variable `x` declared as mutable but never mutated

W5005: Shadowed Variable

axon
val x = 1;
val x = 2;
// WARNING[W5005]: variable `x` shadows previous declaration

W5006: Naming Convention

axon
fn MyFunction() {}
// WARNING[W5006]: function `MyFunction` should use snake_case
//   help: rename to `my_function`

W5007: Redundant Type Annotation

axon
val x: Int32 = 42;
// WARNING[W5007]: type annotation is redundant — inferred as `Int32`

W5008: Missing Documentation

axon
pub fn public_api() {}
// WARNING[W5008]: public item `public_api` is missing documentation

Error Output Formats

Human-Readable (Default)

error[E2001]: type mismatch — expected `Int32`, found `String`
  --> src/main.axon:5:15
  help: consider using `parse()` to convert the string

JSON (--error-format=json)

json
{
  "error_code": "E2001",
  "message": "type mismatch — expected `Int32`, found `String`",
  "severity": "error",
  "location": { "file": "src/main.axon", "line": 5, "column": 15 },
  "suggestion": "consider using `parse()` to convert the string"
}

See Also

Axon Compiler Architecture

Pipeline

Source → Lexer → Parser → AST → Name Resolution → Type Checking
      → Shape Checking → Borrow Checking → TAST → MIR → MIR Passes → LLVM IR → Native Binary

Overview

The Axon compiler (axonc) is structured as a multi-phase pipeline. Each phase transforms the program representation and may produce diagnostic errors. The pipeline is designed to continue after errors where possible, providing multiple diagnostics in a single compilation pass.

Modules

Core Pipeline

ModuleFileDescription
Lexersrc/lexer.rsTokenization of Axon source text. Handles keywords, types, operators, delimiters, literals (int, float, string, char), comments (// and / /), attributes (@cpu, @gpu, @device), and source location tracking.
Parsersrc/parser.rsRecursive descent parser. Produces an AST from a token stream. Handles operator precedence with Pratt parsing, provides clear error messages with source locations, and implements error recovery for multiple diagnostics.
ASTsrc/ast.rsAbstract Syntax Tree types. All nodes carry Span for source location. Serializable via serde for tooling integration.
Name Resolutionsrc/symbol.rsSymbol table with lexical scoping. Resolves names to definitions, detects undefined variables, and tracks variable mutability. Part of the type checking phase.
Type Checkersrc/typeck.rsHindley-Milner type inference with constraint-based unification. Registers stdlib types, resolves names, infers expression types, and checks type compatibility.
Shape Checkersrc/shapes.rsTensor dimension verification. Ensures tensor operations have compatible shapes at compile time. Axon's key differentiator for ML/AI safety.
Borrow Checkersrc/borrow.rsOwnership, move, and borrow analysis. Tracks value lifetimes, prevents use-after-move, and enforces mutable borrow exclusivity.
TASTsrc/tast.rsTyped Abstract Syntax Tree. Annotates each AST node with its resolved type. Serves as the bridge between type checking and code generation.
MIRsrc/mir/Mid-level Intermediate Representation. Flattened, SSA-like form suitable for optimization passes and lowering to LLVM IR.
MIR Passessrc/mir/transform/MIR optimization passes: dead code elimination and constant folding. Managed by a PassManager that runs passes based on optimization level.
Name Internersrc/interner.rsGlobal string interning for O(1) name comparisons. Deduplicates identifier strings via NameInterner and lightweight Name handles.
Diagnosticssrc/error.rsAccumulative diagnostic system with categories, severity configuration (--deny/--allow/--warn), error limits, and grouped display.
Codegensrc/codegen/LLVM IR generation and native compilation.

Code Generation Submodules

ModuleFileDescription
LLVM IRsrc/codegen/llvm.rsGenerates textual LLVM IR (.ll files) from MIR. Compiles to native code via clang.
ABIsrc/codegen/abi.rsApplication Binary Interface definitions for calling conventions.
MLIRsrc/codegen/mlir.rsMLIR integration for ML-specific optimizations (future).
Runtimesrc/codegen/runtime.rsRuntime support declarations (memory allocation, printing, etc.).

Standard Library

ModuleFileDescription
Stdlibsrc/stdlib/Built-in type and function registration. Registers primitive types (Int32, Float64, Bool, String), collection types, and AI framework types into the type checker.

The stdlib includes AI/ML framework types:

  • src/stdlib/nn.rs — Neural network layers (Linear, Conv2d, etc.)
  • src/stdlib/optim.rs — Optimizers (SGD, Adam, etc.)
  • src/stdlib/data.rs — Data loading types (Dataset, DataLoader)
  • src/stdlib/mem.rs — Memory management primitives

Tooling

ModuleFileDescription
Formattersrc/fmt.rsCode formatter. Parses source to AST and re-emits with consistent style.
Lintersrc/lint.rsStatic analysis linter. Checks for unused variables, naming conventions, complexity, and more.
REPLsrc/repl.rsInteractive Read-Eval-Print Loop for Axon expressions.
Doc Generatorsrc/doc.rsDocumentation generation from doc comments.
LSP Serversrc/lsp/Language Server Protocol implementation for IDE integration.
Package Managersrc/pkg/Package management (manifests, registry, dependency resolution).

Error System

Errors are categorized by compiler phase using numeric codes:

RangeCategoryExamples
E0xxxLexer/Parser errorsE0001 unexpected character, E0002 unterminated string
E1xxxName resolution errorsE1001 undefined variable, E1002 duplicate definition
E2xxxType errorsE2001 type mismatch, E2002 cannot infer type
E3xxxShape errorsE3001 dimension mismatch, E3002 incompatible tensor shapes
E4xxxBorrow errorsE4001 use after move, E4002 mutable borrow conflict
E5xxxMIR/Codegen errorsE5009 no main function, E5010 codegen failure
W5xxxLint warningsW5001 unused variable, W5002 naming convention

All errors carry:

  • A source Span (file, line, column, offset)
  • A human-readable message
  • A severity level (Error, Warning, Note)
  • Optional suggestions for fixes
  • An optional diagnostic category for filtering (parse-error, type-error, borrow-error, etc.)

Diagnostics support severity overrides via CLI flags (--deny, --allow, --warn) and an error limit (--error-limit N) that stops compilation after N errors.

Data Flow

                    ┌──────────┐
    Source text ───►│  Lexer   │───► Vec<Token>
                    └──────────┘
                         │
                    ┌──────────┐
                    │  Parser  │───► AST (Program)
                    └──────────┘
                         │
                    ┌──────────┐
                    │  Names   │───► SymbolTable
                    └──────────┘
                         │
                    ┌──────────┐
                    │  TypeCk  │───► TypeInterner + Constraints
                    └──────────┘
                         │
                ┌────────┴────────┐
                │                 │
           ┌─────────┐    ┌──────────┐
           │ ShapeCk  │    │ BorrowCk │
           └─────────┘    └──────────┘
                │                 │
                └────────┬────────┘
                         │
                    ┌──────────┐
                    │   TAST   │───► TypedProgram
                    └──────────┘
                         │
                    ┌──────────┐
                    │   MIR    │───► MirProgram
                    └──────────┘
                         │
                    ┌──────────┐
                    │ MIR Pass │───► Optimized MirProgram
                    └──────────┘
                         │
                    ┌──────────┐
                    │  LLVM IR │───► .ll file
                    └──────────┘
                         │
                    ┌──────────┐
                    │  clang   │───► Native binary
                    └──────────┘

Key Design Decisions

  1. Safe Rust only — No unsafe blocks anywhere in the compiler.
  2. Arena-style type interning — Types are identified by TypeId (index), enabling O(1) lookups and avoiding lifetime complexity.
  3. Constraint-based type inference — Generates constraints during traversal, then solves via unification. Enables HM-style inference.
  4. Error recovery — Parser continues after errors to report multiple diagnostics in one pass.
  5. Textual LLVM IR — Generates .ll files rather than using LLVM C API, keeping the compiler dependency-free and simplifying builds.
  6. External clang — Uses clang as a subprocess for final compilation, avoiding LLVM library linking.
  7. Stack safety — Recursive descent functions are wrapped with stacker::maybe_grow to dynamically grow the stack for deeply nested input, preventing stack overflows.
  8. MIR optimization passes — Pluggable pass architecture (MirPass trait + PassManager) enables incremental addition of optimization passes. Dead code elimination and constant folding are built-in at -O1 and above.

Axon Compiler Security Audit

Date: 2024 Scope: Axon compiler (axonc) crate — all source under src/

1. unsafe Block Inventory

Finding: No unsafe blocks in compiler source code.

A search of the entire src/ directory confirms zero uses of unsafe { } blocks in the Axon compiler implementation. The keyword unsafe appears only in:

  • Lexer/token — as a keyword token that Axon can lex (TokenKind::Unsafe)
  • Parser — to parse unsafe fn declarations in Axon source
  • LSP — as a completion/hover entry for the unsafe keyword
  • Package manifest — in test data for lint deny lists

This means the Axon compiler itself relies entirely on Rust's safe subset, inheriting all of Rust's memory safety guarantees (no buffer overflows, use-after-free, data races, etc.).

Risk: None — Rust's type system and borrow checker enforce safety at compile time.

2. FFI Boundaries

2.1 Clang Subprocess Invocation

The only external process invocation is in src/codegen/llvm.rs:

  • compile_ir_to_binary() — invokes clang via std::process::Command
  • compile_ir_to_object() — invokes clang via std::process::Command

Risks:

  • Command injection: The output path is passed directly as a command argument. If an attacker controls the output path, they could potentially inject arguments.
  • Path traversal: No validation is performed on the output path.

Mitigations:

  • Arguments are passed as separate array elements to Command::args(), not concatenated into a shell string. This prevents shell injection.
  • The IR content is written to a file first, not passed via stdin, limiting injection vectors.
  • Only clang is invoked — no shell (sh -c / cmd /c) is used.

Recommendations:

  1. Validate/sanitize output paths before passing to clang.
  2. Use absolute paths for the clang binary or verify it on $PATH.
  3. Consider sandboxing clang invocations (e.g., seccomp, containers).
  4. Add a timeout to prevent hanging clang processes.

2.2 No Other FFI

The compiler does not use any extern "C" functions, does not link to C libraries, and does not use libc or std::ffi directly.

3. Input Validation

3.1 Source Code Parsing

The lexer and parser handle arbitrary input gracefully:

  • Lexer (src/lexer.rs): Processes input character-by-character. Unknown characters produce error tokens. Unterminated strings/comments produce error tokens with descriptive messages. No panics on any input.
  • Parser (src/parser.rs): Uses error recovery to continue parsing after syntax errors. Returns a partial AST plus a list of errors. The parse_source() function in lib.rs is the safe entry point.
  • Type checker (src/typeck.rs): Handles undefined types, recursive types, and type mismatches by producing error diagnostics. Falls through gracefully when earlier phases produce errors.

Verification: The fuzz test suite (tests/fuzz_tests.rs) exercises the compiler with 40+ edge cases including empty input, all ASCII characters, malformed syntax, deeply nested structures, and more.

3.2 Denial of Service via Input

Potential risks:

  • Extremely long identifiers: The lexer allocates a String for each identifier. A 10GB identifier would consume 10GB of memory.
  • Deeply nested expressions: The recursive descent parser uses the call stack. Extremely deep nesting (>1000 levels) may cause stack overflow.
  • Exponential type inference: Pathological type constraints could cause the unification algorithm to run for a long time.

Recommendations:

  1. Add configurable limits on identifier length (e.g., 1024 characters).
  2. Add a nesting depth limit to the parser (e.g., 256 levels).
  3. Add a timeout or iteration limit to the type inference engine.
  4. Add a maximum source file size check (e.g., 10MB).

4. Package Registry Security Model

The package system (src/pkg/) is in early development. Future security considerations:

4.1 Package Integrity

  • Requirement: All packages must have cryptographic signatures (ed25519).
  • Requirement: Package contents must be verified against a hash (SHA-256).
  • Requirement: Lock files must pin exact versions with hashes.

4.2 Dependency Resolution

  • Risk: Dependency confusion attacks (public package overriding private).
  • Mitigation: Support private registry priorities, namespace scoping.
  • Risk: Typosquatting.
  • Mitigation: Name similarity checks during axon pkg add.

4.3 Build Scripts

  • Risk: Arbitrary code execution during package installation.
  • Recommendation: Axon should NOT support arbitrary build scripts. Instead, provide a declarative build configuration.

4.4 Supply Chain

  • Recommendation: Support axon audit command to check for known vulnerabilities in dependencies.
  • Recommendation: Support reproducible builds.

5. REPL Security Considerations

The REPL (src/repl.rs) reads from stdin and evaluates Axon expressions.

Current state: The REPL only performs parsing and type checking — it does not execute code. This limits the attack surface.

Future risks (when execution is added):

  • File system access: Axon code in the REPL should be sandboxed.
  • Network access: Should be disabled by default in REPL mode.
  • Resource limits: CPU time and memory should be bounded.
  • History file: REPL history should be stored with restricted permissions.

Recommendations:

  1. Implement a capability-based security model for REPL execution.
  2. Default to a restricted sandbox with explicit opt-in for I/O.
  3. Add :sandbox on/off REPL command for security-conscious users.

6. Memory Safety Guarantees

6.1 Rust Safety Model

The Axon compiler is written entirely in safe Rust. This provides:

  • No buffer overflows: Bounds checking on all array/vector accesses.
  • No use-after-free: Ownership system prevents dangling references.
  • No null pointer dereference: Option<T> forces explicit handling.
  • No data races: Borrow checker prevents shared mutable state.
  • No uninitialized memory: All values must be initialized.

6.2 Allocation Patterns

  • Type interner (src/types.rs): Uses arena-style allocation via Vec<Type>. Types are identified by index (TypeId), preventing dangling references.
  • Symbol table (src/symbol.rs): Uses scoped HashMap with stack-based scope management. Scopes are pushed/popped correctly.
  • AST nodes (src/ast.rs): Heap-allocated via Box<Expr> for recursive types. Ownership is clear and single-owner.
  • Error collection: Errors are collected in Vec<CompileError> and returned to the caller. No global mutable state.

6.3 Dependencies

CrateVersionPurposeRisk
serde1.xSerializationLow — widely audited
serde_json1.xJSON outputLow — widely audited
clap4.xCLI argument parsingLow — widely audited

All dependencies are well-established, widely audited crates with no known vulnerabilities.

Summary

AreaStatusRisk Level
unsafe codeNone found✅ None
FFI boundariesClang subprocess only⚠️ Low
Input validationGood (with fuzz tests)⚠️ Low
Package registryNot yet implemented📋 Future
REPL securityParse-only (no execution)✅ None (currently)
Memory safetyFull Rust safety guarantees✅ None
Dependencies3 well-audited crates✅ None

Overall assessment: The Axon compiler has a strong security posture thanks to being written in safe Rust with minimal dependencies. The primary areas for future hardening are input size limits and the package registry security model.

Contributing to Axon

Thank you for your interest in contributing to the Axon programming language! This guide covers how to build from source, run tests, and submit changes.


Getting Started

Prerequisites

  • Rust (stable, 1.75+) — rustup.rs
  • Git
  • Clang (for the native binary backend) — optional for most development

Clone and Build

bash
git clone https://github.com/axon-lang/axon.git
cd axon
cargo build

Verify it works:

bash
cargo run -- --help
cargo run -- lex tests/examples/example1_hello.axon

Project Structure

axon/
├── Cargo.toml              # Rust project manifest
├── src/
│   ├── main.rs             # CLI entry point (axonc)
│   ├── lib.rs              # Library root — compiler pipeline
│   ├── token.rs            # Token types
│   ├── lexer.rs            # Lexer (source → tokens)
│   ├── ast.rs              # AST node definitions
│   ├── parser.rs           # Parser (tokens → AST)
│   ├── span.rs             # Source location tracking
│   ├── error.rs            # Error types and reporting
│   ├── types.rs            # Type system (Type, TypeInterner)
│   ├── symbol.rs           # Symbol table and name resolution
│   ├── typeck.rs           # Type checker (HM inference)
│   ├── shapes.rs           # Shape checker (tensor dims)
│   ├── borrow.rs           # Borrow checker (ownership)
│   ├── tast.rs             # Typed AST
│   ├── mir.rs              # Mid-level IR
│   ├── codegen/
│   │   ├── llvm.rs         # LLVM IR generation
│   │   ├── mlir.rs         # MLIR / GPU backend
│   │   ├── runtime.rs      # Runtime library
│   │   └── abi.rs          # ABI and symbol mangling
│   ├── stdlib/             # Standard library definitions
│   │   ├── prelude.rs      # Auto-imported items
│   │   ├── ops.rs          # Operator traits
│   │   ├── collections.rs  # Vec, HashMap, Option, Result
│   │   ├── tensor.rs       # Tensor operations
│   │   ├── nn.rs           # Neural network layers
│   │   ├── autograd.rs     # Automatic differentiation
│   │   ├── optim.rs        # Optimizers
│   │   ├── loss.rs         # Loss functions
│   │   └── ...             # More stdlib modules
│   ├── fmt.rs              # Code formatter
│   ├── lint.rs             # Linter
│   ├── doc.rs              # Documentation generator
│   ├── repl.rs             # REPL
│   ├── lsp/                # Language server
│   │   └── handlers.rs     # LSP request handlers
│   └── pkg/                # Package manager
│       ├── manifest.rs     # Axon.toml parsing
│       ├── resolver.rs     # Dependency resolution
│       └── commands.rs     # CLI commands
├── stdlib/                 # Axon source stubs (.axon files)
├── tests/
│   ├── integration_tests.rs
│   ├── type_tests.rs
│   ├── codegen_tests.rs
│   ├── stdlib_tests.rs
│   ├── ai_framework_tests.rs
│   ├── tooling_tests.rs
│   └── examples/*.axon     # Example programs
├── editors/
│   └── vscode/             # VS Code extension
├── benches/                # Benchmarks
├── fuzz/                   # Fuzz testing
└── docs/                   # Documentation

Running Tests

Full Test Suite

bash
cargo test

This runs 863+ tests across all compiler phases.

Specific Test Files

bash
# Lexer and parser tests
cargo test --lib lexer
cargo test --lib parser

# Type checker tests
cargo test --test type_tests

# Code generation tests
cargo test --test codegen_tests

# Standard library tests
cargo test --test stdlib_tests

# AI framework tests
cargo test --test ai_framework_tests

# Tooling tests (LSP, formatter, linter, REPL)
cargo test --test tooling_tests

Running a Single Test

bash
cargo test test_name_here -- --exact

Running Benchmarks

bash
cargo test --test compiler_bench -- --ignored

Development Workflow

1. Create a Branch

bash
git checkout -b feature/my-feature

2. Make Changes

Edit the relevant source files. The compiler pipeline flows:

Source → Lexer → Parser → AST
                           ↓
                    Name Resolution
                           ↓
                    Type Checker → Shape Checker → Borrow Checker
                           ↓
                        Typed AST
                           ↓
                          MIR
                           ↓
                    LLVM IR / MLIR
                           ↓
                      Native Binary

3. Add Tests

Every change should include tests. Add them to the appropriate test file:

  • Lexer/Parser changes: src/lexer.rs or src/parser.rs (unit tests)
  • Type system changes: tests/type_tests.rs
  • Codegen changes: tests/codegen_tests.rs
  • Stdlib additions: tests/stdlib_tests.rs
  • Tooling changes: tests/tooling_tests.rs

4. Run Tests

bash
cargo test

Ensure all tests pass before submitting.

5. Format and Lint

bash
cargo fmt
cargo clippy

6. Submit a Pull Request

Push your branch and open a PR. Include:

  • Description of what the change does
  • Related issue number (if any)
  • Test output confirming tests pass

Coding Guidelines

Style

  • Follow Rust standard style (cargo fmt)
  • Use descriptive variable names
  • Add doc comments (///) for public items
  • Keep functions focused and under 50 lines when possible

Error Handling

  • Use proper error codes (see Compiler Errors)
  • Include source locations in all errors
  • Add suggestions where helpful
  • Test both success and error cases

Testing

  • Each feature should have positive and negative tests
  • Test edge cases (empty input, deeply nested structures, etc.)
  • Integration tests should use .axon example files
  • Aim for test names that describe what they verify

Adding a New Feature

Adding a New Keyword

  1. Add the keyword to Token enum in src/token.rs
  2. Add it to the keyword map in src/lexer.rs
  3. Add parser support in src/parser.rs
  4. Add AST node in src/ast.rs
  5. Add type checking in src/typeck.rs
  6. Add tests at each level
  7. Update documentation

Adding a Stdlib Function

  1. Add the function signature in src/stdlib/<module>.rs
  2. Register it in the type checker (src/typeck.rs)
  3. Add an Axon stub in stdlib/<module>.axon
  4. Add tests in tests/stdlib_tests.rs
  5. Update documentation

Adding a New Lint Rule

  1. Add the warning code to src/lint.rs
  2. Implement detection logic
  3. Add tests in tests/tooling_tests.rs
  4. Document in docs/reference/compiler-errors.md

Architecture Overview

For detailed architecture documentation, see docs/internals/architecture.md.

Key Design Principles

  1. Correctness first — the compiler should never accept invalid programs
  2. Helpful errors — every error should explain what went wrong and suggest a fix
  3. Performance — the compiler should be fast (targeting <100ms for typical files)
  4. Testability — every component should be independently testable

Communication

  • Issues: Report bugs and request features via GitHub Issues
  • Discussions: Design discussions in GitHub Discussions
  • Code Review: All changes require at least one review

License

Axon is open source. By contributing, you agree that your contributions will be licensed under the same license as the project.


See Also