Getting Started with Axon
Welcome to Axon — the ML/AI-first systems programming language that combines the safety of Rust, the ergonomics of Python, and first-class support for tensors, automatic differentiation, and GPU computing.
What is Axon?
Axon is a compiled, statically-typed language purpose-built for machine learning and AI workloads. It provides:
- First-class tensor types with compile-time shape checking
- Ownership-based memory safety — no garbage collector, no data races
- Automatic differentiation (reverse-mode autograd)
- Native GPU execution via CUDA, ROCm, and Vulkan backends
- Hindley-Milner type inference — write less, know more
- A rich standard library with neural network layers, optimizers, and data loading
Axon compiles to native code through LLVM and to GPU kernels through MLIR, delivering performance on par with C++ while remaining approachable for ML researchers.
Installation
From Cargo (Recommended)
If you have Rust installed:
Binary Download
Pre-built binaries are available for:
- Linux (x86_64, aarch64)
- macOS (x86_64, Apple Silicon)
- Windows (x86_64)
Download from the releases page and add the binary to your PATH.
From Source
bashgit clone https://github.com/axon-lang/axon.git
cd axon
cargo build --release
# Binary is at target/release/axonc
Verify the installation:
bashaxonc --version
# axonc 0.1.0
Hello, World!
Create a file named hello.axon:
axonfn main() {
println("Hello, Axon!");
}
Compile and run:
bashaxonc build hello.axon -o hello
./hello
# Hello, Axon!
Or use the REPL for quick experimentation:
bashaxonc repl
>>> println("Hello from the REPL!")
Hello from the REPL!
Your First Project
Axon includes a built-in package manager. Create a new project:
bashaxonc pkg new my_project
cd my_project
This generates the following structure:
my_project/
├── Axon.toml # Project manifest
├── src/
│ └── main.axon # Entry point
└── tests/
└── test_main.axon # Test file
The generated Axon.toml:
toml[package]
name = "my_project"
version = "0.1.0"
edition = "2026"
[dependencies]
The generated src/main.axon:
axonfn main() {
println("Hello from my_project!");
}
Build and run the project:
bashaxonc pkg build
axonc pkg run
# Hello from my_project!
Compiling and Running
Single File
bash# Parse and check for errors
axonc check hello.axon
# Build an optimized binary
axonc build hello.axon -O 3 -o hello
# Emit LLVM IR for inspection
axonc build hello.axon --emit-llvm
Project-Based
bashaxonc pkg build # Build the project
axonc pkg run # Build and run
axonc pkg test # Run tests
axonc pkg fmt # Format all source files
axonc pkg lint # Lint all source files
Optimization Levels
| Flag | Description |
|---|
-O 0 | No optimization (default, fastest compile) |
-O 1 | Basic optimizations |
-O 2 | Standard optimizations |
-O 3 | Aggressive optimizations |
Editor Setup
VS Code (Recommended)
The official Axon VS Code extension provides:
- Syntax highlighting for
.axon files
- Real-time error diagnostics via the Axon LSP
- Go-to-definition, hover types, and find references
- Code completion with type-aware suggestions
- Inlay hints for inferred types
- Semantic token highlighting
- Code snippets for common patterns
Install from the marketplace or build from source:
bashcd editors/vscode
npm install
npm run build
Other Editors
For any editor that supports the Language Server Protocol:
This starts the Axon language server over stdio, compatible with Neovim (via nvim-lspconfig), Emacs (via lsp-mode), Helix, Zed, and others.
What's Next?
Language Tour
A quick tour of Axon's syntax and core features. This guide assumes familiarity with at least one systems or ML language (Rust, Python, C++).
Variables
Variables are declared with val. They are immutable by default.
axonval x = 42; // immutable, type inferred as Int32
val y: Float64 = 3.14; // explicit type annotation
var counter = 0; // mutable variable
counter += 1;
Type inference works across expressions — you rarely need annotations:
axonval name = "Axon"; // String
val active = true; // Bool
val scores = [95, 87, 92]; // Vec<Int32>
Functions
Functions are declared with fn. Parameters require type annotations; return types follow :.
axonfn add(a: Int32, b: Int32): Int32 {
a + b // last expression is the return value
}
fn greet(name: String) {
println("Hello, {}!", name);
}
fn main() {
val sum = add(3, 4);
greet("World");
}
Unsafe Functions
Functions performing low-level operations can be marked unsafe:
axonunsafe fn raw_pointer_access(ptr: *mut Float32): Float32 {
// low-level memory access
}
Basic Types
Numeric Types
| Type | Description | Size |
|---|
Int8 | Signed 8-bit integer | 1 byte |
Int16 | Signed 16-bit integer | 2 bytes |
Int32 | Signed 32-bit integer | 4 bytes |
Int64 | Signed 64-bit integer | 8 bytes |
UInt8 | Unsigned 8-bit integer | 1 byte |
UInt16 | Unsigned 16-bit integer | 2 bytes |
UInt32 | Unsigned 32-bit integer | 4 bytes |
UInt64 | Unsigned 64-bit integer | 8 bytes |
Float16 | 16-bit floating point | 2 bytes |
Float32 | 32-bit floating point | 4 bytes |
Float64 | 64-bit floating point | 8 bytes |
Other Primitives
| Type | Description |
|---|
Bool | Boolean (true / false) |
Char | Unicode scalar value |
String | UTF-8 encoded string |
Integer Literals
axonval dec = 42; // decimal
val hex = 0xFF; // hexadecimal
val bin = 0b1010; // binary
val oct = 0o77; // octal
val sci = 1.5e10; // scientific notation
Control Flow
If / Else
if is an expression — it returns a value:
axonval max = if a > b { a } else { b };
if score >= 90 {
println("Excellent");
} else if score >= 70 {
println("Good");
} else {
println("Keep trying");
}
While Loops
axonvar i = 0;
while i < 10 {
println("{}", i);
i += 1;
}
For Loops
axonfor item in collection {
println("{}", item);
}
for i in 0..10 {
println("{}", i);
}
Match Expressions
Pattern matching with exhaustiveness checking:
axonmatch value {
0 => println("zero"),
1 => println("one"),
n => println("other: {}", n),
}
match option_val {
Some(x) => println("Got {}", x),
None => println("Nothing"),
}
Models
Named product types with fields:
axonmodel Point {
x: Float64,
y: Float64,
}
val p = Point { x: 1.0, y: 2.0 };
println("({}, {})", p.x, p.y);
Methods via extend
axonextend Point {
fn distance(&self, other: &Point): Float64 {
val dx = self.x - other.x;
val dy = self.y - other.y;
(dx * dx + dy * dy).sqrt()
}
fn origin(): Point {
Point { x: 0.0, y: 0.0 }
}
}
Enums
Sum types with variants that can hold data:
axonenum Shape {
Circle(Float64),
Rectangle(Float64, Float64),
Triangle { base: Float64, height: Float64 },
}
fn area(shape: Shape): Float64 {
match shape {
Shape.Circle(r) => 3.14159 * r * r,
Shape.Rectangle(w, h) => w * h,
Shape.Triangle { base, height } => 0.5 * base * height,
}
}
Traits and Extend Blocks
Traits define shared behavior:
axontrait Printable {
fn to_string(&self): String;
}
extend Printable for Point {
fn to_string(&self): String {
format("({}, {})", self.x, self.y)
}
}
Trait Bounds
axonfn print_all<T: Printable>(items: Vec<T>) {
for item in items {
println("{}", item.to_string());
}
}
Supertraits
axontrait Drawable: Printable {
fn draw(&self);
}
Generics
Functions, models, and traits can be generic:
axonfn max<T: Ord>(a: T, b: T): T {
if a > b { a } else { b }
}
model Pair<A, B> {
first: A,
second: B,
}
extend<A: Display, B: Display> Pair<A, B> {
fn show(&self) {
println("({}, {})", self.first, self.second);
}
}
Tensor Types and Shape Annotations
Axon's killer feature — tensors are first-class citizens with compile-time shape verification:
axon// Tensor with known shape
val weights: Tensor<Float32, [784, 256]> = randn([784, 256]);
// Dynamic batch dimension with ?
val input: Tensor<Float32, [?, 784]> = load_batch();
// Matrix multiply — shapes checked at compile time
val output = input @ weights; // Tensor<Float32, [?, 256]>
Shape mismatches are caught before your code ever runs:
axonval a: Tensor<Float32, [3, 4]> = randn([3, 4]);
val b: Tensor<Float32, [5, 6]> = randn([5, 6]);
val c = a @ b; // ERROR[E3001]: shape mismatch — inner dims 4 ≠ 5
See the Tensor Guide for the full story.
Error Handling
Axon uses Option<T> and Result<T, E> for safe error handling:
axonfn find(haystack: Vec<Int32>, needle: Int32): Option<Int32> {
for i in 0..haystack.len() {
if haystack[i] == needle {
return Some(i);
}
}
None
}
fn read_config(path: String): Result<Config, IOError> {
val file = File.open(path)?; // propagate error with ?
val data = file.read_all()?;
parse_config(data)
}
See Error Handling for patterns and best practices.
Modules and Visibility
Organize code into modules with mod and use:
axonmod math {
pub fn square(x: Float64): Float64 {
x * x
}
fn internal_helper() {
// private — not visible outside this module
}
}
use math.square;
fn main() {
println("{}", square(4.0)); // 16.0
}
See Modules & Packages for the full module system.
What's Next?
Tensor Programming
Tensors are first-class citizens in Axon. The type system tracks tensor shapes at compile time, catching dimension mismatches before your code ever runs.
Tensor Types and Shapes
Every tensor has a dtype and a shape encoded in its type:
axon// Static shape — all dimensions known at compile time
val weights: Tensor<Float32, [784, 256]> = randn([784, 256]);
// Dynamic batch dimension (?)
val input: Tensor<Float32, [?, 784]> = load_batch();
// Fully dynamic shape
val dynamic: Tensor<Float32, [?, ?]> = some_function();
Shape Syntax
| Syntax | Meaning |
|---|
[3, 4] | Static shape: 3 rows, 4 columns |
[?, 784] | Dynamic first dim, static second dim |
[?, ?, 3] | Batch × height × width, 3 channels |
[N] | Named dimension (generic) |
Creating Tensors
Initialization Functions
axon// Zeros and ones
val z = zeros([3, 4]); // Tensor<Float32, [3, 4]>
val o = ones([256]); // Tensor<Float32, [256]>
// Random initialization
val r = randn([128, 64]); // normal distribution
val u = rand([10, 10]); // uniform [0, 1)
// From data
val t = Tensor.from_vec([1.0, 2.0, 3.0, 4.0], [2, 2]);
// Range
val seq = arange(0, 10); // [0, 1, 2, ..., 9]
// Identity matrix
val eye = Tensor.eye(4); // 4×4 identity
// From file
val data = load_data("weights.npy");
Dtype Selection
axonval f16: Tensor<Float16, [1024]> = zeros([1024]); // half precision
val f32: Tensor<Float32, [1024]> = zeros([1024]); // single precision
val f64: Tensor<Float64, [1024]> = zeros([1024]); // double precision
val i32: Tensor<Int32, [10]> = arange(0, 10); // integer tensor
Shape Operations
Reshape
Change the shape without changing the data:
axonval a: Tensor<Float32, [2, 6]> = randn([2, 6]);
val b = a.reshape([3, 4]); // Tensor<Float32, [3, 4]>
val c = a.reshape([12]); // Tensor<Float32, [12]>
// val d = a.reshape([5, 5]); // ERROR[E3002]: cannot reshape [2,6] (12 elements) to [5,5] (25 elements)
Transpose
axonval m: Tensor<Float32, [3, 4]> = randn([3, 4]);
val mt = m.transpose(); // Tensor<Float32, [4, 3]>
// For higher-rank tensors, specify axes
val t: Tensor<Float32, [2, 3, 4]> = randn([2, 3, 4]);
val tp = t.permute([0, 2, 1]); // Tensor<Float32, [2, 4, 3]>
Squeeze and Unsqueeze
axonval a: Tensor<Float32, [1, 3, 1, 4]> = randn([1, 3, 1, 4]);
val b = a.squeeze(); // Tensor<Float32, [3, 4]>
val c: Tensor<Float32, [3, 4]> = randn([3, 4]);
val d = c.unsqueeze(0); // Tensor<Float32, [1, 3, 4]>
Concatenation and Stacking
axonval a: Tensor<Float32, [2, 3]> = randn([2, 3]);
val b: Tensor<Float32, [2, 3]> = randn([2, 3]);
val cat = Tensor.cat([a, b], 0); // Tensor<Float32, [4, 3]>
val stk = Tensor.stack([a, b], 0); // Tensor<Float32, [2, 2, 3]>
Slicing
axonval t: Tensor<Float32, [10, 20]> = randn([10, 20]);
val row = t[0]; // Tensor<Float32, [20]>
val sub = t[2..5]; // Tensor<Float32, [3, 20]>
Element-Wise Operations
Standard arithmetic operators work element-wise on tensors:
axonval a = randn([3, 4]);
val b = randn([3, 4]);
val sum = a + b; // element-wise addition
val diff = a - b; // element-wise subtraction
val prod = a * b; // element-wise multiplication (Hadamard)
val quot = a / b; // element-wise division
// Scalar broadcasting
val scaled = a * 2.0;
val shifted = a + 1.0;
Math Functions
axonval x = randn([100]);
val s = x.sin();
val c = x.cos();
val e = x.exp();
val l = x.log();
val sq = x.sqrt();
val ab = x.abs();
val cl = x.clamp(-1.0, 1.0);
Activation Functions
axonval h = relu(x);
val g = gelu(x);
val s = sigmoid(x);
val t = tanh(x);
val p = softmax(logits, dim: 1);
Reduction Operations
Reduce tensors along axes:
axonval t: Tensor<Float32, [4, 5]> = randn([4, 5]);
val total = t.sum(); // scalar
val row_sum = t.sum(dim: 1); // Tensor<Float32, [4]>
val col_mean = t.mean(dim: 0); // Tensor<Float32, [5]>
val max_val = t.max(); // scalar
val min_idx = t.argmin(dim: 1); // Tensor<Int64, [4]>
Common Reductions
| Method | Description |
|---|
.sum() | Sum of all elements |
.sum(dim: N) | Sum along dimension N |
.mean() | Mean of all elements |
.max() / .min() | Maximum / minimum |
.argmax(dim: N) | Index of maximum along dim |
.argmin(dim: N) | Index of minimum along dim |
.prod() | Product of all elements |
.norm(p) | Lp norm |
Linear Algebra
Matrix Multiplication (@ operator)
The @ operator performs matrix multiplication with compile-time shape checking:
axonval A: Tensor<Float32, [3, 4]> = randn([3, 4]);
val B: Tensor<Float32, [4, 5]> = randn([4, 5]);
val C = A @ B; // Tensor<Float32, [3, 5]>
// Inner dimensions must match
val D: Tensor<Float32, [4, 6]> = randn([4, 6]);
// val E = A @ D; // ERROR[E3001]: matmul shape mismatch — inner dims 4 ≠ 4... wait
// // actually [3,4] @ [4,6] works. Let's show a real error:
val F: Tensor<Float32, [5, 6]> = randn([5, 6]);
// val G = A @ F; // ERROR[E3001]: matmul requires inner dims to match: 4 ≠ 5
Batch Matrix Multiplication
axonval batch_a: Tensor<Float32, [?, 8, 64]> = randn([32, 8, 64]);
val batch_b: Tensor<Float32, [?, 64, 32]> = randn([32, 64, 32]);
val batch_c = batch_a @ batch_b; // Tensor<Float32, [?, 8, 32]>
Other Linear Algebra Operations
axonval M = randn([4, 4]);
val d = M.det(); // determinant
val inv = M.inv(); // inverse
val (Q, R) = M.qr(); // QR decomposition
val (U, S, V) = M.svd(); // singular value decomposition
val eig = M.eigenvalues(); // eigenvalues
val tr = M.trace(); // trace
val dp = a.dot(b); // dot product (1D tensors)
Device Transfer
Tensors can be moved between CPU and GPU:
axonval cpu_tensor = randn([1024, 1024]);
// Move to GPU
val gpu_tensor = cpu_tensor.to_gpu();
// Compute on GPU
val result = gpu_tensor @ gpu_tensor;
// Move back to CPU for I/O
val cpu_result = result.to_cpu();
println("{}", cpu_result);
See GPU Programming for details.
Compile-Time Shape Checking
Axon's shape checker catches errors at compile time:
axon// ✓ Shapes match
val a: Tensor<Float32, [3, 4]> = randn([3, 4]);
val b: Tensor<Float32, [4, 5]> = randn([4, 5]);
val c = a @ b; // OK: [3,4] @ [4,5] → [3,5]
// ✗ Shape mismatch
val d: Tensor<Float32, [3, 4]> = randn([3, 4]);
val e: Tensor<Float32, [5, 6]> = randn([5, 6]);
// val f = d @ e; // ERROR[E3001]: matmul inner dim mismatch: 4 ≠ 5
// ✗ Invalid reshape
val g: Tensor<Float32, [2, 3]> = randn([2, 3]);
// val h = g.reshape([2, 2]); // ERROR[E3002]: element count mismatch: 6 ≠ 4
// ✗ Element-wise shape mismatch
val i: Tensor<Float32, [3, 4]> = randn([3, 4]);
val j: Tensor<Float32, [3, 5]> = randn([3, 5]);
// val k = i + j; // ERROR[E3003]: broadcast incompatible shapes [3,4] and [3,5]
Dynamic Shapes
When dimensions are dynamic (?), shape checks happen at runtime:
axonfn process(input: Tensor<Float32, [?, 784]>): Tensor<Float32, [?, 10]> {
val w = randn([784, 10]);
input @ w // batch dim (?) propagated, inner dim (784) checked statically
}
Summary
| Feature | Example |
|---|
| Static shape | Tensor<Float32, [3, 4]> |
| Dynamic dim | Tensor<Float32, [?, 784]> |
| Matmul | A @ B |
| Element-wise | a + b, a * 2.0 |
| Reduction | t.sum(dim: 1) |
| Reshape | t.reshape([6, 2]) |
| Device | t.to_gpu(), t.to_cpu() |
| Shape error | Caught at compile time |
See Also
Ownership and Borrowing
Axon uses an ownership system inspired by Rust to guarantee memory safety at compile time — no garbage collector, no dangling pointers, no data races.
The Three Ownership Rules
- Every value has exactly one owner.
- When the owner goes out of scope, the value is dropped.
- There can be either one mutable reference OR any number of immutable references to a value — never both at the same time.
Ownership and Move Semantics
By default, assigning a value moves it. The original binding becomes invalid:
axonval tensor = randn([1024, 1024]);
val other = tensor; // tensor is MOVED into other
// println("{}", tensor); // ERROR[E4001]: use of moved value `tensor`
println("{}", other); // OK
This applies to function calls as well:
axonfn consume(t: Tensor<Float32, [3, 3]>) {
println("{}", t);
}
val data = randn([3, 3]);
consume(data);
// consume(data); // ERROR[E4001]: use of moved value `data`
Why Moves?
Moves prevent double-free errors and make ownership transfer explicit. When a large tensor is passed to a function, no implicit copy occurs — you always know where your data lives.
Borrowing: &T and &mut T
To use a value without taking ownership, borrow it:
Immutable Borrows (&T)
Multiple immutable borrows are allowed simultaneously:
axonfn print_shape(t: &Tensor<Float32, [?, 784]>) {
println("Shape: {}", t.shape);
}
val input = randn([32, 784]);
print_shape(&input); // borrow, don't move
print_shape(&input); // still valid — input wasn't moved
Mutable Borrows (&mut T)
Only one mutable borrow is allowed at a time, and no immutable borrows may coexist with it:
axonfn scale(t: &mut Tensor<Float32, [3, 3]>, factor: Float32) {
// modify tensor in place
}
var weights = randn([3, 3]);
scale(&mut weights, 2.0);
println("{}", weights); // OK — mutable borrow has ended
Borrow Conflicts
The compiler rejects overlapping mutable and immutable borrows:
axonvar data = randn([10]);
val r1 = &data;
val r2 = &mut data; // ERROR[E4003]: cannot borrow `data` as mutable
// because it is also borrowed as immutable
println("{}", r1);
Lifetimes
Lifetimes ensure that references never outlive the data they point to. In most cases, the compiler infers lifetimes automatically:
axonfn first_element(v: &Vec<Int32>): &Int32 {
&v[0] // lifetime of return value tied to lifetime of `v`
}
When the compiler needs help, you annotate lifetimes explicitly:
axonfn longest<'a>(a: &'a String, b: &'a String): &'a String {
if a.len() > b.len() { a } else { b }
}
Dangling Reference Prevention
axonfn dangling(): &String {
val s = "hello".to_string();
&s // ERROR[E4005]: `s` does not live long enough
} // `s` is dropped here
Copy Types vs Move Types
Some small, stack-allocated types implement the Copy trait and are copied instead of moved:
| Copy Types | Move Types |
|---|
Int8 through Int64 | String |
UInt8 through UInt64 | Vec<T> |
Float16 through Float64 | Tensor<T, S> |
Bool, Char | HashMap<K, V> |
| Tuples of Copy types | Models (by default) |
axonval a: Int32 = 42;
val b = a; // copy — both a and b are valid
println("{} {}", a, b);
val s = "hello".to_string();
val t = s; // move — only t is valid
// println("{}", s); // ERROR
Making Models Copyable
Derive Copy and Clone for small value types:
axonmodel Color: Copy, Clone {
r: UInt8,
g: UInt8,
b: UInt8,
}
val red = Color { r: 255, g: 0, b: 0 };
val also_red = red; // copy, not move
println("{}", red.r); // OK
Tensor Device-Aware Borrowing
Tensors carry device information (@cpu / @gpu), and the borrow checker enforces device-safety rules:
Rule: No Cross-Device Aliasing
A tensor on the GPU cannot be mutably borrowed while a CPU reference exists:
axonvar t = randn([256, 256]);
val cpu_ref = &t;
val gpu_t = t.to_gpu(); // ERROR[E4007]: cannot move `t` to GPU while
// borrowed on CPU
Device Transfer is a Move
Transferring a tensor between devices moves it:
axonval cpu_data = randn([1024]);
val gpu_data = cpu_data.to_gpu(); // cpu_data is moved
// println("{}", cpu_data); // ERROR: use of moved value
val result = gpu_data.to_cpu(); // gpu_data is moved back
println("{}", result);
Safe Pattern: Borrow, Then Transfer
axonvar data = randn([256, 256]);
// Phase 1: work on CPU
val norm = data.mean();
println("Mean: {}", norm);
// Phase 2: transfer to GPU (no outstanding borrows)
val gpu_data = data.to_gpu();
val result = gpu_data @ gpu_data;
Ownership in Practice: Training Loop
A real-world example combining ownership patterns:
axonmodel Trainer {
model: NeuralNet,
optimizer: Adam,
}
extend Trainer {
fn train_epoch(&mut self, data: &DataLoader): Float32 {
var total_loss = 0.0;
for batch in data {
val (inputs, targets) = batch;
// model borrowed mutably through self
val predictions = self.model.forward(inputs);
val loss = cross_entropy(predictions, targets);
total_loss += loss.item();
loss.backward();
self.optimizer.step();
self.optimizer.zero_grad();
}
total_loss / data.len() as Float32
}
}
Key ownership points:
&mut self — the trainer exclusively owns the model during training
data: &DataLoader — data is borrowed immutably (read-only)
loss.backward() consumes gradient information (move semantics on graph nodes)
- No data races are possible — the type system guarantees it
Summary
| Concept | Rule |
|---|
| Ownership | Each value has exactly one owner |
| Move | Assignment transfers ownership (non-Copy types) |
| Copy | Small primitives are implicitly copied |
&T | Immutable borrow — multiple allowed |
&mut T | Mutable borrow — exclusive access |
| Lifetimes | References cannot outlive their referent |
| Device safety | Cross-device aliasing is forbidden |
See Also
Error Handling
Axon uses algebraic types for error handling — no exceptions, no null pointers. Every possible failure is encoded in the type system.
Option\
Option<T> represents a value that may or may not exist:
axonenum Option<T> {
Some(T),
None,
}
Using Option
axonfn find_index(items: &Vec<String>, target: &String): Option<Int32> {
for i in 0..items.len() {
if items[i] == target {
return Some(i);
}
}
None
}
fn main() {
val names = vec!["Alice", "Bob", "Charlie"];
match find_index(&names, "Bob") {
Some(idx) => println("Found at index {}", idx),
None => println("Not found"),
}
}
Option Methods
axonval opt: Option<Int32> = Some(42);
// Unwrap (panics if None)
val x = opt.unwrap(); // 42
// Unwrap with default
val y = opt.unwrap_or(0); // 42
val z = None.unwrap_or(0); // 0
// Map: transform the inner value
val doubled = opt.map(|x| x * 2); // Some(84)
// is_some / is_none
if opt.is_some() {
println("Has a value");
}
// and_then: chain optional operations
val result = opt
.map(|x| x + 1)
.and_then(|x| if x > 0 { Some(x) } else { None });
Result\
Result<T, E> represents an operation that can succeed (Ok) or fail (Err):
axonenum Result<T, E> {
Ok(T),
Err(E),
}
Using Result
axonfn parse_int(s: &String): Result<Int32, String> {
// parsing logic...
if valid {
Ok(parsed_value)
} else {
Err("invalid integer: " + s)
}
}
fn read_config(path: String): Result<Config, IOError> {
val file = File.open(path)?; // propagate error
val contents = file.read_all()?; // propagate error
val config = parse_toml(contents)?;
Ok(config)
}
Result Methods
axonval ok: Result<Int32, String> = Ok(42);
val err: Result<Int32, String> = Err("oops");
// Unwrap (panics on Err)
val x = ok.unwrap(); // 42
// Unwrap with default
val y = err.unwrap_or(0); // 0
// Map the success value
val doubled = ok.map(|x| x * 2); // Ok(84)
// Map the error
val mapped_err = err.map_err(|e| IOError.new(e));
// is_ok / is_err
if ok.is_ok() {
println("Success!");
}
// and_then: chain fallible operations
val result = ok
.and_then(|x| if x > 0 { Ok(x) } else { Err("negative") });
Pattern Matching on Errors
Pattern matching is the primary way to handle errors:
axonfn process_file(path: String) {
match File.open(path) {
Ok(file) => {
match file.read_all() {
Ok(data) => println("Read {} bytes", data.len()),
Err(e) => eprintln("Read error: {}", e),
}
}
Err(e) => eprintln("Open error: {}", e),
}
}
Matching Specific Error Types
axonmatch load_model("model.axon") {
Ok(model) => {
println("Model loaded: {} parameters", model.param_count());
}
Err(IOError.NotFound(path)) => {
eprintln("File not found: {}", path);
}
Err(IOError.PermissionDenied(path)) => {
eprintln("Permission denied: {}", path);
}
Err(e) => {
eprintln("Unexpected error: {}", e);
}
}
The ? Operator
The ? operator propagates errors to the caller, reducing boilerplate:
axon// Without ?
fn load_data(path: String): Result<Vec<Float32>, IOError> {
val file = match File.open(path) {
Ok(f) => f,
Err(e) => return Err(e),
};
val contents = match file.read_all() {
Ok(c) => c,
Err(e) => return Err(e),
};
parse_csv(contents)
}
// With ? — equivalent but cleaner
fn load_data(path: String): Result<Vec<Float32>, IOError> {
val file = File.open(path)?;
val contents = file.read_all()?;
parse_csv(contents)
}
The ? operator:
- If the value is
Ok(v), unwraps to v
- If the value is
Err(e), returns Err(e) from the enclosing function
- Works on
Option<T> too — None propagates as None
Chaining with ?
axonfn pipeline(path: String): Result<Model, Error> {
val config = load_config(path)?;
val data = load_dataset(&config.data_path)?;
val model = build_model(&config)?;
val trained = train(model, data)?;
Ok(trained)
}
Panic vs Recoverable Errors
Recoverable Errors
Use Result<T, E> for expected failure modes:
axonfn connect(host: String): Result<Connection, NetworkError> {
// network errors are expected — caller decides what to do
}
Panics
Use panic for unrecoverable programmer errors:
axonfn get_element(v: &Vec<Int32>, idx: Int32): Int32 {
if idx < 0 || idx >= v.len() {
panic("index out of bounds: {} (len: {})", idx, v.len());
}
v[idx]
}
Panics terminate the program with a stack trace. Use them for:
- Logic errors / violated invariants
unwrap() on None or Err when failure is truly unexpected
- Debug assertions
Guidelines
| Situation | Use |
|---|
| File not found | Result<T, IOError> |
| Network timeout | Result<T, NetworkError> |
| Parse failure | Result<T, ParseError> |
| Index out of bounds | panic |
| Division by zero | panic |
| Unimplemented code | panic("not implemented") |
Defining Custom Error Types
axonenum ModelError {
LoadFailed(String),
ShapeMismatch { expected: Vec<Int32>, actual: Vec<Int32> },
TrainingDiverged,
}
extend Display for ModelError {
fn to_string(&self): String {
match self {
ModelError.LoadFailed(path) => format("failed to load: {}", path),
ModelError.ShapeMismatch { expected, actual } =>
format("shape mismatch: expected {:?}, got {:?}", expected, actual),
ModelError.TrainingDiverged => "training diverged (loss = NaN)".to_string(),
}
}
}
fn load_and_train(path: String): Result<Model, ModelError> {
val model = load_model(path).map_err(|e| ModelError.LoadFailed(e.to_string()))?;
train(model)
}
Error Handling in ML Code
A realistic training function with comprehensive error handling:
axonfn train_model(config: &TrainConfig): Result<Model, ModelError> {
val data = DataLoader.from_csv(&config.data_path)
.map_err(|e| ModelError.LoadFailed(e.to_string()))?;
var model = NeuralNet.new(config.hidden_size);
var optimizer = Adam.new(model.parameters(), lr: config.learning_rate);
for epoch in 0..config.epochs {
var epoch_loss = 0.0;
for batch in &data {
val (inputs, targets) = batch;
val predictions = model.forward(inputs);
val loss = cross_entropy(predictions, targets);
// Check for divergence
if loss.item().is_nan() {
return Err(ModelError.TrainingDiverged);
}
epoch_loss += loss.item();
loss.backward();
optimizer.step();
optimizer.zero_grad();
}
println("Epoch {}: loss = {:.4}", epoch, epoch_loss / data.len() as Float32);
}
Ok(model)
}
Summary
| Concept | Type | Use Case |
|---|
| Missing value | Option<T> | Lookup, search, optional fields |
| Fallible operation | Result<T, E> | I/O, parsing, network |
| Error propagation | ? | Clean chaining of fallible calls |
| Unrecoverable | panic(...) | Logic errors, invariant violations |
| Pattern matching | match | Exhaustive error handling |
See Also
Modules and Packages
Axon provides a hierarchical module system for organizing code and a built-in package manager for dependency management.
Modules
Declaring Modules
Use mod to define a module inline:
axonmod math {
pub fn square(x: Float64): Float64 {
x * x
}
pub fn cube(x: Float64): Float64 {
x * x * x
}
fn helper() {
// private — not visible outside `math`
}
}
fn main() {
println("{}", math.square(5.0)); // 25.0
println("{}", math.cube(3.0)); // 27.0
// math.helper(); // ERROR: `helper` is private
}
File-Based Modules
For larger projects, modules map to files:
my_project/
├── Axon.toml
└── src/
├── main.axon # crate root
├── model.axon # mod model
├── data/
│ ├── mod.axon # mod data (directory module)
│ ├── loader.axon # mod data.loader
│ └── transform.axon # mod data.transform
└── utils.axon # mod utils
In src/main.axon:
axonmod model;
mod data;
mod utils;
fn main() {
val net = model.build_network();
val loader = data.loader.DataLoader.new("train.csv");
utils.log("Training started");
}
In src/data/mod.axon:
axonpub mod loader;
pub mod transform;
Visibility (pub)
Items are private by default. Use pub to make them visible outside their module:
axonmod network {
pub model Layer {
pub size: Int32, // public field
weights: Vec<Float32>, // private field
}
pub fn new_layer(size: Int32): Layer {
Layer {
size,
weights: Vec.new(),
}
}
extend Layer {
pub fn forward(&self, input: &Vec<Float32>): Vec<Float32> {
// public method
}
fn init_weights(&mut self) {
// private method — internal use only
}
}
}
Visibility Rules
| Declaration | Visible To |
|---|
fn foo() | Current module only |
pub fn foo() | Parent module and beyond |
pub model Foo | Public type, fields default to private |
pub field: T | Public field on a public model |
Importing with use
Bring items into scope with use:
axonuse std.collections.HashMap;
use std.io.{File, Read, Write};
fn main() {
var map = HashMap.new();
map.insert("key", 42);
}
axon// Absolute path
use std.math.sin;
// Nested imports
use std.collections.{Vec, HashMap, HashSet};
// Wildcard import (use sparingly)
use std.prelude.*;
// Aliased import
use std.collections.HashMap as Map;
Re-exports
Modules can re-export items for a cleaner public API:
axonmod internal {
pub fn core_function(): Int32 { 42 }
}
// Re-export so users see `my_lib.core_function`
pub use internal.core_function;
The Axon.toml Manifest
Every Axon project has an Axon.toml at its root:
toml[package]
name = "my_ml_project"
version = "0.2.1"
edition = "2026"
authors = ["Jane Doe <jane@example.com>"]
description = "A neural network toolkit"
license = "MIT"
repository = "https://github.com/jane/my_ml_project"
[dependencies]
axon-vision = "0.3.0"
axon-nlp = { version = "0.1.0", features = ["transformers"] }
axon-data = { git = "https://github.com/axon-lang/axon-data.git", branch = "main" }
[dev-dependencies]
axon-test = "0.1.0"
[build]
opt-level = 2
gpu = "cuda"
Manifest Fields
| Section | Field | Description |
|---|
[package] | name | Package name (lowercase, hyphens) |
| version | Semantic version (MAJOR.MINOR.PATCH) |
| edition | Language edition year |
| authors | List of authors |
| description | One-line description |
| license | SPDX license identifier |
[dependencies] | name = "ver" | Registry dependency |
| name = { git = "..." } | Git dependency |
| name = { path = "..." } | Local path dependency |
[dev-dependencies] | | Dependencies for tests only |
[build] | opt-level | Default optimization level |
| gpu | Default GPU target |
Dependencies
Adding Dependencies
bash# From registry
axonc pkg add axon-vision
axonc pkg add axon-nlp --version 0.2.0
# Remove a dependency
axonc pkg remove axon-vision
Using Dependencies
After adding a dependency, import it like any module:
axonuse axon_vision.transforms.{resize, normalize};
use axon_nlp.tokenizer.BPETokenizer;
fn preprocess(image: Tensor<Float32, [?, ?, 3]>): Tensor<Float32, [?, 224, 224, 3]> {
val resized = resize(image, [224, 224]);
normalize(resized, mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225])
}
Lock File
Running axonc pkg build generates an Axon.lock file that pins exact dependency versions for reproducible builds. Commit this file to source control.
Package Manager Commands
| Command | Description |
|---|
axonc pkg new <name> | Create a new project |
axonc pkg init | Initialize in current directory |
axonc pkg build | Build the project |
axonc pkg run | Build and run |
axonc pkg test | Run tests |
axonc pkg add <pkg> | Add a dependency |
axonc pkg remove <pkg> | Remove a dependency |
axonc pkg clean | Clean build artifacts |
axonc pkg fmt | Format all source files |
axonc pkg lint | Lint all source files |
Standard Library Modules
Axon ships with a comprehensive standard library:
| Module | Contents |
|---|
std.prelude | Auto-imported basics (println, Clone, Copy, Display, etc.) |
std.collections | Vec, HashMap, HashSet, Option, Result |
std.string | String with UTF-8 operations |
std.io | File, Read, Write, formatting |
std.math | Trigonometry, logarithms, constants |
std.tensor | Tensor creation, shape ops, reductions, linalg |
std.nn | Neural network layers (Linear, Conv2d, LSTM, Transformer) |
std.autograd | Automatic differentiation |
std.optim | Optimizers (SGD, Adam, AdamW) + LR schedulers |
std.loss | Loss functions (CrossEntropy, MSE, BCE) |
std.data | DataLoader, CSV/JSON loading |
std.metrics | Accuracy, precision, recall, F1, ROC-AUC |
std.transforms | Image and text preprocessing |
std.sync | Mutex, RwLock, Arc, Channel |
std.thread | spawn, JoinHandle |
std.device | Device abstraction, GPU query |
std.random | Random number generation |
std.ops | Operator traits (Add, Mul, MatMul, Index) |
std.convert | From, Into, TryFrom, TryInto |
Project Organization Best Practices
my_ml_project/
├── Axon.toml
├── Axon.lock
├── src/
│ ├── main.axon # entry point
│ ├── lib.axon # library root (if building a library)
│ ├── model/
│ │ ├── mod.axon
│ │ ├── encoder.axon
│ │ └── decoder.axon
│ ├── data/
│ │ ├── mod.axon
│ │ └── preprocessing.axon
│ └── utils.axon
├── tests/
│ ├── test_model.axon
│ └── test_data.axon
├── benches/
│ └── bench_model.axon
└── examples/
└── inference.axon
See Also
GPU Programming
Axon provides first-class GPU support through device annotations, automatic kernel compilation, and device-aware tensor operations. Write GPU code in Axon — no CUDA C required.
Device Annotations
@gpu Functions
Mark a function for GPU execution:
axon@gpu
fn vector_add(a: Tensor<Float32, [1024]>, b: Tensor<Float32, [1024]>): Tensor<Float32, [1024]> {
a + b
}
The Axon compiler lowers @gpu functions through the MLIR backend to produce optimized GPU kernels for the target platform (CUDA, ROCm, or Vulkan).
@cpu Functions
Explicitly mark a function for CPU-only execution:
axon@cpu
fn save_results(data: Tensor<Float32, [?, 10]>, path: String) {
val file = File.create(path);
file.write(data);
}
@device Annotation
Specify a device target explicitly:
axon@device("cuda:0")
fn forward_gpu0(x: Tensor<Float32, [?, 784]>): Tensor<Float32, [?, 10]> {
// executes on CUDA device 0
val w = randn([784, 10]);
x @ w
}
Device Transfer
Tensors are transferred between devices with .to_gpu() and .to_cpu():
axonfn gpu_example() {
// Create on CPU
val cpu_tensor = randn([1024, 1024]);
// Transfer to GPU
val gpu_tensor = cpu_tensor.to_gpu();
// Compute on GPU — fast!
val result = gpu_tensor @ gpu_tensor;
// Transfer back to CPU for I/O
val cpu_result = result.to_cpu();
println("{}", cpu_result);
}
Transfer is a Move
Device transfer follows ownership rules — the source tensor is consumed:
axonval data = randn([256, 256]);
val gpu_data = data.to_gpu(); // data is moved
// println("{}", data); // ERROR[E4001]: use of moved value `data`
To keep a CPU copy, clone first:
axonval data = randn([256, 256]);
val backup = data.clone();
val gpu_data = data.to_gpu();
println("{}", backup); // OK — backup is a separate copy
Tensor Device Placement
Tensors track their device in the type system. Operations between tensors on different devices are compile-time errors:
axonval cpu_a = randn([100]);
val gpu_b = randn([100]).to_gpu();
// val c = cpu_a + gpu_b; // ERROR: device mismatch — cpu and gpu tensors
Creating Tensors Directly on GPU
axon@gpu
fn init_weights(): Tensor<Float32, [784, 256]> {
randn([784, 256]) // created directly on GPU — no transfer needed
}
GPU Kernel Compilation
When you compile with --gpu, Axon compiles @gpu functions into GPU kernels:
bash# Compile for NVIDIA GPUs
axonc build model.axon --gpu cuda -O 3
# Compile for AMD GPUs
axonc build model.axon --gpu rocm -O 3
# Compile for Vulkan (cross-platform)
axonc build model.axon --gpu vulkan -O 3
How It Works
- Frontend: Axon source → AST → Typed AST (same for all targets)
- MIR: Typed AST → Mid-level IR with device annotations
- MLIR: GPU-annotated MIR → MLIR dialects (GPU, Linalg, Tensor)
- Lowering: MLIR → NVVM (CUDA) / ROCDL (ROCm) / SPIR-V (Vulkan)
- Linking: Host code + GPU kernels → single binary
Optimization Pipeline
Axon applies GPU-specific optimizations:
- Kernel fusion — combine adjacent operations into single kernels
- Memory coalescing — optimize memory access patterns
- Shared memory tiling — tile matrix multiplications for cache efficiency
- Async transfers — overlap computation with host↔device transfers
Multi-GPU Programming
Selecting a Device
axonuse std.device.{Device, cuda};
fn main() {
val dev0 = cuda(0); // first GPU
val dev1 = cuda(1); // second GPU
val a = randn([1024, 1024]).to_device(dev0);
val b = randn([1024, 1024]).to_device(dev1);
}
Data Parallelism
Split batches across GPUs:
axonfn train_multi_gpu(model: &mut NeuralNet, data: &DataLoader) {
val devices = [cuda(0), cuda(1)];
for batch in data {
val (inputs, targets) = batch;
// Split batch across devices
val chunks = inputs.chunk(devices.len(), dim: 0);
var losses = Vec.new();
for i in 0..devices.len() {
val chunk = chunks[i].to_device(devices[i]);
val pred = model.forward(chunk);
val loss = cross_entropy(pred, targets);
losses.push(loss);
}
// Aggregate gradients
val total_loss = losses.sum();
total_loss.backward();
}
}
Device Query
axonuse std.device;
fn main() {
val count = device.gpu_count();
println("Available GPUs: {}", count);
for i in 0..count {
val dev = device.cuda(i);
println(" GPU {}: {} ({}MB)", i, dev.name(), dev.memory_mb());
}
}
Complete Example: GPU Matrix Multiplication
axonuse std.device.cuda;
@gpu
fn matmul_gpu(
a: Tensor<Float32, [?, ?]>,
b: Tensor<Float32, [?, ?]>,
): Tensor<Float32, [?, ?]> {
a @ b
}
fn main() {
val size = 2048;
// Create tensors on CPU
val a = randn([size, size]);
val b = randn([size, size]);
// Transfer to GPU
val ga = a.to_gpu();
val gb = b.to_gpu();
// GPU matrix multiply
val gc = matmul_gpu(ga, gb);
// Get result
val c = gc.to_cpu();
println("Result shape: {}", c.shape);
println("Result[0][0]: {}", c[0][0]);
}
Compile and run:
bashaxonc build matmul.axon --gpu cuda -O 3 -o matmul
./matmul
# Result shape: [2048, 2048]
# Result[0][0]: 12.3456
Best Practices
- Minimize transfers — keep data on GPU as long as possible
- Batch operations — GPU shines with large, parallel workloads
- Use
@gpu functions — let the compiler handle kernel generation
- Profile first — not everything benefits from GPU acceleration
- Clone before transfer if you need the CPU copy
See Also
Tutorial 1: Hello, Tensor!
In this tutorial you'll create, manipulate, and inspect tensors — the fundamental data type in Axon.
Prerequisites: Axon installed (Getting Started)
Step 1: Create a Project
bashaxonc pkg new hello_tensor
cd hello_tensor
Step 2: Your First Tensor
Open src/main.axon and replace its contents:
axonfn main() {
// Create a 1D tensor from values
val numbers = Tensor.from_vec([1.0, 2.0, 3.0, 4.0, 5.0], [5]);
println("Numbers: {}", numbers);
println("Shape: {}", numbers.shape);
println("Sum: {}", numbers.sum());
println("Mean: {}", numbers.mean());
}
Run it:
bashaxonc pkg run
# Numbers: [1.0, 2.0, 3.0, 4.0, 5.0]
# Shape: [5]
# Sum: 15.0
# Mean: 3.0
Step 3: Creating Tensors
Axon offers several tensor constructors:
axonfn main() {
// Zeros and ones
val z: Tensor<Float32, [2, 3]> = zeros([2, 3]);
println("Zeros:\n{}", z);
val o: Tensor<Float32, [3]> = ones([3]);
println("Ones: {}", o);
// Random tensors
val r = randn([2, 2]); // normal distribution
println("Random:\n{}", r);
// Range
val seq = arange(0, 5);
println("Range: {}", seq); // [0, 1, 2, 3, 4]
// Identity matrix
val eye = Tensor.eye(3);
println("Identity:\n{}", eye);
}
Step 4: Arithmetic Operations
Tensors support element-wise arithmetic:
axonfn main() {
val a = Tensor.from_vec([1.0, 2.0, 3.0], [3]);
val b = Tensor.from_vec([4.0, 5.0, 6.0], [3]);
println("a + b = {}", a + b); // [5.0, 7.0, 9.0]
println("a * b = {}", a * b); // [4.0, 10.0, 18.0]
println("a * 2 = {}", a * 2.0); // [2.0, 4.0, 6.0]
// Math functions
val x = Tensor.from_vec([0.0, 1.5708, 3.1416], [3]);
println("sin(x) = {}", x.sin());
println("exp(x) = {}", x.exp());
}
Step 5: Matrix Multiplication
The @ operator performs matrix multiplication:
axonfn main() {
val A: Tensor<Float32, [2, 3]> = Tensor.from_vec(
[1.0, 2.0, 3.0,
4.0, 5.0, 6.0], [2, 3]
);
val B: Tensor<Float32, [3, 2]> = Tensor.from_vec(
[7.0, 8.0,
9.0, 10.0,
11.0, 12.0], [3, 2]
);
val C = A @ B; // [2, 3] @ [3, 2] → [2, 2]
println("A @ B =\n{}", C);
println("Shape: {}", C.shape);
// A @ B =
// [[58.0, 64.0],
// [139.0, 154.0]]
}
Step 6: Reshaping and Transposing
axonfn main() {
val t: Tensor<Float32, [2, 6]> = arange(0, 12).reshape([2, 6]);
println("Original [2, 6]:\n{}", t);
// Reshape
val r = t.reshape([3, 4]);
println("Reshaped [3, 4]:\n{}", r);
val flat = t.reshape([12]);
println("Flat: {}", flat);
// Transpose
val m: Tensor<Float32, [2, 3]> = Tensor.from_vec(
[1.0, 2.0, 3.0,
4.0, 5.0, 6.0], [2, 3]
);
val mt = m.transpose();
println("Transposed [3, 2]:\n{}", mt);
}
Step 7: Reductions
axonfn main() {
val data: Tensor<Float32, [3, 4]> = Tensor.from_vec(
[1.0, 2.0, 3.0, 4.0,
5.0, 6.0, 7.0, 8.0,
9.0, 10.0, 11.0, 12.0], [3, 4]
);
println("Sum (all): {}", data.sum()); // 78.0
println("Mean (all): {}", data.mean()); // 6.5
println("Max (all): {}", data.max()); // 12.0
println("Sum (dim 0): {}", data.sum(dim: 0)); // [15.0, 18.0, 21.0, 24.0]
println("Sum (dim 1): {}", data.sum(dim: 1)); // [10.0, 26.0, 42.0]
}
Step 8: Putting It All Together
A small program that normalizes a dataset:
axonfn normalize(data: Tensor<Float32, [?, ?]>): Tensor<Float32, [?, ?]> {
val mean = data.mean(dim: 0);
val std = data.std(dim: 0);
(data - mean) / std
}
fn main() {
// Simulate a dataset: 100 samples, 4 features
val dataset = randn([100, 4]);
println("Before normalization:");
println(" Mean: {}", dataset.mean(dim: 0));
val normed = normalize(dataset);
println("After normalization:");
println(" Mean: {}", normed.mean(dim: 0)); // ≈ [0, 0, 0, 0]
println(" Std: {}", normed.std(dim: 0)); // ≈ [1, 1, 1, 1]
}
What You Learned
- Creating tensors with
from_vec, zeros, ones, randn, arange
- Element-wise operations (
+, -, *, /)
- Matrix multiplication with
@
- Reshaping, transposing, and slicing
- Reduction operations (
sum, mean, max)
- Compile-time shape checking
Next Steps
Tutorial 2: Linear Regression
Build a simple linear regression model from scratch using tensors and autograd. This tutorial demonstrates Axon's automatic differentiation engine.
Prerequisites: Tutorial 1: Hello, Tensor!
The Problem
We'll fit a line y = wx + b to synthetic data using gradient descent.
Step 1: Generate Synthetic Data
axonfn generate_data(n: Int32): (Tensor<Float32, [?, 1]>, Tensor<Float32, [?, 1]>) {
// True parameters: y = 3.5x + 2.0 + noise
val x = randn([n, 1]);
val noise = randn([n, 1]) * 0.3;
val y = x * 3.5 + 2.0 + noise;
(x, y)
}
Step 2: Define the Model
Linear regression is a single linear transformation:
axonmodel LinearRegression {
weight: Tensor<Float32, [1, 1]>,
bias: Tensor<Float32, [1]>,
}
extend LinearRegression {
fn new(): LinearRegression {
LinearRegression {
weight: randn([1, 1]),
bias: zeros([1]),
}
}
fn forward(&self, x: Tensor<Float32, [?, 1]>): Tensor<Float32, [?, 1]> {
x @ self.weight + self.bias
}
}
Step 3: Define the Loss Function
Mean Squared Error — the standard loss for regression:
axonfn mse_loss(
predictions: Tensor<Float32, [?, 1]>,
targets: Tensor<Float32, [?, 1]>,
): Tensor<Float32, []> {
val diff = predictions - targets;
(diff * diff).mean()
}
Step 4: Training with Autograd
Now we use Axon's autograd to compute gradients and update parameters:
axonuse std.autograd.GradTensor;
use std.optim.SGD;
fn train() {
// Generate training data
val (x_train, y_train) = generate_data(200);
// Initialize model
var net = LinearRegression.new();
// Optimizer: SGD with learning rate 0.01
var optimizer = SGD.new(
[&net.weight, &net.bias],
lr: 0.01,
);
// Training loop
for epoch in 0..100 {
// Forward pass
val predictions = net.forward(x_train);
// Compute loss
val loss = mse_loss(predictions, y_train);
// Backward pass — compute gradients
loss.backward();
// Update parameters
optimizer.step();
optimizer.zero_grad();
if epoch % 10 == 0 {
println("Epoch {}: loss = {:.4}", epoch, loss.item());
}
}
// Print learned parameters
println("Learned weight: {:.4} (true: 3.5)", net.weight.item());
println("Learned bias: {:.4} (true: 2.0)", net.bias.item());
}
Step 5: Evaluate the Model
axonfn evaluate(net: &LinearRegression) {
// Generate test data
val (x_test, y_test) = generate_data(50);
// Predict
val predictions = net.forward(x_test);
// Compute test loss
val test_loss = mse_loss(predictions, y_test);
println("Test MSE: {:.4}", test_loss.item());
// Print a few predictions
println("\nSample predictions:");
println(" x | predicted | actual");
println(" --------|-----------|-------");
for i in 0..5 {
println(" {:.4} | {:.4} | {:.4}",
x_test[i].item(),
predictions[i].item(),
y_test[i].item()
);
}
}
Step 6: Full Program
axonuse std.autograd.GradTensor;
use std.optim.SGD;
model LinearRegression {
weight: Tensor<Float32, [1, 1]>,
bias: Tensor<Float32, [1]>,
}
extend LinearRegression {
fn new(): LinearRegression {
LinearRegression {
weight: randn([1, 1]),
bias: zeros([1]),
}
}
fn forward(&self, x: Tensor<Float32, [?, 1]>): Tensor<Float32, [?, 1]> {
x @ self.weight + self.bias
}
}
fn mse_loss(
predictions: Tensor<Float32, [?, 1]>,
targets: Tensor<Float32, [?, 1]>,
): Tensor<Float32, []> {
val diff = predictions - targets;
(diff * diff).mean()
}
fn main() {
println("=== Linear Regression in Axon ===\n");
// Data
val (x_train, y_train) = generate_data(200);
// Model
var net = LinearRegression.new();
var optimizer = SGD.new(
[&net.weight, &net.bias],
lr: 0.01,
);
// Train
for epoch in 0..200 {
val pred = net.forward(x_train);
val loss = mse_loss(pred, y_train);
loss.backward();
optimizer.step();
optimizer.zero_grad();
if epoch % 50 == 0 {
println("Epoch {:>3}: loss = {:.6}", epoch, loss.item());
}
}
println("\nFinal parameters:");
println(" weight = {:.4} (true: 3.5)", net.weight.item());
println(" bias = {:.4} (true: 2.0)", net.bias.item());
// Evaluate
val (x_test, y_test) = generate_data(50);
val test_pred = net.forward(x_test);
val test_loss = mse_loss(test_pred, y_test);
println("\nTest MSE: {:.6}", test_loss.item());
}
fn generate_data(n: Int32): (Tensor<Float32, [?, 1]>, Tensor<Float32, [?, 1]>) {
val x = randn([n, 1]);
val noise = randn([n, 1]) * 0.3;
val y = x * 3.5 + 2.0 + noise;
(x, y)
}
Run:
bashaxonc build linear_reg.axon -O 2 -o linear_reg
./linear_reg
# === Linear Regression in Axon ===
#
# Epoch 0: loss = 14.832901
# Epoch 50: loss = 0.129384
# Epoch 100: loss = 0.091203
# Epoch 150: loss = 0.089847
#
# Final parameters:
# weight = 3.4821 (true: 3.5)
# bias = 1.9934 (true: 2.0)
#
# Test MSE: 0.092145
Key Concepts Covered
| Concept | Axon Feature |
|---|
| Model definition | model with tensor fields |
| Forward pass | @ operator, element-wise ops |
| Loss function | Tensor reductions (.mean()) |
| Backpropagation | loss.backward() |
| Parameter update | optimizer.step() |
| Gradient reset | optimizer.zero_grad() |
Next Steps
Tutorial 3: MNIST Classifier
Build a convolutional neural network to classify handwritten digits from the MNIST dataset using Axon's std::nn module.
Prerequisites: Tutorial 2: Linear Regression
Overview
We'll build a CNN with:
- Two convolutional layers with ReLU and max pooling
- Two fully connected layers
- Softmax output for 10 digit classes
Step 1: Data Loading
axonuse std.data.DataLoader;
use std.transforms.{normalize, to_tensor};
fn load_mnist(): (DataLoader, DataLoader) {
val train_loader = DataLoader.from_csv("data/mnist_train.csv")
.batch_size(64)
.shuffle(true)
.transform(|img| {
val tensor = to_tensor(img, [1, 28, 28]);
normalize(tensor, mean: [0.1307], std: [0.3081])
});
val test_loader = DataLoader.from_csv("data/mnist_test.csv")
.batch_size(256)
.shuffle(false)
.transform(|img| {
val tensor = to_tensor(img, [1, 28, 28]);
normalize(tensor, mean: [0.1307], std: [0.3081])
});
(train_loader, test_loader)
}
Step 2: Define the Model
axonuse std.nn.{Conv2d, Linear, MaxPool2d, Module, Sequential};
model MNISTNet {
conv1: Conv2d,
conv2: Conv2d,
pool: MaxPool2d,
fc1: Linear<9216, 128>,
fc2: Linear<128, 10>,
}
extend MNISTNet {
fn new(): MNISTNet {
MNISTNet {
conv1: Conv2d.new(in_channels: 1, out_channels: 32, kernel_size: 3, padding: 1),
conv2: Conv2d.new(in_channels: 32, out_channels: 64, kernel_size: 3, padding: 1),
pool: MaxPool2d.new(kernel_size: 2, stride: 2),
fc1: Linear.new(),
fc2: Linear.new(),
}
}
}
extend Module for MNISTNet {
fn forward(&self, x: Tensor<Float32, [?, 1, 28, 28]>): Tensor<Float32, [?, 10]> {
// Conv block 1: [?, 1, 28, 28] → [?, 32, 14, 14]
val h = self.conv1.forward(x);
val h = relu(h);
val h = self.pool.forward(h);
// Conv block 2: [?, 32, 14, 14] → [?, 64, 7, 7]
val h = self.conv2.forward(h);
val h = relu(h);
val h = self.pool.forward(h);
// Flatten: [?, 64, 7, 7] → [?, 3136]
val batch_size = h.shape[0];
val h = h.reshape([batch_size, 3136]);
// Fully connected layers
val h = relu(self.fc1.forward(h));
self.fc2.forward(h)
}
}
Step 3: Training Loop
axonuse std.optim.Adam;
use std.loss.cross_entropy;
use std.metrics.accuracy;
fn train_epoch(
net: &mut MNISTNet,
data: &DataLoader,
optimizer: &mut Adam,
): (Float32, Float32) {
var total_loss = 0.0;
var correct = 0;
var total = 0;
for batch in data {
val (images, labels) = batch;
// Forward
val logits = net.forward(images);
val loss = cross_entropy(logits, labels);
// Track metrics
total_loss += loss.item();
val predicted = logits.argmax(dim: 1);
correct += (predicted == labels).sum().item() as Int32;
total += labels.shape[0];
// Backward
loss.backward();
optimizer.step();
optimizer.zero_grad();
}
val avg_loss = total_loss / data.num_batches() as Float32;
val acc = correct as Float32 / total as Float32;
(avg_loss, acc)
}
Step 4: Evaluation
axonfn evaluate(net: &MNISTNet, data: &DataLoader): (Float32, Float32) {
var total_loss = 0.0;
var correct = 0;
var total = 0;
for batch in data {
val (images, labels) = batch;
val logits = net.forward(images);
val loss = cross_entropy(logits, labels);
total_loss += loss.item();
val predicted = logits.argmax(dim: 1);
correct += (predicted == labels).sum().item() as Int32;
total += labels.shape[0];
}
val avg_loss = total_loss / data.num_batches() as Float32;
val acc = correct as Float32 / total as Float32;
(avg_loss, acc)
}
Step 5: Full Training Program
axonuse std.nn.{Conv2d, Linear, MaxPool2d, Module};
use std.optim.Adam;
use std.loss.cross_entropy;
use std.data.DataLoader;
use std.transforms.{normalize, to_tensor};
fn main() {
println("=== MNIST Classifier ===\n");
// Load data
val (train_loader, test_loader) = load_mnist();
println("Train: {} samples", train_loader.len());
println("Test: {} samples\n", test_loader.len());
// Create model and optimizer
var net = MNISTNet.new();
var optimizer = Adam.new(
net.parameters(),
lr: 0.001,
);
// Training
val epochs = 10;
for epoch in 0..epochs {
val (train_loss, train_acc) = train_epoch(&mut net, &train_loader, &mut optimizer);
val (test_loss, test_acc) = evaluate(&net, &test_loader);
println("Epoch {:>2}/{} | Train Loss: {:.4} Acc: {:.2}% | Test Loss: {:.4} Acc: {:.2}%",
epoch + 1, epochs,
train_loss, train_acc * 100.0,
test_loss, test_acc * 100.0,
);
}
// Final evaluation
val (_, final_acc) = evaluate(&net, &test_loader);
println("\nFinal test accuracy: {:.2}%", final_acc * 100.0);
}
Expected output:
=== MNIST Classifier ===
Train: 60000 samples
Test: 10000 samples
Epoch 1/10 | Train Loss: 0.2134 Acc: 93.41% | Test Loss: 0.0712 Acc: 97.82%
Epoch 2/10 | Train Loss: 0.0583 Acc: 98.19% | Test Loss: 0.0498 Acc: 98.41%
...
Epoch 10/10 | Train Loss: 0.0089 Acc: 99.71% | Test Loss: 0.0312 Acc: 99.12%
Final test accuracy: 99.12%
Step 6: GPU Training (Optional)
To train on GPU, simply transfer data and model:
axonfn main() {
var net = MNISTNet.new().to_gpu();
var optimizer = Adam.new(net.parameters(), lr: 0.001);
for epoch in 0..10 {
for batch in &train_loader {
val (images, labels) = batch;
val images = images.to_gpu();
val labels = labels.to_gpu();
val logits = net.forward(images);
val loss = cross_entropy(logits, labels);
loss.backward();
optimizer.step();
optimizer.zero_grad();
}
}
}
Compile with GPU support:
bashaxonc build mnist.axon --gpu cuda -O 3 -o mnist
Step 7: Save the Model
axonuse std.export.save;
// After training
save(&net, "mnist_model.axon");
println("Model saved!");
// Load later
use std.export.load;
val loaded_net: MNISTNet = load("mnist_model.axon");
Key Concepts Covered
| Concept | Axon Feature |
|---|
| CNN architecture | Conv2d, MaxPool2d, Linear |
| Data loading | DataLoader with transforms |
| Training loop | forward → loss → backward → step |
| Metrics | argmax, accuracy calculation |
| GPU training | .to_gpu() + --gpu cuda |
| Model saving | std.export.save / load |
Next Steps
Tutorial 4: Building a Transformer
Build a transformer encoder from scratch in Axon to understand self-attention, multi-head attention, and the full transformer architecture.
Prerequisites: Tutorial 3: MNIST Classifier
Architecture Overview
A transformer encoder block consists of:
- Multi-Head Self-Attention
- Layer Normalization + residual connection
- Feed-Forward Network (two linear layers with activation)
- Layer Normalization + residual connection
Step 1: Scaled Dot-Product Attention
The fundamental building block of transformers:
axonfn scaled_dot_product_attention(
query: Tensor<Float32, [?, ?, ?]>, // [batch, seq_len, d_k]
key: Tensor<Float32, [?, ?, ?]>, // [batch, seq_len, d_k]
value: Tensor<Float32, [?, ?, ?]>, // [batch, seq_len, d_v]
): Tensor<Float32, [?, ?, ?]> {
val d_k = query.shape[2] as Float32;
// Attention scores: Q @ K^T / sqrt(d_k)
val scores = (query @ key.transpose()) / d_k.sqrt();
// Softmax over the last dimension
val weights = softmax(scores, dim: 2);
// Weighted sum of values
weights @ value
}
Step 2: Multi-Head Attention
Split the model dimension into multiple heads for parallel attention:
axonuse std.nn.Linear;
model MultiHeadAttention {
num_heads: Int32,
d_model: Int32,
d_k: Int32,
w_query: Linear<512, 512>,
w_key: Linear<512, 512>,
w_value: Linear<512, 512>,
w_output: Linear<512, 512>,
}
extend MultiHeadAttention {
fn new(d_model: Int32, num_heads: Int32): MultiHeadAttention {
val d_k = d_model / num_heads;
MultiHeadAttention {
num_heads,
d_model,
d_k,
w_query: Linear.new(),
w_key: Linear.new(),
w_value: Linear.new(),
w_output: Linear.new(),
}
}
fn forward(
&self,
query: Tensor<Float32, [?, ?, 512]>,
key: Tensor<Float32, [?, ?, 512]>,
value: Tensor<Float32, [?, ?, 512]>,
): Tensor<Float32, [?, ?, 512]> {
val batch_size = query.shape[0];
val seq_len = query.shape[1];
// Project Q, K, V
val q = self.w_query.forward(query);
val k = self.w_key.forward(key);
val v = self.w_value.forward(value);
// Reshape to [batch, num_heads, seq_len, d_k]
val q = q.reshape([batch_size, seq_len, self.num_heads, self.d_k])
.permute([0, 2, 1, 3]);
val k = k.reshape([batch_size, seq_len, self.num_heads, self.d_k])
.permute([0, 2, 1, 3]);
val v = v.reshape([batch_size, seq_len, self.num_heads, self.d_k])
.permute([0, 2, 1, 3]);
// Attention per head
val attn = scaled_dot_product_attention(q, k, v);
// Concatenate heads
val concat = attn.permute([0, 2, 1, 3])
.reshape([batch_size, seq_len, self.d_model]);
// Final projection
self.w_output.forward(concat)
}
}
Step 3: Feed-Forward Network
Two linear layers with GELU activation:
axonmodel FeedForward {
linear1: Linear<512, 2048>,
linear2: Linear<2048, 512>,
}
extend FeedForward {
fn new(d_model: Int32, d_ff: Int32): FeedForward {
FeedForward {
linear1: Linear.new(),
linear2: Linear.new(),
}
}
fn forward(&self, x: Tensor<Float32, [?, ?, 512]>): Tensor<Float32, [?, ?, 512]> {
val h = gelu(self.linear1.forward(x));
self.linear2.forward(h)
}
}
Combine attention and feed-forward with residual connections and layer norm:
axonuse std.nn.LayerNorm;
model TransformerBlock {
attention: MultiHeadAttention,
feed_forward: FeedForward,
norm1: LayerNorm,
norm2: LayerNorm,
}
extend TransformerBlock {
fn new(d_model: Int32, num_heads: Int32, d_ff: Int32): TransformerBlock {
TransformerBlock {
attention: MultiHeadAttention.new(d_model, num_heads),
feed_forward: FeedForward.new(d_model, d_ff),
norm1: LayerNorm.new(d_model),
norm2: LayerNorm.new(d_model),
}
}
fn forward(&self, x: Tensor<Float32, [?, ?, 512]>): Tensor<Float32, [?, ?, 512]> {
// Self-attention + residual + norm
val attn_out = self.attention.forward(x, x, x);
val h = self.norm1.forward(x + attn_out);
// Feed-forward + residual + norm
val ff_out = self.feed_forward.forward(h);
self.norm2.forward(h + ff_out)
}
}
Step 5: Positional Encoding
Add position information since attention is permutation-invariant:
axonfn positional_encoding(seq_len: Int32, d_model: Int32): Tensor<Float32, [?, ?]> {
val pe = zeros([seq_len, d_model]);
for pos in 0..seq_len {
for i in 0..(d_model / 2) {
val angle = pos as Float32 / (10000.0).pow(2.0 * i as Float32 / d_model as Float32);
pe[pos][2 * i] = angle.sin();
pe[pos][2 * i + 1] = angle.cos();
}
}
pe
}
Stack multiple transformer blocks into a complete encoder:
axonuse std.nn.{Embedding, Linear, Module};
model TransformerEncoder {
embedding: Embedding,
layers: Vec<TransformerBlock>,
classifier: Linear<512, 10>,
d_model: Int32,
}
extend TransformerEncoder {
fn new(
vocab_size: Int32,
d_model: Int32,
num_heads: Int32,
num_layers: Int32,
d_ff: Int32,
num_classes: Int32,
): TransformerEncoder {
var layers = Vec.new();
for _ in 0..num_layers {
layers.push(TransformerBlock.new(d_model, num_heads, d_ff));
}
TransformerEncoder {
embedding: Embedding.new(vocab_size, d_model),
layers,
classifier: Linear.new(),
d_model,
}
}
}
extend Module for TransformerEncoder {
fn forward(&self, tokens: Tensor<Int64, [?, ?]>): Tensor<Float32, [?, 10]> {
val seq_len = tokens.shape[1];
// Token embedding + positional encoding
val x = self.embedding.forward(tokens);
val pe = positional_encoding(seq_len, self.d_model);
var h = x + pe;
// Pass through transformer blocks
for layer in &self.layers {
h = layer.forward(h);
}
// Classification: use [CLS] token (first position)
val cls = h[.., 0, ..]; // [batch, d_model]
self.classifier.forward(cls)
}
}
Step 7: Training
axonuse std.optim.AdamW;
use std.loss.cross_entropy;
fn main() {
println("=== Transformer Encoder ===\n");
// Hyperparameters
val vocab_size = 10000;
val d_model = 512;
val num_heads = 8;
val num_layers = 6;
val d_ff = 2048;
val num_classes = 10;
// Create model
var net = TransformerEncoder.new(
vocab_size, d_model, num_heads, num_layers, d_ff, num_classes,
);
var optimizer = AdamW.new(
net.parameters(),
lr: 0.0001,
weight_decay: 0.01,
);
println("Model: {} parameters", net.param_count());
println("Config: d_model={}, heads={}, layers={}\n", d_model, num_heads, num_layers);
// Training loop
val epochs = 20;
for epoch in 0..epochs {
var total_loss = 0.0;
var num_batches = 0;
for batch in &train_loader {
val (tokens, labels) = batch;
val logits = net.forward(tokens);
val loss = cross_entropy(logits, labels);
loss.backward();
optimizer.step();
optimizer.zero_grad();
total_loss += loss.item();
num_batches += 1;
}
val avg_loss = total_loss / num_batches as Float32;
println("Epoch {:>2}/{}: loss = {:.4}", epoch + 1, epochs, avg_loss);
}
}
Axon's stdlib includes pre-built transformer components:
axonuse std.nn.{TransformerEncoder as TE, MultiHeadAttention};
fn main() {
// One-liner transformer encoder
val encoder = TE.new(
d_model: 512,
num_heads: 8,
num_layers: 6,
d_ff: 2048,
dropout: 0.1,
);
val input: Tensor<Float32, [?, 128, 512]> = randn([32, 128, 512]);
val output = encoder.forward(input);
println("Output shape: {}", output.shape); // [32, 128, 512]
}
Architecture Diagram
Input Tokens
│
▼
┌─────────────┐
│ Embedding │
│ + Pos Enc │
└─────┬───────┘
│
▼ ×N layers
┌─────────────────────┐
│ Multi-Head Attn │
│ + Residual + Norm │
├─────────────────────┤
│ Feed-Forward │
│ + Residual + Norm │
└─────────┬───────────┘
│
▼
┌─────────────┐
│ Classifier │
│ (Linear) │
└─────────────┘
│
▼
Logits [?, num_classes]
Key Concepts Covered
| Concept | Implementation |
|---|
| Self-attention | Q @ K^T / sqrt(d_k), softmax, @ V |
| Multi-head | Reshape → parallel attention → concat |
| Residual connections | x + sublayer(x) |
| Layer normalization | LayerNorm |
| Positional encoding | Sinusoidal sin/cos |
| Classification | [CLS] token → Linear |
See Also
Tutorial 05: Models and Enums
Axon supports user-defined data types through models (structs) and enums, providing type-safe data modeling for ML pipelines and systems programming.
Models
Use model to group related data together with named fields:
axonmodel Point {
x: Float64,
y: Float64,
}
fn distance(a: Point, b: Point): Float64 {
val dx = a.x - b.x;
val dy = a.y - b.y;
return (dx * dx + dy * dy).sqrt();
}
fn main() {
val origin = Point { x: 0.0, y: 0.0 };
val target = Point { x: 3.0, y: 4.0 };
println(distance(origin, target)); // 5.0
}
Model Methods
Attach functions to models using extend blocks:
axonmodel ModelConfig {
learning_rate: Float64,
batch_size: Int32,
epochs: Int32,
}
extend ModelConfig {
fn default(): ModelConfig {
return ModelConfig {
learning_rate: 0.001,
batch_size: 32,
epochs: 10,
};
}
fn with_lr(self, lr: Float64): ModelConfig {
return ModelConfig {
learning_rate: lr,
batch_size: self.batch_size,
epochs: self.epochs,
};
}
}
Models with Tensors
Models are ideal for encapsulating model parameters:
axonmodel LinearLayer {
weights: Tensor<Float32, [_, _]>,
bias: Tensor<Float32, [_]>,
}
extend LinearLayer {
fn forward(self, input: Tensor<Float32, [_, _]>): Tensor<Float32, [_, _]> {
return input @ self.weights + self.bias;
}
}
Enums
Enums define types that can be one of several variants:
axonenum Activation {
ReLU,
Sigmoid,
Tanh,
LeakyReLU(Float64), // variant with data
}
fn apply_activation(x: Float64, act: Activation): Float64 {
match act {
Activation.ReLU => if x > 0.0 { x } else { 0.0 },
Activation.Sigmoid => 1.0 / (1.0 + (-x).exp()),
Activation.Tanh => x.tanh(),
Activation.LeakyReLU(alpha) => if x > 0.0 { x } else { alpha * x },
}
}
Pattern Matching
Use match for exhaustive enum handling — the compiler verifies all variants are covered:
axonenum Device {
CPU,
GPU(Int32), // GPU with device index
}
fn device_name(d: Device): String {
match d {
Device.CPU => "cpu",
Device.GPU(idx) => format("cuda:{}", idx),
}
}
Ownership and Models
Axon's ownership rules apply to model fields. When a model goes out of scope, its owned fields are dropped automatically:
axonmodel DataBatch {
images: Tensor<Float32, [_, 3, 224, 224]>,
labels: Tensor<Int64, [_]>,
}
fn process_batch(batch: DataBatch) {
// `batch` is moved here — caller can no longer use it
val predictions = model.forward(batch.images);
val loss = cross_entropy(predictions, batch.labels);
}
Use references (&) to borrow without transferring ownership:
axonfn inspect_batch(batch: &DataBatch) {
println(batch.images.shape());
println(batch.labels.shape());
// batch is borrowed — caller retains ownership
}
Next Steps
Tutorial 06: Error Handling
Axon uses Result and Option types for explicit, type-safe error handling. No hidden exceptions — every fallible operation returns a value you must handle.
The Option Type
Option<T> represents a value that may or may not exist:
axonenum Option<T> {
Some(T),
None,
}
Using Option
axonfn find_max(data: Tensor<Float32, [_]>): Option<Float32> {
if data.len() == 0 {
return None;
}
return Some(data.max());
}
fn main() {
val values = tensor([1.0, 5.0, 3.0, 9.0, 2.0]);
match find_max(values) {
Some(max) => println("Max value: {}", max),
None => println("Empty tensor"),
}
}
Option Combinators
axonval maybe_value: Option<Float64> = Some(42.0);
// unwrap_or: provide a default
val value = maybe_value.unwrap_or(0.0);
// map: transform the inner value
val doubled = maybe_value.map(|x| x * 2.0);
// is_some / is_none: check presence
if maybe_value.is_some() {
println("Got a value!");
}
The Result Type
Result<T, E> represents an operation that can succeed (Ok) or fail (Err):
axonenum Result<T, E> {
Ok(T),
Err(E),
}
Using Result
axonfn load_model(path: String): Result<Model, String> {
if !file_exists(path) {
return Err("Model file not found: " + path);
}
val data = read_file(path)?; // ? propagates errors
return Ok(parse_model(data));
}
fn main() {
match load_model("weights.axon") {
Ok(model) => println("Loaded model with {} params", model.param_count()),
Err(e) => println("Error: {}", e),
}
}
The ? Operator
The ? operator propagates errors up the call stack automatically:
axonfn train_pipeline(config_path: String): Result<Float64, String> {
val config = load_config(config_path)?; // returns Err early if this fails
val data = load_dataset(config.data_path)?; // same here
val model = build_model(config)?; // and here
val final_loss = train(model, data, config.epochs)?;
return Ok(final_loss);
}
This is equivalent to writing match at every step, but much more concise.
Custom Error Types
Define your own error types for domain-specific errors:
axonenum TrainingError {
DataNotFound(String),
InvalidShape { expected: Shape, actual: Shape },
ConvergenceFailure { epoch: Int32, loss: Float64 },
OutOfMemory,
}
fn train(model: Model, data: Dataset): Result<Model, TrainingError> {
if data.shape() != model.expected_input_shape() {
return Err(TrainingError.InvalidShape {
expected: model.expected_input_shape(),
actual: data.shape(),
});
}
// ... training logic ...
return Ok(model);
}
Combining Option and Result
Convert between Option and Result:
axon// Option → Result: provide an error message for the None case
val value: Result<Float64, String> = maybe_value.ok_or("value was missing");
// Result → Option: discard the error info
val maybe: Option<Float64> = result.ok();
Panics
For truly unrecoverable errors, use panic:
axonfn assert_valid_shape(t: Tensor<Float32, [_, _]>) {
if t.shape()[0] == 0 {
panic("tensor must have at least one row");
}
}
Panics terminate the program immediately with a source location and message. Use them for programming errors (invariant violations), not expected failure modes.
Best Practices
- Use Result for recoverable errors — file I/O, network, parsing
- Use Option for missing values — lookups, optional config fields
- Use panic for bugs — invariant violations, unreachable code
- Use
? operator — keeps error-handling code concise
- Define domain error types — makes errors self-documenting
Next Steps
Migrating from PyTorch to Axon
A side-by-side guide for PyTorch developers. Axon's ML framework is heavily inspired by PyTorch's API, with the added benefits of compile-time shape checking, ownership-based memory safety, and native GPU compilation.
Tensor Creation
python# PyTorch
import torch
x = torch.zeros(3, 4)
y = torch.ones(5)
z = torch.randn(128, 256)
a = torch.tensor([1.0, 2.0, 3.0])
e = torch.eye(4)
r = torch.arange(0, 10)
axon// Axon
val x = zeros([3, 4]);
val y = ones([5]);
val z = randn([128, 256]);
val a = Tensor.from_vec([1.0, 2.0, 3.0], [3]);
val e = Tensor.eye(4);
val r = arange(0, 10);
Key difference: Axon tensors carry their shape in the type system: Tensor<Float32, [3, 4]> vs PyTorch's dynamic torch.Tensor.
Tensor Operations
python# PyTorch
c = a + b
c = a * b
c = a @ b # matmul
c = torch.matmul(a, b)
m = x.mean(dim=0)
s = x.sum(dim=1)
r = x.reshape(3, 4)
t = x.T # transpose
axon// Axon
val c = a + b;
val c = a * b;
val c = a @ b; // matmul (same!)
// no functional form needed
val m = x.mean(dim: 0);
val s = x.sum(dim: 1);
val r = x.reshape([3, 4]);
val t = x.transpose();
Model Definition
python# PyTorch
import torch.nn as nn
class MyModel(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(784, 256)
self.fc2 = nn.Linear(256, 128)
self.fc3 = nn.Linear(128, 10)
def forward(self, x):
x = torch.relu(self.fc1(x))
x = torch.relu(self.fc2(x))
return self.fc3(x)
model = MyModel()
axon// Axon
use std.nn.{Linear, Module};
model MyModel {
fc1: Linear<784, 256>,
fc2: Linear<256, 128>,
fc3: Linear<128, 10>,
}
extend MyModel {
fn new(): MyModel {
MyModel {
fc1: Linear.new(),
fc2: Linear.new(),
fc3: Linear.new(),
}
}
}
extend Module for MyModel {
fn forward(&self, x: Tensor<Float32, [?, 784]>): Tensor<Float32, [?, 10]> {
val h = relu(self.fc1.forward(x));
val h = relu(self.fc2.forward(h));
self.fc3.forward(h)
}
}
val model = MyModel.new();
Key differences:
model + extend Module instead of class(nn.Module)
- Linear layer sizes are part of the type:
Linear<784, 256>
- Input/output shapes are checked at compile time
- No
super().init() boilerplate
Training Loop
python# PyTorch
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()
for epoch in range(10):
for inputs, targets in dataloader:
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, targets)
loss.backward()
optimizer.step()
print(f"Epoch {epoch}: loss = {loss.item():.4f}")
axon// Axon
use std.optim.Adam;
use std.loss.cross_entropy;
var optimizer = Adam.new(model.parameters(), lr: 0.001);
for epoch in 0..10 {
for batch in &dataloader {
val (inputs, targets) = batch;
val outputs = model.forward(inputs);
val loss = cross_entropy(outputs, targets);
loss.backward();
optimizer.step();
optimizer.zero_grad();
}
println("Epoch {}: loss = {:.4}", epoch, loss.item());
}
Almost identical! The main differences:
model.forward(x) instead of model(x)
cross_entropy(outputs, targets) is a function, not a class
optimizer.zero_grad() typically called after step() (same effect)
- Borrow semantics:
&dataloader to iterate without consuming
CNN Layers
python# PyTorch
self.conv1 = nn.Conv2d(1, 32, kernel_size=3, padding=1)
self.pool = nn.MaxPool2d(2)
self.bn = nn.BatchNorm2d(32)
self.dropout = nn.Dropout(0.5)
axon// Axon
self.conv1 = Conv2d.new(in_channels: 1, out_channels: 32, kernel_size: 3, padding: 1);
self.pool = MaxPool2d.new(kernel_size: 2, stride: 2);
self.bn = BatchNorm.new(32);
self.dropout = Dropout.new(rate: 0.5);
python# PyTorch
lstm = nn.LSTM(input_size=128, hidden_size=256, num_layers=2, batch_first=True)
attention = nn.MultiheadAttention(embed_dim=512, num_heads=8)
encoder = nn.TransformerEncoder(
nn.TransformerEncoderLayer(d_model=512, nhead=8),
num_layers=6
)
axon// Axon
val lstm = LSTM.new(input_size: 128, hidden_size: 256, num_layers: 2);
val attention = MultiHeadAttention.new(d_model: 512, num_heads: 8);
val encoder = TransformerEncoder.new(
d_model: 512,
num_heads: 8,
num_layers: 6,
d_ff: 2048,
dropout: 0.1,
);
GPU / Device Management
python# PyTorch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
x = x.to(device)
# Multi-GPU
model = nn.DataParallel(model)
axon// Axon
use std.device.{cuda, cpu};
val model = MyModel.new().to_gpu();
val x = x.to_gpu();
// Or with explicit device
val dev = cuda(0);
val x = x.to_device(dev);
Axon difference: Device transfer is a move (ownership transfer), not a copy.
Optimizers
| PyTorch | Axon |
|---|
torch.optim.SGD(params, lr=0.01) | SGD.new(params, lr: 0.01) |
torch.optim.Adam(params, lr=0.001) | Adam.new(params, lr: 0.001) |
torch.optim.AdamW(params, lr=1e-4, weight_decay=0.01) | AdamW.new(params, lr: 0.0001, weight_decay: 0.01) |
Loss Functions
| PyTorch | Axon |
|---|
nn.CrossEntropyLoss()(output, target) | cross_entropy(output, target) |
nn.MSELoss()(output, target) | mse_loss(output, target) |
nn.BCELoss()(output, target) | bce_loss(output, target) |
nn.L1Loss()(output, target) | l1_loss(output, target) |
Axon uses functions instead of loss classes — simpler and more direct.
Autograd / Gradients
python# PyTorch
x = torch.randn(3, requires_grad=True)
y = x * 2 + 1
y.sum().backward()
print(x.grad)
with torch.no_grad():
prediction = model(x)
axon// Axon
use std.autograd.{GradTensor, no_grad};
val x = GradTensor.new(randn([3]));
val y = x * 2.0 + 1.0;
y.sum().backward();
println("{}", x.grad());
no_grad(|| {
val prediction = model.forward(x);
});
Data Loading
python# PyTorch
from torch.utils.data import DataLoader, TensorDataset
dataset = TensorDataset(x_train, y_train)
loader = DataLoader(dataset, batch_size=64, shuffle=True)
axon// Axon
use std.data.DataLoader;
val loader = DataLoader.new(x_train, y_train)
.batch_size(64)
.shuffle(true);
Model Saving / Loading
python# PyTorch
torch.save(model.state_dict(), "model.pt")
model.load_state_dict(torch.load("model.pt"))
# ONNX export
torch.onnx.export(model, dummy_input, "model.onnx")
axon// Axon
use std.export.{save, load, export_onnx};
save(&model, "model.axon");
val model: MyModel = load("model.axon");
// ONNX export
val dummy_input = randn([1, 784]);
export_onnx(&model, dummy_input, "model.onnx");
What Axon Adds Over PyTorch
| Feature | PyTorch | Axon |
|---|
| Shape checking | Runtime errors | Compile-time errors |
| Memory safety | Manual management | Ownership system |
| Type safety | Dynamic typing | Static types with inference |
| GPU compilation | Python + CUDA C | Native GPU via MLIR |
| Performance | Python overhead | Native binary, no GIL |
| Package manager | pip + setup.py | Built-in axonc pkg |
| Formatting | black (separate) | Built-in axonc fmt |
Quick Translation Table
| PyTorch | Axon |
|---|
import torch | (built-in, no import needed) |
import torch.nn as nn | use std.nn.* |
model(x) | model.forward(x) |
loss.item() | loss.item() (same!) |
.backward() | .backward() (same!) |
.to("cuda") | .to_gpu() |
torch.no_grad() | no_grad(\ | \ | { ... }) |
model.eval() | model.eval() (same!) |
model.train() | model.train() (same!) |
See Also
Migrating from Python to Axon
A side-by-side guide for Python developers moving to Axon. Axon will feel familiar in many ways, but adds static types, ownership, and compile-time shape checking.
Variables
| Python | Axon |
|---|
x = 42 | val x = 42; |
x = 42 (reassign later) | var x = 42; |
x: int = 42 | val x: Int32 = 42; |
python# Python
name = "Alice"
age = 30
scores = [95, 87, 92]
axon// Axon
val name = "Alice";
val age = 30;
val scores = vec![95, 87, 92];
Key difference: Axon variables are immutable by default. Use var for mutable variables.
Functions
python# Python
def add(a: int, b: int) -> int:
return a + b
def greet(name: str):
print(f"Hello, {name}!")
axon// Axon
fn add(a: Int32, b: Int32): Int32 {
a + b // implicit return (last expression)
}
fn greet(name: String) {
println("Hello, {}!", name);
}
Differences:
fn instead of def
- Curly braces instead of indentation
- Type annotations are required on parameters
- No
return needed for the last expression
Types
| Python | Axon | Notes |
|---|
int | Int32 / Int64 | Explicit sizes |
float | Float32 / Float64 | Explicit sizes |
bool | Bool | |
str | String | |
list[int] | Vec<Int32> | |
dict[str, int] | HashMap<String, Int32> | |
Optional[int] | Option<Int32> | |
None | None | |
Control Flow
If/Else
python# Python
if score >= 90:
grade = "A"
elif score >= 70:
grade = "B"
else:
grade = "C"
axon// Axon — if is an expression!
val grade = if score >= 90 {
"A"
} else if score >= 70 {
"B"
} else {
"C"
};
Loops
python# Python
for i in range(10):
print(i)
for item in items:
process(item)
while condition:
do_work()
axon// Axon
for i in 0..10 {
println("{}", i);
}
for item in items {
process(item);
}
while condition {
do_work();
}
Pattern Matching
python# Python 3.10+
match command:
case "quit":
exit()
case "hello":
print("Hi!")
case _:
print("Unknown")
axon// Axon
match command {
"quit" => exit(),
"hello" => println("Hi!"),
_ => println("Unknown"),
}
Classes → Models + Extend
Python classes map to Axon models with extend blocks:
python# Python
class Point:
def __init__(self, x: float, y: float):
self.x = x
self.y = y
def distance(self, other: 'Point') -> float:
return ((self.x - other.x)**2 + (self.y - other.y)**2)**0.5
def __str__(self):
return f"({self.x}, {self.y})"
axon// Axon
model Point {
x: Float64,
y: Float64,
}
extend Point {
fn new(x: Float64, y: Float64): Point {
Point { x, y }
}
fn distance(&self, other: &Point): Float64 {
val dx = self.x - other.x;
val dy = self.y - other.y;
(dx * dx + dy * dy).sqrt()
}
}
extend Display for Point {
fn to_string(&self): String {
format("({}, {})", self.x, self.y)
}
}
Key differences:
- No inheritance — use traits for polymorphism
&self is explicit (immutable borrow)
- Constructors are regular functions (by convention called
new)
Error Handling
python# Python
try:
f = open("config.toml")
data = f.read()
config = parse(data)
except FileNotFoundError:
print("Config not found")
except ParseError as e:
print(f"Parse error: {e}")
axon// Axon
match File.open("config.toml") {
Ok(file) => {
match file.read_all() {
Ok(data) => {
val config = parse(data);
println("Loaded config");
}
Err(e) => eprintln("Read error: {}", e),
}
}
Err(e) => eprintln("Config not found: {}", e),
}
// Or more concisely with ?
fn load_config(): Result<Config, IOError> {
val file = File.open("config.toml")?;
val data = file.read_all()?;
parse(data)
}
NumPy / Tensors
python# Python (NumPy)
import numpy as np
a = np.zeros((3, 4))
b = np.random.randn(3, 4)
c = a + b
d = np.dot(a, b.T)
mean = np.mean(c, axis=0)
axon// Axon
val a = zeros([3, 4]);
val b = randn([3, 4]);
val c = a + b;
val d = a @ b.transpose();
val mean = c.mean(dim: 0);
Key differences from NumPy:
- Compile-time shape checking — shape errors caught before runtime
@ operator for matrix multiplication (like Python 3.5+, but type-checked)
- No
import needed — tensors are built-in types
- Shapes are part of the type:
Tensor<Float32, [3, 4]>
List Comprehensions → Iterators
python# Python
squares = [x**2 for x in range(10)]
evens = [x for x in numbers if x % 2 == 0]
axon// Axon (iterator methods)
val squares: Vec<Int32> = (0..10).map(|x| x * x).collect();
val evens: Vec<Int32> = numbers.iter().filter(|x| x % 2 == 0).collect();
Modules
python# Python — file: math_utils.py
def square(x):
return x * x
# main.py
from math_utils import square
axon// Axon — file: math_utils.axon
pub fn square(x: Float64): Float64 {
x * x
}
// main.axon
mod math_utils;
use math_utils.square;
Quick Reference
| Python | Axon |
|---|
print(x) | println("{}", x) |
len(x) | x.len() |
type(x) | Compile-time types (use :type in REPL) |
isinstance(x, T) | Pattern matching |
None | None (Option type) |
raise ValueError() | return Err(...) or panic(...) |
assert x > 0 | assert(x > 0) |
# comment | // comment |
"""docstring""" | /// doc comment |
pip install | axonc pkg add |
python script.py | axonc build script.axon && ./script |
See Also
Migrating from Rust to Axon
Axon draws significant inspiration from Rust's syntax and safety model, but is purpose-built for ML/AI workloads. If you know Rust, you'll feel at home quickly. This guide covers the key differences.
At a Glance
| Feature | Rust | Axon |
|---|
| Ownership | Borrow checker | Simplified borrow checker |
| Generics | <T: Trait> | <T: Trait> (same syntax) |
| Tensors | External crate (ndarray) | First-class Tensor<D, Shape> |
| GPU | External crate (cuda-rs) | Built-in @device annotation |
| Macros | macro_rules! / proc macros | Not yet supported |
| Async | async/await | Not yet supported |
| Package manager | Cargo | axonc pkg (compatible workflow) |
| Strings | String / &str | String / &str (same model) |
| Error handling | Result<T, E> | Result<T, E> (same model) |
Syntax Differences
Function Declarations
rust// Rust
fn add(x: i32, y: i32) -> i32 {
x + y
}
axon// Axon — explicit return required, types use PascalCase
fn add(x: Int32, y: Int32): Int32 {
return x + y;
}
Key differences:
- Axon requires explicit
return statements (no implicit last-expression return)
- Primitive types use PascalCase:
Int32, Int64, Float32, Float64, Bool
- Semicolons are required on all statements
Type Names
| Rust | Axon |
|---|
i32 | Int32 |
i64 | Int64 |
f32 | Float32 |
f64 | Float64 |
bool | Bool |
String | String |
Vec<T> | Vec<T> |
Variable Bindings
rust// Rust
let x = 5; // immutable
let mut y = 10; // mutable
axon// Axon — val for immutable, var for mutable
val x = 5;
var y = 10;
Ownership Model
Axon uses a simplified version of Rust's ownership model:
axonfn take_ownership(data: Tensor<Float32, [_]>) {
// `data` is owned here — moved from caller
println(data.sum());
}
// `data` is dropped here
fn borrow_data(data: &Tensor<Float32, [_]>) {
// `data` is borrowed — caller keeps ownership
println(data.sum());
}
Simplifications vs Rust:
- No lifetime annotations (
'a) — lifetimes are inferred or scoped
- No
Rc<T> / Arc<T> in user code — runtime reference counting where needed
- Borrow checker is less strict — focuses on preventing use-after-free and double-free
First-Class Tensors
The biggest difference from Rust: tensors are a built-in type with shape tracking:
axon// Axon — tensors are first-class citizens
val x: Tensor<Float32, [3, 3]> = tensor([[1.0, 2.0, 3.0],
[4.0, 5.0, 6.0],
[7.0, 8.0, 9.0]]);
// Shape-checked matrix multiply (compiler verifies dimensions)
val y: Tensor<Float32, [3, 1]> = tensor([[1.0], [0.0], [1.0]]);
val result = x @ y; // result: Tensor<Float32, [3, 1]>
In Rust, you'd need an external crate:
rust// Rust — requires ndarray or nalgebra
use ndarray::Array2;
let x = Array2::<f32>::from_shape_vec((3, 3), vec![...]).unwrap();
Dynamic Shapes
Use _ for dimensions known only at runtime:
axonfn process_batch(input: Tensor<Float32, [_, 784]>): Tensor<Float32, [_, 10]> {
// batch size (_) is dynamic, feature dimensions are static
return model.forward(input);
}
GPU Support
Axon has built-in device management — no external CUDA bindings needed:
axon// Move tensor to GPU
val gpu_data = data.to(Device.GPU(0));
// GPU-annotated functions
@device(GPU)
fn matmul_kernel(a: Tensor<Float32, [_, _]>, b: Tensor<Float32, [_, _]>): Tensor<Float32, [_, _]> {
return a @ b;
}
In Rust, GPU support requires unsafe FFI to CUDA/OpenCL libraries.
What's Not (Yet) in Axon
Coming from Rust, you'll miss these features (planned for future releases):
- Traits/Interfaces — Use structural typing for now
- Macros — No compile-time metaprogramming yet
- Async/Await — No async runtime yet
- Closures — Limited closure support (planned)
- Pattern matching in val —
val (a, b) = tuple; not yet supported
- Crate ecosystem — Axon's package registry is young
Build & Run Comparison
bash# Rust
cargo build
cargo run
cargo test
# Axon
axonc build main.axon
./main
axonc pkg test
Migration Strategy
- Start with compute kernels — Port tensor-heavy code first (biggest Axon advantage)
- Keep Rust for infrastructure — Build systems, CLI tools, networking stay in Rust
- Use Axon for models — Training loops, inference pipelines, data processing
- Interop via C ABI — Axon and Rust can call each other through C FFI
Further Reading
CLI Reference
Complete reference for the axonc command-line compiler.
Synopsis
axonc <COMMAND> [OPTIONS]
axonc --help
axonc --version
Commands
axonc lex
Tokenize an Axon source file and print the token stream.
Arguments:
<FILE> — Path to the .axon source file
Example:
bashaxonc lex hello.axon
# Token(Fn, 1:1)
# Token(Identifier("main"), 1:4)
# Token(LeftParen, 1:8)
# Token(RightParen, 1:9)
# Token(LeftBrace, 1:11)
# ...
axonc parse
Parse an Axon source file and print the AST.
bashaxonc parse <FILE> [OPTIONS]
Arguments:
<FILE> — Path to the .axon source file
Options: | Flag | Description | |------|-------------| | --error-format <FORMAT> | Output format: human (default) or json | | --errors-only | Print only errors, suppress AST output |
Example:
bashaxonc parse hello.axon
axonc parse hello.axon --error-format=json
axonc parse hello.axon --errors-only
axonc check
Type-check an Axon source file. Runs the full frontend pipeline: lex → parse → name resolution → type inference → shape checking → borrow checking.
bashaxonc check <FILE> [OPTIONS]
Arguments:
<FILE> — Path to the .axon source file
Options: | Flag | Description | |------|-------------| | --error-format <FORMAT> | Output format: human (default) or json | | --emit-tast | Emit the typed AST as JSON | | --deny <CATEGORIES> | Promote diagnostic categories to errors (comma-separated) | | --allow <CATEGORIES> | Suppress diagnostic categories (comma-separated) | | --warn <CATEGORIES> | Override diagnostic categories to warnings (comma-separated) | | --error-limit <N> | Stop compilation after N errors |
Diagnostic categories: parse-error, type-error, borrow-error, shape-error, lint-warning, unused-variable, unused-import, unreachable-code, deprecated-syntax
Example:
bashaxonc check model.axon
axonc check model.axon --emit-tast
axonc check model.axon --error-format=json
axonc check model.axon --deny unused-variable
axonc check model.axon --error-limit 5
Exit codes:
0 — No errors
1 — One or more errors found
axonc build
Compile an Axon source file to a native binary.
bashaxonc build <FILE> [OPTIONS]
Arguments:
<FILE> — Path to the .axon source file
Options: | Flag | Description | Default | |------|-------------|---------| | -o, --output <PATH> | Output file path | Input filename without extension | | -O, --opt-level <LEVEL> | Optimization level: 0, 1, 2, 3 | 0 | | --emit-llvm | Emit LLVM IR text instead of compiling | — | | --emit-mir | Emit Axon MIR (debug intermediate representation) | — | | --emit-obj | Emit object file (.o) instead of binary | — | | --gpu <TARGET> | GPU target: none, cuda, rocm, vulkan | none | | --error-format <FORMAT> | Output format: human or json | human | | --deny <CATEGORIES> | Promote diagnostic categories to errors (comma-separated) | — | | --allow <CATEGORIES> | Suppress diagnostic categories (comma-separated) | — | | --warn <CATEGORIES> | Override diagnostic categories to warnings (comma-separated) | — | | --error-limit <N> | Stop compilation after N errors | — |
Examples:
bash# Basic compilation
axonc build hello.axon
# Optimized build with custom output
axonc build model.axon -O 3 -o model
# Emit LLVM IR for inspection
axonc build model.axon --emit-llvm -o model.ll
# GPU compilation for NVIDIA
axonc build model.axon --gpu cuda -O 3
# AMD GPU target
axonc build model.axon --gpu rocm -O 2
# Emit object file for linking
axonc build model.axon --emit-obj -o model.o
Optimization levels: | Level | Description | |-------|-------------| | -O 0 | No optimization — fastest compile, easiest to debug | | -O 1 | Basic MIR optimizations: dead code elimination, constant folding | | -O 2 | Standard optimizations (includes O1 + inlining, loop unrolling, vectorization) | | -O 3 | Aggressive optimizations (includes O2 + LTO, auto-vectorization, FMA) |
axonc fmt
Format an Axon source file according to the standard style.
Arguments:
<FILE> — Path to the .axon source file
The formatter modifies the file in place. Formatting is idempotent — running it twice produces the same output.
Example:
bashaxonc fmt src/main.axon
axonc lint
Run the linter on an Axon source file. Reports style and best-practice warnings.
Arguments:
<FILE> — Path to the .axon source file
Lint rules: | Code | Rule | |------|------| | W5001 | Unused variable | | W5002 | Unused import | | W5003 | Dead code | | W5004 | Unnecessary mutability | | W5005 | Shadowed variable | | W5006 | Naming convention violation | | W5007 | Redundant type annotation | | W5008 | Missing documentation on public items |
See Compiler Errors for details on each warning.
Example:
bashaxonc lint src/main.axon
# warning[W5001]: unused variable `temp`
# --> src/main.axon:12:9
axonc repl
Start the interactive Read-Eval-Print Loop.
REPL Commands: | Command | Description | |---------|-------------| | :type <expr> | Show the type of an expression | | :ast <expr> | Show the AST for an expression | | :load <file> | Load and evaluate an Axon source file | | :save <file> | Save REPL history to a file | | :clear | Clear the REPL state | | :help | Show help | | :quit | Exit the REPL |
Example:
$ axonc repl
Axon REPL v0.1.0 — type :help for commands
>>> val x = 42
>>> x * 2
84
>>> :type x
Int32
>>> val t = randn([3, 3])
>>> t.shape
[3, 3]
>>> :quit
axonc doc
Generate HTML documentation from doc comments in Axon source files.
bashaxonc doc <FILE> [OPTIONS]
Arguments:
<FILE> — Path to the .axon source file
Options: | Flag | Description | |------|-------------| | -o, --output <PATH> | Output file path (default: stdout) |
Example:
bashaxonc doc src/lib.axon -o docs/api.html
axonc lsp
Start the Axon Language Server Protocol server over stdio.
The LSP server provides:
- Real-time diagnostics
- Go-to-definition
- Hover (type information)
- Code completion
- Find references
- Rename symbol
- Signature help
- Inlay hints
- Semantic tokens
Configure your editor to use axonc lsp as the language server for .axon files.
axonc pkg
Package manager commands for Axon projects.
axonc pkg new <NAME>
Create a new Axon project with standard directory structure.
bashaxonc pkg new my_project
# Created project `my_project`
Generated structure:
my_project/
├── Axon.toml
├── src/
│ └── main.axon
└── tests/
└── test_main.axon
axonc pkg init
Initialize an Axon project in the current directory.
bashmkdir my_project && cd my_project
axonc pkg init
axonc pkg build
Build the current project (reads Axon.toml).
axonc pkg run
Build and run the project.
axonc pkg test
Run all tests in the tests/ directory.
axonc pkg add <PACKAGE>
Add a dependency to Axon.toml.
bashaxonc pkg add axon-vision
axonc pkg add axon-nlp --version 0.2.0
axonc pkg remove <PACKAGE>
Remove a dependency from Axon.toml.
bashaxonc pkg remove axon-vision
axonc pkg clean
Remove build artifacts.
axonc pkg fmt
Format all .axon source files in the project.
axonc pkg lint
Lint all .axon source files in the project.
Global Options
| Flag | Description |
|---|
--help | Print help information |
--version | Print version (axonc 0.1.0) |
Environment Variables
| Variable | Description |
|---|
AXON_HOME | Axon installation directory |
AXON_PATH | Additional module search paths (colon-separated) |
See Also
Compiler Error Reference
Complete reference for all Axon compiler error codes. Each error includes its code, description, example code that triggers it, and how to fix it.
Error Code Ranges
| Range | Category | Description |
|---|
| E0001–E0099 | Lexer / Parser | Syntax errors |
| E1001–E1099 | Name Resolution | Undefined or duplicate names |
| E2001–E2099 | Type Errors | Type mismatches and inference failures |
| E3001–E3099 | Shape Errors | Tensor shape mismatches |
| E4001–E4099 | Borrow Errors | Ownership and lifetime violations |
| W5001–W5010 | Lint Warnings | Style and best-practice warnings |
E0001–E0099: Lexer / Parser Errors
E0001: Unexpected Character
axonval x = 42$;
// ^ ERROR[E0001]: unexpected character `$`
Fix: Remove or replace the invalid character.
E0002: Unterminated String Literal
axonval s = "hello;
// ^ ERROR[E0002]: unterminated string literal
Fix: Close the string with a matching ".
axon/* this comment never ends
//^ ERROR[E0003]: unterminated block comment
Fix: Close with */. Nested comments require matching pairs.
E0010: Expected Token
axonfn foo( {
// ^ ERROR[E0010]: expected `)`, found `{`
Fix: Add the missing token.
E0011: Expected Expression
axonval x = ;
// ^ ERROR[E0011]: expected expression, found `;`
Fix: Provide a value or expression.
E0012: Expected Type
axonval x: = 42;
// ^ ERROR[E0012]: expected type, found `=`
Fix: Provide a type annotation after :.
E0020: Invalid Integer Literal
axonval x = 0xGG;
// ^ ERROR[E0020]: invalid hexadecimal literal
Fix: Use valid digits for the number base (0-9, a-f for hex).
E0021: Invalid Float Literal
axonval x = 1.2.3;
// ^ ERROR[E0021]: invalid float literal
E0030: Invalid Escape Sequence
axonval s = "\q";
// ^ ERROR[E0030]: unknown escape sequence `\q`
Fix: Use valid escapes: \\, \n, \t, \r, \", \0.
E0040: Duplicate Match Arm
axonmatch x {
1 => println("one"),
1 => println("one again"), // ERROR[E0040]: duplicate match arm
}
E0050: Invalid Pattern
axonmatch value {
1 + 2 => println("?"), // ERROR[E0050]: expected pattern, found expression
}
E1001–E1099: Name Resolution Errors
E1001: Undefined Variable
axonfn main() {
println("{}", unknown_var);
// ^ ERROR[E1001]: undefined variable `unknown_var`
}
Fix: Declare the variable before use or check for typos.
E1002: Undefined Function
axonfn main() {
foo();
// ^ ERROR[E1002]: undefined function `foo`
}
E1003: Undefined Type
axonval x: NonExistent = 42;
// ^ ERROR[E1003]: undefined type `NonExistent`
E1010: Duplicate Definition
axonfn foo() {}
fn foo() {}
// ^ ERROR[E1010]: duplicate definition of `foo`
E1011: Duplicate Field
axonmodel Point { x: Int32, x: Int32 }
// ^ ERROR[E1011]: duplicate field `x`
E1020: Unresolved Import
axonuse std.nonexistent.Module;
// ^ ERROR[E1020]: unresolved import `std.nonexistent`
E1030: Private Item
axonmod inner {
fn secret() {}
}
inner.secret();
// ^ ERROR[E1030]: function `secret` is private
Fix: Add pub to the item or access it from within its module.
E2001–E2099: Type Errors
E2001: Type Mismatch
axonval x: Int32 = "hello";
// ERROR[E2001]: type mismatch — expected `Int32`, found `String`
E2002: Binary Operator Type Error
axonval x = "hello" + 42;
// ERROR[E2002]: cannot apply `+` to `String` and `Int32`
E2003: Return Type Mismatch
axonfn foo(): Int32 {
"not an integer"
// ERROR[E2003]: return type mismatch — expected `Int32`, found `String`
}
E2010: Missing Field
axonmodel Point { x: Int32, y: Int32 }
val p = Point { x: 1 };
// ERROR[E2010]: missing field `y` in struct `Point`
E2011: Unknown Field
axonmodel Point { x: Int32, y: Int32 }
val p = Point { x: 1, y: 2, z: 3 };
// ^ ERROR[E2011]: unknown field `z` on `Point`
E2020: Trait Not Implemented
axonfn print_it<T: Display>(x: T) {}
print_it(SomeStruct {});
// ERROR[E2020]: trait `Display` not implemented for `SomeStruct`
Fix: Implement the required trait for the type.
E2021: Ambiguous Method
axon// When multiple trait impls provide the same method
value.shared_method();
// ERROR[E2021]: ambiguous method call — candidates from `TraitA` and `TraitB`
Fix: Use fully qualified syntax: TraitA.shared_method(&value).
E2030: Cannot Infer Type
axonval x = Vec.new();
// ERROR[E2030]: cannot infer type — add a type annotation
Fix: val x: Vec<Int32> = Vec.new();
E2040: Invalid Cast
axonval x = "hello" as Int32;
// ERROR[E2040]: cannot cast `String` to `Int32`
E3001–E3099: Shape Errors
E3001: Matmul Shape Mismatch
axonval a: Tensor<Float32, [3, 4]> = randn([3, 4]);
val b: Tensor<Float32, [5, 6]> = randn([5, 6]);
val c = a @ b;
// ERROR[E3001]: matmul shape mismatch — inner dimensions 4 ≠ 5
// note: left shape [3, 4], right shape [5, 6]
Fix: Ensure the inner dimensions match: [M, K] @ [K, N].
E3002: Invalid Reshape
axonval t: Tensor<Float32, [2, 3]> = randn([2, 3]);
val r = t.reshape([2, 2]);
// ERROR[E3002]: cannot reshape [2, 3] (6 elements) to [2, 2] (4 elements)
Fix: Ensure the total number of elements is preserved.
E3003: Broadcast Incompatible
axonval a: Tensor<Float32, [3, 4]> = randn([3, 4]);
val b: Tensor<Float32, [3, 5]> = randn([3, 5]);
val c = a + b;
// ERROR[E3003]: shapes [3, 4] and [3, 5] are not broadcast-compatible
E3010: Invalid Transpose Axes
axonval t: Tensor<Float32, [2, 3, 4]> = randn([2, 3, 4]);
val p = t.permute([0, 1, 5]);
// ERROR[E3010]: axis 5 out of range for tensor with 3 dimensions
E3020: Dynamic Shape Required
axon// When static shape info is unavailable
// ERROR[E3020]: cannot verify shape statically — consider using `?` for dynamic dims
// note: runtime shape check will be inserted
E4001–E4099: Borrow Errors
E4001: Use After Move
axonval data = randn([100]);
val other = data;
println("{}", data);
// ERROR[E4001]: use of moved value `data`
// note: `data` was moved on line 2
Fix: Clone the value or restructure to avoid the move.
E4002: Borrow of Moved Value
axonval s = "hello".to_string();
val t = s;
val r = &s;
// ERROR[E4002]: cannot borrow `s` — value has been moved
E4003: Mutable Borrow Conflict
axonvar v = vec![1, 2, 3];
val r1 = &v;
val r2 = &mut v;
// ERROR[E4003]: cannot borrow `v` as mutable — also borrowed as immutable
// note: immutable borrow of `v` occurs on line 2
Fix: Ensure immutable borrows end before taking a mutable borrow.
E4004: Multiple Mutable Borrows
axonvar data = randn([10]);
val a = &mut data;
val b = &mut data;
// ERROR[E4004]: cannot borrow `data` as mutable more than once
E4005: Dangling Reference
axonfn dangling(): &String {
val s = "hello".to_string();
&s
// ERROR[E4005]: `s` does not live long enough
// note: borrowed value only lives until end of function
}
Fix: Return an owned value instead of a reference.
E4006: Mutability Required
axonval data = randn([10]);
scale(&mut data, 2.0);
// ERROR[E4006]: cannot borrow `data` as mutable — declared as immutable
// help: consider changing to `var data`
E4007: Cross-Device Borrow
axonvar t = randn([256]);
val cpu_ref = &t;
val gpu_t = t.to_gpu();
// ERROR[E4007]: cannot move `t` to GPU while borrowed on CPU
W5001–W5010: Lint Warnings
W5001: Unused Variable
axonval x = 42;
// WARNING[W5001]: unused variable `x`
// help: prefix with underscore: `_x`
W5002: Unused Import
axonuse std.math.sin;
// WARNING[W5002]: unused import `sin`
W5003: Dead Code
axonfn unused_function() {}
// WARNING[W5003]: function `unused_function` is never called
W5004: Unnecessary Mutability
axonvar x = 42;
println("{}", x);
// WARNING[W5004]: variable `x` declared as mutable but never mutated
W5005: Shadowed Variable
axonval x = 1;
val x = 2;
// WARNING[W5005]: variable `x` shadows previous declaration
W5006: Naming Convention
axonfn MyFunction() {}
// WARNING[W5006]: function `MyFunction` should use snake_case
// help: rename to `my_function`
W5007: Redundant Type Annotation
axonval x: Int32 = 42;
// WARNING[W5007]: type annotation is redundant — inferred as `Int32`
W5008: Missing Documentation
axonpub fn public_api() {}
// WARNING[W5008]: public item `public_api` is missing documentation
Human-Readable (Default)
error[E2001]: type mismatch — expected `Int32`, found `String`
--> src/main.axon:5:15
help: consider using `parse()` to convert the string
json{
"error_code": "E2001",
"message": "type mismatch — expected `Int32`, found `String`",
"severity": "error",
"location": { "file": "src/main.axon", "line": 5, "column": 15 },
"suggestion": "consider using `parse()` to convert the string"
}
See Also
Axon Compiler Architecture
Pipeline
Source → Lexer → Parser → AST → Name Resolution → Type Checking
→ Shape Checking → Borrow Checking → TAST → MIR → MIR Passes → LLVM IR → Native Binary
Overview
The Axon compiler (axonc) is structured as a multi-phase pipeline. Each phase transforms the program representation and may produce diagnostic errors. The pipeline is designed to continue after errors where possible, providing multiple diagnostics in a single compilation pass.
Modules
Core Pipeline
| Module | File | Description |
|---|
| Lexer | src/lexer.rs | Tokenization of Axon source text. Handles keywords, types, operators, delimiters, literals (int, float, string, char), comments (// and / /), attributes (@cpu, @gpu, @device), and source location tracking. |
| Parser | src/parser.rs | Recursive descent parser. Produces an AST from a token stream. Handles operator precedence with Pratt parsing, provides clear error messages with source locations, and implements error recovery for multiple diagnostics. |
| AST | src/ast.rs | Abstract Syntax Tree types. All nodes carry Span for source location. Serializable via serde for tooling integration. |
| Name Resolution | src/symbol.rs | Symbol table with lexical scoping. Resolves names to definitions, detects undefined variables, and tracks variable mutability. Part of the type checking phase. |
| Type Checker | src/typeck.rs | Hindley-Milner type inference with constraint-based unification. Registers stdlib types, resolves names, infers expression types, and checks type compatibility. |
| Shape Checker | src/shapes.rs | Tensor dimension verification. Ensures tensor operations have compatible shapes at compile time. Axon's key differentiator for ML/AI safety. |
| Borrow Checker | src/borrow.rs | Ownership, move, and borrow analysis. Tracks value lifetimes, prevents use-after-move, and enforces mutable borrow exclusivity. |
| TAST | src/tast.rs | Typed Abstract Syntax Tree. Annotates each AST node with its resolved type. Serves as the bridge between type checking and code generation. |
| MIR | src/mir/ | Mid-level Intermediate Representation. Flattened, SSA-like form suitable for optimization passes and lowering to LLVM IR. |
| MIR Passes | src/mir/transform/ | MIR optimization passes: dead code elimination and constant folding. Managed by a PassManager that runs passes based on optimization level. |
| Name Interner | src/interner.rs | Global string interning for O(1) name comparisons. Deduplicates identifier strings via NameInterner and lightweight Name handles. |
| Diagnostics | src/error.rs | Accumulative diagnostic system with categories, severity configuration (--deny/--allow/--warn), error limits, and grouped display. |
| Codegen | src/codegen/ | LLVM IR generation and native compilation. |
Code Generation Submodules
| Module | File | Description |
|---|
| LLVM IR | src/codegen/llvm.rs | Generates textual LLVM IR (.ll files) from MIR. Compiles to native code via clang. |
| ABI | src/codegen/abi.rs | Application Binary Interface definitions for calling conventions. |
| MLIR | src/codegen/mlir.rs | MLIR integration for ML-specific optimizations (future). |
| Runtime | src/codegen/runtime.rs | Runtime support declarations (memory allocation, printing, etc.). |
Standard Library
| Module | File | Description |
|---|
| Stdlib | src/stdlib/ | Built-in type and function registration. Registers primitive types (Int32, Float64, Bool, String), collection types, and AI framework types into the type checker. |
The stdlib includes AI/ML framework types:
src/stdlib/nn.rs — Neural network layers (Linear, Conv2d, etc.)
src/stdlib/optim.rs — Optimizers (SGD, Adam, etc.)
src/stdlib/data.rs — Data loading types (Dataset, DataLoader)
src/stdlib/mem.rs — Memory management primitives
| Module | File | Description |
|---|
| Formatter | src/fmt.rs | Code formatter. Parses source to AST and re-emits with consistent style. |
| Linter | src/lint.rs | Static analysis linter. Checks for unused variables, naming conventions, complexity, and more. |
| REPL | src/repl.rs | Interactive Read-Eval-Print Loop for Axon expressions. |
| Doc Generator | src/doc.rs | Documentation generation from doc comments. |
| LSP Server | src/lsp/ | Language Server Protocol implementation for IDE integration. |
| Package Manager | src/pkg/ | Package management (manifests, registry, dependency resolution). |
Error System
Errors are categorized by compiler phase using numeric codes:
| Range | Category | Examples |
|---|
| E0xxx | Lexer/Parser errors | E0001 unexpected character, E0002 unterminated string |
| E1xxx | Name resolution errors | E1001 undefined variable, E1002 duplicate definition |
| E2xxx | Type errors | E2001 type mismatch, E2002 cannot infer type |
| E3xxx | Shape errors | E3001 dimension mismatch, E3002 incompatible tensor shapes |
| E4xxx | Borrow errors | E4001 use after move, E4002 mutable borrow conflict |
| E5xxx | MIR/Codegen errors | E5009 no main function, E5010 codegen failure |
| W5xxx | Lint warnings | W5001 unused variable, W5002 naming convention |
All errors carry:
- A source
Span (file, line, column, offset)
- A human-readable message
- A severity level (Error, Warning, Note)
- Optional suggestions for fixes
- An optional diagnostic category for filtering (parse-error, type-error, borrow-error, etc.)
Diagnostics support severity overrides via CLI flags (--deny, --allow, --warn) and an error limit (--error-limit N) that stops compilation after N errors.
Data Flow
┌──────────┐
Source text ───►│ Lexer │───► Vec<Token>
└──────────┘
│
┌──────────┐
│ Parser │───► AST (Program)
└──────────┘
│
┌──────────┐
│ Names │───► SymbolTable
└──────────┘
│
┌──────────┐
│ TypeCk │───► TypeInterner + Constraints
└──────────┘
│
┌────────┴────────┐
│ │
┌─────────┐ ┌──────────┐
│ ShapeCk │ │ BorrowCk │
└─────────┘ └──────────┘
│ │
└────────┬────────┘
│
┌──────────┐
│ TAST │───► TypedProgram
└──────────┘
│
┌──────────┐
│ MIR │───► MirProgram
└──────────┘
│
┌──────────┐
│ MIR Pass │───► Optimized MirProgram
└──────────┘
│
┌──────────┐
│ LLVM IR │───► .ll file
└──────────┘
│
┌──────────┐
│ clang │───► Native binary
└──────────┘
Key Design Decisions
- Safe Rust only — No
unsafe blocks anywhere in the compiler.
- Arena-style type interning — Types are identified by
TypeId (index), enabling O(1) lookups and avoiding lifetime complexity.
- Constraint-based type inference — Generates constraints during traversal, then solves via unification. Enables HM-style inference.
- Error recovery — Parser continues after errors to report multiple diagnostics in one pass.
- Textual LLVM IR — Generates
.ll files rather than using LLVM C API, keeping the compiler dependency-free and simplifying builds.
- External
clang — Uses clang as a subprocess for final compilation, avoiding LLVM library linking.
- Stack safety — Recursive descent functions are wrapped with
stacker::maybe_grow to dynamically grow the stack for deeply nested input, preventing stack overflows.
- MIR optimization passes — Pluggable pass architecture (
MirPass trait + PassManager) enables incremental addition of optimization passes. Dead code elimination and constant folding are built-in at -O1 and above.
Axon Compiler Security Audit
Date: 2024 Scope: Axon compiler (axonc) crate — all source under src/
1. unsafe Block Inventory
Finding: No unsafe blocks in compiler source code.
A search of the entire src/ directory confirms zero uses of unsafe { } blocks in the Axon compiler implementation. The keyword unsafe appears only in:
- Lexer/token — as a keyword token that Axon can lex (
TokenKind::Unsafe)
- Parser — to parse
unsafe fn declarations in Axon source
- LSP — as a completion/hover entry for the
unsafe keyword
- Package manifest — in test data for lint deny lists
This means the Axon compiler itself relies entirely on Rust's safe subset, inheriting all of Rust's memory safety guarantees (no buffer overflows, use-after-free, data races, etc.).
Risk: None — Rust's type system and borrow checker enforce safety at compile time.
2. FFI Boundaries
2.1 Clang Subprocess Invocation
The only external process invocation is in src/codegen/llvm.rs:
compile_ir_to_binary() — invokes clang via std::process::Command
compile_ir_to_object() — invokes clang via std::process::Command
Risks:
- Command injection: The output path is passed directly as a command argument. If an attacker controls the output path, they could potentially inject arguments.
- Path traversal: No validation is performed on the output path.
Mitigations:
- Arguments are passed as separate array elements to
Command::args(), not concatenated into a shell string. This prevents shell injection.
- The IR content is written to a file first, not passed via stdin, limiting injection vectors.
- Only
clang is invoked — no shell (sh -c / cmd /c) is used.
Recommendations:
- Validate/sanitize output paths before passing to
clang.
- Use absolute paths for the
clang binary or verify it on $PATH.
- Consider sandboxing
clang invocations (e.g., seccomp, containers).
- Add a timeout to prevent hanging
clang processes.
2.2 No Other FFI
The compiler does not use any extern "C" functions, does not link to C libraries, and does not use libc or std::ffi directly.
3.1 Source Code Parsing
The lexer and parser handle arbitrary input gracefully:
- Lexer (
src/lexer.rs): Processes input character-by-character. Unknown characters produce error tokens. Unterminated strings/comments produce error tokens with descriptive messages. No panics on any input.
- Parser (
src/parser.rs): Uses error recovery to continue parsing after syntax errors. Returns a partial AST plus a list of errors. The parse_source() function in lib.rs is the safe entry point.
- Type checker (
src/typeck.rs): Handles undefined types, recursive types, and type mismatches by producing error diagnostics. Falls through gracefully when earlier phases produce errors.
Verification: The fuzz test suite (tests/fuzz_tests.rs) exercises the compiler with 40+ edge cases including empty input, all ASCII characters, malformed syntax, deeply nested structures, and more.
Potential risks:
- Extremely long identifiers: The lexer allocates a
String for each identifier. A 10GB identifier would consume 10GB of memory.
- Deeply nested expressions: The recursive descent parser uses the call stack. Extremely deep nesting (>1000 levels) may cause stack overflow.
- Exponential type inference: Pathological type constraints could cause the unification algorithm to run for a long time.
Recommendations:
- Add configurable limits on identifier length (e.g., 1024 characters).
- Add a nesting depth limit to the parser (e.g., 256 levels).
- Add a timeout or iteration limit to the type inference engine.
- Add a maximum source file size check (e.g., 10MB).
4. Package Registry Security Model
The package system (src/pkg/) is in early development. Future security considerations:
4.1 Package Integrity
- Requirement: All packages must have cryptographic signatures (ed25519).
- Requirement: Package contents must be verified against a hash (SHA-256).
- Requirement: Lock files must pin exact versions with hashes.
4.2 Dependency Resolution
- Risk: Dependency confusion attacks (public package overriding private).
- Mitigation: Support private registry priorities, namespace scoping.
- Risk: Typosquatting.
- Mitigation: Name similarity checks during
axon pkg add.
4.3 Build Scripts
- Risk: Arbitrary code execution during package installation.
- Recommendation: Axon should NOT support arbitrary build scripts. Instead, provide a declarative build configuration.
4.4 Supply Chain
- Recommendation: Support
axon audit command to check for known vulnerabilities in dependencies.
- Recommendation: Support reproducible builds.
5. REPL Security Considerations
The REPL (src/repl.rs) reads from stdin and evaluates Axon expressions.
Current state: The REPL only performs parsing and type checking — it does not execute code. This limits the attack surface.
Future risks (when execution is added):
- File system access: Axon code in the REPL should be sandboxed.
- Network access: Should be disabled by default in REPL mode.
- Resource limits: CPU time and memory should be bounded.
- History file: REPL history should be stored with restricted permissions.
Recommendations:
- Implement a capability-based security model for REPL execution.
- Default to a restricted sandbox with explicit opt-in for I/O.
- Add
:sandbox on/off REPL command for security-conscious users.
6. Memory Safety Guarantees
6.1 Rust Safety Model
The Axon compiler is written entirely in safe Rust. This provides:
- No buffer overflows: Bounds checking on all array/vector accesses.
- No use-after-free: Ownership system prevents dangling references.
- No null pointer dereference:
Option<T> forces explicit handling.
- No data races: Borrow checker prevents shared mutable state.
- No uninitialized memory: All values must be initialized.
6.2 Allocation Patterns
- Type interner (
src/types.rs): Uses arena-style allocation via Vec<Type>. Types are identified by index (TypeId), preventing dangling references.
- Symbol table (
src/symbol.rs): Uses scoped HashMap with stack-based scope management. Scopes are pushed/popped correctly.
- AST nodes (
src/ast.rs): Heap-allocated via Box<Expr> for recursive types. Ownership is clear and single-owner.
- Error collection: Errors are collected in
Vec<CompileError> and returned to the caller. No global mutable state.
6.3 Dependencies
| Crate | Version | Purpose | Risk |
|---|
serde | 1.x | Serialization | Low — widely audited |
serde_json | 1.x | JSON output | Low — widely audited |
clap | 4.x | CLI argument parsing | Low — widely audited |
All dependencies are well-established, widely audited crates with no known vulnerabilities.
Summary
| Area | Status | Risk Level |
|---|
unsafe code | None found | ✅ None |
| FFI boundaries | Clang subprocess only | ⚠️ Low |
| Input validation | Good (with fuzz tests) | ⚠️ Low |
| Package registry | Not yet implemented | 📋 Future |
| REPL security | Parse-only (no execution) | ✅ None (currently) |
| Memory safety | Full Rust safety guarantees | ✅ None |
| Dependencies | 3 well-audited crates | ✅ None |
Overall assessment: The Axon compiler has a strong security posture thanks to being written in safe Rust with minimal dependencies. The primary areas for future hardening are input size limits and the package registry security model.
Contributing to Axon
Thank you for your interest in contributing to the Axon programming language! This guide covers how to build from source, run tests, and submit changes.
Getting Started
Prerequisites
- Rust (stable, 1.75+) — rustup.rs
- Git
- Clang (for the native binary backend) — optional for most development
Clone and Build
bashgit clone https://github.com/axon-lang/axon.git
cd axon
cargo build
Verify it works:
bashcargo run -- --help
cargo run -- lex tests/examples/example1_hello.axon
Project Structure
axon/
├── Cargo.toml # Rust project manifest
├── src/
│ ├── main.rs # CLI entry point (axonc)
│ ├── lib.rs # Library root — compiler pipeline
│ ├── token.rs # Token types
│ ├── lexer.rs # Lexer (source → tokens)
│ ├── ast.rs # AST node definitions
│ ├── parser.rs # Parser (tokens → AST)
│ ├── span.rs # Source location tracking
│ ├── error.rs # Error types and reporting
│ ├── types.rs # Type system (Type, TypeInterner)
│ ├── symbol.rs # Symbol table and name resolution
│ ├── typeck.rs # Type checker (HM inference)
│ ├── shapes.rs # Shape checker (tensor dims)
│ ├── borrow.rs # Borrow checker (ownership)
│ ├── tast.rs # Typed AST
│ ├── mir.rs # Mid-level IR
│ ├── codegen/
│ │ ├── llvm.rs # LLVM IR generation
│ │ ├── mlir.rs # MLIR / GPU backend
│ │ ├── runtime.rs # Runtime library
│ │ └── abi.rs # ABI and symbol mangling
│ ├── stdlib/ # Standard library definitions
│ │ ├── prelude.rs # Auto-imported items
│ │ ├── ops.rs # Operator traits
│ │ ├── collections.rs # Vec, HashMap, Option, Result
│ │ ├── tensor.rs # Tensor operations
│ │ ├── nn.rs # Neural network layers
│ │ ├── autograd.rs # Automatic differentiation
│ │ ├── optim.rs # Optimizers
│ │ ├── loss.rs # Loss functions
│ │ └── ... # More stdlib modules
│ ├── fmt.rs # Code formatter
│ ├── lint.rs # Linter
│ ├── doc.rs # Documentation generator
│ ├── repl.rs # REPL
│ ├── lsp/ # Language server
│ │ └── handlers.rs # LSP request handlers
│ └── pkg/ # Package manager
│ ├── manifest.rs # Axon.toml parsing
│ ├── resolver.rs # Dependency resolution
│ └── commands.rs # CLI commands
├── stdlib/ # Axon source stubs (.axon files)
├── tests/
│ ├── integration_tests.rs
│ ├── type_tests.rs
│ ├── codegen_tests.rs
│ ├── stdlib_tests.rs
│ ├── ai_framework_tests.rs
│ ├── tooling_tests.rs
│ └── examples/*.axon # Example programs
├── editors/
│ └── vscode/ # VS Code extension
├── benches/ # Benchmarks
├── fuzz/ # Fuzz testing
└── docs/ # Documentation
Running Tests
Full Test Suite
This runs 863+ tests across all compiler phases.
Specific Test Files
bash# Lexer and parser tests
cargo test --lib lexer
cargo test --lib parser
# Type checker tests
cargo test --test type_tests
# Code generation tests
cargo test --test codegen_tests
# Standard library tests
cargo test --test stdlib_tests
# AI framework tests
cargo test --test ai_framework_tests
# Tooling tests (LSP, formatter, linter, REPL)
cargo test --test tooling_tests
Running a Single Test
bashcargo test test_name_here -- --exact
Running Benchmarks
bashcargo test --test compiler_bench -- --ignored
Development Workflow
1. Create a Branch
bashgit checkout -b feature/my-feature
2. Make Changes
Edit the relevant source files. The compiler pipeline flows:
Source → Lexer → Parser → AST
↓
Name Resolution
↓
Type Checker → Shape Checker → Borrow Checker
↓
Typed AST
↓
MIR
↓
LLVM IR / MLIR
↓
Native Binary
3. Add Tests
Every change should include tests. Add them to the appropriate test file:
- Lexer/Parser changes:
src/lexer.rs or src/parser.rs (unit tests)
- Type system changes:
tests/type_tests.rs
- Codegen changes:
tests/codegen_tests.rs
- Stdlib additions:
tests/stdlib_tests.rs
- Tooling changes:
tests/tooling_tests.rs
4. Run Tests
Ensure all tests pass before submitting.
bashcargo fmt
cargo clippy
6. Submit a Pull Request
Push your branch and open a PR. Include:
- Description of what the change does
- Related issue number (if any)
- Test output confirming tests pass
Coding Guidelines
Style
- Follow Rust standard style (
cargo fmt)
- Use descriptive variable names
- Add doc comments (
///) for public items
- Keep functions focused and under 50 lines when possible
Error Handling
- Use proper error codes (see Compiler Errors)
- Include source locations in all errors
- Add suggestions where helpful
- Test both success and error cases
Testing
- Each feature should have positive and negative tests
- Test edge cases (empty input, deeply nested structures, etc.)
- Integration tests should use
.axon example files
- Aim for test names that describe what they verify
Adding a New Feature
Adding a New Keyword
- Add the keyword to
Token enum in src/token.rs
- Add it to the keyword map in
src/lexer.rs
- Add parser support in
src/parser.rs
- Add AST node in
src/ast.rs
- Add type checking in
src/typeck.rs
- Add tests at each level
- Update documentation
Adding a Stdlib Function
- Add the function signature in
src/stdlib/<module>.rs
- Register it in the type checker (
src/typeck.rs)
- Add an Axon stub in
stdlib/<module>.axon
- Add tests in
tests/stdlib_tests.rs
- Update documentation
Adding a New Lint Rule
- Add the warning code to
src/lint.rs
- Implement detection logic
- Add tests in
tests/tooling_tests.rs
- Document in
docs/reference/compiler-errors.md
Architecture Overview
For detailed architecture documentation, see docs/internals/architecture.md.
Key Design Principles
- Correctness first — the compiler should never accept invalid programs
- Helpful errors — every error should explain what went wrong and suggest a fix
- Performance — the compiler should be fast (targeting <100ms for typical files)
- Testability — every component should be independently testable
Communication
- Issues: Report bugs and request features via GitHub Issues
- Discussions: Design discussions in GitHub Discussions
- Code Review: All changes require at least one review
License
Axon is open source. By contributing, you agree that your contributions will be licensed under the same license as the project.
See Also