In this guidebook we will learn how the compiler works and how to contribute to it by adding new features or fixing bugs. We will cover the CLI interface, the compiler architecture, how to create builtins, the standard library and tests.
CLI Interface
The entire CLI interface is defined in main.rs, using clap for argument parsing. The main function initializes the AmberCompiler struct (defined in src/compiler.rs), which serves as the main driver for the compilation process.
Available subcommands include:
Run: Compiles and executes an Amber script immediately.Build: Compiles an Amber script to a Bash script.Eval: Executes a snippet of Amber code passed as a string.Check: parses and type-checks the code without generating output.Docs: parsing the code and generating documentation for it.Test: Runs tests defined in the Amber project.Completion: Generates shell completion scripts.
When a command is executed, main.rs configures the AmberCompiler with the appropriate options and calls its methods (e.g., compile(), execute(), generate_docs()) to perform the requested task.
Compiler
Compiler consists of:
src/compiler.rs- Main entry point for the compilersrc/rules.rs- Syntax rules that are used by Heraclitus framework to correctly output tokenssrc/utils- Contains parsing environments, caches, contexts and Amber's implementations of metadatasrc/modules- Syntax modules that parse Amber syntax and also handle the translation processsrc/translate- Contains a definition ofTranslateModuletrait that is used to translate modules the previously mentionedmodules
AmberCompiler struct by itself is just a bootstrapper for all the syntax modules. Here we will learn some practical facts about compiler. For a more in-depth guide, visit our compiler guide.
Parser & Tokenizer
Thanks to heraclitus, we can use simple abstractions to go through tokens.
Please open any syntax module code file, and find a line that says:
impl SyntaxModule<ParserMetadata> for MODULE_NAME_HERE
It will have a parse() function, where all the magic happens. You can either dig into the code yourself or look at the example below to understand how it works.
Example parser
Important: this is pseudo code. Its purpose is to demonstrate how it should look like.
// This code parses the following: `1 + 2`
fn parse(meta: &mut ParserMetadata) -> SyntaxResult {
let digit_1 = meta.get_current_token(); // gets the text (as an Option)
token(meta, "+")?; // matches that there is a "+" and skips it
let digit_2 = meta.get_current_token();
self.digit_1 = digit_1.unwrap();
self.digit_2 = digit_2.unwrap();
Ok(())
}
Parsing Logic & Failures
The parsing process in Heraclitus revolves around the SyntaxResult type, which is an alias for Result<(), Failure>. The Failure type is critical for control flow and offers two distinct error modes:
- Quiet Error: which means "This is not the syntax validation you are looking for."
- Returned when a syntax module doesn't match the current code (e.g., looking for a
letkeyword but findingif). - The compiler catches this error and backtracks to try the next available syntax module.
- Returned when a syntax module doesn't match the current code (e.g., looking for a
- Loud Error: which means "This IS the correct module, but the code is wrong."
- Returned when the compiler is certain it's parsing the correct construct but encounters invalid syntax (e.g., missing semicolon after variable declaration).
- This error halts the entire compilation process and reports a failure to the user.
Heraclitus Functions
Heraclitus provides a set of helper functions and macros to streamline parsing and error reporting:
token(meta, "keyword"): Attempts to consume a specific text token. Returns a Quiet error if the token doesn't match.token_by(meta, pattern): Matches a token based on a boolean predicate function. Returns a Quiet error failure.syntax(meta, &mut submodule): Recursively parses a nested syntax module. It propagates whatever error the submodule returns (Quiet or Loud).error!(meta, tok, ...): A macro that halts compilation with a Loud error at the position of provided token.error_pos!(meta, pos => ...): A macro that halts compilation with a Loud error at a specific position which can be more complex than single token, allowing for detailed error messages with context.
Translator
Same as parser open a syntax module, and find a line that says impl TranslateModule for MODULE_NAME_HERE and that should contain a translate function.
Same as before, you can either dig into the code you opened or look at the example below.
Example translator
Important: this is pseudo code. Its purpose is to demonstrate how it should look like.
// This will translate `1 + 2` into `(( 1 + 2 ))`
fn translate() -> String {
// self.digit_1 and self.digit_2 is set earlier by the parser
format!("(( {} + {} ))", self.digit_1, self.digit_2)
}
Basically, the translate() method should return a FragmentKind which represents a piece of the compiled shell script.
Fragments
Amber compiles to shell script fragments. The FragmentKind enum encapsulates these different types of output. You can find available fragment modules in src/translate/fragments/. Common ones include:
RawFragment(raw): Represents a raw string of shell code (e.g.,echo "hello").BlockFragment(block): Represents a block of code, often used for bodies of functions or loops.ListFragment(list): A list of fragments properly joined together.SubprocessFragment(subprocess): For command substitutions$(...).VarExprFragment/VarStmtFragment: For handling variable usage and definition.
To construct these fragments easily, Amber provides helper macros:
raw_fragment!("echo {}", value): Creates aRawFragmentwith formatted text.fragments!(a, b, c): joins multiple fragments into aListFragment.
Creating Builtins
In this guide we will see how to create a basic built-in function that in Amber syntax presents like:
example "Hello World"
And compiles to:
echo "Hello World"
For a real example based on this guide you can check the cd builtin that is also Failable.
Let's start!
Create a src/modules/builtin/builtin.rs file with the following content:
// Import the core Heraclitus framework traits and types required for defining syntax modules
use heraclitus_compiler::prelude::*;
// Import the Expression module to parse arguments as expressions
use crate::modules::expression::expr::Expr;
// Import the TranslateModule trait to define how this syntax translates to shell code
use crate::translate::module::TranslateModule;
// Import metadata structures:
// - `ParserMetadata`: Tracks parsing state (declared variables, functions, warnings, current scope).
// - `TranslateMetadata`: Tracks translation state (indentation level, silent/eval modes).
use crate::utils::{ParserMetadata, TranslateMetadata};
// Import DocumentationModule (required trait, even if unused for internal builtins)
use crate::docs::module::DocumentationModule;
// Import the `raw_fragment` macro for easy construction of shell script fragments
use crate::raw_fragment;
// This struct represents the parsed state of our builtin.
// It holds the data extracted during parsing.
#[derive(Debug, Clone)]
pub struct Example {
// We expect this builtin to take one argument, which is an expression.
value: Expr,
}
// Implement the SyntaxModule trait to define how to parse this construct.
impl SyntaxModule<ParserMetadata> for Example {
// Defines the name used for this module in compiler debug logs and traces.
syntax_name!("Example");
// Returns a default instance of the struct.
fn new() -> Self {
Example {
value: Expr::new()
}
}
// The core parsing logic.
// Returns `SyntaxResult`, which is `Result<(), Failure>`.
// See "2.1.1. Parsing Logic & Failures" for details on Quiet vs Loud errors.
fn parse(&mut self, meta: &mut ParserMetadata) -> SyntaxResult {
// 1. Match the keyword "example".
// `token(...)` attempts to consume the specific token. If it fails, it returns a `Quiet` error.
// The `?` operator propagates this error, allowing the compiler to try other modules.
token(meta, "example")?;
// 2. Parse the argument.
// Once we've matched the keyword "example", we are committed to this syntax.
// `syntax(...)` will return either a `Loud` or `Quiet` error from the submodule.
syntax(meta, &mut self.value)?;
Ok(())
}
}
// Implement TypeCheckModule to validate types before translation.
impl TypeCheckModule for Example {
fn typecheck(&mut self, meta: &mut ParserMetadata) -> SyntaxResult {
// 1. Recursively typecheck the argument expression first.
self.value.typecheck(meta)?;
// 2. Validate that the argument is of the expected type (Text).
if self.value.get_type() != Type::Text {
let pos = self.value.get_position();
// `error_pos!` creates a formatted `Loud` error message pointing to the specific
// location in the user's code.
return error_pos!(meta, pos => {
message: "Builtin function `example` can only be used with values of type Text"
});
}
Ok(())
}
}
// Implement TranslateModule to convert the AST into the target shell script.
impl TranslateModule for Example {
fn translate(&self, meta: &mut TranslateMetadata) -> FragmentKind {
// 1. Translate the argument expression into a shell string.
let value = self.value.translate(meta);
// 2. Construct the final shell command.
// `raw_fragment!` creates a code fragment that is inserted directly into the output script.
raw_fragment!("echo {}", value)
}
}
// Implement DocumentationModule.
// For internal builtins not exposed in standard docs, we return an empty string.
impl DocumentationModule for Example {
fn document(&self, _meta: &ParserMetadata) -> String {
String::new()
}
}
Now let's import it in the main module for built-ins src/modules/builtin/mod.rs
pub mod echo;
pub mod nameof;
// ...
pub mod builtin;
Now we have to integrate this syntax module with either statement Stmt or expression Expr. Since this is a statement module, we'll add it to the list of statement syntax modules. Let's modify src/modules/statement/stmt.rs:
// 1. Import your new module
use crate::modules::builtin::builtin::Example;
// 2. Add it to the StmtType enum
// This allows the AST (Abstract Syntax Tree) to hold your new construct.
pub enum StmtType {
// ...
Example(Example)
}
// 3. Register it in the parsing loop
impl SyntaxModule<ParserMetadata> for Statement {
// ...
fn parse(&mut self, meta: &mut ParserMetadata) -> SyntaxResult {
// `parse_statement!` iterates through the provided modules in order.
// The order determines precedence (though keywords usually disambiguate).
parse_statement!([
// ...
Example,
// ...
], |module, cons| {
// ...
})
}
}
Don't forget to add a test in the validity folder and to add the new builtin to the list of the reserved keywords.
Standard Library
The Amber Standard Library (stdlib) is a collection of essential modules written in Amber itself, located in the src/std directory. It provides foundational capabilities that are available to every Amber program.
Modules include:
- Text (
text.ab): String manipulation functions (splitting, joining, trimming). - Math (
math.ab): Mathematical constants and functions. - Array (
array.ab): Utilities for handling arrays and lists. - FS (
fs.ab): File system operations (reading, writing, checking existence). - Env (
env.ab): Environment variable access and manipulation. - Date (
date.ab): Date and time utilities. - Http (
http.ab): Basic HTTP request capabilities.
and more...
Every function in the standard library is rigorously tested. You can find these tests in src/tests/stdlib/. When adding new standard library features, you must add corresponding tests to ensure correctness and prevent regressions.
Tests
Amber uses cargo test for testing:
validity- the validity of the compiler output (src/tests/validity/)erroring- the error handling of the compiler (src/tests/erroring/)stdlib- the standard library functions (src/tests/stdlib/)
For every test written in Amber there are 3 ways to check the result following this order:
- if a
// Outputcomment on top that include the output to match Succeededword will be matched against the test output
Tests will be executed without recompilation. Amber will load the scripts and verify the output in the designated file to determine if the test passes.
Some tests require additional setup, such as those for download that needs Rust to load a web server. These functions require special tests written in Rust that we can find in src/tests/stdlib.rs file.
Let's write a simple test
#[test]
fn prints_hi() {
let code = "
echo \"hi!\"
";
test_amber(code, "hi!", TestOutcomeTarget::Success);
}