The Diatom Reference

This book is the reference for the Diatom Programming Language.

About

Diatom is an embedded scripting language aiming to make scripting easy, secure and maintainable without compromised with speed.

Try Diatom

Use our online playground to try diatom without install anything.

Get Diatom

Diatom is available at crates.io.

The latest API documentation is available here.

Grammar

This chapter is about the grammar of diatom.

General Grammar Rules

Diatom's grammar is design around the following concept in mind:

  • No delimiter: Statements and Expressions does not require newline or ; to be separated from each other. That means a = 1 b = 2 is a perfectly fine line of code.
  • Fewer Punctuations: Diatom significantly reduces the need of redundant punctuations like () {} pairs. Blocks uses keywords instead of {} and tuples does not need to be surrounded by (). Instead just write a,b is enough to make a tuple. [1 2 3] is a good list in diatom, no comma is required or allowed.
  • Format independent syntax: Diatom does not use so called significant whitespace in its grammar. You are free to format you code without changing its meanings. To prevent unintended behaviors, most of the grammar requires some kind of terminator like if ... then ... end so user will not accidentally write to wrong scopes.
  • Expression based: Every expression( including block ) has a return value. Simply put an expression at the end of the block will automatically return its value. For example, fn x = x + 1 will just return x + 1, no return is needed.

Tokens

Tokens are the result of lexical analysis performed by the compiler.

Identifier

Syntax
Identifier: [_a-zA-Z][_0-9a-zA-Z]*

Note: Unicode identifier is allowed but discouraged.

A valid identifier is a sequence of characters that satisfies the following conditions:

  • The sequence must not start with '0'..'9'.
  • The sequence must not contains whitespace characters, namely ' ', '\r', '\n', '\t' .
  • The sequence must not contains any ascii symbols like ?#$@%... .
  • Unicode characters may be used. For example, 🐱🐶 is valid identifier.

Keywords

Syntax
Keyword:
"until"
| "end"
| "if"
| "then"
| "else"
| "elsif"
| "in"
| "for"
| "do"
| "return"
| "break"
| "continue"
| "loop"
| "def"
| "fn"
| "begin"
| "and"
| "or"
| "not"
| "fn"
| "is" | "require"

Numeric Literal

Syntax
Number:
( [ 0 ] [ Xx ] [ _0-9a-fA-F ]+)
|( [ 0 ] [ Bb ] [ _0-1 ]+)
|( [ 0 ] [ Oo ] [ _0-7 ]+)
|( [ 0-9 ] [ _0-9 ]* ( \.[ _0-9 ]+ ){0,1} ( [ Ee ] [ \+\- ]{0,1} [ 0-9_ ]* ){0, 1} )

Literals*ExampleExponentiation
Decimal integer123_456N/A
Hex integer0xff1N/A
Octal integer0o72N/A
Binary integer0b1111_0100N/A
Floating point12.23e-5Optional

*: All number literals allow _ as a visual separator: 1_234.0E+18
*: All characters are case insensitive.

String Literal

String literals are any characters surrounded by a pair of " or '. Any utf-8 characters may be included. To use special characters, you may put a \ before that escape sequence. Here is a list of all escape sequence.

Syntax
String:
' ( [^\\] | EscapeSequence | \\' )* '
| " ( [^\\] | EscapeSequence | \\" )* "

EscapeSequence:
\\x [0-7]{2}
| \\u [0-9a-fA-F]{4}
| \\U [0-9a-fA-F]{8}
| \\r | \\n | \\t | \\

EscapesCharacter
\\backslash
\rNewline
\nCarriage return
\tTab
\x417 bit character code (exactly 2 hex numbers*)
\u00E916bit unicode escape(exactly 4 hex numbers*)
\U000000E916bit unicode escape(exactly 8 hex numbers*)

*: All hex numbers ignore case.

Operators

Syntax: Operator:
<singular operator>
| $ \( <expression>* \)
| $ \[ <expression>* \]
| $ \{ (<expression>*) | (<identifier> : <expression>)* \}

Here is a list of all operators.

OperatorMeaningExamplePrecedence (left and right)
.access member[1,2].len$()23,24
::access memberList::len$([1,2])23,24
<-set meta-table{} <- MetaTable21,22
notlogic notnot false19 (prefix)
$()function callf$()20 (postfix)
$[]index arraylist$[0]20 (postfix)
${}data constructorJust${1}20 (postfix)
**power2**317,18
*multiply2*315,16
//integer division7//315,16
/division7/315,16
%modular5%215,16
+addiction1+213,14
-subtract or negative value1-2 -219 (prefix) 13,14 (infix)
>greater than5>411,12
>=greater than or equal to4>=311,12
<less than2<611,12
<=less than or equal to5<=511,12
==equal to3==311,12
iscompare reference{} is []11,12
<>not equal to5<>411,12
andlogic andtrue and false9, 10
orlogic ortrue or false7,8
..range1..105,6
,create a tuple1,2,31,2
=assign value to variablea=54,3

Comment

Comment starts with -- and lasts until the end of current line.

Syntax
Comment:
\-\- [^\n]*

Expressions

All expressions in diatom has a return value.

Syntax
Expression:
BlockExpression
| CaseExpression
| ConstantExpression
| IfExpression
| LambdaExpression
| OperatorExpression
| RequireExpression

Constant values

Constant values or built in data structures. Diatom has dictionary, set, tuple and list as built in data structures.

Syntax
ConstantExpression:
<number> | <string> | true | false | \( \)
| \{ <expression>* }
| \{ : \} | \{ (<expression>:<expression>)* \}
| \[ (<expression>)* \]
| <identifier>

Examples

0x123 -- numerical literal

"abc" -- string literal

true false -- boolean value

() -- unit type value
println$(())

-- list
println$([1, 2, 3], [])

-- Table
println$({key1 = 1, key2 = 'key'}, {})

-- tuple (Must contains at least two elements)
t1 = (1, 3, '?')
t2 = (1.2, (4, 5)) -- use parentheses for precedence
println$(t1, t2)

Operator Expressions

For operators and precedences, see Operators.

Syntax
OperatorExpression:
<prefix> <expression>
| <expression> <infix> <expression>
| <expression> <postfix>

Examples

a = 12345

println$(
    a // 3,
    a > 12344,
    a + 5,
    a ** 4,
    a - -2,
    a.float$(),
    a / Float::INF
)

If Expression

Condition code execution.

Syntax
IfExpression:
if <expression> then (<statement>)*
(elsif <expression> then (<statement>)*)*
end

Examples

def test x =
    if x > 0 then
        println$('positive')
    elsif x == 0 then 
        println$('zero')
    else
        println$('negative')
    end
end

test$(1)
test$(0)
test$(-1)

Block Expression

Multiple statements can be grouped as a block. A block also has its own scope.

Syntax
BlockExpression:
begin ( <statement> )* end

Examples

begin
    a = 1
    if a > 0 then
        return 0
    else 
        return 1
    end
end

Lambda Expression

Lambda expression is used as an anonymous function.

Syntax
LambdaExpression:
fn ( <identifier> )* = <expression>

Examples:

-- lambda takes a single expression
add3 = 
    fn a b c = a + b + c
println$( add3$(1, 2, 3) )

-- To use multiple expression/statements
-- Group them in a block expression
(fn x = begin
    loop 
        if x == 0 then
            break
        else 
            x = x//2
        end
    end
    println$('loop finished')
end) $(Int::MAX) -- Directly call lambda

Require Expression (Not Implemented)

Diatom allows to import another file with require expression. This expression will return a module contains all variables after executing that file. This expression does not introduce any variables to global scope.

Syntax RequireExpression:
require <string literal>

Note that the string literal must be in a.b.c or a form. There may not be characters except [a-zA-Z0-9_\\-\\.].

Examples

util = require "myModule.util"
lib = require "lib"

Statements

Statement in diatom always return () (aka unit type).

Syntax
Expression
| DataStatement
| FunctionStatement
| LoopStatement
| FlowStatement

Loop Statement

There are three types of loops in diatom:

  • loop: infinite loop
  • until: loop until a break condition is met
  • for: loop through an Iterator

Syntax
LoopStatement:
loop <statement>* end
| until <expression> do <statement>* end
| for <expression> in <expression> do
<statement>*
end

Examples

a = 0
loop 
    if a > 5 then
        break
    else 
        a = a + 1
    end
end
println$(a)

a = 0
until a > 5 do
    a = a + 1
end
println$(a)

a = 0
for _ in 0..6 do
    a = a + 1
end
println$(a)

Function Statement

Function statement declares a function and bind the name to that function in current scope.

Syntax
FunctionStatement:
def <identifier>``<identifier>* =
<statement>*
end

Examples

def add3 a b c =
    a + b + c 
end

println$( add3$(1, 2, 3) )

Flow Control Statement

These statement alters execution flow in a function or a lambda expression.

Syntax
FlowStatement:
break
| continue
| return <expression>?

Examples

def f a = 
    if a > 3 then
        return true -- return a value
    end
    return -- Return nothing (Unit)
end

println$( f$(5), f$(-1) )

for i in 1..10 do
    if i < 5 then
        print$(i, '')
        continue
    else 
        println$('Break: i =', i)
        break
    end
end

Table & Meta Table

This chapter is about syntax related to table and meta table.

Member Resolution

Operator . and :: are used to access member of a table.
Diatom interpreter will looks up table entries first, if failed then it will try to look up the table's meta table entries.

meta_table = {meta_key = 2}
-- Create a table with meta table
table = {key1 = 0, key2 = 1} <- meta_table

println$('table.key1 =', table.key1)
println$('table.meta_key =', table.meta_key)

However, assign to a table will NOT be resolved to meta table. In which case will create a new key inside this table.

meta_table = {meta_key = 2}
-- Create a table with meta table
table = {key1 = 0, key2 = 1} <- meta_table

table.meta_key = 3

println$('table.meta_key =', table.meta_key)
println$('meta_table.meta_key =', meta_table.meta_key)

Meta Table

A meta table is just a normal table can contains any keys.

Meta table is used to abstract a common behavior among multiple tables. It can ONLY be set at the creation of a table. After that you can not set a new meta table or get current meta table out.

Modify meta table from a table is also limited due to assign to table keys will not be resolved to meta table.

This example will NOT work.

table = {}

meta = {
    my_method = 
        fn x = x + 1
}

table <- meta

Method Call

If a table's entry or its meta table's entry is a closure, it is possible to call this closure as a method or member function.

A method will implicitly pass the caller itself as the first parameter.

def my_add self x =
    self.value + x
end

MyInt = { 
    my_add = my_add
}

my_int = { value = 1, my_add2 = my_add } <- MyInt
println$( my_int.my_add$(10) )
println$( my_int.my_add2$(10) )

Sometimes you may want to call a static method by not passing the caller, in which case you can use :: operator instead of .. It is also recommended to use static value such as Int::MAX with :: instead of Int.MAX.

def my_add self x =
    self.value + x
end

MyInt = { 
    my_add = my_add
}

my_int = { value = 1, my_add2 = my_add } <- MyInt
println$( MyInt::my_add$( my_int, 10 ) )

Names

This chapter is about diatom's namespace and name resolution.

Scope

Diatom use lexical scope. That means all names are resolved according lexical context.

Variable Declaration

Diatom does not require variable to be declared before using. Variable is automatically declared the first time it is assigned with a value.

Variable Shadowing

Diatom disallows local variable to shadow outside ones. Names will always be resolved to outside ones if available. This is also applied to function parameters. They must not be the same name as a outside variable.

For example,

a = 1
def f x =
    a = x
end
f$(2)

println$('a =', a)

a will have value 2 after execution.

Local Scope

The following grammars enable local scope:

  • Block: begin ... end
  • Function: def ... end
  • Lambda: fn
  • Loop/For: loop ... end, for ... end, until .. do ... end
  • If: if .. then ... else ... end

Variable declared in local scope is not available in outside scopes.

The following code will NOT work.

begin
    a = 10
end
println$(a)

Binding and Closure

Assignment Rules

All primitive types in diatom are unboxed and will be passed by value. This includes

  • Unit
  • Int
  • Float
  • Bool
  • String
i = 1
alias = i
alias = 2

println$('i =', i)
println$('alias =', alias)

Lists, tuples and tables are stored in heap. Thus these value shall be passed by reference. External functions and closures also follow this rule.

l = [1, 2]
alias = l

alias.append$(3)

println$('l =', l)
println$('alias =', alias)

Closure

Closure may capture variables which is not by value but directly aliases the variable. The captures are still effective even if variables go out of scope.

-- declare variable x1 & x2
x1 = () x2 = ()
f = 
    fn = begin
        a = 1
        -- x1 captures `a`
        x1 = 
            fn = a
        -- x2 also captures `a`
        x2 = 
            fn x = begin 
                a = x
            end
    end
-- initialize x1 & x2 
f$()

-- modify `a` (decalred in `f`) which has
-- already go out of scope via `x2`
x2$(10)

-- `a` captured by x1 is also modified
println$(x1$())

Module (Unstable Feature)

Diatom can load external diatom source code files as modules.

Search Path

Search path is a set of paths where diatom compiler looking for possible modules. There are several ways to add search path.

  • Relative path: When parsing a file, the file's directory is automatically added to search path and takes the highest priority.
  • Other search path: Search path can also be specified by host.
  • Console: Diatom Console will add current work directory to search path.

Resolve Rule

Module string like "a.b.c" will be resolved to <search path>/a/b/c.dm or <search path>/a/b/c/mod.dm. Diatom will always search module according to priority and immediately return when the first module matched is found.

Diatom will only execute a module once and use cached value when the same module is loaded.

Access Module Members

Require expression will return a module. All variables declared in the module file will be a member of the module and be accessed via . operator.

-- lib.dm
version = "0.1.1"

def sum (x y z) =
    x + y + z
end

-- main.dm
lib = require "lib"
lib.version
lib.sum$( 1 2 3 )

Standard Library

This part is about diatom's standard library.

Built-in Methods

Built-in methods that are implemented by the interpreter.

println

Print anything to output buffer (usually is managed by host program). It may accepts any amount and any kind of parameters. Will NOT run into infinite loop if data to be printed is cycled.

If output buffer can not be written (such as tcp stream is closed), it will panic.

Example

table = { ref = () }
table.ref = table

println(table, 1, "asm", false, 1.234, ())

print

Same as println but does not print a new-line character at the end.

assert

Assert an expression.

Accept a single bool as argument. If the value is false then it will panic.

Example

a = (1..).take(6).collect()
assert(a.len() == 6)

panic

Immediately panic. Accepts nothing or a string as its argument.

Example

painc()
painc('look at me')

unreachable

Cause panic if executed. Useful to mark unreachable codes.

def my_func =
    if false then
        unreachable()
    end
end

my_func()
println('success')

todo

Cause panic if executed. Useful to mark something not completed yet.

def my_func =
    todo()
end

my_func()

Unit (primitive type)

A tuple with 0 items, has only one possible value (). This value is implicitly returned if no return value is provided.

Type

() :: ()

Examples

def ret_unit = end

println$( ret_unit$() )

value = begin
    -- statement always return ()
    loop 
        break
    end
end

println$(value)

Int (primitive type)

An platform independent 64-bit signed integer type, guaranteed to be wrapped if overflow happens.

Meta table

Int type's meta table is stored in variable named Int.

Examples

println$( Int::MAX )
println$( Int::abs$(-10) )

Int::MAX

Maximum possible value for Int type.

Type

MAX :: Int

Examples

println$( Int::MAX )

Int::MIN

Minimum possible value for Int type.

Type

MIN :: Int

Examples

println$( Int::MIN )

Int::abs

Calculate absolute value of an integer.
Note that absolute value of Int::MIN will always cause an overflow which will return Int::MIN.

Type

abs :: Int -> Int

Examples

println$( (-10).abs$() )
println$( Int::MIN.abs$() )

Int::float

Convert this integer to a Float type.

Type

float :: Int -> Float

Examples

println$( 100.float$() )

Float (primitive type)

An platform independent 64-bit float point type.

Meta table

Float type's meta table is stored in variable named Float.

Examples

println$( Float::INF )
println$( Float::abs$(-10.0) )

Float::MAX

Maximum possible value for a Float.

Type

MAX :: Float

Examples

println$( Float::MAX )

Float::MIN

Minimum possible value for a Float.

Type

MIN :: Float

Examples

println$( Float::MIN )

Float::INF

Infinity (∞).

Type

INF :: Float

Examples

println$( Float::INF )

Float::NEG_INF

Negative infinity (−∞).

Type

NEG_INF :: Float

Examples

println$( Float::NEG_INF )

Float::NAN

Not a Number (NaN).

The value of this depends on Rust version and platform.

Description from Rust's std
Note that IEEE 754 doesn’t define just a single NaN value; a plethora of bit patterns are considered to be NaN. Furthermore, the standard makes a difference between a “signaling” and a “quiet” NaN, and allows inspecting its “payload” (the unspecified bits in the bit pattern). This constant isn’t guaranteed to equal to any specific NaN bitpattern, and the stability of its representation over Rust versions and target platforms isn’t guaranteed.

Type

NAN :: Float

Examples

println$( Float::NAN )
println$( Float::NAN.is_nan$() )

Float::floor

Calculate the largest integer less than or equal to a Float.

Type

floor :: Float -> Float

Examples

println$( (-1.23).floor$() )
println$( 1.23.floor$() )

Float::ceil

Calculate the smallest integer greater than or equal to a Float.

Type

ceil :: Float -> Float

Examples

println$( (-1.23).ceil$() )
println$( 1.23.ceil$() )

Float::round

Calculate the nearest integer to self. Round half-way cases away from 0.0.

Type

round :: Float -> Float

Examples

println$( (-1.23).round$() )
println$( 1.23.round$() )

Float::int

Cast a Float to Int type by truncating.

Type

int :: Float -> Int

Examples

println$( 1.7.int$() )
println$( 1.2.int$() )
println$( (-1.7).int$() )
println$( (-1.2).int$() )

Float::abs

Calculate absolute value of a Float.

Type

abs :: Float -> Float

Examples

println$( (-1.23).abs$() )

Float::is_nan

Test if a Float is NaN.

Type

is_nan :: Float -> Bool

Examples

println$( 0.1235.is_nan$() )
println$( Float::NAN.is_nan$() )

Float::is_inf

Test if a Float is positive or negative infinity.

Type

is_inf :: Float -> Bool

Examples

println$( Float::INF.is_inf$() )
println$( Float::NEG_INF.is_inf$() )
println$( 1.2345.is_inf$() )

Bool (primitive type)

Boolean value, can be either true or false. if and until statement requires condition must return a bool value.

Type

Bool :: Bool

Examples

This example will NOT work.

if 0 then
    println$(' 0 is not false ')
end

Tuple

A tuple of items, useful in multiple return or anonymous tables.
Access tuple items via . operator and a Int literal.

Examples

def multi_ret x y =
    (x, y)
end

tuple = multi_ret$(1, 3)
println$( tuple )
println$( tuple.0 )

tuple.0 = 'changed'
println$( tuple.0 )

List

A dynamic array that can contains any items.

-- Create an empty list
list = []
println$(list)

-- List with a single item
list = [1]
println$(list)

-- List with multiple items
list = [1, 'string', false]
println$(list)

Type

List :: [a]

Meta Table

List type's meta table is stored in variable named List.

l = [1, 2, 3]
List::append$(l, 4)
println$(l)

Index

List may by index by Int type. Both postive and negative will work.

l = [1, 2, 3, 4]
println$( l$[0], l$[3] )
println$( l$[-4], l$[-1] )

Iterate

List can be used in for loop or as an iterator.

-- for loop
for i in [1, 2, 3] do
    print$(i, '')
end
println$('\n')

-- get an iterator
list = [1, 2, 3]
iterator = list.iter$()
println$( iterator.__next$() )
println$( iterator.__next$() )
println$( iterator.__next$() )
println$( iterator.__next$() is Option::None )
println$(list)

List::len

Return the length of a list.

Type

len :: [a] -> Int

Examples

l = [1, 2, 3]
println$( l.len$() )

List::clear

Remove all items from a list and return self.

Type

clear :: [a] -> [a]

Examples

l = [1, 2, 3].clear$()
println$(l)

List::reverse

Reverse a list and return self.

Type

reverse :: [a] -> [a]

Examples

l = [1, 2, 3].reverse$()
println$(l)

List::append

Append a new item and return self.

Type

append :: [a] -> a -> [a]

Examples

l = [1, 2, 3].append$(4)
println$(l)

List::insert

Insert an item to a given index.

Type

insert :: [a] -> Int -> a -> [a]

Examples

l = [1, 2, 3].insert$(-2, 5)
println$(l)

List::remove

Remove an item from given index and return self.

Type

remove :: [a] -> Int -> [a]

Examples

l = [1, 2, 3]
println$( l.remove$(2) )

List::iter

Get an iterator of this list.

Type

iter :: [a] -> Iter a

Examples

iter = [1, 2, 3].iter$()
println$( iter.__next$().value )

Iter

This is standard library's iterator api. Can be set as meta table for any iterator that implements __next, which returns an Option. Access vai variable Iter.

Iter::all

Test if all elements satisfy a predict. Stop evaluation after find the first false.

Type

all :: Iter a -> (a -> bool) -> bool

Iter::any

Test if any elements satisfied a prediction. Stop evaluation after find the first true.

Type

any :: Iter a -> (a -> Bool) -> Bool

Iter::collect

Collect all items into a list.

Type

collect :: Iter a -> [a]

Iter::count

Count all elements in an iterator.

Type

count :: Iter a -> Int

Iter::sum

Calculate sum over all elements. If there is no element, 0 is returned.

Type

sum :: Iter a -> a

Iter::max

Calculate maximum element of an iterator. Panic if there is no element.

Type

max :: Iter a -> a

Iter::min

Calculate minimum element of an iterator. Panic if there is no element.

Type

min :: Iter a -> a

Iter::reduce

Reduce all elements by a given function. Panic if there is no element available.

Type

reduce :: Iter a -> (a -> a -> a) -> a

Iter::for_each

Apply function to every elements.

Type

max :: Iter a -> (a -> ()) -> ()

Iter::fold

Fold a iterator with initial value and a given function.

Type

fold :: Iter a -> b -> (b -> a -> b) -> b

Iter::map

Map a function over an iterator.

Type

map :: Iter a -> (a -> b) -> Iter b

Iter::filter

Filter an iterator by prediction.

Type

filter :: Iter a -> (a -> Bool) -> Iter a

Iter::skip

Skip the first n elements of an iterator.

Type

skip :: Iter a -> Int -> Iter a

Iter::take

Take n elements from iterator.

Type

take :: Iter a -> Int -> Iter a

Iter::zip

Zip two iterators.

Type

zip :: Iter a -> Iter b -> Iter (a, b)

Iter::step_by

Skip some elements by a fixed step.

Type

step_by :: Iter a -> Int -> Iter a

Iter::take_until

Take elements until prediction not satisfied.

Type

take_until :: Iter a -> (a -> bool) -> Iter a

Iter::enum

Enumerate an iterator from 0.

Type

enum :: Iter a -> Iter (Int, a)

Gc

Control the garbage collector. Can be accessed via variable Gc.

Gc::collect

Immediately start collecting garbage. Accept no arguments.

Gc::pause

Pause garbage collector. Accept no arguments.

Gc::resume

Resume garbage collector. Accept no arguments.