The Diatom Reference
This book is the reference for the Diatom Programming Language.
About
Diatom is an embedded scripting language aiming to make scripting easy, secure and maintainable without compromised with speed.
Try Diatom
Use our online playground to try diatom without install anything.
Get Diatom
Diatom is available at crates.io.
The latest API documentation is available here.
Grammar
This chapter is about the grammar of diatom.
General Grammar Rules
Diatom's grammar is design around the following concept in mind:
- No delimiter: Statements and Expressions does not require
newline
or;
to be separated from each other. That meansa = 1 b = 2
is a perfectly fine line of code. - Fewer Punctuations: Diatom significantly reduces the need of redundant punctuations like
() {}
pairs. Blocks uses keywords instead of{}
and tuples does not need to be surrounded by()
. Instead just writea,b
is enough to make a tuple.[1 2 3]
is a good list in diatom, no comma is required or allowed. - Format independent syntax: Diatom does not use so called significant whitespace in its grammar. You are free to format you code without changing its meanings. To prevent unintended behaviors, most of the grammar requires some kind of terminator like
if ... then ... end
so user will not accidentally write to wrong scopes. - Expression based: Every expression( including block ) has a return value. Simply put an expression at the end of the block will automatically return its value. For example,
fn x = x + 1
will just returnx + 1
, noreturn
is needed.
Tokens
Tokens are the result of lexical analysis performed by the compiler.
Identifier
Syntax
Identifier: [_a-zA-Z][_0-9a-zA-Z]*Note: Unicode identifier is allowed but discouraged.
A valid identifier is a sequence of characters that satisfies the following conditions:
- The sequence must not start with
'0'..'9'
. - The sequence must not contains whitespace characters, namely
' ', '\r', '\n', '\t'
. - The sequence must not contains any ascii symbols like
?#$@%...
. - Unicode characters may be used. For example,
🐱🐶
is valid identifier.
Keywords
Syntax
Keyword:
"until"
| "end"
| "if"
| "then"
| "else"
| "elsif"
| "in"
| "for"
| "do"
| "return"
| "break"
| "continue"
| "loop"
| "def"
| "fn"
| "begin"
| "and"
| "or"
| "not"
| "fn"
| "is" | "require"
Numeric Literal
Syntax
Number:
( [ 0 ] [ Xx ] [ _0-9a-fA-F ]+)
|( [ 0 ] [ Bb ] [ _0-1 ]+)
|( [ 0 ] [ Oo ] [ _0-7 ]+)
|( [ 0-9 ] [ _0-9 ]* ( \.[ _0-9 ]+ ){0,1} ( [ Ee ] [ \+\- ]{0,1} [ 0-9_ ]* ){0, 1} )
Literals* | Example | Exponentiation |
---|---|---|
Decimal integer | 123_456 | N/A |
Hex integer | 0xff1 | N/A |
Octal integer | 0o72 | N/A |
Binary integer | 0b1111_0100 | N/A |
Floating point | 12.23e-5 | Optional |
*: All number literals allow _ as a visual separator: 1_234.0E+18
*: All characters are case insensitive.
String Literal
String literals are any characters surrounded by a pair of "
or '
. Any utf-8 characters may be included. To use special characters, you may put a \
before that escape sequence. Here is a list of all escape sequence.
Syntax
String:
' ( [^\\] | EscapeSequence | \\' )* '
| " ( [^\\] | EscapeSequence | \\" )* "EscapeSequence:
\\x [0-7]{2}
| \\u [0-9a-fA-F]{4}
| \\U [0-9a-fA-F]{8}
| \\r | \\n | \\t | \\
Escapes | Character |
---|---|
\\ | backslash |
\r | Newline |
\n | Carriage return |
\t | Tab |
\x41 | 7 bit character code (exactly 2 hex numbers*) |
\u00E9 | 16bit unicode escape(exactly 4 hex numbers*) |
\U000000E9 | 16bit unicode escape(exactly 8 hex numbers*) |
*: All hex numbers ignore case.
Operators
Syntax: Operator:
<singular operator>
| $ \(<expression>
* \)
| $ \[<expression>
* \]
| $ \{ (<expression>
*) | (<identifier>
:<expression>
)* \}
Here is a list of all operators.
Operator | Meaning | Example | Precedence (left and right) |
---|---|---|---|
. | access member | [1,2].len$() | 23,24 |
:: | access member | List::len$([1,2]) | 23,24 |
<- | set meta-table | {} <- MetaTable | 21,22 |
not | logic not | not false | 19 (prefix) |
$() | function call | f$() | 20 (postfix) |
$[] | index array | list$[0] | 20 (postfix) |
${} | data constructor | Just${1} | 20 (postfix) |
** | power | 2**3 | 17,18 |
* | multiply | 2*3 | 15,16 |
// | integer division | 7//3 | 15,16 |
/ | division | 7/3 | 15,16 |
% | modular | 5%2 | 15,16 |
+ | addiction | 1+2 | 13,14 |
- | subtract or negative value | 1-2 -2 | 19 (prefix) 13,14 (infix) |
> | greater than | 5>4 | 11,12 |
>= | greater than or equal to | 4>=3 | 11,12 |
< | less than | 2<6 | 11,12 |
<= | less than or equal to | 5<=5 | 11,12 |
== | equal to | 3==3 | 11,12 |
is | compare reference | {} is [] | 11,12 |
<> | not equal to | 5<>4 | 11,12 |
and | logic and | true and false | 9, 10 |
or | logic or | true or false | 7,8 |
.. | range | 1..10 | 5,6 |
, | create a tuple | 1,2,3 | 1,2 |
= | assign value to variable | a=5 | 4,3 |
Comment
Comment starts with --
and lasts until the end of current line.
Syntax
Comment:
\-\- [^\n]*
Expressions
All expressions in diatom has a return value.
Syntax
Expression:
BlockExpression
| CaseExpression
| ConstantExpression
| IfExpression
| LambdaExpression
| OperatorExpression
| RequireExpression
Constant values
Constant values or built in data structures. Diatom has dictionary
, set
, tuple
and list
as built in data structures.
Syntax
ConstantExpression:
<number>
|<string>
| true | false | \( \)
| \{<expression>
* }
| \{ : \} | \{ (<expression>
:<expression>
)* \}
| \[ (<expression>
)* \]
|<identifier>
Examples
0x123 -- numerical literal
"abc" -- string literal
true false -- boolean value
() -- unit type value
println$(())
-- list
println$([1, 2, 3], [])
-- Table
println$({key1 = 1, key2 = 'key'}, {})
-- tuple (Must contains at least two elements)
t1 = (1, 3, '?')
t2 = (1.2, (4, 5)) -- use parentheses for precedence
println$(t1, t2)
Operator Expressions
For operators and precedences, see Operators.
Syntax
OperatorExpression:
<prefix>
<expression>
|<expression>
<infix>
<expression>
|<expression>
<postfix>
Examples
a = 12345
println$(
a // 3,
a > 12344,
a + 5,
a ** 4,
a - -2,
a.float$(),
a / Float::INF
)
If Expression
Condition code execution.
Syntax
IfExpression:
if<expression>
then (<statement>
)*
(elsif<expression>
then (<statement>
)*)*
end
Examples
def test x =
if x > 0 then
println$('positive')
elsif x == 0 then
println$('zero')
else
println$('negative')
end
end
test$(1)
test$(0)
test$(-1)
Block Expression
Multiple statements can be grouped as a block. A block also has its own scope.
Syntax
BlockExpression:
begin (<statement>
)* end
Examples
begin
a = 1
if a > 0 then
return 0
else
return 1
end
end
Lambda Expression
Lambda expression is used as an anonymous function.
Syntax
LambdaExpression:
fn (<identifier>
)* =<expression>
Examples:
-- lambda takes a single expression
add3 =
fn a b c = a + b + c
println$( add3$(1, 2, 3) )
-- To use multiple expression/statements
-- Group them in a block expression
(fn x = begin
loop
if x == 0 then
break
else
x = x//2
end
end
println$('loop finished')
end) $(Int::MAX) -- Directly call lambda
Require Expression (Not Implemented)
Diatom allows to import another file with require expression. This expression will return a module contains all variables after executing that file. This expression does not introduce any variables to global scope.
Syntax RequireExpression:
require<string literal>
Note that the string literal must be in a.b.c
or a
form. There may not be characters except [a-zA-Z0-9_\\-\\.]
.
Examples
util = require "myModule.util"
lib = require "lib"
Statements
Statement in diatom always return ()
(aka unit type).
Syntax
Expression
| DataStatement
| FunctionStatement
| LoopStatement
| FlowStatement
Loop Statement
There are three types of loops in diatom:
loop
: infinite loopuntil
: loop until a break condition is metfor
: loop through an Iterator
Syntax
LoopStatement:
loop<statement>
* end
| until<expression>
do<statement>
* end
| for<expression>
in<expression>
do
<statement>
*
end
Examples
a = 0
loop
if a > 5 then
break
else
a = a + 1
end
end
println$(a)
a = 0
until a > 5 do
a = a + 1
end
println$(a)
a = 0
for _ in 0..6 do
a = a + 1
end
println$(a)
Function Statement
Function statement declares a function and bind the name to that function in current scope.
Syntax
FunctionStatement:
def<identifier>``<identifier>
* =
<statement>
*
end
Examples
def add3 a b c =
a + b + c
end
println$( add3$(1, 2, 3) )
Flow Control Statement
These statement alters execution flow in a function or a lambda expression.
Syntax
FlowStatement:
break
| continue
| return<expression>
?
Examples
def f a =
if a > 3 then
return true -- return a value
end
return -- Return nothing (Unit)
end
println$( f$(5), f$(-1) )
for i in 1..10 do
if i < 5 then
print$(i, '')
continue
else
println$('Break: i =', i)
break
end
end
Table & Meta Table
This chapter is about syntax related to table and meta table.
Member Resolution
Operator .
and ::
are used to access member of a table.
Diatom interpreter will looks up table entries first, if failed then it will try to look up the table's meta table entries.
meta_table = {meta_key = 2}
-- Create a table with meta table
table = {key1 = 0, key2 = 1} <- meta_table
println$('table.key1 =', table.key1)
println$('table.meta_key =', table.meta_key)
However, assign to a table will NOT be resolved to meta table. In which case will create a new key inside this table.
meta_table = {meta_key = 2}
-- Create a table with meta table
table = {key1 = 0, key2 = 1} <- meta_table
table.meta_key = 3
println$('table.meta_key =', table.meta_key)
println$('meta_table.meta_key =', meta_table.meta_key)
Meta Table
A meta table is just a normal table can contains any keys.
Meta table is used to abstract a common behavior among multiple tables. It can ONLY be set at the creation of a table. After that you can not set a new meta table or get current meta table out.
Modify meta table from a table is also limited due to assign to table keys will not be resolved to meta table.
This example will NOT work.
table = {}
meta = {
my_method =
fn x = x + 1
}
table <- meta
Method Call
If a table's entry or its meta table's entry is a closure, it is possible to call this closure as a method or member function.
A method will implicitly pass the caller itself as the first parameter.
def my_add self x =
self.value + x
end
MyInt = {
my_add = my_add
}
my_int = { value = 1, my_add2 = my_add } <- MyInt
println$( my_int.my_add$(10) )
println$( my_int.my_add2$(10) )
Sometimes you may want to call a static method by not passing the caller, in which case you can use ::
operator instead of .
. It is also recommended to use static value such as Int::MAX
with ::
instead of Int.MAX
.
def my_add self x =
self.value + x
end
MyInt = {
my_add = my_add
}
my_int = { value = 1, my_add2 = my_add } <- MyInt
println$( MyInt::my_add$( my_int, 10 ) )
Names
This chapter is about diatom's namespace and name resolution.
Scope
Diatom use lexical scope. That means all names are resolved according lexical context.
Variable Declaration
Diatom does not require variable to be declared before using. Variable is automatically declared the first time it is assigned with a value.
Variable Shadowing
Diatom disallows local variable to shadow outside ones. Names will always be resolved to outside ones if available. This is also applied to function parameters. They must not be the same name as a outside variable.
For example,
a = 1
def f x =
a = x
end
f$(2)
println$('a =', a)
a
will have value 2
after execution.
Local Scope
The following grammars enable local scope:
- Block:
begin
...end
- Function:
def
...end
- Lambda:
fn
- Loop/For:
loop
...end
,for
...end
,until
..do
...end
- If:
if
..then
...else
...end
Variable declared in local scope is not available in outside scopes.
The following code will NOT work.
begin
a = 10
end
println$(a)
Binding and Closure
Assignment Rules
All primitive types in diatom are unboxed and will be passed by value. This includes
- Unit
- Int
- Float
- Bool
- String
i = 1
alias = i
alias = 2
println$('i =', i)
println$('alias =', alias)
Lists, tuples and tables are stored in heap. Thus these value shall be passed by reference. External functions and closures also follow this rule.
l = [1, 2]
alias = l
alias.append$(3)
println$('l =', l)
println$('alias =', alias)
Closure
Closure may capture variables which is not by value but directly aliases the variable. The captures are still effective even if variables go out of scope.
-- declare variable x1 & x2
x1 = () x2 = ()
f =
fn = begin
a = 1
-- x1 captures `a`
x1 =
fn = a
-- x2 also captures `a`
x2 =
fn x = begin
a = x
end
end
-- initialize x1 & x2
f$()
-- modify `a` (decalred in `f`) which has
-- already go out of scope via `x2`
x2$(10)
-- `a` captured by x1 is also modified
println$(x1$())
Module (Unstable Feature)
Diatom can load external diatom source code files as modules.
Search Path
Search path is a set of paths where diatom compiler looking for possible modules. There are several ways to add search path.
- Relative path: When parsing a file, the file's directory is automatically added to search path and takes the highest priority.
- Other search path: Search path can also be specified by host.
- Console: Diatom Console will add current work directory to search path.
Resolve Rule
Module string like "a.b.c" will be resolved to <search path>/a/b/c.dm
or <search path>/a/b/c/mod.dm
. Diatom will always search module according to priority and immediately return when the first module matched is found.
Diatom will only execute a module once and use cached value when the same module is loaded.
Access Module Members
Require expression will return a module. All variables declared in the module file will be a member of the module and be accessed via .
operator.
-- lib.dm
version = "0.1.1"
def sum (x y z) =
x + y + z
end
-- main.dm
lib = require "lib"
lib.version
lib.sum$( 1 2 3 )
Standard Library
This part is about diatom's standard library.
Built-in Methods
Built-in methods that are implemented by the interpreter.
println
Print anything to output buffer (usually is managed by host program). It may accepts any amount and any kind of parameters. Will NOT run into infinite loop if data to be printed is cycled.
If output buffer can not be written (such as tcp stream is closed), it will panic.
Example
table = { ref = () }
table.ref = table
println(table, 1, "asm", false, 1.234, ())
Same as println
but does not print a new-line
character at the end.
assert
Assert an expression.
Accept a single bool as argument. If the value is false then it will panic.
Example
a = (1..).take(6).collect()
assert(a.len() == 6)
panic
Immediately panic. Accepts nothing or a string as its argument.
Example
painc()
painc('look at me')
unreachable
Cause panic if executed. Useful to mark unreachable codes.
def my_func =
if false then
unreachable()
end
end
my_func()
println('success')
todo
Cause panic if executed. Useful to mark something not completed yet.
def my_func =
todo()
end
my_func()
Unit (primitive type)
A tuple with 0 items, has only one possible value ()
. This value is implicitly returned if no return value is provided.
Type
() :: ()
Examples
def ret_unit = end
println$( ret_unit$() )
value = begin
-- statement always return ()
loop
break
end
end
println$(value)
Int (primitive type)
An platform independent 64-bit signed integer type, guaranteed to be wrapped if overflow happens.
Meta table
Int
type's meta table is stored in variable named Int
.
Examples
println$( Int::MAX )
println$( Int::abs$(-10) )
Int::MAX
Maximum possible value for Int
type.
Type
MAX :: Int
Examples
println$( Int::MAX )
Int::MIN
Minimum possible value for Int
type.
Type
MIN :: Int
Examples
println$( Int::MIN )
Int::abs
Calculate absolute value of an integer.
Note that absolute value of Int::MIN
will always cause an overflow which will return Int::MIN
.
Type
abs :: Int -> Int
Examples
println$( (-10).abs$() )
println$( Int::MIN.abs$() )
Int::float
Convert this integer to a Float
type.
Type
float :: Int -> Float
Examples
println$( 100.float$() )
Float (primitive type)
An platform independent 64-bit float point type.
Meta table
Float
type's meta table is stored in variable named Float
.
Examples
println$( Float::INF )
println$( Float::abs$(-10.0) )
Float::MAX
Maximum possible value for a Float
.
Type
MAX :: Float
Examples
println$( Float::MAX )
Float::MIN
Minimum possible value for a Float
.
Type
MIN :: Float
Examples
println$( Float::MIN )
Float::INF
Infinity (∞).
Type
INF :: Float
Examples
println$( Float::INF )
Float::NEG_INF
Negative infinity (−∞).
Type
NEG_INF :: Float
Examples
println$( Float::NEG_INF )
Float::NAN
Not a Number (NaN).
The value of this depends on Rust version and platform.
Description from Rust's std
Note that IEEE 754 doesn’t define just a single NaN value; a plethora of bit patterns are considered to be NaN. Furthermore, the standard makes a difference between a “signaling” and a “quiet” NaN, and allows inspecting its “payload” (the unspecified bits in the bit pattern). This constant isn’t guaranteed to equal to any specific NaN bitpattern, and the stability of its representation over Rust versions and target platforms isn’t guaranteed.
Type
NAN :: Float
Examples
println$( Float::NAN )
println$( Float::NAN.is_nan$() )
Float::floor
Calculate the largest integer less than or equal to a Float
.
Type
floor :: Float -> Float
Examples
println$( (-1.23).floor$() )
println$( 1.23.floor$() )
Float::ceil
Calculate the smallest integer greater than or equal to a Float
.
Type
ceil :: Float -> Float
Examples
println$( (-1.23).ceil$() )
println$( 1.23.ceil$() )
Float::round
Calculate the nearest integer to self. Round half-way cases away from 0.0.
Type
round :: Float -> Float
Examples
println$( (-1.23).round$() )
println$( 1.23.round$() )
Float::int
Cast a Float
to Int
type by truncating.
Type
int :: Float -> Int
Examples
println$( 1.7.int$() )
println$( 1.2.int$() )
println$( (-1.7).int$() )
println$( (-1.2).int$() )
Float::abs
Calculate absolute value of a Float
.
Type
abs :: Float -> Float
Examples
println$( (-1.23).abs$() )
Float::is_nan
Test if a Float
is NaN.
Type
is_nan :: Float -> Bool
Examples
println$( 0.1235.is_nan$() )
println$( Float::NAN.is_nan$() )
Float::is_inf
Test if a Float
is positive or negative infinity.
Type
is_inf :: Float -> Bool
Examples
println$( Float::INF.is_inf$() )
println$( Float::NEG_INF.is_inf$() )
println$( 1.2345.is_inf$() )
Bool (primitive type)
Boolean value, can be either true
or false
.
if
and until
statement requires condition must return a bool value.
Type
Bool :: Bool
Examples
This example will NOT work.
if 0 then
println$(' 0 is not false ')
end
Tuple
A tuple of items, useful in multiple return or anonymous tables.
Access tuple items via .
operator and a Int
literal.
Examples
def multi_ret x y =
(x, y)
end
tuple = multi_ret$(1, 3)
println$( tuple )
println$( tuple.0 )
tuple.0 = 'changed'
println$( tuple.0 )
List
A dynamic array that can contains any items.
-- Create an empty list
list = []
println$(list)
-- List with a single item
list = [1]
println$(list)
-- List with multiple items
list = [1, 'string', false]
println$(list)
Type
List :: [a]
Meta Table
List
type's meta table is stored in variable named List
.
l = [1, 2, 3]
List::append$(l, 4)
println$(l)
Index
List may by index by Int
type. Both postive and negative will work.
l = [1, 2, 3, 4]
println$( l$[0], l$[3] )
println$( l$[-4], l$[-1] )
Iterate
List can be used in for
loop or as an iterator.
-- for loop
for i in [1, 2, 3] do
print$(i, '')
end
println$('\n')
-- get an iterator
list = [1, 2, 3]
iterator = list.iter$()
println$( iterator.__next$() )
println$( iterator.__next$() )
println$( iterator.__next$() )
println$( iterator.__next$() is Option::None )
println$(list)
List::len
Return the length of a list.
Type
len :: [a] -> Int
Examples
l = [1, 2, 3]
println$( l.len$() )
List::clear
Remove all items from a list and return self.
Type
clear :: [a] -> [a]
Examples
l = [1, 2, 3].clear$()
println$(l)
List::reverse
Reverse a list and return self.
Type
reverse :: [a] -> [a]
Examples
l = [1, 2, 3].reverse$()
println$(l)
List::append
Append a new item and return self.
Type
append :: [a] -> a -> [a]
Examples
l = [1, 2, 3].append$(4)
println$(l)
List::insert
Insert an item to a given index.
Type
insert :: [a] -> Int -> a -> [a]
Examples
l = [1, 2, 3].insert$(-2, 5)
println$(l)
List::remove
Remove an item from given index and return self.
Type
remove :: [a] -> Int -> [a]
Examples
l = [1, 2, 3]
println$( l.remove$(2) )
List::iter
Get an iterator of this list.
Type
iter :: [a] -> Iter a
Examples
iter = [1, 2, 3].iter$()
println$( iter.__next$().value )
Iter
This is standard library's iterator api. Can be set as meta table for any iterator that implements __next
, which returns an Option
. Access vai variable Iter
.
Iter::all
Test if all elements satisfy a predict. Stop evaluation after find the first false
.
Type
all :: Iter a -> (a -> bool) -> bool
Iter::any
Test if any elements satisfied a prediction. Stop evaluation after find the first true
.
Type
any :: Iter a -> (a -> Bool) -> Bool
Iter::collect
Collect all items into a list.
Type
collect :: Iter a -> [a]
Iter::count
Count all elements in an iterator.
Type
count :: Iter a -> Int
Iter::sum
Calculate sum over all elements. If there is no element, 0 is returned.
Type
sum :: Iter a -> a
Iter::max
Calculate maximum element of an iterator. Panic if there is no element.
Type
max :: Iter a -> a
Iter::min
Calculate minimum element of an iterator. Panic if there is no element.
Type
min :: Iter a -> a
Iter::reduce
Reduce all elements by a given function. Panic if there is no element available.
Type
reduce :: Iter a -> (a -> a -> a) -> a
Iter::for_each
Apply function to every elements.
Type
max :: Iter a -> (a -> ()) -> ()
Iter::fold
Fold a iterator with initial value and a given function.
Type
fold :: Iter a -> b -> (b -> a -> b) -> b
Iter::map
Map a function over an iterator.
Type
map :: Iter a -> (a -> b) -> Iter b
Iter::filter
Filter an iterator by prediction.
Type
filter :: Iter a -> (a -> Bool) -> Iter a
Iter::skip
Skip the first n elements of an iterator.
Type
skip :: Iter a -> Int -> Iter a
Iter::take
Take n elements from iterator.
Type
take :: Iter a -> Int -> Iter a
Iter::zip
Zip two iterators.
Type
zip :: Iter a -> Iter b -> Iter (a, b)
Iter::step_by
Skip some elements by a fixed step.
Type
step_by :: Iter a -> Int -> Iter a
Iter::take_until
Take elements until prediction not satisfied.
Type
take_until :: Iter a -> (a -> bool) -> Iter a
Iter::enum
Enumerate an iterator from 0
.
Type
enum :: Iter a -> Iter (Int, a)
Gc
Control the garbage collector. Can be accessed via variable Gc
.
Gc::collect
Immediately start collecting garbage. Accept no arguments.
Gc::pause
Pause garbage collector. Accept no arguments.
Gc::resume
Resume garbage collector. Accept no arguments.