Welcome to the SOF Programming Language documentation!
This documentation details the Stack with Objects and Functions programming language, an experimental stack-based reverse-polish-notation functional programming language created by kleines Filmröllchen.
While the README is comprehensive on basic concepts and a good starting point for interested people (like you), the docs shall provide the most thorough information on SOF, including a full language description/specification and a from-scratch tutorial (no programming knowledge required).
This documentation, like all of SOF, is a Work In Progress (WIP). As much as possible, non-implemented features are marked as such. It is licensed under GNU FDL 1.3.
License
Copyright (C) 2019-2021 kleines Filmröllchen. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".
Getting started
This section is a tutorial and a more user-focused guide of SOF's features. It's recommended that you have some experience programming in common paradigms (imperative/OOP, functional).
Basics
SOF is an interpreted language. Use the installation steps described in the Readme to install SOF and launch the REPL interpreter. Note that the results of programs you typed in might not be visible; they are probably placed on the stack. This is different to a lot of REPL interpreters (such as Python or JavaShell), which print the result of the last expression, whether you put in a println
or not. I recommend typing along with this tutorial and modifying the examples in your own creative ways to learn more about things are done in SOF. All example code shows the input and output of the interpreter, where >>>
is a user input line, ...
is a input continuation line and !!!
is an error information starting line.
Let's get the Hello World out of the way:
>>> "Hello, world!" writeln
Hello, world!
This introduces both string literals (double quotes, escape sequences coming soon!) as well as the most basic I/O command: writeln
, which takes a string and prints it to standard output, ending the line. Most languages call this command println
, a leftover term from when such a command would actually instruct a printer to print the text on paper. But I'm getting off track here.
You may have noticed something weird here. Don't believe me? Let's try some arithmetic:
>>> 3 12 + writeln
15
>>> 26 18 * writeln
468
>>> 378 9 / writeln
42
>>> 1 2 + 3 + writeln
6
You can see that the first line clearly computes 3 + 12, but why is the operator (+
, addition) after the numbers it operates on (3
and 12
)? The reason is that the stack-based nature of SOF causes it to have postscript notation: all the operations come after the operands they operate on. Most famously, the document description language PostScript by Adobe uses this operation method, and - surprise, surprise - it works off a stack as well.
Each operation or function will take a different number of arguments that come before it. As we saw, writeln
only takes one argument: the thing to be printed, while +
takes two arguments: the two numbers to be added. The same goes for the other arithmetic operations, including -
not shown here.
With this new knowledge, take a look at the last line. Where is the second operand to the second +
instruction? The 3
is one of them, but the other seems to be missing. Except, of course, it doesn't.
Any operation that occurs in SOF ever only has the Stack to work with. This means that not only do literals place their values onto the stack, but operations place their result back onto the stack. When the interpreter sees the first +
, it retrieves the top two elements on the stack, which in this case happen to be the numbers 1
and 2
put there by the literals, and computes their sum. The result, 3, is placed back onto the stack, which can be imagined as shortening the code to 3 3 +
. The second +
doesn't even know where its numbers come from - they may be from a user input function, they may be literals, they may be the result of an operation or they may be a duplication of another value (which, in this case, is quite possible). As long as they are both numbers, the +
sums them and places that value onto the stack, ready for use in the next computation (which happens to be writeln
).
The Stack
You can't understand SOF without understanding the stack. On the flip side, once you do, SOF should feel much more intuitive.
What is a stack?
This section will shortly explain the notion of a stack as used in computer science. Feel free to skip it if you know about stacks and the LIFO principle.
A stack, in computer science, is just like a stack in the real world. Imagine a stack of books:
----------
| book |
----------
| book |
----------
| book |
----------
| book |
----------
These books are heavy - you cannot lift more than one at a time. And because they are placed on top of each other, you can only access the topmost one. For these reasons, you can only do one of two basic things: put another book onto the stack or remove one book from the stack, making the book below that the new topmost book. (Technically, you could also count looking at the topmost book as a basic operation). These operations are called push and pop (and peek), respectively.
The same thing goes for stacks in computer science, but now the books are stored in electronic memory and the books are data: in the case of SOF numbers, strings, commands, code etc. You will often see a stack being referred to as a LIFO queue, which stands for "last in - first out", i.e. the last element that went into the "queue" (on top of the stack) is the first element that will be retrieved back with a pop operation.
Advanced stacking
Here are some cool things to do with a stack:
>>> "world" dup writeln writeln
world
world
The dup
operator, short for duplicate, creates an identical copy of anything on the stack.
>>> 6 7 8 pop writeln
7
The pop
operator does exactly what it says: it discards a value from the stack. In this case, this makes the 7 the topmost element of the stack, which is then printed.
>>> 4 5 swap writeln
4
The swap
operator swaps the topmost two elements of the stack.
Names and Errors
Let's get into more advanced topics. Variables, branching and of course the programmer's favorite: Errors.
Naming things
Right now, we are only using the stack for storing things. We can duplicate, remove and operate on these values, but if we need lower values that were put on the stack previously, we are going to run into some issues quickly. For this reason, we can use the mighty def
operator:
>>> 3 x def
>>> # do some stuff here
>>> x . writeln
3
def
is short for "define". In this case it defines that the name x
should represent the number value 3
. Nothing fancy happens, but in the background, SOF created what is essentially a variable of name "x" and given it the value of the number 3. We can now go off and do something else (also note the use of basic # comments
) and come back later to retrieve x's value. This is done by giving the "variable"'s name followed by a .
. That little dot, the Call operator, is incredibly powerful (and powers, like, everything in SOF at all), but let's not get ahead of ourselves. Here, it is just used to retrieve the value associated with a name. Also, from now on, we will use the proper SOF terminology and call this simple x
an Identifier. It doesn't do anything on its own, it is just a piece of data that can be used to identify (hence the name) variables and other named things.
Making decisions
You are probably already waiting for conditional execution and all that turing-complete stuff. Here it is:
>>> { "That number is small" writeln } 4 1000 < { "That number is large" writeln } ifelse
That number is small
>>> { "That number is small" writeln } 2000 1000 < { "That number is large" writeln } ifelse
That number is large
There are a bunch of things here that need discussion. The braces delimit code blocks, which are a way of grouping instructions (in this case, two simple output writes) to be executed later. More on them in a bit. The <
is a basic comparison operator, a less than operator that needs two numbers and returns a boolean (true/false) according to the comparison result. Finally, the ifelse
is an operation that takes in two executable things, in this case the code blocks, and executes them based on the boolean result that lies in between them: if it is true (if the number is smaller than 1000), it executes the first block, if it is false (greater 1000), it executes the second block. A simple if would omit the alternative block and just execute the first block if the condition was true:
>>> { "Primary school completed!" writeln } 2 1 > if
Primary school completed!
>>> { "Primary school completed!" writeln } 2 4 > if
>>>
Errors
Up until now, we have only written simple programs that do not crash, because they conform to SOF's syntax and other rules. But let's say you screwed up while typing a string and forgot the closing quote:
>>> "a string
...
!!! Syntax Error in line 1 at index 1:
"a string
^
No closing '"' for string literal.
The first thing that will happen when you try this is that a line with three dots will appear. This is because SOF found the error, but many errors can be corrected on the next line, so it gives you another chance. This error, however, is not resolvable: Ending the continuation line with another press of the enter key will make SOF scream at you. But in a good way.
First, there is the !!! bit, which signals an error. Then, the name of the Error is given. The Errors section in the reference has information on all errors, but the most common ones are Syntax, your code is sh*t, and Type, your data is sh*t. After the error type comes the information on where the error occurred (possibly incorrect) as well as the segment of code where the error occurred combined with a pointer to the exact character (mostly correct, no guarantees). This helps you find the place of mishap. Finally, there is some additional information on what went wrong: in this case, no closing quote for the string literal was found, which is exactly the error. Note that as with every language, the false behavior might be somewhere else, but wasn't detected due to legal SOF behavior. For example, you might be passing a wrong parameter to a function, which will only be detected when some operation tries to act on that parameter and finds it to be of a wrong type.
It's a bird! It's a plane! No wait - It's a function!
Functions are essentially code blocks on steroids. They can't do anything more or less, they are just safer and more convenient. To create a function, this pattern is generally used:
# create the function
{ 3 + return } 1 function addThree globaldef
# and call it
1 addThree : writeln # 4
The important bit is the primitive token (keyword) function
. It takes in a piece of executable SOF code, a "Callable", as well as an argument count, which is 1 in this case. The code for the function here is a simple code block, the only way you can specify a Callable literally. Code block data behaves like any other data, it lives on the stack and can be assigned a name. Its superpower is delayed execution, that is, the SOF instructions you place inside it are not executed immediately, they are stored, to be run later. In this case, unlike the if
and ifelse
commands you saw earlier, we don't even use the stored code directly, instead, we use it as the logic of a newly created function. But the function itself is also just a Callable, some data, now sitting on the stack. To use it repeatedly from anywhere, we must def
it like any other variable and give it a name. We use the globaldef
in this case, which will always def
globally (duh) (Python programmers: compare this to the global
keyword). This pattern is SOF convention and allows you to define functions globally even inside other functions and classes.
We can now call the function, like we called the variables beforehand. But behold! This time around, we don't simply retrieve the Identifier's "value" with .
. That would be the function itself again, which we want to call! For this reason, the double-call operator :
exists. It simply executes two calls, the first time retrieving the function, the second time executing it. It is both more performant and more compact than . .
, but identical in function.
Now, what happens when we call the function? If you know any programming language, you know that functions (or methods, procedures, lambdas or whatever they're called) can recieve one or more arguments. SOF, of course, is no different. If you know hardware/assembly-level programming, this may sound familiar: Arguments to an SOF function are expected to be on the stack. The function definition specifies the number of arguments, and SOF places a "stack protector" under all of a function's arguments. This means that a function's body cannot change anything that is on the stack, except for its arguments, of course. Also, anything that is on the stack when the function exits will be deleted, up to this "stack protector". Think of the stack protector as a specially-marked return address in assembler. The function cannot mess with the caller's stack and cannot clobber it. You can absolutely be sure how many elements from the stack are consumed, according to the function's argument count.
The code block looks familiar, but it now uses the new-fangled operation return
. This will break out of the function body instantly and return the topmost value on the stack from the function. Alternatively, you can use the return:nothing
PT, which will end the function without returning anything. This is the default behavior when the function's end is reached without a return
. The return value is placed onto the stack, obviously. In the example we use it as an argument to writeln
.
Language Reference
This section serves as the specification of the SOF language itself and its outer components.
SOF Language Specification
This section specifies the Stack with Objects and Functions programming language.
The SOF source file
An SOF source file is a program or part of a program in the SOF programming language. The source files shall end with .sof
or .stackof
; the latter mainly for extension collisions. Every SOF source file must adhere to the following Extended Backus-Naur form specification:
(* A program is a series of tokens and comments, where tokens MUST be separated by whitespace. *)
SofProgram = [Token] { [Comments] ?Whitespace? [Comments] Token } { [Comments] ?Whitespace? } ;
Token = "def" | "globaldef" | "dexport" | "use" | "export"
| "dup" | "pop" | "swap"
| "write" | "writeln" | "input" | "inputln"
| "if" | "ifelse" | "while" | "dowhile" | "switch"
| "function" | "constructor"
| "+" | "-" | "*" | "/" | "%" | "<<" | ">>" | "cat"
| "and" | "or" | "xor" | "not"
| "<" | "<=" | ">" | ">=" | "=" | "/="
| "." | ":" | "," | ";" | "nativecall"
| "[" | "]" | "|"
| "describe" | "describes" | "assert"
| Number | String | Boolean
| Identifier | CodeBlock ;
Identifier = ?Unicode Letter? { ?Unicode Letter? | DecimalDigits | "_" | "'" | ":" } ;
(* A code block recursively contains SOF code, i.e. an SofProgram. *)
CodeBlock = "{" SofProgram "}" ;
(* Literals *)
String = '"' { ?any character except "? '\"' } '"' ;
Boolean = "true" | "false" | "True" | "False" ;
Number = [ "+" | "-" ] ( Integer | Decimal ) ;
Integer = "0" ( "h" | "x" ) HexDigits { HexDigits }
| [ "0d" ] DecimalDigits { DecimalDigits }
| "0o" OctalDigits { DecimalDigits }
| "0b" BinaryDigits { BinaryDigits } ;
Decimal = DecimalDigits { DecimalDigits } "." DecimalDigits { DecimalDigits }
[ ("e" | "E") ( "+" | "-" ) DecimalDigits { DecimalDigits } ] ;
BinaryDigits = "0" | "1" ;
OctalDigits = BinaryDigits | "2" | "3" | "4" | "5" | "6" | "7" ;
DecimalDigits = OctalDigits | "8" | "9" ;
HexDigits = DecimalDigits | "a" | "b" | "c" | "d" | "e" | "f" | "A" | "B" | "C" | "D" | "E" | "F" ;
(* Comments are ignored. *)
Comments = Comment { Comment } ;
Comment = ( "#" { ?any? } ?Unicode line break/newline? )
| ( "#*" { ?any? } "*#" ) ;
Here, SofProgram
is the syntax specification for an entire SOF source file. The source file consists of two types of syntactical constructs: Comments and Tokens.
Comments are purely for the benefit of the programmer and are entirely ignored. Tooling is encouraged to place programming-language-like information about the code inside comments. There are two types of comments: Single-line comments start with #
and extend to the end of the line, while multi-line comments start with #*
and end with *#
and can span multiple lines. In the following and the language reference in general, comments are treated as non-existent, and it is recommended to remove comments from the source code as one of the first steps of code processing.
Tokens are the core of the SOF program. The tokens are ordered in a linear sequence. The only exception is the code block token: A code block recursively nests another sequence of tokens. The major other differentiation in the token type is between the Literal Tokens that behave and look like the literals in other programming languages, as well as the Primitive Tokens aka. keywords that execute program logic.
Executing
Executing an SOF program is as simple as executing all tokens in the order that they are given in the source code. What each token does can vary wildly and each token is specified precisely in the main language reference. In general, executed tokens manipulate the Stack and the Nametables, and while Primitive Tokens have a predefined constant action, Literals vary depending on the exact token text given.
The SOF program environment
Every SOF program has an environment in which it is executed. This mainly consists of the Stack and the Nametable (or Nametables). The stack is a LIFO stack/queue (last in, first out) that can contain any kind of value. Of these values, there are types that the user can place and read via the use of certain tokens, and there are the "hidden types" that are used to make the program execute correctly. The form of these types is less precisely defined because they heavily depend on the implementation. The following hidden types are used:
- Global nametable: Sits on the bottom of the stack and acts as the global nametable into which bindings and lookups are done if no other nametable is present; or if the
globaldef
or similar PTs are executed, which explicitly always operate on the global nametable. - Function environment with local nametable: This is a nametable that hides the nametables below it on the stack when normal bindings and lookups are done. This nametable is placed on the stack when a function enters and can also serve as an indicator for the implementation on where to return execution to; similar to return addresses on assembly-level stacks. This nametable also holds a return value that is set by some tokens and returned to the caller (on the stack) when exiting the function. There are minor variations on the local nametables, for example for object constructors, but these behave almost the same.
The literal tokens
The literal tokens are all used to specify a value of a built-in type literally. They most commonly come in the form of basic data types like numbers and literal strings, but technically, code blocks are also literals. When a literal token is encountered, its value is placed on the stack, and the value of the token is derived from its physical form. This is self-explanatory for the literals of type Integer, Float, String (escape processing is explained elsewhere), and Boolean.
For code blocks, the code block's contained tokens must be stored in the data type in some form such that the contained tokens, and even additional code blocks, may be fully reconstructed by the implementation when the code block is later needed. This internal representation is deliberately kept unspecified so that implementations can choose any representation (or even multiple) that is most efficient in their circumstance. The code block, despite its appearance, therefore also just puts a data object on the stack that can later be executed or transformed as specified by the code block semantics. Most of these are given with the PTs that manipulate code blocks and can be found in the language reference. Because code blocks are considered to be immutable, implementations can take appropriate data-sharing measures to reduce these large on-stack data structures in size.
Exiting a program
The SOF program exits when the last token is executed or an error occurs. The kinds of errors and when they are thrown are specified in the language reference. The implementation must guarantee that no memory leak or similar resource issue results from a normal or abnormal termination.
Types
SOF currently has the following types:
Value
Base type of all other types; the reference implementation calls this Stackable
(because it can sit on the stack).
Callable
Technically speaking: Any type that can be operated on with the call operator .
. But as this applies to every type, Callable is mostly used as a term to mean more complex types that actually execute logic when called. The most important callables are Identifier (performs nametable lookup), CodeBlock, and Function.
Identifier
A specialized string-like type that can only contain specific characters (mostly letters and numbers). Used for name binding, i.e. the def and call operator families.
Primitive
Any type that returns itself when called and can be specified with a simple literal. All the following basic data types are primitives.
Number
Base type for number types. Arithmetic operations only operate on Numbers.
Integer
Normal integral positive/negative Number. The reference implementation uses 64-bit storage, but this is not mandatory. Ideally, Integers have no limit on their size other than memory.
Float/Decimal
Floating-point Number, 64 bits in the reference implementation. May be indefinitely precise.
Boolean
Truth value, either true
or false
, the basis of program control flow. When called (future) it takes two elements from the stack and returns the lower one if it is true
, or the higher one if it is false
.
String
Piece of text, infinitely long (memory-limited), all Unicode characters/code points supported.
CodeBlock
A code block created with curly braces; contains SOF code to be executed without a safe environment or arguments/return values. The most basic Callable that has user-definable behavior; therefore, it is often used to compose operations and Callables that require other Callables.
Function
A combination of a CodeBlock to be executed and a positive integer amount of arguments, possibly 0. When called, protects its internals through the use of an FD and places its arguments above that on the stack to be used by its code.
Object
A data collection like a nametable, with the difference that objects are user-creatable and do not serve the role of stack delineation or as targets for def
.
Constructor
A special kind of function that, when called, creates a new object and is able to initialize that object.
Errors
SOF throws a variety of errors when you mess something up. Currently, errors cannot be caught; for the possibility of adding this feature later, the except
PT is reserved.
SyntaxError
The input is not correct SOF syntax. This will occur:
- on unclosed strings, block comments and braces
- on too many closing braces
- on wrong Integer, Boolean and Decimal literals
- on invalid identifiers
Syntax errors are unrecoverable, that is, they may never be caught.
TypeError
An operation was attempted on incompatible or unsupported types. This will occur:
- on any native operation that has typed arguments.
- on PTs that require specific types. E.g.:
def
, call operator.
NameError
It was attempted to retrieve the value of an identifier that is not defined in the current scope(s).
ArithmeticError
An illegal mathematical operation was attempted: mostly divide by zero or its derivatives.
StackAccessError
An operation attempted an illegal modification of the Stack:
- The end of the stack was reached; accessing the GNT/NNT is not allowed.
- An FD was reached; Stack access beyond it is not allowed.
StackSizeError
(future) The Stack has reached the maximum feasible size. This will most likely only occur on recursive programs with deep recursion levels.
Primitive Tokens
Every primitive stack operation is called a primitive token. It is listed with its arguments, the stacklowest argument first, and its return value description. No return value section means that this operation places nothing on the stack. Get familiar with this argument and return type shorthand, it is used in all the documentation.
Operation special cases for identifiers
In order to make in-place modification of defined values easier, it's possible to combine any of the binary operations +
, -
, *
, /
, %
, <<
, >>
, <
, >
, <=
, >=
, =
, /=
, and
, or
, xor
with an identifier as the first (right, lower) argument. This will retrieve the value from the identifier through a .
-like call, perform the operation on that value as the right argument instead, and then store the result back into the identifier like normal def
. The result does not remain on the stack. This is equivalent to the +=, -= etc. operator found in many programming languages.
Miscellaneous tokens
input
(string input function)
Return value < the token read from stdin: String
Reads one word, i.e. everything in the standard input up to the first (Unicode) whitespace character, without trailing or leading whitespace characters.
inputln
(line string input function)
Return value < the line read from stdin without line terminator(s): String
Reads one line (any combination of line separators end one line) from standard input.
write
(output function)
Arguments < output: String
Writes the argument to standard output.
writeln
(output function w/ line break)
Arguments < output: String
Writes the argument to standard output and terminates the line.
Arithmetic and Logic PTs
+
(add operator)
Arguments < left: Number < right: Number
Return value < Mathematically: left + right
: Number
Computes the sum of the two arguments. The result is an Integer if both arguments are Integers and a Decimal if any argument is a Decimal. Throws TypeError
if any of the arguments has a non-number type.
-
(subtract operator)
Arguments < left: Number < right: Number
Return value < Mathematically: left - right
: Number
Computes the difference between the two arguments. The result is an Integer if both arguments are Integers and a Decimal if any argument is a Decimal. Throws TypeError
if any of the arguments has a non-number type.
*
(multiply operator)
Arguments < left: Number < right: Number
Return value < Mathematically: left · right
: Number
Computes the product of the two arguments. The result is an Integer if both arguments are Integers and a Decimal if any argument is a Decimal. Throws TypeError
if any of the arguments has a non-number type.
/
(divide operator)
Arguments < left: Number < right: Number
Return value < Mathematically: left ÷ right
: Number
Computes the result of the first argument divided by the second argument. The result is an Integer (division with remainder) if both arguments are Integers and a Decimal (division) if any argument is a Decimal. Throws TypeError
if any of the arguments has a non-number type. Throws ArithmeticError
if the right argument is zero.
%
(modulus operator)
Arguments < left: Number < right: Number
Return value < Mathematically: left mod right
: Number
Computes the result of the first argument modulus by the second argument. First, any Decimals are converted to Integers. Then, the remainder of the integer division of the two arguments is computed and returned. Throws TypeError
if any of the arguments has a non-number type. Throws ArithmeticError
if the right argument is zero.
<<
, >>
(logical bit shift operators)
Arguments < base: Number < amount: Number
Return value < base (<<
or >>
) amount: Integer
Computes the logical bit shift, that is the base (first argument) shifted left (<<) or right (>>) amount number of bits. Because this is a logical shift, it does not sign-extend the base. If this operation receives Floats as arguments, it truncates them to Integers.
<
, >
, >=
, <=
(comparison operators)
Arguments < left: Number < right: Number
Return value < Result of the comparison: Boolean
Compares the two arguments, always in the form left <comp> right
. The operators are less than, greater than, less than or equal, greater than or equal, respectively. This operation throws a TypeError
if any of the arguments is not a Number.
=
, /=
(equality operators)
Arguments < left < right
Return value < Whether the values are equal/unequal: Boolean
Checks whether the two arguments are equal or not equal, respectively. Two arguments are compared using the following algorithm:
- If both arguments are Numbers: Check whether their numeric value is equal. Integers are converted to Floats if at least one of the arguments is a Float.
- If both arguments are Booleans: Check whether they represent the same truth value.
- If both arguments are Strings: Check if every single one of their characters matches in order.
- If both arguments are Objects: Check if their nametables contain the same value for each key and whether they contain the same list of keys. The values are checked with this same algorithm.
- If both arguments are any other builtin value: Return false. This applies most importantly to CodeBlocks and Functions, as there is no simple way of determining their equality.
If the arguments aren't of the same type, upcasting is done, where Booleans upcast to Numbers and all other types upcast to Strings. This means that, for example, "2" 2 =
holds true. For stricter equality, check the types first.
and
, or
, xor
(binary logic operators)
Arguments < left < right
Return value < result of the operation: Boolean
Compares the two operators according to their Boolean value. The algorithm of finding the boolean value is the exact same as convert:bool
uses.
not
(negation operator)
Arguments < arg
Return value < result of the negation: Boolean
Negates arg
's value; if it was true, it is now false, if it was false, it is now true. If the argument is not a Boolean, its truthiness value is determined according to convert:bool
.
Control flow
dowhile
(loop with at least one iteration)
Identical to while
, but will execute the body callable before checking the condition, which results in at least one call to the body.
if
(conditional execution operator)
Arguments < to execute: Callable < condition: Boolean
Executes the callable if condition
is true.
ifelse
(conditional execution operator with alternative)
Arguments < to execute: Callable < condition: Boolean < to execute otherwise: Callable
Executes the first callable if condition
is true. Otherwise, executes the second callable.
switch
(multi-ifelse conditional execution operator)
Arguments < "switch::
" (Identifier) < ( case body: Callable < case condition: Callable ) * any number of times < default body: Callable
Compact alternative to nested ifelse
's. The behavior if this Method is as follows:
The default body, the element last on the stack, is stored for later use. Then, the entire stack is traversed two elements at a time. If the first element is the identifier "switch::
", the beginning/end of the switch has been reached; this special identifier serves as a sort of label to delineate the statement from the other, likely important stuff on the stack. As no case has been executed yet, the default body is executed.
If, however, the first element is a Callable, it is executed and the algorithm expects a Boolean value to be situated on top of the stack afterward. If this Boolean is true, the second element, the corresponding case body, is executed. Otherwise, the search continues.
After any body was executed, the stack has all elements up to the switch:: label removed, meaning that the bodies cannot store values onto the stack, but should use def
s to save them to the NT.
while
(loop function)
Arguments < body: Callable < condition: Callable
For every iteration of executing the body, executes the condition callable, which should place a Boolean value onto the stack. If the boolean value is true, the body will be executed once and the cycle repeats, if it is false, the loop will end.
Stack & Naming
def
(definition operator)
Arguments < value: Value < name: Identifier
Modifies the LNT by setting the key-value pair name: value
. This means that now the value of the identifier name
is "defined" to be the value value
, hence the name. Will overwrite any existing binding to name
.
dup
Arguments < elmt: Value
Return value < elmt < elmt
Duplicates the topmost element on the stack.
globaldef
(global definition operator)
The same as def
, but always defines into the GNT.
pop
(stack remove operator)
Arguments < any
Removes the topmost element from the stack and discards it.
swap
(stack exchange operator)
Arguments < x < y
Return value < y < x
Exchanges the position of the two top-most elements on the stack.
Functional
.
(call operator)
Arguments < 0 or more < tocall: Callable
Return value < any value or none
Calls the topmost element of the stack. Each type exhibits its own behavior when called, but the most basic are:
- Most primitives return themselves.
- Calling an identifier looks up that identifier's value in the innermost scope that has the identifier defined or throws a
NameError
if that fails. - Calling a namespace looks up the next stack element, an identifier, in that namespace. This process is recursive, although currently namespaces cannot be nested.
- Calling a function executes it and consumes the specified number of arguments from the stack. The return value of the function is placed on the stack. An exception to this is when the function arguments are "blocked off" with a currying operator (see below). In that case, the function is not called but it and the curried arguments replaced by the curried function.
- Calling a code block executes it. There are neither arguments nor a return value.
:
(doublecall/function invoke operator)
Arguments < 0 or more < tocall: Callable
Return value < any value or none
Calls the topmost element of the stack twice. I.e., after the first call, the now topmost element is immediately called again. Therefore, this PT is a shortcut that is exactly equivalent to (and faster than) . .
. This is intended for convenient use of named functions, i.e. functions defined into a namespace. The first call will retrieve the function itself onto the stack, and the second call will execute it. Therefore, the normal way you will see functions be used is with :
, and it's an easy indicator of distinguishing a variable lookup from a function call.
|
(currying marker)
This is a pseudo-value on the stack that does nothing and is invisible to almost all operations. It is used to limit the number of arguments a function receives. If a function, while retrieving arguments from the stack for calling, encounters a currying marker before all necessary arguments are found, the function is not called but a curried function is created instead that has the specified number of arguments "pre-stored". The curried function can be called later, it takes the number of remaining arguments. These appear on the stack below the curried arguments, so that the actual argument order inside the function doesn't change.
function
(function definition operator)
Arguments < code: CodeBlock < argcount: Integer
Creates a function with argcount
arguments. Usually, it is then def
'd or globaldef
'd with a name.
nativecall
(native function invocation operator)
Arguments < any number of arguments < native function identifier: String
Return value < any
Calls a function defined natively in the interpreter. Learn more.
Typing-related PTs
This includes PTs that create or handle complex types.
[
(list marker)
This is a pseudo-value on the stack that does nothing and is invisible to almost all operations. It is used as a delimiter for the start of a list when the list creator ]
is used.
]
(list creator)
Arguments < [
< any number of elements (< ]
)
Return value < literal: List
Creates a new list from a number of literal values. This operation traverses the stack downwards until it hits the list start marker [
; this means that the list creator operation is the only one that actually handles the start marker and doesn't just ignore it. All the elements in between are used as the initial values of the list, and they are ordered such that the lowest values are the first in the list. For example, the code [ 1 2 3 ]
creates a list with the Integer elements 1, 2, and 3, in that exact order. This entire literal list creation system is therefore very intuitive while still respecting SOF's orthagonality.
,
(fieldcall operator)
Arguments < object: Object < field: Callable
Return value < object < value
Executes a call on the object's nametable with the given field as a Callable to execute. The field is usually an identifier identifying a field value on the object in question. It can however be any sort of callable data, including callables that do not interact with their nametable. The object is left on the stack below the return value, making it easily available for further processing.
;
(methodcall operator)
Arguments < object: Object < 0 or more < tocall: Callable
Return value object < any value or none
Calls the topmost element of the stack twice, like :
. Additionally, remembers the object before the calls occur and places it back onto the stack below the return value of the second call. Furthermore, the function call actually happens with the object nametable as the function nametable, so the function can use object attributes as variables and modify the object with def
's. It is also possible to add new attributes to an object; for this reason, if you need named temporaries you should use an inner dummy function.
This operator is intended to be used with method-like named functions, functions that expect an object to operate on. SOF technically has no bound functions (you can emulate them by attaching function values to object attributes, but using those is a lot more cumbersome), so all functions that act like methods are free functions expecting some object to operate on. The big advantage is that as long as you pass an object that behaves like the one the function expects, the function will operate just fine. The method call operation will leave the program with the result of the operation (possibly none), and the original object below it, which allows for it to be used further.
constructor
(constructor creation operator)
Arguments < code: CodeBlock < args: Integer
Return value < constructor: Constructor
Creates a new object creation template by turning the code block into a constructor function. Inside the constructor's code block, def
s can be used to initialize fields on the object's nametable. A new object can be created by executing the constructor; this is the earliest time that the code body is executed. Just like other functions, the constructor can obtain any number of arguments when called.
Modules
dexport
(definition+export operator)
Arguments < value: Value < name: Identifier
name dexport
is syntactic sugar for name globaldef name export
. This operator simply binds the value to name in the GNT and also exports it.
export
Arguments < name: Identifier
Exports the value bound to name in the LNT. Exporting is the method of making data visible to other SOF modules that import this module. Only exported names, not all names in the GNT, will be available to the module after import.
use
(import module)
Arguments < module name: String
This PT is part of the module system, documented here. It executes modules and imports their exported definitions into the global namespace.
Builtin functions
These functions are always available to the user, and part of the prelude
file in the standard library. The difference to other modules is that the prelude file is executed as if its text was in the file itself, so normal module mechanisms don't apply unless you explicitly "prelude" use
.
convert:bool
(Not implemented)
Arguments < toConvert
Return value: converted: Boolean
Converts the argument to a Boolean. If the value is not already a Boolean, it uses the "truthyness" of the argument, which is almost always true. When the argument is 0 or 0.0, it is false.
convert:float
Arguments < toConvert
Return value: converted: Float
Converts the argument to a float. The argument can either be a string containing a valid SOF float literal (plus any leading/trailing whitespace), an integer or a float already. The result is the corresponding float value; the function fails with a TypeError if conversion fails, e.g. wrong number format, unsupported origin type.
convert:int
Arguments < toConvert
Return value: converted: Integer
Converts the argument to an integer. The argument can either be a string containing a valid SOF integer literal (plus any leading/trailing whitespace), an integer already or a float to be rounded. The result is the corresponding integer value; the function fails with a TypeError if conversion fails, e.g. wrong number format, unsupported origin type.
convert:string
Arguments < toConvert
Return value: converted: String
Converts the argument to its string representation. This is the same process used by the output methods. The argument can be of any type, as any SOF type has a string representation, but the result might not be beautiful.
convert:callable
Arguments < toConvert
Return value: converted: Callable
Converts the argument to its callable equivalent. This has the following result:
- "Real" callables are unchanged. This affects functions, code blocks and identifiers.
- Primitives are converted to a Church encoding version of themselves. (Not implemented) This means:
- Natural numbers
n >= 0
are converted to a callable that when called with another callablef
, will callf
n
times. Iff
returns a value and receives an argument, this is exactly equivalent to the notion of Church numerals. - Booleans are converted to a callable that when called with two arguments, will return the first (stack-lowest) argument if it is
true
, otherwise, it will return the second argument. This also means thatca cb
cond if
(ca
,cb
Callables,cond
Boolean) is equivalent toca cb cond
convert:callable : .
- Other Integers
x
are converted to a two-element list[a, b]
wherea, b ∈ ℕ
are Church numerals as described above andx = a - b
. - Floats
x
are first converted to the most accurate rational representation. Then, a two-element list[k, a]
is created wherek
is a Church numeral (integer),a
is a Church-encoded natural number andx = k / ( 1 + a )
.
- Natural numbers
The conversion fails on Strings and other more complex types and throws a TypeError.
random:01
Return value: random number: Float
Generates a pseudo-random number between 0 (inclusive) and 1 (exclusive), optimally using a system-provided RNG (such as /dev/urandom
on Linux). THIS PSEUDO-RANDOM NUMBER GENERATOR IS NOT GUARANTEED TO BE CRYPTOGRAPHICALLY SAFE.
random:int
Arguments < start: Integer < end: Integer
Return value: random number: Integer
Generates a pseudo-random number between start and end, inclusive. Uses random01
as the initial source of randomness (and, therefore, is NOT CRYPTOGRAPHICALLY SAFE).
random:float
Arguments < start: Float < end: Float
Return value: random number: Float
Floating-point variant of random:int
with equivalent behavior. Returns any floating point number between start and end, inclusive.
fmt'x
Arguments < format: String < x format arguments
Return value: formatted: String
Formats a string with a number of format arguments. fmt'x
is just a placeholder name; the actual functions are called fmt'0
through fmt'9
with 0-9 arguments, respectively. The exact format specificer format is not well-documented and can be found with the relevant native implementations. It's similar to Java's format string syntax though, and some tests exist for it.
The SOF Module System
SOF's module system is intended to be simple, but flexible and practical. It is very reminiscent of Python's module system.
Modules, files, and folders
Each SOF source code file is a separate module. Folders are not special, they can just serve to group modules and avoid naming conflicts. There are no special module names such as init or main in Python, all files ending in .sof
are accessible equivalent modules.
Modules are named hierarchically with familiar dot syntax. Modules starting with a dot .
are relative modules, and modules starting with any other character are absolute modules.
Relative modules import relative to the file location. Single dots (except the leading dot) are used to import one directory lower, i.e. the name between this dot and the one before it is considered a directory in with to look for the module. Double dots ..
are used to import a directory higher (cf. directory navigation in all major operating systems). The highest directory possible is the directory of the base module of the program if the relative import chain originates from the base module, or the directory of the libraries if the relative import chain originates from a library module that was imported absolutely. This distinction prevents nonsensical and dangerous "upwards" imports while allowing for useful features like sibling folder importing.
Absolute modules import in the library directory. This is a runtime-constant directory which will later be accessible with command-line arguments and/or environment variables. It usually sits in a related directory to the SOF executable itself. The library directory contains not only the SOF standard library modules but also any modules added manually by the user or by package managers. Modules imported absolutely can import relatively themselves, which again allows for submodule structures even in the libraries. Within an absolute module, single dots can also be used to import in sub-directories of the library directory.
The module name, i.e. the name after the final dot, never contains the .sof
ending. This allows for the alternative endings and special file formats which are treated specially by the module system, like .soflib
.
Each naming segment in any module specification, which represents either folders or the final file, can contain all characters except the two slash characters (used by the operating systems for directory structure) and dots, of course.
Given this detailed description, the method of resolving modules is unambiguous and straightforward. Modules are always treated with UTF-8 encoding, just as all SOF files are.
Names
As SOF has no namespaces like C, care needs to be taken when naming functions and other exports of a module. As they overwrite all GNT entries of the same name upon import, duplicate definitions are technically allowed (though the interpreter might issue a warning). The convention as outlined in the programming conventions is to use underscores for separating pseudo-namespaces where necessary.
The use
, export
and dexport
primitive tokens
The use
primitive token is used to import a module. The module specification, its behavior explained above in detail, is given by a string. The SOF module system imports the specified module, which may come from the internal cache if it was already imported. Then, all of the bindings defined by export
or dexport
are imported into the importing file's global namespace. This means that you don't have to worry about cluttering global namespaces with unnecessary names: only the names you export in a module are visible to use
rs of that module.
Note that of course, use
is recursive. SOF code that is currently executed as part of a module import can use
other modules without any different rules or exceptions. The only impossible module connection is any sort of circular import. The reason is equivalent to Python's reason: Because module importing always involves executing the entire imported module's source code. However, given the huge ecosystem of Python libraries, it is clear that this is not a limitation and all circular dependencies can be reworked to strict hierarchical dependencies.
Running functions in other modules
There is special treatment given to exported functions, which technically is a special rule about all functions but only becomes relevant with cross-module functions. All functions store the global nametable at the time of definition; i.e. the global nametable of their module or file. As each module gets its own global nametable, this means that functions in different modules refer to different GNTs, but functions in the same module refer to the same GNT. When a function is run, the global nametable is in fact temporarily replaced with the function's global nametable at defintion time (if necessary) and restored afterwards. This means that a function can access global values of its module like one would expect. To keep the orthagonality, the global nametable exchanging can be thought of as a stack of global nametables at the very bottom of the real stack. The actual global nametable is just the top of this sub-stack, and global nametables are pushed to and popped from the stack on function entry and exit.
The SOF standard library
This is the official collection of library functions and classes provided by the SOF system.
Files in the standard library
math
: Usual mathematical operations.op
: (Not implemented) Built-in operations as callables.io
: (Not implemented) File input/output.list
: (Not implemented) List functions.fp
: (Not implemented) Helpers and tools for functional programming.
math
abs
: Absolute value
Arguments < a: Number
Return value < |a|
: Number
Returns the absolute value of the given number.
sin
: Sine
Arguments < a: Float
Return value < sin(a)
: Float
Returns the mathematical sine of the input. The input angle is treated as radians.
cos
: Cosine
Arguments < a: Float
Return value < cos(a)
: Float
Returns the mathematical cosine of the input. The input angle is treated as radians.
tan
: Tangent
Arguments < a: Float
Return value < tan(a)
: Float
Returns the mathematical tangent of the input. The input angle is treated as radians.
exp
: Exponent
Arguments < a: Float
Return value < e^a
: Float
Returns e (Euler's constant, approximately 2.718281) to the power of a. This is the most accurate power function.
ln
: Natural logarithm
Arguments < a: Float
Return value < ln(a)
: Float
Returns the natural logarithm of a. This is the most accurate logarithm function.
log
: Logarithm
Arguments < n: Float < a: Float
Return value < log_n(a)
: Float
Returns the logarithm with the base of n of a. This is mathematically equivalent to ln(a) / ln(n)
.
hypot
: Hypotenuse
Arguments < a: Float < b: Float
Return value < hypot(a, b)
: Float
Returns the size of the hypotenuse with the adjacent and opposite being a and b. This is the value sqrt(a*a + b*b)
as calculated by Pythagoras' formula, but it avoids overflows and imprecisions caused by large intermediary values.
list
(Not implemented)
List functions are prefixed with list_
; this prefix is omitted here. These functions are method-like functions to be called on a list.
element
: Index into a list
Arguments < list: List < index: Integer
Return value < element: the value at index index
in the list
This function implements normal indexing into a list. Any element can be retrieved from a list by means of its index. Indices are zero-based as in most programming languages. This means that the first element in a list is referred to by an index of zero. Using a negative index will retrieve elements from the end of the list, i.e. an index of -1 refers to the last element of the list (disregarding its size), -2 to the second to last element and so on. This is very useful for list-size-independent indexing from the end.
An index that would reach past the limits of the list throws an IndexError
. The element
function will always throw for empty lists. The indexing and throwing behavior is inherited by all list functions that take indices, unless noted otherwise.
length
: Length of a list
Arguments < list: List
Return value < List length: Integer
Returns the number of elements in the list, always nonnegative. Returns zero for the empty list.
head
: First element of a list
Arguments < list: List
Return value < First element in the list: Any value
This function returns the first value of the list. It throws IndexError
if the list is empty.
tail
: List tail
Arguments < list: List
Return value < The list's tail: List
Returns a new list that has the first element of the old list removed. Together with head
, it can be used to split a list into its first element and its remainder.
Returns an empty list for the empty list.
reverse
: Reverse a list
Arguments < list: List
Return value < The reverse of the list: List
Returns a new list which has all the elements of the old list, but at inversed positions. For example, the last element is now the first and the second element is now the second-to-last.
Returns the empty list for the empty list.
split
: Split up a list
Arguments < list: List < index: Integer
Return value < A two-element list with the first and second portion of the list, in that order.
This function splits up a list into two halves. The first half contains all elements up to the index (including the element at the index), and the second half contains all elements after the index. The two halves are returned as a list containing two elements. It is a more efficient combination of the take
and after
functions.
take
: First n elements of a list
Arguments < list: List < n: Integer
Return value < List of length n: List
Returns a new list that contains the elements of the given list up to the given index, exclusive. For positive n, this always means that the length of the new list is equal to n. Returns the empty list for n=0, returns the entire list if n is greater or equal the list's length.
after
: Elements after an index
Arguments < list: List < n: Integer
Return value < List with elmts after n: List
Inverse of take
; returns the elements that take
dropped from the list. For positive n, this is equivalent to dropping the first n elements from the list, for negative n, it is equivalent to taking |n|
elements from the end of the list (possibly the entire list if |n| > length(list)
. For n=0, the entire list is returned. For n greater than list length, the empty list is returned.
first
: First element of the list
Arguments < list: List
Return value < element: Any value
Returns the first element in the list, equivalent to list 1 take
. This is intended for use with tuple-like lists.
second
: Second element of a list
Arguments < list: List
Return value < second element: Any value
Returns the second element of the list, similar to fst
, and equivalent to list 2 take
.
pair
: Create a tuple
Arguments < a: Any value < b: Any value
Return value < [ a, b ] : List
Creates a two-element list from the two arguments. Main function for creating tuple-like lists (short lists of known length) and returning two values.
filter
: Filter a list
Arguments < list: List < filter predicate: Callable
Return value < filtered list: List
This is the first of the higher-order list processing functions. filter
takes a list and a callable. The callable is then provided with each element at a time in order (the element is placed on the stack and the callable is invoked, therefore at least both functions and codeblocks will work). For each element where the callable returns a truthy value, the element is retained in the resulting list. For each element where the callable returns a falsy value, the element is discarded. The result is a list where only the elements that the filtering function deemed necessary are retained.
How does SOF actually work?
This page shall describe the way that SOF works internally, while staying language-independent, as to accomodate other implementations of SOF compilers and interpreters. Nevertheless, Examples of the reference Java implementation shall be given, as it currently is the only existing implementation of SOF.
Data Structures
SOF is a pure stack-based language. That means: All data always resides on the Stack, a linear unit of memory cells that contain data. Although we know a stack as being only LIFO and having one single visible element (the top, head or first element of the stack), in practice the SOF stack should be a Deque. If it wasn't, one would need two independent stacks for many operations (but both of them could be real, pure stacks).
But what is a Deque? This term, pronounced "deck", is short for "double ended queue" and describes a data structure with arbitrary access on both ends. The top of the deque is accessed with peek, pop and push, while the bottom of the deque is accessed with peekLast, popLast, pushLast. Java provides not only a Deque in its Collection framework, but also many specialized implementations, such as the currently used ConcurrentLinkedDeque
, which is a double-linked-list implementation with threadsafety.
All data lives on the stack, this was already stated. But how does this allow for named variables, namespaces and function calling? The answer is the second most important data structure of SOF, the Nametable.
A Nametable is simply a list of key-value mappings (Map
in Java and JavaScript, dict
in python) that maps identifiers to any SOF data. Pretty simple, but this powers all of SOF. All defintions made by def
are simply entries into nametables, the Call operator .
simply accesses entries in nametables.
The Global Nametable
The global nametable (GNT) is always the lowest element on the stack; when it is missing, something serious has gone wrong. The SOF programmer can never inspect, modify or remove the GNT, but it is being used all the time:
- All defintions made on a global level enter the GNT
- All imported NNTs (see below) are placed in the GNT
The GNT will be discussed in further detail with its use cases.
Scoping, the Call and Def operators
A Scope is created whenever a function starts. The scope is signaled by a special NT on the stack, called a function delimiter (FD). FDs hold special information on where to return execution when the function ends and what the return value is. Also, the FD cannot be taken off the stack by the program in any other way than returning from executing the current code block or function.
As FDs are NTs, this means that at any point in the program there could be many NTs on the stack at once. To figure out which NT is to be used for the call operator .
with an Identifier, the following simple rule is applied: Walk down the stack from top (last) to bottom (first). Whenever a Nametable is encountered, determine whether it contains the identifier that .
wants to call. If so, retrieve this identifier's value from this Nametable, if not, continue the search all the way to the NNT/GNT. If nothing is found, throw a NameError. This ensures that definitions made in a "more local" NT are more important and hide those in a "more global" NT.
Normally, def
operates on the Local Nametable (LNT), which is simply the highest NT on the stack. This may be an FD, or the GNT if there is no FD. This importantly means that code inside a function cannot modify nametables outside unless using the globaldef
operator:
Sometimes, the user wants to define into the GNT. For this, the operator globaldef
is provided, which exhibits the same behavior as def
except for always defining into the GNT. This is useful for defining functions in an enclosed scope, modifying global variables and so on. It is also convention to always globaldef
global functions, constructors, etc.
Native calls
The PT nativecall
executes a call to a natively implemented function. The only explicit argument of the native call is a string identifying the native function to call. Because the reference implementation is Java-based, the way in which native functions are identified is very similar to Java method signatures. The general form is NativeFunction = Package { "." Package } "." Class "#" Method "(" [ ArgumentType ] { "," ArgumentType } ")"
, where Package represents a legal Java package name, Class represents a legal Java class name and Method represents a legal Java method name. The argument types must be the internal SOF type names that the reference implementation uses, like StringPrimitive, FloatPrimitive etc. There may be any number of arguments separated by a comma (but no spaces!), and possibly none. The arguments are taken from the stack when nativecall
runs, where the last argument taken from the stack is the first argument passed to the native function. The native function may return an SOF typed result, in which case it is placed on the stack, or it may return nothing (void), in which case the stack is not modified. Native functions may throw (incomplete) compiler exceptions, in which case they propagate from the nativecall
as normal SOF errors of type Native
.
How to write readable SOF code
This page will outline the conventions, idioms and common practices used when writing SOF code. Following this guide leads to good, idiomatic SOF code and APIs that other developers can use with ease. Most sections are in no particular order.
Whitespace use
As SOF is very whitespace insensitive, good code uses whitespace to logically structure the otherwise pretty one-dimensional series of tokens.
Token separation
Tokens on one line are separated with one single space. An exception is made when you want to align groups of tokens in multiple sequential lines: then, use of multiple spaces between tokens for alignment is encouraged.
Example: Defining a number of variables in sequence: Don't do this:
3 x def
"string" msg def
2 4 * eight def
Do this:
3 x def
"string" msg def
2 4 * eight def
Line breaks and token grouping
Each line should contain one single action, or a logical unit of actions, which might require multiple tokens. In general, PTs should appear on the same line as their arguments, except when these arguments are already on the stack or require a lot of steps to be prepared.
For PTs that take code blocks, such as if
, the last closing brace and the PT itself should be on the same line, except when line length would be a problem.
Code blocks should be split up into lines, where the opening and closing brace are on their own line, mimicking the "braces on next line" code style found in C-like languages. Exceptions are when the code block is very short (1-2 tokens) or contains only a single logical action (such as a function call with many parameters).
Example: Don't do this:
# define function
{ pop 15 + return } 2 function someFunction globaldef
# take user input, process it, store it, store a modified version, print one of two messages
input convert:int : true someFunction : dup x def 3 + y def { "large" } { "small" } y . 33 < ifelse
Do this:
# define function
{
# discard first argument
pop
# compute something
15 + return
} 2 function someFunction globaldef
# take user input
input convert:int :
# process it
true someFunction : dup
# store it
x def
# store a modified version
3 + y def
# print one of two messages
{ "large" } { "small" } y . 33 < ifelse
Line indentation
Indenting one level should only be done inside code blocks; this also includes functions and methods. The braces of the code blocks themselves should be on the original indentation level, mimicking the "braces on next line" code style found in C-like languages. Whole-line comments are indented as code would be.
Example: Don't do this:
3 x def
{
# a comment
3 someComputation :
2 someComputation :
4 writeln
} 3 4 < if
Do this:
3 x def
{
# a comment
3 someComputation :
2 someComputation :
4 writeln
} 3 4 < if
An exception to the indentation rule are methods: They, together with the constructor, should be aligned one level further than the surrounding code. The constructor defining calls themselves ( 3 constructor <classname> globaldef <classname> :
) should be on the same indentation level as the surrounding code.
Naming conventions
General naming in SOF is done with CamelCase. All names except for constructors (classes) should be lowercase, constructors are uppercase. Examples: fooFunc
, connectToWebservice
, doCoolComputation
, myVariable
, vector1
, MyClass
, Circle
, FileCommunicator
. Names in general should be self-explanatory and human-readable, avoid abbreviations and name collisions. (There is no significant speed benefit on shorter names, as all names are identified through some sort of hash)
Special naming conventions are used for functions that provide similar functionality (there is, of course, no function overloading in SOF): functions of this sort should be of the form fname'args:variant
. fname
is the general name of the function collection. args
is either the number of arguments or the argument's types, separated by additional '
. variant
is either the variation on the base functionality or the return type.
Examples: The functions random:01
, random:int
and random:float
all provide randomness, but :01
returns values between 0 and 1, :int
returns integers in a range and :float
returns floats in a range. Similarly, the function collection convert:<type>
includes a lot of functions that convert to a specific type from other supported types. The writef'<argc>
functions all provide formatted standard output writing, but with a different number of formatting arguments specified by the argc
.
Cross-module naming
Names in modules can use an underscore to separate pseudo-namespaces (identical to the module name) from actual names. If the function names are reasonably generic and the number of exports in a module is small, this can be omitted.
For example, all the list functions in standard library list
have a list_
prefix, such as list_elem
Glossary
- PT: Primitive token. A special token that has the syntax of an identifier (i.e. if it wasn't special, it would be treated as an identifier) but executes a special operation that (for the most part) cannot be accomplished by other means.
- Nametable: A key-value mapping structure (compare to Java's & JavaScript's
Map
, python'sdict
) that is the second most important data structure of SOF internally after the Stack. All Nametables live on the Stack. - GNT: Global nametable. Lowest element on the stack, used for top-level lookups and
globaldef
. Exported functions keep the GNT at export time.