= Expressions = 0 1 2... -> just push themselves on the stack "string" -> pushes "string" on the stack -> "bareword" /bareword -> "bareword" / |bareword -> "bareword" | \bareword -> "bareword" \ bareword -> lookup "bareword" in current scopes -> passive -> push value on the stack -> active -> call value on current stack -> quote -> call value on current stack, even in quoted mode / -> nop "string" | -> resolve "string" in current scope, push value "string" \ -> resolve "string" in current scope, call value on current stack [ -> push array begin marker on stack ] -> since the topmost array marker, create array from stack values ( -> push tuple begin marker on stack ) -> since the topmost tuple marker, create tuple from stack values { -> push quote begin marker on stack, increase parser quote count } -> since the topmost quote marker, everything becomes one closure (deduce type), decrease parser quote count dup -> dups |dup -> push dup function pointer \dup -> dups |f |g ; -> { f g } { ...1 } variable ; { ...2 } ; -> { ...1 variable } { ...2 } ; -> { ...1 variable ...2 } 1 2 |add * -> 3 |f * -> f |f *10 g -> f g StructuredData.field -> StructuredData "field" . -> dereference field in structured data, if passive push, if active do SD "field" .| -> dereference, push value SD "field" .\ -> dereference, call x "string" = -> dereference, create (passive/active depends on type) field if not existent, set value to x x pointer = -> dereference, set value to x x /pointervar = -> sets the pointer itself x |pointervar = -> sets the value the pointer points to x \pointervar = -> if the pointer points to a value which is a pointer, assign where that pointer points to v "name" deff -> define name as active name, assign v v "name" defv -> define name as passive name, assign v < -> push new scope > -> pop scope, structured data object now on stack < 0 =a 0 =b > -> something which has a and b is on top of stack 0 1 2 _0 -> 0 1 2 2 0 1 2 _1 -> 0 1 2 1 0 1 2 _2 -> 0 1 2 0 0 1 2 -0 -> 2 0 1 2 -1 -> 1 0 1 2 -2 -> 0 exch = -01 pop = - nop = _ dup = _0 _ -> copy stack contents according to digits - -> delete stack contents up to largest digit or count of "-" whichever is larger, recreate according to digits 1 [ 2 3 ] add -> [ 3 4 ] [ [ 1 ] [ 2 ] ] length -> 2 # scanning for applicable base type from top A->int B->int add -> B->A->int A->int A->int add -> A->int = Characters = !: co-routines and threads ": string quote #: line comment $: %: &: ': (: tuple begin ): tuple end *: apply function +: ,: -: stack manipulation .: field dereference /: stringify 0-9: numbers :: ;: function composition <: scope start =: assignment >: scope end ?: ternary operator @: A-Z: bareword characters [: array begin \: callify ]: array end ^: _: stack manipulation `: a-z: bareword characters {: quote begin |: passify }: quote end ~: = Memory Management = Inspiration: http://wiki.luajit.org/new-garbage-collector 0x6???????????: Heap memory ("16 TB should be enough for everyone.") 0x5???????????: GC block bitmap 0x4???????????: GC mark bitmap Large set of reachable, old objects Large set of unreachable, new objects Small set in between == Object Memory Layout == === Int === * Length in bytes (including header, always 16) bit 63-60: 0 0 0 0 bit 59: reserved for GC * value === String === * Length in bytes (including header) bit 63-60: 0 0 0 1 bit 59: reserved for GC * hash (0 if not yet calculated) * Exact length * data (UTF-8) === Scope === * Length in bytes (including header) bit 63-60: 0 0 1 0 bit 59: reserved for GC bit 58: parent pointer exists (for scopes) (always 1 for now) bit 57: extension area pointer exists (always 1 for now) * name table (mapping entry names to offsets) * parent scope (0 if no parent) * extension area pointer (0 if no extra members (yet)) * data [ ]* === Name Table === ( TODO: create a hash-table based implementation here ) * Length in bytes (including header) bit 63-60: 0 0 1 1 bit 59: reserved for GC * End of fill in bytes (i.e. if this field == length, table is full) * data [ , ]* ( TODO: Oh, the performance... ) The 0th string maps to 0 and so on. === Extension Area === * Length in bytes (including header) bit 63-60: 0 1 0 0 bit 59: reserved for GC * data [ ]* === Function === * Length in bytes (including header) bit 63-60: 0 1 0 1 bit 59: reserved for GC * captured scope pointer (0 if non-capturing function) * type pointer (0 if untyped) * code pointer === Function Code === * Length in bytes (including header) bit 63-60: 0 1 1 0 bit 59: reserved for GC * [ ]* === Array === * Length in bytes (including header) bit 63-60: 0 1 1 1 bit 59: reserved for GC * [ ]* === Function Type === * Length in bytes (including header) bit 63-60: 1 0 0 0 bit 59: reserved for GC * Number of input stack elements * [ ]* * Number of output stack elements * [ ]* ( TODO: Think about representing this directly as array pointers ) ( TODO: Think about moving the counts to the beginning ) = Musings About The Optimizer = Main problem: How do we find out if the names mean what we believe they do? == Runtime Checking == * Optimize for the _current_ values of names * Check whether changes have occured * maybe some kind of version numbering in the scopes + semantically identical to fully dynamic scopes - slow == Dynamic / Static / Constant Scopes, Guessing == * A scope can be static (i.e. names don't change activation mode and no new names) * A scope can be constant (i.e. names don't change values and it is static) * The compiler guesses what is correct * The user can override * Conflicting runtime-changes raise errors + should be doable as long as no joker redefines =, defv, deff + should result in fast code for most methods - surprising runtime errors == Dynamic / Static / Constant Scopes, User Specified == * A scope can be static (i.e. names don't change activation mode and no new names) * A scope can be constant (i.e. names don't change values and it is static) * Default scopes are dynamic, but the user can specify otherwise - many methods would not get optimized because the user forgot to flag the scope == Dynamic / Static / Constant Names == * A name can be static (i.e. it is never overwritten after being resolved in a child scope and remains at the same scope index) * A name can be constant (i.e. its value never changes after being resolved in a child scope) * An optimized version of the code can be generated whenever it is called (using current name values) + fine grained control, sensible defaults possible - even more assignment semantics === Name Properties === * [d] deeply constant: objects referenced (indirectly) from this name can be assumed constant (constant implied) * [c] constant: the object reference of this name can be assumed constant (code constant implied) * [t] code / type constant: the function type can be assumed constant (including whether this function has no captured scope) * [s] static: the path from local scope to where the name resolves can be assumed constant the name index within the scope can be assumed constant the activation level can be assumed constant * [] dynamic: nothing can be assumed * [v] inactive: push value * [f] active: execute unless quoted * [q] quote-active: execute on first resolution (execution is instant) +---------------------------------------------------+ | | | s | t | st | c | d | +---+------+-------+-------+--------+-------+-------+ | v | defv | defvs | defvt | defvst | defvc | defvd | | f | deff | deffs | defft | deffst | deffc | deffd | | q | defq | | | | | | +---+------+-------+-------+--------+-------+-------+ * defaults: |defvs "==" deffd |deffs "=*" deffd * it is possible to globally turn checking (after dynamic resolution) on == Sticky Resolution == * Active names are considered constant after being resolved for the first time * Passive names are considered static after being resolved for the first time - fails to correctly handle the { =*array ... } use case = Musings about Types = [ /foo /bar ] [ 2 3 ] { "Key %s -> %d" format sys .out .writeall } [ "" "" ] [ ] typed * [ /a /b /c ] _ { cat } * # -> [ /aa /bb /cc ] [ /a /b /c ] _ { cat } [ "" "" ] [ "" ] typed * # -> [ /aa /bb /cc ] [ /a /b /c ] _ [ 0 ] [ "" ] typed cat # -> [ [ /aa /ba /ca ] [ /ab /bb /cb ] [ /ac /bc /cc ] ] [ /a /b /c ] _ |cat cross Types are represented by concrete values of that type. Co-iterability is decided by type-match and * integers: co-iterated type is bitwise and, if zero, no co-iteration * strings: co-iterated type is longest common substring, if epsilon, no co-iteration == Musings about function composition operator == Maybe ; should act differently when getting a string, or use another single char for category operations Uses: Categorical operators over Set Anyway: This should be handled by a library, use escape mechanism for ; == Musings about the API == sys .file sys .net sys .net .tcpip "127.0.0.1" 80 connect "www.google.de" 80 connect sys .poll create -> p file userdata p .add file p .remove timeout p .poll sys .linux str ... "format" sprintf string "format" sscanf end string prefix { 0 end string infix } start string postfix { start string len infix } start end string infix { start end range string * } str .encode string from to encode str .utf8 bin ... "format" bprintf string "format" bscanf math [ 1 2 3 ] 16 unbase -> 1+2*16+3*16*16 12345 10 base -> [ 5 4 3 2 1 ]