Non syntactical meta programmingThe outstanding part of Pliant language is that it does not answer the classical question: In the end, the meta programming introduced below, not to be mistaken with syntactical meta programming introduced long ago by LISP language, is nothing less than a generalization of most good language features such as functional or object programming. This article explains how Pliant meta programming works, from a high level point of view, and the consequences on software development, then illustrates presented concepts through an explained working sample. Pliant 4 stages representation of the programWhile compiling, a Pliant program exists under four different representations: Stage 1: The program as a text fileHere is a sample: function fact x -> y Stage 2: The program as a treeHere is a sample (this sample is oversimplified and does not match the sample provided at stage 1; see demonstration link bellow to get real examples): Each node contains either an identifier, or a constant value. Stage 3: The program as a list of instructionsIt is bit like a C program, with no nested functions, and gotos instead of high level controls. Here is a sample: 1: copy 1 y For C fluent users, the above notation has the same meaning as this unusually written but (probably) valid C code: int x,y,c1; In the list of instructions, '0' '1' 'x' 'y' 'c1' are what I call arguments. They are typed. Stage 4: The program is encoded as a native processor instructions setThe program is just a set of bytes encoding native processor instructions. TransitionsParserThe transition from stage 1 to stage 2 is named "the parser". CompilerThe transition from stage 2 to stage 3 is named "compilin"g, or "meta compiling". Code generatorTransition from stage 3 to stage 4 is named code generation or code optimization. Understanding the Pliant language central conceptThe last part of the Pliant compiler machinery is the global dictionary. It is just a dictionary associating definitions to identifiers. One identifier can have several definitions, each definition is associated to a module, and each program only sees definitions associated with modules it imported using 'module' instruction. Now that I have presented all key pieces of the Pliant machinery, let me explain the stage 2 to stage 3 transition (compiler) step by step. As a summary, successfully compiling a sub expression in the stage 2 tree exactly means attaching a stage 3 set of instructions and a result argument. Be sure to properly understand this because it is the central concept of the Pliant language. An interactive tool is provided that enables you to view a program at various stages in order to help understand how Pliant compiler machinery works. In the demonstration tool, when you click on a stage 2 node, the stage 3 section bellow will be updated to display the instructions list and result argument attached to this node as a result of compile transition. An extra very interesting information is provided under 'Successfully compiled by' label: it is a link in the online source code browser to the meta or function that did the attachment. Still in the demonstration tool, the 'Stage 3 to 4 transition' final section is intended to help understand how the Pliant code generation transition is performed. Each of the rewriting filter is listed in small letters, and when, if it changed something in the instructions list, it is followed by the instructions list with old instructions displayed in red and new ones in green. Some extra informations reported directly by the filter are displayed in light brown. SamplesHere is a small sample to help you better understand the tree first represtentations of the program, and the stage 2 to stage 3 transition which is the core of Pliant compiler machinery. Stage 1 : the source codevar Int i := 3 Stage 2 : the treeStage 3 : a list of intructions and a result argument has been attached to each nodePlease notice that this drawing is ignoring Pliant constants evaluation mechanism. I mean, at the '+' node, the real Pliant compiler would notice that '1' is constant, '2' is a constant, '+' is a function with no side effect, so it would immediately execute 1+2 and attach an empty instructions list and constant '3' as the result of the node, instead of attaching instruction '+ 1 2 t1' (which you can read as t1 := 1+2) as displayed on this drawing. More samplesHere is a link to a small interractive graphic tool that might also help you better understand the stage2 to stage 3, then stage 3 to stage 4 stransition. ↣ Meta programmation illustration When executing online, you can only use the tool only on a small set of predefined programs (use select box at the top). When executing Pliant locally, you can apply the tool to any small program you type in. Design choices discussionIn very few words, the core of Pliant design is:
Let's study how necessary each of the stages is, and how much design flexibility is available when defining it. Stage 1: The program as a text fileSome programming systems exist that start directly at stage 2 tree because the program is edited using a graphic schema editor. The advantage of directly editing a graphic schema is that the programmer can be assisted easierly though a set of contextual menus, and the representation is cleaner. The advantage of plain text is that it's a lot more compact. About flexibility, the question is: should the source file be a standard plain text file just like 40 years ago, or should it have styling enhancements such as HTML or the old RTF (rich text format) ? My general conclusion about stage 1 is that it's necessary because it is the best encoding for the programmer to read the program, and there is no significant design choice to make here. Stage 2: The program as a treeIt is the encoding that is convenient for implementing language features because it's the easiest to travel using a program. So, it exists in nearly any language as the output of the parser. In many languages such as C or Java, the application cannot access it, as opposed to LISP where it can. About flexibility, even more than for stage 1, the lack of innovation comes from the lack of needs. When you have a tree where each node can be either an identifier or a constant, then all possible features fit nicely on the model, so why make it more complex ? Stage 4: The program encoded as a native processor instructions setThis stage also exists and is 100% constrained because the hardware says so, so there is not much to discuss here. (1) Stage 3: The program as a list of instructionsLet's start by the easy part about stage 3 encoding. What is it useful for ? First of all, it is an encoding that is very convenient for algorithms that extract properties about the program. The reason is that the structure is simple to travel (a list of instructions) and the semantic of each instruction is also simple, as opposed to stage 2 where the tree is simple to travel but the semantic is mostly undefined. Extracting properties about the program can have three different usage:
I'm now ready to express the second important notion required to understand the Pliant language design layout. Stage 2 (program as a tree) is the semantic view, whereas stage 3 (program as an instructions list) is the procedural (execution) view. What it means is that when you have the program as a tree, all freedom is possible through defining new meta functions. You can change anything, so you can get maximum expressivity, but the drawback is that you can just say nothing about the program because you cannot be aware of all the meta others have or might define. On the other hand, the semantic of the stage 3 instructions list is completely fixed because it is just a high level abstraction of the real hardware, so that it will not be disturbed by nice extensions other guys you are not aware of might have added through meta programming. Now, we get the central point justifying the overall Pliant design. At time you write the code that defines a meta function for 'C' identifier, the two important properties you expect are:
This is achieved in Pliant through not defining a meta function as a function that turns a stage 2 tree to another stage 2 tree just as in LISP, but as function that turns a stage 2 tree as a stage 3 instructions list. In the end, through saying that all transitions (parser between stage 1 and stage 2, meta programming between stage 2 and stage 3, code generator between stage 3 and stage 4) can be customized, Pliant brings close to unlimited flexibility at expressivity level. Then, avoiding nasty side effects between various extensions, and keeping tight control on final execution (predictable and mostly optimal execution speed) is provided through introducing stage 3 that has well defined fixed semantic, and stating that any semantic extension is a custom transition from stage 2 tree to stage 3 instructions list. As a final summary of Pliant language core design, through introducing stage 3 instructions list that have fixed semantic, and defining meta programming as a transition from stage 2 expressions tree to stage 3 instructions list, Pliant enables meta programming to provide mostly unrestricted semantic extensions as usual, but also two more important properties:
I hope to have convinced you about the necessity for stage 3 instructions list, so that you would now be able to understand why, in languages history, I position Pliant as the connection point between LISP and C: What I have not covered yet in this discussion is: what design flexibility do we have for stage 3 definition ? Once again, the design choice in Pliant is minimality. Also I don't claim proved minimality. I just tried to design it so, and what is important in the end is not that it would be absolutely minimal or not, but that it's simple and it brought the two just explained extra properties (efficiency and side effects control). Just for information, the only feature I see that could be removed in Pliant stage 3 instructions list is the offset in indirect arguments. In Pliant, an indirect argument is a data the address of is provided by another argument value, plus an offset to apply on the address argument value. This 'offset' extra field can always be avoided through computing the effective address in a temporary variable. I mean: 1: foo [a+3] is the same as: 1: + a 3 t1 The choice here has been to say that when we have a variable containing the address of a record, we want to be able to access the fields without creating temporary address variables. Anyway, since I'm entering probably too specific issues for an introduction about Pliant meta programming, I'll now rather switch to practice than try to be exhaustive at design level. Please just notice before that the subtle casting machinery details are also just left on the side of this introduction article. All of it is hidden under the high level 'cast' method that we will see in the practical section. Using meta programmingLet's study meta programming through a simple sample: the 'max' function. First, in C, we would define it as a macro: #define max(x,y) ((x)>(y) ? (x) : (y)) It has two limits:
Here is a well known side effect example. If we write: int x,y; we get x being sometime incremented twice. Here is a possible Pliant implementation: module "/pliant/language/compiler.pli" Let's now explain this listing step by step. A Pliant meta function is a function that will receive an expressions tree as a parameter, and attach an instructions list and a result argument to it. Defining a meta function just require using 'meta' keyword: meta max e 'e' is used to provide the name of the variable that will be used to access the expressions tree. Now the general explanation of the listing is: first we test that we have at least three arguments (P1), because 'max' with two arguments is already defined and we don't want to conflict with it. Then in the 'find' part, we try to find the type 't' of the result. We will try the type of each of the arguments one after the other (P2), and check if other can cast to it (P3). In the example at the end of the listing, the first argument is 3 and has type uInt, the second is 6.2 and has type Float and the third is -8 and has type Int, so that the final type can only be Float. Here are more detailed explanations about various meta programming features involved: Scanning the expressions treeThe number of sons of a node is obtained through calling 'size' method: if e:size<3 Accessing one son is obtained through calling the empty method with the index of the son as a parameter, with as usual in computing field the first son having index 0: var Pointer:Expression ei :> e 0 The address of value associated with a node is provided by 'value' field: var Address a := ei value Please note that if the node is an identifier, it will have type 'Ident': if (entry_type ei:value)=Ident Scanning definitions in the global dictionaryThe following sample scans all definitions associated with 'compare' keyword, that are seen by the current module, and that are functions: var Pointer:Arrow c :> e:module first "compare" Scanning all definitions without applying current module visibility would be: var Pointer:Arrow c :> pliant_general_dictionary "compare" So, in the sample meta functions, the not easy to read following sequence: var Pointer:Arrow c :> e:module first "compare" does a bit the same as: var Link:Function compare :> the_function compare t t -> Int with the significant difference that it works whereas the simpler version does not because 'the_function' meta expects constant arguments whereas 't' is not a constant. Compiling argumentsAssuming that 'ei' is a pointer to an (sub)expression, we can try to compile if through calling 'compile' method: ei compile What is strange about 'compile' is that it will either succeed or ... raise an error, instead of returning a success boolean. In the Pliant early days, I had troubles deciding between each function or method returning a status result or try/catch style errors handling style. 'compile' is some reminds from this hesitations, so that most of the time, you will use: ei compile ? which is just the compact version of: ei compile Anyway, most of the time, you will not use 'compile' method, but 'cast' that returns a boolean specifying if the node could be compiled and casted to the specified data type: if not (ei cast Str) Building the instructions listBefore we can use any sub expression result, we have to copy to the instructions list all instructions necessary to compute the sub expression result. This is achieved through calling 'suckup' method: e suckup e:0 Here is the high level function to create a temporary variable with type address: var Link:Argument adr :> argument local Address And the high level function to create a stage 3 instruction: var Link:Instruction instr :> instruction (the_function 'address Universal' Universal -> Address) e:0:result adr the number of arguments provided to 'instruction' meta must be consistent with the number of arguments in the function prototype, including the result. A jumping instruction (the are only three of them that are 'jump anyway', 'jump if' and 'jump if not') will get two more parameters; the first one is 'jump' keyword, and the second one is the target instruction: var Link:Argument cond :> argument local CBool Adding the instruction at the end of the instructions list of the node: e add instr Attaching the resultvar Link:Argument r The second parameter specifies the kind of access the calling function is allowed on the result. Most of the time, it is 'access_read'. Sometime it can be 'access_read+access_write' just like in 'ovar' meta. 'access_object' flag can also be added to specify that what is returned is a Pliant object so that generic methods can be called directly. When the function returns no result, just use: e set_void_result Please remind that in Pliant, attaching the result argument to the expression is the synonym of successfully compiling it. Advanced meta programmingWith LISP, we would get the ability to define 'max' accepting any number of arguments and the ability to avoid nasty side effects through creating temporary variables, so this first example was not sufficient to demonstrate why Pliant meta programming engine enables to go beyond LISP. A more advanced example is 'thread' instruction defined in /pliant/language/schedule/thread.pli function thread_must_succeed ok Let's start through explaining how to use 'thread': function test would display (even if it's a over simplistic buggy code): a = 1 What is interesting about it is that 'thread' instructions automatically detected that 'a' and 'b' local variables are used by the thread code, so that they must and will be copied to the new thread. Moreover 'share b' enables to force the 'b' variable to be passed by address rather than being copied so that modifying it in the thread body change the value in the main thread. So, the resulting code will roughly be: type TestEnv and this is basically what a C programmer would have to write. So, back to the proposed implementation of 'thread' meta, the end of it is using the same set of features as introduced with 'max' meta: var Link:Argument ok :> argument local CBool The stange part of the 'thread' meta is: var Link:List expressions :> new List Let's start through explaining 'mark' 'rewind': var Address mark := module mark 'mark / rewind' is just a way to add temporary definitions to the general dictionary. We call 'mark' to specify when temporary definitions start, then 'define' to provide each new definition, and finally 'rewind' to discard them. Now, the few remaining unexplained lines of the 'thread' meta are: var Link:List expressions :> new List From a very high level point of view, you can think at 'freeze' as a mechanism for introducing lazy evaluation. The code is probably too complicated to fully explain as an introduction to meta programming, but it is still interesting to point out the following section of it: var Pointer:Arrow c :> intern_locals first Basically, the meaning is: for each local variable, we call 'uses' method to test if the variable is used in the provided bloc. I mean, in the usage sample above, it is 'uses' that determine that 'a' and 'b' are used in the new thread code, whereas 'c' is not. method e uses a -> answer Through scanning the stage 3 instructions list instead of stage 2 expressions tree, we get the very important 'controlled extensions side effects' properties expressed in the end of the design choice discussion. Basically, 'thread' does not care about extensions used in it's body because it will scan the result as stage 3 instructions list where the semantic is fixed and simple. Please notice that expression freezing notion introduce by this simplified 'thread' implementation is intensively used by Pliant UI to implement 'button' and 'link' instructions. Lastly, for programming experts, please also notice that the meta programming features presented in this second sample with 'freeze' show you that Pliant meta programming is powerfull enough to implement a (logical) programming feature like closure. ConclusionYou found even the first 'max' sample meta function too complex ? Ok, here is a more LISP like version: meta max e It will just do syntactical rewrite, so turn: max a b c d to: max a (max b (max c d)) The goal of this article was not to list all possibilities of Pliant meta programming, but just to explain precisely, then illustrate on a not trivial sample, how Pliant engine design enables to get three interesting properties all at once, resulting in a language with much larger decent usage scope:
|