Language

Pliant syntax

Orthogonal hard to read syntax

Let's start from the most orthogonal encoding:

('function' 'fact' 'x' '->' y ('{}' ('arg' 'Int' 'x' 'y') (':=' 'y' ('shunt' ('=' 'x' 0) 1 ('*' x ('fact' ('-' 'x' 1)))))))

We have a LISP (the ancestor of logical programming languages) like very orthogonal, but not easy to read, encoding of the factorial function definition.
Each identifier is limited by simple quotes.

Identifiers

First of all, we can remove many of the quotes surrounding identifiers because a set of letters is assumed to be an identifier:

(function fact x '->' y ('{}' (arg Int x y) (':=' y (shunt ('=' x 0) 1 ('*' x (fact ('-' 'x' 1)))))))

Operators

Then, some signs (such as minus or multiply) or set of signs (such as := or ->) are also assumed to be identifiers and moreover provide automatic folding according to folding priorities:
As an example:

a*b+c*d

will be automatically folded as:

('+' ('*' a b) ('* c d))

So, back to our factorial function definition, we can rewrite it as:

(function fact x -> y { (arg Int x y) (y := shunt x=0 1 x*(fact x-1)) } )

Indentation

Lastly, indentation is also used by Pliant parser as a folding mechanism. I mean:

a b c
  d e f
    g h
  i j

is automatically folded as:

(a b c { (d e f { (g h) } ) (i j) } )

The rules are: a parenthesis is open at the beginning of each line, and is closed either at the end of the line, or if the next line is right indented compared to the current one, at the end of the right indented bloc starting on the next line.
When the next line is right indented compared to the current one, a brace is also automatically open at the end of the current line, and closed at the end of the right indented bloc.

So, we can finally rewrite our factorial function as:

function fact x -> y
  arg Int x y
  y := shunt x=0 1 x*(fact x-1)

Colon

There is one last way to remove parenthesis is Pliant, which is the colon.

a:b:c:d

is the same as

(((a b) c) d)

If you are aware of other programming language, you will notice that Pliant colon is a purely syntactical convention, as opposed to most languages that use dot instead of colon, but restrict it to fetching a subpart of an object, or calling a method.
So, in Pliant, I can write:

var Str s
console s:len eol

just like we would write s.len() in C++,
but I can also write:

console fact:5 eol

as an equivalent of:

console (fact 5) eol

Semicolon

a b c d ; e f g

is the same as

(a b c d) (e f g)

Orthogonal notations

The two following programs are equivalent:

var Int i
for i 1 5
  console "hello" eol

and

for (var Int i) 1 5
  console "hello" eol

Basically, 'var' instruction declares one or more variables through specifying their data type, and 'returns' the last one. The same feature is used when we write:

var Int i := 3

which is the same as:

var Int i ; i := 3

or:

var Int i
i := 3

If you are used to C programing, you will probably also be not familiar with the following notation:

if { var Int j := shunt i>0 i%8 i%9 ; j=3 or j=5 }
  console "bingo" eol

the semantic being that brace 'returns' the result of the last instruction it contains, so that the program is equivalent to:

var Int j := shunt i>0 i%8 i%9
if j=3 or j=5
  console "bingo" eol

Comments

Comments start with a '#' sign and end at the end of the line:

var Int i # here is a comment

'document' is a deprecated control that accepts any bloc as a parameter and is just ignored by Pliant language. It is used to embed XML like informations (rich text) in the middle of plain ASCII program files:

document
  title
    text "My new program"
  para
    text "With embedded documentation"

The 'document' part is interpreted by the Pliant source browser /pliant/site/source.ui, and can be edited from the Pliant text editor through clicking on 'Wysiwyg' button while the cursor is on the line with 'document' keyword.

Discussion

New Pliant users are often disappointed about Pliant syntax to be so different than the C one, with two main criticizes:

   •   

using indentation rather than {}

   •   

using (f x) instead of f(x)

About using indentation for blocs, the reason is that it's what the human reader really use to understand the program. So, as soon as we introduce {}, we have the computer relying on one concept and the human relying on the other one, and nasty bugs when they don't match.
So, Pliant approach is just to make the {} implicit, and it is what it really is because the result of parsing is still a '{}' node. See sample.

About using f x rather than f(x), first, it enables to write simple lines with no parenthesis at all:

y := fun x

and simple programs that look like this:

instr1 arg arg arg
instr2 arg
instr3 arg arg
...

and this is important to make it easy for beginners and non professional programmers. It has been part of Pliant philosophy to try to provide smooth (continuous) learning curve from beginner to advanced programmer, and this has been favored at the expense of using what is currently perceived as mainstream.

Then, the selected notation enables the very important property that accessing fields is the same as accessing a method with no argument:

var Str s
console (s len) eol

or

console s:len eol

in both cases, 'len' can be either a field or a method.

In other words, with (f x) notation, parenthesis a no more related to computing the function result but just a way to fold the expressions tree, and it's good because in computer field as opposed to math, there is just no reason to treat functions as something special.
Moreover, if you look at:

var Str s

then, with f(x) notation convension, either 'var' is something special at syntax level, or it should be noted as:

var(Str,s)

none would be satisfying, and (f x) notion is the way to avoid these issues.

Extending the parser

- the remaining part of this article is intended for advanced programmers -

The Pliant parser is made of two kind of entities:

   •   

token filters that are responsible to recognize tokens in the text source file

   •   

operators that are responsible to fold the list of tokens to a tree

The following section does not explain how Pliant parser works, but just how to provide it tiny extensions. If you want to learn how the core parser works, read Pliant compiler machinery article.

token filter example

- This sample is assuming you are running Pliant release 106 at least since I have added 'token_filter' helper function in release 106 to make it higher level -

Here is a sample token filter from /pliant/language/type/number/int.pli
It is responsible to recognize base 2 encoded integer numbers, I mean things like '10011000b'

function parse_bin context line parameter
  arg_rw ParserContext context ; arg Str line ; arg Address parameter
  var uInt value := 0
  for (var Int i) 0 line:len-1
    var Char c := line i
    if c="0" or c="1"
      var uInt value2 := value .*. (cast 2 uInt) .+. (cast c:number-"0":0:number uInt)
      if value2\(cast 2 uInt)<>value
        return
      value := value2
    eif c="b" and i<>0
      if i+1<line:len and (line i+1):isidentcharacter
        return
      var Link:uInt t :> new uInt
      t := value
      context add_token addressof:t
      context forward i+1
      return
    else
      return

token_filter parse_bin "pliant parser basic types"

The 'parse_bin' token filter we define will check the beginning of 'line' parameter to see if it's what it is expected to recognize, and if so, it will use 'add_token' to provide the new token it constructed, then 'forward' to move forward the parsing cursor.
If what it finds at the beginning of the line is not what it expects, it just returns without calling 'add_token' and 'forward'.
Then the filter is recorded through calling 'token_filter' meta. The second parameter is the class of the filter.

We have a token filter that can recognize '>' token, another that can recognize '=' and a third one that can recognize '>='. How do we grant that '>=' will be recognized as a single '>=' token instead of two '>' and '=' tokens ?
Through having a class notion in token filters, with tokens with higher class applied first. The predefined classes are:

   •   

pliant parser housekeeping

   •   

pliant parser priority user operators

   •   

pliant parser basic signs

   •   

pliant parser several chars operators

   •   

pliant parser 2 chars operators

   •   

pliant parser 1 char operators

   •   

pliant parser basic types

'>=' is assigned class 'pliant parser 2 chars operators' whereas '>' is assigned class 'pliant parser 1 char operators' so that '>=' will be tried first.

Please notice that 'token_filter' is a high level meta. On the low level, it will add some definition in the Pliant dictionary, and export it. As a result, a given token filter will be tried if and only if the module it has been defined in has been included by the current module. This enable to provide syntactical extensions that do not disturb code that don't explicitly want to use them.

operator example

This is much simpler:

operator '.xor.' 656 1 1

656 is the priority of the operator, then the two next parameter specify how many parameters is expects in front, and how many after.
So, this first sample wil turn:

a b c .xor. d e f

to:

a b ('.xor.' c d) e f

The second sample provides an operator that will fold any number of arguments on each side:

operator '?' 518h 1000000000 1000000000

so, this second sample will turn:

a b c ? d e f

to:

('?' (a b c) (d e f))

If you want to know the priority of default operators, just search for lines that look like this in /pliant/language/startup/startup.c :

C_operator_t1("pliant parser 1 char operators","*",false,"*",0x320,1,null,1,null);

this sample C line is declaring "*" operator, with folding priority 320h.

exeptions to the indentation rules

When you write:

if a
  b c d
else
  e f

the result is not

('{}' ('if' a ('{}' b c d)) ('else' ('{}' e f)))

but rather

('if' a ('{}' b c d) 'else' ('{}' e f))

It means that 'else' has not been handled as a new instruction, but rather as an extra set of parameters of the 'if' instruction.

This is necessary each time an instruction is expected to receive several bloc parameters, and this is declared through 'dual_keyword' instruction:

dual_keyword if 1 100 else 1 1

you can read it as: when 'else' instruction has extactly one argument, and follows a 'if' instruction that have between 1 and 100 arguments, then the 'else' instruction will be added at the end of the 'if'' instruction.