Text handlingLitteral stringsLitteral strings are provided through double quoting the content: console "abc" eol Special characters are provided through bracket sequence: console "abc[lf]def[rb]" eol Basic operations on stringsThe data type for storing a text string is 'Str'. var Str s := "abc" Concatenation is using plus sign: var Str s2 := s+"def" A substring is obtained through providing the index of the first selected character (the first character of the string has index 0, not 1) followed by the number of selected characters. var Str s := "abcde" CharactersA character can be constructed from it's ASCII number through 'character' instruction: var Char c := character 65 A character shall be extracted from a string, or changed, through providing it's index. The index must be positive and smaller than the string length: var Str s := "abcde" A litteral string with only one character can also be used as a character: var Char c := "A" # ok The ASCII number of a character can be obtained by 'number' method: var Char c := "A" UTF8 encoding issuesRecent releases of Pliant assume that a 'Str' data type is UTF8 encoded. This has nasty consequences you have keep in mind: Let's start with a sample: var Str s := "rêve" The result is 5, not 4. Let's continue with encoding related issues: module "/pliant/language/type/text/str32.pli" # we need this because 'Str32' data type is not part of the default Pliant dialect: we need an extension The result is this time 4. Of course, we have also 'character32' and 'number' for 'Str32' strings: module "/pliant/language/type/text/str32.pli" As a summary, Pliant uses UTF8 encoding for default string data type, and the reason is that it is more memory efficient that using 16 or 32 bytes per character. There is a drawback that you don't have one character matching one position. In many situations, this is not really a problem because UTF8 is a smart encoding, so you can for example search substrings, and it will always work. A 'Str8' data type exists, that encodes each character exactly on 8 bits: module "/pliant/language/type/text/str8.pli" Please notice that there is currently no 'Char8' data type and 'character8' function because they are 'Char' and 'character' in facts. Searching for a substringThe first parameter of the 'search' method is the substring to search for. It must not be the empty string. The second is the value to return if the substring is not found. The result is once more not the index of the character in the string where the substring has been found, but the index of the byte in the UTF8 encoding. var Str s := "abcde" ParsingThe 'parse' method is a very powerfull way to parse some string. It scans the string from left to right and tries to match provided arguments one after the other. The result is true if all arguments have been found and the end of the string has been reached. var Str s := "abc12def" In this first sample, we have seen two kind of elements: litteral strings, and a variable. A litteral string must be matched litteraly, or parse will fail. A variable must find characters that provide a valid value for the variable data type, and the variable will be set with that value, or parse will fail. The 'any' keyword matches any sequence of characters. If it has a variable as an argument, the matched substring will be returned in the variable: var Str s := "abcdef" A matching sequence of characters for a string variable is just like a Pliant litteral string, it starts with a double quote, ends with a double quote, and in the middle, the bracket charater introduces a special character: var Str s := "ab[dq]cd[dq]ef" The 'pattern' keyword forces it's argument to be handled like a litteral string to be matched instead of a variable to be filled. var Str s := "abcdef" 'lpattern' will search for the last instance whereas 'pattern' searches for the first one. var Str s := "a/b/c/d" The 'word" keyword checks that what is matched is a full keyword, I mean the previous and next characters are not some letters. var Str s := "abcdef" Case shall be ignored through using 'acpattern' or 'acword' instead of 'pattern' or 'word'. 'ac' means 'any case'. var Str s := "ab cd ef" The underscore keywords matches any number of spaces. var Str s := "ab cd ef" All spaces are between matched elements are automatically dropped. If you don't want spaces to be automatically dropped, use ' which standard for 'extact parsing' instead of 'parse': var Str s := "abc12def" OptionsOptions are a way to store all a dictionary (a set of keyword -> value) in a single variable. Let's take an example: var Str s := "id [dq]r1[dq] name [dq]Dupont[dq] count 3 mini 2 maxi 10 country [dq]Spain[dq]" We can test if a keyword is defined: if (s option "name") We can also pick the value associated with the keyword: console "name is " (s option "name" Str) eol A default value shall be provided to be returned if the requested keyword is not found or the corresponding value does not match the requested data type: console "name is " (s option "name" Str "nobody") eol We can also query the position of the keyword in the string: console "name keyword found at position " (s option_position "name" -1) eol The second parameter of 'option_position' is the value to return if the keyword is not found in the string. Through using both 'option_position' and 'parse', we can also find two values following a keyword: var Str s := "id [dq]r1[dq] name [dq]Dupont[dq] count 3 range 2 10 country [dq]Spain[dq]" Lastly, let's assume that we want to pass several time the same keyword value. Several of the methods we have seen previously accept an extra parameter just after the keyword parameter that specifies the index of the keyword instance we want to considere. When this parameter is omited, it's the same as setting it to zero. Here is a sample: var Str s := "value 2 value 5 value 10" Other string related functionsNo explaination needed, isn't it ? console (repeat 5 "abc") eol console upper:"Hello" eol |