Pointers and memory allocationHandling memory addressesFrom an just a bit oversimple point of view, a computer memory is a continuous set of byte. Each byte has an index called it's address. Each data has an address that we can get through 'addressof' keyword: var Int i On a 32 bits system, an integer, 'i' in this sample is stored on 4 bytes since the integer is 32 bits long and each byte is 8 bits long. The address returned by 'addressof' is the address of the first of the four bytes containing 'i' value. An address is a bit like an integer value, but has type 'Address' instead of 'Int': var Address a This might seem strange at first. If the address is the index of a memory byte, then it's an integer value, so why do we handle it as a data with type 'Address' instead of handling it as a data with type 'Int' ? Here are some operations we can do on an address: var Address b means that 'b' will receive the address of the memory byte which is 12 bytes after the one 'a' contains the address of. 12 is 3 times 4, where 4 is the number of bytes of an integer on a 32 bits plateform. If translate is given only a type parameter but no count (3 in our sample), then count 1 is assumed. So: b := a translate Int is the same as: b := a translate Int 1 Then we can access the data at any given address: i := b map Int If the memory bytes specified by b don't contain an integer address, the result is unspecified and might just crash the program. Here is a first shortcut: b := a translate Int 3 is the same as: i := a map Int 3 Finally, in some rare situations, we might want to do real computations on some address, so we can cast it to a real integer: i := cast a Int then do some computations, and finally cast back: a := cast i Address So, when we write: b := a translate Int 3 with an absolute lack of taste, we could instead write it as: b := cast (cast a Int)+3*Int:size Address PointersIf your program has a lot to do on the interger data at address 'a', then explicitely dereferencing each time using 'map' might be boring in the end: Address a So, we introduction pointers, as a way to make dereferencing lighter (from the writing point of view only since from the computational point of view it's just the same) since implicit: var Pointer:Int p In this first example, we see that a pointer uses two set instruction, one being ':>' and the other beeing ':=' var Int i j ':>' means starting from now, 'p' points to the address of 'i'. Of course, we could get the exact same result, with ugly writing, using an address variable 'a' instead of a pointer variable 'p': var Int i j For readers used to C language, the program would be written as: int i,j; In other words, Pliant pointers do automatic dereferencing, whereas C pointers require explicit dereferencing. C notation is closer to the computational reality since dereferencing is an operation that the processor really has to do so showing it as a '*' sign can be considered a good thing; on the other hand, it makes program much less clear to read. Null addressThere is a 'null' constant address that points to memory byte with index 0, and it is used in many applications as a special value. See other programming languages for extra details. var Address a When dealing with pointers, testing for null address is done through 'exists' function: var Pointer:Int p which is equivalent to: if addressof:p<>null Double indirectionSome algorithms need to handle a pointer on a pointer. As an example, in an algorithm dealing with a list of elements, we might need to have 'l' variable beeing a pointer to the last pointer of the list so that we can modify it to add a new element in the end of the list: var Int i The first instruction 'l := i' means full dereferencation, so the integer data pointed by the pointer pointed by 'l' will receive the value of 'i'.. If we think to addresses: a := addressof i The first instruction 'a := addressof i' means 'a' address variable will now contain the address of the 'i' variable content. Memory allocationReserving some raw bytes of memory is done trough 'memory_allocate' function which is an equivalent of C language 'malloc' function: var Address a The first parameter is the number of consecutive bytes requested, and the second is an address that recommands to provide an address near the provided one. This second does not exist is C language 'malloc' and is ... not used by Pliant at the moment. Releasing the memory is done through calling 'memory_free': memory_free a Then there are the classical variants. a := memory_zallocate 100 null and changing the size of an allocated area: a := memory_resize a 200 null and, if extra bytes have to be set to zero: a := memory_zresize a 200 null We can get the exact size allocated through: var Int i We can also get global informations or configuration about memory usage through several not frequently used functions. First the amount of memory consumed by Pliant process through allocating pages (at operating system level, memory is not assigned to application as bytes, but as pages) can be obtained through: i := memory_currently_consumed then, we can now the maximum amount of memory consumed by Pliant since the start of the process: i := memory_maximum consumed We can also know the current amount of memory really used (the consumed memory is bigger than the used one because of holes resulting from various allocations and releases): i := memory_currently_used It is possible to check that the memory data structure has not been corrupted as a result of poking to a wrong place. Of course, it's just a debugging helper function which is not granted to work: memory_checkup How memory allocation is implementedThe general picture is that it's a modification of the Doug Lea well known algorithm. Then, in /pliant/language/schedule/multi_threads_locks.pli which is implementing the locking to turn Pliant to multithread safe (the C written Pliant bootstrap code is not), I added a mechanism to provide better allocation performances in multithreaded applications than using a single lock. The machinism consists of a small list of free chunks for each possible small number of bytes and is actived through setting 'alloc_steedup' constant. See 'memory_allocate_mt' and 'memory_free_mt' functions in the code for extra details. - a more step by step explaination of the memory allocation mechanism would be required, starting by the operating system glue page allocation functions, but it's out the scope of this first generation of documention - Objects and references countThis paragraph is explaining how to use references count allocation mechanism, not why and when to use it. For a general introduction to various memory allocation techniques, see external literature, and also the 'An introduction to Pliant storage' article. A Pliant object is a data with a header containing two special fields in front: the references counter and the type pointer. So, on a 32 bits plateform, an 'Int' data consumes 4 bytes, but an object with type 'Int' consomes 12 bytes (4 for the references counter, 4 for the type pointer, and 4 for the integer value). We can create an object with type 'Int' through using the low level function (please notice that it's not recommanded at application level unless you know what you do): var Address a the returned address is not address of the first byte of the object header but the address of the first byte of the data. So, we can use an object as if it was a standard data through something like: var Int i When the address is pointing a true Pliant object, you can also get access to the object using 'omap': i := a omap Int The difference between 'map' and 'omap' is just that 'omap' says to Pliant compiler that the underlying data is an object, so it can be passed to a function that requires an object as a parameter. Here is a sample: function foo i Then, an 'Arrow' is just like an 'Address' but it MUST contain the address of an object. Setting an arrow to an address which is not the one of an object will corrupt Pliant process memory layout, so probably turn it to unpredictable and finally crash. Such a mistake is nearly always catched immediately when running Pliant at debugging level 2, and it is one of the major reasons for running your software at debugging level 2 from time to time. Now, the magic feature of an arrow is that it automatically adjusts the references counter of the object it contains the address of: var Arrow r With greater details, when the address contained in a arrow is changed:
The last rule is that 'null' address is supported and will just skip the corresponding increase or decrease operations. The operations related to arrows are: var Pointer:Type t :> entry_type r What getting the pointer to the effective type might be usefull for is outside the scope of this article and will be later explained in an article about typing. Get the references counter of the object the arrow contains the address of: var Int i := entry_counter r Please notice that calling 'entry_type' or 'entry_counter' on an arrow with null address is a bug that will immediately crash the Pliant process. Now, just like we have 'Arrow' as a special case of 'Address' that automatically handles adjusting the references count of the pointed object, we have 'Link' as a special case of 'Pointer' that also automatically handles adjusting the references count: var Link:Int l 'new' is a function that allocates and builds an object with the specified type, just like 'entry_new', but returns a pointer to it instead of returning it's address. So, the previous code does the same as: var Arrow r and the difference is that, in the first case, we could increment the object integer data value by one through: l += 1 but in the second case, we would have to use the ugly explicit dereferencing notation: r map Int += 1 Just like an arrow, a link must point to an object, or the Pliant memory layout will be corrupted. Here is one of the most classical mistakes in Pliant: var Int i Since 'i' is a local variable, it is not an object, so setting a link to it is a bug immediately caught only when running at debugging level 2. |