pdx Configuration Index pdx Interactive Mode

pdx Reference

The creation of reports and diagrams

Reports come into existence from report templates. pdx searches through these templates to find a section with pdx function calls, mostly the format-function. Such sections are then cut out of the template, being evaluated and then replaced by the result. A report template can have multiple sections with function calls.

Report templates are eather plain ASCII text or text in a formatting language like HTML or XML or text in a programming language like C or SQL. We call this a host language. What host language is being used we don't want to limit in any way. But pdx must know how to find the sections with the function calls. That's why these sections are placed in comments of the host language, for example in HTML or XML between <!-- and -->, in C between /* and */. Doing so the template remains still an incomplete but correct file of it's type which still allows the use of tools like HTML- or XML-editors. It's wise to "mark" such pdx-comments still a bit more to distinct them from other existing comments in the file. You could use indications like <!--- and ---> or /** and **/. A complete, small template for a HTML template file containing pdx-instructions would look like this:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html lang="de-ch">
<head>
    <meta http-equiv="CONTENT-TYPE" content="text/html; charset=iso-8859-1">
    <title>MyTitle</title>
</head>

<body style="direction: ltr;" lang="de-DE">

<!--- (now) --->, <b>pdx</b> <!--- (version) ---> (<!--- (build) --->)                          *

</body>
</html>

In the line marked with * we see three small pdx-sections each one with a call to one of the functions now, version and build. The output this line creates looks like this:

2009-12-27 15:14:51, pdx 0.3.0 (2009-12-27 10:28:14 on castor, GNU/Linux 2.6.31-ARCH x86_64)

One can see that all parts of the line that are not placed in pdx-sections (and also the whole surrounding text) is transferred without any change into the output. By the way this line could be written also using the format-function:

<!--- (format   (now)   ", <b>pdx</b> "   (version)   " ("   (build)   ")") --->

To fill a complete HTML-table with values we could write a template file like this:

[...]

<table style="page-break-before: avoid; page-break-inside: avoid;
             width: 800px;" border="1" cellpadding="1" cellspacing="1">

  <tbody>
    <tr valign="top">
      <td>Datum/Zeit</td>
      <td>*</td>
      <td>n</td>
      <td>l</td>
      <td>m</td>
      <td>x</td>
      <td>Kommentar</td>
    </tr>
<!---
(format
    (empty "<br>")
    "<tr valign=top>"
    "<td>"    datetime                         "</td>"
    "<td>"    (select "*" (days 7))   <1.1>    "</td>"
    "<td>"    (select "n" (days 7))   <1>      "</td>"
    "<td>"    (select "l" (days 7))   <1>      "</td>"
    "<td>"    (select "m" (days 7))   <1.0>    "</td>"
    "<td>"    (select "x" (days 7))   <1.1>    "</td>"
    "<td>"    (select "#" (days 7))            "</td>"
    "</tr>" newline
)
--->
  </tbody>
</table>

[...]

The table starts outside the pdx-section and also ends outside. But the lines of the table (except the header) are generated completely by pdx. These lines look all equal.

Diagrams are made from diagram definition files. These files always contain exactly one diagram definition, this means exactly one call to the diagram-function. In diagram definition files comment indications are not necessary because we have no host language. A complete diagram definition would look like this:

(diagram 400 300 #FFFDFD

    (axes (month 1) 3.0 9.0 1.0 #0)

    (hline 5.0 #C0C0C0)
    (hline 6.0 #C0C0C0)
    (hline 7.0 #C0C0C0)

    (curve (sum (select "*" (month 1)) day  3:30  9:30)    #FF0000)
    (curve (sum (select "*" (month 1)) day 11:00 14:30)    #00FF00)
    (curve (sum (select "*" (month 1)) day 17:30 20:30)    #0000FF)
    (curve (sum (select "*" (month 1)) day 21:00  2:00)    #FFFF00)
    (curve (avg (select "*" (month 1)) day        2:00)    #0       1.0)
)

While developping new report templates or diagram definitions it would be wise to take existing files as base and to modify them step by step.

Syntax

We assume the following syntactic rules:
{int}, {double} signed numbers
5, 3.14
{string} character strings
"Hugo"
{time} mostly a time duration, seldomly really a time
09:13
{timestamp} a concrete point in history containing date and time 2009-12-31-7:30:01
{selection} an amount of timestamp-value-pairs
{color} a color in hexadecimal RGB notation
#00FF00
{nothing} for function results only: the result of the function is empty and can't be used for further calculations

Note: In some cases we need timestamps as parameter. Normally timestamps are written like CCYY-MM-DD hh:mm[:ss] - with a space in the middle. But the space is here also the separator for function parameters. So we need a - (minus) here to tell pdx that this timestamp is one parameter, not two.

Functions

Time spans
The following functions calculate each a concrete time span. A possible {int} argument is just a factor. If there's no {int} argument this factor is 1. The plural forms don't have a special meaning. They're just a more intuitive notation.
(second)         →  {time}
(second {int})   →  {time}
(seconds {int})  →  {time}
(minute)         →  {time}
(minute {int})   →  {time}
(minutes {int})  →  {time}
(hour)           →  {time}
(hour {int})     →  {time}
(hours {int})    →  {time}
(day)            →  {time}
(day {int})      →  {time}
(days {int})     →  {time}
(week)           →  {time}
(week {int})     →  {time}
(weeks {int})    →  {time}
The names of these functions are self-explaning.
(month)            {time}
(month {int})      {time}
(months {int})     {time}
Note, a month is here always 30 days. The specifications (month) and (days 30) are synonymous.
(year)             {time}
(year {int})       {time}
(years {int})      {time}
Note, a year is here always 365 days. The specifications (year) and (days 365) are synonymous.


now
(now)              {timestamp} The function now calculates an actual timestamp.
Note: if you use the command line option -f the returned value is the time of the start of the application even if this lies now some seconds back. So all calls to now return the same timestamp if you use -f.

Note: you can define the returned value with the command line option -n. With this you can (if you have well designed report templates and diagram definitions) produce reports for any time in history without any complicated configuration.


Selection
The five implementations of the select function get data values from a collection and put the into a selection. A selection is a time limited, not by any manner of means gapless part of a collection. The mandatory {string} parameter names the collection. If the collection is empty or if not data value can be found according to the time limitations the result selection is empty.
(select {string})                          →  {selection}
The function gets all data values of the collection.
(select {string} {timestamp})              →  {selection} The function gets the one data value from the collection who matches the specified {timestamp}.
(select {string} {timestamp} {timestamp})  →  {selection} The function gets all data values from the collection according to the time span between the two {timestamp} parameters.
(select {string} {time})                   →  {selection} The function gets all data values from the collection according to the time span between (now)-{time} and (now).
(select {string} {time} {timestamp})       →  {selection} The function gets all data values from the collection according to the time span between {timestamp}-{time} and {timestamp}.
Examples:

(select "*")
get all data of the default collection

(select "*" 2009-12-01-12:34)
get data of the default collection since Dec 01 2009 12:34

(select "n" 2009-01-01-0:00 2010-01-01-0:00)
get all data of the collection n of the year 2009
 
(select "l" (weeks 2))
get all data of the collection l of the last two weeks

(select "l" (months 3) 2009-06-01-0:00)
get all data of the collection l of the last three months before June 01 2009


merge
The function merge allows the merge of selections, this means of values comming originally from different collections. This is useful for categorized values stored in several collections but belonging together anyway.
(merge keyword ...)    {selection} keyword names the function being used if two values from different collections collide by having the same timestamp. keyword can be as follows:
  • avg (calculate the average)
  • min (take the lesser value)
  • max (take the greater value)
  • sum (add both values)
The open list of parameters after keyword allows selections only, at least two.
Example:

(merge avg (select "*" (days 7)) (select "x" (days 5)))


fold
The function fold allows "folding" the time axis of a selection. Imagine a collection written on a paper strip and fold it in your mind so that periods of time lay above each other. With this technique you can compare days or months based on the data of several days and months.
(fold keyword1 keyword2 {selection})    {selection} keyword1 controls the folding interval. Instead of keyword1 the following specifications can be used: year, month, day, hour or minute. For example, if you use day the timestamps of the collection will be cut til their day component, this means only the time component remains. All values will then get timestamps containing only 24 hours.

keyword2 has the same meaing like keyword in the merge function. You can use here: avg, min, max, sum, first, last.
Note: even the selection in the result of a folding operation must have valid timestamps. But there's no possibility to get an absolute timestamp after folding a period of time, several timestamps lay above each other. That's why these timestamps get the year 9999. According to the interval used also other parts of these timestamps (month, day and so on) are surely valid in their syntax but do not make sense.
Example:

selection a           (fold day avg (select "a"))    (fold day first (select "a"))   (fold day last (select "a"))
--------------------  ---------------------------    -----------------------------   ----------------------------
2009-12-01 13:01 5.2  9999-01-01 13:01 5.45 <- avg!  9999-01-01 13:01 5.2 <- first!  9999-12-05 13:01 5.7 <- last!
2009-12-02 13:02 5.7  9999
-01-01 13:02 5.7           9999-01-01 13:02 5.7            9999-01-01 13:02 5.7
2009-12-03 13:03 3.2 
9999-01-01 13:03 3.2           9999-01-01 13:03 3.2            9999-01-01 13:03 3.2
2009-12-04 13:04 4.8  9999-01-01 13:04 4.8           9999-01-01 13:04 4.8            9999-01-01 13:04 4.8
2009-12-05 13:01 5.7

2009-12-06 13:06 5.3  9999-01-01 13:06 5.3           9999-01-01 13:06 5.3            9999-01-01 13:06 5.3


Statistik
The statistic functions execute typical statistic calculations upon a selection. The functions calculate:
  • avg:   the arithmetic average
  • count: the numbr of values
  • first: the first (oldest) value in time
  • last:  the last (youngest) value in time
  • max:   the greatest value
  • min:   the smallest value
  • sdv:   the standard deviation
  • sum:   the sum of all values
These functions have five implementations each (func is just a placeholder for the function name).
(func {selection})                          {selection} execute the calculation based on the whole selection, the result contains a single row
(func {selection} {time} {time})          →  {selection} execute the calculation based on the whole selection but take only values between the two day times, the result contains a single row
The following implementations use a keyword to name the aggregation interval. This keyword can be: year, month, day, hour, minute, second. The result has one row per interval. For instance, if you use day as aggregation interval the result will contain one row per day of the original selection.
(func {selection} keyword)                →  {selection} execute the calculation based on the whole selection, aggregate according to keyword
(func {selection} day {time})             →  {selection} execute the calculation based on the whole selection, aggregate according to keyword, use the specified time as midnight, this allows values past 0:00 to belong still to the previous day
(func {selection} keyword {time} {time})  →  {selection} execute the calculation based on the whole selection, aggregate according to keyword, use only values between the two times
The functions avg and sdv have a sixth, floating implementation. This one uses previous and following values and leads to very nice curves.
(func {selection} {int} {int})              {selection} execute the calculation based on the whole selection, use per concrete value {int} previous and {int} following values and calculate in this floating window only, this leads to ay many values in the result as values in the original selection
Examples:

(avg (select "*"))
compute the average over all values of the default collection

(max (select "*") day 2:00)
get the daily maximum of the default collection, assume day change at 2:00

(sum (select "n" 3:30 9:00) day)
get the daily sum of values of the collection n, sum only values between 3:30 and 9:00

(avg (select "l" (month)) 5 5)
get the floating average over 11 values of the collection l for the last month

(first (select "*" (month)) day 2:00)
get the first line of each day of the last month from the default collection

(last (select "*" (day)) hour)
get the last line of each hour of the last day from the default collection


Comparison
(== {selection} {double})  →  {selection}
(!= {selection} {double})  →  {selection}
(<  {selection} {double})  →  {selection}
(>  {selection} {double})  →  {selection}
(<= {selection} {double})  →  {selection}
(>= {selection} {double})  →  {selection}
All these functions compare each value of a selection with a specified constant. The result will contain only those values who match the comparison. So the selection in the function result is as big as the original selection or lesser.
Examples:

(< (select "*") 5.0)
get all values from the default collection being less than 5.0

(>= (select "*") 7.0)
get all values from the default collection being greater than or equal 7.0


Arithmetics
(+ {double}    {double})     →  {selection}
(- {double}    {double})     →  {selection}
(* {double}    {double})     →  {selection}
(/ {double}    {double})     →  {selection}
These four functions calculate with two simple numbers. However, the result is a selection to make it simpler to use it for further calculations. But this selection has only one row and on this row no timestamp.
(+ {selection} {double})     →  {selection}
(- {selection} {double})     →  {selection}
(* {selection} {double})     →  {selection}
(/ {selection} {double})     →  {selection}
This group of functions combines each value of the selection with a number. The result has as many rows as the specified selection.
(+ {selection} {selection})  →  {selection}
(- {selection} {selection})  →  {selection}
(* {selection} {selection})  →  {selection}
(/ {selection} {selection})  →  {selection}
These four functions handle two selections. During this the timestamps of the selections will be compared row by row and then used as key. The number of rows in both selections don't have to be equal. If there's no matching row in the second selection as in the first selection the next older one will be used. The result has asmany rows as the first selection. Also the timestamps of the result come from the first selection. Important is that also the first row in the first selection finds a matching row in the second one, this means a a row with the same timestamp or an older one. pdx produces an error if this condition is not fulfilled.

These four implementations are especially useful if there are selections who are numerator and denominator of a quotioent, if you have values based on other ones, specific values.
(+ {timestamp} {time})       →  {timestamp}
(- {timestamp} {time})       →  {timestamp}
These functions add or subtract a {time} from or to a {timestamp}.
Example 1:

(+ 2010-17-12-00:00:00 (days 3))
results in Dec 20, 2010 0:00

Example 2:

selection a                 selection b                                    (* (select "a") (select "b"))
--------------------        --------------------                           -----------------------------
                            2009-11-17 12:38 9.3

2009-12-01 13:00 5.2                                ->   5.2 * 9.3 =       2009-12-01 13:00 48.36
2009-12-02 13:00 5.7                                ->   5.7 * 9.3 =      
2009-12-02 13:00 53.01
2009-12-03 13:00 3.2                                ->   3.2 * 9.3 =      
2009-12-03 13:00 18.24
                            2009-12-03 19:17 8.4

2009-12-04 13:00 4.8                                ->   4.8 * 8.4 =       2009-12-04 13:00 40.32
2009-12-05 13:00 5.7        2009-12-05 13:00 4.7    ->   5.7 * 4.7 =      
2009-12-05 13:00 26.79
2009-12-06 13:00 5.3                                ->   5.3 * 4.7 =       2009-12-06 13:00 30.21


HbA1c
These functions are spezific for diabetics. They calculate the HbA1c value based on blood sugar values of the last 90 days. This amount of data is mandatory in the collection. The function HbA1c rates all values in the collection equal while HbA1c2 gives younger values a higher weight. The latter one is more variable than the first. The common {string} parameter names the collection containing the blood sugar values.
(HbA1c  {string})                          →  {selection} calculates HbA1c from (now), the result has only one row
(HbA1c  {string} {timestamp})              →  {selection} calculates HbA1c from the specified {timestamp}, the result has only one row
(HbA1c  {string} {timestamp} {timestamp})  →  {selection} calculates HbA1c in the timespan from the first {timestamp} to the second {timestamp}, the result has as many rows as values in this timespan
(HbA1c  {string} {time})                   →  {selection} calculates HbA1c in the timespan from (now)-{time} to (now), the result has as many rows as values in this timespan
(HbA1c  {string} {time} {timestamp})       →  {selection} calculates HbA1c in the timespan from {timestamp}-{time} to {timestamp}, the result has as many rows as values in this timespan
(HbA1c2 {string})                          →  {selection} (these five implementations are strictly like the five ones above)
(HbA1c2 {string} {timestamp})              →  {selection}
(HbA1c2 {string} {timestamp} {timestamp})  →  {selection}
(HbA1c2 {string} {time})                   →  {selection}
(HbA1c2 {string} {time} {timestamp})       →  {selection}


Berichte
The functions of this group are needed for the creation of reports. They return a character string, oftenly a multi-line amount of text. pdx parses the report template, sees an invication of format, executes it and replaces this by the function result at exactly the same position. These functions can truely be tested in interactive mode.
(format ...)    {string} The format function accepts an open parameter list consisting from text, function results, format specifications and keywords:

(format
    "<tr>"
    "<td>"
   datetime                          "</td>"

    "<td>"   (select "*" (days 7))    <1.1>    "</td>"
    "<td>"   (select "n" (days 7))    <1>      "</td>"
    "</tr>"
    newline
)

All these expressions result in short pieces of text being concatenated. The result is a one or multi-line piece of text in any length. The number of lines in it depends on the number of values in the function results. If values do match by their timestamps they are placed on the same line.

The keyword datetime is a placeholder for the timestamp of the line. The keyword newline inserts a physical linebreak.

The format specifications can be recognized by their angle brackets. They always apply to the value directly before. There are three different formats:
  • <n>:   an integer number with at least n digits
  • <n.0>: a number with at least n integer digits and if necessary decimal digits
  • <n.m>: always n integer and m dcimal digits
The result of the example above is real HTML:

[...]
<tr><td>2009-01-17 21:42:49</td><td>5.6</td><td>6</td></tr>
<tr><td>2009-01-18 05:54:41</td><td>6.8</td><td>7</td></tr>
<tr><td>2009-01-18 12:17:22</td><td>5.4</td><td>6</td></tr>
[...]
(empty {string})    {string} In a complex format function invocation sometimes there's the problem to place even empty values in the output represented by something visible. Empty values appear by joining multiple selections to a table (outer join). Using the empty function we can tell the format function what {string} to use for empty values in in the output.


Diagramme
The following functions draw something visible into a diagram. They can't be tested in interactive mode of pdx. The result is always {nothing}, we can't use it for further calculations.
(diagram {int} {int} {color} ...)
    {nothing}
The diagram function is a wrapper. It surrounds the definition of a concrete diagram. The first {int} parameter specifies the size of the diagram in x-direction, the second one the size in y-direction. The {color} parameter specifies the background color in RGB notation. The following open parameter list should contain invocations of other diagramm functions especially at least one axes and one curve function.
The four implementations of the axes function draw a complete and labelled coordinate system. The user has not to worry about the details. The three commonly used {double} parameters are 1) the lower bound of the y-axis, 2) the upper bound of the y-axis, 3) the line width of the axes. The {color} parameter names the color of the axes and their labels.
(axes {timestamp} {timestamp} {double} {double} {double} {color})
    {nothing}
x-axis in the timespan between the two {timestamp} parameters
(axes {time} {timestamp} {double} {double} {double} {color})
    {nothing}
x-axis in the timespan between {timestamp}-{time} and {timestamp}
(axes {time} {double} {double} {double} {color})
    {nothing}
x-axis in the timespan between (now)-{time} and (now)
(axes keyword {double} {double} {double} {color})
    {nothing}
with keyword = year, month, day, hour or minute, especially for drawing data resulting of a call to the fold function. You need the same interval here.
(curve {selection} {color} ...)
    {nothing}
The function draws a curve in the diagram.

Without any further parameters the curve-function draws a zigzag line in the specified color just by connecting the data values of the selection. With the help of additional parameters this behaviour can be changed:
  • The keyword bars creates vertical bars instead of a zigzag line.
  • A string like "+", "|", "-", "x" or "°" doesn't create a line but single, unconnected points symbolized by the specified marker.
  • A {double}-parameter allows to change the thickness of the (zigzag) line.
In bar graphs one can draw multiple bars in one aggregation interval using two {int}-parameters. This sounds difficult but is easy to understand in an example: Given that we have values for four different, abstract day times, say "in the morning", "at noon", "in the evening" and "late". And we want four bars per day representing values at these times. In this case it would be necessary to draw the bars that they don't overlay each other:

(curve (sum (select "n" (week)) day  3:30  9:30) #FF1000 bars 1 4)
(curve (sum (select "n" (week)) day 11:00 14:30) #FF5000 bars 2 4)
(curve (sum (select "n" (week)) day 17:30 20:30) #FF9000 bars 3 4)
(curve (sum (select "n" (week)) day 21:00  2:00) #FFB000 bars 4 4)

These four lines differ in the selections, the colors of the bars and in the first {int}-parameter. This one is the number of the bar, the second {int}-parameter says how many bars we have. So the first line draws the first bar of four. pdx computes how wide a single bar must be drawn. In the example every bar gets a quarter of the width of a day on the x-axis. You can play with this. You must not draw every bar, this means you can also create gaps between bars with this.
The two implementations of the hline function draw a horizontal line in der specified color.
(hline {double} {color})
    {nothing}
{double} specifies the position of the line on the y-axis
(hline {double} {double} {color})
    {nothing}
the first {double} parameter specifies the position of the line on the y-axis, the second one determines the line width
The two implementations of the hline function draw a vertical line in der specified color.
(vline {timestamp} {color})
    {nothing}
{timestamp} specifies the position of the line on the x-axis
(vline {timestamp} {double} {color})
    {nothing}
{timestamp} specifies the position of the line on the x-axis, {double} determines the line width
(vline {time} {color})
    {nothing}
{time} specifies the position of the line on the x-axis, especially for drawing data resulting of a call to the fold function
(vline {time} {double} {color})
    {nothing}
{time} specifies the position of the line on the x-axis, {double} determines the line width, especially for drawing data resulting of a call to the fold function
Example:

(diagram 500 375 #FFFDFD

    (axes day 3.0 9.0 1.0 #0)

    (hline 4.5 #C0C0C0)
    (hline 5.0 #C0C0C0)
    (hline 5.5 #C0C0C0)
    (hline 6.0 #C0C0C0)
    (hline 6.5 #C0C0C0)
    (hline 7.0 #C0C0C0)
    (hline 7.5 #C0C0C0)

    (vline  5:45 #C0C0C0)
    (vline 12:30 #C0C0C0)
    (vline 18:30 #C0C0C0)
    (vline 21:30 #C0C0C0)

    (curve      (fold day first (merge avg (select "*" (week)) (select "x" (week))))        #FF0000 "+")
    (curve (avg (fold day first (merge avg (select "*" (week)) (select "x" (week))))  3  3) #000000 2.0)
    (curve (avg (fold day first (merge avg (select "*" (year)) (select "x" (year)))) 30 30) #000000)
)


other
(build)       {string} The function gets the build-string from pdx. This string contains informations about when and how pdx has been compiled, this means which options have been used and which optional features are supported.
(database)    {string} The function gets name and version of the database system below.
(version)     {string} The function gets the version of pdx.
Examples:

(build)
Dec 15 2010, 17:25:30, USE_SQLITE, USE_MYSQL, USE_READLINE, USE_BOARD, USE_CAIRO, USE_ETPAN

(database)

MySQL 5.1.51

(version)

1.2.0


pdx Configuration Index pdx Interactive Mode