YML – Why a Markup Language?!

YML 2.7.6 of Thu 25 May 2023 – Copyleft 2007-2023, Volker BirkDownload YML 2

Text

To output text nodes (character data), write literals. There are integer literals, floating point literals and text literals.

Literals are written like in Python, that means, text literals are in single or double quotes, or multiline in triple double quotes:

"text" 'also text' """some more text"""
42 "an integer and" 42.23 "a floating point literal"

Literals are being output by the text function.

Function Calls

The main idea of YML scripts is calling functions which then generate XML tags (Markup). Functions are generating single tags, lists or trees of tags.

To call a function, write the name of the function, followed by a comma separated list of function parameters (C like syntax) or just attribute=value pairs. Unlike C, you don't need to insert the parameter list into parentheses. A simple function call can be terminated by a semicolon ; or by a period .

It does not matter, if you're calling your function using parentheses or brackets or without. So these statements are equal:

foo "hello, world";
foo "hello, world".
foo("hello, world");
foo["hello, world"];

Subtrees

If you omit the tailing semicolon, you're creating a Subtree; YML Subtrees can also be opened and closed with braces:

foo {
    bar {
        something;
    }
}

If a Subtree only consists of one single subelement, then you may omit the braces:

foo
    bar;

Named Parameters

To generate attributes by calling a function, you can use Named Parameters.

For that case, assign literals or symbols to attribute names like the following. The name of the parameter then will be used as the name of the generated attribute. An example:

div id=sample {
    "this is a " a href="#sample" "link sample"
}

This generates:

<div id="sample">this is a <a href="#sample">link sample</a></div>

Unnamed Parameters

Unnamed Parameters prepare values for predefined attributes. The following example is equivalent to the sample above:

decl a(href);
decl div(id);

div "sample" {
    "this is a " a "#sample" "link sample"
}

If no predefined attribute can be allocated, the value of the parameter is added to the body.

Calling with &

Especially if you have a default body for your function, calling with a leading & can be sensible: then the tag itself is omitted and only the body is being output:

decl something { tag1; tag2; };

list {
    &something;
}

results in:

<list>
  <tag1/>
  <tag2/>
</list>

This has the same result as aliasing something to -.

Function Lists

Function Lists are a feature of YML to simulate a more C like syntax. Let's have some examples. You can have a list of functions whereever you can have a function. Function Lists are comma separated:

x i, j, k

compiles to:

<x>
  <i/>
  <j/>
  <k/>
</x>

Parameter Lists

A sample together with Descending Attributes:

decl Interface @name;
decl attr @type @name;
decl func @type @name;

Interface Icecream {
    attr color flavour;
    attr long number;
    func int getPrice();
    func void addFlavour(in color flavour, in long number);
}

compiles to:

<Interface name="Icecream">
  <attr type="color" name="flavour"/>
  <attr type="long" name="number"/>
  <func type="int" name="getPrice"/>
  <func type="void" name="addFlavour">
    <parm>
      <in/>
      <color/>
      <flavour/>
    </parm>
    <parm>
      <in/>
      <long/>
      <number/>
    </parm>
  </func>
</Interface>

Note the parm tags – they're generated by default, if you write a Parameter List behind a Function Call. That differs from calling the function with parameters – calling means using text values.

The parm tags are emitted, because the _parm function is called each time such a parameter will be emitted.

If you want to have the _parm function doing other things, just declare it in another way.

Generic Declarations

Using Generic Declarations is just like using Parameter Lists – use angle brackets instead of parentheses. For Generic Declarations, the _generic function is called each time such a Generic Declaration will be emitted, generating generic tags as the default:

max<int>(x, y)

compiles to:

<max>
  <generic>
    <int/>
  </generic>
  <parm>
    <x/>
  </parm>
  <parm>
    <y/>
  </parm>
</max>

The content function

The content; Function Call has a special meaning (only in a default body): it does not generate a tag, but instead the tags of a supplied body in a call will be inserted at each place where the content; function call is existing in the default body.

The text function

There is a special YML function named text. Usually, it's just aliased to - (and therefore outputting nothing). The text function is called each time a text literal will be output.

If you declare the text function, you can overload that behaviour. For example, YSLT is declaring text like this:

decl text alias xsl:text;

"test"

generates:

<xsl:text>test</xsl:text>

The text function is not called, if you give text as a value for an attribute:

decl text alias xsl:text;

a "test"

generates:

<a>test</a>

But it is called using the quoting operators:

decl text alias xsl:text;

a > test

generates:

<a><xsl:text>test</xsl:text></a>

The decl, define and operator functions

The decl, define and operator functions are not defined, so they cannot be used accidentally by having a syntax error i.e. in a decl statement. If you want to use such a function, i.e. decl(), you have to declare it explicitely:

decl decl;
decl();

will result in:

<decl/>

Declaring Functions: decl

As default, each Function Call generates one XML tag, which has the same name. To be exact, the XML tag has dashes in it's name where the YML function has underscores.

To define, how tags and attributes look like, which are created by a Function Call, you can use the decl statement.

Trivial Declarations

In a trivial declaration, you're just declaring the Function Name and so the XML tag name:

decl html, head, title, body, p, a;

As seen in the example, multiple declarations can be done in a comma separated list.

Because trivial declarations are done automatically, if you're using a function for the first time, you usually don't need to declare this way.

Specifying Unnamed Parameters

To specifiy Unnamed Parameters, give the parameter list comma separated in parentheses or provide one or more brackets with parameter lists in them:

decl a(href), img[src];

If you're using the corresponding functions a() and img() together with an unnamed parameter in a call, then these attributes are used for applying the values, respectively:

a "http://www.ccc.de" "The Club Homepage" img "logo.png";

These Function Calls generate:

<a href="http://www.ccc.de">The Club Homepage</a><img src="logo.png"/>

Giving Default Values for parameters

To give default values for generating XML attributes, assign a literal to each named parameter in the declaration parentheses or brackets. Two examples, which do the same:

decl img(src, alt="picture");
decl img[src][alt="picture"];

Aliasing: using different YML functions for the same XML tag for different tasks

Sometimes tags are used in different ways to do different things. For this case, you can use aliasing. Aliasing means, the YML function name and the XML tag name differ. For example:

decl a(href), target(name) alias a;

Both defined YML functions then generate <a /> tags – but the Unnamed Parameter differs.

The alias name - has a special meaning: it omits the tag in the output. That is especially sensible if you have a default body. Then an alias to - has the same meaning as starting the function call with the & character: only the body is emitted.

Specifying Descending Attributes

Maybe you want to write something like this:

Module ERP {
    Interface Customer {
        // ...
    }
}

Without any extras, this compiles to:

<Module>
    <ERP>
        <Interface>
            <Customer />
        </Interface>
    </ERP>
</Module>

For this case, it would be practical, if ERP would not be interpreted as extra tag but as value for an attribute name. This you can achive with Descending Attributes:

decl Module @name, Interface @name;

With this declaration, the code sample above is compiling to:

<Module name="ERP">
    <Interface name="Customer" />
</Module>

Descending attributes can also be used this way:

decl module +name;
decl element +name;

module Some {
    element {
        one;
        two;
        three;
    }
    element {
        four; five; six
    }
}

The above generates:

<?xml version='1.0' encoding='UTF-8'?>
<module name="Some">
  <element name="one"/>
  <element name="two"/>
  <element name="three"/>
  <element name="four"/>
  <element name="five"/>
  <element name="six"/>
</module>

Specifying Descending Pointers

Like with descending attributes, you can use descending pointers. Instead of preceding the name of an attribute with a + sign (like with descending attributes), precede it with an asterisk *.

Like with pointers in general, it's a good idea to combine that with a default body:

decl f *p { some tags with *p };

f value;

This generates:

<?xml version='1.0' encoding='UTF-8'?>
<f>
  <some>
    <tags>
      <with>value</with>
    </tags>
  </some>
</f>

Supplying a Default Body

Additionally, you can supply a Default Body for each tag. For that case, add a YML function block in braces to your declaration:

decl pageContent alias body {
    a name=top;
    include heading.en.yhtml2;
    div id=entries
    content;
};

The sample above is used for generating this homepage, for example.

See the content function.

Inheritance

Declarations can inherit information from previous declarations. For that case, there is the possibility to use an is clause to give a function name to inherit from.

The following is an example from the YSLT specification:

decl stylesheet(version="1.0", xmlns:xsl="http://www.w3.org/1999/XSL/Transform");

decl estylesheet is stylesheet (
    xmlns:exsl='http://exslt.org/common',
    xmlns:math='http://exslt.org/math',
    xmlns:func='http://exslt.org/functions',
    xmlns:str='http://exslt.org/strings',
    xmlns:dyn='http://exslt.org/dynamic',
    xmlns:set='http://exslt.org/sets',
    extension-element-prefixes='exsl func str dyn set math'
);

decl textstylesheet is estylesheet {
    output "text";
    const "space", !"'" + " " * 200 + "'"!;
    param "autoindent", 4;
    content;
}, tstylesheet is textstylesheet;

Here estylesheet inherits the tag name and the Default Values from stylesheet, while textstylesheet inherits all from estylesheet again. estylesheet then adds a Default Body, and tstylesheet does exactly the same as textstylesheet.

All of these YML functions output stylesheet XML tags, but with different defaults.

Shapes

Shapes are comparable to inheritance. Declaring a shape inherits every property beside the name.

decl coords(x=0, y=0);
decl point <coords> (name);

point "origin";

compiles to:

<point y="0" x="0" name="origin"/>

It's possible to have more than one shape, too. Multiple shapes are patching each other in the sequence they're listed:

decl coords(x=0, y=0);
decl named +name;
decl point <coords, named>;

point origin;

compiles to:

<point y="0" x="0" name="origin" />

Namespaces

XML namespaces can be used just by providing an alias clause. Additionally, they can be used by an in clause; these two lines are equivalent:

decl apply(select) alias xsl:apply-templates;
in xsl decl apply(select) alias apply-templates;

in clauses also can be used with a block of declarations in braces:

in xsl {
    decl template(match);
    decl apply(select) alias apply-templates;

    decl function(name) alias template;
    decl call(name) alias call-template;
}

Pointers

In some situations, it is good to have information in a Function Call, which then changes the way XML tags are generated. For this case, there are Pointers.

The name should not mislead you; I took it because I chose the * symbol to declare them, and that is the meaning of this symbol in the programming language C. The concept behind is very easy.

For example, it could be a good idea to generate a small HTML document containing some content. For this case, the title of the page is a good case for using pointers:

decl page(*title) alias html {
    head {
        title *title;
    }
    body {
        h1 *title;
        content;
    }
};

In the example above, calling page('My Page') { p 'hello, world'; } will result in this XML output:

<html>
    <head>
        <title>My Page</title>
    </head>
    <body>
        <h1>My Page</h1>
        <p>hello, world</p>
    </body>
</html>

Pointers can be referenced in any place in the Default Body of a decl statement, also for generating extra tags. Then the value for a Pointer will be the tag name.

Additionally, you can insert the value of a pointer as text by calling it with two leading asterisks, i.e. if the pointer is defined as *x, you can insert its value as text using: **x.

Pointers without tags

To give a literal a name, you can define pointers to literals.

define *answer = 42;
something *answer;

will compile to:

<something>42</something>

The define keyword as well as the asterisk * can be omitted. So this is equivalent to the statements above:

answer = 42;
something *answer;

The pointer *_debug_trace

If you're calling yml2proc with --debug, then this pointer is filled with tracing info text, otherwise it's an empty string.

Macros

Macros are a way to generate values for attributes with variable content. Macros can be set like any other parameters; they're used for a text search & replace in the values of attributes when attributes are generated.

Parameters, which represent macros, are determined with a preceding % sign. They're accounted for before any other parameter is accounted for in a function call, even if they were defined after other parameters.

An example:

decl foo(%macro, myAttr="something %macro for testing");

testing
    foo "nice";

This generates:

<testing>
  <foo myAttr="something nice for testing"/>
</testing>

The Null Function

The function with the name _ (underscore) is called Null Function. If you define this function, then you're switching off the default behaviour, that trivial declares are done automatically.

Instead, unknown functions now call the Null Function. This can be very sensible together with Descending Attributes:

decl _ +type +name alias func;
decl interface +name;

interface Testcase {
    void f(in string input);
    long getOptions();
}

compiles to:

<interface name="Testcase">
  <func type="void" name="f">
    <parm>
      <in/>
      <string/>
      <input/>
    </parm>
  </func>
  <func type="long" name="getOptions"/>
</interface>

Quoting Operators

Five different quoting operators implement different functionality:

Quote >

The > operator quotes into text nodes, doing XML escaping of text. An example:

> this text will be put into a text node and these angle brackets <> will be quoted

Additionally, it can be used to implement an indention system, see YSLT below.

Then an integer literal can be the first part of the operator; it gives the indention level. For example:

0> this text is indented to the actual level and then output,
 >  followed by this text.\n

1> this text is indented one indention level\n
2> two levels\n
1> one level again\n

Quote text is being output by the code > text.

Block Quote >>

To include more lines of text into a single quoted area, use double >>. The lines are concatenated together then. An example:

p   >>
    This generates a text paragraph for HTML. All this text, which you can find in
    these lines, is being concatenated together to one single text node, and then put
    into the body of the <p> ... </p> tag.
    >>

Block quote text is being output by the code > text.

Line Quote |

The | operator does the same as the > operator, adding a newline character to the text node.

Additionally, it can be used to implement an indention system, see YSLT below.

Then it's used together with additional > symbols showing the grade of indention:

| not indented
|> single indent
|>> double indent
(...)

Line quote text is being output by the code > text.

Block Line Quote ||

The || operator opens and closes a block of lines, which then are handled like if each of them would be preceeded with a Line Operator |.

Sample:

||
this is code being quoted through
this is the second line
||

is equivalent to:

| this is code being quoted through
| this is the second line

Block line quote text is being output by the code > text.

Inserting Commands

Just like with a Unix shell, you can insert statements into text by using backticks:

| Click `a href="http://fdik.org/yml/" "this link"`, please!

Being in a Block Line Quote ||, you additionally can use the Line Command operator (two backquotes,``).

This is very interesting to have in YSLT, for example:

||
some code
``apply "myTemplate";
some other code
||

User defined in-text Operators

You can define short cuts for inserting commands into text by defining operators.

Therefore, you need a regular expression for matching text and YML text for replacing with. Here an example, how this is used by YSLT:

define operator "«(.*?)»" as value "%1";

The RegEx have Python syntax.

In this example all matches to the RegEx will be replaced by the YML text in the as clause. The text of the first group in the RegEx will replace the %1 in the resulting YML text. You can do that for more than one group – just use %2 for the second group, %3 for the third one and so on.

The define keyword can be omitted.

Quote Through ]

The Apple ][ prompt operator just quotes through directly into XML what it gets.

If the first character of a command is a <, then quote through is applied automatically.

This is the preferred way to output XML tags directly in YML:

<output me="directly" />

] <!--
]     add some comment, which then appears in XML
] -->

Including YML files

You can include a second YML script file into an existing YML script file at any place using one of the following:

include something.yml2
include "something else.yml2"
include 'anything addionally.yml2'

If you're not starting the filename with '.' or '/' as in the example above, then if the YML_PATH environment variable is set to a colon separated list of directories, these directories are being searched for the given filename. Otherwise, the local directory is searched. The system location for .yml2 and .ysl2 files is always searched afterwards.

Filename globbing using * and ? placeholders is supported to include more than one file at a time:

include part*.yml2

Filename globbing also can be used reverted; that means, the files are included in reverse order:

include reverse part*.yml2

If there are the files part1.yml2, part2.yml2 and part3.yml2, part3.yml2 is included first now.

To include plain text as text nodes, you can use:

include text some.txt

To include ready made XML, use:

include xml some.xml

If there is a file mask or a filename in a pointer you can include indirectly:

declare files = "*.yml2"
include from *files

Escaping into Python – the Escape Operator !

You can insert a Python command at any place by using the ! operator:

!class X(str): pass

Python script Operator !!

You can use the double !! to include more than one line of Python code:

!!
def fak(n):
    if n == 0:
        return 1
    else:
        return n * fak(n - 1)

def getName(id):
    return SQL("select name from customers where id='"+str(id)+"';")[0]
!!

Python generated parameters in Function Calls

You may use Python expressions to generate names and/or values in Function Calls. To do so, embed the Python expression in ! ... !:

f x=!fak(5)!;
customer name=!getName(42)!;
tag !getNextAttributeName()!=42;

Python generated Function Calls

You can generate text with a Python expression, which represents an YML Function Call. The resulting YML function is then executed. Also here, embed the Python expression in ! ... !:

!getTagName() + " " + getAttrib() + "='" + getValue() + "'"!

Using Pointers as values in Python function calls

Sometimes it is useful to call a generating Python function using information of a YML Function Call. For that case, there is the python statement in YML. You can call there a single Python function using YML Pointers as parameters.

This is used in the YSLT specification, for example:

decl apply(select, *indent=1) alias apply-templates {
    python withIndent(*indent);
    content;
};

Comments //, /* */

Comments are written like in Java or C++:

// this is a comment
something() { // this is a comment after a tag
/* this is some comment, too */

After Quoting Operators, comments are not possible. Instead, they're quoted through:

// the following line you'll find in the output document
> this text is being output // and this too
<< Using YML 2 ^Top^ >> show me YSLT (source)