How does it work?

Martin Döring — Wednesday, 1st of February 2017

Well, short said: A wiki is a text formatter. It loads text files from the filesystem, identifies special tokens like URLs, lists or headlines and transforms it into a pretty nice looking webpage. On the other hand it is possible to edit these original text pages through the webbrowser.

The parser

To go a bit deeper in detail, I now would like to explain, how the wiki syntax parser does work. The function that does all this is called '''wiki_to_html''' and is used for generating the webpages and for generating the content part of the RSS feed.

At first every line ending with carrage return and linefeed is changed to just linefeed to make the whole thing easier.
Then we search for several patterns like URLs to other sites, pictures to include, wiki pages with their respective syntax. The trick is now, that these URLs ... are cut out and replaced by a placeholder so that no html specific relicts remain in the whole text. The cut out snippets are stored in a "Reference" with their placeholders as index, so that later they can be replaced back in the text - but as valid HTML constructs.
Now, that the URLs ... and so on are parked and no meaningful html entities are part of the text anymore, the text is converted to be HTML conform by the php function htmlentities(). For example the '''&''' sign is converted to '''&'''. Now we have normal HTML conform text.
Now all lines, which end with a linefeed are converted to paragraphs. And the crazy thing is: All is a paragraph at first! That means headings, lists, normal text and so on.
Now we search for text constructs, that look like headings, lists and so on AND which are also surrounded by paragraph tags '''<p>tex text </p>'''.
We also scan for mail adresses, since in mail adresses there are no HTML special characters, we have no need for parking them in the '''Reference''', as explained above.
Now, that nearly everything is more or less done, we exchange the placeholders back to their original text snippets.
At last we convert italics and bold text parts to the HTML equivalent - and we are done.

The template.html

Much simpler is the work with the template.html file. This is still valid HTML, but enriched by some special placeholders. This template is read in and now every placeholder is replaced by it's real value. These are for example page title, wiki author, time in ISO format and some others. And at last the page text, before converted by our parser above, is also filled in. The placeholder for this is called CONTENT.

Well all this is not much magic, but to get it more or less working as I want and to be free of parsing errors was not that easy.
'''Back to FAQ⤴'''