Making the Ugly Elegant: Templating With DOM
- How It Works
- The Code
- Caveats
-
The API
- Instantiation
- Shorthand XPath Syntax
(string)
-
repeat
next
setValue
set
addClass
append
remove
- History
Templating is easy to do in any particular way, but doing it right is hard. I can’t
count how many hip new template engines have popped up in just the last few years alone. I’m about to add one to
the pile, but it is certainly not ‘hip’. It is however the closest I have ever gotten to the fabled golden
fleece of “100% separation”. Unlike most other forms of templating, this really doesn’t mix logic and
HTML, nor does it try to mask the blatant logic (“if this, then this”) by renaming ‘logic’ or using a
{{special syntax}}
.
What we’re going to do is this: take a static (and I mean static) HTML page, load it into the DOM as
an XML tree and then use the PHP as your logic, removing bits of the template not needed and changing the text
about.
I got this idea from this blog post:
Your
templating engine sucks and everything you have ever written is spaghetti code (yes, you). The article
itself is long, agressive, rambling and fails to demonstrate the principle concretely. I simply ignored all the text
and focused on the core principle that was being noted: instead of embedding some form of code in the HTML (even if
it’s just evolved search / replace syntax), just load the HTML into DOM and manipulate there so that
the HTML itself is ignorant of the templating.
The reason why this is not just the same as a {{special-syntax}}
is that we are not mixing two
different languages, syntaxes or programming models in one HTML file. If you change your templating engine,
it’s still HTML. If you change your logic, it’s still HTML. Special syntaxes invent another language to intermix
with HTML and thus add programmatic concepts to a declarative syntax—which is not clean
separation no matter what you name it.
By doing it this way, the HTML file itself can be designed independently of the software, and that whoever does the
HTML doesn’t have to know PHP. You could change the whole server language and it wouldn’t change the template
one bit. More importantly you can actually view the whole look of the template in the browser without running the
software. The reason I’m adopting this templating approach for NoNonsense Forum
is to make it easier for anybody to modify the look of their forum without having to learn PHP, and hopefully
encourage more contribution from all skill levels.
It took a few revisions, two weeks and a lot of head-wracking to beat the DOM into something elegant,
but here it is, NoNonsense Templating:
How It Works
The first thing to wrap your head around is that DOM templating works on the principle of mostly taking
away rather than adding. Logic-wise this is more difficult to get used to than you would think; you will be used to
adding data according to logic rather than “if this, then remove the thing that it is not”.
Firstly your template should be a static HTML page that contains all of the content and ‘possibilities’ of your
output, where by we will remove what is not relevant to the page. For example:
<p id="login" class="logged-out">
You are not logged in.
</p>
<p id="login" class="logged-in">
You are logged in as <b class="username">Bob</b>
</p>
In the PHP we can modify the HTML this way:
(Please note that templates you load must be valid XML and have a single root node—e.g.
`<html>`
—in order to work, the examples in this article omit this for simplicity. See
XML caveats for more details)
//load the template and provide an interface
$template = new DOMTemplate (file_get_contents ('test.html'));
//lets imagine the user is logged in, remove the logged-out section and set the username
$template->remove ('.logged-out');
$template->setValue ('.username', 'Alice');
The `remove`
call finds all elements that have a class of
`logged-out`
and deletes them (you can also refer to IDs using `#id`
).
The `setValue`
method sets the text-content of an element, removing
anything that was within. By replacing element content it means that you can provide dummy text to test the look and
feel of your template, and it will be replaced with the real data.
Behind the scenes `.logged-out`
becomes the full XPath query
`.//*[contains(@class,"logged-out")]`
. The shorthand syntax also supports specifying a required
element type and/or an attribute to target, e.g:
$template->setValue ('a.my-button@href', '/some_url');
You can also use full XPath syntax:
//if using HTTPS, change the Google search box to use HTTPS too
if (@$_SERVER['HTTPS'] == 'on') $template->setValue (
'//form[@action="http://google.com/search"]/@action',
'https://encrypted.google.com/search'
);
Looping is always a sore point in templating. How do you take a chunk and repeat it down the page without having to
define a ton of logic in your templates?
Looping with the DOM is shockingly elegant!
$item = $template->repeat ('.list-item');
foreach ($data as $value) {
$item->setValue ('.item-name', $value);
$item->next ();
}
The `repeat`
method takes an element (via shorthand/XPath) to be used as
the repeating template and copies it, then you just `set`
and
`remove`
elements from the repeating template as if it were its own
template. Once you’ve templated that iteration you call the `next`
method and the HTML is added
after the previous element, then the template repeater resets itself back to the original HTML so you can template
it again!
Once you’ve made all your changes to the template, just retrieve the final HTML and output.
die ($template);
See the API for details of all the functions.
The Code
If you would like to see a real-world use of this templating system with a ton of examples you can draw from real,
practical code you can examine the source code of my forum system called NoNonsense
Forum here:
If you don’t like the idea of targetting classes or IDs in your HTML, have a look at v4 of DOMTemplate that finds
elements according to data-template attributes.
Caveats
- Whitespace handling is good, but not perfect
-
In the case of repeating an element the whitespace within is kept, but the whitespace outside the
element is not. This is not a major problem, it just means that the closing and opening tags of
your lists will be paired (e.g. “…</li><li>…
”).
The biggest issue is that when elements are removed, the whitespace around them remains, meaning
that you get a number of blank lines in the output HTML where the elements used to be. There’s
no direct way of handling this other than perhaps using a search/replace to remove blank lines in
the HTML after it’s been templated.
One benefit of using the DOM however is that if you want minify the HTML a little,
you can just add “$this->DOMDocument->preserveWhiteSpace = false;
” to the
constructor function of DOMTemplate
and the markup will be returned as a big blob
with few line-breaks.
If you add “$this->DOMDocument->formatOutput = true;
” instead, the markup
will be ‘tidied’ for you, re-nesting the elements neatly in an easy to read fashion.
- XML woes
-
DOMTemplate stores and manipulates the template internally as strict XML. Thankfully, since
v16, DOMTemplate automatically converts your source HTML to XML on
loading and converts from XML to HTML on output, thus alleviating most of the input-strictness
problems with earlier versions. There is however still a few caveats to remember:
-
HTML must be valid
-
The automatic conversion of HTML named-entities (invalid in XML) into
Unicode is still not comprehensive. 248 of the most common are covered, but a
total of
over
2100 exist. DOMTemplate may in a future version cover all 2100+ named
entities, but until then ensure that your HTML source does not use any
named-entities outside of the 248 recognised by DOMTemplate. Recent PHP
versions appear to return the complete set now
-
HTML that you load either through instantiation or
apply to the template using
`setValue`
must have only
one root node. I.e. a list of elements can not be
used unless wrapped by an element
The API
Instantiation
Provide the HTML to load as a string when instantiating the template class. It must be valid and have only one root
element (e.g. `<html>`
).
$template = new DOMTemplate (file_get_contents ('index.html'));
If you are loading an XHTML document, or any XML file with a default namespace (e.g.
`<html xmlns="http://www.w3.org/1999/xhtml">`
), you must specify a prefix (any will do)
and the namespace URL like so:
$template = new DOMTemplate ('index.html', 'html', 'http://www.w3.org/1999/xhtml');
All XPath queries you make with this template must prefix element names with the namespace, including for
the shorthand:
$template->setValue ('//html:title', 'Hello World'); //XPath
$template->setValue ('html:a#my-button@href, 'http://google.co.uk'); //shorthand
This bizarre requirement is a limitation in the design of XPath itself.
Shorthand XPath Syntax
-
All of the methods that accept a query (`setValue`
,
`set`
,
`addClass`
,
`append`
,
`remove`
&
`repeat`
) use a shorthand-syntax where you only need to
provide the class (`.class`
) or ID (`#id`
) you want to target and the
full XPath query is built for you.
E.g. `.my-button`
-
An element type can be provided: `a#my-button`
-
An attribute name can be provided which will be the target of the
`setValue`
,
`set`
and
`remove`
methods: `a#my-button@href`
-
You can test attributes for values (the element will be selected, not the attribute):
`label@for="submit"`
-
You can specify the index of an element to select: `li[1]`
-
You can select child elements: `#list/li/a`
-
You can also just use full XPath query, as-is: `/html/head/title`
-
You can provide multiple targets by separating the queries with commas, e.g:
`.header, .body, .footer`
You can intermix shorthand and full XPath like this.
(string)
To get the HTML out of the template, cast the template class object to a string,
e.g.:
$template = new DOMTemplate ('<span>test</span>');
echo $template;
In instances where the intended type is ambiguous, use PHP’s casting syntax to force a string conversion:
$html = (string) $template;
repeat
`repeat (string $query)`
Takes a shorthand XPath query and returns a `DOMTemplateRepeaterArray`
object instantiated with the element(s) selected in the query. This object supports the
`set`
, `setValue`
,
`addClass`
, `append`
&
`remove`
methods, in addition to the following method:
`next`
Takes the current HTML content of the elements within `DOMTemplateRepeaterArray`
object and appends it
as a sibling to the previously repeated template (i.e either the element(s) you
instantiated the repeater with, or the element(s) that were added by the previous call to the `next`
method), then resets its HTML content back to the original HTML it had when it was created.
In simple terms, it adds the templated HTML to end of a list and then resets it back to the original HTML, to be
used again. In practical terms, like this:
$item = $template->repeat ('.list-item');
foreach ($data as $value) {
$item->setValue ('.item-name', $value);
$item->next ();
}
setValue
`setValue (string $query, string $value, [bool $asHTML=false])`
Replaces the content of all elements matched with the shorthand XPath query with the
given value. The string value is HTML-encoded (unless you give `asHTML`
as true), so any HTML in the
value will appear as-is, rather than be rendered as HTML. This method intelligently sets the value to elements,
attributes and classes according to the XPath used. See `addClass`
for
details on HTML class behaviour.
$template->setValue ('#name', 'Kroc');
set
`set (array $queries, [bool $asHTML=false])`
Allows you to write code in a more compact way by specifying an array of shorthand XPath
queries and their associated value to set.
$template->set (array (
'#name' => 'Kroc',
'#site' => 'http://camendesign.com'
));
addClass
`addClass (string $class)`
Adds the specified HTML class name to every element matched with the shorthand XPath
query. If an element already has a class attribute, multiple class names will be separated by spaces when the
new class is added.
$template->addClass ('#section', 'open');
append
`append (string $query, string $content)`
Appends content to the end of the inside of any element(s) matched by the
shorthand XPath query. E.g.:
<article>
Stuff here
⋮
<== Append new content here
</article>
remove
`remove (string $query | array $queries)`
Deletes all the elements (and their children) matched with the shorthand XPath query.
$template->remove ('.secret-stuff');
Also accepts an array in the format of `'xpath' => true|false`
.
If the value is false, the XPath will be skipped. This allows you to write compact removal code by not having to
write `if (x) $template->remove ('y');`
several times in a row,
e.g:
$template->remove (array (
'.section-1' => $section == 1,
'.section-2' => $section == 2,
⋮
));
For a good example of this style of writing, see
the code for
NoNonsense Forum.
In addition to this behaviour, you can also remove class names from a class attribute, whilst retaining any other
class names present by specifying the class name to remove in the value, when targetting a class attribute with the
XPath, thusly:
$template->remove (array ('a@class' => 'undesired'));
History
-
v20
- Switch to PHP7 as a minimum requirement, use a namespace
- Added
`append`
method
- Fixes for empty elements when converting from XML to HTML,
and a regression in
`repeat`
, with thanks to Mauskin
- v19 Add more void elements
-
v18 Three community bug fixes:
- Eric Desbiens (olace): Adding class to an element that already had a class would fail
- Zegnat: iframes should not self-close
- Peter: Typo with
`$this::XML`
should be `$this->XML`
- v17 Fixed regex bug where the same letter either side of an equals sign being removed
- v16 Filtering of HTML on input and output, removing the strict-XML requirement for source
text. The
`html`
method was removed in favour of casting the class to a String
- v15 Throw an exception for invalid XPath queries or HTML
- v14 XPaths are cached for speed
- v13 Multiple XMLNS support
- v12 Ability to remove classNames using
`remove`
method
- v11 Changed instantiation to use a string instead of a filename
- v10
`repeat`
now works simultaneously with multiple elements instead of just
one
- v9 Greatly improved shorthand XPath syntax adding index matching, child matching &
attribute testing
- v8 Changed
`setValue`
to intelligently apply to elements, attributes or
classes, with a parameter to include HTML as-is (`setHTML`
was removed)
- v7 XML prolog is kept if already present and UTF-8 characters are no longer hex-encoded
- v6 XML namespace support. Also, template repeating now appends as a sibling, not as the
last child of the parent (removes the need for a superfluous parent element).
- v5 New shorthand XPath syntax for classes and IDs instead of
`data-template`
attributes
- v4 Added multiple XPath targets
- v3 Added method chaining
- v2 Added HTML entity decoding
- v1 Initial release