Camen Design

c share + remix linear_recursion

Creating a Nested HTML Comment Thread From a Linear List, by Magic!

If you have ever created a comment system for a website that supports nested replies you will know that one of the messiest parts of the code is taking a flat list of comments from the database, where usually you have a field for each comment that states what the parent comment is to that comment, and outputting a nested HTML structure where replies sit within their respective comments.

Screenshot of a typical comment thread
A typical comment thread showing nested replies

Normally, the method to deal with this is to convert this linear list into a node tree, replicating the relationships of the comments in a programming object which you can then recurse through and output.

The problem with this approach is that it’s hard to do and requires lots of separate functions and objects and you end up writing a thousand lines of code in order to do it properly and cleanly.

The alternative method is to do it procedurally, limit the nesting depth to a certain amount and just write absolutely crazy, insane kitten-killing, spaghetti code (with GOTOs) that no mother could love. It’ll get the job done quickly, but it will eat at your soul all the way from the hard drive.

I was not prepared to do it either way, so I invented a hybrid approach that achieves the end result in very little code and as elegantly as possible. The down side is that you should never code this way unless you know what you are doing!

There once was a master programmer who wrote unstructured programs. A novice programmer, seeking to imitate him, also began to write unstructured programs. When the novice asked the master to evaluate his progress, the master criticized him for writing unstructured programs, saying, “What is appropriate for the master is not appropriate for the novice. You must understand the Tao before transcending structure.”

The Tao of Programming

My code linearly traverses an array using a recursive function. That, in just about any programming school of thought is bad coding and not to be done! However, I am a hacker, and transcend to a skill level where all wrongs are also tools to be used and abused for art. This code works, and that is all that matters—there is no good or bad, just code.

Initialise Data

Let’s initialise the code using an example table of data, like this. You could imagine that this has been pulled from an SQL query. Each comment has a parent field that will point to the ID of the comment it is in reply to, a parent of “0” will mean the comment is at root level.

$comments = array (
	1 => array (
		"parent" => 0,	"datetime" => "2011/01/04 15:25:00", "name" => "Alice",
		"text"   =>	"Mario is better than Sonic! Mario has always had successful games, where".
				"as Sonic’s games have been sucking for a decade"
	2 => array (
		"parent" => 1,	"datetime" => "2011/01/04 15:27:00", "name" => "Bob",
		"text"   =>	"Er, aren’t we forgetting Hotel Mario?"
	3 => array (
		"parent" => 0,	"datetime" => "2011/01/04 15:32:00", "name" => "Alex",
		"text"   =>	"*Sits in the corner curled into a ball rocking…* Don’t want Shadow… Don’t want Shadow…"
	4 => array (
		"parent" => 1,	"datetime" => "2011/01/04 16:02:00", "name" => "Bill",
		"text"   =>	"At least Sega have been trying new things; Nintendo have been making the same damn".
				"game for the last 25 years!"
	5 => array (
		"parent" => 4,	"datetime" => "2011/01/04 16:14:00", "name" => "Charlie",
		"text"   =>	"Everything after Sonic and Knuckles doesn’t exist in my eyes."

Nest and Sort

First, we need to sort these comments. Sorting by date alone won’t really work because the newest comment might be a reply to an old comment in the middle of the thread. We need to first get the comment nesting sorted and then sort each branch of the tree by date.

The answer to this that I stumbled upon is that when you sort words alphabetically the presence of two words that start with the same letter is handled by comparing the second letters, and if they are the same, so on through the letters until some result is reached. I imagined an alphabetic structure below where each reply inherited the parent’s identifier, and each new branch increased the identifier in alphabetical order:

A	<- first comment
A|B	<- child comment
A|B|C	<- another child
A|B|D	<- sibling
E	<- sibling of first comment

We achieve this by encoding the comment date/time to base 36 (A-Z, 0-9) which can easily be sorted. When a comment has a parent, we chain the “sort codes” together to make sibling comments share part of the same code, and thus appear next to each other.

function sortcode (&$comments, $comment_id) {
	$parent = @$comments[$comment_id]['parent'];
	//if a parent exists, get its sort code first
	return ($parent ? sortcode ($comments, $parent).'|' : '').
		//and then append this comment’s code (an alpha-numeric version of the comment timestamp)
		base_convert (strtotime ($comments[$comment_id]['datetime']), 10, 36);

//apply the sort codes to the comments, and also collect them into an array which we will use to sort
//the `$comments` array to have comments in the correctly nested order
foreach ($comments as $comment_id => &$comment)
	//can’t use `$comment['sort]` because you can’t add to an array whilst inside it with foreach (byref)
	$sort[] = $comments[$comment_id]['sort'] = sortcode ($comments, $comment_id)
//sort the comments using the sort codes. this works by having an array of sortable alpha-numeric keys that are in the same
//original order as the comments that they came from. `array_multisort` then sorts both, matching any change in order in
//`$sort` with `$comments` too
array_multisort ($sort, SORT_ASC, $comments);

Now we have a linear array, but sorted according to the nesting depth and time of the comment!

Output HTML

//generate the nested HTML from the flat list of comments
reset ($comments); echo template_comments ($comments);

function template_comments (&$comments) {
	//retrieve the current comment from the list. the first time this function is called, this will obviously be the
	//first comment--though ensure you have called `reset ()` on the array if it’s been used previously!
	$comment = current ($comments);
	//the sort code that we applied earlier strings together comment timestamps in a chain according to depth,
	//i.e. "ABC|DEF|GHI". by splitting the sort code we can determine the current nesting depth of the comment
	$this_depth = count (explode ('|', $comment['sort'])) - 1;
	//prepare the current comment. note how this is stored in a variable with no context to the previous comments.
	//this is where the recursion comes into play, appending trees of comments together whilst still operating on a
	//array, travelled linearly
	$html = <<<HTML
	<header>by:${comment['name']} on:${comment['datetime']}</header>

	//we leave `<article>` open, as any child comments will go inside
	//get the next comment: if we have run out of comments, then finish the HTML for this comment
	if (!$comment = next ($comments)) return $html . "</article>\n";
	//determine the depth of the next comment so that we know if this will be added in parallel,
	//as a child, or fold back to the parent
	$next_depth = count (explode ('|', $comment['sort'])) - 1;
	//next comment will be a child?
	if ($next_depth > $this_depth) {
		//insert the next comment within the current one (child). note that this recursive call will continue to
		//insert child and sibling comments until that branch is complete and returns back to this point here
		$html .= template_comments ($comments);
		//NOTE: when we recursed (above), it will have moved the array pointer forward 1 *OR MORE* places;
		//which means that `$comment` is no longer pointing to the current location in the comment list!
		//let’s get a new reference (if we’ve not run out of comments)
		if (!$comment = current ($comments)) return $html . "</article>\n";
		//get the depth level of this next comment; we’ve handled all children already so it could only be one of
		//two possibilities: either it’s a sibiling of the comment we started out with at the top of this function
		//or it’s part of a parent thread and this particular tree has stopped branching
		$next_depth = count (explode ('|', $comment['sort'])) - 1;
	//is the next comment a sibling of the current comment?
	//(that is, it is another reply to the same parent comment)
	if ($next_depth == $this_depth) 
		//append the next comment at the same level, no nesting
		$html .= "</article>\n\n" . template_comments ($comments)
	//lastly, when all other options have been expended, the next comment is not part of this thread
	//e.g.	A
	//	A|B
	//	A|B|C
	//	A|B|D
	//	E	<- this comment is the beginning of a new thread
	//we fold back through the recursion, this will automagically drop us back to the correct recursion level where
	//the next comment is a sibling of another comment
	return $html . "</article>\n";

The final code (attached to this article) outputs a correctly nested set of comments:

Screenshot of nested comments in a web browser

I hope this has been most educational.

Kind regards,
Kroc Camen.