ZpqrtBnk

So long, XML

Posted on January 21, 2015 in umbraco

One of the reasons for the success of Umbraco is probably that rendering pages is fast and that, in turn, is due to all content being cached in memory at all time. Navigating and rendering content do not hit the database.

That memory cache has always been composed of one big XML document representing the whole content tree. That was fine, because content was hierarchically organized, and it allowed for fast and easy processing of the whole tree with XSLT—which was the only way to render content.

And then things started to evolve.

Microsoft introduced MVC and Razor, and some people considered it would be a good alternative to XSLT (though it is still a debated subject)
Umbraco's back-office adopted the latest web techologies (Angular...) and migrated to more JavaScript-oriented concepts, such as serializing content to JSON
Content evolved and began to become more and more tabular and relational, and to outgrow the strict structure of the tree (see What becomes of XSLT)
Some issues with having one big XML document started to become more pressing, such has concurrency or preview management

However, the whole rendering layers of Umbraco were built on top of the XML cache, and the XML cache was everywhere, and so it was difficult to just start thinking about changes... without breaking everything.

So for some time now, work has been in progress in the dev-v7-contentcache branch to isolate the "content cache" into a component with a well-defined contract or interface, so that it later on becomes possible to think about different implementations of that cache.

Today's Progress Status report

As of Jan. 21st, 2015, this blog is dogfooding the dev-v7-contentcache branch. The cache implementation that is used is still the XML cache, but it is entirely abstracted and lives in one well isolated component (Umbraco.Web.PublishedCache.XmlPublishedCache)

Don't get too excited, it is not ready for production—but it works. Some work remains, in order to ensure that the cache is properly refreshed when "things" happen to content or to content types. And then, it needs to be heavily tested.

But we're getting closer, and already have a POC full-object-oriented cache that should land in the Git branch soon.

The cache contract

So... this is the death of XSLT, right? Not quite.

Sure, the big central XmlDocument instance does not exist anymore—in fact, the whole content class is gone. But the cache contract specifies that a content cache must be IXPathNavigable—in other words, that whatever the implementation, it is possible to invoke cache.CreateNavigator() and obtain an XPathNavigator instance.

And this is all the .NET Framework needs to execute XSLT transforms or XPath queries. The dev-v7-contentcache branch also contains a cache implementation, PublishedNoCache, that basically does not cache anything and hits the database for every content. Obviously, it is insanely slow, but it runs XSLT macros quite fine.

Want to get ready for the future? In most cases, all you have to do is to stop thinking "XmlDocument" and start thinking "XPathNavigator". So, no more:

var xml = content.Instance.GetXml();

But (in all v7 versions already):

var nav = UmbracoContext.Current.ContentCache.CreateNavigator();

Get ready!

There used to be Disqus-powered comments here. They got very little engagement, and I am not a big fan of Disqus. So, comments are gone. If you want to discuss this article, your best bet is to ping me on Mastodon.