orbifold think. visualize. understand.

Telerik RadDiagram’s scalability and performance.

Windows graphics has two types or modes of rendering technologies;
  • Retained graphics mode; which means that drawing (code) is not done immediately but updates an internal model (the scene graph). This internal model is maintained by the operating system and all optimizations and fine details are handled transparently by the graphics pipeline. All variations of XAML (WPF, Silverlight, Windows Phone) use retained graphics. In essence the XAML code is a partial mirror of the visual tree maintained by XAML and this visual tree is organized by .Net. The advantage of using retained graphics are manifold;
    • You don't need to worry about pixels, refreshing (invalidating) bits and pieces, use coordinates (except the single coordinate which represents the root of the drawn object) and so on.
    • Things are vectorial and all sorts of manipulations (scaling, rotation…) are rather easy. In any case, way easier than using immediate mode.
    • Performance and optimizations are done by the framework. This doesn't mean it's always perfect but at least you can focus on application logic rather than having to dig in sometimes difficult rendering issues.
    • Device independence; you don't need to have a different rendering logic for different form factors.

Retained Graphics Mode

The disadvantages are essentially the advantages you find in the immediate mode below.

  • Immediate (or direct) graphics mode; means your code draws directly on a canvas and the operation system does not keep a scene graph of what is being drawn. Usually the application or API has its own internal model of the scene. The (older but still very much used) GDI and GDI+ rendering of WinForms is the typical example which also resembles much the way one uses a writeable bitmap to draw and animate things. The advantages of a direct mode are;
    • The only memory captured is the size of the drawing and the scene graph (if any). The retained graphics mode consumes much more memory due to the fact that the drawing instances (Control class, Shape class…in XAML) usually contain much more than what the application needs. In the case of XAML, all things related to styling, templating, triggers…are part of the rendering API whether or not you need it. While (usually necessary) the application's scene graph also absorbs memory it's usually more lightweight and more efficient for the case at hand.
    • You have more control over potential optimizations and how optimizations which are inherent to the business case can be induced in the drawing process. Retained graphics is business-agnostic and only relies on purely technical (low-level rendering) knowledge. Referring to the two types of diagramming types below: the retained graphics pipeline doesn't know whether you want a particle system with large-scale topology or a small-scale UML diagram with rich interactivity. Using a writeable bitmap or something alike is likely more efficient if you have thousands or millions of shapes/items in your diagram.

Immediate Graphics Mode

One the surprising things in this context is for example the fact that the graphics canvas has a more or less constant memory footprint however many objects are drawn on it. Simply because the bitmap has not layers, only pixels which maintain a constant 16-bit (or 32-bit) value whatever the visual content is. In this sense you can draw millions of items on a bitmap and scale graphs to a size only limited by the way you organize its content. If a graph structure would be kept inside a database and objects drawn systematically on a bitmap you could scale the graph to quasi infinite proportions.  What is the tradeoff? The fact that the objects drawn in immediate mode are not interactive. At least, not by default, only interactive in as far as you (the developer) handle the mouse events, figure out what is hit and how it should react. The retained graphics mode does this free of charge; XAML has a rich API which allows you to add animations, style and interactivity. If you wish to do this in immediate mode you need to organize yourself a visual tree, define tree traversals, culling and partitioning the scene in order to gain traversal performance and so on.

In the past people have elaborated various ways to combine both things in order to get the best of both worlds. Using a writeable bitmap in Silverlight or plugin into the CompositionTarget.Rendering event is such an example (see this article for instance). Gaming engines, on the other hand, are the prime example of homemade scene graphs and rich scene API's with a direct rendering pipeline.

It's worth to mention in this context also the JavaScript and HTML5 Canvas as an example of immediate graphics. There are plenty of examples on the net of very impressive graphics but which lack interactivity on a smaller scale. That is, it's possible to create particle systems (with JavaScript and the Canvas) and render millions of particles but without the possibility to access a single particle with a mouse click since this would require the visual tree (and its traversal). In most cases, these examples aims at giving a global effect and not a fine-grain access. Alongside the HTML5 canvas you have the SVG framework which, much like XAML, offers a vectorial and more rich object-level API.

This leads to the essence of scalability and graph drawing; what is the purpose of the graph and what kind of interaction do you wish to have? There are basically two categories of graph (or diagramming) interfaces:
  • Diagrams where individual shapes and connections matter. These are diagrams where the user needs to click and select individual shapes, alter its properties, create new connections and so on. This is the Visio-like and RadDiagram paradigm. The API is rich and offers a framework which can be modelled according to the needs of the business context. Diagrams in this category benefit from the retained graphics pipeline (XAML, SVG…) since it articulates a RAD-methodology and adapts to the widely different business contexts in which diagrams are used.
  • Diagrams which aim at giving a global (bird's eye) view of a certain topic, where global topology matters more than local links. These diagrams are about seeing clusters and broad relationships (e.g. LinkedIn and Twitter networks), about graph layout on a large scale (e.g. how does the internet look like on a global scale?), about how particle systems represent the dynamics of a certain (business) system (e.g. the air traffic control on a global scale).  This is the datavisualization paradigm aiming at representing data and giving insights into big data sets. In this context, the result is usually more important than the creation process (see e.g. the LinkedIn diagram below; who cares about how it's created…? The resulting topology is merely interesting and the data is not in the diagram but a result of data stored and edited elsewhere). This paradigm benefits from the direct rendering pipeline (Canvas, bitmap, GDI…) and the result is often quite static (tooltips being the prototypical 'interaction').

Sample LinkedIn Diagram (aka data visualization diagramming)

The problem zone is the gray area between the Visio and the datavisualization paradigms; what if you want to display huge diagrams and keep the interactivity to the max? Certain business domains are indeed susceptible to this dilemma; forensic data analysis, social network analysis, security and anti-terrorism agencies and alike. What can be done in this case?

Let's be first and foremost straight away clear about the RadDiagram framework; is sits in the Visio paradigm and will not scale to the millions. There are various reasons for this:
  • By design; the RadDiagram framework was designed (and this is the general spirit of Telerik's suite of XAML controls) in order to rapidly create great diagrams with a minimum of knowledge about diagramming drawing and graph theory. It was not developed in function of large scale diagrams like the LinkedIn sample above. The shapes (RadDiagramShape) and connections (RadDiagramConnection) are rich controls which on top of the already loaded .Net framework API (i.e. the ContentControl and Control classes) add an additional layer of interactivity and customization which enables rich, interactive diagram but sacrifice (to some extend) memory and processing. This doesn't mean we sacrifice lightly performance but the choice is consistently on the breadth and scope of applications (workflow, organization charts and so on) rather than on scalability.
  • The XAML framework is inherently a bad choice to scale things to the millions. Even when using virtualization techniques, each and every instance of, say, the Control class comes with a wealth of stuff (templating, styling, triggers, events…) which consume memory whether or not you need it in your concrete business context (application). The XAML framework is rich in scope but at a price. Maybe on top of this one could add the fact that, even though this has become less an argument than ten years ago, the managed programming paradigm is maybe also an issue. Some situations still benefit from a pure C++ approach than from a managed approach.
  • RadDiagram can articulate a wide variety of diagramming tasks but at the same time it does not focus on any type in particular. If your application is all about tree graphs and large hierarchies then there are for sure ways in which the tree-layout code could be optimized in function of the data you wish to display. That is, the graph layout and internal engine managing shapes and connections is not geared to anything in particular while there are definitely shortcuts possible if some knowledge (properties) of the to-be displayed data is known. For example, testing for graph cycles in the layout could be omitted if the data is guaranteed to be acyclic. There are on many levels ways in which a custom implementation could give specific applications a performance or scalability boost. This customization is part of the Telerik consulting services and not something which can be done by customers due to the End User License Agreement.

Coming back to the question on what can be done, it's crucial that independently of the technical discussion you ask yourself:
  • What does my end-user gain from being able to see a million of shapes and connections? Data is not the same as information, displaying a lot of shapes is not (and usually does not result in) better information visualization. A diagram is in most business applications just another way and not the only way to gain insight into a dataset. Usually it has to be combined with other visualizations (timeline, Gantt, pies…) in order to give a fuller picture of the question at hand. Will your user really see the needle in the huge diagram, wouldn't it be better to guide the user to a smaller set first and then display the diagram?
  • What is the business question the user tries to answer using the diagram? A great visualization is not an application where data is just being presented, an application should invite a user to follow a certain path to answer a business-related question, it should present a workflow (screen flow) and user experience (UX). In much the same way, it's not very useful to display a million rows in a datagrid if there is no way to filter and experiment with the data; one should try to focus on what the aim is of a representation (in order to solve a business related question) rather than just offering bluntly a lot of data and a lot of widgets.
  • Do the details matter or is it only the result? In many situations data is either noisy or needs pre-processing before being visualized and this is true for diagrams as well. In various business domains it's necessary to pre-process data using SQL Server's Analysis Services, StreamInsight (or alike) or use OLAP techniques before sending the result to the (visualizing) client. Many situations where large diagrams and interactivity is expected can be solved by delegating the filtering and selection process to the backend. The UX experience aimed for and a study of what the application is intended to do often dictates how big diagrams can be reduced to the essence. Blaming hardware and software performance is too often an excuse for not digging into the tougher (business and UX) questions.
  • What is the long term vision of the application and how will the data scale on the long term? The choice of data visualization controls (not just diagramming) should be taken in function of tomorrow and not how your applications is today (or has been in the past years). The importance of this question resides in the difficulty one faces when trying to shift a diagramming visualization between the two paradigms. Because the approach (aka rendering techniques) is so fundamentally different it takes quite a turn to shift things from one into another.

These question in essence also answer the question on how to proceed if you sit in the gray zone mentioned above:
  • If you have moved from a direct rendering context to a retained context (e.g. you upgraded a Winform app to a WPF app) you might have discovered that you gained in interactivity but that the scalability is not what you expected.  If this is the case, you need to question the shift or whether you can find compromise or whether a rethinking of the UX is in order.
  • If you have a big data warehouse and wishes to use XAML you need to think about how to fully exploit the backend processing before sending naively terrabytes of data to the client (with unrealistic scalability expectations).
  • If you do not want to compromise on the amount of items in the diagram and you consider to target mobile platforms, you might discover that web technologies offer nowadays solid performance and scalability. The price to pay in this case is the shift in programming paradigm and the lack of rich (diagramming and non .Net) API.
  • As explained above, much fine-tuning is possible in RadDiagram in concrete business cases and with more knowledge of the data. The customization of RadDiagram through consulting services is maybe what you need.