Denodo data virtualization

Denodo has been one of the leaders in the Data virtualization platform vendors market and it allows implementation of all the data virtualization features and principles.

Denodo platform tools

Denodo platform is a set of server and client components which can operate on a sigle machine or in a cluster mode. Below a brief overview of Denodo software components:

  • Denodo platform control center - this tool manages all the server components, informs about server status, allows modification of global parameters and starting and stopping of denodo tools. It also includes license information and is used to do upgrades and installations.
  • Virtual Dataport server - is the main data virtualization server application, this is the heart of denodo platform. It manages physical layer (i.e. wrappers to various source systems, translates VQL to SQL or any other communication language with the source system), implements views and operations on them which represent the logical layer (it includes optimizer and query plan generator and executor engine modules), implements cache mechanism to store local copies of the data when required) and provides interfaces for reporting purposes.
  • Virtual Dataport (VDP) administration tool - is the client application with user friendly gui which enables communication with the VDP server.
  • ITPilot - Web automation tool, it makes possible to obtain structured data from the web sources
  • ARN Aracne - allows reading unstructured web information (crawling, filtering and maintaining indexes), document repositories, e-mail servers, RSS sites etc.
  • Denodo scheduler - it is a server application which allows running and scheduling data extraction and integration jobs which might be a combination of processes from VDP, ITPilot or Aracne. In fact with denodo scheduler the denodo platform might to some extent act like a traditional ETL tool.
  • Denodo scheduler administration tool - graphical web gui to denodo scheduler, very basic.
  • Denodo Monitor - command line utility which runs in the background and is able to do auditing and monitoring of the denodo servers. It is able to monitor processes, sockets, resources, threads, queries and cache and outputs the information either to a logfile or a database.
  • Denodo Monitor Reports - a simple external tool available on Denodo which provides graphical representation of the information logged by Denodo Monitor.
  • Denodo Dashboard - real time VDP server and cluster monitoring (CPU, memory usage, resources, current connections, data source activity, cache requests and processes)

One thing worth mentioning is that VQL (Virtual Query Language) is used to create views and other objects in Denodo platform - it is a variation of SQL language which extends it by dedicated data virtualization capabilities.

Denodo advantages and nice features

  • It pretty much covers all aspects of data virtualization
  • Lots of options to do joins (merge join, hash join, nested, easy to switch the order of the join)
  • Very easy to set up a source connection, a webservice or other API (like MDM for example) to access the data. To some extent it satisfy the needs of an MDM solution
  • Tree view and data lineage show how the data flows
  • Fast learning curve, help and tutorials available
  • Complex data transformations are handled easily, including hierarchical data structures (JSON, XML for example).
  • Denodo is relatively easy to extend. With basic development skills, custom transformations, data connectors and data quality checks can be programmed in Java (also external tools can be invoked this way).
  • Very good quality of the help documents, online tutorials and responsiveness of the vendors support
  • Monitoring of all service components in an integrated fashion. Denodo platform monitoring is also deliverable via JMX and SNMP so it can be analyzed with such apps as JConsole, HP Openview, Tivoli, WinRM and others.

Denodo weaknesses, common issues and things to improve

  • The solution seems to be immature overall. The VDP client, denodo scheduler client look and act like written by junior programmers, for example:
    - two-phase login to denodo scheduler tool. First you use a admin/admin login and then the vdp user credentials.
    - the formula editor in VDP validates dynamically the written formula in a way which makes it unusable in practice (external text editor is usually required)
    - lots of errors where reason is unknown
    - data types handling (very easy to fall into a trap of joining VARCHAR with NVARCHAR without even noticing it). This always gives bad output.
  • Delegation of queries to the source database. This should be a key to the successful use of a data virtualization tool. If a query is not pushed to the source database, then all the records are read to the denodo server, processed there and then skipped. It's ok if all the records need to be processed anyway (while generating a fiull extract for instance). In general it's rather inefficient, many denodo functions and formulas don't translate properly into native SQL which results in reading all the data, sometimes the generated SQL query is not inerpreted by the source engine which results in throwing an error.
  • Inbound and Outbound interfaces in most cases use JDBC or ODBC drivers with their good and bad features. Especially in a scenario when denodo is accessed via ODBC it produces many errors and is very faulty (like Tableau ODBC connection to Denodo for example)
  • Need to know Java to write functions or procedures
  • Denodo scheduler is supposed to automate tasks and make it possible to create ETL-like processes but the tool is raw and very basic, would be nice to make it more professional.
  • The VDP Tool GUI of Denodo up to version 5.5 was very simple and difficult to use. The end user experience has improved significantly with version 6 where multiple tabs can be used, VQL shell editor was improved, user can access multiple databases at a time, etc. but there are still things to improve.

Denodo resources

Denodo vendor webpage - a complete source of information on denodo data virtualization platform, provides a free downloadable Denodo express trial version of the tool (with limited features), lots of articles, best practices documents, help guides, video tutorials and community forum information.