Recursive SQL for Hierarchical Data


There's a common need to store hierarchical data in a relational database. For example, imagine you want to model Departments, where each Department can have zero or more sub-departments (and sub-departments can also have sub-departments).

The easiest way to do this is to create a Department table with the following three fields:

department_id
name
parent_id

You would then define a self-referential foreign-key constraint linking parent_id to department_id on the same table.

So top-level departments would have a null parent_id, and all sub-departments have a parent_id pointing to the record of the parent department. Pretty straight forward...

The problem arrises when you want to start using this data. For example, suppose you have an Employee record that belongs to a particular department. You want to find out if the employee's department is a child of some higher top-level department.

Typically you could do this in your application code, by writing a recursive function to execute an SQL query to get the department parent, and keep going up the tree till you find the department you are looking for, or reach the top of the tree. There's a performance penalty for this though, as you need to execute multiple queries. For short hierarchies, this isn't much of a problem. But if you needed to do this for, say, 500 employees, then it adds up.

There's a way to do this in a single SQL query though, which in some cases may lead to substantial performance gains. This involves using CTE (Common Table Expressions), which is now standard across most popular relational databases (including MySQL and PostgreSQL).

WITH RECURSIVE temp(department_id, name, parent_id) AS (
 SELECT department_id, name, parent_id FROM department WHERE department_id = 10
 UNION SELECT department.department_id, department.name, department.parent_id
   FROM department, temp
   WHERE temp.department_id = department.parent_id)
SELECT * FROM tree;

This will return a flat table with the record for department_id=10, and records for all of its parents. This is pretty useful in itself, but you can go further by taking advantage of the recursive function to dynamically build useful information. Take the following for example:



WITH RECURSIVE tree(department_id, name, parent_id) AS (
 SELECT department_id, name, parent_id, 0 as depth,
CAST(department_id AS text) AS ids,
CAST(name AS text) AS path
FROM department WHERE department_id = 10
 UNION SELECT department.department_id, department.name, department.parent_id, depth+1 as depth,
ids||','||CAST(department.department_id AS character varying) AS ids,
CAST(department.name AS character varying)||'/'||path AS path
   FROM department, tree
   WHERE department.department_id = tree.parent_id)
SELECT * FROM tree order by depth desc, path ;

This does the same as above, in that it returns department 10 and all its parents. However, it also generates three additional computed fields:

  • depth -- this is a counter of the distance from the department=10 record. I.e. how many levels up the parent is at.
  • ids -- this is a concatenated string of all the department_id records as you go up the tree. You could take only the root parent for example, and use this ids string to construct a dynamic SELECT query with a department_id IN (ids) clause.
  • path -- this is similar to ids, just a concatenated string of department names, in order of parent/child. You could use this to print to a display, or to logically sort, etc.

Note that you can also use the same logic to get a child departments instead of parents, simply by changing the red bits:

WITH RECURSIVE temp(department_id, name, parent_id) AS (
 SELECT department_id, name, parent_id FROM department WHERE department_id = 10
 UNION SELECT department.department_id, department.name, department.parent_id
   FROM department, temp
   WHERE department.department_id = temp.parent_id)
SELECT * FROM tree;



Comments

Popular posts from this blog

Wkhtmltopdf font and sizing issues

Import Google Contacts to Nokia PC Suite

Can't delete last blank page from Word