So the first small project i’m working on in my spare time is to iterate Alfresco’s file system through code. I did some looking on the Internet and there wasn’t that many good solutions easily discoverable.  This looked like some low hanging fruit to work on and study for ACE at the same time.

Note, this is being developed in PostgreSQL but could easily be used on nearly any database platform.

Here is a basic query to get Alfresco to show a list of folders from Alfresco Share.

SELECT
alf_child_assoc.parent_node_id,
alf_child_assoc.child_node_id,
alf_child_assoc.qname_localname,
alf_qname.local_name
FROM
alf_child_assoc,
alf_node,
alf_qname
WHERE
alf_node.type_qname_id = alf_qname.id
and alf_node.id = alf_child_assoc.child_node_id
and alf_qname.local_name in ('sites','folder')
and parent_node_id =
(
select alf_child_assoc.child_node_id
from alf_child_assoc
where alf_child_assoc.qname_localname = 'company_home'
)

The output should look something like this:

 parent_node_id  | child_node_id | qname_localname  | local_name
----------------+---------------+------------------+-----------
             13 |            14 | dictionary       | folder
             13 |            25 | guest_home       | folder
             13 |            26 | user_homes       | folder
             13 |            27 | shared           | folder
             13 |            28 | Imap Attachments | folder
             13 |            29 | Imap Home        | folder
             13 |           473 | sites            | sites

The records returned obviously are comparable to the view in Share.

  • parent_node_id (13 in this case) is Company Home
  • child_node_id is the id number for each folder
  • qname_localname is the folder name displayed in Share
  • local_name is the type of object in this part of the tree.

company_home
Comparable view in Alfresco Share

Below is a view, removing the entry point to allow for iterating any part of folder tree.

CREATE OR REPLACE VIEW view_alf_folders AS
SELECT
alf_child_assoc.parent_node_id,
alf_child_assoc.child_node_id,
alf_child_assoc.qname_localname,
alf_qname.local_name
FROM
alf_child_assoc,
alf_node,
alf_qname
WHERE
alf_node.type_qname_id = alf_qname.id
and alf_node.id = alf_child_assoc.child_node_id

Next is the recursive query to provide the hierarchical view of the tables, much like dm_folder in Documentum 😛

CREATE OR REPLACE VIEW view_recurse_alf_folders AS
WITH RECURSIVE recurse_alf_folders(
        parent_node_id,
        child_node_id,
        qname_localname,
        local_name) AS (
           SELECT v.parent_node_id,
                  v.child_node_id,
                  v.qname_localname::text,
                  v.local_name
           FROM  view_alf_folders v
           UNION ALL
           SELECT s.parent_node_id,
                  r.child_node_id,
                  (s.qname_localname::text || '/' || r.qname_localname::text )::text as qname_localname,
                  r.local_name
           FROM recurse_alf_folders r,
                view_alf_folders s
           WHERE r.parent_node_id = s.child_node_id
)
SELECT local_name, child_node_id, qname_localname FROM recurse_alf_folders;

I used “/” to connect the folders together, because obviously “\” is an escape character.

Here is an example of the finished output

view_recurse_alf_folders

As you can see, we can now produce any type of report on the files in Alfresco Repository, and easily filter based on their type, such as content, or a folder in this example.  Additionally I have left the child_node_id in the query to allow easy linking to other tables to get file sizes or any other type of metadata.

I wrote this at home on my Development System.  Over the next month I will probably convert this code to work with MS SQL Server, and run it against a million+ folder system and can provide feedback on any tips to make the code scale better.

I’m pretty sure this probably won’t be on the ACE exam, but I feel like I learned a lot about the internal database behind Alfresco tonight 😛

Developed on Alfresco 5.1.1 with PostgreSQL 9.4.4