You are on page 1of 18

JCR Deep Dive | Jochen Toppe's Blog

Page 1 of 18

Jochen Toppe's Blog


Putting the >O< in technology
To search, type and hit enter

About

Life@Work

Software

JCR Primer

Arduino Tinkering

Home Automation Library

12. View All


JCR Deep Dive
Introduction: Managing Content

ShareThis

Recently Written
CQ5 Command Line Admin Unit Testing in Adobe/Day/CQ5 Certification Incompetence Bye Bye Holographic Storage Ditch Flash, Mr. Jobs. Really?

A good content management system should be able to manage highly structured content. It does not only need to manage the content and its properties, but also associated metadata, versions, and state (such as locking/unlocking and publication state). The resulting structure is essentially an object graph of various nodes: content, properties, versions, metadata. Lets explore this idea by the example of a simple web page. The page contains a headline, left column that contains a teaser and a formatted text field, and a sidebar column that contains two teaser modules.

Categories
.NET Electronics J2EE JCR Rails Software Architecture Uncategorized

Sample Wireframe

Archives
April 2012 September 2011 March 2011 February 2010 January 2010 December 2009 November 2009

Applying object-oriented design principles, we can identify the following major classes:

UML Diagram

October 2009 May 2009 April 2009 March 2009 February 2009 January 2009 December 2008

Page which contains the headline as well as properties for the left column and sidebar which can reference any number of content modules Content Module as a superclass for the two different types of content modules found on this page: Teaser and Formatted Text Module. This was derived using the object-oriented principle of abstraction and generalization: a Teaser is a Content Module. Teaser: Teaser module which inherits the title from the parent class Formatted Text Module: Contains a formatted text properties Consequently, the sample page above can be depicted using the following object graph:

November 2008 October 2008 September 2008 August 2008 July 2008 June 2008 May 2008

Sample Object Graph

April 2008 March 2008

Naturally, this graph will become more complex once adding version nodes of the individual content and property objects. Modeling an entire web site in this fashion results in an even bigger object graph. Object-oriented databases have traditionally been excellent at persisting complex hierarchical data. For example, Poet, one of my former companies, had implemented an enterprise content management system on top of their object-oriented database. Most other vendors, however, embraced existing enterprise infrastructures and built their repositories on top of relational databases. The repository, as outlined previously, acts as an abstraction, providing applications access to the content (the above object graph) while insulating the application from the storage implementation details.

February 2008

Admin
Log in WordPress

Links

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

JCR Deep Dive | Jochen Toppe's Blog

Page 2 of 18

Other content management systems that have followed this approach are, amongst many others, CoreMedia, Fatwire, and Day Systems. Day systems pioneered the Java Content Repository standard that specifies a vendor-neutral API to access and store content structures. However, in the authors experience, object oriented content modeling techniques can be applied to almost any enterprise content management system, even if the support the different frameworks offer for this architecture vary greatly.

Facebook LinkedIn Xing Twitter

The Java Content Repository The Java Content Repository (JCR) standard, which is based on the Java Specification Requests JSR170 (version 1.0) and JSR 283 (version 2.0), provides a Java-centric object-oriented storage API specifically targeted at content management scenarios. The JCR is not a content management system or a full-fledged content management system API, but rather a content repository API. A content repository provides a common API for all content-driven applications and CMS components, which require access to the content. It provides methods to read, write, and query content. The primary motivation of the JCR standard is to provide a standard and vendor-neutral programmatic interface for content repositories, allowing applications of multiple vendors to interact efficiently. This chapter will provide an overview of the basic concepts of the JCR, which are a foundation to the subsequent chapters. The chapter will not give you an overview understanding of the JCR specification, but will not fully cover the JCR in-depth, as this would most definitely be beyond the scope of this book. There are numerous excellent resources to consult, first and foremost the JSR 170 and JSR 283 specifications and accompanying API documentation. This chapter covers concepts found both in JSR 170 and JSR 283 and will point out any features that are limited to JSR 283 as most JCR-compliant implementations currently only implement the JSR170 specification. All sample code in this chapter has been written to work using Apaches open source implementation of JSR 170, JackRabbit.

Repositories and Workspaces


At the core of the specification is the concept of the repository. A repository contains any number of workspaces. While a simple repository usually only contains one workspace, a repository can contain any number of workspaces. A workspace is essentially a view into the repository. The same content items may appear in multiple workspaces, some content items may only be present in one workspace. Workspaces can be thought of work-in-progress areas of a repository, much like a scrapbook. Content is always created in a workspace and may then be propagated into another workspace. There are many potential uses for multiple workspaces. For example, one can use workspaces in a similar way to branches in a source control system (even though as we will see later, the JCR provides versioning separately). A group of content editors work locally in a workspace and when their work is completed, the finished work can be submitted into a central workspace. Workspaces could potentially also be used to differentiate between in-progress, QA, and live environments, where changes are propagates from one workspace to the other as they go through the editorial workflow.

Lets consider the following example. A repository contains two workspaces, Workspace 1 and Workspace 2. Both work spaces contain the common content nodes 00,01,02, and 04. Both work spaces contain new nodes which are not present in the other (05 and 03). Since we have not formally defined the notion of a content node yet, think of the nodes as web pages on a web site for now.

Repository with Workspaces

The JCR provides a Java API within the package javax.jcr. The main entry point is through the class Repository which allows the caller to query the repository for its capabilities as well as to obtain a session to interact with the content. The JCR defines two levels of features. The level 1 repository is a read-only repository which allows the retrieval and traversal of content, export to XML, querying the content, as well as introspection of the defined content types. A level 2 repository further adds the capability of writing content back to the repository. Either repository may support a variety of optional features. Whether a vendor supports

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

JCR Deep Dive | Jochen Toppe's Blog

Page 3 of 18

these advanced features can be queried via the method Repository.getDescriptor, as this following example illustrates using the Apache Jackrabbits TransientRepository implementation: Repository repository = new TransientRepository(); System.out.println(repository.getDescriptor(Repository.LEVEL_1_SUPPORTED)); The Repository provides method to log in to the repository using credentials and an optional workspace name and yields a Session object. The Session object is then bound to these credentials and workspace, i.e. a session cannot span multiple workspaces. The JCR does not provide methods to manage workspaces (such as adding and removing), these are specific to the repository implementation being used. The JCR further does not provide methods for managing users or managing authentication and authorization (e.g. content access permissions); While JSR 170 provides a basic API check whether permissions are present on given content, JSR 283 greatly improves on the feature set in this area as we will later see. The following code creates a simple JackRabbit repository, obtains a session and queries the default workspace for its name: // instantiate a JackRabbit transient repository Repository repository = new TransientRepository(); // Provide JackRabbit default credentials Session session = repository.login( new SimpleCredentials(username, password.toCharArray())); // this will yield default System.out.println(We are in workspace + ws.getName());

Defining Content
Content in the JCR is by nature defined in a highly-structured format, meaning that individual classes and properties are identified. In object-oriented modeling, structure is expressed via the definition of classes that can be visualized in a UML class diagram. Classes contain member variables and methods and can be instantiated as object. The JCR follows a very similar approach, except that these concepts are named differently: OO-Concept Class JCR Equivalent Node Type Description Defines the structure of the objects to be created. Instance of a node type. The JCR uses primitive types to define properties, much like standard objectoriented programming which uses primitives such as integers and strings also. Defines atomic types, such as integers, strings, dates, and pointers (references). The JCR does not model functionality, only data. Hence there is no equivalent of a function.

Object Member Variable

Node Property

Type

Property Type

Function

None

The JCR further defines a common parent type for nodes and properties, called item. Each workspace in a JCR repository contains any number of items. A node may reference any number of child nodes as well as any number of property nodes, essentially utilizing the object-oriented composite pattern to build a graph of items where nodes may be internal or leaf elements and properties may only be leaf elements. Each node must have a parent node, meaning there cannot be freefloating, and hence inaccessible, nodes in the repository.

Items, Nodes, Properties

Lets take a step back and look at this from the birds-eye perspective. The JCR is essentially a flexible storage engine. It uses a schema definition to describe the allowed structure of the individual instances of nodes and properties. As a matter of fact, this schema definition can be provided as a file. It then provides a programmatic API for the developer to create, store, retrieve, and query content within this structure. The

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

JCR Deep Dive | Jochen Toppe's Blog

Page 4 of 18

biggest leap from traditional object-oriented programming is that the structure is not defined in terms of class definition but as a schema definition file, most likely in written in XML. This schema definition is also referred to as the content model.

Birds-Eye View of JCR

In the JCR, the model is expressed through the concept of node types and property types. Every node must have exactly one node type associated with it. The node type defines the properties each node may reference as well as other characteristics of the node. Nodes can be placed in an inheritance relationship, i.e. the node characteristics include the parent type which properties are inherited from. A node type may also be defined to be abstract (in JSR 283 only), meaning that it cannot be instantiated, i.e. no node instances of this type can be created it is purely used to implement the object-oriented principle of abstraction.

Nodes may have unique identifiers (a UUID) if they inherit from the built-in base type mix:referenceable. This is, in essence, identical to an objects identity in standard object-oriented programming languages. As a matter of fact, Reference properties in the JCR simply reference the UUID of a node, meaning that a node may be copied, moved, or renamed without breaking these references. As a consequence, all nodes which may be referenced through such a property must inherit from mix:referenceable. Inheriting from this base type automatically injects a property with the name jcr:uuid into the node type declaration. The method Node.getUUID() is consequently merely a shortcut for Node.getProperty(jcr:uuid ).getValue ().getString(). The JCR defines a pre-set number of property types which are split into two major categories: primitive properties, such as integers and strings, and reference properties which may reference other items in the repository or external URIs. Reference types may enforce the concept of referential integrity, i.e. not break links. This is particularly useful for modeling references of content which is contained within one repository. The content editor may rename, move, delete content and the repository automatically enforces that no references break through these actions. There are numerous trade-offs to be considered when choosing these types which are elaborated in the upcoming chapter on content modeling. Properties may further be defined to be mandatory or optional. Property Type String Binary Type Primitive Primitive Summary A string property A binary large object, or blob. This type is used for storing unstructured data in the repository. Examples include images, css files, flash files, etc. A date property A long property (64-bit signed twos complement integer) A double property (64-bit IEEE 754 floating point) A boolean property (true/false) Arbitrary length decimal number (maps to java.math.BigDecimal) Namespace-qualified string, such as the name of a node type, e.g. as nt:folder or samples:teaser. This can be particularly useful if a property references a content type, such as all nodes of type samples:teaser in a particular folder. Represents a path within the workspace to a particular item (such as /a/b/foo). This reference property does not enforce referential integrity. A reference property refers to another node in the workspace via the id of the referenced

Date Long

Primitive Primitive

Double

Primitive

Boolean Decimal*

Primitive Primitive

Name

Primitive

Path

Reference

Reference

Reference

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

JCR Deep Dive | Jochen Toppe's Blog

Page 5 of 18

node. The repository will enforce referential integrity. WeakReference* Reference Identical to a Reference property with the exception that referential integrity it not enforced. Accessing a stale link will throw an exception. Identical to a String property except that it only allows values that conform to the URL syntax (RFC 3986). This is predominantly used for modeling external references, such as web sites. A property of undefined type. The type can be defined at runtime, meaning by the Node.addNode function. Two node instances of the same node type could hence have the same property (by name) with different types. Internally, the JCR stores undefined properties as String values, i.e. they are automatically converted. Implementation-specific issues may arise as a consequence when, for example, storing binary properties in undefined properties. *Denotes JSR-283 specific properties Properties may further be defined to be multi-valued. This is equivalent to defining an array property, i.e. a property that may have more than one value. It is possible to query the node type definition within the API to determine, for example, whether a property is multi-valued as the following example code illustrates: Node node = session.getNodeByUUID("some existing uuid"); Property property = node.getProperty("sampleProperty"); if (property.getDefinition().isMultiple()) { // there are potentially multiple values Value[] values = property.getValues(); } else { // there is only one value Value value = property.getValue(); } Also refer to the CND in a nutshell which outlines the compact node type definition language in-depth.

URI*

Reference

Undefined

n/a

Namespaces
The repository features native support for namespaces. These namespaces are modeled after XML namespaces and allow different aspects of the content model to be grouped together within them. Namespaces are delimited by a : (colon) character. Similar to XML namespaces, the namespace prefix on the left side of the colon is a shorthand notation for the full namespace name, designated by a URI. The JCR defines four built-in namespaces: Summary Reserved for built-in node types and properties, for example the jcr:uuid property. http://www.jcp.org/jcr/nt/1.0 Reserved for built-in primary nt node types, for example nt:file. mix http://www.jcp.org/jcr/mix/1.0 Reserved for built-in mixin types, for example mix:referenceable. xml http://www.w3.org/XML/1998/namespace Reserved for reasons of compatibility with XML. According to the JSR specification, this prefix should not be used by clients of the API in the names of normal nodes or properties, since doing so will cause problems on export to XML. (empty) (empty) The empty namespace is the default namespace. In a writeable repository, namespaces can be added via the javax.jcr.NameSpaceRegistry class. The following code registers the samples prefix to the URI http://www.jtoppe.com/samples. Subsequent API calls can now use the samples prefix. Workspace ws = session.getWorkspace(); // register the samples prefix with the given URI ws.getNamespaceRegistry().registerNamespace("samples", "http://www.jtoppe.com/samples"); // use the samples prefix Node folder = session.getRootNode().addNode("demo", "samples:folder"); Prefix Jcr URI http://www.jcp.org/jcr/1.0

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

JCR Deep Dive | Jochen Toppe's Blog

Page 6 of 18

Namespaces have to be declared in the repository before they can be referenced through the API or within the node type definition files. The same hold for the node definition filein which the prefixes used need to be mapped to the URI. Notice that the URI, as the fully qualified identifier, specified in the CND file must match the URI registered via the NameSpaceRegistry. // CND namespace definitions

Creating and Accessing Content A workspace is defined to always contain exactly one root node which can be obtained by calling Session.getRootNode(). This node is the parent for all custom nodes created via the API. The root node has no name (Session.getRootNode(). getName() returns an empty string). The root node further does not allow child nodes with conflicting names, meaning that the names of all child nodes have to be unique, much like a folder within Windows Explorer. Nodes can further be obtained from the Session object by directly specifying an identifier (via Session.getNodeByUUID( String uuid)). Adding nodes can be achieved in a variety of ways. For one, the method Node.addNode(String relpath) adds a new node of type nt:unstructured at the given path, relative to the node the method is called on. The method Node.addNode(String relPath, String primaryNodeTypeName) adds a node of a given type. Properties can be added to nodes via the Node.setProperty(String propertyName, [type] value) method, and removed via Node.removeProperty(String propertyName). If the autocreate option is set in the definition of the property, the JSR will automatically create the property when the parent node is created. Properties defined as mandatory have to be present and set to a value, either explicitly or implicitly be defining a default value in the property definition, before the node can be saved. All items can be retrieved through using a path expression. Paths uniquely identify every node and property within the repository. They may be written relative to a node or in absolute notation, beginning with a / (forward slash). The characters . and .. specify the current and parent node, respectively. When nodes contain child nodes of the same name, the position of the child node can be specified as part of the path using brackets, for example /a/b[3] matches the third child node with the name b. The method Node.getNode(String relPath) only accepts relative paths and returns the node matching the relative path given. The method Session.getItem(String path) only accepts absolute paths and returns any item (nodes and properties) matching the given path. In the following example, the node a,b,c,d, and e are placed in an inheritance relationship, i.e. b is a child node of c, etc.

Sample Nodes

The following code example illustrates the various ways of accessing items via paths: // yields empty string (the name of the root node) System.out.println(session.getItem("/").getName()); // yields "a" System.out.println(root.getNode("a/b/c/../..").getName()); // yields "b" System.out.println(root.getNode("a/b").getName()); // yields "d" System.out.println(b.getNode("d").getName()); // first child node called a, yields "a" System.out.println(session.getItem("/a[1]").getName()); // retrieve b's title property, yields "b's title" Item item = session.getItem("/a/b/samples:title"); if(!item.isNode()) System.out.println(((Property)item).getValue().getString());

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

JCR Deep Dive | Jochen Toppe's Blog

Page 7 of 18

The APIs also implement full visitor pattern, allowing the traversal of all property and node items in the content graph. The following example will print all child nodes and their property names when applied to a node via Node.accept(new PrintingVisitor()). protected static class ItemNamePrinter implements ItemVisitor { public void visit(Property property) throws RepositoryException { System.out.println(" property " + property.getName()); } public void visit(Node node) throws RepositoryException { System.out.println(" Node Path: " + node.getPath()); // go through all properties for (PropertyIterator propIter = node.getProperties(); propIter.hasNext();) { Property property = propIter.nextProperty(); property.accept(this); } // and child nodes for (NodeIterator nodeIterator = node.getNodes(); nodeIterator.hasNext();) { Node subNode = nodeIterator.nextNode(); subNode.accept(this); } } }

Content may further be accessed via the query APIs which are covered in-depth in section Content Search.

Creating Content: A First Example


In the previous example [see section Managing Highly Structured Content] a page with content modules was expressed in the form of this UML diagram, which is equivalent to the content model for the simple example web site.

Sample Content Model

In this example, we would define the node types Page, ContentModule, FormattedTextModule, and Teaser. All content nodes in the repository have to be of one of these types. The properties (such as the page title or the leftColumn relationship between Page and ContentModule) are modeled using property types. Given this, a content model for the above example site may define the following node types. All code samples below use the JCR compact node type notation (CND), which will be covered in-depth later. Node Type: content This node type is defined as the top-most level of abstraction for all content nodes to be stored in our custom content model. It inherits from the built-in base type mix:referenceable, which allows other properties to reference this node as well as the implicit creation of a jcr:uuid property of the node. // base class for all content [samples:content] > mix:referenceable, nt:base // this is a primary node type Node Type: folder Type folder node type is introduced in order to organize content pages and modules into folders. Note that the JCR technically defines a built-in type nt:folder, however, this type only allows the pre-defined types nt:folder, nt:file, and nt:linkedFile as children. It hence cannot be used as a means to organize custom content. The folder type allows for child nodes of type content: // base class for all content [samples:content] > mix:referenceable, nt:base // this is a primary node type primary Node Type: contentModule The contentModule node type serves as a common base class for the teaser and formattedText types. By building this layer of abstraction, the page type can aggregate both types by referencing this type. // the content module type [samples:contentModule] > samples:content // headline property - samples:title (STRING) mandatory Node Type: teaser

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

JCR Deep Dive | Jochen Toppe's Blog

Page 8 of 18

A teaser is a simple teaser node type which contains a headline property. // the teaser type which inherits from contentModule [samples:teaser] > samples:contentModule // no properties (headline is inherited from contentModule) Node Type: formattedText A simple formatted text node type which contains a string field. // the rich text module type which inherits from contentModule [samples:formattedText] > samples:contentModule // string property - samples:formattedTextField (STRING) mandatory Node Type: page The page serves as the main content container to build pages for the example layout. It contains multi-valued properties (lists), which aggregate content modules in the left column and sidebar column. Both these multivalued properties are constrained to type contentModule, meaning they can only reference nodes of type contentModule or subtypes thereof: Node Type: page The page serves as the main content container to build pages for the example layout. It contains multi-valued properties (lists), which aggregate content modules in the left column and sidebar column. Both these multi-valued properties are constrained to type contentModule, meaning they can only reference nodes of type contentModule or subtypes thereof: // the page type [samples:page] > samples:content // title field of type string - samples:headline (STRING) mandatory // left column as a multi-valued reference property constrained to // referenced of nodes of type contentModule only - samples:leftColumn (REFERENCE) multiple < 'samples:contentModule' // left column as a multi-valued reference property constrained to // referenced of nodes of type contentModule only - samples:sidebar (REFERENCE) multiple < 'samples:contentModule' After importing this definition file into the content repository, we can make the example site come to life. The following Java code creates the necessary nodes for a simple page: Repository repository = new TransientRepository(); Session session = repository.login( new SimpleCredentials("username", "password".toCharArray())); Node root = session.getRootNode(); Workspace ws = session.getWorkspace(); // the value factory is to create values or properties ValueFactory valueFactory = ws.getSession().getValueFactory(); // create a folder to store the content of the page in Node folder = root.addNode("demo", "samples:folder"); // start creating the content, teaser1 first Node teaser1 = folder.addNode("teaser1", "samples:teaser"); teaser1.setProperty("samples:title", "Teaser 1 Title"); Node teaser2 = folder.addNode("teaser2", "samples:teaser"); teaser2.setProperty("samples:title", "Teaser 2 Title"); Node teaser3 = folder.addNode("teaser3", "samples:teaser"); teaser3.setProperty("samples:title", "Teaser 3 Title"); Node formattedText = folder.addNode("formattedText1", "samples:formattedText"); formattedText.setProperty("samples:title", "Formatted Text Title"); formattedText.setProperty("samples:formattedTextField", "Hello my friend"); Node page = folder.addNode("index", "samples:page"); page.setProperty("samples:headline", "Page headline"); // since this property is multi-valued, it can reference any // number of nodes. The following code utilizes the ValueFactory // to create value object from the nodes. page.setProperty("samples:leftColumn", new Value[]{ valueFactory.createValue(teaser1), valueFactory.createValue(formattedText) }); page.setProperty("samples:sidebar", new Value[]{ valueFactory.createValue(teaser2), valueFactory.createValue(teaser3) }); session.save(); Mixins

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

JCR Deep Dive | Jochen Toppe's Blog

Page 9 of 18

The JCR allows two types of nodes: the primary node type, which we have seen in previous examples, and nodes of type nt:mixin. By definition, a mixin is a type, which is solely defined to serve as a base class for other types. While classic inheritance models the is-a relationship, mixins are used to inherit functionality and property definitions only. Mixin nodes are declared in the same way as primary nodes. By definition, mixin nodes cannot be instantiated, such as via the addNode function. The JCR provides a number of built-in mixins that are to be used when defining custom content models. For example, the built-in mix:referenceable mixin type infuses the behavior of being able to be referenced from property into a node type and adds the jcr:uuid field.

Mixins provide an easy way to define abstract node types in custom content models. As a matter of fact, JSR 170 does not provide the ability to declare abstract node types. This feature is part of JSR 283.
Unlike primary node types, mixins can be added to an existing node even if the mixing declaration is not present in the content model definition. Lets extend the previous example by a mixin type named samples:configurable. This type will allow the folder node type to accept an arbitrary number of strings as configuration options. This could, for example, be used to set properties for an entire site section on a per-folder basis. An example would be defining the color scheme to be used by the rendering stack: // configurable type [samples:configurable] // specify this to be a mixin type mixin // declare a multi-valued property - samples:option (STRING) multiple // base class for all content [samples:content] > mix:referenceable, nt:version // a folder type to organize our pages [samples:folder] > samples:content, samples:configurable // child nodes are any node of type samples:content + * (samples:content) multiple Which can now be accessed via the API: // create a folder to store the content of the page in Node section1 = root.addNode("section1", "samples:folder"); section1.setProperty("samples:option", new String[] {"colorscheme=green"} ); Node section2 = root.addNode("section2", "samples:folder"); section2.setProperty("samples:option", new String[] {"colorscheme=green"} ); Storing Unstructured Content As mentioned above, every node in the repository must be assigned a type. When creating a new sub node via the API, the type is either directly supplied or, when omitted, the API creates a node of the built-in type nt:unstructured, which is used to model unstructured data. As a matter of fact, there are multiple ways of modeling unstructured data: Binary properties: Properties of type Binary can be used to store arbitrary binary data Unstructured node type: The built-in nt:unstructured node type can be utilized to create arbitrary nodes structures. It allows any number of properties of all types as well as any number of sub node. This approach is typically useful for storing semi-structured content, that being content which is stored in a structured fashion yet does not adhere to a specific schema. Examples applications are, for example, XML and HTML data. nt:file node type: The built-in nt:file node type is designated to represent arbitrary files in the repository. Nodes of this type have one child node by the name content which contains the contents of the file. Typically, the content subnode will be of the built-in type nt:resource which contains a binary property by the name of data.

Versioning
JSR provides versioning as an optional feature. Versioning allows the repository to track different versions of the content as changes are made throughout the content lifecycle. Versioning is built on top of the existing concept of nodes and workspaces, the JCR simply provides a number of convenience methods around these. JR283 enhances on the versioning set forth in JSR170 by specifying four versioning models, simple versioning, full versioning, activities, and configurations. While these different models provide an interesting intellectual excursion, this section will limit itself to the versioning model built into JSR170, providing the basic understanding of the complexities of versioning content. To verify that a particular JCR implementation supports versioning, the Repository object can be queried for the feature in the following manner: Repository.getDescriptor(OPTION_VERSIONING_SUPPORTED). A repository which supports versioning contains a special storage area called the version storage. The entire version history of a particular node is contained within this version storage.

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

JCR Deep Dive | Jochen Toppe's Blog

Page 10 of 18

For a node type to be versionable, it must inherit from the mixin type mix:versionable, either through the node type definition of by adding the mixin type at runtime. The mix:versionable type, a subtype of mix:referencable, adds a number of properties to the node type, such as references to the version history, the base version, and predecessor versions. The JCR can track all changes of a node, its properties and child nodes. Every check-in operation though Node.checkin() creates a new version. This naturally will lead to a lot of intermediate versions throughout the content lifecycle which might require to be cleaned up. A common practice is to remove intermediate versions when a piece of content is published, only tracking the published versions of a particular node. Another approach may to utilize a version collector which removes older versions after set criteria, such as time. The JCR does not provide this as built-in functionality at the moment. The nodes version history can be access through the method Node.getVersionHistory() which yields an object of type VersionHistory. This provides access to all versions as well as support to add version labels to particular versions. Version labels can be thought of logical markers which can be used to restore a node to a particular version with the given marker. This provides a mechanism to attach a logical version name across many nodes versions, much like a tag in a source control system.

Nodes in the Version Store

Under the hood, the mix:versionable node type adds the property jcr:versionHistory, to the versionable content node. This property points to the version history. The version history itself is simply a node of type nt:versionHistory. The version history points to the root version, a node of type nt:version. This node contains version-specific metadata such as when the version was created, the version number, as well as reference properties for the predecessor and successor versions in the version history, essentially forming a doubly-linked list. A new version node is created every time the Node.checkin() method is called through the API. The properties and child nodes of the original versionable content node are versioned into a subnode of the nt:version node which is referred to as a frozen node, of type nt:frozenNode. The node type descriptor gives the programmer full control over how and if properties and child nodes are versioned. The frozen node contains all copied versionable attributes of the original versionable node as well as the field jcr:frozenPrimaryType contains the node type name of the original node. The exact semantics of the versioning as well as restore operation for properties and child nodes can be greatly influenced via the node type definition, adding more complexity and choices for the developer. JSR 170 defines four versioning choices, known as the on-parent-version (OPV) attribute, which govern the versioning behavior of properties and child nodes. As a matter of fact, these attributes influence the creation of new versions via the Node.checkin() method as well as the restore semantics, when a node is restored from an older version via the Node.restore() method. The possible OPV attribute values are copy, version, initialize, compute, ignore, abort.

The COPY Attribute Properties of the versionable node with the copy attribute are directly copied into the frozen node. The referential integrity constraint on nodes of type Reference is lifted when they are copied. This prevents situations in which a node cannot be deleted because an older version of another node might still reference it in a Reference property. All child nodes of the versionable node are converted into frozen nodes and copied into the version store, independent of their respective node types OPV attribute and whether they inherit from mix:versionable or not.

Copied Child Nodes

When a node is restored from a previous version via the Node.restore() method, all properties marked with the COPY attribute are restored and all child nodes attached the node are replaced with the versioned nodes.

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

JCR Deep Dive | Jochen Toppe's Blog

Page 11 of 18

Use this attribute for all versionable properties. Only use it for a child node definition if the child nodes are dependent children of the parent node and need to be versioned together with the parent node, i.e. the objectoriented equivalent of a composition relationship. If used incorrectly, this can very quickly create a huge versioning graph in your repository or result in undesired behavior then restoring older versions.

The VERSION Attribute When the OPV attribute of a property definition is set to VERSION, the semantics are identical to those of the COPY attribute: The properties are copied to the versionable node, the referential integrity constraint is lifted for Reference properties. When specified for child nodes, the versioning operation will honor the specific child nodes individual versioning attributes. Instead of copying versionable child nodes, the JCR creates nodes of type nt:childVersionHistory in the version store which points version history of the child node, not copying the affected subtree. Note that this does not point to the specific version of the child node but the version history, allowing the programmer to delete intermediate versions of nodes without breaking the version graph. Child nodes which are not versionable are still copied as versionable nodes into the version node of the closes parent node which is versionable. Lets explore this by an example. Assume the nodes a,b,c, and d are in the parent-child relationship as outlined on the left side of the picture below. Both nodes types of a an b are versionable (i.e. inherit from mix:versionable) whereas d and b do not. When a is checked in, a frozen node representing a is created. This node contains two child nodes: a frozen node which features the versioned content of the non-versionable node b. And a node of type nt:versionHistory which points to the version of node b which was a child of a at the time it was checked in.

Versioning with the VERSION Attribute

When a node is restored from a particular version, the API provides to options with the removeExisting parameter of the Node.restore() method. If set to true, any children that already exists as children of the node to be restored are replaced directly from the version history. If the flag is set to false and a child already exists, an Exception is thrown. In the above example, this would mean that the call Node.restore(1.0, false) on node a would throw an exeption since the versioned subnode b already exists in the current version and hence cannot be replaced as per the parameter.

The INITIALIZE Attribute When a versionable node is checked in, i.e. a new version created, the new version will contain the child nodes and parameters marked with the INITIALIZE attribute, however, their values will not be copied. Rather, the JCR will create new attributes and child nodes in the version strore and initialize them according to the rules set forth in the node type descriptor, such as default values. When a versionable node is restored from a previous version, child nodes and properties marked with this OPV attribute are not restored, i.e. ignored. This behavior merely provides any benefit for day-to-day content modeling, yet it versions the fact that certain properties and child nodes where present at the time of archival.

The COMPUTE Attribute When the COMPUTE attribute is present for a child node definition, all child nodes, which are also versionable, are copied into the version store as frozen nodes. Upon the restore operation, however, they are ignored. The same holds for property definitions: The properties are versioned but not restored. This attribute essentially implements a version-only strategy, which copies versionable child nodes, ignores nonversionable nodes, and does not restore the child nodes and properties. Finding a useful scenario for these semantics is left as an exercise to the reader.

The IGNORE Attribute

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

JCR Deep Dive | Jochen Toppe's Blog

Page 12 of 18

When this attribute is present for the child definition of a versionable node, all children will be ignored, i.e. not versioned when the node is checked in. When the node is restored from a previous version, all existing child nodes remain, i.e. the parent-child relationship is neither versioned nor restored. When the attribute is present for properties, these properties are ignored when a version is created. When a node is restored, and a property marked with this is present, the value is kept. Properties and child relationships marked with this attribute are essentially ignore in the versioning and restore process. This is highly useful for properties and child relationships, which do not require being versioned.

The ABORT Attribute The ABORT attribute provides rather peculiar semantics. Its presence for either the child definition or a property causes the versioning operation to immediately abort with a VersionException, preventing the checkin of such a node.

Observation
The existence of the observation feature in a particular JCR implementation, which is optional according to the JCR, can be verified by querying the repository descriptors: Repository.getDescriptor (OPTION_OBSERVATION_SUPPORTED). This feature essentially implements the object-oriented observer pattern which allows applications to register for content-specific events which the JCR fires when one of the following events occur: Creation of a node Deletion of a node Adding of a property to a node Removal of a property to a node Change of a property The events further communicate the path of the affected item as well as the user id of the user responsible for the change. Note that the path provided in the event might point to a node or property which does no longer exist because it has been removed. Listeners can be registered via the API on a per-workspace level. It is not possible to specify a repository-wide listener. The following example illustrates a listener which prints out repository events: protected static class MyListener implements EventListener { public void onEvent(EventIterator eventIterator) { System.out.println("Caught events:"); while (eventIterator.hasNext()) { Event event = eventIterator.nextEvent(); try { System.out.println(" Type: " + event.getType()); System.out.println(" User: " + event.getUserID()); System.out.println(" Path: " + event.getPath()); } catch (RepositoryException e) { e.printStackTrace(); } } } } This listener can now be easily registered to listen to the current workspace via the ObservationManager: Workspace ws = session.getWorkspace() ObservationManager observationManager = ws.getObservationManager(); observationManager.addEventListener( new MyListener(), /* new listener */ Event.NODE_ADDED | Event.PROPERTY_ADDED, /* bitmask of event types */ "/", /* path to constrain the listening to */ true, /* is deep, i.e. whether to monitor subpaths of previous defined path */ null, /* specific uuids to listen to */ null, /* particular node types to listen to */ false /* nolocal, i.e. ignore local session */ ); The observation feature allows the architect to implement advanced event-driven architectures. This could, for example, be used to trigger changes in third party systems when content is changed and/or published, such as invalidating a web cache or content delivery network (CDN). It can be used to export content as it is changed or synchronize two different repositories in real-time. The major drawback of JSR170, however, is that the event listeners are not persistent, they only exist within the scope of the session. There is no way to register a listener which can pick up processing where it left off at a later time, i.e. by means of presenting the API a timestamp of the last event that was processed. For many use cases this may not be significant, but when using observation to synchronize the repository with an external system, this feature would be essential to ensure fault-tolerant operation.

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

JCR Deep Dive | Jochen Toppe's Blog

Page 13 of 18

For this reason, JSR 283 introduces the concept of journaled observation. The API introduces the concept of an event journal which can be retrieved via the method Workspace.getEventJournal(). The EventJournal class returned from this method provides methods to skip back and forth in the event log using timestamps, allowing listeners to periodically disconnect and connect and resume processing where they left off.

Locking
The concept of locking relates to locking a content node for exclusive modification, meaning only one user can modify a locked node. It is analogous to the locking semantics provided by most software versioning systems (except for subversion, of course, which strictly follows a non-locking model). Locking is an optional feature in the JCR and whether the repository implementation at hand supports this feature can be determined via the following call: Repository.getDescriptor( OPTION_LOCKING_SUPPORTED). It is important to point out at this point that locking is not identical to check-in/check-out. The check-in/checkout semantics are a feature of versioning and are enabled by adding the mixin mix:versionable to the node type definition. A checked out node, however, is not locked and can be modified by another user. To add locking support to a node, the mixin mix:lockable needs to be added to the node type definition. A node can then be locked via the Node.lock() method, and unlocked via the Node.unlock() method. Via a parameter of the lock() method, the locking operation can be defined to be deep or shallow a shallow lock applies only to the node the lock operation is called on. A deep lock applies to all child node of the node also, in disregard of whether the child nodes are lockable or note. Deep locks are, for example, desired when the child nodes are owned by the parent node, i.e. a composition relationship. The API provides two methods to determine whether nodes are locked: The method Node.isLocked() provides the locking status either by a direct lock or by a parent nodes deep lock. The method Node.holdsLock() returns whether the node itself holds a lock directly, i.e. does not include parent nodes deep locks in the result. The lock method returns a Lock object which in returns contains a lock token. The concept of lock tokens might sound counter-intuitive at first. The JCR does not use the user who placed a lock on a node (also known as the lock owner) to determine whether the current session may modify a node. Instead, the lock token must be registered in the current session. When the Node.lock() method is called, the lock is automatically added to the session. The lock token can be removed from the session and registered within another session, allowing the other session to modify the node. A token can only be registered with one session of a given repository, only giving one session per time access to the node. The application code built on top of the content repository may implement a custom locking/unlocking strategy since it must maintain these tokens separately (the repository does not manage them). Locks can further be limited to the current session, automatically expiring when the session terminates (session-scoped lock): // isLocked returns whether a lock applies to the node // either by directly being locked or by a parent nodes deep // lock if(!node.isLocked()) { node.lock(false /* isDeep */ , false /* isSessionScoped */); } else { System.out.println(Node is already locked by + node.getProperty(jcr:lockOwner)); } The repository restricts the following operations on a lockable node only when it is locked and the current user does not hold the lock token: Adding and removing properties Changing property values Adding or removing child nodes Adding or removing mixin types The repository will not prevent deletion or moving of a lockable node as these are strictly seen as operations of the parent node both operations only change the child nodes of the parent node.

Shareable Nodes
In JSR170, a node can only be the child node of one other node. This makes it impossible to file nodes under multiple taxonomies, i.e. defining multiple paths to get to the same node. JSR283 hence introduces the (optional) concept of shareable nodes. A shareable node is a node which can share its properties and children with other nodes. A node type is made shareable by adding the mixin mix:shareable.

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

JCR Deep Dive | Jochen Toppe's Blog

Page 14 of 18

Shareable Nodes in JSR283

It is important to note that by adding shareable nodes, the path of a node does not identify a unique node anymore. Rather, there may be different paths to reach a node, such as A/S1/Y and B/S2/Y both identify node Y in the above example. It is up to the particular implementation to define a deemed path, i.e. the default return value of Node.getPath().

XML Import and Export The JCR natively supports the import and export of content into XML. It defines two XML formats, the system view and the document view. While the system view is targeted to be a complete format for replicating the repository content, the document view provides a human-readable XML format, at the expense of completeness. The document view is hence a subset of the system view. The two XML formats further greatly differentiate. The system XML format is a namespace-aware, generic XML format which adheres to a pre-defined DTD. Nodes, for example, are expressed through the tag sv:node, as demonstrated in the following example: <?xml version="1.0" encoding="UTF-8"?> <sv:node xmlns:jcr="http://www.jcp.org/jcr/1.0" xmlns:nt="http://www.jcp.org/jcr/nt/1.0" xmlns:sv="http://www.jcp.org/jcr/sv/1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:fn_old="http://www.w3.org/2004/10/xpath-functions" xmlns:mix="http://www.jcp.org/jcr/mix/1.0" xmlns:samples="http://www.jtoppe.com/samples" xmlns:rep="internal" sv:name="node"> <sv:property sv:name="jcr:primaryType" sv:type="Name"> <sv:value>samples:teaser</sv:value> </sv:property> <sv:property sv:name="jcr:uuid" sv:type="String"> <sv:value>dac260e8-bc54-466e-8874-2da96a219c25</sv:value> </sv:property> <sv:property sv:name="jcr:created" sv:type="Date"> <sv:value>2008-08-18T13:06:16.236-04:00</sv:value> </sv:property> <sv:property sv:name="samples:title" sv:type="String"> <sv:value>sample title</sv:value> </sv:property> </sv:node>

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

JCR Deep Dive | Jochen Toppe's Blog

Page 15 of 18

The document view explicitly is not a generic format, but rather an explicit XML format. Instead of utilizing generic tags, unique tags are created for each node with the name of the node as the following example shows: <?xml version="1.0" encoding="UTF-8"?> <jcr:root xmlns:jcr="http://www.jcp.org/jcr/1.0" xmlns:nt="http://www.jcp.org/jcr/nt/1.0" xmlns:sv="http://www.jcp.org/jcr/sv/1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:fn_old="http://www.w3.org/2004/10/xpath-functions" xmlns:mix="http://www.jcp.org/jcr/mix/1.0" xmlns:samples="http://www.jtoppe.com/samples" xmlns:rep="internal" jcr:primaryType="rep:root"> <folder jcr:primaryType="samples:folder" jcr:uuid="b583a22b-5af7-4db8-b880-64f841c7d09e" jcr:created="2008-08-18T13:39:12.769-04:00"> <node jcr:primaryType="samples:teaser" jcr:uuid="ae09589f-6a76-47c7-bca8-e86c7d70f3d5" jcr:created="2008-08-18T13:39:12.769-04:00" samples:title="first title"/> <node jcr:primaryType="samples:teaser" jcr:uuid="50d454f4-5ce7-4858-af43-4e5b5a900950" jcr:created="2008-08-18T13:39:12.769-04:00" samples:title="second title"/> </folder> </jcr:root> Export of content is provides by the Session.exportSystemView() and Session.exportDocumentView() methods, whereas import is provided by Session.importXML(). The import method automatically detects whether the passed in XML is a system or document view.

Searching Content
The JCR provides multiple ways to discover content. The repository can be browsed through the node API by, for example, the application of the visitor pattern. The JCR further sets for a variety of ways of searching content. For one, all content can be discovered through XPath, which is a mandatory feature for all repository implementations. Optionally, the repository may allow a special SQL-like syntax for searching content. JSR283

JSR283 deprecates the JSR170 SQL syntax and replaces it with JCR-SQL2. It further introduces a third query mechanism, JCR-JQOM which expresses the query as a tree of java objects.

Query Specification While queries can be written in a variety of ways (see above), every query consists of the following elements: Type constraints which limit the query results to a particular type of node. This can be used to query for primary node types as well as mixin types. Property constraints which limit the result set to nodes which contain a specific property as well as properties with specific values. Path constrains constrain the query result to a certain path in the repository. An ordering specifier which defines the order in which nodes are returned by (for example alphabetical by node name). Column specifiers which determines field names which are to be returned to be viewed in the row/column format

Executing Queries All queries are submitted through the QueryManager, which can be obtained from the workspace via the method Workspace.getQueryManager(). It allows the programmer to create new queries via the call QueryManager.createQuery(String query, String queryLanguage) where the query language parameter is one of Query.SQL or Query.XPATH. This method call returns an object of type Query which can then be executed.

Persistent Queries Queries can be made persistent, i.e. stored as a node in the repository by calling Query.storeAsNode(String path) which stores the query as a node of type nt:query. Such a persistent query can later be retrieved via QueryManager.getQuery(Node node).

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

JCR Deep Dive | Jochen Toppe's Blog

Page 16 of 18

Accessing Query Results The query results can be presented in two different ways. For one, an iterator over all resulting nodes or as a table in which each row presents a resulting node and the columns select properties of the matching nodes. The following code outlines the simplest way of viewing a result set by iterating over the nodes: QueryResult queryResult = query.execute(); NodeIterator nodes = queryResult.getNodes(); // now iterate over the nodes The following example utilizes a row-based approach to visualize the query result: String[] headers = queryResult.getColumnNames(); for (int i = 0; i < headers.length; i++) { System.out.printf("%20s | ", headers[i]); } System.out.println(); for (RowIterator rowIterator = queryResult.getRows(); rowIterator.hasNext();) { Row row = rowIterator.nextRow(); Value[] values = row.getValues(); for (int i = 0; i < values.length; i++) { System.out.printf("%20s | ", values[i].getString()); } System.out.println(); } XPath Search The XPath search is executed against the document view XML format (see section XML Import and Export). Lets explore this by an example. The following code creates three nodes underneath the root node: Node folder = root.addNode("folder", "samples:folder"); Node node = folder.addNode("node", "samples:teaser"); node.setProperty("samples:title", "first title"); Node node2 = folder.addNode("node", "samples:teaser"); node2.setProperty("samples:title", "second title"); // export to XML session.exportDocumentView("/", System.out, true, false); This translates to the following document view XML (the namespace declarations have been omitted for brevitys sake):

To create a query which returns all nodes underneath node folder which have a property named samples:title with the value second title, the following XPATH query can be constructed: QueryManager queryManager = workspace.getQueryManager(); Query query = queryManager.createQuery( "/jcr:root/folder/*[@samples:title='second title']", Query.XPATH); QueryResult queryResult = query.execute(); for (NodeIterator nodes = queryResult.getNodes(); nodes.hasNext();) { Node n = nodes.nextNode(); System.out.println(n.getPath()); } JCR-SQL While the XPath notation is very powerful, many developers find it hard to adopt (unless of course theyre XML experts). The SQL notation represents a powerful alternative. However, it is deprecated and replaced by JCRSQL2 in JSR283. The JCR-SQL equivalent of the above XPath query is: Query query = queryManager.createQuery("select * from samples:teaser where samples:title='second title'", Query.SQL);

Transactions Transaction support is an optional feature in the JCR and whether the repository implementation at hand supports this feature can be determined via the following call: Repository.getDescriptor ( OPTION_TRANSACTION_SUPPORTED). A repository which supports transactions fully integrates with the Java Transaction API (JTA), allowing all node operations to be part of a transaction scope. When a transaction is in progress, the behavior of the JCR APIs slightly changes. For example, the Session.Save() operation doesnt commit the changes to the repository (and hence make it visible to other sessions) until the transaction has been successfully committed.

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

JCR Deep Dive | Jochen Toppe's Blog

Page 17 of 18

12. View All


Comments

9 Responses to JCR Deep Dive Duncan Reade on January 30th, 2009 8:54 am

First, thank you for providing this very informative site. And Secondly, just in case it has not been brought to your attention: the URI links to the http://www.jcp.org/ site are out of date (used in section 3. Defining Content). Regards Duncan Reade

Thomas Einwaller on March 13th, 2009 3:20 am

Thanks for that great post. It seems node type folder is missing?

Yasser on June 12th, 2009 7:06 am

Really good post.. thanks for the effort

Bruno Dusausoy on October 22nd, 2009 6:50 am

Hi, great article. But as Thomas said previously, it seems you forgot to put the node type samples:folder definition. It seems its a copy/paste of the samples:content definition instead.

Patrick van Kann on October 19th, 2010 11:58 am

Really great article. With regards to the missing definition for samples:folder, I believe the below works. [samples:folder] > samples:content // accept any subnodes of type samples:content + * (samples:content) multiple I adapted this from the example in the CND in a nutshell section of this blog.

Jochen on October 20th, 2010 10:44 am

Thanks! I need to spend some time updating my blog at some point :-)

Serge on October 28th, 2010 12:54 pm

Great content. Are there some good interative builder for CND Types ? Would be interesting links

Jochen on October 28th, 2010 1:16 pm

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

JCR Deep Dive | Jochen Toppe's Blog

Page 18 of 18

Building an interactive builder shouldnt be that hard. Back when I worked for CoreMedia we had a tool that would take the XMI from any old UML tool and XSLTed it to a content definition file. Wasnt CND but XML, but same idea.

Confluence: MIS I. T. Development on September 20th, 2011 7:45 am

Java Content Repository Deep Dive I found an interesting article online about Java Content Repositories (JCR). Im going to look into this further because it may be useful as a replacement to our current JPA workflow. JCR supports many features that should be useful,

Leave a Reply

Name (required)

Email Address(required)

Website

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Submit Comment

Copyright 2008 Jochen Toppe's Blog Powered by WordPress Based on Silhouette theme created by Brian Gardner

-->

http://jtoee.com/jsr-170/the_jcr_primer/all/1/

4/20/2013

You might also like