Steve Hannah

Ramblings about Xataface, Java, and other software development issues

April 23, 2011

Adobe CQ5 Developer Training

Filed under: Work, Software Development — shannah @ 10:37 am

I just spent the past week in a developer training course for Adobe Communiqué 5.4 - a content management system on steroids. I thought I’d jot down some of my thoughts while they’re fresh in my mind.

CQ5 is a Java based CMS that is built around the JCR-283 (Java Content Repository) spec which essentially defines a sophisticated object database that is indexed by Lucene for easy searching and cross-referencing of objects. CQ5’s JCR implementation is called CRX, but there is also an open source reference implementation named Apache Jackrabbit if you have an allergy to commercial software.

It is not entirely correct to call the JCR an object database as it isn’t used to store Java objects directly - but the fact that it defines a tree of nodes and that all content is stored and accessed in a hierarchical fashion makes its use very similar to that of an object database. As such, it is natural to draw comparisons with Zope and its object database, the ZODB.

JCR vs ZODB

Zope, a python-based application framework, is radically different than the traditional relationship database model of web application development. The ability to store Python objects directly in the database and have them indexed solved many development problems, but it also created a few problems that would make maintenance of an ever-changing web application more difficult. Namely:

1. When you make changes to a class, it can break all of the existing objects of that class in the database (you need to run a migration).
2. If you try to load an object whose class definition can’t be found the system barfs.

This problem of class versions, managing upgrades of content types etc.. , was the single biggest problem with devleoping on Zope - and while I’m sure that there are best practices to work around this problem, I believe that the JCR solution of of storing content nodes but not actual objects is a much cleaner way of handling content.

The JCR stores a tree of content nodes, each of which have properties and their own child nodes. These structures translate well to different formats like XML (so you can dump entire branches of the repository as XML) and JSON - not so with a pure object database like the ZODB whose structures can be far more complex and include dependencies to classes. Data in the JCR can always be browsed independent of the component libraries which may be loaded into the system. You can browse the repository using WebDAV, the web-based content explorer that is built into CRX (the JCR implementation that is packaged with CQ5), or using CRXDE (the Eclipse-based development environment that is freely available to developers).

You can still define custom node types for your repository but this would merely dictate the name of the node type and perhaps which properties are required.

So, at first glance, this seems like a very stable base upon which to build web applications.

The Stack

The CQ5 stack looks like this:

- WCM - The web content management layer consisting of a bunch of flashy UI components built using the ExtJS javascript library. (this part is proprietary).
- Sling - HTTP server that makes it easy to read and write from the repository using HTTP requests. Very slick (this part is open source).
- CRX - The content repository itself. Handles all permissions, storage, replication, etc… This part is proprietary. It performs the same function as Apache Jackrabbit, but includes a number of enterprise level improvements including a more powerful security model (I am told).

Author & Publish Deployment Instances

The recommended deployment is to have separate author and publish environments each running their own stack, and use the built-in replication feature to propagate authors’ changes to the publish instance whenever a piece of content is activated. This functionality, luckily, has been streamlined to hide most of the complexity. Workflow is built-in to allow you to activate each piece of content individually. Activation automatically triggers replication to the publish instance(s). This model seems to be very well suited to websites with few authors and many public viewers. It is scalable also, as you can add as many publish instances as you want to share the load.

This standard flow control (replicating changes from the author instance to the publish instances) leads me to wonder about cases where you do want the public to be able to interact with your site (e.g. through comments). We didn’t get into this scenario very much in the training, but, as I understand it, any content posted to the publish instance will go into an “outbox” for that instance that will be replicated to the author instances and await approval. They will then be re-replicated back to the publish instances once approved.

Security Model

The security model is quite different than that of most systems. Rather than having security attached to content types (because there are no content types) like with a relational database, or defining a large set of permissions corresponding to each possible action in the system as Zope does, security is 100% attached to the nodes themselves. Each node in the JCR includes an ACL (access control list) which maps only a small set of permissions to each user. There are only a few possible permissions that can be assigned or denied on each node. Basically it boils down to permission to read, write, delete, create new, set permissions, and get permissions on a node level. If there are no permissions assigned to a user on a particular node, then it will use permissions from the node’s parent.

One implication of this security model is that you must pay attention to the content hierarchy when developing applications. You cannot treat this like a relational database!

This is important. I suspect that many developers coming from a relational database background will be tempted to try merge the best of both worlds and try to create pseudo-content types in the system. After-all, all properties in the JCR are indexed, so you could easily just add a property called ‘contentType’ to your nodes to identify them as a particular content type, then build functionality that allows users to add instances of this content type. You could then create view templates that aggregate these content types to treat them as a table. You could do this, *but* you must be aware that you don’t have the same level of control that you have in a relational database system over what a user can do with your content types.

If you are querying the repository solely based on a property on a node - and not based on the path, then you may be surprised by the results that you obtain. At the very least, the JCR security model, despite appearing to be simple, is actually far more difficult to implement than its relational cousin - when trying to imitate the functionality of a relational database. You cannot control what properties are added to every node in the repository so querying based on property values may produce undesirable results. Instead you have to fully embrace the hierarchical model of data step very carefully when you try to import concepts from other paradigms as they could cause you to inadvertently introduce holes.

Custom Content Types (Sort of)

While CQ doesn’t have custom content types, it does allow you to map content nodes to a set of rendering scripts which produces something very much likc a content type. By setting the “sling:resourceType” property on a node to the path to a “component” that you develop, you can dictate where CQ looks for scripts that are used to render the node when requests are made. Components can be either “page” components, which represent an entire page, or regular components, which are included inside a page.

You can register page components to show up in the list of types of pages that can be added by authors when they add a new page to the system. Similarly you can register your regular components to show up in the “sidekick” (i.e. component palette) for authors when they are editing a page, so that it can be dragged onto a page. You can define which types of components are allowed to be parents or children of other components, and you can define which parts of site are allowed to have a particular component types added.

The Component Hierarchy

You can also define a “resourceSuperType” for components to allow them to inherit from other components in the system. This is handy for code reuse as there are hundreds or thousands of existing components that can be overridden or extended. We ran through several exercises creating and extending components. I’m satisfied that this process is not difficult and quite powerful.

Component Dialogs

A component without a dialog is really a lame duck. Users (especially authors) need to be able to interact with your components. E.g. if you create a photo album component, you need to allow your user to add photos to it. Adding dialogs is not difficult but I suspect that the development process is slated for improvements and more automation for future releases. The dialog forms are created entirely by creating appropriately named subtrees under your component’s node. E.g. you would create a child node of a particular type named “dialog”, which contains a child node named “items”, which contains a subnode named “tabs”, etc… 6 or 7 layers deep.

Each tab, each widget, each panel, is represented by a node in the repository. This is clever but somewhat tedious. It is like building a UI using only the UI hierarchy tree in the left panel of the IDE without the visual editor. I suspect that future versions will probably include a proper WYSIWYG UI editor for developing these dialogs but for now this manual system will have to do.

Despite the tediousness of the process, in the scheme of things it is still quite efficient. In only a few minutes you can produce a multi-tab, multi-field UI with rich widgets that allows your users to add and edit a myriad of content types on your site.

April 11, 2011

TestDisk a Nifty Utility for fixing drives with bad boot sectors

Filed under: Work, Software Development — shannah @ 11:16 am

Just ran into an interesting problem with an external hard drive that was being used as a time machine backup for laptop. Someone tried to connect this drive to their windows machine and it evidently screwed up the boot bits so not only would windows not recognize it, Macs wouldn’t recognize the disk either.

Tried running it through Disk Utility but received a message saying “Disk cannot be repaired.”

So I loaded up TestDisk and took it for a spin. Here is a photo gallery outlining the steps that I took.

May 5, 2009

Host-alert.com

Filed under: Work — shannah @ 9:08 am

Just set up server monitoring with host-alert.com to alert me when my server goes down..

April 20, 2009

Why piracy must be stopped

Filed under: Work, Software Development, News — shannah @ 4:15 pm

I wrote this in response to a number of “pro piracy” or “piracy rationalization” comments to a CNN article:
http://scitech.blogs.cnn.com/2009/04/20/a-turning-point-for-online-piracy

This appears to be a culture war, and one that is being lost - and will eventually cost us dearly. Many of these comments are consistent with my anecdotal experience with friends and acquaintances. People who are involved in theft, be it digital or material, always try to rationalize their behavior. Nobody actually believes that they are a bad person. I have known people who earn a living by stealing car stereos. Their justification will generally include such points as “insurance will pay for it - and big insurance companies deserve to be robbed..”, or “the guy who owns the car is obviously rich and can afford to get a new stereo”. Either way there is some justification or rationalization that allows the thief to sleep at night.

Digital piracy is no different. There seem to be many well-articulated arguments to justify digital piracy, but all seem to predicated on the assumption that since “stealing” digital content does not deprive the original owner the content, it isn’t really like stealing at all. You wouldn’t steal your friend’s car because then your friend would be without a car (and you would be without a friend). However if such a thing as a car replicator existed that allowed you to duplicate your friend’s car for free, then you probably wouldn’t think twice about “replicating” your friend’s car.

So for pirates who otherwise are not thieves, it seems to boil down to an internal rejection of the notion that digital piracy is, in fact, theft. Fair enough. It is different enough from material theft that we might as well distinguish it from material theft and give it a different name. So piracy is not “stealing” it is simply “piracy”.

Now that we have distinguished it, let’s look at some of the implications of piracy.

1. If a product is freely available via piracy, and in our culture, piracy is considered OK, then anyone who decides to “purchase” that product is really engaging in a form of charity because they believe in the cause of the product or the person who created it. This is why 10 years ago you thought it was OK to pay $20 for a DVD movie (because you were purchasing a product), but now you think that $20 is a rip-off, because you are now engaging in $20 or charity - more difficult to justify (people spend up to 10% of their income on charitable donations, and the other 90% on themselves - by the same formula you’d think that a pirate who likes a movie would be willing to donate $2 to the movie-maker, even though he would have been willing to purchase it from the movie maker for 10 times that).

2. Based on the economic assumption that people are inherently greedy, most people won’t choose to “purchase” a product when they can get it for free.

3. The marginal value of any product that can readily be pirated will approach zero.

4. At a value of zero the product is not worth making, so the supply of good digital products (e.g. music, movies, software, e-books) will also approach zero - you won’t be able to get them anymore.

If, as a culture, we want to preserve our rich climate of art and ideas, it is imperative that we address this issue. Simply lowering prices to reflect what “pirates” perceive as reasonable prices would result in artificially low prices (because a pirate’s perceived value of content is based on how much he would donate out of altruism, not how much the product should actually be worth to him). If we completely eliminated piracy, only then could we find out what a digital product is really worth. If prices are too high, people won’t pay them, and they will come down. If prices are too low so as to deter artists from producing product, then prices will go up until they reach equilibrium.

They cannot reach equilibrium as long as there is a free alternative to every digital product.

Attempts such as Apple’s DRM are certainly a step in the right direction, but have been met with much resistance from the “pirate” community, as they want the ability to copy anything that they buy freely. Unfortunately we’ve seen that people are not responsible enough to handle this privilege, so it is unrealistic to think that any solution without some form of DRM will solve our problem and produce a proper equilibrium.

Given the facts and the implications of those facts, it is imperative that we proceed with whatever reasonable acts are necessary to curtail piracy. It may not be stealing, but it is still bad for society.

November 25, 2008

Replacing Scriptaculous/Prototype with jQuery

Filed under: Work, Software Development — shannah @ 4:34 pm

I have used Scriptaculous in the past to sprinkle little bits of UI magic into Xataface. Specifically, I have used it to add collapsible sections, sortable sections (via drag-and-drop), and sortable tables (also via drag and drop). These worked great! The Scriptaculous library was a bit bulky and it made the initial page load time a little bit longer, but the result was worth it.

Unfortunately I have started to run into problems with Scriptaculous interfering with other scripts on the page. Scriptaculous is built on the Prototype.js library which adds a number of handy methods and attributes to the built-in javascript types, like objects, arrays, DOM Elements, and strings. As a proof of concept, this is great as it shows off the dynamic features of the javascript programming language. However this can cause problems with scripts that count on the results of the default behavior of these built-in types.

For example, I have made use of Kevin van Zonneveld’s php.js library which provides pure javascript implementations of familiar PHP functions. One such function is count() which is supposed to return the number of elements in a PHP array. In Javascript, this function can either take objects or arrays as a parameter in order to provide the closest possible behavior to its PHP counterpart. Essentially, all this function does is count the number of elements in the array (or object) and return the result as an integer. Unfortunately, after including the Prototype.js library, all objects now have a number of default properties and methods whether you want them or not because they are added to Object.prototype. This effectively breaks the count() function and I can’t see a viable way to work around the problem other than removing Prototype.js from the mix.

Why does prototype.js break the count() function?

Take the following example:

var o = {0 : 'a', 1: 'b', 2: 'c'};
count(o); // should return 3 but with Prototype.js installed it returns 25

This returns the wrong result because Prototype.js adds a number of methods and properties to all objects in the system, so the count() function must count these also.

jQuery to the Rescue

Luckily there is another library that does everything that I have been using Scriptaculous/Prototype.js for: jQuery. It is leaner and less intrusive. It doesn’t change any of the underlying types and it still provides the drag-and-drop sorting of sections, and collapsing/expanding of sections. And in most cases it provided a cleaner, faster solution than was required with Scriptaculous.

July 4, 2007

Insulating the ZODB from bad products

Filed under: Work, Software Development — shannah @ 5:25 pm

The ZODB (Zope Object Database) is a wonderful little invention that provides Zope and Plone with a lot of flexibility. Because it is uses a heirarchical format, it is intuitive and easy to move and copy objects around.

However, it seems that the proper functioning of the ZODB depends heavily on all of the objects stored therein being in good health. This means that if you inadvertently install a product that doesn’t cover all of its bases, you could be up the creek without a paddle when it comes time to copy or migrate the site.

I am currently attempting to upgrade our Faculty’s plone web site to use the new SFU look and feel. I set up a development server a couple of months ago to work on the new skin. Now that it is ready, I would like to create a copy of our site on the same Zope instance so that I can install the skin on that instance, then just change the path so that the change can happen instantaneously.

This strategy would work perfectly if we were working directly on the file system. However, I have encountered a basket full of problems in trying to make this copy. It seems easy enough. You click the little box beside the site in the ZMI, press the "Copy" button, then click the "Paste" button. If only it were that simple.

In my first attempt, it churned for about 30 minutes before returning an error that it couldn’t find a transform for the image/pcx type. After some searching, I found an obscure fix for this issue, involving the temporary removal of one of the python source files for the PortalTransforms package.

My next attempt resulted in some errors relating to an old product (CoreBlog) that was no longer installed in the system. Apparently there were still some remnants left in the ZODB. I couldn’t find any actual CoreBlog objects, but the error seemed to indicate that there were some remnants left in the portal catalog.

So I tried updating the portal catalog to see if that would fix anything. After about 30 minutes of thinking it returned an read-write error.

Next I tried to clear and then rebuilt the catalog. This worked. Now I’m back trying to make a copy of the site… It is still thinking….

Getting to the point

So the point of this post was two-fold.

  1. To rant about Plone
  2. To suggest to those who might be reading this and have a hand in the direction of Zope and Plone, that the ZMI should be insulated from bad products. Imagine if, when copying files from your hard disk to a flash drive, the operating system crashed because one of the files was corrupt. This would make computers nearly impossible. How about an error log to inform me that one of the files couldn’t be copied - but let the rest of the copy go through. Or better yet, let the copy go through unhindered, allowing whatever problems were existent on the original file to be copied through to the copy. I could live with that.

April 12, 2007

OpenOffice Base Links

Filed under: Work, Software Development — shannah @ 3:49 pm

Open Office Base Message Forum

Good open office base tutorials

Filed under: Work, Software Development — shannah @ 3:41 pm

http://sheepdogguides.com/fdb/fdb1main.htm

This guy has some nice tutorials on how to use the Open Office Database tool.

Dataface to Open Office: You complete me

Filed under: Work, Software Development — shannah @ 8:34 am

I just ran across the latest release of Open Office.org (version 2.2.) which includes the holy grail of database development: Base. This version contains a built-in database that moves into the realm of filemaker for ease of use. It allows power users to develop tables, views, queries, forms, and reports inside of OpenOffice. What’s more, once you have registered the database, you can use it in the other parts of open office (like Writer and Calc). This is the way it ought to be.

It now looks like Open Office is a perfect development environment for DBAs that need to unroll database solutions for clients. It is available on just about every OS under the sun so there are no compatibility issues. All of the databases are stored in the Open Document format - so a database can be shared and copied.

What really interests me, however, is the fact that these great tools can work with existing SQL databases like MySQL with minimal hassle. That, and the fact that the DBs are stored in an open format.

Here’s the idea: Dataface can create .odb files (the database file format for Open Office) on the fly that will allow users to interact with the database application using the quick and easy Open Office interface. For some things, a web interface is just too clunky. I’m not sure how deep this rabbit hole goes, but I intend to explore it to its limits to see just how much Dataface can be integrated with Open Office.

January 23, 2007

CAS4PAS Mission - The further adventures of Plone 2.5

Filed under: Work, Software Development — shannah @ 4:10 pm

Okay.. I’m still at this Plone 2.5 upgrade thing. The early reports saying that the upgrade from Plone 2.1 to 2.5 is a snap are indeed correct. Unless you are using CAS for authentication.

Since Plone 2.5 uses PAS (Pluggable Authentication Service) which is very different than the way 2.1 handled authentication, it is necessary to do away with the old way to do CAS auth.

Begin rant ….

The single biggest complaint about the Plone/Zope community is that they think that backward compatibility is optional. They don’t think twice about breaking old code in favour of new features. Undoubtedly PAS is a superior way of handling authentication and permissions than the old way, but couldn’t they make it backwards compatible with the old way of doing things. It is ridiculous to think that every time I have to upgrade to a new version that I have to spend a week or two feeling around the bugs and bases of the code and configuration just trying to make it work again. This paragraph is written out of anger and frustration…. bahh!!

End rant …

Okay, back to rational thinking. Here is what I have done so far:

  1. Downloaded CAS4PAS 1.0.0-1 and installed in the Products directory.
  2. Restarted Zope and added a CAS Helper to the acl_users folder of my Plone site - then copied the settings across.
  3. Downloaded and installed the new version of PloneCASLogin (because the old version doesn’t work with PAS).
  4. Went to a different browser and tried to log in using CAS.
  5. The login button takes me to the CAS login page OK, but when I return to the Plone site, it says that there was a sign-in failure.

At this point I went back to the download pages for the Plone CAS Login and CAS4PAS products to see if I was missing anything. I noticed a patch to make it compatible with CAS 2 (not sure if my server is CAS 1 or 2, but I thought I might as well download it anyway to see if that was the problem.

  1. Downloaded and ran the patch
  2. Restarted the server and tried to log in — same problem!
  3. Scoured the internet for information from others having the same problem. Found one helpful post here.
  4. It suggested that I try to disable the “Challenge” feature of the credentials_cookie_auth in acl_users.
  5. Same problem!

That is where I sit right now… No clues, so I’ll dig deeper into the code and see what I can find….

Next Page »

Powered by WordPress