I can answer some of your questions. Full disclosure: I'm the founder and project lead for ModeShape.
Briefly, ModeShape is a lightweight, embeddable, extensible open source JCR repository implementation that federates and unifies content from multiple systems, including files systems, databases, data grids, other repositories, etc. You can use the JCR API to access the information you already have, or use it like a conventional JCR system.
Here are some of the higher-level features of ModeShape:
- Supports all of the JCR 2.0 required features: repository acquisition; authentication; reading/navigating; query; export; node type discovery; permissions and capability checking
- Supports most of the JCR 2.0 optional features: writing; import; observation; workspace management; versioning; locking; node type management; same-name siblings; orderable child nodes; shareable nodes; and
mix:etag
, mix:created
and mix:lastModified
mixins with autocreated properties.
- Supports the JCR 1.0 and JCR 2.0 languages (e.g., XPath, JCR-SQL, JCR-SQL2, and JCR-QOM) plus a full-text search language based upon the JCR-SQL2 full-text search expression grammar. Additionally, ModeShape supports some very useful extensions to JCR-SQL2:
- subqueries in criteria
- set operations (e.g, "
UNION
", "INTERSECT
", "EXCEPT
", each with optional "ALL
" clause)
- limits and offsets
- duplicate removal (e.g., "
SELECT DISTINCT
")
- additional depth, reference and path criteria
- set and range criteria (e.g., "
IN
", "NOT IN
", and "BETWEEN
")
- arithmetic criteria (e.g., "
SCORE(t1) + SCORE(t2)
")
- full outer join and cross joins
- and more
- Choose from multiple storage options, including RDBMSes (via Hibernate), data grids (e.g., Infinispan), file systems, or write your own storage connectors as needed.
- Use the JCR API to access information in existing services, file systems, and repositories. ModeShape connectors project the external information into a JCR repository, potentially federating the information from multiple systems into a single workspace. Write custom connectors to access other systems, too.
- Upload files and have ModeShape automatically parse and derive structured information representative of what's in those files. This derived information is stored in the repository, where it can be queried and accessed just like any other content. ModeShape supports a number of file types out-of-the-box , including: CND, XML, XSD, WSDL, DDL, CSV, ZIP/JAR/EAR/WAR, Java source, Java classfiles, Microsoft Office, image metadata, and Teiid models and VDBs. Writing sequencers for other file types is also very easy.
- Automated and extensible MIME type detection, with out-of-the-box detection using file extensions and content-based detection using Aperture.
- Extensible text extraction framework, with out-of-the-box support for Microsoft Office, PDF, HTML, plain text, and XML files using Tika.
- Simple clustering using JGroups.
- Embed ModeShape into your own application.
- RESTful API (requires deployment into an application server).
These are just some of the highlights. For details on these and other ModeShape features, please see the ModeShape documentation.
Now, here are some specific answers to your numbered questions:
ModeShape is hosted at JBoss.org and uses/integrates with other JBoss technology, because we thought it better to reuse the best-of-breed libraries. But ModeShape definitely is not tied to the JBoss Application Server. ModeShape can be used on other application servers in much the same way as other JCR implementations (typically embedded into a web application). Plus, ModeShape can be embedded into any application; it is, after all, just a regular Java library. It even uses SLF4J so that ModeShape log messages can be sent to the application's logging framework.
Now, having said that, we do make it easier to deploy ModeShape to a JBoss AS installation with a simple kit: simply unzip, customize the configuration a bit (depending upon your needs), and start your app server. ModeShape will run as a service within the app server, allowing your deployed apps to simply lookup, use and share repositories. ModeShape can even be monitored using the JBoss AS console.
I believe you're referring to our plans to develop a repository visualization tool (much less than a fully-fledged CMS system). Work on that has just recently begun, and we'd welcome any insight, requests for functionality, and interest in collaborating with us. I know that Magnolia can be run on top of ModeShape, but not sure if other CMS apps are able to do this. The JBoss Enterprise Data Services (EDS) platform also includes ModeShape and uses it as a metadata repository. The JBoss Business Rules Management System can also use ModeShape as its JCR repository.
ModeShape and Jackrabbit both internally use Lucene for full-text search and querying. In that regard, they're pretty similar. Of course, ModeShape's implementation of search and query parsing and execution is different than Jackrabbits, and was actually written by some of the same folks that implemented the MetaMatrix relationally-oriented integration & federation engine (now part of JBoss EDS). As a result, ModeShape has a separate parser for each of its query languages, but after that all validation, planning, planning, and execution of all queries is done in the same way. We're very proud of the capabilities and performance of our query engine!
ModeShape does not have a connector to other CMIS systems, but as you point out that's currently in-work (MODE-650). We'd also like to work with the Apache Chemistry team to make sure the JCR adapter works with ModeShape. We've just not had the time to do so.
ModeShape does have a JcrTools utility class that may prove useful. But any utility class written on top of the JCR API should work just fine.
Hope that helps!