On Developing a Global Jedi Archive Format

You can help this blog out by sharing this post with your friends.
THIS POST NO LONGER REFLECTS MY CURRENT VIEWS: It should be noted that my views have evolved since this blog first began. A lot of posts reflect views that I held at the time that I made them, but no longer hold.

Of course, such a disclaimer will not appear on every single post that reflects views of mine that have since been revised – but with regards to some posts (such as this one) I feel it is especially important to make that point particularly clear. I find insertion of such notices to be more intellectually honest than simply deleting the post.

This document about a Global Jedi Archive Format is to let the Jedi community know what I am doing – but also with the intent of calling forth anyone who would like to participate in this project.

I some time ago began the project of trying to come up with a format that a Jedi archive could use for storing information. It would be a format that could encode any kind of document or extradocumental information that might belong in the Jedi archives. At the time that I began, though, I had no idea on just how ambitious a project I was embarking.

As I write this document about the development of a Global Jedi Archive Format, my purpose is two-fold. I want to let the Jedi community know what I am doing and how it is going. But I also do so with the intent of calling forth anyone who would like to participate in this project.

At first, I thought my project would be simple. I had the Perl scripting language with a built-in port of the Expat library for parsing XML files. I had a few more infrastructural tools that I had originally written for other purposes but could very easily be deployed for this purpose. All I really had to do is figure out what kinds of information might need to go into a document of a Jedi archive, and how such information would need to be organized. Then, all I needed to do was come up with an Expat-compatible XML-based format that’s good at organizing information in such a manner – and use the fore-mentioned tools to write parsers to convert documents of the format I designed to the necessary pre-established standard formats.

As time went on, though, I realized that there were two factors that would make the project more massive than I had imagined at first. The first factor was the sheer diversity in the kind of information that might need to be stored in Jedi archives – not all of which could follow a common organizational scheme. The second factor was the sheer amount of information that might need to go into the Jedi archives – far more than any singular Jedi individual or organization could possibly maintain single-handedly.

As for the diversity in the type of information that would be contained in any Jedi archives, the initial plan was to come up with a document format that would provide enough flexibility to handle the different types of data-elements within the various formats without becoming so overwhelming as to be impossible for a human being to realistically learn, let alone edit. This quickly proved to be an unrealistic goal – as the various types of documents did not have enough common-ground as far as data-elements are concerned for this to be possible.

This problem can be solved by devising not one, but several document formats for the different kinds of documents in the archives. All of these formats would be united into a common ecosystem by yet another format with which to index the various documents along with all their basic information – including what format the documents are in.

As for this indexing system – one more factor that would place quite a demand on it would be the amount of information that might need to go into the Jedi archives. This volume of information is likely to end up being far more than any single Jedi or Jedi organization can curate single-handedly. For this reason, the Jedi archives can not be all one entity’s responsibility, or even stored in one place. Rather than the whole curation being done by one Jedi organization, there would need to be an organization who’s mission would be or include the most centralized tasks of co-ordinating the Jedi archives, farming parts of the curation out to the various Jedi organizations that elect to participate in the Global Jedi Archive network. Furthermore, the central archiving organization might not work directly with the curators of all the sections of the archives – but rather, farm out large chunks of the archive to a few other organizations – who in turn sub-divide their wards of the Global Archives into smaller chunks to be farmed out – and so-forth.

Also – it probably would not be practical to insist that the informational-categories of the archives be in any way related to the curacy sections of the archives. This is because, at the bottom level, each organization should have direct control over the curation of whatever documents it is most qualified to curate – which may span various informational categories.

Also, each Jedi organization may have in it’s archives both it’s own private material to be kept within the organization as well as public material to be integrated into the global Jedi archives. In addition to that, in between these two categories, some might have material that is not private in the sense that anyone is welcome to access this material should they choose to do so – but which it is deemed unlikely that anyone outside that particular organization would find the material relevant. The archives would need it to be possible to store all these types of information integrated in a manner that would allow them to reference each-other as appropriate – but which would also allow their public, global material to be just as seamlessly integrated into the Global Jedi Archives.

The indexing system for the archives would need to accomodate all of these needs of a de-centralized system while keeping the Jedi archives robust – that is to say, not easily-broken.

This robustness would need to include a number of things. For one thing, there would be the need for one document to have a way of referencing another without thea reference being broken should the section of the archives containing the document being referenced be moved from one place on the Internet to another. It would also be beneficial if it included some system of redundancy of storage for all material that is integrated into the global Jedi archive network – so that if something unfortunate happens to one part of the network the result will not be a breakdown that would affect everyone.

Written by 

Leave a Reply