Sunday, June 26, 2011

Git Wave

Git Wave is something I actually started working on whilst I had some time off between jobs but at the time of writing has not progressed very far.  I needed a project to learm Scala with and this was the idea I selected to put into the real world.  I may continue to work on it but time will tell how much energy I have for it.

The idea of Git Wave is to combine the shared conversation idea from Google Wave with the decentralised , distributed, versioning functionality of Git.  So you can have a shared conversation that is not centrally stored on someone else's server.  Effectively everyone in the conversation has there own copy and other people's changes get merged in with your own changes.  So the conversation is like a source file which you collaborate on.

I'll admit I was a big fan of Wave and was disappointed when Google stopped working on it.  The concept of the shared conversation is a powerful one even if Google didn't completely nail it.   In my opinion they gave up too early and they needed to provide good email integration (so you could at least partially include non Wave users), platform notification integration (so you could tell when a Wave was modified or created),  notification prioritisation and control (to avoid notification overload)  and third party Wave servers (so that companies could control their own data).

The Git Wave data would end up in public accessible Git repositories so that others could access the data when your computer is down (though a peer to peer mechanism would also be possible) so encryption, identity management and notification would be an important part of the implementation.  So you make your changes to the conversation in your own Git repository and push them to a public Git repository (one that you are authorised to push to).  The data would be encrypted with a generated conversation key and then the conversation key would be encrypted with each recipient's public encryption key (which of course you need to have locally stored or be able to retrieve).  Each recipient would need to be notified in some way that a new conversation involving them is available in the public repository.  Each recipient would "pull" the changes into there own repository make their own changes, push to their own public repository and notify the other recipients of the new repository that contains their updates.

At first glance it seems a bit over engineered since there is effectively two Git repositories for each recipient (a private one and a public one) but the advantage is that each recipient is in control of their own copy of the conversation and can choose what they want to merge and the public repositories are necessary to ensure everyone can access the conversation even when some of the participants have their computers turned off.  Also it is not too different from normal mail where an email is copied to your ISP mail server and then the recipients' mail server(s) before getting to their computers.

Git Wave would not scale well to large numbers of active recipients but each recipient does not really have to monitor all other recipients' public repositories if they are willing to trust other recipients to do the merging for them since all changes will eventually appear in all repositories.   For example the creator of the conversation could monitor each of the other recipients changes and merge them into his public repository so that all the other recipients would just need to monitor the creator's public repository.

In a peer to peer approach the user's would exchange details (like public key and notification method) but a more scaleable approach would require identity servers.  This could be as simple as a REST API that responded to particular http queries eg http://foo.com/username/publickey - it would be simple to allow username@foo.com to be a synonym for the REST query prefix to enable a more familiar style of user id.

Notification could be via email, twitter or (my preference) a new generic notification service (which could logically be integrated with the Identity service).  Again it could use a simple REST API and allow users to register how they want to be notified.  I would also like to see a generic notification agent that runs on the client platform that communicates with the server and receives the notifications.    The combination of Notification service and Notification agent would be useful in much larger contexts to consolidate and prioritise notifications from the many and varied sources that they currently come from.

There would be a Git repository per conversation since with GIT you have to clone the whole repository so multiple conversations in a repository would not work very well.  A conversation would be made up of multiple files with XHTML a likely candidate for the data format of the viewable part of the conversation.  Other files containing meta data like the encrypted session key would also be required.

The gadgets were quite good in Google Wave - the data for the gadget was usually stored as JSON in the Wave with code for the gadget being separate so something similar should be possible in Git Wave.

Google Wave also had Robots.  They would be doable but would work like automated humans.  There would need to be a remote robot service that would instantiate a robot with a given public key and a URL to code to run.  Another possibility for robot like behaviour is to have scripting and/or plugin behaviour, that was defined in the conversation or in the client, that executed before a change was submitted or merged.  This scripting could also reject other people's changes (ie refuse to merge them in) if they didn't conform to certain rules.  The thing I liked most about the robot idea in Google Wave was the potential for state change driven work flow to automate processes eg a leave application process.

I envision multiple clients.  Nowadays you have to have a Web Client - that would mean not keeping your conversation data on your own machine but many people seem happy with that.  Native clients would be good as well and I started using Eclipse RCP as a cross platform basis for the client which can even be turned in a web app using Eclipse RAP.

The big caveat to Git Wave is getting the merging done properly.  Merging multiple changes from multiple people could get messy.  I imagined that every recipient would have their own branch and that each user could choose to merge others changes in or keep them separate and view them separately.  Mostly you would want to just do an automated merge and hope for the best but I could see the use of manual merging with merge tools (eg showing colour coded differences) being a possibility.   Google Wave had its whole Operational Transformation functionality which probably avoids some of the messier merge scenarios.

No comments:

Post a Comment