Everyone has a Graph Store

Try this thought experiment.

For practical purposes we often assume that everyone has a computer, a reasonable Internet connection and a modern Web browser. We know it's an inaccurate assumption, but it provides conceptual targets for technology in terms of people and environment.

Ok, now add to that list a Graph Store: a flexible database to which information can easily be added, and which can be easily queried. The data can also be easily shared over the Cloud. The data is available for any applications that might want to use it. The database is schemaless, agnostic about what you put in it: the data could be about contacts, descriptions of people & their relationships (i.e. a Social Graph), it could be about places or events, products, technical information, whatever. It can contain private information, it can contain information that you're happy to share. You control your own store and can let other people access as much or as little of its contents as you like (which they can do easily over the cloud). You can access other people's store in the same way, according to their preferences. It's both a Personal Knowledgebase and a Federated Public Knowledgebase.

So, make the assumption: everyone has a Graph Store. Now what do you want to do with yours? What can your friends and colleagues do with theirs? How can you use other peoples information to improve your quality of life, and vice versa? What new tools can be developed to help them take advantage of their stores? How can you get rich quick on this? What other questions are there..?

Note that if everyone has a Graph Store, for free they automatically get the value-add of the linked data cloud.

Ok, I'm presenting this as a thought experiment, but we pretty much already have all the necessary tools and infrastructure for it to be reality. They aren't generally packaged up in a form that's user-friendly, but that part is becoming increasingly trivial (see below). If you want to run such a store on a local machine there are masses of alternatives - to pick the first three that come to mind there's 4Store, Fuseki and Stardog. If you have a server or other kind of cloudspace available then tools like these are an option there too. For an enterprise kind of environment you probably should look at OpenLink Virtuoso. If you want to leave everything to the cloud, there's Kasabi - note their free hosting option. (I can't remember offhand what other hosted cloud-based options are available, I'm pretty sure there are a few others but a quick search only yielded Dydra which is currently in private beta - please ping me if you know of others...or set up your own :)

The reason I'm prompted to post this now is because of a couple of projects I've had on the go for a while. One (Scute) is an attempt to make my hacking with RDF easier - it's essentially a glorified text editor with a bit of HTTP clientness built in. The other (Seki) was started as a demo more or less to show how a triplestore could be used as a general-purpose read/write Web server, supporting content as well as data. Neither of these is remotely mature enough for proper reuse (Scute has become bloaty/buggy and Seki doesn't do much yet, both are lacking tests and documentation, work in progress innit). But what I found interesting was that although they are approaching semweb tech from a very different direction, there's some definite convergence going on. That convergence is more or less around what I was calling the Semantic Web in a Box (SWIB) a few years ago (jeez, 2006 - tempus fuggits).

The thing is, although this Web stuff does evolve gradually over time, there are also developments that are in effect big steps forward. In the context of the Semantic Web there was the publication of the 2004 specs (solidifying the material that came before), the development of SPARQL (allowing loosely-coupled access to triplestores) and the perspective shift that the notion of linked data offers (bringing the Web back into the Semantic Web). That's not mention the initiatives that have appeared outside semweb cognoscenti circles - things like schema.org.

I reckon SPARQL 1.1 is another big step. Yes, we already knew how to write to the Web with good old RESTful HTTP. But SPARQL Update, Graph Store Protocol etc. offer a standard, loosely-coupled way of writing to triplestores. Ok, a purist may point out that a lot of this stuff isn't RESTful, hence isn't truly Webby. But that doesn't matter - it completes the decoupling of the backend layer (arguably, paradoxically, disintermediating the layers) making it possible to commodify that layer and allow middleware to use generic interfaces, plugging in to any store at one end and potentially any client at the other.

This means the SWIB idea just got a whole lot easier. All it needs to be at heart is a triplestore which supports read/write SPARQL. As noted above, these are already available. I do think the packaging could be improved, to totally minimise the installation effort. One click to download, one click to install, another click to run. A bit of shiny GUI is also desirable, not only to make things easier that the default HTML form for endpoint access but also to reduce the surprise to the end user. It should look a lot more like familiar tools - ideally including something general-purpose (think Microsoft Access) and one or two domain-specific apps (FOAFish contacts/social net client is an obvious one, taking advantage of recent developments a Rich Snippets aware bookmarking app might be nice). A little configuration tool would be good to have too, not everyone is comfortable editing exotically-formatted text files.

Of course it would make me very happy if someone else put a SWIB together like this, dear lazyweb, as it'll probably take me another 6 years to get it together myself. But irrespective of what I say or do on the matter the personal/shared graph store is such a gaping niche that it's bound to happen in some form pretty soon anyway. Whatever, the current absence of "everyone has a graph store" is a conceptual block to imagining the possibilities. So try assuming this is already a done deal.

Comments to G+ please


danja
2012-02-26T15:02:58+01:00
swib federated semweb rdf
Related
Comments
Edit

Once more unto the breach (again)

For the first time in ages I've had a couple of days to sit down and look at code. A lot of it was stuff I hadn't finished, dating back a few years. The typical pattern was either getting distracted from the original aims and playing with the fun stuff or aiming to do so much that I never really got past square one. So this time around I've changed my mind, decided to keep the fun stuff (playing with Agents in Scala) separate from the main app work.

The main app in mind here is the Semantic Web in a Box idea which I'm back to thinking about in a more minimal form, informed a lot by what Rob wrote on his blog - What people find hard about Linked Data - and the stuff in the Talis tutorial. Basically what I'm after is a very easy-to-use Linked Data editor/visualization tool, with support for some kind of pluggability (TBD). There are existing tools which can do this sort of stuff, but the key here is to keep things as simple as possible (and free and open source). Target users are total beginners and experienced folks that want to be able to knock simple stuff together quickly. There's really not a lot to this, and 'wait long by the river and implementation of your plans will float by' usually works, but no-one really seems to have got around to this thing.

It'll be a Java/Swing desktop app with the following features:

  • Internal triplestore(s)
  • RDF editor with various views and syntax validation
  • SPARQL editor and results viewer
  • HTTP client (for examining remote resources, crawling and publishing to remote stores/services)
  • HTTP server (for simulating live data)
  • HTTP proxy (for examining headers etc)
  • Basic HTML editor/viewer


What should also be possible is to run it headless, as a live service.

Probably more than half the people that read this are likely to have such parts living in their codebases - Java Swing components, Jena, ARQ, and Apache HTTP libs cover an awful lot, the tricky part is wiring them all up in a useful way, with a UI that doesn't confuse.

I've made a start on gathering together the bits, but I'm unlikely to get down to a good coding session for a while again, so what follows is really notes to self so I don't forget...

So, RDF editor.

Currently the main class is org.hyperdata.swing.rdftree.editor.RdfEditor

One view is a resource-centered thing, based on a JTree backed by a Jena Model. Like everything else here, it's unfinished and very buggy (notably there's something like an out-by-one error on which row expands). But this should give the general idea, the paths should expand indefinitely :

rdf tree table

Right now it's only addressing the local model, but it should be reasonably straightforward to hook the HTTP client up to terminal node URIs to go and GET remote data (must check how Tabulator goes about that) and extending the drop-down paths.

Text views for Turtle and RDF/XML (with crude highlighting from JEditorPanes):

turtle editor

xml editor

I've only just started looking at a graph view (again!), separate from the stuff above - I just hacked at one of the JGraph demos, long way to go:

The launcher for that is org.hyperdata.swing.graph.danja.GraphEditor


graph view

I've stuck the code over here:

source, wiki etc.


danja
2010-11-21T18:47:57+01:00
swib linkeddata semweb rdf
Related
Comments
Edit