Schema/Vocab Mapping toolkit

Olaf Hartig has pointed me to the R2R Framework :

[[

The R2R Framework enables Linked Data applications which discover data on the Web, that is represented using unknown terms, to search the Web for mappings and apply the discovered mappings to translate Web data to the application's target vocabulary. The R2R Framework is aimed to be used by Linked Data publishers, vocabulary maintainers and Linked Data application developers. It support them by:

1. providing the R2R Mapping Language for publishing fine-grained term mappings on the Web

2. defining best-practices on how mappings can be discovered by Linked Data applications

3. providing an open-source implementation of the R2R Mapping Engine.

]]


danja
2011-07-22T09:47:48+01:00
schema linkeddata r2r lod rdf vocab mapping
Related
Comments
Edit

Linked Data One-Liner

A lot of information is merely On the Web when it would be more useful In the Web...


danja
2011-04-04T03:44:39+01:00
linkeddata rdf
Related
Comments
Edit

Once more unto the breach (again)

For the first time in ages I've had a couple of days to sit down and look at code. A lot of it was stuff I hadn't finished, dating back a few years. The typical pattern was either getting distracted from the original aims and playing with the fun stuff or aiming to do so much that I never really got past square one. So this time around I've changed my mind, decided to keep the fun stuff (playing with Agents in Scala) separate from the main app work.

The main app in mind here is the Semantic Web in a Box idea which I'm back to thinking about in a more minimal form, informed a lot by what Rob wrote on his blog - What people find hard about Linked Data - and the stuff in the Talis tutorial. Basically what I'm after is a very easy-to-use Linked Data editor/visualization tool, with support for some kind of pluggability (TBD). There are existing tools which can do this sort of stuff, but the key here is to keep things as simple as possible (and free and open source). Target users are total beginners and experienced folks that want to be able to knock simple stuff together quickly. There's really not a lot to this, and 'wait long by the river and implementation of your plans will float by' usually works, but no-one really seems to have got around to this thing.

It'll be a Java/Swing desktop app with the following features:

  • Internal triplestore(s)
  • RDF editor with various views and syntax validation
  • SPARQL editor and results viewer
  • HTTP client (for examining remote resources, crawling and publishing to remote stores/services)
  • HTTP server (for simulating live data)
  • HTTP proxy (for examining headers etc)
  • Basic HTML editor/viewer


What should also be possible is to run it headless, as a live service.

Probably more than half the people that read this are likely to have such parts living in their codebases - Java Swing components, Jena, ARQ, and Apache HTTP libs cover an awful lot, the tricky part is wiring them all up in a useful way, with a UI that doesn't confuse.

I've made a start on gathering together the bits, but I'm unlikely to get down to a good coding session for a while again, so what follows is really notes to self so I don't forget...

So, RDF editor.

Currently the main class is org.hyperdata.swing.rdftree.editor.RdfEditor

One view is a resource-centered thing, based on a JTree backed by a Jena Model. Like everything else here, it's unfinished and very buggy (notably there's something like an out-by-one error on which row expands). But this should give the general idea, the paths should expand indefinitely :

rdf tree table

Right now it's only addressing the local model, but it should be reasonably straightforward to hook the HTTP client up to terminal node URIs to go and GET remote data (must check how Tabulator goes about that) and extending the drop-down paths.

Text views for Turtle and RDF/XML (with crude highlighting from JEditorPanes):

turtle editor

xml editor

I've only just started looking at a graph view (again!), separate from the stuff above - I just hacked at one of the JGraph demos, long way to go:

The launcher for that is org.hyperdata.swing.graph.danja.GraphEditor


graph view

I've stuck the code over here:

source, wiki etc.


danja
2010-11-21T18:47:57+01:00
swib linkeddata semweb rdf
Related
Comments
Edit

Linked Data and Hype

[in reply to John Sowa on the cg@conceptualgraphs.org list, unfortunately the mail didn't get through - something up with the server]

I reckon the activities around Linked Data are somewhat different to the typical "Next Big Thing". I'd suggest the NBT here if anything is the Semantic Web, which has suffered from industry hype, and as yet does not live up to the promises. However Linked Data is essentially the same idea as the Semantic Web, but with more emphasis on the "Web" side and less on the "Semantic".

The central idea of treating the Web conceptually as one big (graph-shaped) database works fine (and the LOD cloud [1] is a notable concrete manifestation), but as you note, most applications do require fast access to relevant data. Some of the more recent RDF stores/SPARQL engines do have performance comparable to traditional RDBs, but I don't think this is entirely relevant to the core paradigm. The tendency in the past has been for the creation of data silos, where each company or organization has their own discrete database. Where data is exposed to the Web it has been in the form of human-readable documents. This makes for a huge impedance mismatch for anyone wishing to use computers to make use of multiple data sources.

Where data is exposed to the Web as linked data, the material is available for direct recombination and reuse by other parties. When the appropriate standards are used (primarily URIs for identification, RDF for structure and HTTP for transfer) the notion of a database takes on a different form: a triplestore is a (fast) cache of a little chunk of the global Web of data.

Let's say electricity providers and water providers have their own databases. A company wishing to know where to lay fibre-optic cables would probably want to know where the existing (and planned) wiring/piping lies. Right now that would typically mean they'd need fairly in-depth knowledge of the database schemas and local conventions used by the utility companies. But if the data is available in a consistent form (i.e. RDF) then the work of aligning the source data and extracting the information becomes that much easier. The utilities may still have their own idiosyncratic ways of describing their systems, but then again if they happen to use some common vocabularies (e.g. for geo-location) considerably less expert knowledge of the individual systems is needed to get started. The fibre-optics company could run selective queries (or run a crawler) over the utilities' Web-exposed data, and trivially merge the results in their own, local, performant store.

The adoption of linked data has to some extent slipped under the radar of industry hype, a good example being http://data.gov.uk, which aims to take (non-personal) UK government data and expose it to the Web in a reusable form. The change in paradigm and increased potential for reuse is pretty apparent when you consider that a lot of the source data is held in Excel spreadsheets or buried in documents. This government-backed project has yielded a couple of surprises - on the one hand the willingness of gov departments to hand over their data and help out (the material being technically publicly available already, for practical reasons that can be far from the case). On the other hand developers have been fairly clamouring to get their hands on the data to build end-user applications.

(Incidentally, some of the data.gov.uk folks are working on the Linked Data API [2] which provides interfaces to triplestores which don't require any knowledge of RDF or SPARQL, which has traditionally been something of a blocker).


danja
2010-08-29T07:00:29+01:00
linkeddata semweb hype
Related
Comments
Edit