Introducing dork

That's Descriptions of Runtime Klasses. Some simple Java for getting RDF out of code trees.

The RDF can be used to generate class diagrams, like this:

class tree

An interesting aspect of the Web Beep project is processor pipelines. To optimize things I needed to play with parameters easily so wound up building a system interface covering the processors and pipelines. As it stand in the source now, the configuration is set up from Java structures. But to see what the configuration is, a recursive toString() on the Java structures yields a fairly structured text description of the configuration (there's an example on the How It Works page).

This led me to think that if such descriptions could be used to describe existing configurations, they could also be used to set up those configurations. The format's ad hoc, so first it made sense to look at using something standard. The processor pipelines are essentially graphs (with annotations) so RDF was naturally the hammer I chose. The general processors/pipelines model is encoded (better word?) in the Java class structure, so if I could get that in RDF it'd be a good start. It's general-purpose stuff so I've split it off as a separate project at github and given it a silly name.

This kind of thing's been done before, in fact I'm hoping to incorporate David Huynh's doclet (for use with Javadoc to generate RDF) as well in the near future. But that approach gets its data 'statically' from the source, whereas the parameters at runtime are important for Web Beep's processors etc. I've made a start on the write-up with the code (ermm, Javadoc's todo :), but one key thing is just using a describe() method in the kind of places you might use a toString(). It should return a snippet of Turtle-syntax RDF describing the object in which it appears. I've also made a start on some easy-to-use utility methods that use reflection to extract a description of objects which doesn't rely on them having a describe() method, bit of a lighter touch.

As a sanity check on the generated RDF I made a (pretty trivial) SPARQL query with XSLT transform to GraphViz dot format, the result of which can be used (with straightforward command-line tools) to generate images like the one above. [I remembered half way through that Redland's rapper utility can output dot format, but that's RDFy (see screenshot) and I'm after something much more app-specific.] There's a little script which shows how the image was arrived at.


danja
2012-01-19T22:12:46+01:00
java dork dot diagrams class sparql rdf
Related
Comments
Edit

Introducing JEdwards

JEdwards is a little sub-project I've just been putting together in Java. Screenshot.

It's so named for two reasons:

  1. it's roughly a contraction of "towards a Javascript editor"
  2. it's something you probably want to ignore (like twincest :)

Having said that, it does have a couple of features that may be of interest to sane developers:

  1. a Java terminal emulator (bash shell)
  2. syntax highlighting for SPARQL/Turtle

Neither are entirely finished, but both are useable/reusable (Apache 2 license, or somesuch).

evil jedward

I've been using Eclipse for most of my dev stuff for years now. When I was doing things in Node.js I wound up configuring it to have a file explorer pane, a text editor pane (for Javascript, HTML, Turtle or SPARQL) and three terminal panes all connected to the local shell. Eclipse was basically a (slow) sledgehammer to crack a nut. I did spend a while looking for a way of setting these things up using separate apps, but was beaten by the problem of pinning the windows to the workspace. I believe it should be possible using Devil's Pie or similar, but I had no joy. But as it happened I wanted a terminal emulator in Java anyhow and had played with syntax highlighting before.

In Scute I'd put together some basic highlighting for Turtle, except when I came to look at it again it was a bit too hardcoded to reuse, and Javascript is quite complicated... Looking around I came across jsyntaxpane, which is a pluggable highlighter which takes its config from a JFlex lexer. It'd got the necessary for Javascript, so I decided to use that instead of my hacky code. I found a SPARQL/Flex file on the Web that someone had prepared for IntelliJ IDEA which although was geared to do other things saved me a bit of time writing out the SPARQL patterns. Here's sparql.flex.

For the terminal emulator I started with the JConsole UI from BeanShell, to which I've adding the bits which talk to the bash shell. It works ok on this Ubuntu machine, I've no idea what would be needed to set it up for a different OS. The source for that is here.

I started Scute, a desktop RDF toolkit, just over a year ago. I did get some bits working fairly well - I was using the SPARQL bits for real - but then I got distracted and left it largely unusable... This JEdwards bit of coding has got me back into it, and tightened up how I was thinking about the dev process. I must write this up properly. The main idea is, while it should be built from reusable components, the way it's setup as a whole will be optimized for how I want to work. Somewhat inspired by woodcarving, where a lot of the time what's best isn't a general purpose tool (wood router or software IDE) but a highly focused tool (1/4" No.4 fishtail gouge or JEdwards). If the resulting code is useful for other people, great, but the motivation isn't to create a product, just to help my own personal workflow. Horse before cart dogfood.

The reusable components part comes from testing. I'm lazy about tests at the best of times, and Scute is all about GUI so is a bit tricky to test. But I reckon component-level functional tests make a fair a substitute for unit tests. Anyhow, more about this another day.


danja
2012-01-18T19:12:14+01:00
scute terminal emulator jedwards sparql turtle syntax highlighter rdf
Related
Comments
Edit

Hixie's Furniture

Too long; read later - here's a demo : SPARQL Sliders Test

+Ian Hickson posted a lovely semweb use case:

"I'd like a search tool for furniture that works like Google's Flight Search does for flights. That is, with sliders so I can say what type of furniture (table), what range of widths (1-2m), lengths (2-5m), and heights (1-2m), what material (wood), what thickness, what price range, etc, I'd like, with the list of available products updating in real time."

As it happens I wanted a slider thingy ages ago, so this was a good prompt to make a demo of the front end part which takes the values from slider components and uses them in a SPARQL query.

For convenience/lack of available data the demo runs against dpPedia via the SNORQL SPARQL Explorer. As furniture and it's dimensions wasn't available it uses cities and their populations and elevations.

So how would you get real data?

First of all, furniture vendors could either provide dumps of their data or, more Webby, mark up their sites with RDFa and/or HTML5 microdata using e.g. the GoodRelations e-commerce vocabulary.

Ultimately, for a front end like these sliders to work, the data would need to go in a store with a SPARQL endpoint. But, triplestores shouldn't be thought of as just a wacky alternative to a SQL database. A triplestore is just a cache of a little chunk of the Linked Data Web. The question of where the store resides and how the data is collected is entirely open. Following the more traditional DB model, a service might aggregate the data published by known furniture suppliers and provide the endpoint online.

But alternately, a local user agent (I think Chris Bizer had a little Java example, can't find the link...there are others) could crawl the Web to answer the query just-in-time. The advantage of this approach is that it's more thorough and the only real option for totally arbitrary queries, the downside being that it's answer will probably take longer than milliseconds. But remember triplestores are caches, not every little bit of information would have to be discovered and read from every page. There are vocabs for dataset and vocab discovery (remind me of the acronyms please :) Note too that you're not limiting your client agent to a single datastore. traditional backends (SQL or NoSQL) are effectively isolated silos, triplestores are integrated with the links of the Web.

Incidentally, this is something that might be nice to express as a Web Intent, along the lines of "make me a query from this template with these parameters and apply it to this endpoint, putting the results into this widget" (that's a bit verbose for a general-purpose intent, but you get the gist). c.f. RDFAffordances.




danja
2012-01-11T15:01:56+01:00
sparql demo goodrelations rdf hixie furniture
Related
Comments
Edit

Another SPARQL solution

Bravo! A solution to the latest SPARQL puzzle.

@glenn_mcdonald found a way of getting the non-Roman-god solar system bodies:

PREFIX rdfs:  <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wn:    <http://www.w3.org/2006/03/wn/wn20/schema/>
PREFIX id:    <http://wordnet.rkbexplorer.com/id/>

SELECT DISTINCT ?planet WHERE {
  ?s1 wn:memberMeronymOf id:synset-solar_system-noun-1 .
  ?s1 rdfs:label ?planet .
  OPTIONAL {
    ?s1 wn:containsWordSense ?ws1 .
    ?ws1 wn:word ?w .
    ?ws2 wn:word ?w .
    ?s2 wn:containsWordSense ?ws2 .
    ?s2 wn:hyponymOf id:synset-Roman_deity-noun-1 .
  }
  FILTER (!bound(?s2))
}

Isolating just the planets looks to be out of reach using the WordNet endpoint alone, but I guess that can be left as a challenge for federated query e.g. CONSTRUCTs from different datasets into a local store before SELECTing.

Update

From RobVesse -

Here's an even simpler query for yesterdays puzzle - still doesn't isolate real planets though

PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wn:  <http://www.w3.org/2006/03/wn/wn20/schema/>

SELECT DISTINCT ?label WHERE 
{
 ?s1 wn:memberMeronymOf <http://wordnet.rkbexplorer.com/id/synset-solar_system-noun-1> .
 ?s1 rdfs:label ?label.
 OPTIONAL
 {
  ?s2 wn:hyponymOf <http://wordnet.rkbexplorer.com/id/synset-Roman_deity-noun-1> .
  ?s2 rdfs:label ?label.
 }
 FILTER (!BOUND(?s2))
}

...plus...

Here's a soln using wordnet and dbpedia to show only planets not named after roman gods, requires a SPARQL 1.1 engine to run

PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wn:  <http://www.w3.org/2006/03/wn/wn20/schema/>

SELECT DISTINCT ?label WHERE 
{
 SERVICE <http://wordnet.rkbexplorer.com/sparql/>
 {
   ?s1 wn:memberMeronymOf <http://wordnet.rkbexplorer.com/id/synset-solar_system-noun-1> .
   ?s1 rdfs:label ?label.
 }
 MINUS
 {
  SERVICE <http://wordnet.rkbexplorer.com/sparql/>
  {
    ?s2 wn:hyponymOf <http://wordnet.rkbexplorer.com/id/synset-Roman_deity-noun-1> .
    ?s2 rdfs:label ?label.
  }
 }
 BIND(URI(CONCAT("http://dbpedia.org/resource/", ?label)) AS ?dbpResource)

Here's a suitable engine: Leviathan (a demo of the SPARQL Engine used in dotNetRDF).


danja
2011-04-12T20:19:30+01:00
sparql puzzle rdf
Related
Comments
Edit

Another SPARQL puzzle

Using the WordNet endpoint at http://wordnet.rkbexplorer.com/sparql/ I can get the names of the solar system bodies that are named after Roman gods with :

PREFIX rdfs:		<http://www.w3.org/2000/01/rdf-schema#>
PREFIX wn:	<http://www.w3.org/2006/03/wn/wn20/schema/>

SELECT DISTINCT ?label WHERE {
?s1 wn:memberMeronymOf <http://wordnet.rkbexplorer.com/id/synset-solar_system-noun-1> .
?s1 rdfs:label ?label.
?s2 wn:hyponymOf <http://wordnet.rkbexplorer.com/id/synset-Roman_deity-noun-1> .
?s2 rdfs:label ?label.
}

The challenge is to get the names of the solar system bodies that aren't named after Roman gods. (Ideally I'd like planets in the solar system... rather than ...bodies, but I can't see a suitable class).


danja
2011-04-11T20:42:00+01:00
sparql puzzle rdf
Related
Comments
Edit

Pattern exclusion in SPARQL

Seconds after I twittered the last post, @LeeFeigenbaum responded.

Ok, so I have two patterns, and I want to find the statements that match either pattern but don't match both. The solution is rather a flexible little idiom for this kind of negation. The specific patterns are:

?set dbpp:wikiPageUsesTemplate  <http://dbpedia.org/resource/Template:Infobox_programming_language> .

and

?set a yagoc:ProgrammingLanguage106898352 .

(I'm running this agains dbPedia)

Lee's solution is:

PREFIX dbpp:		<http://dbpedia.org/property/>
PREFIX yagoc: <http://dbpedia.org/class/yago/>

SELECT count(?set) where {
{
?set dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
OPTIONAL {
?set a ?marker .
FILTER(?marker = yagoc:ProgrammingLanguage106898352)
}
FILTER(!bound(?marker))
   } UNION {

?set a yagoc:ProgrammingLanguage106898352 .
OPTIONAL {
?set dbpp:wikiPageUsesTemplate ?marker .
FILTER(?marker = <http://dbpedia.org/resource/Template:Infobox_programming_language>)
}
FILTER(!bound(?marker))
}
}

(Note that COUNT isn't (yet) standard SPARQL, but seeing the size of the result sets was handy here).

It's looks convoluted, but each half of the UNION is kind-of the converse of the other (and will give interesting results independently). I was a little surprised it did work as variables are scoped to the whole query and ?marker looked troublesome. But FILTERs are scoped to the local group, and that's where it matters here (it will produce the same results if you had a different variable for each half of the UNION).

There is something slightly odd happening in this particular case (or I'm missing something obvious). The figures I got before were 762 matches for the UNION of the two patterns, 178 for the intersection, so I'd have expected 762 - 178 = 584 results, but this gives 406. So there's a bit of sloppy QED around here. I was missing something obvious.

Lee again via twitter: the numbers look perfect to me - the 762 double-counts the 178 in the intersection. 406+178=584

As @glenn_mcdonald and Lee have pointed out, a DISTINCT would fix my original UNION query to exclude the dupes. Glenn also offers a more concise version taking advantage of a Virtuoso feature:

PREFIX dbpp:		<http://dbpedia.org/property/>
PREFIX yagoc: <http://dbpedia.org/class/yago/>

SELECT count(?set1) where {
{
?set1 dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
FILTER NOT EXISTS {?set1 a yagoc:ProgrammingLanguage106898352}
} UNION {
?set1 a yagoc:ProgrammingLanguage106898352 .
FILTER NOT EXISTS {?set1 dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language>}
}
}


danja
2011-03-28T19:35:15+01:00
negation sparql rdf
Related
Comments
Edit

Long multiplication

Querying http://dbpedia.org/sparql


PREFIX yagoc:		<http://dbpedia.org/class/yago/>
SELECT COUNT(?set1) where {
?set1 a yagoc:ProgrammingLanguage106898352 .
}

result = 336

PREFIX dbpp:		<http://dbpedia.org/property/>

SELECT COUNT(?set2) where {
?set2 dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language>
}

result = 426

disjunction, language is in set1 OR set2

PREFIX dbpp:		<http://dbpedia.org/property/>
PREFIX yagoc: <http://dbpedia.org/class/yago/>
SELECT count(?s) where {
{
?s dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
} UNION {
?s a yagoc:ProgrammingLanguage106898352 .
}
}

result = 762

conjunction, language is in set1 AND set2

PREFIX dbpp:		<http://dbpedia.org/property/>
PREFIX yagoc: <http://dbpedia.org/class/yago/>
SELECT COUNT(?set) where {
?set a yagoc:ProgrammingLanguage106898352 .
?set dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
}

result = 178

PREFIX dbpp:		<http://dbpedia.org/property/>
PREFIX yagoc: <http://dbpedia.org/class/yago/>

SELECT count(?and) where {
{
?or dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
} UNION {
?or a yagoc:ProgrammingLanguage106898352 .
}
?and dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
?and a yagoc:ProgrammingLanguage106898352 .

FILTER(?and = ?or)
}

result = 356!?

Took me a long while to realise what that number represents, definitely time for a break...

What I'm trying to find (if it's possible) are queries to look at the difference between the sets above, the 762 - 178 = 584 part. I'm hoping something along the lines of Finding Resources that don't have a certain property might work. If anyone knows an idiom that'll work (or knows that it isn't possible) please ping me.


danja
2011-03-28T13:59:45+01:00
sparql puzzle rdf
Related
Comments
Edit

SPARQL Results and HTML

A thought in passing. When I've need to display SPARQL results in a browser I've generally either used some kind of programmatic templating (as in this blog) or XSLT on XML results - which can get clunky, but when the transformation is done, it's done. Results XML is straightforward (and I'm still rather fond of XML) but the choice of syntax is pretty arbitrary. The RDF that comes back from a CONSTRUCT is grand, that's a really nice kind of query, the data is immediately ready for reuse (it might an obsessive-compulsive thing, but DESCRIBE still feels a bit messy). I've not got around to playing with JSON results, presumably that lends itself to speedy application in most languages.

But I can't help thinking it'd be neat if SPARQL results came back directly as RDFa so by default you had something that made sense both in a browser and to an RDF agent. Is there anything you can do with a SELECT that a CONSTRUCT-to-HTML couldn't do? Is there any way the stuff could be structured to simplify templating? There's at least one results XML to HTML XSLT around somewhere, I guess that could be tweaked for experimantation.


danja
2010-12-14T07:54:28+01:00
rdfa html sparql rdf
Related
Comments
Edit