We now know how to do this computer stuff. Networks even. The Web of Data is now mainstream, yesterday's news. But we're still at the stage like Luther pinning his notice to the church door, can't find a stapler. Out of my window I see a little electricity pylon, that will be on a DB somewhere. I also can probably see 50 trees. I should be able to describe them individually, that's actually a plum tree that I cropped last year. I should be able to identify the caterpillar crawling up the twig, on that branch, of that tree. It's totally Engelbart, augment the person. But I don't think anyone (except perhaps for Phillip K. Dick) envisaged the possibility that this could be networked. The separation between human individuals can be huge, but there is a way of having a shared intellectual overlay of the whole of reality. We have all the bits of technology, easy, already. To pick a cliche, imagine there's a starving kid in Africa. We can know about that almost immediately, divert appropriate resources. Ok, just some guy with a mobile, taking a piece of bread. As a species, we have the potential to be so beautiful. First time on this this planet, let's make the best of it.
Energy
There is now a critical mass of people that know about Web data. Call it semweb, whatever. You can see some frustration bleed out, language in Facebook post (alright, I still just want a taxonomy for prostitutes). The ideas of 10 years ago have been fulfilled. We have the Web of Data, in a year or two, the momentum is rolling, different than imagined, but it is here. Now we have to look at other interesting things, that might be useful for humanity. I'm not sure, but anonymous access to the Web seems a good idea. Protocol work. Next.
2012-07-04T19:33:56+01:00
semweb rdf
Related
Comments
Seki Update
Seki is my little project intended to explore some of the space around the notion of a Linked Data Platform (bit of praxis there, I didn't envision it that way when I started). The W3C have chartered an LDP Working Group, so obviously I'll be watching over there for tie-ins. The approach I'm taking is to build a front end/bridge to a SPARQL 1.1-capable triplestore. So far I've got a rough skeleton down so it can behave essentially as a (very crude) CMS. When I was last looking at the code I hit something of a stumbling block with how best to cover authentication/authorization. On paper it looks like the modeling side of it should be straightforward, though in practice there are a lot of choices, not obvious which are better - Bergi (the Bergwinkl one)) has been putting some time in on it recently, I reckon I'll just follow his lead. Protocol-wise, I think for now I'll just go with HTTP Basic. Seki uses node.js and I get the sense that it'll be very straightforward to wrap the appropriate parts in HTTPS. (I think when I asked around, Hixie's suggestion was Basic over HTTPS).
My intention was once Seki was fairly usable I'd slap it on hyperdata.org, play with it live there. As it happened the DB behind the Wiki I had running there got corrupted, so a couple of days ago I pushed Seki in its place. It's far from what you'd call fully functional yet, but all I needed right away was it to serve static files, and that it's doing admirably.
Once I've go it going properly with basic CMS functionality (with auth), I plan to have a go at hooking in some of the things I saw at the Salzburg workshop the other week - Apache Stanbol, the VIE widgets and associated bits and pieces. The motivation there is in part that those things are just cool stuff, but there's a slightly deeper reason too. Their design is such that they are strongly componentized, with primary interface everywhere being the Web. Architecturally, IMHO, that has to be the right direction.
2012-06-24T14:45:45+01:00
ldp seki semweb rdf
Related
Comments
Three phases of the Semantic Web
The slides I presented at the IKS Workshop are now on slideshare (font messed up a bit, I'll have a go at uploading a pdf version later) and at slides.odp. Probably more useful for a skim are the preparatory notes. I think my main quasi-novel point was that historically the (Semantic) Web could be said to have been through three phases:
1. "It's all about the docs"
the traditional Document Web, with a bit of metadata
2. "No, it's all about the things"
the upper-case Semantic Web, reaching a zenith with Linked Data
3. "Ok, maybe the docs are important after all"
the current phase, not docs exor data but a synthesis of what's gone before - all the Linked Data goodness, what we've learnt about REST, with Web APIs and a variety of media types (like JSON plus JSON-LD), all the smarter CMS stuff with natural language processing bits, the search stuff, bringing in RDFa/microdata/microformats, all together with some gentle relaxation of constraints (think schema.org) - and gaining truly mainstream adoption
Apologies to anyone in Salzburg that followed the link I gave in the slides, I'd totally forgotten that the service there was broken. Just spent this morning setting up a live instance of Seki on hyperdata.org to fix that. Well, kinda live, all it's actually doing now is serving up a handful of static pages and giving the crawlers a 404. There are quite a few things I need to fix up - some thought needed around config and most of all I need to get some auth in place, like yesterday. But having it live is pretty good motivation to get things fixed up.
2012-06-22T13:09:33+01:00
cms iks salzburg semweb rdf
Related
Comments
IKS Salzburg, Day 1
At the Semantic Enterprise Technologies workshop in Salzburg. Very good so far. Too busy listening to comment :)
2012-06-12T14:31:34+01:00
iks semantic salburg rdf
Related
Comments
ELIZA NG
Or: mental health monitoring via social net activity textual analysis. ELIZA was a clunky 1960's AI application, mimicked a psychotherapist quite convincingly via crude scripts (there are loads of online versions). Nearly 50 years on, surely we can do better?
I've no idea what work might have been done on this already, got other things to do right now, so just a note to come back to or for someone else to look into.
From a tech point of view at least, the first step to fixing a problem in a system is understanding it, and the first step to that is observing the system's behaviour, especially the bits that fall outside desirable parameters. You hook some kind of monitor or debugger on to the system, gather data, analyse, from the results hopefully discover potential solutions. There's a broad spectrum of potential problems, from the system not performing optimally to it causing catastrophic failure across connected systems. The system in this case is a person. The unwanted behaviour is the stuff associated with mental health issues: things going on the in the subject's consciousness that lead to the spectrum from unhappiness, through to self-harm and/or harm to others. But how do you monitor such behaviour?
Nowadays a lot of people interact with the Web through social networking sites. They pump out a lot of text data. So imagine a hook into that data that applies a bit of text analysis. A Facebook app, a twitter-subscriber, a blog feed aggregator etc. (ideally all of these). The lifestream stuff.
Using myself as a sample subject, I have periods of mild depression (usually expressed as lethargy, lack of motivation, fortunately not much of the Dark Thoughts stuff). Also periods of heavy drinking, which sometimes lead to a bit of mania (lack of sleep being a big factor). If you tracked the text I output to the social networks there are plenty of markers: the depressed bits would be associated with lower output for starters. As well as incoherence during the boozy spells, I also seriously ramp up on sweariness and general antagonism.
It's easy to see how you could formulate a few chart plots from specific factors, like keywords for sweariness. But we have smarter ways to look at text, analyse the stuff across different dimensions. Initially I imagine it would make sense to obtain some baselines, corresponding to societal norms (Big Data!). Then onto individual norms. These would only make sense alongside other metrics that corresponded to something like Maslow's hierarchy - physical well-being and further up things like how items ticked off the todo list today, state of the bank account. Ideally you'd also want to monitor various other environmental factors - a trigger for me losing it has often been travel (especially to the uk :) Care would need to be taken not to conflate deviation from societal norms with anti-social behaviour, a bit of unhappiness with abnormality.
So ultimately, assuming you've got the data and done smart analysis, how do you fix the problem? I can't see there being any magic bullets, but given such a setup like this you could at least monitor the effects of medication, therapy or lifestyle changes. General solution I guess being to keep tweaking the variables and keeping whatever causes a net improvement.
2012-05-30T19:24:30+01:00
socialnets mental ai sweariness bigdata health rdf facebook
Related
Comments
Serendipity and Affordances
Or: how the Web works
Serendipity means a "happy accident" or "pleasant surprise"; specifically, the accident of finding something good or useful without looking for it.
An affordance is a quality of an object, or an environment, which allows an individual to perform an action. Here's a 2-minute video that clearly explains affordances.
-->
It turns out the data is much more powerful when matched up with something else online. The real benefit of the Web is the serendipity. You find that people will use the information for all kinds of other things.
When I say hypertext, I mean the simultaneous presentation of information and controls such that the information becomes the affordance through which the user (or automaton) obtains choices and selects actions.
Engineer for serendipity.
- Roy again.
(An alignment that came up in conversation with Mike Amundsen)
2012-05-12T15:25:27+01:00
serendipity affordances rdf
Related
Comments
Lady Gaga

it's a httpRange-14 thing...
2012-04-24T03:21:13+01:00
gaga uri tag http lady httpRange-14 rdf
Related
Comments
SPLUCKY FSMs
The other day I suggested Lucky SPARQL (now SPLUCKY, thanks Kingsley), a little convention in which a SPARQL query with an additional parameter, if it's results were a single URI, would redirect to that URI. One detail I'd overlooked is that it would also be desirable to specify the Accept: type passed to the ultimate URI, but that could be given easily enough in the query parameters.
Anyhow, insomnia just brought this thing back to mind, and it occured to me that this form of queries -
http://example.org/sparql?query=SELECT+DISTINCT+... &action=redirect
- could also appear as resources in the dataset being queried and hence appear as the result of a SPLUCKY query. So one SPLUCKY query could pull out another, which would do the auto-GET thing for another and so on, hopping from URI to URI. Rather like a finite state machine.
Ok, observant readers may have noticed that this is just a glorified redirect chain/loop, not quite an FSM. But it's not far off! [said with the enthusiasm of someone who's successfully built a nearly perpetual motion machine]. In fact the shape of it (combined with insomnia) brought to mind the Turing Machine. But as you need to write to the tape, and idempotent GETs won't allow that, implementing that with SPLUCKY seems a non-starter. Although it did occur to me that if you flip it over, the URI itself could correspond to the tape, that can be modified. Whatever, these kind of machines are very similar to the way R/W Web activity happens, the similarity being more evident when shuffling resources/statements as with SPARQL 1.1. (A nice insomniscient thought is that the Web is just a fancy Turing Machine tape, and humans are the rules tables...).
Incidentally, although SPARQL has a media type "application/sparql", I couldn't see any ref in the specs to using the URI of a query in place of the query itself in the query parameters (!), i.e. something like:
endpoint/query?queryURI=http://example.org/abc.rq
Thought I'd seen that set up on an endpoint somewhere, is it specified anywhere?
2012-04-13T04:55:00+01:00
fsm turing sparql machine rdf splucky
Related
Comments
A first taste of the schema.org carbonated soft drink
<http://hyperdata.org/Hello> a sioc:Post ;
dc:date "2012-04-02T07:24:53.676Z" ;
dc:title "Hello World!" ;
sioc:content "My first post." ;
foaf:maker [ foaf:nick "danja" ] .
schema:articleBody owl:equivalentProperty sioc:content .
schema:author owl:equivalentProperty foaf:maker .
schema:name rdfs:subPropertyOf dc:title .
schema:datePublished owl:equivalentProperty dc:issued .
schema:Article rdfs:subClassOf sioc:Item .
foaf:nick rdfs:subPropertyOf schema:additionalName .
Top-level terms
I think it would be very helpful if schema.org was a bit clearer about "top-level" terms. Right now Thing has description, name, image, url. Ok, not bad as a first pass against what's needed on the Web. But url is/should be redundant (but that's just my semweb prejudices), there's slight conflict between description and content-oriented terms like articleBody which has the intermediate node of Article. (This isn't a new phenomenon, RSS history is littered with the wreckage of content vs. description, and higher up the architectural tree it's one of the features of httpRange-14). Ok, maybe description is useful enough to leave alone, similarly name is probably reasonable to cover the top level of label, title, name. image I suppose is fair enough, a pragmatic approach to something that could easily get messy if more WebArch was brought into the picture. I guess my recommendations then would be to add a term Item (for a generic Information Resource, superclass of Article etc) and date (for a superproperty of all dates).
Automatic mapping
I haven't yet decided whether or not to use the Web vocab or schema.org versions of the terms in my internal RDF, I suppose I could even use both. But my little experience above demonstrates it's not yet obvious how to map across even with these really common terms. If the starting point was something richer, the amount of work involved could easily explode. Some kind of automation is desirable, for the benefit of someone like me in the current situation, a publisher of semantically marked-up HTML that would like their material to connect with the Linked Data Cloud, or someone writing an app that consumes data across different vocabularies. A service (or two) springs to mind: give it a term and it responds with correspondences from other vocabs, or give it a lump of data and let it offer a translation to the preferred vocab(s)/format. There are at least two approaches to implementation: SPARQL CONSTRUCT and/or RDFS/OWL inference (in both cases the use of generic superclasses/properties could be useful). The front end could offer something like the Rich Snippets Testing Tool for authors together with an open API for translation by app developers, to give a leg-up for integration/mashups. It would be nice if the good folks behind schema.org would consider throwing some resources in this direction.
See also :
2012-04-05T15:13:53+01:00
iks seki rdfa html schema.org semantic semweb rdf
Related
Comments
DocSpace and ThingSpace
About a week ago Simon St.Laurent posted a comment on Facebook in a thread about httpRange-14. Alas I can't find Simon's original comment, but the gist was more or less: it's a bad idea to use HTTP URIs for anything that isn't directly associated with the HTTP protocol, e.g. as names for real-word things, concepts, "non-Information Resources" etc. I've been mulling over this a bit.
I don't disagree with him that it might be conceptually more elegant to have a different namespace for identifiers of things rather than identifiers of Web documents (namespaces in the broad sense, probably talking URI schemes in practice). You can still describe and reason about things in a model like RDF using non-HTTP URNs. But for the identifiers to be useful on a global scale, you need a discovery mechanism. HTTP with hypermedia offers this. But how would you mesh the 'thing' namespace with the 'HTTP' namespace? Ok, you could use HTTP to access RDF documents that link to other documents. Mike Amundsen might disagree a little about the extent of this, but I'd suggest any RDF is inherently a kind of hypermedia simply by supporting HTTP URIs. As soon as you see the http:// prefix, the methods are implicitly available on top (they are made explicit to some extent with RDFa etc).
Ok, so to find about my dog Basil you'd want to try and find statements that match a template like ?doc a:foaf:Document; foaf:primaryTopic <urn:basil> . You're working with an indirection between real-world things and document space. A map of identifiers ThingSpace <-> DocSpace. But this is in effect what we've got when we use HTTP URIs for things, the protocol doesn't support resolving them to things any more than a URN scheme URI over HTTP would resolve to anything. The net result is the same as in what Jeni Tennison recently called the web of data view. But by allowing HTTP URIs to identify things, we can kludge past needing a separate space for thing identifiers and explicit namespace map. Yes, there is a downside - the httpRange-14 permathread - but leaving aside the philosophical niceties the concrete problem is just a matter of choosing an appropriate mechanical convention. This seems is a small price to pay given the way it simplifies the publication of linked data. The linked data cloud is progress! Once again I refer you to Dan Connolly's question: Are there parts of traditional logic and databases that, if we set them aside, will result in viral growth of the Semantic Web? By the same token we might ask, are there parts of Web best practices that it might be worth setting aside, you know, as a bootstrap, like just for a little while... (cf. schema.org vs. distributed vocabularies).
PS. Simon's blogged his thoughts on this : Original sin and the ruin of HTTP URIs, and although much of that sounds critical of semweb efforts (we have been talking at cross-purpose a little) there is significant common ground on the G+ thread, around the notion that Linked Data HTTP URIs should always resolve to something useful over HTTP, i.e. that HTTP URIs for things should make sense on the Web as well as in the triplestore, to the extent that you can put them in a browser's address bar and not expect an error.
2012-04-04T23:30:33+01:00
webarch uris simonstl httpRange-14 rdf
Related
Comments
AgentRank and serendipity
Agent Rank is described in a Google patent from last year and the implications of it (from a SEO perspective) are discussed in this post. Interesting stuff: essentially factoring people's identities into search particularly through author reputation. Appropriately enough, as evidence that Google is actively working on this a screenshot of Othar Hansson's G+ profile is used (he's listed as Engineering Lead on "The Authorship Project").
This seems a natural progression for them. Another likely example of Google implementing on a large scale ideas that have been floating around semweb activities for a long while. Now they've got the Give yourself a URI bit down (with G+ identities) the rest can follow. The article suggests that the approach will be quite nuanced, incorporating topic information as well. Bravo - anything that makes the soup of the Web more digestible is to be welcomed.
My only concern is a bit on the abstract side. Tim Berners-Lee has often praised the serendipity aspect of the Web, finding things and making connections by (apparent) chance that wouldn't otherwise be obvious. Information reuse is a cornerstone of (Semantic) Web technology, and it's there from the ground up: Roy Fielding says Engineer for serendipity. Whether it's through the uniformity of the interface (as Roy might put it) or of the graph (as Tim might put it), the Web does seem to encourage alignment of resources on similarities, without prejudice.
But any ranking of resources surely has to be done based on known parameters. Serendipity is all about seeing similarities across previously unknown axes. So doesn't AgentRank (and for that matter good old-fashioned PageRank) run counter to this idea?
Here's a recent rather trivial example of serendipity. A couple of days ago I came across this puzzle:

Now I'm pretty certain this would normally have totally foxed me. But I got the solution in a couple of minutes because the night before I'd been concentrating on this little woodcarving project:

(Now finished - end results)
I won't give any other clues to the puzzle, but working on one problem gave me direct insight into the other. Ok, the puzzle is an artificial problem, but what if the key to the Reimann Hypotheses lay in a similarly peculiar direction? The information needed could be out there on the Web already, hidden in plain sight on the blog of a mathematician and that of (say) a woodcarver. If it were, it's vaguely plausible that a semweb-style system that combined the data behind the blogs would see the connection. A text-similarity based system might see the connection too. But if the access to the information was based on the mathematician's reputation in mathematics, the woodcarver's reputation in woodcarving and the crossover of these, there would be no serendipity.
I don't know, the infrastructure of the Web supports serendipity, but how do we surface it?
2012-03-31T10:30:49+01:00
authorrank pagerank serendipity semweb rdf agentrank
Related
Comments
Lucky SPARQL
tl;dr : how to give SPARQL endpoints an "I'm Feeling Lucky" option and hence support things like WebFinger
Take a query like:
SELECT DISTINCT ?blog WHERE {
?person foaf:name "James Snell" .
?person foaf:weblog ?blog .
}
LIMIT 1
If I'm asking something like that, then what I'm probably trying to achieve is to get to James' blog. But if use that on an endpoint, what I'll get back is a bunch of XML (or JSON), from which I'll have to parse out the URI, then fire off another GET. So what about having the endpoint server support an additional parameter, something like:
http://example.org/sparql?query=SELECT+DISTINCT+... &action=redirect
which would tell the server to pull out the URI in the results, and return:
HTTP/1.1 302 Found
Location: http://chmod777self.blogspot.com
- thus taking me straight to my actual target.
WebFingering
I've had James Snell's proposal for simplifying WebFinger simmering away in the back of my mind. I'm unconvinced by the architectural style of what he suggests (Gopher?), but he does get bonus points for creativity. (See also James' response on that). In the query above I've used foaf:name which is likely to give ambiguous results. But if it was foaf:mbox_sha1sum instead, you've got a mechanism for WebFinger with James' optimization. Ok, the request URI is a bit cumbersome, but templating a short version for special cases like WebFinger would be easy enough.
PS. A better name might be "Optimistic SPARQL" (and probably return a 404 if the query doesn't return a suitable pattern).
2012-03-29T15:51:36+01:00
sparql semweb rdf gopher webfinger
Related
Comments
Debunking the 27 Club with SPARQL
This morning I stumbled across a Fortean Times piece about the "27 Club". The story goes that an awful lot of popular musicians have died at the age of 27. A recent new member of the club is Amy Winehouse, and there was a notable cluster with Brian Jones, Jim Morrison, Jimi Hendrix and Janis Joplin all joining around 1970. The idea of the 27 Club appears to have started soon after Kurt Cobain's death in 1994. Rock mythology being what it is, the origin of the 27 Club is now taken as being bluesman Robert Johnson's pact with the Devil (at the crossroads).
So, is there any truth in this? According to a rock star biographer quoted in Wikipedia "there is a statistical spike for musicians who die at 27", but also there's been a British Medical Journal study that showed no such spike. So, contradictory evidence, the jury's still out... But as it happens Wikipedia also has a good collection of data about musicians and that data is available in processing-friendly linked data from dbPedia. So I thought I'd look into this myself.
Long story short, does the highlighted column here look like a spike?
I started by finding the Wikipedia page for Kurt Cobain: https://en.wikipedia.org/wiki/Kurt_Cobain. Given that it's easy to get dbPedia's identifier for the man: http://dbpedia.org/resource/Kurt_Cobain. Opening Kurt's URI in a browser results in a redirect to a page about him (following the 303 convention): http://dbpedia.org/page/Kurt_Cobain. It displays the pieces of data dbPedia knows about him, the properties and their values. From that I was able to see how the relevant facts were expressed, and translate them to the following triples in Turtle notation :
PREFIX foaf:
PREFIX ont:
PREFIX db:
PREFIX xsd:
db:Kurt_Cobain a ont:MusicalArtist .
db:Kurt_Cobain foaf:name "Kurt Cobain" .
db:Kurt_Cobain ont:birthDate "1967-02-20"^^xsd:date .
db:Kurt_Cobain ont:deathDate "1994-04-05"^^xsd:date .
This is enough to use as a template for a SPARQL query, putting a variable in place of Kurt's identifier. Given the Robert Johnson story it seems reasonable to filter out any musicians born before the 20th century.
PREFIX foaf:
PREFIX ont:
PREFIX xsd:
SELECT ?name ?birth ?death WHERE {
?m a ont:MusicalArtist ;
foaf:name ?name ;
ont:birthDate ?birth ;
ont:deathDate ?death .
FILTER (?birth > "1900-01-01"^^xsd:date)
}
Running that query produces 4099 results, which seemed small enough to handle in a spreadsheet. Had it been a few more I'd probably have opted for the JSON representation of the results and done the processing with a little script. Had the querying been more complex (likely to cause timeouts on dbPedia) I'd probably have had to do some CONSTRUCT queries to extract the chunks of dbPedia of interest in RDF and put those in a local store, running queries against that. But it wasn't and it wasn't, so I ran the query directly, choosing the XML+XSLT stylesheet option to give me results in HTML. These I simply copied from the browser and pasted into a LibreOffice spreadsheet.
The spreadsheet automatically figured out the date format so I was able to get the musician's ages with a trivial calculation. Sorting on this column revealed that the first 33 entries were duff data, mostly invalid format. Neither Wikipedia nor dbPedia are perfect. But 4066 values, even allowing for a few errors along the way, should be a big enough sample size to test the theory.

You can see here another problem with the data - Kurt has two entries. I guess something like Google Refine could be used to tidy this up, but I went with the assumption that such problems would be reasonably evenly distributed. PS. a DISTINCT qualifier in the SELECT clause in the query would be an improvement like this, and the ?name bit would be better dropped (it's not needed and introduces duplicates). I used the SNORQL endpoint of dbPedia.
Here's an online Google Spreadsheet derived from my LibreOffice original.
So, results. I'll leave the statistical significance measuring to someone else, but to my eyes at least there doesn't seem to be a spike at 27, with only 40 deaths (there's a bigger version of the chart here). If anything, there may be a spike at the top value, 95 deaths at age 74. There may well be a 27 Club of accursed musicians, but the 74 Club is more popular. I don't have the figures for normal humans, but the BMJ found that "musicians in their 20s and 30s were two to three times more likely to die prematurely than the general UK population".
Keith Richards is 68.
2012-03-07T16:21:58+01:00
linkeddata 27club sparql rdf data linked dbpedia journalism
Related
Comments
Linked data games
This morning I was playing a linked data game. It was a multiplayer game, involving a big map with pieces on it (a bit like Risk), and actual printed cards (a bit like Top Trumps) which looked remarkably like Wikipedia info boxes. Then I woke up. I can't remember much about the game now, but it's pretty easy to imagine the linked open data cloud being used to fuel a game engine. There's plenty of open data around for geo-oriented things, and the Top Trumps bit is (for my sleeping self at any rate) could be a trivial mapping from dbPedia.
There are a few precedents for Web data games. A few years ago Gunnar Grimnes hooked FOAFish data up into RDFRoom, a shoot-em-up kind of thing (I've a feeling someone else made another, can't remember who and searching didn't help, "rapid deployment force" gets in the way).
I challenge anyone that's worked with structured data to watch the film about text adventure games Get Lamp and not come away inspired to do a tie-in [e.g. GET lamp HTTP/1.1] (I am old enough to have taken part in this wave of computer games first-hand, but somehow it passed me by). It seems Liam Quin has had a go with RDF "you are standing outside on a grassy hill, with only a damp towl and a bar of soap" (Liam and I are currently working on a writing project together, so this was a delightful coincidence). I found it via a related post by Leigh.
The Web does have the navigation/exploration metaphor for traversing links, something I'd like to exploit somehow around RDF Affordances. "Surfing" the Web is quite a hip idea, but I do wonder about Tim Berners-Lee's metaphor with Weaving the Web : "Wear it like a shawl, dude."
PPS. Liam adds: "At one point you can Summon the Timbl, who talks faster and faster until he vanishes in a puff of logic."
PS. Couple more references (via Facebook) :
...and a bit of code
99 IF(CLOSNG)GOTO 95
YEA=YES(81+NUMDIE*2,82+NUMDIE*2,54)
NUMDIE=NUMDIE+1
IF(NUMDIE.EQ.MAXDIE.OR..NOT.YEA)GOTO 20000
PLACE(WATER)=0
PLACE(OIL)=0
IF(TOTING(LAMP))PROP(LAMP)=0
DO 98 J=1,100
I=101-J
IF(.NOT.TOTING(I))GOTO 98
K=OLDLC2
IF(I.EQ.LAMP)K=1
CALL DROP(I,K)
2012-03-04T10:55:00+01:00
doom adventure moo games rdf
Related
Comments
Consolidation
A little follow-up to my post Everyone has a Graph Store. Two main things: looking at those graphs from a different perspective and a little initiative I'm putting forward to try and advance a particular aspect of this stuff. (PS. I've gone on about the first point a lot longer than intended and the dogs need walking, so I'll leave the second thing for another day - in lieu of that check SPARQL Box).
"Graphs" are just Structured Data
Given the response I got on twitter, G+ etc. there must have been something right about that post, but the most interesting feedback I got relates to what was wrong with it. Specifically from Kingsley Idehen (@kidehen) :
do the people we need to engage really care about the facts that they've been using 'Graphs' forever? I don't think so. Why not remind them of the fact that they've been working with structured data forever, but in silos prior to the emergence of the ubiquitous Web.
I was bandwaggoning the graph meme, in the sense of the Social Graph that's been talked about a lot in recent years, along with things like Tim Berners-Lee's description of the WWW as the Giant Global Graph. I also had in mind the concrete notion of the graph as found in RDF. But Kingsley's absolutely right to point out that what we're talking about here is really just structured data and how we use it.
I'll borrow a little from Kingsley's own history to help clarify the point. Go back two decades and you'll find Kingsley starting a company (which became OpenLink) focused on data integration sofware. Their products were middleware that allow connections to be made between various kinds of enterprise databases and applications. They were based on industry standards, allowing pluggability between systems (acronym city: SQL, XML, ODBC, JDBC, OLE, ADO...). Kingsley had recognised there was a market for this stuff because, in essence, being able to connect different systems together significantly increased the value and utility of those systems - the whole being greater than the sum of parts. Fast-forward to say a decade ago, and a new kind of data integration was becoming feasible - using the Web. Rather than using standards designed for connecting specific enterprise tools together, this exploited open, global standards, notably URLs and HTTP. While XML was (and is) useful for this purpose (and HTML also has its uses), the emerging Resource Description Framework has Web techologies as its foundations, so is ideally suited for integrating data in this environment. Seeing the advantages of using not only Web technologies as middleware but also the Web as a database in its own right, Kinsgley ensured his company was an early adopter and they've been at the forefront of the development of linked data ever since.
But there's a lot more to this than enterprise databases.
Local Structured Data
Every time we use a computer we are working with structured data. Even if it's just Word documents on a file system, there are relationships and interactions between the pieces of information we're working with. Take a look at your Start Menu or whatever the OS X Toolbar is called: every one of the applications there uses data in a structured fashion. While there will be some system-wide integration of their data, e.g. in allowing intelligent search, essentially each application operates in it's own little isolated world.
Back to the Web again and we see all the different companies, services and application operating in a similar fashion, commonly referred to as data silos. But the take home here, as Kingsley puts it, is that we've all been using structured data forever. The challenge for the next generation of software, whether we interact with it on our cell phone, laptop, desktop, domestic appliance or the Web is genuine integration. The best integration capability we have to date is through Web technologies.
Here I'll quote Kingsley again (from G+). He's talking in the context of linked data advocacy, but the point he makes is a much broader, practical one:
Basically, we should be demonstrating 'Linked Data Inside' effects on existing apps (Access, File Maker, Excel, Google Spreadsheet etc..). Here's the the pleasant surprise and one of my eternal Linked Data frustrations: each of the native tools above have natural bindings to Linked Data courtesy of:
1. HTTP GET support -- so each Linked Data Resource URL is a Data Source Name, easily comparable to an ODBC/JDBC Data Source Name
2. CSV output support -- meaning to make 3-tuples or 4-tuples and then save to a Text file that practically N-Triples .
Let's take this opportunity to collectively fix the broken Linked Data narrative. Fixing that will also enable critical fixes to the broken Semantic Web narrative. Everything is a Remix, but Linked Data (the ultimate remix technology) is described or pitched as the ultimate remix facilitator.
More generally, in other words, the future is already here (it's just not very evenly distributed). Referring back to my previous blog post, you can legitimately search & replace "Graph" with "Structured Data".
2012-03-01T14:52:17+01:00
box kidehen sparql rdf data linked
Related
Comments
Everyone has a Graph Store
Try this thought experiment.
For practical purposes we often assume that everyone has a computer, a reasonable Internet connection and a modern Web browser. We know it's an inaccurate assumption, but it provides conceptual targets for technology in terms of people and environment.
Ok, now add to that list a Graph Store: a flexible database to which information can easily be added, and which can be easily queried. The data can also be easily shared over the Cloud. The data is available for any applications that might want to use it. The database is schemaless, agnostic about what you put in it: the data could be about contacts, descriptions of people & their relationships (i.e. a Social Graph), it could be about places or events, products, technical information, whatever. It can contain private information, it can contain information that you're happy to share. You control your own store and can let other people access as much or as little of its contents as you like (which they can do easily over the cloud). You can access other people's store in the same way, according to their preferences. It's both a Personal Knowledgebase and a Federated Public Knowledgebase.
So, make the assumption: everyone has a Graph Store. Now what do you want to do with yours? What can your friends and colleagues do with theirs? How can you use other peoples information to improve your quality of life, and vice versa? What new tools can be developed to help them take advantage of their stores? How can you get rich quick on this? What other questions are there..?
Note that if everyone has a Graph Store, for free they automatically get the value-add of the linked data cloud.
Ok, I'm presenting this as a thought experiment, but we pretty much already have all the necessary tools and infrastructure for it to be reality. They aren't generally packaged up in a form that's user-friendly, but that part is becoming increasingly trivial (see below). If you want to run such a store on a local machine there are masses of alternatives - to pick the first three that come to mind there's 4Store, Fuseki and Stardog. If you have a server or other kind of cloudspace available then tools like these are an option there too. For an enterprise kind of environment you probably should look at OpenLink Virtuoso. If you want to leave everything to the cloud, there's Kasabi - note their free hosting option. (I can't remember offhand what other hosted cloud-based options are available, I'm pretty sure there are a few others but a quick search only yielded Dydra which is currently in private beta - please ping me if you know of others...or set up your own :)
The reason I'm prompted to post this now is because of a couple of projects I've had on the go for a while. One (Scute) is an attempt to make my hacking with RDF easier - it's essentially a glorified text editor with a bit of HTTP clientness built in. The other (Seki) was started as a demo more or less to show how a triplestore could be used as a general-purpose read/write Web server, supporting content as well as data. Neither of these is remotely mature enough for proper reuse (Scute has become bloaty/buggy and Seki doesn't do much yet, both are lacking tests and documentation, work in progress innit). But what I found interesting was that although they are approaching semweb tech from a very different direction, there's some definite convergence going on. That convergence is more or less around what I was calling the Semantic Web in a Box (SWIB) a few years ago (jeez, 2006 - tempus fuggits).
The thing is, although this Web stuff does evolve gradually over time, there are also developments that are in effect big steps forward. In the context of the Semantic Web there was the publication of the 2004 specs (solidifying the material that came before), the development of SPARQL (allowing loosely-coupled access to triplestores) and the perspective shift that the notion of linked data offers (bringing the Web back into the Semantic Web). That's not mention the initiatives that have appeared outside semweb cognoscenti circles - things like schema.org.
I reckon SPARQL 1.1 is another big step. Yes, we already knew how to write to the Web with good old RESTful HTTP. But SPARQL Update, Graph Store Protocol etc. offer a standard, loosely-coupled way of writing to triplestores. Ok, a purist may point out that a lot of this stuff isn't RESTful, hence isn't truly Webby. But that doesn't matter - it completes the decoupling of the backend layer (arguably, paradoxically, disintermediating the layers) making it possible to commodify that layer and allow middleware to use generic interfaces, plugging in to any store at one end and potentially any client at the other.
This means the SWIB idea just got a whole lot easier. All it needs to be at heart is a triplestore which supports read/write SPARQL. As noted above, these are already available. I do think the packaging could be improved, to totally minimise the installation effort. One click to download, one click to install, another click to run. A bit of shiny GUI is also desirable, not only to make things easier that the default HTML form for endpoint access but also to reduce the surprise to the end user. It should look a lot more like familiar tools - ideally including something general-purpose (think Microsoft Access) and one or two domain-specific apps (FOAFish contacts/social net client is an obvious one, taking advantage of recent developments a Rich Snippets aware bookmarking app might be nice). A little configuration tool would be good to have too, not everyone is comfortable editing exotically-formatted text files.
Of course it would make me very happy if someone else put a SWIB together like this, dear lazyweb, as it'll probably take me another 6 years to get it together myself. But irrespective of what I say or do on the matter the personal/shared graph store is such a gaping niche that it's bound to happen in some form pretty soon anyway. Whatever, the current absence of "everyone has a graph store" is a conceptual block to imagining the possibilities. So try assuming this is already a done deal.
2012-02-26T15:02:58+01:00
swib federated semweb rdf
Related
Comments
API Babel
Nothing new here...that's the problem :)
I posted this in a conversation with Nina Jeliazkova and Evan on G+, thought I'd put it here so I could find it again.
Let's say I was setting up an events service for musicians. Following +Evan Prodromou's ref, Portable Contacts would seem in scope for the musicians themselves. Events happen in a location, so one part of the API I'd be interested in is the address stuff. To work with that data it might be useful to use geonames too. It's events, so let me have the place stuff from eventful as well. Geo, geo and geo - with three completelydifferent APIs:
http://portablecontacts.net/draft-spec.html#rfc.section.7.4
http://www.geonames.org/export/web-services.html
http://api.eventful.com/docs/venues/search
The data may be exposed but it's there as a, er, kind of glass silo, it doesn't exactly lend itself to reuse.
Nina remarked:
Not that technically it is impossible to merge the APIs, there is no reason (business, whatever) for them to sit down and merge the APIs. This has happened in network engineering (and other domains) couple of decades ago; there have been many incompatible network protocol/hardware vendors then. It takes time to recognise the value of synchronisation.
She's right, but that time part is an issue. A few years ago everyone was talking about mashups - didn't the value become apparent then? We've had a good modelling language for sync'ing data since (say) 2004 when the RDF specs came out. The data-handling tooling came along with SPARQL in 2008. RESTful good practice ideas have spread widely in the past few years, with linked data I suppose being their counterpart in the semweb world. So why are APIs still so difficult?
Ok, that's glass-half-empty from a semweb perspective. Awareness of this tech has spread. The stuff around Rich Snippets, schema.org and HTML5 microdata demonstrate that the ideas are reaching a wider audience. (Incidentally I was impressed by JeniT's diplomacy about HTML5 in her excellent presentation - but I'm going to start referring to the stuff as HubrisML :)
A personal data point: last week I checked my Twitter "followers" for the first time in maybe 6 months. Around 150 new people. I'd estimate that 100 of them had reference to the Semantic Web (or some closely associated tech) in their profiles. I follow this tech, but still I hardly recognised any of these new folks.
I suspect Mike Amundsen might have a point when he says RDF will languish until it goes hyper (i.e. gain affordances as a hypermedia type). JeniT's talk of using HTML/XML/JSON/RDF for what it's best at probably applies - so how do you bring interactivity to RDF without it looking like it's got a goat's head stuck on it's back? Research needed (high on my list). Whatever, pragmatically the linked data API goes a long way.
Anyhow (once I get my bank balance back in the black) I intend to put a lot more effort into actually using this tech to build human-facing apps. I've a few ideas on how to operate as an Indie, a core one being that taking full advantage of what the Web has to offer (i.e. using linked data etc) offers a business advantage, everything else being equal.
(any comments to the thread on G+ please)
2012-02-22T13:28:58+01:00
apis api affordances semweb rdf
Related
Comments
Named graph identity
On Using named graphs to model Accounts it states:
It is very important to recognize the following:
A graph's name does not identify its contents.
That is, two graphs with the same name does not imply that they contain the equivalent RDF subgraphs.
SPARQL 1.1 effectively supports that, saying:
the relationship between an IRI and a graph in an RDF dataset is indirect. The IRI identifies a resource, and the resource is represented by a graph (or, more precisely: by a document that serializes a graph).
On face value that would seem to undermine all potential for using named graphs for provenance (and many other things).
But Richard Cyganiac's response is on the nail:
it is only the social contracts and conventions around URI ownership and web architecture that...allow us to maintain the fiction that URIs in RDF actually identify specific entities
Right, it's axiomatic that URI owners get to decide what their URIs identify. But ok, it's hard to see how (say) my personal assertions of graph identity offer a neat mechanical solution that can work at Web scale. However the perspective set up by all this is misleading. Resources are conceptual entities: "my blog" (with the URI http://dannyayers.com/) is a resource that has representations that change fairly frequently, including a named graph. But this doesn't mean to say that it's rendered useless for provenance purposes. The claim "Danny said on his blog" makes sense even if the statement to which it's referring isn't immediately available at http://dannyayers.com/ (incidentally there is a bigger graph available). Linked data provides a means ("follow your nose") by which related information can be discovered. In fact the claim still makes sense if the original statement is no longer on the Web, it can still be useful (only the provenance paths have got a little twistier). Some data, especially if there's a way to find more data, is always more useful than none. As danbri commented:
A dc:source linking any factually-oriented document to its alleged sources goes a long way...
2012-02-20T14:21:24+01:00
named provenance graphs rdf
Related
Comments
Browsers are the New PowerPoint
Stumbled on this by Donald Norman in some of Edward Tufte's material about PowerPoint :
Technology is not neutral. Technology has properties--affordances--that make it easier to do some activities, harder to do others: The easier ones get done, the harder ones neglected. Each has its constraints, preconditions, and side effects that impose requirements and changes on the things with which it interacts, be they other technology, people, or human society at large. Finally, each technology poses a mind-set, a way of thinking about it and the activities to which it is relevant, a mind-set that soon pervades those touched by it, often unwittingly, often unwillingly. The more successful and widespread the technology, the greater its impact upon the thought patterns of those who use it, and consequently, the greater its impact upon all of society. Technology is not neutral, it dominates.
- Norman, Donald A., Things that Make Us Smart, Perseus Books, 1993, p. 243
It nicely expresses what I've been trying to say in my periodic rants about the tyranny of the browser. Tufte's application of the above to PowerPoint is lovely, now rather than handwaving I can point to something concrete that also blinkers our way of looking at information.
The Web browser as we currently know it has evolved to interact with the Web in a way that's been influenced by perceptions of what the Web is and can be (for example, that it's largely read-only). There's a feedback loop; it's self-perpetuating. There are clear advantages for Web publishers and users in the convergence in the way browsers behave, but this is at the cost of innovation.
Incidentally, Tufte is now a sculptor.
2012-02-19T01:47:43+01:00
intents browser web rdf tufte
Related
Comments
Social nets and shared objects
Just checked back on the geek pop video I put up on Tuesday: 111 hits, 4 likes, 1 dislike - heh, satisfactory ratio.
I don't have the energy for advocacy and am not really interested in marketing, but it did get me wondering how you would actually target an audience in this day and age - talking to the right people is efficient communication, right? Clearly folks like Google believe they can target arbitrary demographics with their advertising, identifying the appropriate audience through analysis of user behaviour. Done accurately, it's no longer advertising as such but more about making a connection between some kind of provider and a willing recipient.
In this specific case, the primary target would really be perhaps a person who uses a computer a lot, but only has a minor interest in dev, if any. They probably get most of their desktop software through regular commercial channels, supplemented by dodgy copies of things from their friends. It would be in the interests of this person to know about open source if only in the sense of better software for free. But most of the people reading this will be a hop or two removed from that demographic. Exaggerating for effect, the Open Source Circle has no intersection with the Regular User Circle. How do you find paths through? Ok, maybe there's one that goes [open source user] - [open source geek] - [.net geek] - [MS Windows user]. Yeah, (social) graph problems.
There's potential around communities of interest. Again in this particular case a graphic designer that normally uses Photoshop may be in contact with a Gimp user.
There's an aspect of this I reckon is still really virgin territory, ripe for colonization: I'm sure I've heard better terms but call it "shared objects". My guitar is of generic type Stratocaster, so if someone else has a guitar is of generic type Stratocaster there's a very good chance we've got other things in common. It's close to what Amazon already does around recommendations, but I reckon it could be done a whole lot smarter and in a way that's more broadly useful. It's a Semantic Web/Linked Data idea that's also entirely in scope for schema.org and RDFa/microdata work.
Uldis Bojars did some work around the "shared objects" thing a year or two back, I must pester him again for references.
2012-02-17T13:42:18+01:00
federated social semweb rdf graph
Related
Comments
Can I Hack It
I really enjoyed myself making Get Your Data Out! back in 2008 (how time flies) and recently wanted to apply myself to some fun projects to keep the black dog at bay. So I've made another geek-pop advocacy video called:
a little celebration of Open Source
It's a public information broadcast, just under 4 minutes long, briefly describing what open source is, why it should be encouraged and introducing a bunch of open source applications. With a bouncy soundtrack to help stop you getting bored.
All the source and media files I used are here. Please feel free to rebroadcast, remix, etc.
2012-02-14T22:54:46+01:00
video music pop share source hack hacking rdf open
Related
Comments
From Web Palaces to Reinforced Concrete
The memory palace, also known as the Method of loci or memory walk is a technique for remembering things. Basically you start by memorizing the physical layout of a place such as your own home. When you need to remember something, in your mind you put it somewhere in that place. To recall the thing, you look in the place. The trick apparently works well for things like memorizing lists. It was known by the Ancient Romans. It relies on the fact that we have fairly good spatial memory.
Many of the metaphors used around the Web reflect a spatial model: Netscape Navigator, Internet Explorer, the Information Superhighway. We do seem to be predisposed to think in this way: when talking about mathematical structures, the structure is conceptually a kind of map, comparable to the way we make maps of the physical world. With information retrieval on the Web it's quite hard not to think in terms of a library, not far removed from a physical library with books on shelves. Not to mention Files, Folders and Desktops.
So anyhow, the other day I was pondering whether there was something more around the memory palace idea that could be applied on the Web. Instead of putting a piece of information in a place in the mental model, HTTP PUT it in a place on the Web. I didn't get much further with this line of thought, you need the mental model of a familiar place to start. But another metaphor did occur to me.
What have the Romans ever done for us?
Well, the invention of concrete is usually attributed to them. It "...freed Roman construction from the restrictions of stone and brick material and allowed for revolutionary new designs in terms of both structural complexity and dimension.". Concrete clearly has had a role in the development of the modern city. There's a watchable old TED talk from Steve Johnson on the Web as a City. Concrete is made from aggregate, cement and water, not such a bad analogy to HTML marked-up text. You can mould it into whatever form you like. But what's lacking there is something corresponding to links.
So...how about reinforced concrete? Concrete on its own is good under compression but not extension, it needs something to tie it together if you want to make really big structures. The answer is to embed steel bars in it. Links as rebar, ok.
While this metaphor is a bit limited regarding the flexibility of the Web, it isn't bad for explaining the role of links to bind everything together. Given that links are data, the metaphor isn't bad at explaining how Web of Data relates to the Web of Documents. One is made of steel the other, concrete. Together they make a composite or hybrid material with properties that are greater than their sum of parts.
2012-02-13T13:22:07+01:00
reinforced palace metaphors rdf memory concrete
Related
Comments
Knobs
Quantitative Filtering
Filtering is a core feature of information presentation on the Web. As an example, look at blogs. This post will visible for a while on this blog's front page, along with the other most recent posts. Essentially the page is defined by a filter (by date) applied to a large collection of material. Filters can work over many different axes, e.g. date, tag, author etc. They can be combined to provide faceted views of the information. Filters like this are fairly common on the Web, often seen combined with an ordering of the data for example: Sort By Price and Show 10 Items Per Page.
Many filters operate over a continuous variable (or one that can be mapped to a continuum), the date of posts being a good example. If you've got a continuous variable then a UI component that becomes available is the knob or slider. It's pretty straightforward to hook such a UI component up to a filter to apply to a backend store (in fact I rigged up a demo of doing this not long ago).
In the context of a blog there is quite a lot of data available that could be used for a knob-controlled filter. For example, most blogs contain a mix of long and short posts. Why not filter on word count? More nuanced things are possible nearby, say link count. Or something like readabilty. To make knobs or sliders user-friendly you probably wouldn't want to offer the viewer the actual numbers of word or links, rather a e.g. slider that had at its extremes Short ---|- Long.
Quantative Tagging
Sites like Amazon also exploit user-contributed data like rating (in reviews). But there's an awful lot more potential kinds of information available. To pick some at random: utility, creativity, authority, entertainment value (fun!). So someone comes along and sees a post with a set of sliders below it and sets those sliders at 4, 3, 2, 1. That data is passed back to the server. Ideally the server will store that data associated with the user in question, to allow the whole social query dimension. Or the value may simply be aggregated as numbers associated with the post.
When someone else arrives on the site they see, say, a default view of the most recent posts. But below are same controls again. They are interested in reading material that's useful and fun, but are less interested in the other factors. So they set the sliders at 5, 1, 1, 5 and click Search. They are instantly presented with posts that fit that profile. The user may want to save that profile so it's the default next time they visit.
What happens under the hood at view time is again something that could be quick & dirty less-than/greater-than filtering on the parameters, or something more sophisticated that derived the results from the "shape" of the settings, amplifying the descriptions previously given by their friends.
Taking the user-contributed data angle a step further, instead of having a predetermined set of controls, it wouldn't be hard (at least if you're using RDF under the hood :) to allow the users to define new sliders, just ask them for the axis over which the slider varies Foo...Bar. Working title, please change: Wiki Knobs.
Interstices
The applications for knobs like these are pretty open-ended. What I describe above is a typical-Web-site-oriented view of an idea my late wife Caroline suggested years ago, effectively an idea generation machine, working title Interstices. I'm not sure she was a big fan of the Surrealist movement per se, but she loved seeing surprising, apparently contradictory concepts combined in art. I vaguely remember (or have imagined :) her talking about it the context of a magazine advert for PCs, where the tower cases had Friesian cattle's black and white markings (anyone remember that?).
With Interstices you'd tag images with sliders as described above, arbitrary scales, say Hard...Soft, Natural...Artificial etc. But then rather than looking for matches, the system would offer you opposites, so the PC is Hard...Soft:1, Natural...Artificial:5 whereas the cow is maybe Hard...Soft:4, Natural...Artificial:1.
I said at the time it would be easy to build - still haven't got around to it. But I'm pretty sure at the time we were only thinking in terms of a little app one or two people might use. Imagine something like this supporting proper crowdsourcing, e.g. sliders attached to Flickr. That'd be cool.
Comments to G+ please
2012-02-12T16:18:07+01:00
sliders ideas interstices knobs social creativity gui filter interaction ui rdf data
Related
Comments
RDF Hypermedia is Art
[there's loads of background before I get to what I mean by the title :) ]
An interesting mailing list thread on Web API design (found via twitter) led me to open the Linked Data API material in another browser tab. Mike Amundsen's points re. hypermedia-oriented APIs vs. URI-structured APIs are interesting in this light, and led me to open his hypermedia book again.
Now the RDF Affordances stuff has a really un-catchy name. I've been using "affordances" partly because it's accurate (thanks Mike), but also to be clear that while there is significant overlap with the Web Intents material, it's not quite the same. I've been thinking about RDF Affordances in a sense that (done right) they should be a superset of Web Intents, primarily because RDF is 1. a description language (so an Intent can be described) and 2. seriously linky (has potential to do the wiring of Web intents).
In Mike's book I happened to notice a quote from Roy T. Fielding I'd missed before (being a preface skipper) which has set me re-evaluating things :
"Hypermedia is defined by the presence of application control information embedded within, or as a layer above, the presentation of information."
Ok, so what does RDF look like from this pov? Well, before you get anywhere near presentation there's the representation syntax to consider. RDFa is definitely a hypermedia format, being HTML. The linked data is hooked to the application controls of (clickable) links. But what of RDF/XML and sweet little Turtle? Yes, this is the idea of RDF Affordances, but assuming a blank slate locally...I Google "RDF Hypermedia". Top hit is a blog post from Mike: API for RDF? don't do it! Heh. [A little aside - note that in itself JSON is not a hypermedia format]
While I disagree with the don't, because things on the Web are rarely exclusive-or (and if pressed I bet Mike would concede this :) the point is valid. The post is short and worth reading, the conclusion being, for a variety of reasons: [RDF should] ...explicitly support hypermedia within the message itself.
I'm quite amused at the little path I've just followed - it's a certainty Mike's expressed this to me before. But I'm often rather slow...
This all flips right back to RDF Affordances, but (more clearly in my mind) begging the question of through what mechanism RDF should support hypermedia. Taking Roy's statement in this context there are a few layers to consider: the "raw" message; the presentation layer; a control layer above presentation. Some things do come for free: named graphs provide an association between a message and a resource (the name is its URL), this is effectively what you get out of the box even if you treat e.g. a Turtle message as text/plain. But the notion of an RDF graph is tied in.
The thing is, d ocuments have a real-world counterpart, the printed page. Unlike documents, data - especially graph-shaped data - doesn't really have a "default" presentation. Documents can be realised on a machine pretty easily as the metaphor is embedded deep in our approach to computers (the Desktop metaphor being a consequence of this). But there isn't really a real-world counterpart to hypertext. (Wikipedia dates the history of hypertext back to the annotation style of the Talmud, with a more modern example being Borges' 1941 hypertext novel " The Garden of Forking Paths").
However, while there aren't many concrete pre-computer precedents for hypertext, in use it seems very natural. Which shouldn't be a surpise, if you step back from the printed text of documents, conceptually they regularly hop out to annotation and frequently leap out of serial narrative flow to total tangents. Any remotely interesting conversation could be seen as seriously twisty network (graph) of concepts, changing over time. In this light what we're looking at isn't just vocalizations, text or data, it's information/knowledge. On the Web it's hypertext (HTML) and increasingly hyperdata (RDF). While data as an expression of information/knowledge doesn't really have a default presentation on machines, it too can certainly be flattened to text-based documents with the added dimension of hyperlinks.
Whether you take the perspective of binary digits or interlinked concepts the phrase "Web of Data" does describe a superset of the "Web of of Documents" with which we are more familiar. The text in documents is a representation of real-world concepts, and Web-based data is another kind of knowledge representation. Where the parts of speech (nouns, verbs etc) with grammar provide the first, the parts of the Web (resources, typed representations) with grammar (RDF) provide the latter. For a document-hardened Web developer there is a conceptual shift required (as I called it in my latest little Introduction to RDF) to use URLs as names of things (people, places, products, concepts) not just documents.
So I've waffled on for a few more paragraphs without really addressing what's actually needed for RDF Hypermedia (or explaining this post's title). From a technical standpoint we do already have the makings of a hypermedia interpretation of RDF. For example, wherever a resource appears it can be treated as a link, and like HTML in a browser the default affordance is to "follow your nose" (i.e. do a HTTP GET and render what comes back). This is already a de facto convention for dealing with RDF in HTML, see for example what happens with a link to http://dbpedia.org/resource/Berlin . This has a nice parallel in SPARQL's DESCRIBE, and more generally SPARQL seems a good route to decribing [sic] the affordances RDF should support. So instead of (/as well as) using conventions for encoding the SPARQL constructs [lower case] in URIs as the linked data API does (with appropriate associations for the various HTTP methods) it seems reasonable to explore what you get if you wrap these constructs into User Agent actions.
I've personally got a couple of projects on the go into which I want to insert some of this exploration ( Scute, which is mostly an RDF/SPARQL editor and Manuel, a semweb DSL - neither of which are usable yet, btw). I think they'll offer reasonable testbed environments for a fairly rich (editing) GUI and command-line UI. Ultimately the RDF Hypermedia sort of thing should (/will) become integrated with exisiting Web tools, i.e. the browser. But for me at least it makes sense to approach them away from the browser, thinking in terms of the abstraction I came up with for a generalized Web Agent. (If you've seen any of my periodic rants about the tyranny of the browser you'll know why). I'm delighted to see Mike is also planning on exploring the same general area - spending a year or so on afforded agents . Heh, I reckon now's probably a good time to get moving with this stuff, before it becomes another Hot Topic for PhDs...
Oh yeah, " RDF Hypermedia is Art"? Right, well I've already suggested hypertext/hypermedia has a real-world counterpart in our representation of knowledge in speech and text. Where else do we see representation of twisty graphs of concepts? Taking Wikipedia's definition: Art is the product or process of deliberately arranging items (often with symbolic significance) in a way that influences and affects one or more of the senses, emotions, and intellect. Whatever the medium, the influence it has depends on its ability to communicate, and what it's communicating is relationships between symbols. If that's not knowledge representation I don't know what is. Arty folks often talk about a person's individual experience or interaction with a work of art. Another example of interlinked symbols and interaction? RDF Hypermedia. Quod erat demonstrandum and sieze the goldfish. For the benefit of folks who might point out the non-real, make believe aspects of art, I'll assert the fact "the Mona Lisa is an assertion of a set of facts". You don't have to believe that assertion.
2012-02-11T15:11:24+01:00
hypermedia json art rdf
Related
Comments
Bringing the Cloud home
Bear with me, as usual I've not thought this through thoroughly (but I will quote myself a few times :)
There's been a flurry of commentary recently about the threat to the "Common Web" from things like Facebook, Google+ (e.g. see posts from John Battelle, Robert Scoble). There's also some fragmentation of the Web going on around "Apps" in Apple's mobile device sense. The main antidote proposed for this kind of thing in recent years has generally been to use non-silo'd, non-walled garden services.
But I'd argue that it's a more systematic problem, that switching service won't necessarily solve. Remember we already have an open distributed social network in the form of the blogosphere. Where is the advantage in Facebook, G+?
I'd argue that one of the main causes of this fragmentation is the tyranny of the browser. Sure, from a glass-half-full perspective the Web browser has evolved into a seriously versatile client-oriented Agent platform, with integrated processing (Javascript and/or plugins), UI facilities (HTML/DOM, tabs etc) and protocol support (the HTTP bits). But from a glass-half-empty perspective, look at how it gets used. Typically the Application is a given back-end system with some minimal, quasi-proprietary HTTP exposure with a proprietary front-end built from browser facilities. Can you use the Google Plus front-end you see in the browser and point it at Facebook?. Hell no. What integration is possible is generally done through (typically site-specific) APIs. From the company's point of view, it makes sense to do what they can to make their site as good as possible. From the user's point of view, the experience of a really sophisticated, tightly-integrated site (as Google's pushing for across its subsystems) can be much more satisfying. The cost is that the server and client components are tightly-coupled.
More generally, so much current technology is geared towards that one page you have open in the browser right now, it's like looking through a narrow pipe at the Web. Web Intents and the Firefox work on push are likely to alleviate the symptoms somewhat, but still the underlying problem isn't being directly addressed. The browser has become a bottleneck.
One approach to counter the walled gardens and silos (and associated issues like privacy) is to host all your own stuff (check Steve Pemberton's Why you should have a web site for starters). Depending on your ISP setup, it may be possible to serve direct to the Web from a home host. So why not manage all your data locally? That makes it easier to manage how much control of your own stuff that's handed over to other services.
Ok, you might argue that's something we're (more or less) all already doing. Maybe. But where is the Web? I suspect most people consider it conceptually (as I usually do) as out there. Why should it be here too? Ok again, there's been loads of work done over the years on P2P systems. We've seen things like Microsoft's Personal Web Server. But note: "PWS was useful in developing web applications on the localhost before deploying to a production web server.". The approach here, and with many other comparable systems (certain blog editing software, for example) is that the local material is offline and copied to a remote server (via ftp or whatever) to make it live.
I think the time might be right to look again, and consider making the localhost the live host. Where the wiring doesn't support direct serving, what's needed is a transparent, real-time connection to a remote endpoint to simulate a local/global connection to the internet. Not trivial, but then again there aren't any hurdles that haven't been leaped in other contexts. There are several ways this could be achieved, maybe the most straightforward based on standards would be using an XMPP-based bridge between the local and remote machines, with a fair bit of caching on the remote machine for performance. Should be commoditizable.
Of course this would demand a full stack locally - if you were using, say, a MySQL store behind your CMS, that'd need to be running locally, along with all the interpreters for PHP etc, together with whatever message dispatch switching you'd ordinarily use on the remote host. But I think it would be advantageous to manage information locally using Web-oriented technologies - in particular linked data. Let the Web in.
Why am I suggesting this kind of a setup, just as everything appears to be moving to the Cloud? Well this isn't anti-Cloud, quite the opposite. It's bringing the Cloud right back home, to encourage greater participation. Avoid the bottleneck of the browser (and a handful of lower-level protocols like ssh) to connect with the Web, open the floodgates with other local agents interacting with your own data and through the HTTP bridge, and allow as many parallel channels as you wnat, in addition to the browser. It is the 21st century, after all.
PS. within seconds of me posting the link to G+, Melvin Carvalho responded:
Already doing it. Dyndns and apache let you run a pretty decent web server. Then I have a linked data space on my desktop https://github.com/linkeddata/data.fm . I also have a little script which links my personal data space to my online presence when I'm logged on. I wonder why in 2012 more people dont run a personal server.
2012-02-07T11:09:27+01:00
cloud proxy browser rdf
Related
Comments
Small Data
I'd just like to plant a little flag in the sand. Big Data seems to be the flavour of the month (and is undeniably extremely useful and interesting), but I've a gut feeling that might be symptomatic of not seeing the wood for the trees (or maybe vice versa).
I've not thought this through much, but surely any trends/correlations/relationships that are important enough to be of interest should be detectable without having to build a terabyte+ store? Rather that trying to capture as much raw data as possible up front, I suspect a more productive approach long-term will be to work with (maybe federated) crawler farms, with lots and lots of algorithms running in parallel over what they see. If there are appropriate training feedback loops in place, the shape of algorithms themselves could be treated as the results of the analysis.
It could be argued that once you have accumulated a corpus of raw data you can subsequently throw whatever you like at it without having to get the raw data again. But that corpus will never be complete or truly fresh - as new data appears on the Web all the time. More critically, under normal circustances you can never be sure you've got a dataset that contains a good sample representation covering whatever unknowns you're exploring. But crawlers can be directed to favour slices of the Web that contain information relevant to your hypotheses.
So, in the context of the Web, the Web itself should be the only big data needed. Which gives a neat parallel in the other sciences: reality itself is the only database you'll ever need :)
Ok, in the same way that Big Sites (like Wikipedia/dbPedia) adds big value to the Web alongside lots of small pieces, loosely joined, the same no doubt goes for Big Data. But let's not forget the vice versa, a complementary Small Data approach.
Somewhat orthogonal to this, one way in which the Web is a game changer for data is that here the relationship between pieces of data (/documents) is at least as significant as those pieces of data stacked on top of each other. Link Rank is a special case, an aggregated, flattened view of link value. If topics and entities (i.e. thing in general, people, places, concepts etc) and their interrelationships are inferred and/or explicitly named, it should expose some interesting facets of how human knowledge works.
2012-01-30T10:04:06+01:00
algorithms federated ai science rdf data
Related
Comments
Search plus Your World - fool's gold
For quite a while I've held the view that most current approaches to Web search are fundamentally flawed, because the best way to find something is not to lose it in the first place. But as the companies invested in search gradually get smarter in their use of person- and (to a lesser extent) thing-oriented data, rather than just word association (football) search results seem increasingly more focused. Google's approach in particular has grown increasingly like the model put forward in the Semantic Web initiative. Recently with G+ we see a big push to capture and exploit data associated with personal profiles (the FOAF domain) and brands (the GoodRelations domain, although maybe there's a role for an additional brand- rather than product-oriented vocab). With Rich Snippets and Schema.org there's a direct use of semweb technology (in a slightly mangled form - One True Ontology is a well-known antipattern to anyone that bothers to look at the literature).
In fact the "Your World" part of Search plus Your World (SPYW) can be seen as a reinvention of the most important part of Semantic Web technology, that of giving everything of significance a URL: people, places, things, concepts. Given that, you can start describing and leveraging relationships between those resources. To use a phrase I think originated around microformats, it's lower-case semantic web. Ok, behind the quality glitz of G+ profiles and pages this seems to have been done in a rather sloppy, ad hoc fashion, but that in itself is fine - whatever it takes. But where Google get it very wrong is by putting themselves at the heart of their system. Not only is semantic in lower-case, so is web. If you do a search with SPYW enabled, you're pointed straight back into the Google Empire. They are making themselves gatekeepers of the Web. Although there aren't any concrete entry barriers to this walled garden, by only signposting Google's footpaths in search results it's creating a system with the same characteristics as say AOL around 2000. From Google search being a vital accessory on the open Web, it's increasingly becoming a portal.
There is already a visible cost in practice to Google's echo chamber - if you want to re-find something one of your colleagues said the other day, sure SPYW is helpful. But if you're trying to do some original research, you don't want to be searching with Your World blinkers on - an engine without those preconceptions such as DuckDuckGo will be more useful
This strategy I'd assert is doomed to failure for the same reason AOL's walled garden collapsed, to use another phrase I like to repeat, because no matter how big any single entity becomes, the rest of the Web will always be bigger. The focus on the user/Don't Be Evil thing is absolutely right to highlight the value of non-Google resources, although it does fall short by suggesting that the rest of the Web is just a handful of other companies [G+ link] i.e. Twitter, Facebook etc. Google's own long-term survival as a market leader is absolutely dependent on their respect of the Web at large.
So what should Google do? Re-read Steve Yegge's awesome rant [G+ link] for starters. Especially the bits about Platforms. G+ and Your World should be considered in this context - as a semantic (any case) Web (upper case) Platform. For example, while Google's pages appear to be aimed at providing the canonical URLs for concepts (...lower-case). But there's already an excellent source of such URLs : Wikipedia. In itself Wikipedia only provides URLs of documents who's primary topic is the thing in question, but dbPedia is a well-established mapping based on best practices from thing identifiers to Wikipedia pages (e.g. <http://dbpedia.org/resource/Berlin> foaf:isPrimaryTopicOf <http://en.wikipedia.org/wiki/Berlin> . ). If a handful of students from obscure north-European universities (heh, sorry, just for the sake of contrast), with a little community support can create and maintain - give the world - a service supporting all the concepts/things covered by Wikipedia, imagine what the mighty Google could achieve...
To give a little example in the context of Personal Profiles, if I publish my definitive personal profile on my own domain (note Google already understands all the elements of this) then for queries for which "me" is the appropriate response, that page should be the first hit, not my G+ profile.
Another factor in the walled nature of G+ is the limited API. I'm sure features will be added to this in the near future, but I hope (probably unrealistically) they will use proper standards and follow known best practices. Going further into over-optimistic territory, I'll quote Tom Gruber (in an interview talking about how Siri works) :
A site that exposes RDF usually has an API that is easy to deal with, which makes our life easier. For instance, we use geonames.org as one of our geospatial information sources. It is a full-on Semantic Web endpoint, and that makes it easy to deal with. The more the API declares its data model, the more automated we can make our coupling to it.
What should we (as users and components of the Web) do? Well, basically what we're already doing...but trying not to be distracted by shiny things and keeping an eye on the long term - standards are good. When we publish data on the Web we need to consider the quality of the data first (i.e. make it 5 Star), seeing it as purely Google-fodder is missing the point.
Comments please [Google+ link, the irony is not lost on me :)]
2012-01-28T12:59:52+01:00
google semweb rdf spyw
Related
Comments
Establishing Logical Truth on the Web
[http://purl.org/stuff/true and http://purl.org/stuff/false]
I'm sure they already had URIs somewhere before (http://dbpedia.org/resource/True is nearly there...) but it seemed a nice idea to give them some solid (?) semantics too, fortunately there's at least one media type available - so it's "true as in Javascript". Took a few minutes to set up to give that media type. Tried PHP first but it doesn't seem properly configured on this server (which is weird, I'm sure I've got PHP stuff live). Anyhow, Apache2 config for hyperdata included mod_python from who-knows-when, so I used that.
true.py is:
from mod_python import apache
def index(req):
req.content_type = "application/javascript"
return "true"
with .htaccess (same dir) as:
RewriteRule ^true$ true.py
- plus corresponding stuff for false.py.
I don't like the way the PURL redirects, can that be done transparently I wonder, keeping the same URI in the address bar?
2012-01-20T19:27:25+01:00
truth false logic uris true rdf
Related
Comments
Different Modes of Browsing
Browsers have certainly evolved since the WorldWideWeb browser in 1990 into pretty sophisticated pieces of kit, supporting rich views of HTML and many other media types, along with a powerful version of code-on-demand through Javascript. But in certain respects they're still very primitive. It was probably unavoidable, but there's a significant conceptual gap between what the browser can do as a general-purpose tool and what it can do as a container for site-localized Web Applications. Take Gmail as an example of the latter - very much the same ballpark as desktop mail applications. But move away from that domain and all gmail's functionality becomes inaccessible. We're still a way off a genuine Web of Applications.
One obstacle to maximizing the Webiness of Web Applications is found around the way buttons are used, directly mimicking the behaviour of desktop applications. But on the Web, the best affordances are associated with links (i.e. URIs + HTTP). In this context we should expect more of Web Applications - that the application should be built primarily as a Web API, i.e. a regular Web Site, so that the affordances are available to other applications. It should be trivial for me to check the contents of my gmail inbox from the comfort of my own Home Page. However I'd hazard that the business models of the big Web brands are likely to hinder development in these directions - Google, Facebook, Amazon etc. run somewhat counter to the open Web, in that they are motivated in keeping you in their domain (or in extreme cases like Apple and MS, in their own devices). Web Intents seem to me to be a good start towards enabling more flexible yet uniformly accessible interactions.
Over the years the read/write nature, or rather lack of it, has been discussed an awful lot. Even though the first browser included an in-place page editor, the current model still doesn't really support this. One big reason for this is that HTML - even HTML5 - falls short of supporting the full range of HTTP methods. The predominant approach to writing to the Web is through major intermediation by Content Management Systems. While CMSs are generally a very good idea, the fact that they're built on an effectively hobbled client means they aren't as Webby as they should be. There are genuine technical obstacles to generic writeability, notably those related to authentication and authorization, though hopefully WebID will help there.
The metaphor of the browser is itself quite limiting. Generally we only have one Web document open - visible on the screen - at a time because we can only read one thing at a time. Even with the development of tabs, the browser still essentially reflects this modal model of Web resources. I think I read about work on accessing data across tabs, but as far as I can see it doesn't exist yet. Ok, desktop applications are also under the restriction that we can only look at one thing at a time. But it's a lot more common to interact with multiple independent data sources/sinks and processing components there.
A browser can pretty much support a general-purpose HTTP client (through script), but because we're so used to thinking in terms of the Web of Documents and requirements there, the one page at a time modality is deep in the mindset. Service mashups, be they client or server-side, all really aim towards focusing down on a (primarily read-only) single-document view. A critical aspect is that traditionally the link has basically just one meaning - navigate to another page (do a GET and display the results). But while a link on a Web of Data could correspond to the same thing, it could also mean 'GET the data and merge it into the local store' or 'use this URI to filter the current view' or any number of pivot-like operations.
Ok, this is in danger of turning into another rant...let me sidestep by highlighting one specific browser affordance.
The Turn-Around Button?
One link-oriented metaphor for the browser Back Button could be walking down a footpath, junctions in the footpath corresponding to the available links presented to us on a Web page. Clicking the back button has the effect of walking backwards to the previous junction. We are still facing in the same direction. But what if we metaphorically turn around? Ok, the outlinks on the page will look just the same, but other data is available, our whole history. Why not present the current page alongside a recent history page (like chrome://history/) so we can hop back further - a turn around button. Yes, the Back Button may drop down a list showing the history, but richer information could be provided in the main window such as the links followed as a tree.
From a data perspective, variations on a Back Button might mean 'remove this data from the local store' or simply 'Undo'.
Dunno. Work needed on RDFAffordances.
See also: Identifying Applications State
2012-01-20T12:29:32+01:00
intents applications metaphors actions web browsers rdf
Related
Comments
Introducing dork
That's Descriptions of Runtime Klasses. Some simple Java for getting RDF out of code trees.
The RDF can be used to generate class diagrams, like this:

An interesting aspect of the Web Beep project is processor pipelines. To optimize things I needed to play with parameters easily so wound up building a system interface covering the processors and pipelines. As it stand in the source now, the configuration is set up from Java structures. But to see what the configuration is, a recursive toString() on the Java structures yields a fairly structured text description of the configuration (there's an example on the How It Works page).
This led me to think that if such descriptions could be used to describe existing configurations, they could also be used to set up those configurations. The format's ad hoc, so first it made sense to look at using something standard. The processor pipelines are essentially graphs (with annotations) so RDF was naturally the hammer I chose. The general processors/pipelines model is encoded (better word?) in the Java class structure, so if I could get that in RDF it'd be a good start. It's general-purpose stuff so I've split it off as a separate project at github and given it a silly name.
This kind of thing's been done before, in fact I'm hoping to incorporate David Huynh's doclet (for use with Javadoc to generate RDF) as well in the near future. But that approach gets its data 'statically' from the source, whereas the parameters at runtime are important for Web Beep's processors etc. I've made a start on the write-up with the code (ermm, Javadoc's todo :), but one key thing is just using a describe() method in the kind of places you might use a toString(). It should return a snippet of Turtle-syntax RDF describing the object in which it appears. I've also made a start on some easy-to-use utility methods that use reflection to extract a description of objects which doesn't rely on them having a describe() method, bit of a lighter touch.
As a sanity check on the generated RDF I made a (pretty trivial) SPARQL query with XSLT transform to GraphViz dot format, the result of which can be used (with straightforward command-line tools) to generate images like the one above. [I remembered half way through that Redland's rapper utility can output dot format, but that's RDFy (see screenshot) and I'm after something much more app-specific.] There's a little script which shows how the image was arrived at.
2012-01-19T22:12:46+01:00
java dork dot diagrams class sparql rdf
Related
Comments
Introducing JEdwards
JEdwards is a little sub-project I've just been putting together in Java. Screenshot.
It's so named for two reasons:
- it's roughly a contraction of "towards a Javascript editor"
- it's something you probably want to ignore (like twincest :)
Having said that, it does have a couple of features that may be of interest to sane developers:
- a Java terminal emulator (bash shell)
- syntax highlighting for SPARQL/Turtle
Neither are entirely finished, but both are useable/reusable (Apache 2 license, or somesuch).

I've been using Eclipse for most of my dev stuff for years now. When I was doing things in Node.js I wound up configuring it to have a file explorer pane, a text editor pane (for Javascript, HTML, Turtle or SPARQL) and three terminal panes all connected to the local shell. Eclipse was basically a (slow) sledgehammer to crack a nut. I did spend a while looking for a way of setting these things up using separate apps, but was beaten by the problem of pinning the windows to the workspace. I believe it should be possible using Devil's Pie or similar, but I had no joy. But as it happened I wanted a terminal emulator in Java anyhow and had played with syntax highlighting before.
In Scute I'd put together some basic highlighting for Turtle, except when I came to look at it again it was a bit too hardcoded to reuse, and Javascript is quite complicated... Looking around I came across jsyntaxpane, which is a pluggable highlighter which takes its config from a JFlex lexer. It'd got the necessary for Javascript, so I decided to use that instead of my hacky code. I found a SPARQL/Flex file on the Web that someone had prepared for IntelliJ IDEA which although was geared to do other things saved me a bit of time writing out the SPARQL patterns. Here's sparql.flex.
For the terminal emulator I started with the JConsole UI from BeanShell, to which I've adding the bits which talk to the bash shell. It works ok on this Ubuntu machine, I've no idea what would be needed to set it up for a different OS. The source for that is here.
I started Scute, a desktop RDF toolkit, just over a year ago. I did get some bits working fairly well - I was using the SPARQL bits for real - but then I got distracted and left it largely unusable... This JEdwards bit of coding has got me back into it, and tightened up how I was thinking about the dev process. I must write this up properly. The main idea is, while it should be built from reusable components, the way it's setup as a whole will be optimized for how I want to work. Somewhat inspired by woodcarving, where a lot of the time what's best isn't a general purpose tool (wood router or software IDE) but a highly focused tool (1/4" No.4 fishtail gouge or JEdwards). If the resulting code is useful for other people, great, but the motivation isn't to create a product, just to help my own personal workflow. Horse before cart dogfood.
The reusable components part comes from testing. I'm lazy about tests at the best of times, and Scute is all about GUI so is a bit tricky to test. But I reckon component-level functional tests make a fair a substitute for unit tests. Anyhow, more about this another day.
2012-01-18T19:12:14+01:00
scute terminal emulator jedwards sparql turtle syntax highlighter rdf
Related
Comments
Listy Thing - aspirations
Speaking on the phone to my brother, I told him about the Listy Thing I've been working on, he pointed me to workflowy. It's an outline/list todo thing that already does a big chunk of what I had in mind for Listy Thing (quite funny they've also got a 'y' on the end). The UI is awesome, which on the one hand is inspiring in demonstrating feasibility, on the other scary, showing how far I have to go.
It is basically what I'm after, only I want something backed by RDF so that more data can be associated with nodes (especially nodes which correspond to Web resources), the data can be reused, and many alternate views are possible.
I'm still a little stuck on the fundamental question of how best to represent lists, I guess I just have to try things out. Had some good suggestions on the G+ page - there's even an Ordered List Ontology.
The issue's a bit conflicted, because on the one hand useful ordering is generally tied to some particular property (e.g. dc:date) so the list structure can be generated on demand (via SPARQL or whatever), no additional ordering is needed in the data. But then as far as user experience is concerned, as a list is being put together the order can be totally arbitrary - i.e. there is an order, only we're not quite sure what it is yet. This might suggest using rdf:List as a general purpose mechanism.
I think I'll try some kind of low-cost property (with a numeric value). So a property, which after all is just another kind of resource, gets minted when the list is created in the UI. Ideally I suppose it'd be a bnode but a quasi-disposable URI will do. Dunno, give it an rdfs:label on the fly and associate it with user/date of creation?
I use the namespace http://purl.org/stuff# for "disposable" classes and properties (feel free to follow suit). They're Cool URIs in the sense that they'll always resolve (although I must add RDF docs to that URI), disposable in the sense that they appear in instance data but won't have any more definition.
2012-01-16T09:25:24+01:00
lists rdf
Related
Comments
1. HTML 2. RDF
I posted the other day a note-to-self re. Listy Thing, though I haven't done any coding on that since I have been puzzling over one particular bit, which boils down to : how best to represent lists in RDF. Ok, there's rdfs:List and Wikipedia has quite a good definition of list, but that's not really the end of the story, there are a few practical considerations.
The conceptual model I want is just the usual thing of a finite ordered sequence of things, for sorting out my own lists: shopping, project resources and todo, bookmarks, list of people (friends & circles) etc. etc. The items in these lists will generally be either text or links (which fits nicely with RDF literals and resources)...or other lists. Ok, that just jumped into graph-land. Also the lists/items may also be lots of other things as well - like a todo list might contain tasks (yeah, got an unfinished vocab for that as well). An item may be multiple lists and so on. But the practical model I'm after is pretty much a direct reflection of HTML lists:
- ,
- to allow easy rendering/manipulation in the browser.
One of my favourite typed-item lists of all time is (appropriately enough) Enrico Franconi's Description Logics Course. I want to be able to put things like that together really easily and - the why RDF? part - reuse the data easily.
I've played with very similar stuff before with something I called XOW, XHTML Outlines for W6, where W6 was a simple vocab for adding just a bit of semantics to resources (addressing the questions who, why, what, when, where, how - I think it was Libby set me going down this path). Beh, loads of link rot around there, must fix - basically you could make lists of typed items in the browser, the result could be sent through XSLT to produce RDF.
I've come at it afresh from the motivation of wanting to sort out my own lists, sod any wider problems. The tools are much better this time around, but the modelling thing is stil a quandary.
With a bit of googling I've found some good scripts to get started and have been noodling on the HTML side - drag and drop reordering of lists, with in-place editing. Current home: live here, in github here (note - dev branch of seki). It's not far off what I reckon I need.
Now there's the fun bit of expressing this material as pure data to stick it in a store and access via SPARQL (1.1). First pass at least I'll probably use XSLT in lots of places, the transformations should be pretty straightforward, but the SPARQL side is a bit tricky. Andy Seaborne has done a great post about lists with SPARQL Update, and as I'm using his Fuseki for storage it stands a good chance of working (heh).
But I want to be able to muck around with these things a lot, so I'm wondering whether it might be advantageous to also 1. overlay some old-fashioned RDF container stuff on the lists as well (i.e. rdf:Seq) and even 2. simple ordinal property values, something like :
:contains [ a :listitem; :value ; :inlist - :position "43" ] .
Dunno, may this might help with matching quirks like those Andy mentions, "the empty list isn't any RDF triples, so looking for lists isn't just looking for rdf:rest properties", (rdf:nil keeps running away!) and with SPARQL 1.1 property paths, list elements do not necessarily come out in order.
Funnily enough, lists in RDF seem to attract a lot of caveats - there's the old stuff about how 'weak' containers are, the lovely line in the RDFS spec: "Just as a hen house may have the property that it is made of wood, that does not mean that all the hens it contains are made of wood, a property of a container is not necessarily a property of all of its members.". In the same spec, the delightful: "RDFS does not require that there be only one first element of a list-like structure, or even that a list-like structure have a first element.". But if it walks like a duck and quacks like a duck then it's made of wood, hence a witch.
Suggestions very welcome, here's a G+ Page , see if that works for comments.
PS. also related is the Linked Data API stuff re. lists, see e.g. listvalued_props
- and
2012-01-13T20:10:36+01:00
lists html rdf
Related
Comments
Hixie's Furniture
Too long; read later - here's a demo : SPARQL Sliders Test
+Ian Hickson posted a lovely semweb use case:
"I'd like a search tool for furniture that works like Google's Flight Search does for flights. That is, with sliders so I can say what type of furniture (table), what range of widths (1-2m), lengths (2-5m), and heights (1-2m), what material (wood), what thickness, what price range, etc, I'd like, with the list of available products updating in real time."
As it happens I wanted a slider thingy ages ago, so this was a good prompt to make a demo of the front end part which takes the values from slider components and uses them in a SPARQL query.
For convenience/lack of available data the demo runs against dpPedia via the SNORQL SPARQL Explorer. As furniture and it's dimensions wasn't available it uses cities and their populations and elevations.
So how would you get real data?
First of all, furniture vendors could either provide dumps of their data or, more Webby, mark up their sites with RDFa and/or HTML5 microdata using e.g. the GoodRelations e-commerce vocabulary.
Ultimately, for a front end like these sliders to work, the data would need to go in a store with a SPARQL endpoint. But, triplestores shouldn't be thought of as just a wacky alternative to a SQL database. A triplestore is just a cache of a little chunk of the Linked Data Web. The question of where the store resides and how the data is collected is entirely open. Following the more traditional DB model, a service might aggregate the data published by known furniture suppliers and provide the endpoint online.
But alternately, a local user agent (I think Chris Bizer had a little Java example, can't find the link...there are others) could crawl the Web to answer the query just-in-time. The advantage of this approach is that it's more thorough and the only real option for totally arbitrary queries, the downside being that it's answer will probably take longer than milliseconds. But remember triplestores are caches, not every little bit of information would have to be discovered and read from every page. There are vocabs for dataset and vocab discovery (remind me of the acronyms please :) Note too that you're not limiting your client agent to a single datastore. traditional backends (SQL or NoSQL) are effectively isolated silos, triplestores are integrated with the links of the Web.
Incidentally, this is something that might be nice to express as a Web Intent, along the lines of "make me a query from this template with these parameters and apply it to this endpoint, putting the results into this widget" (that's a bit verbose for a general-purpose intent, but you get the gist). c.f. RDFAffordances.
2012-01-11T15:01:56+01:00
sparql demo goodrelations rdf hixie furniture
Related
Comments
The Emperor's New Client
A wee rant.
Ok, I'm totally with the consensus that the future is Cloud-based, and to be a little more specific Platform-based and to be even more specific primarily HTTP-based. To back that up, cf.
- Michael Hausenblas's new report
- Mike Amundsen's recent blog post -
- Steve Yegge's awesome rant at Google
But to expand something I mentioned in passing here recently :
in one respect the emperor is stark-bollock naked. Browsers are currently a really sucky environment for client development. Sure, the HTML/CSS-based (standard!) rendering is wonderful. As shown with Node.js (and despite what Google are saying around Dart), Javascript is a reasonably pleasant, perfectly capable programming language. The growth of Ajax and JSON have shown inter-system comms is workable. There are some good dev tools and libraries. So why does working with this stuff feel like pulling your own teeth?
Here I could point to the traditional DOM API, blame the W3C for all the world's ills and an awful lot of people would nod and smile knowingly. But although that's arguably valid (heh), I reckon the problem is more systematic and can mostly be blamed on browser developers.
Ok, blame is too strong. The decisions made over the years and the directions taken have generally been perfectly rational in the context of the prevailing conditions. But there have been feedback loops at work. The flashy [sic] chrome [sic] surrounding HTML dev, from the img tag onwards, has pulled Web developers in like moths around a flame. So the browser developers act to improve that experience. Meanwhile server-side tech has developed out of the corporate legacy of silo-based systems. Let me quote Steve Yegge there: "It's a big stretch even to get most teams to offer a stubby service to get programmatic access to their data and computations.". The way services are offered over the Web, even Web 2.0 services still have a big hangover from this mentality. I'd argue that most Web APIs are only marginally better than SOAPy stubs. Largely because XML and JSON aren't particularly Web-friendly. Ok, don't bite my head off, let me qualify that.
First XML. There have been plenty of arguments over the years around XHML, and back in the day (I wonder how old that phrase is) there were arguments about the XML nature of RSS. Postel's Law, the "Robustness Principle" got cited a lot. Let me give you some deja vu:
Be liberal in what you accept, and conservative in what you send.
What a lot of people misinterpreted was the keyword robust. A robust system is one designed to be able to fail gracefully or continue working acceptably with noisy data. That's exactly what we want for the Web, right? Well not necessarily, if I was ordering a book from Amazon, and there was a partial failure, I'd rather they didn't make a best-guess when it came to taking money of my credit card (I think paraphrasing Tim Bray there). Anyhow, XML is not robust, by design. XML is designed to bail out completely at the first sniff of anything dodgy. As it happens, the way XML is often served on the Web is without proper regard for the media type, i.e. dodgy and hence broken.
Sorry, that was gratuitous deviation, the real reason I'd say XML isn't Web friendly, like JSON, is in the way people use it. Whether data is conveyed as name-value pairs or through more complex structures, the key parts are generally just simple strings. But by itself, a string on the Web is next to useless. You or I can (maybe) read it, or even paste it into Google and get a definition. But what is a poor machine client to do? What makes the Web are links. It's 101 but somehow still manages to be overlooked: the link has two facets: a universally unambiguous name (URI/IRI) and a protocol for following it (HTTP). If a client on the Web encounters a link, it can follow its nose to find out more information about it. That's what we as humans do in browsers all the time, yet when it comes to Web services for some reason a simple string is seen as adequate to identify something.
Ok, with XML, the HTML DOM and to some extent JSON there's been some justifiable resistance to the use of URIs for names, because namespaces have traditionally been uninuitive at best and agony at worst. Using URIs instead of simple strings certainly adds a burden (it doesn't have to be that great, check Turtle syntax), but its benefits far outweigh the costs.
The thing is, you'll hear talk of snowflake APIs - only one implementation of each exists - but what gets overlooked is that by their very nature, most APIs just aren't Webby. The client must have prior knowledge that the service at endpoint X uses API Y. What you end up with is effectively a series of 1:1 client-server connections. That, while the uniform interface REST may mean it's less brittle than an RPC connection, still means tight coupling.
Ok, you might argue, that for any communication to take place, some prior knowledge is required. Sure, but that can be minimised - just like the way we follow links for more information in a browser, a service client can follow links to get more information. This is only a small conceptual step, but what it enables is hugely powerful. Above everything else, it's what Linked Data and the Semantic Web gets right.
I reckon that browser developers, with their emphasis on doc-oriented HTML have a natural tendency to carry their experience in that domain across and apply it to data. Naturally namespace-less XML and JSON will seem preferable through that lens. But in practice, documents and data are apples and oranges. Browsers have been optimized over the years for the former, incidentally making the latter harder than necessary.
It's funny how you don't hear so much about service mashups these days, despite their undeniable coolness. I'll assert that it's because developing for Web data in the browser is bloody hard work, especially when there are NxN arbitrary API mappings to know.
Overall it's actually something of a miracle that the notion of cloud-based platforms has emerged.
I had planned to say more about Cloud Computing Outside of the Browser - or to put it another way, evolving old-fashioned non-browser Rich Internet Clients (as well as server-server and every other non-browser configuration). But ranting's worn me out. Anyhow, in short, I reckon that for the forseeable future, non-browser clients in many circumstance are probably preferable to browser-based equivalents, primarily because they're easier to develop (as I keep saying, I reckon the agent model of combined client/server units is a good way to go). While I personally welcome HTML5 and the APIs as a clean-up of document markup and processing, when it comes to data it isn't even a Band-Aid.
2012-01-09T20:00:25+01:00
apis cloud browser services rdf
Related
Comments
Dart H. Vader
I just heard about Dart (via Seth Ladd and Edd), a new Web programming language from Google. It aims to fulfil the role Javascript currently has, only doing it better. On the pro side, new languages are inherently cool, and Javascript can be a real pain. On the con side it seems unlikely that any browsers other than Chrome will support it in the foreseeable future, except potentially via translation to Javascript, i.e. This Page Best Viewed with Chrome
It's hard not to see echoes of the old Microsoft arrogantly pushing it's own product here (remember VBScript?), although Google have in recent years made NIH an artform. But who cares about politics, how's this going to affect the Web?
Well, Code-on-Demand does appear in Fielding's thesis (slightly bizarrely as an 'optional constraint') and has been around since the early days. Pluggable clients are certainly a good idea, and Google have been leaders in moving Rich Internet Applications as opaque desktop apps into the browser using Javascript. The apps are still pretty opaque (View Source on gmail if you doubt that) but they do at least more-or-less run cross-browser.
I've not read much of the Dart docs yet, not tried it at all, but first impressions are that it's a nice clean syntax not unlike JS (or for that matter Java, C# or Python...) and they've already got a good bunch of libs together (even if they do include RPC, yuck!).
As an aside, it should be noted that there's a cost to the standardization of today's browser as Web client (in the process of being defined via HTML5 and associated APIs). It does mean an effective monoculture of HTTP clients. Arguably you can write whatever kind of client you like (probably in Javascript) and host it inside a browser, but they have been optimized for a fairly specific app scope. If you stray from the general model of a Web of HTML Documents you're in for an uphill journey. The arbitrary desktop client has more freedom to use HTTP more creatively, but then there won't be one on everyone's desktop. (Personally I like the notion of Web agents (where an agent = client + server + persistence + code) as an abstraction for Web components, as in "Two Webs!" [pdf - heh]. I wonder, is there a HTTP server in Dart yet?)
Looking at the "Leaked internal dart email" (as with UK politics, it's probably sensible to take the "Leaked" aspect with a pinch of salt), there does seem to be some motivation for Dart coming in response to the success of iOS. I'm pretty sure a new language isn't the best response to this, but it certainly makes a change to the usual big proprietary Flash/Silverlight kind of issues. Google are still talking of evolving Javascript, but it does raise the question of what Dart will offer that couldn't be achieved using JS. Optional typing is the feature they seem to be plugging most. So I wondered if anyone had worked on adding static types to JS. Funnily enough, the first few hits refer to iOS. Oh dear, we're really not talking iOS envy, are we?
It's a little surprising that Google haven't thrown their expertise at the JS-is-a-mess issue previously, I don't see a groundbreaking dev tool and pattern library out there (funnily enough the Dart Editor is based on Eclipse, which does seem a bit un-groundbreaking (although I'm not criticising the choice, Eclipse is my main IDE)).
Whatever, it should be interesting to watch how this pans out. Dart will almost certainly be a very cool language, albeit engendering ambivalence everywhere outside Google. Give me a shout when it includes libs for non-HTML Web languages (i.e. gimmee RDF :)
Comments (G+)
2012-01-06T20:48:18+01:00
google language programming dart rdf
Related
Comments
Listy Thing - note to self
Spent this morning having another go at sorting out my lists and links. The aim is to keep them in a triplestore (probably Seki/Fuseki/TDB/Jena) and to be able to add, organise & edit them in a browser. I'd better leave this, have a nap then get on with something else now. So to help me remember where I'm at:
- Rearrange & in-place edit (with jQuery) worked on test page, doesn't yet work on real data (crashes browser!)
- Editory thing - four-pane CSS seems ok, CKEditor looks good for rich content, need to play (not tried with above yet, not sure about cross-list D&D)
- Did a dump of links from Chrome, ran through Tidy, XSLT (xsltproc) to ul/li and split into separate lists - basically working ok
tidy-default bookmarks_1_6_12.html > bookmarks.xml
xsltproc lib/bookmarks-split2lists.xsl bookmarks.xml
for file in *; do mv "$file" "${file}.html"; done
- Vocab - no idea for textual list items, Annotea has http://www.w3.org/2002/01/bookmark#Bookmark, also Tag Ontology
PS. got some dump-from-del.icio.us code tagliatelle. Need to try AndyS's rdf:List with SPARQL 1.1 Update stuff.
2012-01-06T15:15:08+01:00
lists bookmarks links rdf
Related
Comments
Scutter's Mate
As I was admiring the Linked Open Vocabularies Endpoint (LOV-E) it occurred to me that the vocabs I maintain (well, create and forget...) aren't particularly discoverable. Even before saying they're vocabs, there's not necessarily anything linking in to them (yes, really forget). Ideally I suppose I should put together a proper Semantic Sitemap, but for now I've thrown together a quick and dirty directory walking script in Python: scutters-mate.py. It produces a Turtle listing of the RDF files it finds (by filename extension) containing entries like this:
<http://hyperdata.org/xmlns/meta.ttl> rdfs:seeAlso <dogmood/index.ttl> .
<dogmood/index.ttl> rdfs:seeAlso <http://hyperdata.org/xmlns/meta.ttl> .
<dogmood/index.ttl> format:format <http://purl.org/stuff/formats/text/turtle> ;
rdfs:label "text/turtle" .
Here I ran it in the /xmlns directory and saved the output to xmlns/meta.ttl.
I'm thinking I'll also run it from the root of all the domains I use, then try and remember to link to /meta.ttl wherever appropriate to give the scutters a helping hand.
Comments (G+)
2012-01-04T20:16:46+01:00
sitemap scutter vocabs rdf data linked
Related
Comments
Web Beep - where next...
Minor tweaks aside I've got Web Beep to a good milestone, basically proof-of-concept.
Boxes ticked:
- Live service
- How it works documentation
- Spec
A good point at which to put it on one side and get on with some rather more pressing bill-paying stuff for a while.
But it'd nice to have a clue on next steps. There are a few potential directions:
Ports
The obvious one is in-browser Javascript. While the HTML5 APIs look the best route long-term, it's not so obvious right now. There are things already around like making .wav data: URIs, and also dynamicaudio.js - which looks very promising, it supplies a Flash player for browsers that don't support the API. Until very recently I expected there to be a need for DSP libraries (there is a dsp.js) but as it happens it only requires trivial stuff and there's the Java to refer to, all easily hacked. (The only "serious" DSP bit is the Goertzel algorithm, but that itself is easy-peasy, already done: goertzel.js, literally only took a couple of minutes).
There might be uses for desktop UI-based codecs, but I don't know what...I might well hook something up to the current implemetation, see if it inspires.
Some kind of mobile device app should have potential.
But this all is all very tied to another dev direction -
Applications
What to do with the darn thing? danbri's put some good ideas down with ChirpChirp (that I've still not fully digested).
Nicholas J Humphrey had a brilliant suggestion, use them on radio - nearly every programme these days (BBC R4 at least) seems to read out one or more URIs.
I've not got a smartphone so am pretty clueless about that kind of Apps, but presumably there are a few around there.
Doing stuff with DSP and/or GA and/or RDF
Building the thing led to a couple of collateral proto-products: a little genetic algorithm-based optimizer and the makings of a DSP vocab/ontology.
There has been work done already around DSP and semweb tech by the dbtune and omras folks. The Henry service is a sweet example of the kind of thing that's possible, it's "...able to perform audio processing tasks to answer a particular query". The shape/scope of their ont does seem a bit different to what I've been finding, though obviously there's overlap. My inclination is to derive what's needed from the running code then later align it with their material.
With a reusable system-description mechanism in place (i.e. a DSP vocab) it should be straightforward to apply the genetic algorithm optimization setup to any system which depends on a bunch of parameters and has a notion of fitness.
I've also got a few other personal tie-ins with this - the opportunity to tie the DSP (and analog SP) bits to the SPICE in RDF stuff I was playing around with last year, and going back somewhat further, updating the RPP vocab from over a decade ago (I'll get these things finished eventually...). From a suitable level of abstraction there looks to be interesting potential overlap with data processing too - check David Booth's RDF Data Pipelines for Semantic Data Federation.
2012-01-03T14:21:37+01:00
ga pipelines genetic webbeep algorithm dsp web rdf beep
Related
Comments
Web Beep
I've just gone live with a little fun service : Web Beep - enjoy!
2011-12-31T19:22:33+01:00
audio dsp web rdf beep
Related
Comments
A very compact database query language based on binary relations
Kragen thought it through a bit: http://canonical.org/~kragen/binary-relations.html
(God I love people that are cleverer than me)
2011-10-19T17:08:50+01:00
rdf
Related
Comments
Queries
Words from Kragen, and I really can't answer this, I hope he doesn't mind me sharing:
[[
I've been thinking about how most queries don't really need to have free variables in the predicate position of a triple — that is, in most queries, you know what all the predicates are going to be, and your variables are only in the subject and object positions. Is this true in general, or is it just me?
I've been thinking about a different query language syntax that takes advantage of this, but it has to fall back on reification when it actually does need a variable predicate.
]]
2011-10-19T16:20:38+01:00
rdf
Related
Comments
A Role Model of Consciousness
Past few weeks I've been on pause, my head not working properly. Finally got around to seeing doctor yesterday, now waiting for antidepressants to take effect. I haven't totally wasted my disconnected time, watched a lot of stuff. Including a Midsomer, a couple of Bargain Hunts and a geeky-great vid on poker bots (have I said I really like Berlin? This is a Chaos Communication Camp production, wonderful material). Simulating an actual poker player is really hard, but it got me thinking about the similarly hard problem of what consciousness is, appropriately mental for my state of mind.
Caveat, I'm not up to date on theories in psychology or even AI. Last big thing I read anywhere near this was a lay-reader book I think with "Intelligence" in the title, about what humans are really good at is predicting the future - pretty good hypothesis IMHO. Maybe someone can enlighten me about current thought (I'll cc Planet RDF). But the thing that has been on my mind is more old-school, the internal model bit I think was popular around the 17th century, gone downhill since. Although it may well be rubbish as human stuff, something makes me imagine it might be worth thinking about for machine stuff. I really like the agent metaphor.
Ok, generation 0, we have an agent (A) in a universe (U), and it just sits there. It's a rock. It's surrounded by other agents (which might also be rocks).

Generation 1, we have an agent capable of interacting with the environment, but its interactions are pretty minimal, starting somewhere around a pebble on a beach that has a wander with each tide up to a living creature that has built-in stimulus-response maps along with learnt ones. Kinda Behaviourist. I'm starting with the pebble because interaction with the environment can take a lot of forms, and there's quite a history from at least the Neolithic of generally anthropomorphic agency views of facets of the environment (weather etc) through the Bronze Age deities up to the modern-day religious mythologies.

Generation 2 we approach the Enlightenment and/or Smalltalk. The agent in question has an internal model of the universe containing the agents outside.

On generation 3 we come to the bit that I'll call novel until someone points to an 18th century philosopher who already suggested this. The agent in question has had all its sensors and actuators geared up to the outside world for a while, as well as sensors (and actuators) connected internally. By the mechanisms of Intelligent Design, Natural Selection and copy, paste and tweak a bit, it notices parallels between interactions with the external agents and interactions with itself. It develops a sense of self as another model very similar to the models it has for external agents. Here's the novelty - first the agent becomes aware of external agencies, only then by analogy it becomes aware of itself.

Like all the great (as in most entertaining) theories this is of course unverifiable. But I like the notion that the local stuff only appears after some level of comprehension of the remote stuff, feels like it might be useful somehow.
2011-10-15T20:59:10+01:00
mind intelligence psychology federated ai mad model rdf
Related
Comments
Sell Out
A couple of days ago I got another mail from someone wanting to put links here to their client's. Unusually this seemed written by a human, so I didn't immediately bin it. Insert our links in your old posts and we'll give you some dollars (the figure I think was $50 a link), and the targets will be either relevant and/or to educational resources. Given that I'm in the red right now, and given my recent amount of enthusiasm for paid work, I said ok, bring it on.
It would have been better if I'd been able to do Sebastian Trüg's approach, having a real project to which to donate, but bugger it, I've added a donate button to this blog. Now go visit my sponsor.
2011-10-06T20:48:17+01:00
federated money rdf
Related
Comments
A couple of days ago I got another mail from someone wanting to put links here to their client's. Unusually this seemed written by a human, so I didn't immediately bin it. Insert our links in your old posts and we'll give you some dollars (the figure I think was $50 a link), and the targets will be either relevant and/or to educational resources. Given that I'm in the red right now, and given my recent amount of enthusiasm for paid work, I said ok, bring it on.
It would have been better if I'd been able to do Sebastian Trüg's approach, having a real project to which to donate, but bugger it, I've added a donate button to this blog. Now go visit my sponsor.
2011-10-06T20:47:55+01:00
federated money rdf
Related
Comments
Check Sums
I was offline last week when the news broke that CERN folks announced they'd found a discrepancy between the assumed speed limit of the universe and the way their neutrinos appeared to behave, 20 parts per million. That's a pretty big anomaly when you consider dogs can detect salami in 9 parts per billion of the kitchen (that paper will be published once I've got 99 other co-signaturies who don't mind their crotches being sniffed). I was offline because I was feeling pretty crap after a boozy weekend, lightweight compared to previous exploits but after the hangover had passed I was left in an ultra-violet funk.
Incidentally, for a few days, going to sleep I wound up picking a random, unloaded word that flashed by on my Cartesian plasma screen, "mink", repeating it as a voiceover in said theatre as a mantra to keep demons at bay. I have since rationalised the word - it's a potential HTML5 rel value to correspond to URIQA's MGET. But that's by-the-by.
The too-fast neutrinos went from CERN to Gran Sasso. After dopplering my funk, I was curious about the constant thing. I knew where CERN was (because I watched The Champions as a child) but though I'd heard of Gran Sasso, couldn't place it. As any good mental illnessity goes, my funk featured a good proportion of guilt (getting sweary on social networks leaves you a bit shamefaced).
Now looking on the map my funk shifted back up the spectrum, if you draw a line on the globe from CERN to Gran Sasso it goes straight through this house. Those faster-than-light neutrons came through here (ok, a little underground, but I do leave my empties in the cantina). So how's that for something to feel guilty about - screwing up the model of the universe..?
Which is why I can be sure they got their sums wrong. My empties would have slowed them down. You're probably 40 parts per million out guys.
2011-10-04T21:50:35+01:00
federated cern physics rdf c neutrinos light
Related
Comments
stupid computers
Train of thought. The world imagined by machines won't ever be a direct reflection of the world experienced by animals. But that's not a bad starting point, go all Plato and have the computers as the shadowplaw. Maybe the current generation of computers aren't capable of doing the 3D of a child's first discovery of a 4-leaf clover. They will though, probably in my lifetime. But there's the map/shadow, and there's stuff we can do well in this world, stuff that the machines are good at. A virtual reality with rules that are consistent with this side, but take advantage of that side.
Perhaps I'm getting a little too excited about being back online again.
2011-09-21T23:38:47+01:00
federated rdf
Related
Comments
Speed
A passing observation. It's bloody slow this Web thing. I have terrible wire bandwidth here, but it isn't that that is the bottleneck. Me, ask anyone, might be smart but he's slow witted. Not as slow as this thing.
Picture a couple of people that know each other fairly well, have a spoken language in common. Say they've been out and are trying to figure out the best way of getting home. Bang bang bang bang, the ideas will flow. The Interwebs know the best way to get a taxi, walk or bus. Augmented by the smartphone. But it don't quite work. Computers + data = knowledge. Not.
Even if this machine in front had better than human standard AI, it would still be slow and useless right now compared to a (stupid) talking human. We are missing bits we need to take advantage of the technology. The back end seems to function well, the front end seems like it's as good as it gets. So why do these things behave as if they are slow and stupid?
Passing observation, I honestly don't know. But I feel we should be able to find out. How? Dunno.
2011-09-21T23:24:15+01:00
federated rdf
Related
Comments
RDF, where art though
In comments on a post on G+ I said something I might regret:
"There are plenty of RDF-based applications around, but none really have much broad public appeal."
Ade Oshineye responded with "why do you think that is?"
Ok, overnight I remembered there's at least one app (or set of apps if you prefer) that uses RDF and has a lot of adoption: Drupal. According to Wikipedia it's used on at least 1.5% of Web sites worldwide, and has RDF in its core. Then there's data.gov.uk, a public-facing national government site that's RDF through-and-through. I'm a little out of touch, there are no doubt quite a few other good examples of where I'm wrong.
But given that RDF has been around for 5 years*, it's the way of doing data on the Web and virtually every Web-oriented app uses data somewhere, why isn't it ubiquitous?
(* solid specs came out in 2004 although SPARQL wasn't until 2008 so I'm splitting the difference for a rough date for when it became usable)
RDF isn't something that's going to be in your face anyway, so "broad public appeal" is slightly off-target. Developer adoption may be a better key. Whadever.
In terms of it as a database tech, compared to relational DBs (MySQL etc), custom data handling (Twitter uses Ruby message queues), novel DBs (Facebook uses a key-value store Cassandra apparently) RDF stores don't get much of a look-in. Ok, arguably the big scale things need to be custom to hone performance, but why, alongside the Big Data handling, don't we see RDF augmentation?
For consuming apps and desktop apps, I can't actually think of any well-known ones off the top of my head (I think quite a few of the music apps on Linux use librdf under the covers). I don't have a mobile device - any iPhone apps?
What I find a little bizarre (and please give me counter-examples), is that in the areas where RDF really shines - Web-oriented data integration and reuse - there are hardly any well-known apps out there at all, using any technology. There are a handful of feed aggregators and things like techmeme, but the level of integration there is pretty trivial. (Before Kingsley jumps down my throat - OpenLink Virtuoso is seriously good at this kind of stuff out of the box - but what I'm after is where these things are being used by twitter-sized demographics).
There's certainly something to what Lee Feigenbaum said the other day, the wrong question is usually asked, it should be: What can I do with Semantic Web technologies that I wouldn't do otherwise?
In terms of app-building, right now most parts of most things can be built relatively easily using other technologies, so unless the RDF stack is part of the developer's on-hand toolkit (like e.g. LAMP) it won't be first choice. I do suspect that while the false perception that RDF is complex per se isn't so prevalent these days, there's still a notion around that RDF is complex for the benefits it offers. i.e. linked data isn't perceived as a significant value-add, so why bother? The primary objectives can be acheived by pushing around little JSON objects ("jobbies"?) in a fairly arbitrary fashion, so why look further? But data on the Web surely isn't a niche thing...
Feel free to shoot me down in flames from all angles over this one (I'm not interested in advocacy here so don't care if I expose the wrong message) - I also suspect there's still something in the idea that people simply don't get it. While developers seem to have no problem representing pretty much anything in local databases, the idea that anything can be represented on the Web in a similar way hasn't been grasped. I reckon there's good evidence in virtually every high-profile project. Things tends to be focused on HTML (with a little Javascript) and the browser experience. For service-oriented systems the unwritten assumption is that the services will tie into the same view. I'm certainly not saying that this focus is wrong (those user-facing components are vital), just that it can lead to a blinkered view of what is possible. Only relatively recently have developers at large started looking at things like the identity of people on the Web. You still don't see the same attention given to everything else in the world - products, ideas, activities. Ok, you might point to activity streams and the like, but the subject of those activities still largely tends to be doc-oriented: messages or posts. You might point to schema.org and microdata as ways in which people in the Web development community can put data on the Web. But scratch the surface and the main goals underneath are things like SEO, most of the data being expressed is document metadata, not data about the real world. (Next time you go shopping, notice your interactions with the world from finding your car keys onwards, compare and contrast with the Amazon experience.)
The other day I posted a question on G+ that probably should have gone here: All the necessary components were in place for online social networks, in a distributed form, before Facebook & co. came along: blogs, aggregators, the various protocols. So why were Facebook & co. so successful? (got some good comments there, and was very pleased to find out Andreas Kuckartz is researching the question)
The question of data on the Web seems to lie in a similar socio-politico-technical morass. On federation, I'm afraid I'm inclined to agree with Eric Siegel : "I predict decentralization is inevitable, but its very very far away." I feel pretty much the same about the Web of data, though perhaps not so far away (unless I'm confusing small and far away :)
[ooh - a good point on that from Seb Paquet I'd missed before: The folks who grokked decentralization didn't master social experience design and UI design as well as Zuck, and decentralized infrastructure is harder to monetize so getting funding was difficult.]
One final question dedicated to folks on Planet RDF, from danbri in response to (the Facebook re-presentation of) my post yesterday:
If RDF is so great, we should all be rich by now? :)
Another quote, it must have some relevance - via the BBC, from Sir William Preece chief engineer of the British Post Office in 1876: "The Americans have need of the telephone, but we do not. We have plenty of messenger boys."
Still no system here yet, comments to G+ again.
2011-09-17T13:52:14+01:00
federated semweb rdf
Related
Comments
Plan B - RDF for fun and profit
Last night, after finding out that part of the G+ API had gone public I skimmed their docs and the docs of some of the specs they draw on: Portable Contacts, Activity Streams and OAuth 2.0. Of course it's great that G+ is exposing an API, and great that they're drawing on existing standards. But after looking at those standards I came away shaking my head, feeling rather discouraged. Again and again they contain data expressed use JSON mappings like "kind": "plus#person" (G+ API) and "objectType" : "person" (Activity Streams) and "" (Portable Contacts assumes that if you've got data you're looking at contacts). Aside from the variation in the naming across these, there's a common theme, the assumption that a simple token (like "person") is adequate for definition of something on the Web. How do you know that their definition of "person" is compatible with your system's definition of "person"? Sure, there are the spec docs to back them up, but how do you get from the data to the spec docs? Ok, there's openness in the publication and dev of these specs and standardization to the extent that they're high-profile enough that vendors like Google will see them and adopt them. But in their technical detail they have more in common with pre-Web, offline proprietary formats - "person" means person because we say so, and everybody knows what we mean.
Digging a bit deeper there's reference to the Discovery Protocol Stack which draws on XRD (the OASIS spec for describing resources) and Web Linking (RFC 5988 for defining typed links). Here there's more of an attempt to make the stuff Web-friendly, entities (resources) and relations (links) are identified with URLs so Web-based discovery of further information is in principle possible. But the "One True Ontology" registry-based approach of Web Linking is questionable in a distributed environment (and comparable to schema.org).
The description of things using schema like "kind": "plus#person" looks like what RDF does, except rather than using a Web-based approach to naming (so you could derive a URL from "plus#person", look it up and find out what it means) instead we see ad hoc token-based naming schemes. With Web Linking we have something that corresponds exactly with RDF properties (they are typed links), and if you can look things up in a registry then that's a step in the right direction. We already use registries to decode the meaning of terms in other major vocabularies - e.g. the HTTP media types through which HTML is delivered lead you to the definitions of terms like "strong" in the relevant specs. But is a registry appropriate for every term we're ever going to use? Does a word like "strong" only have one meaning?
Ok, so far there's a phrase which sums up all this: Cargo Cult RDF
But the theory is that grassroots, use case-driven development will tend to create cowpaths in the environmnent, and all standards orgs have to do is pave these. Except it doesn't seem to quite work that way. On the one hand we have the XKCD Standards effect (check the first paragraph on the Portable Contacts page), on the other hand the simple fact that, even with the best will in the world and with good information, people often get things wrong. Take for example:
OAuth [1.0] aims to unify the experience and implementation of delegated web service authentication into a single, community-driven protocol.
[time passes]
OAuth 2.0 is a completely new protocol and is not backwards compatible with previous versions....As more sites started using OAuth, especially Twitter, developers realized that the single flow offered by OAuth was very limited and often produced poor user experiences...OAuth 1.0 was largely based on two existing proprietary protocols: Flickr’s API Auth and Google’s AuthSub. The result represented the best solution based on actual implementation experience. (Introducing OAuth 2.0)
So...even when good, informed standardization is aimed for, flawed technologies built with flawed processes are unavoidable.
But these things are so popular! Vendors and developers can't get enough of this kind of stuff. It's a continuous stream: XML APIs become JSON APIs, microformats become microdata, but the same patterns are repeated again and again.
Years of these developments passing RDF by. Plan A : The Semantic Web still seems as far in the future as it did 5, 10 years ago. The RDF technologies demonstrably work, and adoption is growing, but it's hardly viral. However you look at it, the world of trendy new specs repeatedly steers around that fact. What's a jaded RDF enthusiast to do? Here's what I recommend:
Exploit the situation!
With a continuous flow of different specs that each covers some little part of data on the Web, focusing on any specific development can only work in the short term. A strategy based on technologies that support flexibility and agility, using known best practices of the truly distributed Web is the best option in the long term, so that systems can be rapidly adapted to meet any new requirements. It doesn't matter that e.g. schema.org misses the point, the data is still useful. "Think globally, act locally" is a great expression - in this context it could mean accept whatever the world of Web 2.0+ has to offer, but handle it on your own terms.
In practice, let's say you're developing a system for a particular vertical market: dog leads (I'm getting serious hints as I type). Don't build the system from scratch based on what people in the dog lead market are doing, don't tie yourself to domain-specific schema or protocols. Wherever possible use commodity, off-the-shelf tools. Then if dog leads take a nose dive on the international market you can regroup with a different target - cowbells for cats - using the same tools, and same skill set. The only parts that need change are at the edges. Basically RDF technologies offer a long-term commercial advantage.
2011-09-16T14:31:52+01:00
google streams contacts rant federated web semantic semweb activity rdf portable
Related
Comments
RESTful Turing Machines
I went to bed a couple of hours ago but every time I started to drift off a mosquito buzzed by. Led to this train of thought - how would you build a Universal Turing Machine with hypertext as the engine of state?
Seemed natural to use the Web in the role of the tape in Turing's setup with URLs corresponding to the position on the tape. The path part can be tape-like. Imagine an infinite path: http://example.org/location/location/location ... To move left it's href="../location", to the right href="location/location" (I think...bit tired here :). Whatever, problem is the train crashes to the left. I'm pretty sure a single-ended tape would still be universal because it'd be like folding the normal tape and interleaving the cells. But I reckon it'd be better to stick with the standard tape config but with little sub-rules for the mapping, something like:
start at http://example.org/H
to move right:
if the final char of the URL is a L, remove it
else append a "R"
- and vice versa.
The content of the page http://example.org/HLL might then look like:
<html>
...
Symbol = 0
<a href="http://example.org/HLLL">Left</a>, <a href="http://example.org/HL">Right</a>
...
</html>
or it might 404, corresponding to Turing's blank symbol.
Reading from tape is a GET, writing is a PUT.
I think this is in keeping with the spirit of REST, there is no context kept on the server, the messages are self-contained. The client would have to know the rules for generating the new URLs, plus the instructions. Maybe a neat way of doing the instructions might be to have a series of linked scripts each corresponding to an instruction, effectively a second agent stepping through them.
Now I've had a glass of milk and a chunk of chocolate I'm off back to bed, hope that mosquito's gone. I'll leave criticism, improvement and implementation to other insomniacs.
2011-09-14T03:04:13+01:00
rest turing machine rdf insomnia
Related
Comments
node.js early impressions
It is possible to learn enough Javascript and node.js to do useful stuff in a week.
I've just done it. I'm not exactly familiar with the idioms and I'm sure there are constructs I've not yet encountered, but it's to be expected that broad knowledge will take time.
Of course I had encountered JS before, around HTML/browser, but had never tried doing any proper coding with it. It certainly doesn't lack power, but one drawback I'd say is that its flexibility means that it isn't always obvious what's going on. That goes double for node.js, where having callbacks everywhere can make things confusing (though I'm beginning to get used to that).
The little app I've put together is much more concise than it would be in the languages with which I'm familiar (mostly Java and Python), but then needing a lot of comments to explain what's going on isn't a good smell.
However, if my vague understanding of how node.js works is remotely correct, I get performance/scalability for free (something that'd need a lot of thought in Java/Python). node.js really does lend itself to Web wiring.
2011-09-07T12:58:29+01:00
seki node.js rdf javascript node
Related
Comments
Affordances, described with less clutter
Posts on this blog get picked up by Facebook. Alison who's an experienced Web developer spotted my last post over there and couldn't make much sense of it. Hardly surprising, I referred to rather a lot of obscure stuff and used a lot of jargon without much explanation. But given that this affordances thing relates directly to the way everyone uses the Web, a developer should be able to make sense of it. So here I go again, this time trying to stick to the main points, glossing over the detail. [Blimey, but I've ended up rambling on a long while]
So on the Web you've got lots of documents in HTML on servers and lots of people with clients (browsers) that understand HTML. Those documents and various other messages are passed between server and client using the HTTP protocol. Most of HTML is about document structure, which with the aid of CSS can make text look good on the screen. But it has several things built in that allow a client to communicate over HTTP and hence allow the end user to interact with the Web. Most used is almost certainly the <a href="http:/example.org/here">something</a> link. When interpreted by a browser, that bit of markup highlights the word something and enables the link http://example.org/here to be followed by clicking on the something.
One fairly archaic definition of the word afford is to provide or supply (an opportunity or facility). Presumably this is where a 1970's psychologist got the word affordance (Wikipedia) which he defined as an "action possibility" (and some other stuff). This got picked up by human-computer interaction folks and mutated a bit, but "action possibility" is good enough here. So what the browser does with the bit of markup above - enables the link http://example.org/here to be followed by clicking on the something - can be described as an affordance.
The Web can be looked at as an information store with which we interact, and borrowing from database speak we have four basic operations: Create, Read, Update and Delete (CRUD). Through the highlighted, clickable link the browser provides the Read operation. When we want to Create e.g. a new blog entry, Update or Delete it we typically interact through a HTML <form>. So the kind of things a form enables can also be described as affordances. It's not unreasonable to expand the definition to include certain things the browser does that go beyond displaying a document with structure, things like displaying an image file that's linked to by an <img> element. Nowadays we're surrounded by loads of other different potential interactions thanks to Javascript and Ajax, these are also affordances. With the rise of blogging, online photo/video sharing and social platforms like Facebook, Twitter and now Google Plus, there's a new emergent breed of affordances that's been identified that include things like share, like, +1 etc. These are typically powered by Ajax and very often operate across sites and involve some data transfer, e.g. if you post a link on Facebook to a photo on Flickr it'll add it to your wall display a thumbnail of the image and the title. This new breed of affordances has been called Web Intents or Web Actions depending on where you look. (The Web Intents thread is I believe partly derived from a similar thing called Intents on Android phones, but having never used one I can't comment).
Ok, now there's an increasing amount of data on the Web expressed as Linked Data. This is published using the Resource Description Framework, RDF (depending on who you ask, linky non-RDF formats can also be considered linked data, but that's not really relevant here). The question is, how best to interact with this material, in other words what affordances do we need? There's a natural expression of documents on the Web - just show them as documents - but even for a passive display it's not altogether clear how to represent data. Ok, with traditional databases we usually have a table of some kind. But in that context we have a good idea in advance what can go in the rows and columns. On the Web, where the data can potentially be any shape it's a much trickier creature to pin down. With documents there is the familiar constraint of the individual document or page, whereas data doesn't chunk so neatly - the data we're interested in might be spread wide across the Web, between files containing only a handful of statements and stores containing millions. Links are part of the expression of the data, and links are the fabric of the Web, twisty eh? And this is just considering the Read aspect, there's also (at bare minimum) Create, Update and Delete to throw into the mix. We also need to not only interface with simple file-like linked data representations, there are also triplestores with SPARQL interfaces to consider (although the linked data API should help there, it can make a triplestore+SPARQL setup look more like normal Web representations).
However, to put these kind of problems into context - we don't need every possible operation for all data in all environments, far from it. One thing the work around Web Intents shows is that a handful of little facilities (share, like etc) are making a big difference in the benefit people get out of the Web. One thing that should really be avoided is making things as special cases - if you can share from A to B then you should be able to use the same mechanism to share from C to D and so on (this isn't that different from the centralized system setup, things on the Web should be distributed and ideally federated).
Ok, seems that affordances are going to be pretty important for working with the Web of Data. Some fairly good analysis has been done of HTML-in-browser affordances, and taking a leaf from the HTML book the simple hypermedia click-following of links seems a reasonable place to start in assembling suitable tools (in fact there are quite a few tools out there that support this in one form or another). It's fairly certain that some of the affordances will be a vastly different than those we're familiar with - data supports things like merging (trivial in RDF), query and inference, completely different kinds of transformation and analysis than text and so on. At the moment it's not even really clear that a general-purpose tool like the HTML browser is for documents makes sense for Web data (my guess is most likely a variety of different tools will be built inside the Javascript-capable browser, with different tasks being spread between clients and services).
But again to put these problems into context, there's no reason why any individual applications should be much different than they are today. Passing an image and its title between Flickr and Facebook requires the same basic machinery whatever kind of markup is used to describe the material. One of the aims for the Web as a whole, augmented by the Web of Data, has to be a reduction in complexity for common tasks. The fact that a whole new world of potential applications becomes feasible is just, well, interesting.
2011-08-29T02:56:10+01:00
federated actions intent affordances rdf
Related
Comments
RDF Affordances
Short version : An RDF Affordance is a resource description which gives a client all the information it needs to perform an action.
see RdfAffordances and AffordanceVocabulary.
My last post about what a Data Web Browser might look like led to some fertile discussion on G+. Essentially Mike Amundsen neatly reframed the question to being one about affordances, pointing to a bit of related prior work by him on Hypermedia Types.
We hold this truth to be self-evident, that presented with a simple application scenario a Web Architect will abstract it into a form that will take decades to implement.
Only joking...
Web Intents and Actions
I was initially thinking only in terms of an RDF-oriented browser (plugin/service) but it does make sense to stand back and look at the bigger picture. For starters, while RDF is ideal for describing stuff like service characteristics, there's no compelling reason to limit the data that's being manipulated to RDF. With that door open, there's an immediate tie-in with Web Intents, a JSON/Javascript way of describing/implementing generic interactions like share, edit, view, pick etc. (As it happens I added a Web Intents repository to my todo list a few weeks ago, the idea being to store the descriptions as RDF, providing a minimal API for using them in browsers as others have described - nice bit of serendipitous tie-in).
Tantek has spotted the potential around intents and in Web Actions: Identifying A New Building Block For The Web looks at common features across existing systems like Blog this, Digg, Read later, Follow, Like, Share, Tweet, +1 (he uses "Actions" instead of "Intents" for essentially the same idea).
We hold this truth to be self-evident, that presented with the potential for open-ended innovation a Microformats Geek will start paving cowpaths.
Again, joking...
On the Wiki - RdfAffordances - Mike has brought the abstraction back down to ground with some more detail of RDF-oriented actions, and with a view to hacking an implementation (on my virgin node.js installation) I've started a vocabulary - AffordanceVocabulary - this may change fairly soon, apparently Michael Hausenblas has done a vocab in this area, that'll get precedence if there's overlap/conflict.
We hold this truth to be self-evident, that offered a simple application scenario a Semantic Web Geek will always create a vocabulary that obscures the purpose of the application and that no-one will ever use.
Not entirely joking...
There is one high-level abstraction I've noted on that vocab page that is probably useful. There's a natural boundary between affordances that are essentially just HTTP (e.g. click through link, replace a page) and those which require more complex interations. For now at least I'm calling the former Actions (let me know if there's a better word that doesn't clash with Tantek's usage) - they are around the scope of Mike's Hypermedia Types and the latter Intents - around the scope of Web Intents.
2011-08-28T13:52:53+01:00
intents json browser web affordances semweb rdf data
Related
Comments
Data-Oriented Web Browser
Not a new idea, but I thought I'd try and find out how far we've got and braindump a little. I'm making the fairly big assumption that a general-purpose data browser would feasibly useful/usefully feasible in addition to application- or task-specific tools (i.e. use X for your contact/social data, Y for your project management data, Z for your shopping list).
Historically Web browsers provide simple display of (linked) HTML documents obtained via a subset of HTTP, and that's still their primary use. Not very promising for use on the Web of Data without a lot of server-side magic.
But, as well as supporting increasingly sophisted UI elements, they have built-in support for a Turing-complete language, Javascript. The HTTP limitations can be worked around. So while there may still be potential for a totally new breed of data-oriented Web browsers built from scratch as Rich Internet Applications, current browsers have the potential do do whatever's needed. Although they're pretty much limited to playing a client role, in effect they can be whatever kind of Intelligent Agent you like. The bonus is that everyone's already got a browser on their desktop/tablet/mobile - it's an easy path to deployment either for a plugin or better style as code-on-demand.
What's needed for a Data-Oriented Web Browser?
I'm not sure if the Tabulator is still actively maintained (if not, why not!?), but that gave a good indication of the kind of thing that is possible. Taking a step back, the Web of Data is really the same thing as the Semantic Web, and what's new about the Semantic Web isn't the "Semantic" but the "Web" (once again I've lost the source of that quote). How did/do people work with data without the Web? Typically SQL databases and spreadsheets. From those we can lift SQL queries and command-line tools, stored procedures and database forms (this is rather a confession, but back in the day when I first encountered MS Access it blew me away). Then of course there's the spreadsheet UI paradigm, a grid of cells which can be filled with pretty much anything, including most significantly on-the-fly calculated values.
So here's an initial shopping list:
- an in-memory* graph data structure support (rdfstore-js looks the most advanced right now)
- a spreadsheet-like view (I bet David Huynh has got stuff like this, if not, how hard could it be with a
and jQuery? :)
- a little language for concisely expressing Web operations, e.g. running SPARQL queries, that could be used inside the spreadsheet (the RDF path-following DSL in Apache Clerezza could be useful here too - link please Henry)
- tools for building app-specific forms (quite a few tools support custom views of particular classes, e.g. foaf:Person, Fresnel might help here)
- the ability to write as well as read data (this shouldn't need saying)
* persistence would be provided by the Web
I doubt it's possible to say up front what would be a good user-friendly way of setting this stuff up. But given a bunch of scripts that supported these elements, I reckon with a bit of trial and error dogfood use, within a few iterations something really useful could be possible.
Thoughts? Volunteers? Startups? :)
I've still not got commenting set up here so please post any feedback to this Google Plus entry.
2011-08-26T10:49:28+01:00
gui browser ui spreadsheet semweb rdf data linked
Related
Comments
Magnificent Seven for APIs
Some interesting survey results have just been published about APIs: the good, the bad and the pains. I commented about this on G+ and the discussion there got on to Atom. Some interesting points made, including the likelihood that we're stuck with snowflake APIs (every one is different) for the foreseeable future. I think it was Bill de hÓra who had a post years ago (can't find it now) about the N x N problem of diverse APIs (/models/formats). Essentially if you've got N different APIs then to connect them all you need N x N different translators. But it's also worth noting that this can be reduced to 2 x N if you have mappings to a common format/model. I reckon recent history has shown that formats are secondary, assuming certain boxes are ticked (see below). Regarding the model - there is a well-known, Web-friendly one. So here I'll simply point to ConverterToRDF and ConverterFromRDF.
In the G+ discussion Bill referred to an old blog post of his, Magnificent Seven - the value of Atom. In it he highlights the 7 'primitives' that Atom (format and protocol) uses and that he suggests should be used in any carrier format. I'm inclined to agree, if you are creating an API, tick these boxes, repeated here without Atom-specificity:
- ID - a globally unique identifier for the chunk of data, ideally this should be a HTTP URL
- Link - as above, it's rare that a separate ID and URL are needed
- Updated - the most recent change, invaluable for keeping things in sync
- Extension rules (mustIgnore, foreign markup) - anything the parser doesn't understand, it simply ignores. This allows other people to reuse and extend the format in a compatible fashion.
- Date construct rules - using a standard date format is basic politeness
- Content encoding rules - generally follow the rules for the media type you're using, and if there's textual content use an existing standard format (XHTML is good). Rule of thumb: UTF-8.
- Unordered elements - insisting on order in the structure is (or at least should be) unnecessary, accessing things by name is more reliable
The most significant bit is the ID/Link, this is essential for any API on the Web. It allows the use of the "follow your nose" protocol: if you want any more information about a thing, follow the link. It works for regular Web documents and increasingly for linked data.Incidentally (1), if you are an API developer/user you may like to have a look at the Linked Data API, looking at what's needed to make access to data in a SPARQL-capable store more developer-friendly. Comments welcome there.Incidentally (2), Google Plus is emerging as a pretty good discussion space, if you're in need of an invite mail me.
2011-08-13T09:42:21+01:00
apis atom federated json rdf
Related
Comments
GTD meets the Dice Man
Not for the first time, recently I've been having trouble Getting Stuff Done. In the past I've tried various strategies along with lots of little bits of software that are meant to help. It's never been particularly successful. But pretty much all those techniques have involved starting with the strategy and applying it to your own needs. This time around I thought I'd try going the other way. Start with what I want to do, try and identify the problems I have, develop a strategy from there.
The stuff I want to do generally falls into three categories: what I call work-work, i.e. the stuff that pays the bills; personal projects, which includes things like coding, woodcarving and doing music stuff; chores - a wide range of things from washing up to gardening. I am still in the process of renovating a house, and jobs that need doing there are mostly good fun once I get into them. But I've put them in the chores category because they are things I feel I must do (unlike personal projects), but without any great urgency (unlike work-work).
I can sum up the problem I have with each category pretty easily:
- work-work : procrastination
- chores : laziness
- personal projects : distraction
47 years of assorted neuroses mean whatever psychology is behind these is anyone's guess, but I'm pretty sure each of them them can be in either a vicious or virtuous cycle. If I can bump a little from the former to the latter then winning (as Charlie Sheen would say). So how can I deal with this strategically? Here's what I came up with.
work-work : the one thing I can't let slip (although still manage to), so that has to be my default activity before I think about anything else. But if I've got other stuff to look forward to (and off my mind) and am making reasonable progress with things, it should be easier to get down to it. Just have to make sure I get started in the morning...
chores : I reckon a lot of the problem I have with these is that there's always so much to do, so I feel swamped and stressed about them, only finding relief by putting my feet up in front of the tv, ideally with a bottle of wine. So for this category I've decided to take a leaf out the GTD book - write stuff down and forget it until scheduled. I've got two whiteboards, the one on the left with a list of things I need to get done in the next week or so, the one on the right things to do today. I would guess the tasks probably average to a couple of hours each, some much shorter (e.g. bins, ~10mins) some much longer (e.g. making windows, ~20hrs). With these kinds of things it tends to be the case that once started, there's a tendency to keep going until next meal time. Whatever, if I can do at least half an hour done of each, I'll call that a success for the day. To give me a sense of progress I'll just cross things off as I get them done, only wiping them from the board when space is needed.
personal projects : this is a tricky category, for an unlikely reason. I reckon I probably spend about the right amount of time on these things. Problem is that it's very unfocused and I'm always ready to go off on a tangent on a whim. As well as hopping from project to project I'm always coming up with new ones, long before existing ones get finished. Big trail behind me. With things like woodcarving I don't think I'm too far from where I want to be (I may try the following strategy there as well). But with programming and the like, it's pathological. Fortunately there is usually a lot of common ground to software projects I play with, typically the Web and RDF-related stuff. Often work on one thing will help with another. So I've picked the 6 main projects I've got on the go and numbered them on a sheet of paper. Actually 'project areas' would be more accurate, with most of them there's loads of wiggle room within the same umbrella (not sure if that's mixing metaphors, certainly sounds kinky :)
Now when I feel it's time for a session on a coding project I'll roll the dice, consult the list and concentrate on that project until the next big natural break. Chances are there'll be a change at the next roll of the dice, I'm hoping that'll be enough to stop me project-hopping.
Anyhow, I'll give this a few weeks, see how it goes.
Incidentally I've had a Semantic Web-oriented project management tool on my todo list for years now. I've pushed that back on the big hidden stack for now, with a Personal Knowledgebase being the first goal. Given a decent one of those, personal project management stuff should be a smooth extension. If by then I still need it...
This post was brought you by the number 2.
2011-08-10T23:49:08+01:00
gtd projects dice rdf
Related
Comments
Sitemap notes
Today I added a sitemap to this blog. Some notes-to-self.
Not sure what inspired me to do this, but I have been wanting a complete list of blog post URIs for a while to play around with augmenting the data (e.g. pulling out the links contained within posts and grabbing more info about them).
Blog engine general setup
The HTTP request routing first goes through Apache, if there's a file on the filesystem that matches the request, that is returned. If not, the request gets transparently forwarded to an instance of Gradino running on port 8080. The request gets dispatched through jax-rs to the appropriate handler in the code (most of the code is in Scala but using various Java libs). All the blog data is stored in a Jena TDB triplestore. When a request is made a SPARQL query is run programmatically against the store. The results are formatted as appropriate using a little crude templating (example). Results for the front page and feed are both cached as in-memory strings.
Adding sitemap generator
So for the sitemap, first pass I set things up in the same fashion as the front page and feed are generated, just without a LIMIT on the SPARQL query. This wound up making Apache give a proxy error, not sure exactly why (for some reason error messages didn't show) but it seemed reasonable to assume that it was somehow related to the quantity of results, maybe a silent timeout. I've got archives in the store going back years, my current query (excluding everything with "comment" in the URI) produces just over 5,500 results.
So then I decided to modify things to generate a static file when a POST was received at a particular URL. I should have seen this coming, but my initial attempt at this also gave a proxy error. D'oh! Performance-wise it was effectively the same routine running in the same thread.
But I was able to get it working by making the sitemap generator class a Scala Actor. When the appropriate POST is received, the handler creates a new instance of the Actor and sends it a message, but then continues along the original thread, returning an "ok" message to the browser.
Along the way I evolved what I reckoned was most suitable to put in the sitemap file. The blog front page just uses the core sitemap terms, and this is hard-coded:
<url>
<loc>http://dannyayers.com/</loc>
<changefreq>daily</changefreq>
<priority>0.9</priority>
</url>
Initially I had individual posts using the News sitemap terms, until just now I noticed that they are only for things that change a lot... So instead they just look like this:
<url>
<loc>http://dannyayers.com/2003/05/12/bufo-bufo/</loc>
<lastmod>1970-01-01T01:00:00Z</lastmod>
<changefreq>monthly</changefreq>
</url>
I've left it as monthly in case I want to change any of the template of the individually rendered pages, but reindexing isn't really a priority once the content text has been looked at.
Next I guess I should look at Semantic Sitemaps.
I'm typing this as yet another version of the generator code is running, I've kept making little errors that only show up when I point Google at the sitemap file... But if you're reading this then Gogle is happy with the current version :)
2011-07-24T20:59:40+01:00
seo sitemap catalog gradino rdf
Related
Comments
Schema/Vocab Mapping toolkit
Olaf Hartig has pointed me to the R2R Framework :
[[
The R2R Framework enables Linked Data applications which discover data on the Web, that is represented using unknown terms, to search the Web for mappings and apply the discovered mappings to translate Web data to the application's target vocabulary. The R2R Framework is aimed to be used by Linked Data publishers, vocabulary maintainers and Linked Data application developers. It support them by:
1. providing the R2R Mapping Language for publishing fine-grained term mappings on the Web
2. defining best-practices on how mappings can be discovered by Linked Data applications
3. providing an open-source implementation of the R2R Mapping Engine.
]]
2011-07-22T09:47:48+01:00
schema linkeddata r2r lod rdf vocab mapping
Related
Comments
Protocol
Sasha and Primo demonstrate a combination of "follow-your-nose" and authentication:

2011-07-21T19:52:53+01:00
primo federated dog sasha rdf protocol cat
Related
Comments
Translating between Schema.org and existing RDF
I just heard about a mapping from selected schema.org terms to SIOC (including a little FOAF and DC), it's a handful of statements using RDFS and OWL.
As one of the people responsible for the RDF Review Vocabulary I thought I should take a look what's needed there. It raises a couple of questions. As it happens, schema.org's model for reviews is a little more complex than our RDF vocab (after we went to all that trouble to keep it simple :), and a lot of the terms can't be mapped directly to well-known ones with RDFS/OWL.
e.g. in the RDF vocab:
<#something> :rating "5" .
using schema.org:
<#something> :reviewRating <#theRating> :ratingValue "5" .
So the first question is how best to express this? I think it's straightforward in SPARQL, e.g. for RDF vocab to schema.org for the terms above something like:
CONSTRUCT {
?something s:reviewRating [ :ratingValue ?value ]
} WHERE {
?something r:rating ?value
}
Or would some particular rule language be more appropriate?
The next question is how best to publish the mappings?
Where direct RDFS/OWL translation is possible, I think I'd be inclined towards including them in the RDF vocab, or at least linked via an rdfs:seeAlso.
Where rules are necessary, as above, I really haven't a clue. A simple online reasoning service could be useful (via some pre-cooked SPARQL maybe), but again how would you express that in the vocab?
I doubt I'll have chance to make the translations for Review in the near future (one contract I'm 7 weeks past deadline, another 3 weeks overdue, a couple of days overdue with those paper reviews...). But hopefully the lazyweb will be able to answer these questions in the interim.
That's a point - anyone fancy reimplementing lazyweb.org? It was very sweet, I think Ben only gave up on it due to lack of time to admin.
Oh yeah, and I still haven't implemented comments here yet, so for now please use email, Twitter, Facebook or Google+. (ooh, the Google+ link works properly for comments, but I guess non-G+ user can't comment...let me know if you want an invite)
2011-07-21T19:23:30+01:00
sioc foaf schema reviews schema.org rdf mapping
Related
Comments
Translating between Schema.org and existing RDF
beh, this post has a trailing space on the title - see : http://dannyayers.com/2011/07/21/Translating-between-Schema.org-and-existing-RDF
2011-07-21T18:31:05+01:00
schema vocabs schema.org review rdf
Related
Comments
FSW SFW?
See http://dannyayers.com/2011/07/20/FSW-SFW
[I've not got any handling in for ? on the end of titles in my blog engine...sorry if this post appears twice]
2011-07-20T11:57:21+01:00
federated social web rdf fsw2011
Related
Comments
FSW SFW
[oops, I've not got any handling in for ? on the end of titles in my blog engine...sorry if this post appears twice]
As usual, after the Federated Social Web meet in Berlin I'd planned to write comprehensive blog posts about it. As usual I didn't get far before getting distracted. So far I've done a bit of overview of the conf, a brief note on privacy issues and a fairly random think-piece on decentralized vs. distributed networks. But I haven't actually covered what were probably the two main take-aways from the conf - Federated Social Web stuff itself and the role of WebID. In lieu of something better I'll drop a few key links in now. In both cases things have moved along very quickly in the past few weeks with Google+ and BrowserID, more on those in a mo.
FSW
One big meme was that of the Facebook-killer - basically we need something that has all the user-friendliness of Facebook but not as a walled garden (and with a better story on privacy etc). Step forward Diaspora - you can use it as a service a la Facebook (with which it shares many features), but also set up your own install. There were also a handful of other apps with a similar style. It took me about a 1/2 hour to set up my own install of Status.Net, essentially an open version of Twitter. though I have yet to start using it and probably more significantly yet to connect it up to the other services I use.
Another pointer I must include is to the W3C Federated Social Web Incubator Group. As the charter describes, its scope is pretty wide, including the various emergent protocols and technologies in this space. One of the initial targets is to move forward the Social Web Acid Test - Level 0 (SWAT0) - an integration use case for the federated social web. On the Wiki there are potential use-cases or user-stories that could become part of SWAT1. They're both fairly short so I'll paste SWAT0 and the list of non-W3C technologies from the charter below. The incubator group is encouraging people to join, so if you're interested in this material please sign up.
WebID
To quote from the WebID site, "With WebID, logging into a website is as simple as selecting a WebID and clicking 'log in'". It's a very nifty bit of tech, secure, relatively straightforward to implement, much simpler than most of the alternatives. In essence it's about passing a URI in with a PKI certificate. When Henry presented this at the conf, the audience response was interesting. Although it isn't rocket science, the certificate stuff used isn't very intuitive (personally I have a blind spot on all things auth), so not everybody got it. Of those that did get it, very few could believe what it provided. A question from the audience was telling : "What can be easier than using username + password to log in?". Henry : "One click.".
Although not critical to the functioning of WebID, one of the coolest aspects is that it cleanly supports FOAF (and other) profile discovery, the service can learn more about the user to improve their experience. In other words it's entirely compatible with the Semantic/Linked/Data Web.
WebID was initially known as FOAF+SSL, on the Wiki oh, also here, there are lists of implementations etc. Watch the video and read the notes from Berlin for more.
There's also a W3C WebID Incubator Group.
...
Videos of presentations of the FSW meet in Berlin are online, along with most of the papers.
Google+
Before going any further, I should remind you that we already have a Federated Social Web, the blogosphere. However this is weak on many aspects - the social graph is fairly inaccessible, often poor UIs - in particular feed aggregators are clunky things, immediacy is seriously lacking, identity management and the personal profiles that there are messy, privacy, auth and access control systems are virtually non-existent. Of course all that has left a convenient niche for Twitter, Facebook, and now Google+.
I largely agree with Edd in his (must-read blog post) Google+ is the social backbone. As a competitor to Facebook it does open up the social aspects as a commodity, and it's considerably more open and linkable, i.e. Webby (here's my stuff). I do worry about Google becoming all-powerful in this space, but as they say this too shall pass. I personally believe the nature of the Web is such that any attempts to monopolise or centralize systems will inevitably fail - because decentralized/distributed systems have inherent evolutionary advantages, though they may take time to take effect. So I reckon Google+ should be viewed by Web technologists not as an end in itself, rather as a bootstrap to a more social Web.
Although Google+ doesn't have any Semantic Web features per se, it does a reasonable job of giving people URIs and linking them together. But rather than a niche, there's a gaping void for describing things in general in a machine-friendly form. Whether RDF-oriented linked data activity will expand to fill this void or some Googlesque reinvention (cf. microdata overlords) of RDF remains to be seen, but either way this also seems inevitable (see also Smarter (Hash)Tags and Google+). I'm not sure we're seeing it yet, but with a bit of luck, once the commercial world sees the SEO etc advantages, GoodRelations should cause a large expansion of semwebbiness.
BrowserID
BrowserID is a recent development from Mozilla. It's close to WebID in that it's in the identity space and about secure signing in, but arguably the primary goal is somewhat different. Broadly speaking, it boils down to the payload of WebID being a URL and the payload of BrowserID being an email address. Discussion is ongoing about the (/any) relationship between the two protocols. All other considerations aside, I'd suggest that WebID is more versatile in that there's a lot more you can do with a URL than an email address and because BrowserID is easier to integrate with existing email-based auth, there's better impedance matching with existing systems. I've tried to argue that BrowserID should allow the user to associate a (non-secret) URL with their email address to allow profile discovery etc. But consensus seems to be that keep-it-simple now trumps easier stuff later (WebFinger has been suggested as the route to discovery, I'm not altogether convinced as it's quasi-centralized, requiring a service to assert the email/URL mapping). Whatever happens on this particular point, BrowserID is certainly an interesting and useful development.
- - - -
SWAT0 Use Case
- With his phone, Dave takes a photo of Tantek and uploads it using a service
- Dave tags the photo with Tantek
- Tantek gets a notification on another service that he's been tagged in a photo
- Evan, who is subscribed to Dave, sees the photo on yet another service
- Evan comments on the photo
- David and Tantek receive notifications that Evan has commented on the photo
FSW-related Technologies
- ActivityStreams
- ActivityStreams is an evolving format for syndicating social activities around the web.
- OpenID Foundation
- The OpenID Foundation is the group responsible for OpenID-related standardization. Although work like OpenID Connect is a moving target, the test-cases and specification should be compatible with OpenID.
- OStatus
- OStatus is an architecture combining Pubsubhubbub, WebFinger, ActivityStreams, and PortableContacts.
- Portable Contacts
- The goal of Portable Contacts is to make it easier for developers to give their users a secure way to access the address books and friends lists they have built up all over the web.
- Pubsubhubbub
- Pubsubhubbub (PUSH) is a server-to-server publish/subscribe protocol as an extension to Atom and RSS. Servers compliant with PubSubHubbub can get near-instant notifications when a feed they're interested in is updated.
- Salmon Protocol
- As updates and content flow in real time around the Web, conversations around the content are becoming increasingly fragmented into individual silos. Salmon aims to define a standard protocol for comments and annotations to swim upstream to original update sources -- and spawn more commentary in a virtuous cycle.
- SMOB
- SMOB (Semantic MicroBlogging) is a framework that enables an open, distributed and semantic microblogging experience based on Semantic Web and Linked Data technologies.
- Webfinger
- WebFinger is about making email addresses more valuable, by letting people attach public metadata to them.
2011-07-20T11:55:44+01:00
federated social web rdf fsw2011
Related
Comments
The Symbiotic Web
During the Federated Social Web meetup in Berlin a few weeks ago, most folks used the phrases "distributed network" and "decentralized network" interchangeably, which doesn't seem unreasonable at this point in time when both appear in major contrast to the prevailing "centralized network" architecture of Web sites. On my last night in Berlin, on the steps of a crashed space station at around 4am (early flights to catch) I was chatting with Harry Halpin and he had the following diagram on his netbook:

It's from Paul Baran's landmark memo from 1964, "On Distributed Communications: 1. Introduction to Distributed Communications Networks" (see also some related network diagrams), some of the work which eventually led to the development of the Internet.
Harry was quite insistent on the significance of the "decentralized" net, saying that it was the one you found in nature (e.g. plant structure). I suggested that "distributed" looked at lot like (a 2D representation of) biological cell structure. That wasn't a very satisfactory analog, and since I've had my eyes open for a good natural world example of "distributed". Now I think I have one, and while it's in a different dimension than e.g. plant structure I reckon it maps quite nicely onto Web systems.

(in the woods up the hill)
To quote Wikipedia:
Lichens are composite organisms consisting of a symbiotic association of a fungus (the mycobiont) with a photosynthetic partner (the photobiont or phycobiont), usually either a green alga (commonly Trebouxia) or cyanobacterium (commonly Nostoc). The morphology, physiology and biochemistry of lichens are very different from those of the isolated fungus and alga in culture.
Now imagine how these things might have evolved. Initially there must have been an inheritance tree for the fungi and an independent tree for the algae (following the "decentralized" form), but then at some point the organisms started to get benefit from each other (I am not a microbiologist, but I'd guess that it probably started as a parasitic relationship, then the host side evolved some advantage). So there's a structure something like this:

The tree has become a graph. [PS. ok, strictly speaking a tree is already a graph, but you know what I mean]
Analogies get useful when you can use known aspects of one perspective to predict unknown aspects of the other (like the weird old alchemists' "As Above, So Below"). I don't know, vague hand-waving, maybe the nutrient molecules the fungi handle in lichen could be said to correspond to data, the photosynthesis of the algae corresponding to processing.
While clear-cut symbiosis like this isn't exactly the most common relationship in nature, there's obvious interdependence between every kind of organism on this planet. I don't think it's much of a stretch to suggest there are good parallels with Web systems, especially if you view the interfaces between organisms and their environment as corresponding to APIs between online systems. Certainly client tools and services (agents, in other words) correspond nicely to organisms.
The Web of Data (alongside the Web of documents) is already pretty distributed, the Linked Open Data cloud diagram being a nifty representation. This aspect of the Web isn't in itself particularly dynamic in its operation (data usually just sits there, periodically updating). But given the number of processors connected to the Web as servers and clients, the digital environment certainly has the potential for extremely interesting interactions.
2011-07-15T15:06:41+01:00
federated networks decentralized lichen rdf distributed
Related
Comments
HTML5 Kitten-Herding
Mailing lists again, Facebook doesn't have a monopoly on the notion of "poke", you provide a tiny bit of impetus and get valuable results. This time Sam Ruby, one of the smartest guys I've encountered who isn't a rabid RDF fan (heh). How HTML5 process works:
[[
*) If you have a problem with the process, participate in bug reports on the process.
*) If you have a technical problem with a given change, start by identifying the technical problem.
*) If you want to be impolite on twitter or IRC, I won't be responsible for policing such.
*) Outbursts on public-html will first be dealt with privately when possible, and will be dealt with publicly when necessary.
]]
2011-06-20T14:33:58+01:00
html html5 rdf
Related
Comments
WebID for Pets
I challenged Henry to explain WebID in a form my dogs would understand, I think his response is worth sharing:
[[
Ok. So you need to give each of your dogs and cats a webid enabled RDFID chip that can publish webids to other animals with similarly equipped chips when they sniff them. From the frequence and length of sniffs you can work out the quality of the relationships. On coming home for food, this data could be uploaded automatically to your web server to their foaf file. These relationships could then be used to allow their pals access to parts of your house. For example good friends of your dog, could get a free meal once a week. You could also use that to tie up friendship with their owners, by the master-of-pet relationships, and give them special ability to tag their pet photos. Masters of my dogs friends could be potential friends. If you get these pieces working right you could set up a business with a strong viral potential, perhaps the strongest on the net.
]]
- and bonus points for attaching a photo of a cute kitteh.
2011-06-20T13:44:34+01:00
rdf
Related
Comments
Personal Geo
It's early this this year, but it's Wakes Week in Tideswell. This is when we make kings into fools, fools into kings. Always many fools, no kings. The local church (Cathedral of the Peak) is of John the Baptist, who's day funnily enough coincides with the summer solstice. We dress wells (flower pictures in clay) because water has been a bit important for people, sussed in early neolithic. A gathering of clans with lots of beer. We have our own tune and we have our own dance. We bless the ground and we bless the sun, we bless the moon and we bless the sheep shit. We take to the streets to reclaim the world we're from. There's a brass band, in the brass band there's a big bass drum. At the appointed time he goes dum-dum. Then the ritual dance begins, torchlight through the streets. Ending at the fair, now in the recreational ground, reminding some of us that it is still the 1970s. Last few pints at the Club, back to bed knowing the sun will keep going for another year.
Can't find an accurate youtube of Tidza Band, this might even be Cressbrook band:
http://www.youtube.com/watch?v=iMIAffb6SCY
2011-06-19T05:01:43+01:00
tidza rdf culture
Related
Comments
Blank (node) Verse
Catching up on the HTTP-range-14 thread on the lod mailing list, in a robust response to cygri's robust brand of pragmatism, I couldn't help but notice timbl inadvertently (?) slipped into poetry here:
Formalisms aren't smart.
Sure, I can make a program to make sense of that.
But I'm not going to just to save you the effort of getting it right.
Disappointed by the intensity of your posting.
Systems have managed for a long time to distinguish between library car and book,
between message header and message,
between a book and its subject.
Now we have masses of information about many books
and about many other things we have great value in it
Let's not mess it up.
If you want an ambiguous source of information, use natural language.
The power of data is that is a whole lot less ambiguous.
----
As for the discussion I was having with Pat Hayes, I'm more than happy to call it a day now Pat's acknowledged (despite my communication failures) :
The document is a valid *representation* of the car, yes of course.
2011-06-18T21:21:24+01:00
poetry rdf
Related
Comments
httpRange-14 Reflux
Back in 2002, the following issue was put before the TAG:
httpRange-14: What is the range of the HTTP dereference function?
TBL's argument the HTTP URIs (without "#") should be understood as referring to documents, not cars.
By 2005 a resolution was accepted. If a GET is done and the thing being referred to isn't a document, then a 303 redirect should be used to provide something which is a document. As these things go, this is quite an elegant solution. Additionally it's accepted practice to use #-URIs for things which aren't documents. However, both approaches have their problems, many of which are listed in Providing and discovering definitions of URIs.
But I liked the 303 approach, and did my share of grumbling when things like Microformats and OpenID appeared to conflate the notion of a person with their home page, using the same URI for both. Now schema.org conflates lots of different kinds of things with documents. So while I still believe the TAG resolution works well I personally, finally feel we have to take into account that people won't make this kind of distinction. Others have already argued for pragmatism around the issue - e.g. see linking things and common sense and Back to Basics with Linked Data and HTTP. However it's hard not to see a conflict when (e.g.) RDF says "http://example.org/fred is a person" and in effect HTTP says "http://example.org/fred is a HTML page". Not pretty. On the lod list I just had a go at a conceptual model that avoided such a conflict, but I suspect so far I've only managed to persuade Pat Hayes that I'm barmy. So I'll have another go here with these arguments:
1. (solid) : HTTP doesn't have a notion of a "complete" representation of a resource. A photo of a car could reasonably be served as a GIF or lossy-compressed JPG image. The difference here as far as HTTP is concerned is just in the media type expressed by the Content-Type header.
2. (adequate) : a resource may have representations reasonably served with very different media types. Here I'm thinking of, say, a text version of a photo of a car. It may sound clunky, but there are good precedents: an RDF version of My Home Page can vary widely in its information content from a HTML version of My Home Page. In the HTML format, we have img alt="text". In both cases the assertion is made by the publisher that in some sense the different version can act as a useful alternative.
3. (strong enough IMHO) : a description of a resource can be considered a representation of that resource. On list I suggested there was some isomorphism between a description and a thing. Pat didn't accept that at all, but did say "there can indeed be correspondences between the syntactic structure of a description and the aspects of reality it describes". I'd suggest that's near enough. Those correspondences could be said to make the description a representation in the same way a lossy-compressed version of an photo can still share the same URI as an uncompressed version. As long as an appropriate Content-Type is used.
Given these three, a HTTP URI can simultaneously be understood as referring to a document and a car.
I'd better mention the barmy part. If HTTP did support transfer of matter, then as far as the URI referencing is concerned, all you'd be looking at is another media type. The example I used on list was of my dog Sasha, and given the above I'd suggest you could have various different representations of diminishing fidolity: Sasha herself; Sasha's description in DNA; a photographic description of Sasha; an RDF description of Sasha; a HTML description of Sasha... As I put it on list: you can't squeeze a dog over the wire with HTTP, but that's just a limitation of the protocol.
2011-06-15T16:18:22+01:00
uri tag http httpRange-14 rdf
Related
Comments
On Constraints
Yesterday, re. Microdata, I loosely quoted Dan Connolly - here's the original:
Are there parts of traditional logic and databases that, if we set them aside, will result in viral growth of the Semantic Web?
From a slide, Logic, Databases and Scale dated 2006.
(thanks Dan)
2011-06-10T22:17:55+01:00
rdf
Related
Comments
Privacy bullet points
Federated Social Web stuff.
It seems privacy can't really be pinned down, the definition is evolving. But you can effectively use a working definition (pick one).
Things are different depending where in the world you live.
Your average internet user hasn't a clue.
What's being leeched from your online activity - virtually nobody takes on the implications. But people are learning, they're as far as 1999.
Even when the browser vendors get together and make a button to limit things - still no-one gets the implications (see Aleecia at the link above).
Ok, so far is mostly "duh!".
But there was a lovely little revelation (from Soren I believe) that statistically the people more aware of privacy tend to be those with more disposable income [bum, that's twitterable]. If you want this demographic's dollars in your consumer base, you better get your privacy sussed.
2011-06-09T22:31:14+01:00
federated social web rdf fsw2011
Related
Comments
Federated Social Web
I've been out of the tech loop somewhat the past couple of years, and had decided not to go to conferences for a while. Ennui mostly. But when a Federated Social Web meet in Berlin showed up on the radar, it struck me I might get the shot in the arm I needed. Wasn't far off the mark. Berlin itself I found awesome, but right now I want to get down some notes on the conf. Falk (my new pen-pal) has a couple of overview posts. Good start Thursday night meeting up with Henry and a good crew. Friday morning I was tempted to sit in on the WebID WG but decided to leave them to it, relax in the hostel instead. That was until I got a ping from danbri, flying visit, unexpected f2f. Then the conf. proper started.
It opened with a pep talk from timbl via video link (captured by Dan Romescu, who has also written up the event). Nothing remarkable (aside from how hyper the man can be at 5am local or whatever :), just reinforcement that the notion of "Federated Social Web" is pretty much the same as Tim's notion of how the Web should be.
After that, all the stage stuff was captured on video by the organisers (bravo!).
For most of the presentations and discussion, Facebook was the mammoth in the room. All the stones they've turned over regarding identity, privacy, Web-wiring is astonishing. But there are people generally very well aware of these issues, which was nice.
beh, I'm really struggling writing this up, I get to 135 chars and start counting. Have to do it PowerPoint. The first bullet:
Lessons learned from Social Networking in Egypt (Amr Gharbeia) is really a must-see. A lot of the media bollocks about Facebook and Twitter playing a role in recent Middle Eastern events was true.
A related must-see presentation happened after the FSW event, over at starship c-base. How some European hackers were able to get communications going again after a govt. had pulled the plug - go to about 1700 on the vid here at telecomix (so I'm told, not got bandwidth here to check :)
2011-06-09T21:00:19+01:00
federated social web rdf fsw2011
Related
Comments
I, for one, welcome our new Microdata overlords
I don't think the approach taken by schema.org is the best one, far from it. But the Semantic Web has got quagmired so many times, while the rest of the world gets on and does cool stuff. I recently got irritated enough by the goings-on in the HTML5 WG to join [c.f. extensibility] but then realised I couldn't make any difference, certain patterns/arguments are set in stone. Facebook has taken the world by storm with ideas that were in FOAF years ago. Ball dropped.
I used to hold the opinion that the shortest path from A to B was usually the best one. But any path is better than spending more time wading the river.
Also Dan Connolly came out with a shrewd phrase like "so which constraints do we need to relax for this to go viral?"
And Michael's already sorted http://schema.rdfs.org, mappings to RDF schema :)
2011-06-09T17:35:17+01:00
rdf microdata
Related
Comments
Dark Secret
These Web people are remarkably cool. The Semantic Web people especially so, all with dark pasts arournd AI or worse still, data. But humungously human. This is significant. Even Hixie (who I basically want to take into a dark room and stick needles in) is a good guy. Emergent community, something that only has happened in the past around popes. Tagging this post with RDF so that northen irish guy sees is [yeah you dajobe]. People being nice to people, trying to discover new things, there's nothing wrong here.
2011-05-28T00:41:42+01:00
rdf
Related
Comments
Smell the coffee
I have a problem with the Semantic Web right now. We have plenty of solid specs (and SPARQL 1.1 as it's evolving). We have a lot of data online. The browser is getting smarter. Maybe it's because I don't trust myself with mobile devices, but I'm somehow missing the benefits all this should provide. The social side, which did seem to be doing very well with blogs and people setting up their own sites seems to have disappeared up a cul-de-sac with Twitter and Facebook. This morning I'd have liked to have found a hay fever remedy online, and been able to contact some local drug suppliers to stop me sneezing. It seems like we are all playing with toy projects and can't grow up (yes, I'm still working on my Turtle editor). I'm aware there's a huge amount of pharmaceutical data online, but to stop my sneezing the best bet still seems to be Wikipedia followed up with personal visits to medics. I expect, no *demand* more from the technology, but right now it all seems a bit Dark Ages. We have moved closer to the Bazaar model over Cathedrals, but that just seems to mean people have an excuse to be sloppy (it's open source so it's not my problem any more). About say 5 years ago there was inspirational stuff coming from academia, now it seems to have dried up into poor mimic of the commercial world. I've no real specific idea what is needed right now, but I suspect it looks a lot like a big kick up the backside.
2011-05-27T10:22:55+01:00
rant rdf
Related
Comments
Another SPARQL solution
Bravo! A solution to the latest SPARQL puzzle.
@glenn_mcdonald found a way of getting the non-Roman-god solar system bodies:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX wn: <http://www.w3.org/2006/03/wn/wn20/schema/> PREFIX id: <http://wordnet.rkbexplorer.com/id/> SELECT DISTINCT ?planet WHERE { ?s1 wn:memberMeronymOf id:synset-solar_system-noun-1 . ?s1 rdfs:label ?planet . OPTIONAL { ?s1 wn:containsWordSense ?ws1 . ?ws1 wn:word ?w . ?ws2 wn:word ?w . ?s2 wn:containsWordSense ?ws2 . ?s2 wn:hyponymOf id:synset-Roman_deity-noun-1 . } FILTER (!bound(?s2)) }Isolating just the planets looks to be out of reach using the WordNet endpoint alone, but I guess that can be left as a challenge for federated query e.g. CONSTRUCTs from different datasets into a local store before SELECTing.
Update
From RobVesse -
Here's an even simpler query for yesterdays puzzle - still doesn't isolate real planets though
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX wn: <http://www.w3.org/2006/03/wn/wn20/schema/> SELECT DISTINCT ?label WHERE { ?s1 wn:memberMeronymOf <http://wordnet.rkbexplorer.com/id/synset-solar_system-noun-1> . ?s1 rdfs:label ?label. OPTIONAL { ?s2 wn:hyponymOf <http://wordnet.rkbexplorer.com/id/synset-Roman_deity-noun-1> . ?s2 rdfs:label ?label. } FILTER (!BOUND(?s2)) }...plus...
Here's a soln using wordnet and dbpedia to show only planets not named after roman gods, requires a SPARQL 1.1 engine to run
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX wn: <http://www.w3.org/2006/03/wn/wn20/schema/> SELECT DISTINCT ?label WHERE { SERVICE <http://wordnet.rkbexplorer.com/sparql/> { ?s1 wn:memberMeronymOf <http://wordnet.rkbexplorer.com/id/synset-solar_system-noun-1> . ?s1 rdfs:label ?label. } MINUS { SERVICE <http://wordnet.rkbexplorer.com/sparql/> { ?s2 wn:hyponymOf <http://wordnet.rkbexplorer.com/id/synset-Roman_deity-noun-1> . ?s2 rdfs:label ?label. } } BIND(URI(CONCAT("http://dbpedia.org/resource/", ?label)) AS ?dbpResource)Here's a suitable engine: Leviathan (a demo of the SPARQL Engine used in dotNetRDF).
2011-04-12T20:19:30+01:00
sparql puzzle rdf
Related
Comments
Another SPARQL puzzle
Using the WordNet endpoint at http://wordnet.rkbexplorer.com/sparql/ I can get the names of the solar system bodies that are named after Roman gods with :
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX wn: <http://www.w3.org/2006/03/wn/wn20/schema/> SELECT DISTINCT ?label WHERE { ?s1 wn:memberMeronymOf <http://wordnet.rkbexplorer.com/id/synset-solar_system-noun-1> . ?s1 rdfs:label ?label. ?s2 wn:hyponymOf <http://wordnet.rkbexplorer.com/id/synset-Roman_deity-noun-1> . ?s2 rdfs:label ?label. }The challenge is to get the names of the solar system bodies that aren't named after Roman gods. (Ideally I'd like planets in the solar system... rather than ...bodies, but I can't see a suitable class).
2011-04-11T20:42:00+01:00
sparql puzzle rdf
Related
Comments
Linked Data One-Liner
A lot of information is merely On the Web when it would be more useful In the Web...
2011-04-04T03:44:39+01:00
linkeddata rdf
Related
Comments
Puppies on the Web of Data?
Received via email:
Several years ago I bought a cocker spaniel puppy in Pleasant view Colorado. Are you the Ayers that sell Cocker Spaniel puppies? If so could you contact me anytime you have a litter with a tri-colored male in it both my son and myself are interested. If you are not the correct party I apologize for bothering you.
Nothing to suggest this isn't a legit enquiry. The immediate solution is "no", but I wonder how the machines might help solve it otherwise...
2011-03-30T19:09:28+01:00
puppies rdf
Related
Comments
Pattern exclusion in SPARQL
Seconds after I twittered the last post, @LeeFeigenbaum responded.
Ok, so I have two patterns, and I want to find the statements that match either pattern but don't match both. The solution is rather a flexible little idiom for this kind of negation. The specific patterns are:
?set dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
and
?set a yagoc:ProgrammingLanguage106898352 .
(I'm running this agains dbPedia)
Lee's solution is:
PREFIX dbpp: <http://dbpedia.org/property/>
PREFIX yagoc: <http://dbpedia.org/class/yago/>
SELECT count(?set) where {
{
?set dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
OPTIONAL {
?set a ?marker .
FILTER(?marker = yagoc:ProgrammingLanguage106898352)
}
FILTER(!bound(?marker))} UNION {
?set a yagoc:ProgrammingLanguage106898352 .
OPTIONAL {
?set dbpp:wikiPageUsesTemplate ?marker .
FILTER(?marker = <http://dbpedia.org/resource/Template:Infobox_programming_language>)
}
FILTER(!bound(?marker))
}
}(Note that COUNT isn't (yet) standard SPARQL, but seeing the size of the result sets was handy here).
It's looks convoluted, but each half of the UNION is kind-of the converse of the other (and will give interesting results independently). I was a little surprised it did work as variables are scoped to the whole query and ?marker looked troublesome. But FILTERs are scoped to the local group, and that's where it matters here (it will produce the same results if you had a different variable for each half of the UNION).
There is something slightly odd happening in this particular case (or I'm missing something obvious). The figures I got before were 762 matches for the UNION of the two patterns, 178 for the intersection, so I'd have expected 762 - 178 = 584 results, but this gives 406. So there's a bit of sloppy QED around here. I was missing something obvious.
Lee again via twitter: the numbers look perfect to me - the 762 double-counts the 178 in the intersection. 406+178=584
As @glenn_mcdonald and Lee have pointed out, a DISTINCT would fix my original UNION query to exclude the dupes. Glenn also offers a more concise version taking advantage of a Virtuoso feature:
PREFIX dbpp: <http://dbpedia.org/property/>
PREFIX yagoc: <http://dbpedia.org/class/yago/>
SELECT count(?set1) where {
{
?set1 dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
FILTER NOT EXISTS {?set1 a yagoc:ProgrammingLanguage106898352}
} UNION {
?set1 a yagoc:ProgrammingLanguage106898352 .
FILTER NOT EXISTS {?set1 dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language>}
}
}
2011-03-28T19:35:15+01:00
negation sparql rdf
Related
Comments
Long multiplication
Querying http://dbpedia.org/sparqlPREFIX yagoc: <http://dbpedia.org/class/yago/>SELECT COUNT(?set1) where {
?set1 a yagoc:ProgrammingLanguage106898352 .
}result = 336
PREFIX dbpp: <http://dbpedia.org/property/>
SELECT COUNT(?set2) where {
?set2 dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language>
}result = 426
disjunction, language is in set1 OR set2
PREFIX dbpp: <http://dbpedia.org/property/>
PREFIX yagoc: <http://dbpedia.org/class/yago/>
SELECT count(?s) where {
{
?s dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
} UNION {
?s a yagoc:ProgrammingLanguage106898352 .
}
}result = 762
conjunction, language is in set1 AND set2
PREFIX dbpp: <http://dbpedia.org/property/>
PREFIX yagoc: <http://dbpedia.org/class/yago/>
SELECT COUNT(?set) where {
?set a yagoc:ProgrammingLanguage106898352 .
?set dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
}result = 178
PREFIX dbpp: <http://dbpedia.org/property/>
PREFIX yagoc: <http://dbpedia.org/class/yago/>
SELECT count(?and) where {
{
?or dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
} UNION {
?or a yagoc:ProgrammingLanguage106898352 .
}
?and dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
?and a yagoc:ProgrammingLanguage106898352 .
FILTER(?and = ?or)}
result = 356!?
Took me a long while to realise what that number represents, definitely time for a break...
What I'm trying to find (if it's possible) are queries to look at the difference between the sets above, the 762 - 178 = 584 part. I'm hoping something along the lines of Finding Resources that don't have a certain property might work. If anyone knows an idiom that'll work (or knows that it isn't possible) please ping me.
2011-03-28T13:59:45+01:00
sparql puzzle rdf
Related
Comments
4store on Ubuntu
Bad notes for future ref. It works! Last night I finally got around to sticking 4store (a properly scalable free RDF store) on my slicehost server, which is running Jaunty x84 64bit (uname -a, I always have to look that up), and today I got it on my laptop x86 32bit Maverick. I did things a little out of sequence from the 4Store instructions - I hadn't seen the pointer to ready-wrapped Debian/Ubuntu packages, and didn't keep notes. But in each case I did install the latest Raptor and Rasqal from source, and a stack of common dependencies with apt-get/synaptics, as listed on the 4store site. (I think) on the server the deb package worked right away, locally I had a little snag but got there finally with the 4store source. That snag was a error when trying to load some data, in amongst gobbledegook was "getaddrinfo failed". It seems ipv6 stuff can be one cause for this, googling I found this way of disabling it:
added to /etc/sysctl.conf:
#disable ipv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
(reboot)
Seems ok now. Server install has currently just got this blog's data in it, exposed through SPARQL here. Local copy I think I'll just dump any bits of RDF I come across into, kind of a random DB. Filter it at read time...
2011-03-26T21:46:55+01:00
4store ubuntu triplestore rdf
Related
Comments
Scripty Agents
A bit of background for a question I've put to SemanticOverflow (soon to be moving to semanticweb.com) -
Has anyone got a triplestore to interop with node.js?
(It's evented I/O for V8 JavaScript, looks nifty kit for HTTP stuff).
Given that V8 runs native, I'm guessing it should be possible to hook into Redland somehow (but I wouldn't know where to start). Alternately, I suppose one of the Javascript RDF engines could work, but I'm out of touch with what's available.
SPARQL would be nice to have...
node.js floated into my consciousness again after seeing mention of @sh1mmer's new video on the subject. This time around I got around to installing it - it was pretty easy to get the example server script running (error first time, but rather than disabling SSL as suggested in the docs I just apt-get installed openssl-dev etc. and that did the trick).
On and off I've been playing with the idea of using an agent metaphor for (Semantic Web) services. I did a bunch of slides for the Scripting for the Semantic Web 2007 meetup, which I've just managed to dig out again: Two Webs!. Here's what a generalised agent looks like (slide 47):

Pretty much any semweb service can be viewed this way, and if you degenerate it a bit by dropping the RDF store and HTTP client it covers pretty much any Web app/service/site. Drop the RDF and server (and add a view) and you've got a browser. But I reckon the fun should start when you go in the other direction, starting with this as a general architecture for the app, plugging in whatever bit of functionality you like. I like the idea of such things being quite small, and to add to the agility, use a scripting language for the behaviour, so it looks something like this (slide 48):

The pseudocode here is for a simple doc server (e.g. if the query was done with SPARQL, format done with XSLT), but a key piece to note is the call to another datasource if the query can't be fulfilled locally. It occurred to me that if you had a little framework for agents like these, they could communicate with each other over HTTP (as they would in the wild) although if they knew they were in the same VM then more direct programmatic comms could take place. In between direct and HTTP calls other protocols could also be available, e.g. XMPP, with negotiation used to choose the most suitable wire. But I think it would be important to always (MUST) have HTTP support.(I've been playing with this stuff a bit using Scala Actors, no results to show yet).
So...given how nifty node.js is for doing the HTTP client/server bits, it would be nice to get the RDF bits in there too. Given that node.js/V8 runs natively, one option would be a native RDF store like Redland. I'm not sure, but I think you'd have to wrap the Redland functions up quite a bit in V8's C++ to make them accessible through Javascript in V8 (and I'm not volunteering - it's years since I did any C, and I never got the hang of C++).
If there was a decent Javascript RDF engine, that would also be a good alternative - V8 compiles Javascript, so the result should be pretty performant. (Hmm, whatever happened to the RDF store in Tabulator?).
In the first instance I'm thinking here of running such agents server-side (slide 30 is "Tyranny of the Browser" in Gothic Black Letter - we've got used to such a narrow view of what the Web can do), though as a browser can supply the VM for V8, similar stuff could run there too.
Anyhow, there are already a couple of interesting answers to my question, comments over there please, or alternatelyI installed phpBB the other day to use in the near future for some support stuff - feel free to use that for comments or whatever.
2011-03-25T22:34:57+01:00
agents node.js rdf scripting
Related
Comments
Adding SPICE to the Semantic Web
Main Course
Here's a circuit:
- and here's its SPICE model:***
.INCLUDE la-components.mod
Rsrc 1 0 100E3
Rin 1 2 1E3
Rfeed1 2 3 10E3
Q1 3 0 4 BC109
Q2 3 0 4 BC179
Rfeed2 3 4 10E3
Xopamp 0 2 5 6 4 TL071
Rload 0 4 10E3
Vcc 5 0 15
Vee 6 0 -15
Vsrc 1 0 SIN(0V .1VPEAK 1KHZ)
.TRAN 10US 1000US
***The .INCLUDE is as it sounds, the contents of that file are included in this model. After that it's describing a graph with two kinds of nodes: those associated with a component and connection nodes (i.e. common terminals/points/buses/PCB tracks...). Although the components kind-of contain arcs, they're hidden behind the component's connectors. The component's connectors are identified by their position in the space-separated data. On the schematic the nodes are marked in red.
Taking the first line:
Rsrc 1 0 100E3
This is interpreted via :
(a Resistor) <name> <node1connection> <node2connection> <value>
Rsrc is a 100k resistor connected between nodes 1 and 0(Node/bus 0 is always ground)
Taking the first of the transistors:Q1 3 0 4 BC109
a transistor of type BC109 called Q1 has its collector connected to node 3, base to node 0, emitter to node 4
The .TRAN line is used to run a simulation (a transient analysis), sampling every 10uS for 1000uS. I've not really figured out this side of things properly, couldn't get a straight .DC based transfer chart. But the sine wave will do for now.
Anyhow I can't go looking at a graph model for long without wondering how it could go on the Web. While there are no doubt loads of ways of doing it, the circuit definition can be transcribed into Turtle fairly directly. Bnodes could be used for the connection buses, but it's just as easy to name them. So making things up as I go along -
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix spice: <http://purl.org/stuff/spice/> .
@prefix u: <http://purl.org/stuff/units/> .
@prefix d: <http://purl.org/stuff/devices/> .
@base <http://hyperdata.org/circuits/logamp/> .
<http://hyperdata.org/circuits/logamp> a spice:Circuit ;
dc:title "Log Amp" ;
dc:description "a modified log function amplifier" ;
spice:components ( <Rsrc> <Rin> <Rfeed1> ... <N0> <N1> ...) .
# Rsrc 1 0 100E3
<Rsrc> a spice:Resistor ;
rdfs:label "Rsrc" ;
spice:terminal1 <N1> ;
spice:terminal2 <N0> ;
u:ohms "100000" .
...
that seems ok, now for a transistor:
# Q1 3 0 4 BC109
<Q1> a spice:BJT ;
rdfs:label "Q1" ;
spice:terminal1 <N3> ;
spice:terminal2 <N0> ;
spice:terminal2 <N4> ;
spice:device d:BC109 .
that'll do.
Doing a .INCLUDE in general could really do with something from RDF core (ping RDF WG), but here it's providing other SPICE definitions of the components so it seems reasonable to be more explicit:
d:BC109 rdfs:isDefinedBy <http://hyperdata.org/circuits/logamp/components#BC109> .
which given that SPICE supports subcircuits (which is how TL071 is defined) provides a nice composition mechanism.
I reckon it should be straightforward to write a transformer from SPICE syntax to Turtle. Going the other way, the usual SPARQLing shouldn't be rocket science.
All seems doable. Homework. Rainy day.Starter
I want to play with analog electronics again, stuff I used to do before the Web came along and ate up my cycles. My motivation now is mostly driven by the price of recording studio equipment. If, for example, I just want to invert the phase of a signal, I'd need to pay say $50+ for a passive DI box or $100+ for a pre-amp. This is a bit demoralising when the components are available for pennies (though hardware like connectors and cases can cost a lot more). Then of course there's the circuit hacking angle, it really is good fun. A project that's a permanent fixture in this space is the distortion pedal (like ghard's Big Muff) - the circuits aren't complicated, but getting a good sound is the Holy Grail, so this is what I'm going to play with first.
I did buy a bunch of components a while back, but haven't got much in the way of prototyping/test gear. A cheapo USB ADC will hopefully do for a makeshift oscilloscope for now, and I've just ordered the parts to put together a simple PSU (along with *lots* of oddments). But feeling a bit impatient, I thought I'd have a quick look what software was available these days for circuit simulation.
I don't know if I'm missing something, but things hardly seem to have progressed at all in the last couple of decades (but then again analog electronics hasn't really changed). The de facto standard is SPICE, and there are quite a few tools open source available for using it (ah, things weren't open source back in the day, that's progress). I won't bother linking to the individual bits, if you look for 'spice' in Synaptics a bunch show up, and they all seem to come under the umbrella of gEDA. Anyhow, after an hour or so's fiddling I was able to draw a little circuit using gschem, but I haven't yet managed to get it to generate a working netlist file (which specifies the inter-component connections for SPICE). I think I just need to sit down and check/add all the component attributes. But that's a bit tedious so I've just been playing with a SPICE file manually. Praise be to text formats.
The first problem here was finding simulation definitions of the components I want to use. The little circuit I want to test includes a common op-amp (TL071) and a pair of transistors, one NPN (BC 109), one PNP (BC 179). Took a lot of searching, and although (allegedly) many of the manufacturers do provide SPICE modules for their components, I eventually found what I needed on hobbiest sites. (Making the component module files doesn't look too difficult, it'd mostly mean copying values from a spec sheet into a SPICE definition - again, sounds tedious).
There is GUI adapter for running simulations, gspiceui (which I must have another look at now I've got a working model), but with the amount of trial and error I was having to do I settled back into the command-line tools. For future ref. it goes like this:
ngspice <filename>
This loads in the file and starts up an interactive shell. Took me a long time to figure out what to do next, but here are a couple of bits that worked for me. Once in the shell:
ngspice 1 -> run
Runs the simulation (the .TRANS bit). Then:
ngspice 2 -> plot V(1) V(4)
Produces a plot like this:
Red is the input (voltage on node 1), blue the output (voltage on node 4).
Certainly looks like distortion...wonder what it sounds like...
Pudding
Finding stuff and looking up references in this space is still fairly Paleolithic, so there's one application of exposing this kind of material as linked (wired!) data. But there are probably stacks of other more inspiring apps. Going totally blue sky, a globally distributed circuit could be rather cool. In the digital realm you for example could have a global computer that's built from just a few simulated gates on each of a million interconnected PCs. Bit like an extremely dumbed-down Web service/agent kind of thing.
In the analog realm it could get very wacky. Host your own local circuit subsystem, connect it to anyone else's. I guess you'd want to connect your inputs to other folks' outputs and offer outputs of your own. As long as you are limited to connecting your inputs to the rest of the world (or more versatile, I can only connect my output to your input if I have the appropriate rights) then subsystems should play nicely with each other. I see no reason why for control and audio signals you couldn't do this in real-time using existing streaming audio protocols (codec'ing locally to PCM for the instantaneous values).
This is pretty much what I assume some of the net-based recording systems that are around are doing. I must confess I've never looked into these, trying to mimic traditional recording/mixing stuff that way seems a bit of a non-starter because of the latency issues. But flipping it to a messier, slightly bonkers [insert pun about bipolar transistors] global analog synth kind of idea, then it starts to sound more fun.
2011-02-23T22:00:24+01:00
spice turtle electronics semweb rdf
Related
Comments
Really Simple Reading Lists
I occasionally visit Dave Winer's blog, as he has been known to have good ideas. One of these from a few years ago, that he's talking about again, was 'Reading Lists', whereby (in principle) you can subscribe to a list of feeds. When the list changes, your aggregator (in principle) can subscribe/unsubscribe to the individual feeds in the list, showing the contents of the listed feeds probably grouped together in some fashion. Neat idea, but it doesn't really seem to have caught on.
There are two de facto standards for expressing lists of feeds: OPML and RDF ("foafrolls"). The former is probably better supported in desktop aggregators, the latter maybe more visible in the online Planet aggregators (including Planet RDF, though that uses chumpologica/Redland rather than PlanetPlanet). OPML is Dave Winer's 'outline processor' markup language, for lists of feeds it has typed links. The RDF version uses the FOAF, DC and RSS 1.0 vocabs (very typed links). Away from the feed list application, the OPML format is usable in Dave Winer's outliner, and any RDF tool can make sense of the RDF (naturally :) but I reckon it does rather lend itself to FOAFishness - feeds are associated with a foaf:Person (and/or foaf:Agent) with a foaf:weblog etc. (I dunno, the domain is right on top of SIOC too, maybe some info using those terms could be added to the feedlists..?).
For any kind of Information/Knowledge Manager kind of tool (Personal or otherwise) built with RDF, it seems quite natural to periodically refresh data (not only feeds but pretty much anything in the domain of interest - FOAF Profiles probably being ubiquitous), so Reading Lists would sit comfortably alongside other features.
But in the 'simple' world of RSS, subscribing to feedlists is something of a complication. For instance, in the good Mr. Winer's latest incarnation, he's got aggregated pages (e.g. daveriver.scripting.com) not unlike those of the Planets, with an autodiscovery link in the HTML pointing to the feedlist:
<link rel="alternate" type="application/rss+xml" title="OPML" href="index.opml" />
OPML is RSS? I don't think even the Universal Feed Parser is that liberal. The kludge does get Firefox to show the target as a subscribable link, but then that's still not much good if the tools don't know what to do with it. But it seems to me there's a much simpler approach - use RSS. To get myself some markup to show I just bookmarked this blog's feed with del.icio.us and had a look at the feed that produced, and it contains (trimmed) this:
<item>
<title>Danny Ayers : Raw Blog (feed)</title>
<link>http://dannyayers.com/index.rdf</link>
</item>Now a current aggregator would see that and probably just display it as a HTML-style titled link. But if the aggregator bothered to do a HTTP HEAD, it'd see:
Content-Type: application/rdf+xml
To a (non-RDF savvy) aggregator that means an RSS 1.0 feed. So, aggregator dude, subscribe to it. Atom <link> elements have a (mime) type attribute, so there the HEAD wouldn't even be necessary.
While most feeds are a changing, fixed-length FILO queue of entries, there's nothing to stop them being a variable length list.
In other words, the simplest RSS feed list is an RSS feed. Even if the aggregator needs a little help in recognising a feed list, it's got to be easier than understanding an entirely different format (published with an inappropriate media type).
Ok, so personally I'd go straight down the RDF route, it's a heck of a lot more flexible. But an RSS-format Reading List does seem like low-hanging fruit for non-RDF tools.
Anyhow, if anyone's building an aggregator (they're a great little starter app when learning a new language), consider Reading Lists as a feature.
2011-02-17T17:32:19+01:00
aggregators lists reading rss rdf opml
Related
Comments
Why OWL ain't bad
John Sowa just posted some criticism of OWL as a KR language to the Conceptual Graphs list. I responded but the list's playing up, I ended up with an over quota message. So I'll post the text here and send John the link...
On 14 February 2011 22:59, John F. Sowa <sowa@bestweb.net> wrote:
> I have often commented on the limitations of OWL as a knowledge
> representation language.
...and I believe I have leapt to its defence more than once :)
I don't believe I've done so since OWL 2 [1] came out, so it behoves
me to add my few cents once again.
OWL is limited as a knowledge representation language, by design. As
with most other modelling languages it's a trade off between
expressivity and computational demands. But it has certain features
that sets it apart from most other such languages, the key ones being
related to the fact that it's a Web language. Three such features
spring to mind:
* Most of the language's constructs (individuals, classes, relations
etc) are identified using URIs, so there's Web-compatibility built in
at a low level
* While the binary relation that's at the heart of OWL can seem like a
handicap (especially coming from a DB perspective), when the
information is seen as a graph structure, echoing the Web, its utility
is hard to dismiss
* By making the open world assumption (statements are either true or
unknown) the language reflects the real world as expressed on the Web
- in a global environment, we can't know everything
Given these features as a starting point, OWL does a good job of
providing an ontology language that ticks many of the logician's
boxes.
As it happens, as development of OWL 2 was being proposed, I'd argued
with some of its advocates that there were more useful things the time
could be spent on than the logic side of the language. Turns out it
didn't matter anyway - the enthusiasts there produced what they wanted
(and what apparently their customers were demanding) and there's been
no discernible impact on other development tracks.
The thing is RDF (plus RDFS, perhaps with a tiny bit of OWL) is enough
to cover the vast majority of descriptions (the statements of
interest) to a useful degree. Most of the time you don't actually need
expressive constructs to get useful data on the Web, a very simple
statement of relationship between resources is enough. Logic-wise, the
SPARQL query language (syntactically like SQL, but operating over
graphs) covers the requirements of the vast majority of applications
(IMHO), its simple pattern-matching being substantially more useful
than most other inferences.
The aspect of RDF/RDFS/OWL that really seems to work well is what's
been called the 'follow your nose' protocol. As when browsing the Web,
if you want to find out more about something (and it has a link), you
click the link to get more information. With RDF & OWL entities and
relations being identified by URIs, typically HTTP URIs, your machine
can do the same.
If it encounters a statement, say something like:
Fornitura(John Sowa, Something)
- it (and you) may have now idea of what's being stated. However all
three parts of the statement are Web resources, the statement could be
written longhand as (e.g.) :
<http://www.jfsowa.com/people#me> <http://some-vocab.org/fornitura>
<http://example.org/something> .
To find out more information, you can do a HTTP GET on the unknowns.
Because of it's position in the statement you know :fornitura is a
predicate (a rdf:Property) and by following the link you can get more
information in a machine readable form. In this case, by asking for an
RDF mime type, you will typically get back the ontology defining the
predicate.
Where 'better' knowledge representation (and reasoning) is required, a
lot of the time that can be carried out locally. For example, you may
have a traditional SQL database covering your specific domain.
Internally there will a closed-world assumption, native n-ary
relations and so on. On your own data you can use whatever languages
you like. But that data may be exposed to the Web through RDF (etc),
making it reusable elsewhere.
Ok, there's the argument that to be really useful, you need powerful
knowledge representation globally. But there is a major hurdle - to be
really useful you need a lot of people using *and publishing*
information in that form. Unfortunately along with the
representational power comes complexity, and the extra work required
has to be justifiable - in economic terms at least.
I forget the source, but there's a nice line: "what's new about the
Semantic Web isn't the semantics, it's the Web". When it comes to
global information sharing, the Web part is really where the
difficulties lie. Any logic/data has to actually be widely adopted.
Though the development of the Semantic Web is happening slower than
most folks hoped, it is happening. Take the recent statistics from
Yahoo! :
[[
The data shows that the usage of RDFa has increased 510% between
March, 2009 and October, 2010, from 0.6% of webpages to 3.6% of
webpages (or 430 million webpages in our sample of 12 billion).
]]
https://tripletalk.wordpress.com/2011/01/25/rdfa-deployment-across-the-web/
(RDFa is an RDF format allowing it to be embedded in HTML)
Also the Linked Data Cloud is a nice visualisation of some of the big
datasets out on the Web:
http://linkeddata.org/
So while RDF (alone) is a much more limited knowledge representation
language than OWL (essentially simple binary relations), at least the
data's getting out.
[snip]
> the OMG group is proposing as a way of representing type hierarchies
> in a simpler and more readable form than OWL.
I can't personally see how it could be much simpler than in OWL:
SubClassOf( :Woman :Person )
(in functional syntax)
Note also that structures other than tree can easily be represented,
an artificial example:
#node1 :connectsTo #node2 .
#node2 :connectsTo #node3 .
#node3 :connectsT0 #node1 .
(Turtle syntax)
2011-02-15T10:47:40+01:00
sowa owl kr rdf
Related
Comments
Some Problems
Georgi Kobilarov has a refreshing post, suggesting Making Linked Data work isnt the problem. I'm inclined to agree with most of what he says. The technology in itself isn't a solution to any problem, rather an enabler to solve problems. While the idea of serendipity is appealing, it isn't very good justification for a huge global commitment of resources. So what kind of problems do we, as living, social and technological organisms wish to solve?
To start exploring this space I reckon there are (at least) two general modes of knowledge use. The first is relatively domain-specific, directed by a set of requirements associated with a corresponding set of real-world tasks and operations. These I'd put under the umbrella of Applications, akin to the computer applications we already use but augmented with knowledge engineering facilities and access to the Web of Data. As a shortcut the starting point here is Connolly's Bane: "The bane of my existence is doing things I know the computer could do for me.". But in general it goes far further, in that there are plenty of beneficial things we don't already do. A second mode would be ad hoc, fairly immediate, unplanned, call it Just-in-time problem solving, the kind of thing that we currently turn to search engines for.
As an example of the Applications mode, one of the early drivers for the Web was e-commerce. I think I'm fairly safe in saying that only the surface of the potential there has been scratched. There's a hint of what can be possible with things like the individual-targetting of Google Ads and Amazon recommendations. In this space the GoodRelations ontology is a marvellous baseline. But what we're not really seeing yet is the whole supply chain from the manufacturer to consumer being integrated. Fairly loosely-coupled (as it is today) In one direction there are the financial aspects ("follow the money"), in the other direction is all the transport, manufacturing and processing that go from raw materials to delivered finished product. Within those different parts of the pipeline there are a whole host of problems relating to technology tied together by human and natural resources.
Alongside this commercial world there are macroeconomic and macrosocial systems, those areas traditionally covered by government. We're already seeing some movement around transparency with the various government data projects, but I think we're still a very long way from seeing genuinely informed policy and decision making. Reflecting the darker side of advertising right down to commercial spam and taking advantage of general ignorance, good governance is seriously compromised by self-interest (of individuals and corporations) and misinformation. I recently heard a radio programme talking about the UK Conservative Party's successful "Broken Britain" election campaign. An aspect of this was that violent crime was perceived as being on the increase. However the actual statistics suggest that in reality this malaise had actually been declining (see Murder rate lowest for 12 years "Home Office figures show overall crime fell by 5% in England and Wales"). Politicians will always lie, but damage is only done when they get away with it and aren't held to account with the facts. But I don't really want to suggest that prevention of political badness is the goal here, rather the encouragement and facilitation of goodness (man...).
Another huge area where there are countless problems to solve is science. While the Web has vastly improved information sharing and been a boon to research, I'm not sure the underlying methodologies have changed that much. I'm convinced the open sharing of knowledge at the data level can offer A New Kind of Science (no hyperbole there!).There are plenty of other application domains that could benefit from a bit of Web-scale knowledge engineering. Ok, I'll name one more bundle: the Arts.
Ok, moving on to the Just-in-time mode of problem solving, take a look at the following list (random stuff that came off the top of my head when I woke up this morning). Imagine how you would solve these problems now, and then think how you might solve them with a thousand programmers at your beck and call. Most of them need something considerably deeper than a keyword/linkrank document search. I've dumped this list over on the ESW Wiki, additions and discussion welcome over there (I still haven't implemented comments on this blog, so if you have a comment for anything either mail me or blog it (and mail me) or tweet or use Facebook...).
- I'd like to upgrade the computer I use for video editing. My budget is about 300 euro. What should I buy?
- Who should I get to make the soundtrack to my new film?
- I've bought an Ubuntu laptop to replace my old Apple, I'd like it to run applications that fulfil all the tasks I have on the old machine. What do I need?
- Should HTML use namespace prefixes?
- Is there a political motivation behind Royal Weddings?
- Who should I vote for?
- Who might make a good (romantic) partner?
- I wish to sell my double glazing products in sub-Saharan Africa, who should I contact?
- Who might make a good (business) partner there?
- I got a mail from someone claiming to be my cousin, asking for a loan. Should I give them the loan?
- I've got an interesting rash. Should I see a doctor?
- I wish to enlarge my penis. What method is safe and reliable?
(Sorry, couldn't resist the last one - but it's a valid example of where you'd need good healthcare data alongside reputation and provenance information)
PS. danbri points me to a short 1989/90 document which contains a fairly similar list (minus references to genitalia) : Information Management: A Proposal, by a certain Tim Berners-Lee. Go read it. Now!
2011-01-22T17:46:57+01:00
semweb problems rdf
Related
Comments
Drupal 7
I'm back home after a spell away so am having a lazy week, or rather picking odds and ends that have been on the to-do list seemingly forever. One thing was sorting out my sites, and coincidently the following notice appeared on the SWIG list last week:
After over 3 years of development by almost 1,000 contributors, Drupal 7 has finally been released today! Drupal is an open source content management platform powering hundreds of thousands of websites and applications. Notable websites are WhiteHouse.gov and the many top music artist's sites of Warner Media Group. Drupal 7 features the latest web technologies and remarkable improvements to user experience (UX). Drupal is the first major CMS to include RDF as part of its DNA and embed RDFa markup out of the box: all Drupal 7 sites annotate by default their pages, comments, images, tags, authors, posted date with the popular SIOC, FOAF, Dublin Core, and SKOS vocabularies. We hope that with Drupal adopting RDFa, we can pave the way for a greater adoption of the Semantic Web technologies. Drupal is estimated to power 1% of the Web, and even though Drupal 7 was just released, more than 30,000 websites are already powered by Drupal 7. With today's announcement, this number is likely to sky rocket in the coming months.
...This blog is still a little (Scala) homework project, but not long ago I registered a domain as a place to put my music noodling: spikeandwave.com. It was just a couple of handwritten HTML pages until yesterday, when I slapped Drupal there. Ok, so first my MySQL install seemed to be broken, so I took the opportunity to upgrade from Ubuntu Jaunty to Karmic. Turned out not only had I forgotten the admin password but the instructions I'd found online for resetting it didn't work. But these instructions did work. The install would have been a breeze, had I known this, in php.ini :
memory_limit = 32M ; Maximum amount of memory a script may consume
Default is 16MB and Drupal 7 core requires at least 32MB (I've set mine at 64MB to be on the safe side). You also need to restart Apache2 after changing this. If you make the mistake I did, then DROPing the DB and replacing the settings.php gets you back to square one.
So yesterday I wound up spending a couple of hours or so getting the thing installed. Today it took me maybe 3 hours to get a handle on how to use it enough to get the site more or less how I want it to look. While Drupal core seems to work fine out of the box, I did hit a few bugs with plugin modules, e.g. no joy with XML Sitemap. Must admit I didn't spend long on trying to get such things working, opting for the disable and leave for later workaround. Most of the time was spent getting to know the navigation and where to find things (e.g. took me ages to discover that links are under menus, d'oh!). The only bit of handcoding I did was to tweak the CSS so the centre column was wide enough for a big image.
So basically I reckon it's pretty much comparable to WordPress in terms of ease of setup/use. One nice feature which WordPress didn't have last time I looked is in-place updating of code, something that will hopefully help avoid the usual mess.
So finally to check those semweb credentials. There are hints of typed nodes here and there, but then what's is it's RDFa publishing like? The front page contains these 6 triples: drupal.txt (extracted with the RDFa distiller). content:encoded seems to have grown up!
2011-01-13T00:17:44+01:00
sites drupal rdf
Related
Comments
del.icio.us bookmarks to RDF
The blogosphere seems to think Yahoo! is going to axe del.icio.us so I've knocked together a quick Python script to get my data out - 2317 occasionally annotated bookmarks. To use: make sure you've got Python first (!), download and install BeatifulSoup (navigate to the dir with setup.py, run python setup.py install), download the script and rename it to souper.py, get your del.icio.us bookmarks and rename to delicious.html. Then run python souper.py delicious.html > delicious.ttl and there you have the Turtle.
I've not checked the output particularly thoroughly, but I think it's ok (one shortcut I made was that any bookmarks that couldn't be converted to ASCII would get ignored). Here's my original bookmarks file and the same data in Turtle (7599 triples).
The Twitterati seem to be moving en masse to Pinboard, which has a sign-up fee of $7.42 but seems to have got good reviews.
2010-12-17T00:24:50+01:00
script python turtle semweb rdf delicious tags
Related
Comments
Slow Data, Decentralization and Semantic Web Architecture
[I've still got a bug in my blog software which mangles links, so apologies for the ironically unlinky URIs]
Slow Food (http://en.wikipedia.org/wiki/Slow_Food) is an international movement founded to offer an alternative to fast food, "it strives to preserve traditional and regional cuisine and encourages farming of plants, seeds and livestock characteristic of the local ecosystem". By a little analogical legerdemain, fast data is the kind of stuff you get from regular search engines - quick but not very nutritious, probably bad for you. Slow Data on the other hand has been harvested with care and with attention paid to its preparation. It's far more satisfying in the long run. While complex Semantic Web systems are currently at a slight disadvantage performance-wise (largely due to their youth), there's no reason that high quality data can't be readibly accesible at high speed using existing, well-documented Web techniques. But I'll call it Slow Data anyhow.
So...I recently got a letter (!) which included a description of a proposed social net application based around RDF data. The author knew what they were talking about and the system sounded good, but they were really struggling with one aspect, how to avoid making a centralised system.
One of the great rallying cries of the Linked Data movement has been to open data out to the Web. I doubt very much that I've seen a presentation on the subject that hasn't referred to data silos, usually with a predictable image. This antipattern reaches its zenith in applications where the only interface to the data is a dedicated 'snowflake' API (so named because every one is unique), severely limiting the potential for Web-style interconnection (links). Behind the scenes the application implementation may be highly distributed, but all the user or developer can see is a walled garden with a gatekeeper. That's a lot of buzzwords in one paragraph, so I'd better move towards the point.
How is an RDF triplestore any more open than a SQL-style database hooked up to the Web?
It might sound heretical, but it isn't, or at least isn't necessarily. The only advantage it has is that by default it uses URIs as identifiers for things (corresponding to the keys in a SQL store) which if designed properly will be dereferenceable over HTTP, i.e. they will be links which can be followed to find out more about the named resources. But SQL-backed Web applications can expose links that can be followed, and many do. (The same goes for NoSQL stores). SPARQL is a query language that can be applied to a particular variety of graphs, but again in itself it isn't really any more webby than the triplestores it addresses. However there is the SPARQL Protocol for RDF (SPROT) http://www.w3.org/TR/rdf-sparql-protocol/ which allow things like a HTTP GET /sparql/?query=EncodedQuery and changes the whole ball game (you don't hear much mention of SPROT, I suppose because of the ugly name and a spec that's mostly WSDL stuff that everyone ignores).
Hopefully everyone's familiar with Chapter 5 of Fielding's dissertation - http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm - so to cut the waffle I'll cherry-pick one heading: 5.1.4 Cache. If we imagine the Web (of Data) as one huge interlinked information space, then individual stores such as those associated with specific applications can be considered as caches of small chunks of the Web of data. This is probably easiest to conceive by contrasting two different pieces of software. For one let's have a social net app that lets people discover other people with similar interests. It will store data around resources of the type foaf:Person with properties such as foaf:interest and to leverage the social angle foaf:knows. A traditional app for this kind of thing would involve people signing up and entering information about themselves. But quite justifiably a person might say "I don't want to enter loads of stuff into a form in application Y when I already entered it in application X yesterday" (yes, this is the old Data Portability thing). But pause there and for a second piece of software let's have a generic link-follower and data aggregator, i.e. a crawler or bot, or as they're known in FOAF circles, a scutter. It's not difficult to make such things directed, so they only following specific link types of interest (check Slug http://ldodds.com/projects/slug/ - see also https://github.com/ldodds/slug). Let's make the storage system for this scutter a triplestore. Ok, set the scutter going on the Web at large with a plan to follow foaf:Person related links and slurp the data. Come back a few hours later, and you have an already-populated store to which you can plug in the social app, no need for people to sign up (in an ideal world, and ignoring privacy matters).
Now the scutter plan for this (i.e. get people data) is pretty much isomorphic to a SPARQL query along the lines of:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
CONSTRUCT { ?s ?p ?o } WHERE {
?s rdf:type foaf:Person .
?s foaf:interest ?o .
?s foaf:knows ?o .
?s ?p ?o .
}
This is exactly the kind of query you'd also want to be asking in the social net app. Going through the scutter, you're asking the Web at large, but because the data has already been aggregated in your store, it doesn't take a thousand GET requests to find relevant statements. But the statements are exactly the same. In other words, an RDF store is just a cache of a small chunk of the Web of Data.
For performance reasons this kind of cache would be selective in the data collected, so maybe strictly speaking the architecture is more like Uniform Pipe and Filter http://www.ics.uci.edu/~fielding/pubs/dissertation/net_arch_styles.htm#sec_3_2_2 with the uniformity essentially maintained by following the SPARQL and SPROT specs (and 5.1.6 Layered System is probably relevant too).
This kind of thing is entirely implementable today, in fact the Semantic Web Client Library http://www4.wiwiss.fu-berlin.de/bizer/ng4j/semwebclient/ can do SPARQL queries on the Web at large (SELECT at least, not sure if it supports CONSTRUCT).
There are other pieces of the Semantic Web toolkit that can be cleanly inserted into Web architecture (as one would hope, given that the Semantic Web is meant to be an extension of the existing Web). For example, a general-purpose WebID setup (FOAF+SSL http://esw.w3.org/WebID) could be inserted between client and server to handle authentication, acting as a proxy and/or gateway.
Somewhere recently (I think in a paper by danbri and others) I saw discussion about what was needed to get from a Web of Linked Data to a more fully Semantic Web. In other words, even if you score 5 stars at http://lab.linkeddata.deri.ie/2010/star-scheme-by-example/ there might be more you can offer. I might have dreamt it, but I believe the discussion mentioned inference and reasoners. The thing is on the one hand we have lots of linked data already out there, and we already have pretty performant reasoners (e.g. http://clarkparsia.com/pellet/ ) but reasoning over Web-scale data is likely to remain a fantasy. That is, unless you imagine multiple reasoners acting as dedicated, fairly task-specific agents/services over their own manageable little batch of data. These again could be deployed as proxies. For example, another bit of FOAF jargon is smushing, which originally (when people were bnodes) meant the unification of data about a person based on the assumption that the person could be identified by means of their email address or homepage. Since it's more common now to use URIs to identify people (see http://dig.csail.mit.edu/breadcrumbs/node/71) I don't think it's unreasonable to extend the term to cover unification of multiple URIs for a person (typically with owl:sameAs links somewhere). Now going back to the triplestore of the app described above, that's only really interested in statements including the identified foaf:Person, foaf:interest and foaf:knows. There's nothing to stop this treating a person as two individuals if data has been pulled from sites which use their own person ID schemes. But if somewhere else on the Web at large there was a triplestore with reasoning capability that could eat person IDs, foaf:mbox and foaf:homepage data and spit out owl:sameAs statements, this could be used to unify the descriptions for the application. This triplestore could have a scutter as its input and a SPARQL endpoint to provide output, in other words being a uniform pipe kind of proxy.
Ok, so effectively I'm arguing here that we already have all the bits from which we can glue together a Semantic Web that sits nicely with the Architecture of the World Wide Web http://www.w3.org/TR/webarch/ . But I do think there are at least two specific areas that need attention in the near future. One is in the increased use and optimization for named graphs, especially those of the order of only tens or hundreds of statements. I thought I had a good justification for this, but now my minds gone blank, so just call it a gut feeling. The other thing is in description of datasets - there's already some stuff around annotations and provenance etc, but I'm thinking more in terms of discovery and agents/services being able to advertise themselves to allow a client that's looking for some particular kind of data. the Vocabulary of Interlinked Datasets (voiD, http://vocab.deri.ie/void) is pretty good in this space, but I reckon we need to go a lot further, and have been mulling over a little quasi-protocol for matchmaking between datasets and agents. I'll post more on that once I've got something to talk about...
There is a teeny bit of low-cost, potentially invaluable data that it'd be nice to see more of. Let's say a directed scutter has crawled the Web and has aggregated all statements of the form <http://example.org/fred> foaf:interest ?x. While ideally it will be placing the triples it's found into named graphs corresponding to the provenance, a more likely coding scenario (because the queries will get silly with thousands of FROMs - hmm, does SPARQL NG do anything about that?) would be to dump everything into a default graph. But, while the full provenance may not be retained is this setup, it can still be made available to consumers of the data if statements of the form <http://example.org/fred> rdfs:seeAlso <http://wherever.com/source/somedataaboutfred> are added to the store. Call it future-proofing.
2010-12-14T11:08:17+01:00
architecture arch semweb rdf
Related
Comments
SPARQL Results and HTML
A thought in passing. When I've need to display SPARQL results in a browser I've generally either used some kind of programmatic templating (as in this blog) or XSLT on XML results - which can get clunky, but when the transformation is done, it's done. Results XML is straightforward (and I'm still rather fond of XML) but the choice of syntax is pretty arbitrary. The RDF that comes back from a CONSTRUCT is grand, that's a really nice kind of query, the data is immediately ready for reuse (it might an obsessive-compulsive thing, but DESCRIBE still feels a bit messy). I've not got around to playing with JSON results, presumably that lends itself to speedy application in most languages.
But I can't help thinking it'd be neat if SPARQL results came back directly as RDFa so by default you had something that made sense both in a browser and to an RDF agent. Is there anything you can do with a SELECT that a CONSTRUCT-to-HTML couldn't do? Is there any way the stuff could be structured to simplify templating? There's at least one results XML to HTML XSLT around somewhere, I guess that could be tweaked for experimantation.
2010-12-14T07:54:28+01:00
rdfa html sparql rdf
Related
Comments
path slice grr
the little javascript UI I'm using for typing blog posts seems to break URIs in its attempts to make them relative
but they do work on the homepage, http://dannyayers.com, and I'm too tired to fix them now...
2010-11-21T19:01:25+01:00
blog rdf
Related
Comments
Once more unto the breach (again)
For the first time in ages I've had a couple of days to sit down and look at code. A lot of it was stuff I hadn't finished, dating back a few years. The typical pattern was either getting distracted from the original aims and playing with the fun stuff or aiming to do so much that I never really got past square one. So this time around I've changed my mind, decided to keep the fun stuff (playing with Agents in Scala) separate from the main app work.
The main app in mind here is the Semantic Web in a Box idea which I'm back to thinking about in a more minimal form, informed a lot by what Rob wrote on his blog - What people find hard about Linked Data - and the stuff in the Talis tutorial. Basically what I'm after is a very easy-to-use Linked Data editor/visualization tool, with support for some kind of pluggability (TBD). There are existing tools which can do this sort of stuff, but the key here is to keep things as simple as possible (and free and open source). Target users are total beginners and experienced folks that want to be able to knock simple stuff together quickly. There's really not a lot to this, and 'wait long by the river and implementation of your plans will float by' usually works, but no-one really seems to have got around to this thing.
It'll be a Java/Swing desktop app with the following features:- Internal triplestore(s)
- RDF editor with various views and syntax validation
- SPARQL editor and results viewer
- HTTP client (for examining remote resources, crawling and publishing to remote stores/services)
- HTTP server (for simulating live data)
- HTTP proxy (for examining headers etc)
- Basic HTML editor/viewer
What should also be possible is to run it headless, as a live service.
Probably more than half the people that read this are likely to have such parts living in their codebases - Java Swing components, Jena, ARQ, and Apache HTTP libs cover an awful lot, the tricky part is wiring them all up in a useful way, with a UI that doesn't confuse.
I've made a start on gathering together the bits, but I'm unlikely to get down to a good coding session for a while again, so what follows is really notes to self so I don't forget...
So, RDF editor.
Currently the main class is org.hyperdata.swing.rdftree.editor.RdfEditor
One view is a resource-centered thing, based on a JTree backed by a Jena Model. Like everything else here, it's unfinished and very buggy (notably there's something like an out-by-one error on which row expands). But this should give the general idea, the paths should expand indefinitely :
Right now it's only addressing the local model, but it should be reasonably straightforward to hook the HTTP client up to terminal node URIs to go and GET remote data (must check how Tabulator goes about that) and extending the drop-down paths.
Text views for Turtle and RDF/XML (with crude highlighting from JEditorPanes):


I've only just started looking at a graph view (again!), separate from the stuff above - I just hacked at one of the JGraph demos, long way to go:
The launcher for that is org.hyperdata.swing.graph.danja.GraphEditor
I've stuck the code over here:
2010-11-21T18:47:57+01:00
swib linkeddata semweb rdf
Related
Comments
Piano Piano
Where I'm staying at the moment I don't have much time to get on the computer, and net access is really lousy. But I've had a lot of chance to think about stuff that I want to do, and have realised that I can feed a few birds with one bean. The blog engine (this) I've been writing in Scala is approaching the basic level of functionality I wanted, so I'm looking again at a couple of old ideas.
The first is Semantic Web in a Box (new name needed!), the second an agent-based engine that will support scripting (I did a lightning talk about that at one of the SFSW meetups, must see if I can find the slides). Given that Scala actors are perfect for constructing the kind of agents I have in mind, as well as offering a nice way of doing the SemWeb in a Box stuff, I reckon I'll wrap it all together into one project. And the first application built with this setup can be a refactoring of my blog engine...
Many of the agents probably won't have all these features, but the stereotypical agent I want, a SemWebAgent, will have the following traits:- named with a URI
- access from a HTTP server
- access to a HTTP client
- triplestore
+ some code that'll actually do something useful
Looking from outside, the things will look like regular Web-accessible resources, and can call/be called by external (RESTful) clients/services etc. Internally, if a particular named resource lies within the same VM then more direct messaging is possible. For scripting (when I get around to it), I've got Jython and Rhino (or equivalents) in mind. To support the pluggability of SemWeb in a Box, I'll go for OSGI, probably using Felix as the container.
I've started coding up the core actor stuff, which I will fill with unit tests as well - being new to Scala I'll no doubt make a lot of mistakes. I'm also putting together some functional tests for the blog engine, which I'll refactor to use this system. I'm already using a tiny bit of Apache Clerezza (for jax-rs handling handling of HTTP calls), I believe there'll be quite a lot more I can cherry-pick.
2010-10-10T10:46:05+01:00
box clerezza gradino semweb rdf
Related
Comments
Per-Tag Feeds
I've just added a quick feature here so that if you go to a URI of the form /feed/tag/{TAG} it will produce an RSS 1.0 (RDF) feed for that tag. So hopefully /feed/tag/rdf will now be everything tagged "rdf".
PS. Silly me mistyped the above, so went back and started coding up item editing (as yet unimplemented)...then realised I'd already set things up so that if I post something with the same title on the same day it will already overwrite the previous entry (all the triples hanging off that URI). Heh.
2010-09-27T15:23:45+01:00
code gradino rdf tags
Related
Comments
Slides from KRDB 2010
A week or so ago I was up north in Brixen-Bressanone (definitely "a charming town") at the 3rd KRDB school on Trends in the Web of Data. The programme was exceptionally well contrived, IMHO, seriously apposite for what's going on in the Web of Data. In between beers (don't worry, I am sorting that one out) I did the opening session. My initial brief was (I think) "Semantic Web Platforms". Now I could happily have done the obligatory semweb intro and led into material about the Talis Platform (which is still as far as I know the only one I'd consider a true semweb platform, being provided in a Software as a Service manner via HTTP). But Tom was down to talk about Linked Data (slides) and Martin about the GoodRelations ontology (slides), so I assumed that between them most of those bases would be covered.
In many real senses the Semantic Web is already a done deal, so all this conspired to give me chance to look at the notion of a platform in general. Naturally I consider the Web of Data to be the key enabler right now, but when it comes to choices on how to use it and application strategies, there I reckon it's worth looking at analogeous systems. So I refactored my title to "Platforms and the Semantic Web" and basically spent 2 hours rambling about my hobbies...
Slides on slideshare and pdf.
Many thanks to Enrico, Anja et al for the opportunity. I did stay to poke my nose into the SWAP 2010 goings-on, so caught up with quite a few old faces and met a bunch more new ones. Even made it home in one piece.
2010-09-27T08:44:16+01:00
bressanone krdb semweb rdf slides
Related
Comments
HTML in Turtle
Because of the graph structure behind the scenes, pretty much any data can be expressed in the RDF model and hence in an RDF syntax, although it might get a bit nonsensical when it comes to interpreting the triples. Here is a case in point. There was definitely some sensible discussion of Atom syntax being RDF/XML (but the handful of extra attributes needed were considered too much overhead). But also I vaguely remember (or maybe imagine) HTML/RDF mappings being done pre-Turtle. It just crossed my mind, couldn't resist having a go.
So here's an example:
@prefix : <http://example.org/html9/> .
<http://example.org/hello> a :html ;
:head [ :title "A Page" ] ;
:body [
:h1 "My Page" ;
:p "Hello World!"
] .The placing of the bnodes is a bit arbitrary, but I rather like the idea of a resource being a HTML. I believe this corresponds to the RDF/XML:
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://example.org/html9/">
<html rdf:about="http://example.org/hello">
<head rdf:parseType="Resource">
<title>A Page</title>
</head>
<body rdf:parseType="Resource">
<h1>My Page</h1>
<p>Hello World!</p>
</body>
</html>
</rdf:RDF>Hmm, actually it seems quite sensible at this level of nesting, not all that far from Reto's DiscoBits idea. In fact those bnodes could usefully be swapped for # URIs. But I'd prefer not to think how it gets with e.g. a load of nested <div>s.
Dunno, I could imagine an advanced (RDF-friendly!) Wiki syntax looking something like that Turtle.
2010-08-14T16:59:04+01:00
ideas wikis rdf
Related
Comments
