'Towards a Science of the World Wide Web' by James Hendler

marielle · Post by **marielle** » Wed Dec 06, 2006 2:00 pm

I am going to the talk below this afternoon. If there is any interest, I would be happy to share my notes here.

-------- Towards a Science of the World Wide Web ----------
-------- by Professor James Hendler -----------

Computer Science research in the area of the World Wide Web has largely focused on improved search for individual web pages or on the modeling of Web connectivity (using the tools of networking). However, given the huge impact of the Web on our world, this seems to be an impoverished view. What are the principles of engineering that have made the Web flourish? How can we engineer new technologies, that will extend the capabilities of the Web? What are the social impacts of Web use, and how can Web technologies both allow greater freedoms while preserving the ones we have?

In this talk, I will use some examples from Semantic Web and "policy aware" information access to demonstrate new Web technologies and how we might explore some of the trade-offs between making it easier to integrate information on the Web with protecting that information from abuse. I will explore some of the emerging trends on the Web including social networking, blogging, and beyond page search, and discuss some of the research and technology challenges that they pose to continued Web growth and access, and some new technologies being explored to address these challenges.

Jerry Muelver · Post by **Jerry Muelver** » Thu Dec 07, 2006 3:06 am

Marielle, I, for one, would be interested. I'm involved in the "technology" plan development for a small K12 school district. Any input on the future of the science of the web would be most welcome, even though we are at this stage just struggling to make the present web actually useful for education.

marielle · Post by **marielle** » Thu Dec 07, 2006 11:30 am

They told me they will put all slides on the web.
Check this space in about a week time.
http://www.inf.ed.ac.uk/events/lectures/

There is a more technical talk today. These slides will probably not make it on the web so, I will share my notes for this second session.

If you are interested in this, you may be interested in last year's presentation, by Tim Berners-Lee: http://media.inf.ed.ac.uk/lectures/. You get everything, slides + video of the talk.

The important information for a general public. Corporate Semantic Web (making good money by using data integration/linkable concepts technologies) starts to flourish.

For educators like you, once you have the slides, check out the end of the speech, some examples about a scout troupe and how to give access to different pictures based on a simplistic expert system type of rules. If the person who requests to view the photo is a member of the troupe, then grant access. If the person is not part of the troupe, then grant access only if the parents have authorized the photos to be shown to the public, etc.

Is there any relatively specific problem that you hope to solve with semantic web (data integration and concept linking) types of technologies?

Marielle

marielle · Post by **marielle** » Thu Dec 07, 2006 10:42 pm

Here is a summary of a second talk, which was held in front of a smaller, more specialized audience (Artificial Intelligence lab seminar, Edinburgh University).

The title of the talk was Dark Side of the semantic web

Dark side in the title doesn't refer to a negative force but more to the hidden face of the moon. The fact that when people talk about the semantic web, the predominant concept that comes across is semantic... while the web part is important as well.

A notion Prof. Hendler used very often during his talks, using different allegories is the one that power comes from putting together a variety of islands of knowledge that pre-existed, linking them together.

This is really what the semantic web is about. Data integration and Concept linking. To quote Prof Hendler: "Linking is power". Resources power by exploiting links to web resources, data resources, each other, web 2.0 like annotations on web resources, or annotations in official resources.

An interesting characteristic of semantic web technologies compared to usual technologies is that as in many other domains, both a high end and a low end coexist (high end = big corporation with huge amounts of data to manipulate and huge budgets to buy or fund the design of specialized tools; low end = hackers, hobbyists, small or even one person teams). However, here it seems that commercial success opportunities (making money) is more apparent in the low end than in the high end. What we see mostly is commercialisation coming from ajax, 3 tiers web architecture, etc. Commercial success doesn't come that much from the ability to design a tool that will implement complex processes to manipulate huge amounts of data. It rather comes from using a bit of help from knowledge to make an application more intelligent and more interesting to the end user.

------------------------------------------------------------------------------

Then followed a few slides with a number of acronyms
- RDF, OWL (ontology web language), SPARQL (query language for RDF) [TBD: explain what these technologies are about]
- RIF: Rules Interchange Format
- (Open)Cyc : world's largest and most complete general knowledge base and commonsense reasoning engine.
- SKOS (Simple Knowledge Organization System)
- LSID (Life Sciences Identifier)
- MMORPG (massive multiplayer online role-playing game)

Most of these appear in some form in the layer cake that Berners-Lee is used to show in his presentations

(this is a kind of old version, the guy showed an expanded version where in place of ontology vocabulary, you had SPARQK, OWL, RDF)

The version most like the one that was presented is this:

Basically, that graph is supposed to capture all the different layers that the semantic web is made of.

------------------------------------------------------------------------------

An excellent point Prof. Hendler made was that most of these technologies were fairly complex and required kind of big time investment to get to master them and understand what was possible or not to do with them. Even if a good collection of tools starts to exist, it still remains that access to these technologies is not a piece of cake.

To quote him. "If you believe in a vision, then you should make it easy for developers to follow that vision."

Ontologies can be used to capture any knowledge expressed in a semi-structured way.

A new interesting trend is therefore to propose subsets of these RDF/OWL technologies. Initially, it was thought that persons would request more expressivity for these markup languages. The trend now is to rather request for more specific, limited expressivity... to use only the little bit of expressivity that helps you a lot and leave on the side the more extended expressivity that you only require very occasionally.

He mentioned various initiatives in that direction
- GRDDL embedding RDF into traditional content (alike microformats, I assume) -- apparently a way of extracting RDF from XML docs (using XSLT). There is some w3 info on this.
- OWL mini (or OWL Fast, or OWL Prime, or OWLET, or whatever it might end up being called). That's kind of new and you don't really find much more than mailing list posts on the topic.
- RDF++. Again, something that appears very much under development. Some blog posts and internal presentations

------------------------------------------------------------------------------
A few more points:

1. According to James Hendler, the data sources that are being queried after some code is being executed have started to change from data stored in a relational database to data stored in a RDF triple store. In short, "RDFStore implements a generic hashed data storage that allows to serialise RDF models, resources, properties and property values either to disk or in-memory data structures."

2. What's so special about RDF/OWL?

A. To start with, they don't require a priori fully structured representation of the knowledge like relational database. They are designed to cope with semi-structured types of knowledge.

B. RDF+OWL are designed to live in an open and distributed environment.

C. If you take OWL, for instance, it is grounded into URI (Uniform Resource Identifier). This means that anywhere within your knowledge tree, you put a URI as content of a node. Rather than give a definition of a term or clone the definition form elsewhere like you would be forced to do in a relational database, you can simply refer to some knowledge that is held elsewhere (and eventually kept up to date on that other site). The big distinction then to make is between the URL that corresponds to the usual web page model which can link to a web page like the one of the Guardian (a newspaper in the UK) where the information is susceptible to change everyday and a URI to be used in a knowledge tree. In the second context, the URI doesn't let you retrieve information content per se but information about where that information content is to be found. An important implication of this is of course that the URI for this second type of resource (pointer to the information content) has to remain persistent for it to be useful to link to it.

D. Ontologies allow for web-like relationships between data, which is not easily done in a typical relational database. This corresponds to a point he made 3 time during the talk. The general idea being that links on the web have some paralell with links int he real world. He made a connection with this: Connected Services Framework 3.0 Developers Guide.

3. The current model of the web is one where information is being pulled. Though Ajax can make the Pull happen a lot quicker and give an illusion of push technology, it still remains a Pull model. Agency would have for contribution to act as pushing agents (probably of more specific interest for mobile web technologies).

------------------------------------------------------------------------------

Further readings

- Spinning the Semantic Web (book), with a Review
- Three Theses of Representation in the Semantic Web
- Agents and the Semantic Web by Ky Van Ha (2005)
- Access Control on RDF Triple Stores from a Semantic Wiki Perspective (pdf) by Horrocks, I., & Patel-Schneider, P.F. (2003)

Further Links

- Swoogle - Semantic Web Search. Swoogle is a crawler-based indexing and retrieval system for the Semantic Web. That is, it indexes RDF and OWL documents, rather than plain HTML documents.

- ebiquity - Building intelligent systems in open, heterogeneous, dynamic, distributed environments. Check out the blog, in particular.

- Ontology Resources on the RevEd wiki

marielle · Post by **marielle** » Fri Dec 08, 2006 1:25 am

The notion of Semantic Web is a very abstract one. That's very much about a collection of technologies that gain a lot of extra power by being bundled together. It's a forest where it is quite difficult to perceive the tree, their shapes and colors... or in this context, function and role within a larger ecosystem. It's in fact quite difficult to really understand well what that notion of semantic web refers to.

That presentation was for a specialist public. It may fall into the category of things that tend to be "obvious" for me and not that obvious for persons who were not given to attend an academic seminar a week for nearly 10 years (not that I feel I do understand the notion of Semantic Web well despite reading a few books, papers and discussing it with knowledgeable persons).

If you feel completely at lost, simply wait for the slides of the first, general public, lecture to be made public. They refer to some more common notions and more common usage of the web. However, as the guy said, when you ask the question "show me a website that demonstrates how the semantic web makes a difference" you cannot really show anything. The website looks very much the same from a user interface point of view. What changes is the data source back-end. Rather than having a highly structured relational database, you have an ontology system that allows for the storage and retrieval of semi-structured data.

What does this mean? I will try to explain that. More exactly, I will try to find a few links that explain this better than I could.

Don't hesitate to ask questions if you need some basic knowledge. Simply be aware that I probably won't have time for lengthy discussions until next Monday.

Jerry Muelver · Post by **Jerry Muelver** » Fri Dec 08, 2006 1:06 pm

marielle wrote:Is there any relatively specific problem that you hope to solve with semantic web (data integration and concept linking) types of technologies?

It's more a generic problem -- how to match technology, curriculum, and (inadequate) instructor proficiency into a concept-threaded web that students can follow independently or collaboratively without heavy teacher intervention. In other words, how to develop cross-disciplinary projects so students can use math, social studies, personal finance, English (writing skills), history (research skills), and computer/web skills to work on a single project all day, from class to class, in a task-oriented integrated manner.

I'm not an educator per se -- I'm a tech writer with some 40 years of experience, now pressed into duty substitute teaching in a K-12 system. The teaching experience has opened my eyes to the dismal state of education today. Bill Gates is right -- education is a broken system. I'm trying to stick my finger into one small crack in the dike.

marielle · Post by **marielle** » Fri Dec 08, 2006 1:17 pm

Jerry Muelver wrote:The teaching experience has opened my eyes to the dismal state of education today. Bill Gates is right -- education is a broken system. I'm trying to stick my finger into one small crack in the dike.

I am from the other end. An educator with silver fingers (good with technology). I have a similar analysis. We should talk about this, I believe this would make for an interesting conversation. I recently worked for an e-learning startup that was very much playing with these ideas. I am still in contact with them in a consultancy role. I would be keen to share some of my notes on this on the K-12 forum (hopefully sometimes next week).... feel free to start with your own take on (1) what are the main blocking elements (2) what types of solutions need to be created to overcome these blockages.

marielle · Post by **marielle** » Fri Dec 08, 2006 1:21 pm

Maybe I should post on the K-12 forums the notes that had been contributed on the revolution-education list a year or two ago and that I had archived on my rev-ed wiki:

The theme was then Education, what is needed

A problem, however, is that forums are only private access for reading. So I can't make the content available to education users which are not part of the revolution community yet

.

marielle · Post by **marielle** » Sun Dec 10, 2006 9:42 pm

Jerry,

a paper you may be interested in:

Towards a Corporate Semantic Web Approach in Designing Learning Systems. Review of the TRIAL SOLUTION Project (pdf)

I haven't read more than the abstract.

Abstract. The TRIAL SOLUTION EU project focused on the publication of personalized electronic documents based on existing scientific books. Its general approach consists in slicing electronic books into elementary learning resources and annotating them with metadata enabling the retrieval of resources by a semantic search. The annotated resources are published into a repository available for teachers or students to produce personalised teaching or learning materials with delivery tools. In this paper we give an overview of the project, emphasizing the authoring tool we have developed to annotate the learning resources, and we review it by the light of the Semantic Web.

strongbow · Post by **strongbow** » Fri Oct 17, 2008 5:50 am

Hi Marielle + all

Interesting to read your discussion about the Semantic web, Learning systems, etc. I finished my Masters Thesis on Hypermedia Systems back in 92 and the systems that I researched were very much along these lines back in the 80s... I actually only wrote 1 paragraph or so about Tim Berners-Lee and his development at CERN of the WWW... In my opinion it took off because it was built on the lowest common denominator - very simple text based system, and thus cheap to start off with. But with that "cheapness" came a lack of power (if i may call it that!!

, which is being progressively increased as the initial limitations become apparent. Hence the Semantic web and other efforts.

Looks like the TRIAL SOLUTION EU project built a tool similar to what I built back then - realising the need to provide extra tools to allow the Educator or "Guide" to create a lesson or "guided path" through the existing material, enhanced through the addition of metadata and other tools. Simply linking material is not always enough - if you want to get a particular point across you may need to explain why the link was made - metadata for this link. Each node contains it's own "data" or information, and the links provide extra info, both through the sequencing of the links, as well as the metadata that was provided for the links and nodes. Of course a simple webpage provides some of these capabilities...!

There's definitely a lot to do here - Google provides one part of the solution, but i'm not sure that the WWW infrastructure is up to the task. Time will tell...

cheers

Alan

LiveCode Forums

'Towards a Science of the World Wide Web' by James Hendler

'Towards a Science of the World Wide Web' by James Hendler

Dark Side of the semantic web

Semantic Web and Hypermedia