Saturday, 9 May 2009

EXPRESS and the 3-tier Architecture

We can picture a mapping between the way EXPRESS does things and the 3-tier architecture. This mapping depends on what type of system EXPRESS is providing an interface for.

EXPRESS can provide RESTful Semantic Web Services as an interface for systems that have complex business logic or for systems that have datasets and want to provide them with simple access. In addition to dividing systems into complex business logic systems and dataset access systems, we can divide them into legacy systems (non-web or non-Semantic) and Semantic systems (based on Semantic technologies like OWL or RDF and have Semantic metadata).

So in total we have 4 categories of systems that EXPRESS can be used to provide RESTful Semantic Web Services:
  1. Legacy systems with complex business logic
  2. Legacy systems providing dataset access
  3. Semantic systems with complex business logic
  4. Semantic systems providing dataset access
The mapping between EXPRESS and the 3-tier architecture can be viewed as follows

1. Legacy systems with complex business logic
  • Presentation Tier = Semantic Interface
  • Application Tier = Business Logic
  • Data Tier = Database
2. Legacy systems providing dataset access
  • Presentation Tier = Semantic Interface
  • Application Tier = CRUD to SQL mappings.
  • Data Tier = Database
3. Semantic systems with complex business logic
  • Presentation Tier = Semantic Interface
  • Application Tier = business logic.
  • Data Tier = Triple store or some semantic
4. Semantic systems providing dataset access
  • Presentation Tier = Semantic Interface
  • Application Tier = CRUD to SQL mappings.
  • Data Tier = Triple store or another Semantic database
This mapping raises an important question, if we have two systems one with complex business logic and the other providing only dataset access, and both have similar OWL files how do we show that the complex business one does more than the other.

EXPRESS the Dilemma

As I started implementation, some ideas are popping into my head, I know this should happen more often :) but anyhow...

We can view the Semantic Data in an EXPRESSive system as an interface to the client, which tells it how to interact with the system. The simple CRUD operations affect the Semantic interface in the same way no matter what the system is. What differs however is the business logic which is triggered or reflected by the CRUD operations on the Semantic interface.

There are two levels of actions associated with EXPRESSive systems: actions on the Semantic interface, and actions by the business logic. In fact there are three types if we count actions in the real world. I will illustrate that with an example, when a customer orders a pizza:

-on the Semantic interface level, an order instance containing the details of the order is created, with connections to related instances such as customer and pizzas.
-on the business logic level, the database is updated and notifications are issued.
-the real world level, pizzas are baked then delivered.

Questions:
  • What is this interface? Is it an OWL or RDF file, a triple store, or a triple space.
  • Does the client query the interface structures to infer what happened and what should happen?
  • Isn’t this the application state? Shouldn’t the client keep track of that?
  • Maybe it isn’t the application state, maybe the server state was altered –by the server or by the client- and the server has to show it?
  • Why is this interface important? It shows the consequences of the action.
So What does all that imply?
  • There will be messages exchanged between the Semantic Interface level and the business logic level.
  • In the case of simple data management there is no business logic level.
  • The generation of the Semantic interface can be automated.
In the next post I'll either clarify or complicate things further

Cool URIs

When I design the use cases, I begin by listing the resources’ URIs for example in the Pizza Delivery use case:

URI

../Order

../Order/{Order_ID}

../Order/{Order_ID}/isOrderedBy

../Order/{Order_ID}/hasPizzas

../Order/{Order_ID}/hasStatus

../Order/{Order_ID}/hasTime


../Customer

../Customer/{Customer_ID}/

../Customer/{Customer_ID}/hasName

../Customer/{Customer_ID}/hasPhoneNo

../Customer/{Customer_ID}/has Address


../Pizza

../Pizza/{Pizza_ID}/



Months ago, I had a discussion with my supervisor about the structure of the URI, should it be like this:

http://www.example.com/Order/{Order_ID}

Or like this:

http://www.example.com /{Order_ID} without the /Order/ Prefix.

We decided that the 2nd method was better, I believe the reason was that we wanted could have URIs in the following form Subject/Predicate/Object . But I think the problem was that we thought of URIs as an ID which is not the case. We initially thought of it the URIs way:

http://www.example.com/{OrderID}/isOrderedby/{CustomerID}

But this isn’t quite correct because we are not using the whole URI for the Customer, to assert that an Order was ordered by him we need the customers’ URI not only the ID. So it should be like this:

http://www.example.com/{OrderID}/isOrderedby/http://www.example.com/{CustomerID}

The reason I am going back to that is when I started implementing, I discovered that I need something in the URI to indicate what type of resource we are dealing with is it a Pizza, Customer or an Order? This indicator could be either something like /Order/{OrderID} or having a prefix to the ID like having all order ids starting with the characters “OID”.
Having a classifying URI is a requirement if I wanted to use existing REST frameworks. But even if I wanted to implement everything from scratch, how would my framework know where to forward an incoming request? For example if customer wanted to check the status of an Order with ID 12345 he would submit GET to

http://www.example.com/12345/

How would my framework know that this is an Order, not a Customer or a Pizza? Having an automatically generated stub for each dynamically created resource is not an option, so there must be a way to indicate which kind of resource this is. Maybe having an indicator in the URI is not bad.

I wanted to know the standard way to assign URIs, so I checked the W3C. In the document about Cool URIs for the Semantic Web, they used URIs in the following form

http://www.example.com/people/alice

So to see if this is applied, I checked RKBexplorer, to see how they have been doing things. I checked for equivalent URIs -using their CRS- for Ian Millard a person that I’m sure is in the system. And these came up

1. http://citeseer.rkbexplorer.com/id/resource-CSP211403-1e189c7f65e747386b40ae1031a26e27
2. http://data.semanticweb.org/person/ian-millard
3. http://dblp.rkbexplorer.com/id/people-5c3b0c986bef5fa4e181c5830d56326b-9118ee1bfc54e3cb07408669fc2f7c48
4. http://eprints.rkbexplorer.com/id/ecs-soton/person-04860
5. http://id.ecs.soton.ac.uk/person/4860
6. http://southampton.rkbexplorer.com/id/person-04860
7. http://wiki.rkbexplorer.com/id/ian_millard

I omitted the duplicates that had the same domain name because they used the same naming scheme. By checking the URIs we can see that it is a standard procedure to have an indicator in the URI to show what kind of resource it is.
There is something to watch out for however, if I want to adhere to Cool URIs for the Semantic Web, and that is the distinction between URIs for informational resources and non-informational resources. For more on that check
http://www.w3.org/TR/2007/WD-cooluris-20071217/

Ruby on Rails

To show how - and if :) - EXPRESS works, I will implement a prototype of the approach and test it on some use cases. I want to use a REST framework to make things easier. There are some REST frameworks like Ruby on Rails and Restlet. I started by reading about Ruby on Rails, from what I read I understood it has simplifying assumptions that might conflict with my requirements for example:
  • Every item is either a list or a item in the list which may not work out for some resources like: Order123/hasDateTime
  • IDs of the items must be numbers.
  • Exchanged formats is either in XML or key-value pairs
  • No linking between resources
There may be a workaround these problems but they aren’t straight forward. So I chose Restlet, and for now it seems OK. It is a Java API for modelling resources and responding to HTTP requests, it offers a lot more but these are the features that I am concerned with right now. Another good thing is that I’ll also be using Jena, so doing everything in Java may simplify things. Something in Restlet may cause a problem though, it doesn’t have OWL/RDF in the representations’ MediaTypes. However, there is RDF/XML, I am not sure if it will work instead.

Sunday, 3 May 2009

Is EXPRESS RESTful

For a system to be RESTful it must adhere to constraints of having a uniform interface and using hypertext as the engine of the application state. In the light of these constraints we will describe how EXPRESS is RESTful, first by showing how it has a uniform interface, then by discussing how it can use hypertext to change states. The examples explained are from EXPRESS.
According to Fielding having a uniform interface means: “having a unique resource identifier mechanism, having access methods with the same semantics for all resources, resources are manipulated by exchanging representations and messages for actions and representations are self descriptive.”[1]
In EXPRESS a resource is a class, an instance or a property, each one of them has a URI. And the access methods that can be applied are (OPTIONS, GET, PUT, DELETE and POST). Having the same semantics with the relation to resources we can realize a pattern illustrated in the following table

Resource

OPTIONS

Semantics

Class in an ontology

GET

Gets the structure of the class

POST

A factory endpoint to create instances of this class

Instance or property

GET {only}

Read only instance or property

Instance or property

GET, PUT, DELETE

Modifiable instance or property



But taking this to a higher level of semantics, does the system respond internally to posting an order in the same way it responds to posting a customer? Then the answer is no. And we believe it is still RESTful because the semantics here are inferred from the shared understanding of the ontology, or in RESTful terms, the representations.
As for manipulation of resources by exchanging representations, this is done in EXPRESS. For example to modify a customer a representation of the customer in OWL or RDF is PUT to the customer’s URI. And messages are self-descriptive in the sense that they contain the method and the representation. The representation whether it is OWL or RDF is inherently self-describing.
For the other constraint of REST which is hypertext as the engine of the application state. Fielding describes it as: “each response contains a partial representation of server-side state, some representations contain directions on how to transition to the next state and that each steady-state (page) embodies the current application state.”
For an example of the response containing part of the application state, invoking GET on a resource returns a representation of that resource, which is part of the application state. Another example is when invoking POST on a URI representing a class, a generated URI of an instance of that class is returned, which is part of the application state (a new instance/resource is created). However when invoking a PUT on an instance to modify the resource, the message sent by the client contains the modified resource but the response -in EXPRSS- from the server doesn’t. This can be modified easily to make the server respond by sending a representation of the modified instance if the modification is accepted. But the question is, is that a requirement of REST? Is what Fielding means by the response: a response by either the server or the client and in that case EXPRESS is already RESTful, or does he mean that all responses from the server must contain part of the application-state? Another question is what is the impact of this constraint? Answering this question may guide us to the answer of the previous one.
Regarding the other statement, “some representations contain directions on how to transition to the next state”, this can be accomplished through the OWL file, since it states what the client can do with the resources. Another example is when a client wants to create a new resource, it POSTs to a class’s URI, a new generated URI for an instance of that class is sent back to the client, the client can use that URI to move to the other state which is PUTting the instance’s content to it.
For the last statement “each steady-state (page) embodies the current application state”, I am not quite sure how that works, it may be that the representation the server sends to the client contains the current application state.

I believe the picture right now is still quite vague; going into implementation may clarify it more.

[1] Roy T. Fielding, A Little REST and Relaxation, JAZOON’07 The international Conference on Java Technology, 24-28 June, 2007, Zurich.
http://www.parleys.com/display/PARLEYS/A+little+REST+and+Relaxation

Thursday, 16 April 2009

Understanding REST

I’ve been reading to answer the questions in my last post, and the picture is a bit clearer.

I think I now understand what we gain by HATEOAS. Instead of the client guessing where to go throughout an application by constructing URIs, it is guided by the server by providing it with hyperlinks (representations) to follow.

What do we gain from that?
More decoupling, thus more reusability, less likeliness of clients to break.

So that is HATEOAS, but what is statelessness.

The application state is on the client, because the client is keeping track of where it is and what it is doing. The server doesn’t need to do that.
In the book RESTful Web Services the authors differentiate between two types of states application state and resource state. Resource state is on the server and application state is on the client.

So in my case of ordering a pizza, should the order be a resource created by the client on the server, therefore a resource state? Or should it be part of the application state and should be stored on the client side?

There are some things to note here:
-In my pizza delivery there would be another client interested in the resource, the carrier service so having the order on the client that would not be very efficient.

-Scalability issues, having the order on the server won’t be as scalable and reliable as having the order on the client. In the RESTful Web Services book the authors wrote, “When your application is stateless you don’t need to coordinate activities between servers, sharing memory or creating “server affinity” to make sure the same server handles every request in a session you can throw web servers at the problem until the bottleneck becomes access to your resource state. Then you have to get into database replication, mirroring or whatever strategy is most appropriate for the way you have chosen to store your resource state”. My problem with that is that mostly what clients would deal with in Web Services are resource states, or are we thinking about resources in a wrong way.

-I believe the problem boils down to who creates resources in the book the clients could create resources, however in the dissertation, to the best I know Fielding didn’t mention that clients could create resources on the server. He referred to creators of resources as authors which I assume have control over the server. So if the client can create resources on the server, then the server is storing part of the application state which according to what I understood from Fielding's dissertation isn't RESTful.

-Also in the dissertation there is an example of a shopping cart “A state mechanism that involves preferences can be more efficiently implemented using judicious use of context-setting URI rather than cookies, where judicious means one URI per state rather than an unbounded number of URI due to the embedding of a user-id. Likewise, the use of cookies to identify a user-specific “shopping basket” within a server-side database could be more efficiently implemented by defining the semantics of shopping items within the hypermedia data formats, allowing the user agent to select and store those items within their own client-side shopping basket, complete with a URI to be used for check-out when the client is ready to purchase.” I think that if this was implemented using the RESTful Web Services book way then a client would create a shopping cart at the server then post items to it which won’t be as scalable as Fielding’s client side shopping basket.

Difference between REST and ROA
By REST I am referring to Fielding’s dissertation, and by ROA I am referring to the RESTful Web Services book.
-In REST there is no CRUD. It states that a uniform interface should exist but not what it is.
-In REST there is more emphasis on HATEOAS.
-In ROA there is more emphasis on uniform interface: GET, PUT, POST and DELETE.
-ROA uses the HTTP protocol and its methods to implement RESTful Web Services. In REST a set of constraints are set but not tied to any protocol or system.
-Web Services are not mentioned in REST, but they are the heart of ROA.

Limitations of REST
As stated by Fielding in his dissertation:
“The REST interface is designed to be efficient for large grain hypermedia data transfer, optimizing for the common case of the Web, but resulting in an interface that is not optimal for other forms of architectural interaction.”
“not optimal for other forms of architectural interaction” so why the fuss about RESTful Web Services? They are not transferring large grain hypermedia? Is it that the WS community saw in REST what Fielding didn’t see?

Another limitation which was pointed out by “Triple Space Computing” researchers when considering REST as an infrastructure is that there isn’t a way for the server to send notifications to clients asynchronously.