What is WPS?

Geographic Information Processing for the Web
The Web Processing Service (WPS) offers a simple web-based method of finding, accessing, and using all kinds of calculations and models ([1])

Web Processing Service offers you the following:

  • simple web service to enable remote call of calcualtions.
  • WPS services are self-describing.
  • WPS is an interface description. Several implementations exist.
  • WPS is part of the OGC open standards family: wms, wfc, wcs, sos, csw, ...
  • can be called with simple HTTP requests with key/value pairs or
  • can be called with HTTP post-requests with XML documents.
  • a lightweight specification. Comparable to XML-RPC ... but XML-RPC is not self-describing.
  • can be registered in Catalog Services.

You will find further information in the appendix: WPS Documentation.

In the following we show an example with a Word Counter function which is enabled as a web-service using WPS.

Defining a Word Counter function

In the following example we will use the Word Counter function:

def count_words(file):
    """Calculates number of words in text document.
    Returns JSON document with occurrences of each word.
    """
    return json_doc

This Python function has the following parts:

  • a name or identifier: count_words
  • a description: Calculates number of words ...
  • input parameters: file (mime type text/plain)
  • output parameters: json_doc (mime type application/json)

Now, we would like to run this function remotely using a simple web-service. To get this web-service we can use WPS. The function parts (name, parameters) are all we need to know to define a WPS process.

WPS definition of Word Counter

To add a new proccess you need to define the input and output parameters. For the Word Counter process this looks like the following.

../_images/WpsInOut.png

Here is another example for a Text Generator process. We will use it later for chaining processes.

../_images/WpsTextGenerator.png

There are two types of input/output parameters:

  • Literal Parameters (green): these are simple data types like integer, boolean, string, ...
  • Complex Parameters (yellow): these are documents with a mime-type (xml, cvs, jpg, netcdf, ...) provided as URL or directly.

An input/output parameter has:

  • a name or identifier
  • a descriptive title
  • an abstract giving a description of the parameter
  • multiplicity ... how often can this parameter occur: optional, once, many ...
  • in case of literal parameters a list of allowed values.

For more details see the following WPS Tutorial.

Chaining WPS processes

If you know the input/output parameters of processes you can chain processes. For example we will chain a Text Generator process to our Word Counter process.

../_images/WpsChain.png

The Text document output of the Text Generator process becomes the input of Word Counter process.

You can chain process manually by calling them one after the other. The WPS specification allows you to also chain process with a single WPS request. To get even more flexibility (using if-clauses, loops, monitoring ...) you can also use a workflow engine (Taverna, VisTrails, Dispel4py, ...).

You will find more details about chaining in the GeoProcessing document and the GeoServer Tutorial.

WPS process implementation with PyWPS

There are several WPS implementations available (GeoServer, COWS, ...). In birdhouse, we use the Python implementation PyWPS. In PyWPS the Word Counter process could look like the following:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
from pywps.Process import WPSProcess
class Process(WPSProcess):

  def __init__(self):
    ##
    # Process initialization
    WPSProcess.__init__(self,
        identifier = "wordcount",
        title="Word Counter",
        abstract="""Counts words in text document.""",
        )

    ##
    # Adding process inputs

    self.text = self.addComplexInput(identifier="text",
              title="Text Document",
              formats = [{'mimeType':'text/plain'}])


    ##
    # Adding process outputs

    self.output = self.addComplexOutput(identifier = "output",
            title="Word count result")

    ##
    # Execution part of the process
    def execute(self):

        # count words and save result
        self.output.setValue( count_words( self.text.getValue() )  )

        return

You can see the definition of the input and output parameters and the execute() method where the real count_words() function is called. You will find more details about implementing a WPS process in the PyWPS Tutorial.

Using WPS

A WPS service has three operations:

  • GetCapabilities: which processes are available
  • DescribeProcess: what are the input and output parameters of a specific process
  • Execute: run a process with parameters.

The following diagram shows these operations:

../_images/wps_usage.png

To call these process one can use simple HTTP request with key/value pairs:

  • GetCapabilites request:

    http://localhost:8094/wps?&request=GetCapabilities&service=WPS&version=1.0.0
    
  • DescribeProcess request for wordcount process:

    http://localhost:8094/wps?&request=DescribeProcess&service=WPS&version=1.0.0&identifier=wordcount
    
  • Exceute request:

    http://localhost:8094/wps?request=Execute&service=WPS&version=1.0.0&identifier=wordcount
                            &DataInputs=text=http://birdhouse.readthedocs.org/en/latest/index.html
    

A process can be run synchronously or asynchronously:

  • sync: You make a HTTP request and you need to wait until the request returns with a response (or timeout). This is only useful for short-running processes.
  • async: You make a HTTP request and you get immediately a response document. This document gives you a link to a status document which you need to poll until the process has finished.

Processes can be run with simple HTTP get-requests (as shown above) and also with HTTP post-requests. In the later case XML documents are exchanged with the communication details (process, parameters, ...).

For more details see the following WPS Tutorial.

There are also some IPython notebooks which show the usage of WPS.

Calling Word Counter with Birdy

Now, we are using Birdy wps command line client to access the wordcount process.

Which proccess are available (GetCapabilities):

$ birdy -h
usage: birdy [-h] <command> [<args>]

optional arguments:
  -h, --help            show this help message and exit

command:
  List of available commands (wps processes)

  {chomsky,helloworld,inout,ultimatequestionprocess,wordcount}
                      Run "birdy <command> -h" to get additional help.

What input and output parameters does wordcount have (DescribeProcess):

$ birdy wordcount -h
usage: birdy wordcount [-h] --text [TEXT] [--output [{output} [{output} ...]]]

optional arguments:
  -h, --help            show this help message and exit
  --text [TEXT]         Text document: URL of text document, mime
                      types=text/plain
  --output [{output} [{output} ...]]
                      Output: output=Word count result, mime
                      types=text/plain (default: all outputs)

Run wordcount with a text document (Execute):

$ birdy wordcount --text http://birdhouse.readthedocs.org/en/latest/index.html
Execution status: ProcessAccepted
Execution status: ProcessSucceeded
Output:
output=http://localhost:8090/wpsoutputs/emu/output-37445d08-cf0f-11e4-ab7e-68f72837e1b4.txt

Footnotes

[1]What is WPS? - http://geoprocessing.info/wpsdoc/Concepts#what