Remote object systemšŸ”—

The VisionAppster Engine provides a built-in HTTP server that makes objects accessible through HTTP requests. The remote object API is modeled after REST principles as far as possible. Function calls (RPC) and signals don’t however naturally fit into the REST model and are handled in a different manner. Standard protocols and data formats are however used for all communication.

The remote object system supports a few different encoding schemes. In many cases, the most convenient way of encoding request and response bodies is JSON. When submitting a JSON request, the client will specify ā€œapplication/jsonā€ as the ā€œContent-Typeā€ header. A JSON response is requested by also setting the ā€œAcceptā€ header to ā€œapplication/jsonā€. When calling functions or setting properties, the platform automatically performs safe type conversions (such as integer to double).

The applications of the VisionAppster platform usually require complex data types such as images and tensors to be transferred over the network. For these, JSON is not the best possible encoding. Therefore, the default encoding used for example by the JS client is ā€œapplication/vnd.va-binā€, a high-performance binary serialization format that closely matches the actual memory layout of the complex data types.

To get the most out of this document you should make sure you have the VisionAppster Engine running on your computer. The embedded links can then be used to inspect and control it.

ObjectsšŸ”—

VisionAppster apps are composed of objects. Each object may have an arbitrary number of properties, functions and signals. These concepts can be found in many programming languages in a form or another and are assumed to be familiar to the reader.

In the HTTP interface, each object has a base URI relative to the server’s root. For example, an instance of an Application Manager object is mapped to /manager/. A GET request to this URI will produce a ā€œdirectoryā€ listing that shows the structure of a remote object.

Each object instance has a globally unique ID that can be retrieved by a GET request. The ID will change if a server goes down and the object is recreated.

The ping URI is for application-level connectivity checking and for keeping a dynamically created temporary object instance alive if no other requests are made.

PropertiesšŸ”—

Properties are variables that control the behavior or appearance of an object. Generally, properties describe the object’s current state, and they can be thought of as the member variables or fields of an object. Property values are specific to a single object instance.

Although properties are similar to member variables, they are technically implemented in a different way. The value of a property is set through a setter function that may validate the value before accepting it. Therefore, it is not guaranteed that a property actually assumes the value one sets to it. Properties also usually have a change notifier signal, which lets one to conveniently and efficiently detect changes.

While changing the value of a property may seemingly cause an action (such as an animation in a user interface), properties aren’t used for invoking functionality. That is what functions are for.

Submitting a GET request to properties/ will list properties. Each property declaration contains optional qualifier flags such as const and volatile, a type name, a unique property name and optionally a change signal. A property may be const and thus non-settable but still change value and emit a change signal.

The current value of a property can be read by sending a GET request to properties/propertyname. For example, a GET request to properties/allApps will return a list of all installed applications.

Submitting a PUT request to a property’s URI will change the value of the property. The new value is sent in the request body, usually as JSON. When the value of a property changes, a change notification signal will be sent to all registered clients.

Each property is accompanied with a serial number that is incremented each time the value changes. The serial number is a 32-bit unsigned integer that will be initialized to a random value when a remote object instance is created. In HTTP communication, the serial number is encoded as a hexadecimal string and passed in the ETag header. It can also be attached to property change signals as an extended header.

FunctionsšŸ”—

Functions provide a way to invoke actions on an object. A function can take an arbitrary number of input parameters and optionally return a value.

The functions of a remote object are listed at functions/. Each declaration consists of an optional return type, a name and a (possibly empty) list of parameter types. There may be multiple overloaded versions of the same function, each taking a different set of parameters. When an overloaded function is called, the server uses passed parameter types to find a matching overload.

POST requestsšŸ”—

When a function is called, its arguments are passed as an array in a POST request body. If the function takes only a single argument, the array around the arguments can be omitted. As specified by the HTTP standard, the ā€œContent-Typeā€ header is used as a way to tell the server how to decode the arguments.

If a function returns no value, the server responds with an empty ā€œ200 OKā€ message or an appropriate error code. If there is a return value, it will be encoded as specified by the ā€œAcceptā€ header. Let’s assume the server provides a remote function pass(str: String) -> String that just returns a string it is given as an argument. Calling the function would be a POST request:

POST /myobject/functions/pass HTTP/1.1
Host: localhost:2015
Content-Type: application/json
Accept: application/json
Content-Length: 15

"Hello, World!"

The server’s response would be something like this:

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 15

"Hello, World!"

Since the function declares a name for its argument, it can be called with an object that contains named arguments:

POST /myobject/functions/pass HTTP/1.1
Host: localhost:2015
Content-Type: application/json
Accept: application/json
Content-Length: 24

{"str": "Hello, World!"}

If a function has many parameters that cannot be encoded in the same way (e.g.Ā JSON), the arguments can be passed as a MIME multipart message. Let us assume the server provides a function with the signature binarize(image: Image, threshold: int) -> Image. This function could be called with a MIME multipart message as follows:

POST /myobject/functions/binarize HTTP/1.1
Host: localhost:2015
Content-Type: multipart/mixed; boundary=boundary_marker
Accept: image/png

--boundary_marker
Content-Type: image/png
Content-Length: 12345

... image data here ...

--boundary_marker
Content-Type: application/json
Content-Length: 3

127
--boundary_marker

Creating multipart messages may be a bit of an overkill if only one of the arguments would actually require special encoding. In this case, it is possible to pass some of the arguments as query parameters in the URL. This would produce the same result:

POST /myobject/functions/binarize?127 HTTP/1.1
Host: localhost:2015
Content-Type: image/png
Accept: image/png
Content-Length: 12345

... image data here ...

If query parameters are not given names, they will be passed to the function in the order they appear in the call. In this case the response body becomes the first parameter. Function arguments that have default values can be omitted.

Finally, if the function declaration provides names for its arguments, they can be used to pass parameters in any order. One cannot however mix ordered and named arguments. The special @body query parameter specifies the name of the function argument to which the request body should be passed. Thus, the same call could be rewritten in yet another form:

POST /myobject/functions/binarize?threshold=127&@body=image HTTP/1.1
Host: localhost:2015
Content-Type: image/png
Accept: image/png
Content-Length: 12345

... image data here ...

This would tell the server to use 127 as threshold and the decoded body of the request as the image argument. You can do this using curl as follows:

curl -X POST -H "Content-Type: image/png" -H "Accept: image/png" \
  --data-binary @input.png \
  "http://localhost:2015/myobject/functions/binarize?threshold=127&@body=image" \
  > output.jpeg

If the query string is used to pass function arguments, the server will try to automatically determine the types. It recognizes integers, decimal numbers and hexadecimal color codes in addition to the keywords true, false, nan, inf and -inf. Everything else is treated as a string. If you need to pass a string that matches one of the keywords, you can enclose it in double quotes, e.g.Ā "true".

For example, consider the following query string:

?123&true&%22true%22&-2.1e5&inf&abc&%23ff0000

After splitting at ampersands (&), undoing URL encoding (percent codes) and auto-detecting types this would result in the following argument list:

123     -> int(123)
true    -> bool(true)
"true"  -> String("true")
-2.1e5  -> double(-210000)
inf     -> double(āˆž)
abc     -> String("abc")
#ff0000 -> Color32(255, 0, 0)

GET requestsšŸ”—

If a function takes no parameters or all of them can be encoded to the query string as shown above, it is possible to call a function using a GET request. For example, the parameterless hasError() and clearError() functions can be called with GET.

Asynchronous callsšŸ”—

It is possible to call a remote function asynchronously. In this case the server won’t wait for the function to complete before responding to the client. Instead, the call will be put to a queue and executed later. The results of the function will be pushed to the client’s return channel once the function is done. The format of the response is specified by the Media-Type header. The client needs to give a source ID for the call so that it can recognize the return value when it appears in the return channel.

To asynchronously call the function above:

POST /myobject/functions/pass HTTP/1.1
Host: localhost:2015
Content-Type: application/json
Media-Type: application/json
Client-ID: S3cR3t
Source-ID: 29
Content-Length: 15

"Hello, World!"

The server would return with ā€œ200 OKā€ and put the call into a call queue. Once done, the return value would be pushed to the client’s (S3cR3t) return channel with source ID 29.

SignalsšŸ”—

Signals are used as a way to notify listeners about changes in an object’s state or to emit the results of an asynchronous function call. A signal may pass any number of (including zero) values, and it can be connected to any number of functions with a matching parameter set.

Most programming languages do not provide signals as a built-in concept. The same functionality can be achieved for example with Java-style listener interfaces. Signals are however conceptually cleaner and easier to use in practice.

Submitting a GET request to signals/ will produce a list of signals. Signal declarations are similar to functions, but there is no return value.

Property change signals are a special kind of a signal that will be emitted whenever the value of a property changes. If there is a change signal associated with a property, the signature of the signal will appear in the notifier field of the property description.

Callback functionsšŸ”—

Callback functions provide another way for the remote object to push data to clients. They also make it possible to request data from the client. There are two types of callbacks: permanent ones and callback function arguments.

Permanent callbacks can be used to implement a ā€œlistenerā€ design pattern. Any client can register itself as a listener to a callback function and receive the arguments of the callback function through the return channel (see below). The delivery mechanism is the same as with signals, but only one listener is allowed per function.

Unlike signals, callbacks may return a value, which will be passed back to the server through the return channel. Even if a callback function returns no value, the server receives an acknowledgement once the function call finishes. Permanent callbacks are listed at callbacks/.

In addition to permanent callbacks, a remote function may have callback arguments. Again, the delivery mechanism is the same as with signals, but the arguments of the callback function will only be delivered to the client that made the request. This makes it possible to implement for example asynchronous functions that have no return value but push their results to the calling client after a processing delay.

Pushable sources and return channelsšŸ”—

The remote object system has a concept of a pushable source that lets the programmer to define many different types of data that be sent to the client from the server side. Pushable sources are identified by their relative URIs on the server and may present many different kinds of data sources. Signals are the most commonly used type of a pushable source.

To be able to receive push notifications, the client must first open a return channel. This happens by establishing a WebSocket connection to channels/<client-id>. The WebSocket URL will be, for example, ws://localhost:2015/manager/channels/3dcbff79-3f15-4a80-b7ca-b669cc0b3209. The opened socket will then be used whenever a notification needs to be sent for the client.

The client ID should be a cryptographically strong UUID. A different ID should be used for different servers. This makes it unlikely that anybody else will be able to steal or modify the return channel.

Each return channel is dedicated to a single client. If you make a new request to the same channel URL, the old connection will break. This is seldom needed, but makes it possible to transfer the end point of the channel to another client.

Once a channel has been established, the client can select which notifications to receive by issuing PUT requests to channels/<client-id>/sources/<source-id>. For example, a PUT request to /manager/channels/3dcbff79-3f15-4a80-b7ca-b669cc0b3209/sources/0 with the JSON request body {"sourceUri": "signals/runningAppsChanged"} would tell the server to notify the client whenever the runningAppsChanged signal is emitted.

If the WebSocket request contains a Media-Type header or a mediaType URL parameter, its value will be used as the default encoding for data pushed through the channel. For example, to request responses as JSON, you can do this:

GET /manager/channels/s3cr3t?mediaType=application/json HTTP/1.1
Host: localhost:2015

Note that just like with functions, there may be multiple overloaded versions of a signal. In such a case the client must give the full signature of the signal as the sourceUri parameter. For example:

PUT /myobject/channels/<client-id>/sources/0 HTTP/1.1
Host: localhost:2015
Content-Type: application/json
Content-Length: 44

{"sourceUri":"signals/booleanChanged(bool)"}

The client is responsible for picking a unique, non-negative 32-bit integer (0-2147483647) as a source ID that identifies the signal. In the example, the source ID is zero. The server will send the source ID when it pushes data to the channel so that the client can identify the signal. A DELETE request to /manager/channels/3dcbff79-3f15-4a80-b7ca-b669cc0b3209/sources/0 will tell the server to stop pushing that signal.

Connecting to multiple signals happens by giving each signal a unique ID number in the context of the channel. For example, to subscribe to the barCode and image signals, one can issue the following requests (client ID is ā€œd351a1612998ā€ for illustration):

PUT /myobject/channels/d351a1612998/sources/0 HTTP/1.1
Host: localhost:2015
Content-Type: application/json
Content-Length: 31

{"sourceUri":"signals/barCode"}
PUT /myobject/channels/d351a1612998/sources/1 HTTP/1.1
Host: localhost:2015
Content-Type: application/json
Content-Length: 56

{"sourceUri":"signals/image","mediaType":["image/jpeg"]}

The mediaType parameter specifies how the parameters of the signal should be encoded by the server. See details below. If mediaType is not specified, the channel’s default encoding will be used.

Connected sources can be listed by sending a GET request to channels/<client-id>/sources/. A channel is killed by sending a DELETE request to channels/<client-id>. Finally, a broken connection can be re-established by initiating a WebSocket connection to channels/<client-id>. The server will automatically delete the channel if a broken connection isn’t re-established within a ā€œreasonableā€ time.

Source parametersšŸ”—

Each pushable source can be individually parameterized. In the examples above, the mediaType parameter was used to specify encoding for data pushed back from the server. Other configurable parameters are:

qos

Quality of service level. QoS is specified as a numeric value in the range [-1,3].

Unspecified (-1)

Use the channel’s default.

At most once (0)

A.k.a fire and forget. Send the message at most once and don’t expect an acknowledgment.

Last at most once (1)

Same as 0, but new messages from the same source are allowed to overwrite an older message if it hasn’t been sent yet. The last message will be sent once.

At least once (2)

Make sure receiver gets the message. Resend after a while if the client fails to acknowledge.

Last at least once (3)

Make sure receiver gets the last message from a source. Same as 2, but his mode allows the server to update message content until it has been acknowledged.

maxAge

The maximum number of milliseconds the message will be kept in the outgoing message queue. 0 means unlimited, -1 uses channel default.

Receiving signals and callbacksšŸ”—

The server supports a few different transport protocols for notifications. This section describes the WebSocket implementation.

Messages between the client and the server are sent as WebSocket binary frames. The payload of each WebSocket frame starts with an eight-byte message header that has the following format:

 0               1               2               3
 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+---------------+-+-+-----------+-------------------------------+
| message type  |A|E|  flags    |      source ID (2 LSBs)       |
|               |C|X|           |                               |
|               |K|T|           |                               |
+---------------+-+-+---------+-+-------------------------------+
|     source ID (2 MSBs)      |R|        message index          |
|                             |E|                               |
|                             |S|                               |
+-----------------------------+-+-------------------------------+
... optional extended headers ...
|               |
| header type   | header data ...
+---------------+
  • The first byte contains a message type that describes how the rest of the message should be interpreted:

    1. partial: The message contains a part of a sequence. One or more parts will follow. A ā€œsequenceā€ can be for example an array of signal arguments. A part of a sequence is a single, individually encoded element in the array.

    2. final: The message contains the final part of a sequence. If a sequence contains only one part, the first message will be final.

    3. complete: The message contains a complete sequence of messages, all encoded in a single message. For example, a JSON encoded array of signal parameters is a complete message.

    4. error: The message contains an error. Errors are objects with at least a message field.

  • The next eight bits contain control flags. Bit 0 indicates whether the server expects the client to acknowledge the message or not. If the ACK bit is set, the client must send a reply, optionally with return data. If the EXT bit (1) is set, extended headers follow.

  • The next four bytes contain the source ID the client gave when registering a source as a 32-bit little-endian integer. The most significant bit (31) is reserved and must be zero.

  • The last field is an unsigned 16-bit little-endian sequence number. This lets the client to detect dropped messages in case the server runs out of bandwidth.

  • The rest of the payload is data that is interpreted according to message type.

Note that the header does not contain a length field since the WebSocket frame already has one.

Usually, the arguments of a signal or a callback function are encoded as a JSON array in the order they appear in the signal’s or callback’s declaration and sent as a complete message in a single WebSocket frame (which may get fragmented in transport). When registering a pushable source to a channel, the client can however request splitting arguments to successive frames by giving a preferred encoding scheme for each argument with the mediaType parameter. For example:

PUT /myobject/channels/<client-id>/sources/0 HTTP/1.1
Host: localhost:2015
Content-Type: application/json
Content-Length: 54

{"sourceUri":"signals/image","mediaType":["image/jpeg"]}

This will tell the server to connect to a signal called ā€œimageā€ and to encode the signal’s single parameter as a JPEG in a single final message. When the signal is received, the client can decode the body of the WebSocket frame (after skipping the eight header bytes) as a JPEG image.

If a signal has many parameters, they will be sent in successive WebSocket frames in the order the parameters appear in the declaration of the signal. The messages will be partial up to the last one, which will be final.

It should be noted that "image/jpeg" requests encoding the whole parameter array as an image (which is not possible) whereas ["image/jpeg"] causes the image parameter itself to encoded and sent as a separate message.

The mediaType parameter can also be used to selectively pick signal parameters. Let’s assume the object has a signal with the signature foobar: (v1: int32, v2: Image) -> void. To receive only the image one can give null as the media type for the first parameter:

PUT /myobject/channels/<client-id>/sources/1 HTTP/1.1
Host: localhost:2015
Content-Type: application/json
Content-Length: 61

{"sourceUri":"signals/foobar","mediaType":[null,"image/png"]}

Specifying a null media type for all parameters makes the server to send the signal with an empty body. This may be useful if you want to e.g.Ā observe property changes but retrieve the latest value only on explicit request.

Callbacks are connected to in a similar manner, but the sourceUri is ā€œcallbacks/callbackNameā€. Unlike signals, callbacks may return a value to the server. The encoding of the return value is specified by a replyMediaType parameter when registering a connection. For example:

PUT /myobject/channels/<client-id>/sources/2 HTTP/1.1
Host: localhost:2015
Content-Type: application/json
Content-Length: 65

{"sourceUri":"callbacks/getImage","replyMediaType":["image/png"]}

This would tell the server that the return value of the getImage callback will be encoded as a PNG. If replyMediaType is not specified, the channel’s default will be used.

If the ACK bit of the flags field is set, the client must respond to a message. If there is no data to be sent back, an empty complete message will acknowledge the reception. If there is data to be sent back, it must be encoded as specified by replyMediaType.

The semantics of the replyMediaType parameter are the same as those of the mediaType parameter: a single value specifies the encoding for a complete sequence, and an array of values specifies the encoding for each individual element. Currently, the server does not support partial responses. Therefore, replyMediaType must be either a single string or an array containing a single string.

Callbacks as function argumentsšŸ”—

Let us assume a server object provides a function that calculates the sum of two integer arguments and pushes the result back using a callback function. The signature of the function can be written as sum: (callback: (int) -> void, a: int, b: int) -> void, where callback is a function that will be invoked once the computation of a + b is ready. The sum function itself returns no value.

To be able to receive the asynchronous reply one needs to first set up a return channel as described above. The generated client ID must then be passed to the sum function call so that the server knows which channel to use when calling back. This happens by adding a Client-ID header to the request. Callback function arguments are replaced with the numeric source ID that will be used when pushing that callback’s data back to the return channel. An example:

POST /myobject/functions/sum HTTP/1.1
Host: localhost:2015
Client-ID: R4Nd0M
Source-ID: 30
Content-Type: application/json
Content-Length: 9

[314,1,2]

This tells the server to calculate 1 + 2 and push the result back to the return channel of client R4Nd0M using 314 as the source ID. No other configuration has been given, so the server will encode the arguments of the callback function using the channel’s default encoding. Since the request contains a Source-ID header, the call to sum itself is asynchronous, and the server won’t wait for the function to complete. When it is done, an empty complete message with the given source ID (30) will be pushed to the client through the return channel.

If there is a need to receive the callback’s arguments in a non-default format, it is possible to register and configure the source ID beforehand just like with signals:

PUT /myobject/channels/R4Nd0M/sources/314 HTTP/1.1
Host: localhost:2015
Content-Type: application/json
Content-Length: 34

{"mediaType":["application/json"]}

This would cause the result to be sent as an individual integer (3) instead of a one-element array ([3]). Unlike signals and permanent callbacks, function argument callbacks don’t require a sourceUri parameter. The client identifies the callback solely based on the numeric source ID.

Permanent configuration is useful when the same parameters are used for all callback invocations. Temporary parameters that only apply during a single function call (which may cause multiple invocations of the callback) can be passed as an object type argument:

POST /myobject/functions/sum HTTP/1.1
Host: localhost:2015
Client-ID: R4Nd0M
Content-Type: application/json
Content-Length: 49

[{"id":315,"mediaType":["application/json"]},1,2]

This would cause an individual integer to be pushed back to the channel with 315 as the source ID. Since the mediaType parameter only applies to this call, further calls to sum without the parameter object would use default encoding.

Sharing return channelsšŸ”—

If a server hosts multiple remote objects, it is possible to group them so that many or even all of them push data to the client through a single return channel. Unless the there is a huge amount of data flowing from the server, bundling return channels is usually the right thing to do as it eases error recovery on the client side and makes the server consume slightly fewer resources.

To share a return channel, request the first channel just as you normally would. When attaching to another object, first open the channel normally on one object:

GET /object1/channels/s3cr3t HTTP/1.1
Host: localhost:2015

Then PUT the existing client ID under channels on another object:

PUT /object2/channels/s3cr3t HTTP/1.1
Host: localhost:2015

The first request will open a WebSocket connection that will be used as the return channel for object1. The second one informs object2 to reuse the existing return channel.

It is safe to PUT the same client ID multiple times, but the request will fail if no channel has been opened with GET. If the second request was also a GET, the first connection would be transferred to a new WebSocket client, but the channel would still remain in use by both objects until DELETEd.

Extended headersšŸ”—

Extended headers provide a way to attach additional information to each message passed through the return channel much like the way headers are used in standard HTTP communication. The client must however explicitly request headers to be added as most of the time nothing is needed.

Extended headers follow the static 8-byte message header. Each extended header starts with a byte that specifies its type. A zero byte means end of headers. If the type field is non-zero, the next byte(s) are interpreted according to the type field.

Recognized extended header types are:

  1. ETag. A serial number that counts changes to property values. Can be attached to property change signals. This header with the standard HTTP ETag header lets clients to cache property values. The ETag is sent as a 32-bit unsigned little-endian integer.

  2. Time stamp. The time at which the message was generated in the server, before it was put in the output queue. Represented as a 64-bit unsigned little-endian integer that counts the number of milliseconds since the Unix epoch (1970-01-01T00:00:00).

To request extended headers, a client sends an array of header types in a PUT request when subscribing to a source. For example, to request the ETag header for a property change signal, the following request could be sent:

PUT /myobject/channels/d351a1612998/sources/2 HTTP/1.1
Host: localhost:2015
Content-Type: application/json
Content-Length: 50

{"sourceUri":"signals/valueChanged","headers":[1]}

In partial messages, extended headers are included only in the first one.

Bash clientšŸ”—

The VisionAppster SDK comes with a Bash client that can be used either as a command-line tool or as a library of shell functions in other scripts. The library and the command-line tool are all included in a single file that is located in the Linux SDK in sdk/client/bash/va-client. Type va-client for command-line usage instructions.

If you want to use the functions in your scripts, the best way to do this is to include the va-client script and to call the commands directly as functions:

#!/bin/bash

. $HOME/VisionAppster/sdk/client/bash/va-client

# The address of your Raspberry Pi running VisionAppster Engine
host=192.168.0.123
port=2015
object=/info

va_property_set userDefinedName '"New name"'

Note that both input and output data to the VisionAppster Engine are JSON-encoded by default, and you need to work with raw JSON data. The client makes no conversions for you. Use jq to en/decode JSON messages.