Visual ORB

David Arnold
davida@pobox.com
19 August 1996


The Trouble with VRML

It doesn't do collaborative, interactive virtual reality well.

VRML is a markup language. It (basically) describes a static scene as a series of 3D objects. It supports inclusions (allowing reuse of objects) and uses a scoping mechanism to restrict visibility.

But it doesn't do collaborative, interactive virtual reality.

Why do I want collaborative, interactive virtual reality? Aside from the general fun of the idea, I believe that it will promote new ways of interacting (look at MUDs), new ways of programming and provide a better environment for information retrieval and management.

Did I mention fun? Fun!

3D Objects

Current work on VRML 2.0 is attempting to add rudimentry behaviour to VRML objects. Proposals include motion and sound.

However, the proposals do not address the fundamental issue: VRML is currently oriented towards a browser environment. It provides static declarations of object behaviour and does not allow one user's interactions with an object to influence another's. Simplistic approaches use server-side scripts to alter the description downloaded from the server as clients fetch different objects, but the end result is basically still static.

To achieve true VR, it is necessary to extend the definition of an object beyond static description to support interaction which alters its behaviour and appearance. Such an object could be conceptually similar to a merged VRML object description and a CORBA object interface.

Hmmm.

Geography

In reality, the curvature of the earth and the resolution of the human eye scope our perception. A virtual world requires a scoping mechanism.

In a virtual world, such mechanisms will not be sufficient to allow available machinery to support the demands of hosting a virtual world. Some other constraint on the interaction domain will be required.

Proposal

We propose an architecture designed to support an interactive virtual reality. Derived from existing work on MUDs, VRML and distributed objects, it supports clients that, while related to current VRML browsers, require some additional functionality.

Interaction Engine

The basic component of the architecture is an interaction engine. It is responsible for two things: supporting direct interaction between objects (like an ORB) and propagation of the effects of this interaction to all interested parties. In this note, we'll call this indirect interaction.

Direct Interaction

Direct interactions between objects is basically an ORB function - allowing an object to invoke exported operations on another. This is the basis for the construction of applications and is also used for direct interaction between objects on an ad hoc basis. Examples of this might include browsing, administration and exploratory programming.

Fundamental to the notion of ad hoc direct interaction is that the initiator cannot be required to have previously prepared stubs for using the target object.

Indirect Interaction

In a MUD engine this function is implemented by distributed notification of events to all connected parties, usually in the form of text sent to a telnet session.

Container objects (including rooms) have an ability to distribute notification of events to all enclosed objects. Notice that the container object naturally scopes the distribution of notifications.

In a collaborative virtual reality, indirect interaction is essential, however, supporting wide-scale notification of events is extremely expensive. It is necessary to restrict the visibility (or propagation distance) of events.

Domains

Existing MUD engines typically manage a database containing the state of all objects in the system. This database is used by the interaction engine and the client program (both normally part of the same process) to support object mobility and interaction.

An important attribute of these systems is that the client is dumb: the server does the work of rendering (in text) the scene for the client.

Extending a MUD engine to support 3D object presentation requires an architectural change to move rendering to the client. It is not feasible to perform rendering at the server because of the bandwidth and processing required to render a video streams for each of many clients.

Instead, each client must maintain a model of the current environment and render it themselves. This has the additional bonus of allowing the client to tune the level of detail, speed and other characteristics of the rendering.

The role of the server then is twofold: to maintain the state of the environment and to distribute updates to the attached clients. Obviously, a single server should not have to service all clients, the burden should be sharable across many machines for better performance.

Additionally, a client may not require notification of all events, but only those in which it is interested.

As a means of scoping, we introduce the notion of domains: a special class of object, like a container, with their own appearance and other behaviour. Domain objects exist within a three dimensional space, and have finite, rectilinear 3D volume.

All other objects must exist inside a domain. At any point in time, an object is contained within exactly one domain. When moving between ajoining domains, a large object may appear to protrude beyond its hosting domain.

A client program, rendering the view of an object, uses the description of the domain object for artifacts such as floors, sky, etc. Such descriptions should include a "translucence" factor for the general content of the domain ("air", "fog", "water"?) which will affect the rendered view within the domain.

Of course, it is possible to construct a domain object without constraining the view out of the domain. At this point, the effects of crossing domain boundaries intrude: to allow scalability of implementation, it is impossible to support views reaching an arbitrary distance.

Within a domain, the domain object is responsible for ensuring that all objects are able to receive notification of any event occuring within the domain. The client is responsible for filtering these events to those within the client object's view and retrieving the appropriate presentation data from its sources.

Ajoining domains must also exchange notifications. Given the structure of the virtual space, a domain is able to determine its spatial neighbours, and can request from them notification of events for subsequent redistribution to its own constituents.

However, some limit must be placed on the propagation of these events. Simplistic approaches based on the number of domain hops are attractive, but do not cope well with "thin" domains. It is desirable that the virtual distance between the domains is used as the basic criteria.

In addition, there are cases where it is necessary to receive notification of events beyond your normal "seeing" distance. A modelled telescope (for example) will require the ability to manually subscribe to distant domains.

Beyond restricting the general propagation of notifications, it will also be possible to scope the visual "importance" of an event. As an example, a distant tree on a plain (several domains away) should be visible to you, but the falling of a leaf should normally not.

This will be implemented by supporting a range of "levels of detail" in the events, which domains can use to filter those they propagate.

For simplicty, domains are hosted by a single server. That server is responsible for the distribution of notifications for all events occuring within the domain. While it will be possible to replicate the domain in future, we do not initially propose to deal with this complexity.

Note that the object descriptions must be fetched separately: they are not included in the notification. It will be a matter for the user's policy whether the object's own description is used or a local one substituted on the basis of the object's type. Object descriptions can refer to VRML or other files served by different servers to that hosting the domain (and for performance, this is indeed likely).

Given the structure of the virtual space, caching the object's presentation information might best be linked to the domain servers. This would reduce the distribution burden on an interesting object's host.

Customisation

Some objects will not be provided with appearance information by their author. Similarly to current web browsers, the client will provide such objects with a default presentation. Beyond this, and drawing from design work on Marco, a user might prefer to alter or replace the presentation provided by an object with another.

The client program must provide a means of either replacing or augmenting the presentation provided by an object with one of their own.

Uses of this could include tagging object types for the users quick recognition, replacing the default presentation for unpresented objects with a type-based presentation devised by the user, etc.

Scenarios

There are a number of interesting scenarios that must be catered for in the implementation of such a system. Several of these are described below, together with some implementation notes.
The Telescope

Kasparov vs. Deep Blue

The Approaching Frisbee


David Arnold
28 May 1996