The Data-Centric Modus Operandi

First Published Tuesday, 17th August 2010 02:05 pm from Real-Time Innovations (RTI) : Rick Warren

The opinions expressed by this blogger and those providing comments are theirs alone, this does not reflect the opinion of Automated Trader or any employee thereof. Automated Trader is not responsible for the accuracy of any of the information supplied by this article.


DDS stands

for "Data Distribution Service."

Data distribution is not messaging, and it

is not eventing. However, href="http://blogs.rti.com/2009/06/03/thinking-differently-about-messaging/">data

distribution subsumes messaging and eventing as use

cases to a large extent, and as a result it often gets lumped

into those categories.

Data

distribution is about observing a changing world. A

system whose communication is based on this paradigm tends to

become data-centric: it becomes more

concerned with modeling the first-class concepts of its business

domain and less concerned with managing second-class

"who-told-whom-to-do-what" middleware

concepts like queues and messages. Along the way, it enjoys the

benefits of decreased coupling and improved reliability,

scalability, and performance.

Data Distribution and Its

Kin

Classically,

messaging is an evolution of the remote

method invocation (RMI) paradigm - an attempt to make

that paradigm less coupled and more scalable by making it

asynchronous. A message says "I tell you to do

this." When compared with RMI, "I"

and "you" are more abstract, both in identity

and multiplicity, and the request can be queued for processing at

a later time or by another party without making the sender wait.

These are improvements, but the interaction remains coupled,

because the roles of "I" and

"you" (often in the guises of

"client" and "server" or

the trendier "service consumer" and

"service provider"), as well as the intention

of what action should be performed, are still very much in

play.

Eventing, like data

distribution, is preoccupied with changes to the world. An event

says "I changed in this way." It reduces

coupling by entirely removing both the recipient of that

information and any notion of intention from you business logic

and your mental model; who might receive an event, and what they

might choose to do as a result, are not the business of the event

source. But state management remains a problem, because in order

to understand the change that occurred, all recipients must have

an up-to-date understanding of the state of the world prior to

the latest event - "the price went up by a

dollar" doesn't do me any good if I

don't know what the price was before. This temporal

coupling means that every recipient must process every event in

order, whether those events are interesting or not, just in case

the interpretation of a subsequent interesting event should

happen to require the state established by a previous

otherwise-uninteresting one.

The resulting

processing and state management are complex and expensive. As a

mitigation, they are frequently factored out of the applications

that need the data and into state-management

"servers" that "clients"

must query using a message-centric or even RMI-based approach

- a huge regression in engineering practice! The system

becomes complicated by the presence of multiple interacting

communication paradigms, and the servers (which serve no business

role) introduce performance and fault-tolerance choke

points.

A data-centric architecture eliminate

these problems by simplifying the interactions. A data sample

says simply "the world is like this." It

thereby eliminates coupling not only in terms of source,

recipients, and their intentions, but also in terms of time.

There's no longer any need for recipients to process or

store information they don't care about, because

samples don't implicitly encompass previous samples.

Therefore it becomes perfectly reasonable for one observer to

examine the state of the world every second, or every minute, or

every hour - and for another to observe every single

intermediate state, even if those states change from one to the

other many times a second.

Modeling the World with

DDS

A set of DDS entities, and

the data they distribute and manage, define a view into this

changing "world."

  • A "domain" defines the boundaries

    of the world, the set of information that a collaborating group

    of applications might find interesting. A "domain

    participant" defines the presence of some application

    in that world; it is the data-centric analogue to what is

    frequently known as a "connection" in the

    messaging middleware.

  • href="http://blogs.rti.com/2009/04/30/data-transparency-why-you-should-care/">A

    "type" is a structural description of some

    part of the world - for example, an Antelope

    is brown in color and has four legs and two horns; a Ferrari is

    red in color and has four wheels and two seats. A type has a

    formal definition, usually (though not always) in a declarative

    language like XSD or OMG IDL, and it implies a corresponding

    definition in the target programming language.

  • A "quality-of-service" (QoS)

    definition defines the fidelity with which some party/parties

    is/are able to describe the world. For example, will the

    description contain every state the world passes through or only

    a subset? Will observers have access to new states of the world

    only, or will they be able to see previous states as well? If the

    latter, how far back will those previous states go?

  • A "topic" defines some aspect or

    subset of the world consisting of similar objects. As such, it

    combines a type, which defines the structure of those objects,

    with a QoS definition, which defines how they can be observed to

    change.

  • An "instance"

    defines a single object in the group defined by a topic. For

    example, a topic may be used to distribute the positions of

    airplanes as detected by a radar. Each plane would be an

    instance. All radar tracks have the same structure (type) and are

    updated in the same way (QoS). But they are also distinct from

    one another: it matters whether the plane at a given location

    happens to be American Airlines flight 123 or Delta flight

    456.

  • A "data writer"

    defines a source of information about a particular subset of the

    world (topic). As such, it may override the QoS of its topic

    - multiple parties may provide information about the

    same part of the world but with different degrees of

    fidelity.

  • A "data

    reader" defines an observer of a particular subset of

    the world (topic). As such, it may also override the QoS of its

    topic. Furthermore, it may only be able and/or interested to

    observe certain states of the world. For example, it may only be

    interested in airplanes flying over a particular geographic area

    or in stocks trading at over $20/share.

By creating a data reader with a certain QoS definition,

an application makes an affirmative statement that it wishes to

observe a certain portion of the world under a certain set of

circumstances. For example, it may state that it is interested in

observing the most recent five states (samples) to the objects

(instances) in its part of the world (topic), but it

doesn't need to process changes more frequently than

once every second.

This statement is one of

interest only; it in no way requires the observer to actually

observe a certain set of samples in a certain way or within a

certain period of time. On the one hand, the observer may choose

to be notified asynchronously of every new sample and to respond

to it immediately. On the other, it may "go

away" to other business and return hours later; when it

does, it will find the most recent five samples of each instance,

occurring no more frequently than once every second, waiting for

it. In the mean time, DDS will have taken care of all of the

necessary data reception, filtering, and replacement in order to

make that happen.

DDS's ability to

combine notification and lightweight caching - in

effect, to maintain an application's observed state of

the world on its behalf - is something no other

standards-based technology provides. Developers of data-centric

systems reap the benefits: href="http://www.rti.com/resources/product-tour/performance-scalability.html">higher

performance and scalability, href="http://www.rti.com/resources/product-tour/system-architecture.html">greater

tolerance to dynamic network conditions, and ultimately

href="http://www.rti.com/mk/commercial-middleware-vs-roll-your-own.html">improved

ROI and time-to-market.

rel="nofollow"

href="http://feeds.wordpress.com/1.0/gocomments/rtidds.wordpress.com/255/"> alt="" border="0"

src="http://feeds.wordpress.com/1.0/comments/rtidds.wordpress.com/255/"

/>

href="http://feeds.wordpress.com/1.0/godelicious/rtidds.wordpress.com/255/"> alt="" border="0"

src="http://feeds.wordpress.com/1.0/delicious/rtidds.wordpress.com/255/"

/>

href="http://feeds.wordpress.com/1.0/gofacebook/rtidds.wordpress.com/255/"> alt="" border="0"

src="http://feeds.wordpress.com/1.0/facebook/rtidds.wordpress.com/255/"

/>

href="http://feeds.wordpress.com/1.0/gotwitter/rtidds.wordpress.com/255/"> alt="" border="0"

src="http://feeds.wordpress.com/1.0/twitter/rtidds.wordpress.com/255/"

/>

href="http://feeds.wordpress.com/1.0/gostumble/rtidds.wordpress.com/255/"> alt="" border="0"

src="http://feeds.wordpress.com/1.0/stumble/rtidds.wordpress.com/255/"

/>

href="http://feeds.wordpress.com/1.0/godigg/rtidds.wordpress.com/255/"> alt="" border="0"

src="http://feeds.wordpress.com/1.0/digg/rtidds.wordpress.com/255/"

/>

href="http://feeds.wordpress.com/1.0/goreddit/rtidds.wordpress.com/255/"> alt="" border="0"

src="http://feeds.wordpress.com/1.0/reddit/rtidds.wordpress.com/255/"

/>

src="http://stats.wordpress.com/b.gif?host=blogs.rti.com&blog=7350090&post=255&subd=rtidds&ref=&feed=1"

/>

  • Copyright © Automated Trader Ltd 2013 - The Gateway to Algorithmic and Automated Trading

click here to return to the top of the page