This survey examines an emerging technology know an active databases. It presents the fundamental characteristics of active database systems and describes features a database must support to be legitimately considered as active system. It describes the knowledge, execution and management model for supporting the active behavior and thus makes explicit the decision space within which the designers of active rule systems must work. Survey also describes some of the research prototypes and commercial systems, highlighting important similarities and differences in their implementations. Finally the survey enumerates some of the challenges in developing and implementing active functionality in database systems.
The focus of this survey is on the issues, which needs to be considered in designing active capability and understanding the way they have been addressed in current systems.
Introduction and outline of survey
This survey provides an overview of some of the issues and fundamental characteristics of active database systems. Section 1 defines active databases and enumerates the features they must support to be legitimately considered as an active system. Section 2 introduces some example applications and categories applications based on the applicability of active functionality. Section 3 presents the main components of abstract active rule system architecture and describes a knowledge, execution and management model for supporting active behavior. Section 4 describes and compares some of the representative active databases (both relational and object oriented) in research and commercial domain. Section 5 indicates what architectural features are important for the implementation of active database systems and what are the challenges in developing such systems. Section 6 specifies some fields in which future work is required. Section 7 presents the conclusion.
Traditional database management systems are passive. Active databases extend “passive” DBMS with the possibility of specifying (re) active behavior. The rules, which define this active behavior, are known as triggers or production rules. Thus active databases are defined as databases with production rules. The rules are stored in database and can be shared by many application programs and database can optimize their implementation.
Active databases can recognize specific situations (internal or external) and react to them without immediate, explicit user or application request. An active database system must provide a knowledge model (description mechanism) and an execution model (runtime strategy) for supporting this reactive behavior. Most common rule format is based on ECA model i.e. “ when ever event occurs, check the condition and if it holds, execute the action “.
The features a database must support, to be legitimately considered as active systems are
Reactive behavior must be specifiable/ definable by the user. It should provide a knowledge model consisting of data definition facilities and rule definition language as a means to specify ECA rules.
1.An ADBMS has to provide a means for defining event types
2. An ADBMS has to provide means for defining conditions
3. An ADBMS has to provide means for defining actions
An action formulates the reaction to an event and is executed after rule is triggered and condition is satisfied.
A set of rules at a given point in time forms the rule base. ADBMS should store the information about rules, which currently exist, and how they are defined and how to make them visible to user application (provide access control)
ADBMS must also provide support to add/define new ECA rules and delete old rules, modify events, conditions or action definitions of existing rules. It should have the mechanism to enable and disable rules in rule base.
1. ADBMS must detect event occurrences (either automatically or user/application signaling)
2. ADBMS must support different binding modes and coupling modes
3. ADBMS must be able to evaluate conditions and execute actions
4. ADBMS must implement conflict resolution mechanism (in most of the systems it is currently done by means of priorities)
5. ADBMS must manage event history (consists of all the occurrences of the defined event types). The notion of event history also defines the lifetime of event occurrences and can be very useful in debugging and tracing rules
6. An ADBMS must implement consumption modes
Event consumption mode determines which component (primitive) events are considered for composite events, and how event parameters of composite event are computed from its components. Different application classes may require different consumption modes, such as ‘recent’, ‘chronicle’, ‘continuous’ and ‘cumulative’.
If user ignores all active functionality, then ADBMS should be reducible to DBMS
They involve the use of active functionality to describe some of the behavior manifested by the software system without reference to external devices or systems. Rules mainly deal with the maintenance of data within databases. i.e. using rules for automatic statistical analysis or for alerting users to special conditions
They deal with controlling and responding to events occurring outside the database system. i.e. Aircraft monitoring database [11], medical monitoring databases [14], battlefield threat assessment [15], etc.
Active rules can do tasks, which are handled by special purpose sub systems in passive database. i.e. ECA rules have been used to support integrity constraints [16], materialized views [19], transaction models [20], advance data modeling constructs [21], coordination of distributed computation [22], etc. Such extensions to core database functionality are supported by defining a high level syntax for the extended functionality and a mapping onto a sets of active rules.
Abstract active rule system architecture
This section gives an overview of principal components/processes of active databases. In addressing these issues, reference is made to the abstract architecture of active database system presented above. The figure above makes explicit the principal processes / components (rectangles) and data stores (ellipses) used to implement active functionality. A brief description of these important components is given below.
1. Event Detector: It ascertains what events of interest to the rule system have taken place. Primitive events are notified from the database or from external sources; composite events are constructed from incoming primitive events plus the information about past events that can be obtained from the history.
2. Condition Monitor: It evaluates the conditions of rules associated with events that have been detected by the event detector.
3. Scheduler: It compares recently triggered rules with those that have previously been triggered, updates the conflict set, and fires any rules that are scheduled for immediate processing.
4. Query Evaluator: It executes database queries or actions. Access may be required both to the current state of database and to past states in order to support monitoring of how the database is evolving.
Additionally other components, which are useful and provide support, are
1. Rule Manager/Rule Catalog: Required for handling rule definition & manipulation tasks
2. Rule Execution Monitor: Required for maintaining the set of triggered rules and scheduling their execution. It operates on top of scheduler, query evaluator and event detector.
3. Rule Action Planner: It is invoked by the rule execution monitor to produce optimize execution strategies for database commands occurring in rule actions. The same query processor that executes user commands executes these commands.
An active database system must provide a knowledge, execution and management model for supporting the reactive behavior.
|
Event |
Source Ì {Structure Operation, Behavior Invocation, transaction, Abstract, exception, Clock, External} Granularity Ì {Member, Subset, Set} Type Ì {Primitive, Composite} Operators Ì {or, and, seq, closure, times, not} Consumption mode Ì {Recent, Chronicle, Cumulative, Continuous} Role Î {Mandatory, Optional, None} |
|
Condition |
Role Î {Mandatory, Optional, None} Context Ì {DBT*, Bind E*, DBE*, DBC*} |
|
Action |
Options Ì (Structure Operation, Behavior Invocation, Update Rules, Abort, Inform, External, Do Instead} Context Ì {DBT*, Bind E*, Bind C*, DBE*, DBC*, DBA*} |
* refer symbol table at the
end
Dimensions of the Knowledge Model
Knowledge model of an active database system indicates what can be said about active rules in the system. It essentially supports the description of active functionality. The knowledge model of an active rule is considered to have (up to) three principal components, an event, a condition, and an action. The table above illustrates a number of dimensions of active behavior and makes explicit the decision space within which the designers of active rule systems work. The nature of the description and the way in which the event can be detected largely depend on the source or generator of the event. They can be caused by the structure operation, exception, method invocation, etc. The event granularity indicates whether event is defined for every object in a set, or for specific members of the set. The role of event indicates the relative importance of events in rule structure. Similarly the role of condition indicate the condition importance and its context indicates the setting in which the condition is evaluated.
Execution
Model
|
Condition Mode Ì {Immediate, Deferred, Detached} Action Mode Ì {Immediate, Deferred, Detached} Transition Granularity Ì {Tuple, Set} Net-effect policy Î {Yes, No} Cycle policy Ì {Iterative, Recursive} Priorities Î {Dynamic, Numerical, Relative, None} Scheduling Î {All Parallel, All Sequential, Saturation, Some} Error handling Ì {Abort, Ignore, Backtrack, Contingency} |
Dimensions of the Execution Model
Principal steps that take place during rule execution
The execution model specifies how a set of rules is treated at run time. The table above illustrates a number of dimensions of execution model. The condition and action coupling modes specifies when condition and action should be executed relative to the event that triggers the rule. The net effects of event occurrences indicate whether cumulative effect of event rather than individual event occurrence should be considered. The cycle policy specifies the mode when events are signaled during condition and action execution of rule. The transaction granularity indicates the relationship between event occurrences and rule instantiations. When the transition granularity is ‘tuple’, a single event occurrence triggers a single rule. When granularity is set then a collection of event occurrences are used together to trigger a rule. The scheduling phase determines what happens when multiple rules are triggered at the same time. The execution model is closely related to aspects of underlying DBMS. The main phases in rule evaluation are
1. Signaling Phase: Refers to the appearance of an event occurrence caused by an event source.
2. Triggering phase: It takes the event produced so far and triggers the corresponding rules. The association of a rule with its event occurrence forms a rule instantiation.
3. Evaluation Phase: It evaluates the condition of the triggered rules. The rule conflict set is formed from all rule instantiations whose conditions are satisfied.
4. Scheduling phase: It indicates how the rule conflict set is processed.
5. Execution Phase: It carries out the actions of the chosen rule instantiations. During action execution other events can in turn be signaled that may produce cascaded rule firing.
|
Description Ì {Programming language, Query Language, Objects} Operations Ì {Activate, Deactivate, Signal} Adaptability Î {Compile time, Run time} Data Model Î {Relational, Extended Relational, Deductive, Object-Oriented} Programmer Support Ì {Query, Trace} |
Dimensions of Rule Management
It illustrates the facilities provided by the system for managing rules, specifically what operations can be applied to rules and how can they be represented and what’s the programming support for rules. The ‘description’ of rule refers how rules themselves are normally expressed. The ‘operation’ specifies tasks, which they support on rule base. The ‘adaptability’ refers to the degree of support for changes in rules. The ‘ data model’ refers to the base database and has significant influence on the design of rule system. The ‘programmer support’ is of paramount importance if active rules are to be adopted as a mainstream implementation technology in any practical and useful environment.
Active databases are a powerful, yet complex, extension to traditional database technology. This section presents some of the prominent research prototypes and commercial systems of different types of ADBMS implementations in research and industry, highlighting important similarities and differences. The two category of systems discussed are relational and object oriented.
Starburst active rule system adds active functionality to an extensible relational database system. The unique feature of starburst is that rule processing is invoked automatically at the end of each user transaction that triggers one or more rules. In addition, user can invoke rule processing within transactions by issuing special commands.
The rule system of starburst is based on set based execution model, in which rules are triggered by the net effect of a set of changes to the data. When an operation takes place that is being monitored by a rule the nature of the change is logged in a transition table. The information that is logged in a transition table is used to trigger rules at rule assertion points that may take place either during or at the end of a transition. Thus events do not trigger rules directly. Moreover the role of event is mandatory. For conflict resolution, it uses relative priorities. The operation it supports for rule management are ‘Signal’, ‘Activate’ and ‘Deactivate’ and the scheduling it offers is all-sequential. The action mode it supports is ‘immediate’ and condition mode is ‘deferred’. The semantics of rule execution in starburst is still quite complex and it can be argued that supporting many more facilities in rule sets can make it difficult to understand and maintain
POSTGRES rule system is a tuple-oriented active relational database system. The role of event is mandatory and the scheduling is all-sequential. It does not use priorities but support immediate condition and action execution mode. When a rule action is executed, it may modify additional tuples, each of which may (immediately) trigger additional rules. Thus rule processing is inherently recursive and synchronous.
SQL 3 standard support both row level (within transition granularity of tuple) and statement level triggers (with a transition granularity of set). Statement level triggers are executed once in response to an update operation on a table, no matter how many tuples are affected by the update. The most important feature of SQL 3 is that it makes explicit how triggers are to interact with other features found in relational database i.e. declarative integrity checking mechanism.
The role of event is mandatory and scheduling is all-sequential. For errors recovery it support backtracks/ rollback the transaction. It has both immediate and deferred action execution mode. No conflict resolution is necessary, since no two rules can be defined to have same triggering event. Further, additional syntactic restrictions on rule definition ensure that same table cannot be modified multiple times in a sequence of rule firing, thereby ensuring termination.
HiPAC was associated with passive OODB PROBE. It pioneered many of the most important ideas in active databases, such as coupling modes and composite events. In HiPAC, the definer has the flexibility of deciding whether or not the conditions and actions should execute in the triggering transaction. It supports two level coupling i.e Event-Condition coupling and Condition-Action coupling. Rule processing in HiPAC is invoked whenever any event occurs that triggers one or more rules. Rather than selecting one triggered rule to execute using some form of conflict resolution HiPAC executes all triggered rules concurrently. Thus it uses an extension of nested transaction model of execution. HiPAC rules may have relative ordering, and this ordering is used to influence the serialization order of concurrently executing nested sub-transactions. The role of event is mandatory and it supports all ‘immediate’, ‘deferred’ and ‘detached’ condition and execution modes. The transaction granularity is ‘set oriented’.
This active object oriented database is not based upon persistent C++ system.
Sentinel is an active extension to the C++ based open OODB system from Texas Instruments. The focus in this project has been upon the provision of comprehensive event specification mechanisms, representative of rules as database objects, and integration of rule system with sophisticated transaction manager. The rules may have relative priorities assigned to them. The scheduling support of parallel execution is provided. It has tuple level of transaction granularity and has both ‘immediate’ and ‘deferred’ condition execution mode. It supports all event consumption policies i.e. ‘recent’, ‘chronicle’, ‘continuous’, ‘cumulative’, etc. Also the role of event is mandatory in sentinel.
It supports ‘immediate’, ’deferred’, ’detached’ condition and action execution modes. It has sequential scheduling policy and support adaptability at runtime. The consumption policy for events is chronicle and it supports both member and set level event granularity.
|
Features |
Rule
expressiveness |
Execution
semantics |
Efficiency | ||||||
|
System |
Database
Events |
Temporal
Events |
External
Events |
Event
Constructors |
Coupling
modes |
Cascade rule
exec |
Multiple rule
exec |
Arch. |
Type |
|
HiPAC |
Insert Delete Modify |
Absolute Relative |
Yes |
Disjunction Sequence Closure |
Immediate Deferred Detached |
Yes |
Using extended nested
transaction |
Integrated Object
oriented |
Research Prototype |
|
Postgres |
Retrieve Replace Delete Append New,
old |
Time() Date() |
No |
Only
disjunction |
Immediate |
Yes |
Using user defined
priorities |
Extended
relational |
Research Prototype |
|
Starburst |
Inserted Deleted Updated |
No |
No |
Only
disjunction |
Deferred |
Yes |
Using a conflict
resolution strategy |
Extended Relational |
Research
Prototype |
|
Sybase |
Insert Update Delete |
No |
No |
No |
Immediate |
Yes |
No |
Relational |
Commercial
system |
V
Challenges [17]
There are some key challenges that need to be handled before this technology can be efficiently used. Some of the issues in developing active database applications are
1. Given the requirement, which components/parts of database should be supported using active mechanism and what performance penalty is likely to result from the use of rules.
2. The functionality of large rule base may be difficult to understand, with rules interacting in complex ways and no single description of how control flows through an application.
3. The tools associated with active rule system may be minimal, with little support for browsing, monitoring or debugging of active rules.
The following issues reflect the need for design methodologies, rule analysis techniques and tools for debugging and explanation.
Rule design: Given the requirement of an application, design techniques should provide guidance on the aspects that should be supported using active mechanisms, and those that are better addressed using other facilities. Proposals have been made for methods with explicit support for active behavior. Some of the examples are
These approaches assume that all active rules should surface explicitly in the design method, which might be very difficult and unreasonable.
Rule analysis: There are a number of different characteristics of rule behavior for which a rule analyzer can search [26]
1.
Termination problem
(Is the rule processing guaranteed to terminate?) Static analysis of a rule base can indicate if given set of rules may fail to terminate. Less conservative approach is to examine the conditions and actions of rules in more detail (but is more system specific)
2.
Confluence
(Is the result of rule processing independent of the order in which simultaneously triggered rules are selected for processing?) A rule base is confluent if for any two rules Ri and Rj triggered in any initial state S, a single final state F is guaranteed to be reached regardless of the order in which any subsequent simultaneously triggered rules are selected for firing. Work on confluence analysis has developed algorithms for analyzing complete rule base [26] and for considering the effect of an update on the truth of the condition [27].
3.
Observable determinism
(Does the user of the system observe the effect of rule processing, independent of the order in which triggered rules are selected for processing?) There can be a range of possible solutions for this like rule prioritization, rule condition or action modification to change there effect, dropping rules which are implementing conflicting policies, etc.
Rule debugging: The fact that rule base exhibits terminating and confluent behavior does not imply that it is correct. As the rule base language becomes complicated, the need for rule debugging environment becomes very important. Developing a rule base debugger is a difficult task because insidious interactions rather than state are the main source of incorrect or unexpected behavior and this context-dependent control exhibited by active rules imposes new complex demands on rule debugger [28].
Error recovery: One issue not fully addressed in many ADBMS is the semantics of error recovery during rule processing. Most database rule systems handle errors during rule processing by aborting the current transaction. But in case of error conditions produced by rule actions, this is not the only possible reasonable behavior. Other alternatives are to terminate execution of that rule and continue rule processing, to return to the state preceding rule processing and resume database processing, or to restart rule processing. Another challenge is how to recover events after system crashes especially for temporal or external events.
Features for analyzing rule processing include the ability to trace rule execution, to display the current set of triggered rules, to query and browse the set of rules, and to cross reference rules and data. Other useful features include the ability to control errors in rule programs, to activate and deactivate selected rules or group of rules while database system is processing transactions, and to experiment with rules off-line.
The theory and technology of active database systems is still maturing. There has been a considerable experimentation in developing ADBMSs in different domains but relatively little work is done on standardization and theory. This has resulted in wide range of constructs; execution strategies and software architecture being proposed that have utility in different problem domains. Thus there is a need to develop a formal framework for ADBMS so that it can be used to investigate several basic issues relating to their semantics and expressiveness. Along with this framework, a generic performance metrics for active databases are needed so that different systems and the optimization technique they use can be meaningfully compared.
Some of the domains in which a lot of work is needed are
Despite the challenges presented, the rewards offered by active functionality are great and as the field matures, it will prove as an enduring force within the area of database technology.
|
SYMBOLS |
DESCRIPTION |
|
ADBMS |
Active
Database Management System |
|
ECA |
Event
Condition Action |
|
DBT |
Start of
current transaction |
|
DBE |
Start of
current Event |
|
DBC |
Start of
current Condition |
|
DBA |
Start of
current Action |
|
Bind
E |
Bind with the
event |
|
Bind
C |
Bind with the
condition |