[Rock-dev] Discussion about fault response tables
Sylvain Joyeux
sylvain.joyeux at dfki.de
Tue May 7 16:29:26 CEST 2013
On 05/06/2013 03:47 PM, Chris Mueller wrote:
> Hi,
>
> i have some small points that come up in my mind:
>
> 1) I've less experience with professional fault-toleranced systems. Its
> maybe additional helpful in the beginning to specify the types of
> possible responses the system
> could model. e.g.:
> 1. Spawning a new task, that has been failed due to hardware defects,
> software bugs, etc. (retry)
> 2. Retry a task with another property values because the configuration
> of e.g. a detector is currently completely messed up for the current
> environmet conditions.
> 3. Start a repair task to replace the failed task until the system is
> back in a stable state (thats one of the common use-cases roby is
> currently providing)
> 4. Abandoning a mission if its absolutely not possible to success the
> failed plan und try to continue the rest of the plan (That's a more
> high-level failure)
> 5. Spawning a complete alternative plan if the success of a specific
> task / action is not possible and necessary for the rest of a plan.
> I guess, there are problably much more concret response types, that
> could be retrieved from our past experiences and requirements with
> several systems.
> This could help to concretize the system, because in my opinion it's not
> a trivial matter to design a system model that could handle each kind of
> error.
All these points are covered with the proposed fault response tables.
1, 2, 3 and 5 are provided by on_fault. It is interesting to note that,
from the point of view of Roby, 2, 3 and 5 are equivalent.
> 2) if a fault exception is thrown in the system, a fault handler
> (on_fault ...) should also provide the task that has been failed. An
> ideologic example could be:
It is already provided by exception.origin
> on_fault EXCEPTION do |exception, failed_task|
> failed_task.prepare_restart
> failed_task.reconfigure(:param => BETTER_PARAM_VALUE)
> failed_task.respawn
> end
Looking at the task level is ill-conceived as you can only very rarely
do that.
> 3) Fault tolerance tables could be probably also visualized in syskit
> browse/roby-display. Would be later helpful for implementing and
> debugging the
> response management. That would be great.
As well as the action interface in general, yes.
> 4) Could you conretize a little more the meaning of "symbol" within the
> FAULT_MATCHER specification (maybe with an example)?
> Is it some kind of custom signal that can be thrown from any
> composition/task when a specific data port doesn't output an expected
> value within a given time?
> (thats currently my interpretation about the conecept 'data_predicates'
> mentioned in the wiki).
It is just a mean to give a name to an error. For instance, you would do
# Monitor a battery level
fault :battery_low do
battery0_dev.status_port.battery_level < 1
end
# React to it
on_fault :battery_low do |exception|
surface
end
--
Sylvain Joyeux (Dr.Ing.)
Space & Security Robotics
!!! Achtung, neue Telefonnummer!!!
Standort Bremen:
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-Straße 5
28359 Bremen, Germany
Phone: +49 (0)421 178-454136
Fax: +49 (0)421 218-454150
E-Mail: robotik at dfki.de
Weitere Informationen: http://www.dfki.de/robotik
-----------------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Straße 122, D-67663 Kaiserslautern
Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster
(Vorsitzender) Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
Sitz der Gesellschaft: Kaiserslautern (HRB 2313)
USt-Id.Nr.: DE 148646973
Steuernummer: 19/673/0060/3
-----------------------------------------------------------------------
More information about the Rock-dev
mailing list