Petals ESB Container

The recovering process must be improve about error management

Details

  • Type: Improvement Request Improvement Request
  • Status: New New
  • Priority: Blocker Blocker
  • Resolution: Unresolved
  • Affects Version/s: 3.0.6, 3.1.1
  • Fix Version/s: None
  • Component/s: Recovery
  • Security Level: Public
  • Description:
    Hide

    Today, when the recover starts, all components are set as 'shutdown'. And if an error occurs during recover, we are not able to restart artefacts in the same state than they was just before the last stop.

    The error management should be reviewed.

    In my mind, actual states install-state and lifecycle-state have not to be updated by the recovering process. These both states store the artefact states in which they was just before the last stop. A new state must be introduced to manage errors. And to be able to deal with artefacts recovered with errors during execution of others, an new JMX API must be introduced.

    Show
    Today, when the recover starts, all components are set as 'shutdown'. And if an error occurs during recover, we are not able to restart artefacts in the same state than they was just before the last stop. The error management should be reviewed. In my mind, actual states install-state and lifecycle-state have not to be updated by the recovering process. These both states store the artefact states in which they was just before the last stop. A new state must be introduced to manage errors. And to be able to deal with artefacts recovered with errors during execution of others, an new JMX API must be introduced.
  • Environment:
    -

Activity

Christophe DENEUX made changes - Mon, 4 Oct 2010 - 10:37:18 +0200
Field Original Value New Value
Priority Blocker [ 1 ]
Description Today, when the recover starts, all components are set as 'shutdowned'. And is an error occurs during recover, we are not able to restart artefacts in the same state than they was just before the last stop.

The error management should be reviewed.

In my mind, actual states install-state and lifecycle-state have not to be updated by the recovering process. These both states store the artefact states in which they was just before the last stop. A new state must be introduced to manage errors. And to be able to deal with artefacts recovered with errors during execution of others, an new JMX API must be introduced.
Today, when the recover starts, all components are set as 'shutdown'. And if an error occurs during recover, we are not able to restart artefacts in the same state than they was just before the last stop.

The error management should be reviewed.

In my mind, actual states install-state and lifecycle-state have not to be updated by the recovering process. These both states store the artefact states in which they was just before the last stop. A new state must be introduced to manage errors. And to be able to deal with artefacts recovered with errors during execution of others, an new JMX API must be introduced.
Hide
Roland Naudin added a comment - Mon, 4 Oct 2010 - 10:58:19 +0200

IMO, There is no proper automatic solution for that problem.
There is only solutions that involves human interpretation and intervention.

If you keep the previous state when a recovering fails, then restart the container, and images it fails again and again...
What do you propose to recover the failing artifact?
The only solution is to uninstall it. That's why the state that is kept at the recovering process is the ones that has been set successfully.

Another current problem is that a SA cannot been installed if a component is not STARTED/STOPPED. This filter set at the component level must be removed.

What alternative do you propose that can be acceptable for exploitation?

Show
Roland Naudin added a comment - Mon, 4 Oct 2010 - 10:58:19 +0200 IMO, There is no proper automatic solution for that problem. There is only solutions that involves human interpretation and intervention. If you keep the previous state when a recovering fails, then restart the container, and images it fails again and again... What do you propose to recover the failing artifact? The only solution is to uninstall it. That's why the state that is kept at the recovering process is the ones that has been set successfully. Another current problem is that a SA cannot been installed if a component is not STARTED/STOPPED. This filter set at the component level must be removed. What alternative do you propose that can be acceptable for exploitation?
Hide
Christophe DENEUX added a comment - Mon, 4 Oct 2010 - 13:07:25 +0200

The recovering process will take into account JBI states and its new state. So if a artifact fails, next time it will not be automatically recovered. And a user can use the JMX API to continue recovering after to have fix the problem.

Moreover, IMO, the lost+found has no sens. Why directories can not be kept at their initial location ?

Show
Christophe DENEUX added a comment - Mon, 4 Oct 2010 - 13:07:25 +0200 The recovering process will take into account JBI states and its new state. So if a artifact fails, next time it will not be automatically recovered. And a user can use the JMX API to continue recovering after to have fix the problem. Moreover, IMO, the lost+found has no sens. Why directories can not be kept at their initial location ?
Hide
Roland Naudin added a comment - Mon, 4 Oct 2010 - 13:18:40 +0200

The lost+found is for 'lost' repertories in the repository.
'lost' repositories correspond to JBI artefacts that has not been properly uninstalled, but has been whatever 'forced' to be unsintalled.
They are no more referenced in the system-state.xml.

About the JMX API you mention, it seems to be the standard life-cycle operations, what is the difference?

IMO, the point is how to fix a problem when one occurs? No automatic solution possible.

Show
Roland Naudin added a comment - Mon, 4 Oct 2010 - 13:18:40 +0200 The lost+found is for 'lost' repertories in the repository. 'lost' repositories correspond to JBI artefacts that has not been properly uninstalled, but has been whatever 'forced' to be unsintalled. They are no more referenced in the system-state.xml. About the JMX API you mention, it seems to be the standard life-cycle operations, what is the difference? IMO, the point is how to fix a problem when one occurs? No automatic solution possible.
Hide
Christophe DENEUX added a comment - Mon, 4 Oct 2010 - 14:48:29 +0200

The main goal of the new JMX API is to propose to the user to continue the recover process when it was stopped because of error.

Yes, no automatic solution to resolve problem exists, but in a user point of view, the restart process and error management can be improved.

Show
Christophe DENEUX added a comment - Mon, 4 Oct 2010 - 14:48:29 +0200 The main goal of the new JMX API is to propose to the user to continue the recover process when it was stopped because of error. Yes, no automatic solution to resolve problem exists, but in a user point of view, the restart process and error management can be improved.
Sébastien André made changes - Mon, 2 May 2011 - 11:16:28 +0200
Link This issue blocks SPLOGIDGME-13 [ SPLOGIDGME-13 ]



People

Dates

  • Created:
    Mon, 4 Oct 2010 - 10:36:45 +0200
    Updated:
    Mon, 2 May 2011 - 11:16:28 +0200