[#PETALSESBCONT-339] Introduce a mechanism to handle a lot of messages in the DeliveryChannel - Petals Link JIRA

Details

Type: New Feature
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 4.3.0
Fix Version/s: 5.4.0
Component/s: Persistence
Security Level: Public

Description:

Hide

Currently, when many messages are put in a DeliveryChannel and the component does not take the messages for any reasons, the size of the DeliveryChannel will grow until there is no more memory available.
Currently, no protection exists to avoid that.

We should introduce a mechanism for the queue to be flushed on disk: instead of using memory (after a threshold to not impact performance in normal conditions) the messages would be stored on disk.
Note that persisting to disk using things like memory mapped files is very efficient and writing to disk all the time could be acceptable actually.

Special attention should be given to the handling of special situations such as JVM crashing and restoring of state after restart.

Show
Currently, when many messages are put in a DeliveryChannel and the component does not take the messages for any reasons, the size of the DeliveryChannel will grow until there is no more memory available. Currently, no protection exists to avoid that. We should introduce a mechanism for the queue to be flushed on disk: instead of using memory (after a threshold to not impact performance in normal conditions) the messages would be stored on disk. Note that persisting to disk using things like memory mapped files is very efficient and writing to disk all the time could be acceptable actually. Special attention should be given to the handling of special situations such as JVM crashing and restoring of state after restart.

Environment:

-

Issue Links

Depends

This issue depends on:
~~PETALSESBCONT-401~~ Remove undocumented non-working Router's rate control, filter control and prority queues modules
~~PETALSESBCONT-402~~ Remove the Persistence subsystem

This issue blocks:
PETALSDISTRIB-146 Reliability aka delivery guarantees for consumer-provider communication
PETALSCDK-175 Handle capacity saturation of exchange processor with various strategies

Activity

Descending order - Click to sort in ascending order

Hide

Permalink

Christophe DENEUX added a comment - Mon, 5 Oct 2015 - 15:47:28 +0200 - edited

Postponed to 5.0.3 or upper

Show

Christophe DENEUX added a comment - Mon, 5 Oct 2015 - 15:47:28 +0200 - edited Postponed to 5.0.3 or upper

Hide

Permalink

Victor NOËL added a comment - Thu, 25 Jun 2015 - 12:22:29 +0200

Open questions and important points (not to forget them):

1) Exchanges can be persisted to disk to save memory (so exchanges are removed from memory)
2) If 1, then the provider of a service will have an instance of the previously persisted exchange, different than the instance that was persisted (if not, there is no need to persist exchanges because it will still be in memory…)
3) In case of InOut (and alike) pattern , the consumer keeps a reference to the exchange (either in its code, or in the CDK code handling async messaging)
4) 2+3 means there will be 2 instances of a message at a given time for InOut pattern (and alike).
5) In case of InOnly (and alike) pattern, the consumer does not (may not? should not?) keep an instance of the exchange
5b) 5+2 is a good thing!
6) Exchanges can be persisted to disk to handle crashes
7) For 6, the persistence mechanism does not have to remove instances from memory
8) 7+6+2 willl thus be ok, the same instance will be used, as well as 7+6+5 by the way
9) With 2, it is easy to solve ~~PETALSESBCONT-330~~

With a pluggable implementation of the DeliveryChannel (as explained in previous comment), the implementation of the exchange must be part of the implementation of the DeliveryChannel (because the way exchanges are implemented impact how they are persisted to disk). Also it makes sense because the DeliveryChannel is the one responsible of creating exchanges in the JBI specification.

3 possibles implementation of DeliveryChannel (and Exchange thus):

full memory (as currently), instance of exchange shared by everyone
persistence for memory efficiency: will really make sense in case there is a majority of InOnly exchanges, components must be well implemented to not keep instances in memory uselessly too, and async manager in CDK must be rewritten
persistence for safety: the persistence is done on the side of an in-memory queue

Many open questions then… ! There is more I guess…

Show

Victor NOËL added a comment - Thu, 25 Jun 2015 - 12:22:29 +0200 Open questions and important points (not to forget them): 1) Exchanges can be persisted to disk to save memory (so exchanges are removed from memory) 2) If 1, then the provider of a service will have an instance of the previously persisted exchange, different than the instance that was persisted (if not, there is no need to persist exchanges because it will still be in memory…) 3) In case of InOut (and alike) pattern , the consumer keeps a reference to the exchange (either in its code, or in the CDK code handling async messaging) 4) 2+3 means there will be 2 instances of a message at a given time for InOut pattern (and alike). 5) In case of InOnly (and alike) pattern, the consumer does not (may not? should not?) keep an instance of the exchange 5b) 5+2 is a good thing! 6) Exchanges can be persisted to disk to handle crashes 7) For 6, the persistence mechanism does not have to remove instances from memory 8) 7+6+2 willl thus be ok, the same instance will be used, as well as 7+6+5 by the way 9) With 2, it is easy to solve ~~PETALSESBCONT-330~~ With a pluggable implementation of the DeliveryChannel (as explained in previous comment), the implementation of the exchange must be part of the implementation of the DeliveryChannel (because the way exchanges are implemented impact how they are persisted to disk). Also it makes sense because the DeliveryChannel is the one responsible of creating exchanges in the JBI specification. 3 possibles implementation of DeliveryChannel (and Exchange thus):

full memory (as currently), instance of exchange shared by everyone
persistence for memory efficiency: will really make sense in case there is a majority of InOnly exchanges, components must be well implemented to not keep instances in memory uselessly too, and async manager in CDK must be rewritten
persistence for safety: the persistence is done on the side of an in-memory queue

Many open questions then… ! There is more I guess…

Hide

Permalink

Christophe DENEUX added a comment - Mon, 22 Jun 2015 - 12:01:39 +0200 - edited

I changed to summary to avoid confusion with persistence of messages between consumers and producers.

Moreover, it could be interesting to be able to configure this queue implementation. In this way, we could adapt the queue implementation to use cases:

if few messages can be lost, we choose a faster implementation but less secure,
if no messages can be lost, even when disk writes, we choose the most secure implementation but slower.

The queue implementation used is the one available as Java SPI in the Petals container classloader. If more than one exist, the queue implementation name to use must be set at local container configuration level (ie. server .properties) through the parameter "petals.router.delivery-channel.queue.implementation".
The parameters of the queue implementation (as threshold, messages number, ...) will be set in the local container configuration (ie. server .properties). Their name template is: petals.router.delivery-channel.queue.implementation.<impl-id>.<param-name>" where:

impl-id is the unique identifier of the queue implementation, for example the simple name of the implementation class (without package),
param-name is a parameter name of the implementation.

In a first step, we will provide only one queue implementation, the default one. It remains to define this implementation. Perhaps an implementation that uses memory to store messages until a threshold (number of messages) is reached beyond which messages are flushed on disk.

Show

Christophe DENEUX added a comment - Mon, 22 Jun 2015 - 12:01:39 +0200 - edited I changed to summary to avoid confusion with persistence of messages between consumers and producers. Moreover, it could be interesting to be able to configure this queue implementation. In this way, we could adapt the queue implementation to use cases:

if few messages can be lost, we choose a faster implementation but less secure,
if no messages can be lost, even when disk writes, we choose the most secure implementation but slower.

The queue implementation used is the one available as Java SPI in the Petals container classloader. If more than one exist, the queue implementation name to use must be set at local container configuration level (ie. server .properties) through the parameter "petals.router.delivery-channel.queue.implementation". The parameters of the queue implementation (as threshold, messages number, ...) will be set in the local container configuration (ie. server .properties). Their name template is: petals.router.delivery-channel.queue.implementation.<impl-id>.<param-name>" where:

impl-id is the unique identifier of the queue implementation, for example the simple name of the implementation class (without package),
param-name is a parameter name of the implementation.

Hide

Permalink

Christophe DENEUX added a comment - Mon, 22 Jun 2015 - 10:45:41 +0200 - edited

In my mind, "simpler is better". So, even if disk is cheap, where is the interest to use a persistence mechanism to return an error when no more disk space will be available ? It is needed to write source code about "Persistence" + "Error". More over, how to size this persistence ? It is simpler and more user friendly to have only the error management here.

If you limit the size of messages that wait their processing in the DeliveryChannel, the DeliveryChannel will not grow indefinitely. And, I think that I'm understanding that you want to say by "persistence", if the DeliveryChannel uses a Queue, as today, to store messages, but instead of a simple memory Queue, a Queue using memory and disk, it will be possible to play with an "unlimited" number of messages. Is it this you names "Persistence" ?

About "to keep messages in error somewhere", the dump of messages always exists (it should be completed) at MONIT log level. It was designed for message replaying. So, caution to not introduce duplication.

Show

Christophe DENEUX added a comment - Mon, 22 Jun 2015 - 10:45:41 +0200 - edited In my mind, "simpler is better". So, even if disk is cheap, where is the interest to use a persistence mechanism to return an error when no more disk space will be available ? It is needed to write source code about "Persistence" + "Error". More over, how to size this persistence ? It is simpler and more user friendly to have only the error management here. If you limit the size of messages that wait their processing in the DeliveryChannel, the DeliveryChannel will not grow indefinitely. And, I think that I'm understanding that you want to say by "persistence", if the DeliveryChannel uses a Queue, as today, to store messages, but instead of a simple memory Queue, a Queue using memory and disk, it will be possible to play with an "unlimited" number of messages. Is it this you names "Persistence" ? About "to keep messages in error somewhere", the dump of messages always exists (it should be completed) at MONIT log level. It was designed for message replaying. So, caution to not introduce duplication.

Hide

Permalink

Victor NOËL added a comment - Mon, 22 Jun 2015 - 10:06:01 +0200

The first point is covered by ~~PETALSCDK-135~~ and ~~PETALSCDK-90~~.

The second one is meant to ensure the DeliveryChannel is never full.
Frankly, disk is cheap, you have to correctly dimension your system to run a system when such a situation can arise.

It is still a valid point and it is not covered by this issue, we should create another one or complete this one, but I don't know yet what to do: I mean, there is always a point when nothing can be done anymore.
The error is one solution yes, but if you don't have nor disk nor memory (because if things are persisted to disk it is because there is no more memory), how do you want your JVM to actually answer an exchange? A solution would be to find a way to always have some memory available (i.e., the DeliveryChannel persistence fails earlier than when there is no more memory).

Nevertheless, the case of remote container is interesting and we didn't think yet about it either. I guess it should be returned in error to the sending container.

Finally, when talking of persistence, I'm at the container level, so I don't really care which MEP is used (InOnly in your example), it is not an optional (i.e. per exchange) guaranty, but a per-container guaranty: if you disable it, it is at the container level, so it is not reserved to InOnly or whatever.

I personally think it is possible to achieve as good performance than currently (and maybe better…) WHILE having persistence, so this should come by default.

Also, the main idea, maybe not explicitly expressed here, is that no message should be lost if we can avoid it without a loss of performance.
And the next step is to be able to keep messages in error somewhere and give the possibility to the administrator to replay them.
It is related to PETALSCDK-136 )that you just commented and PETALSDISTRIB-144.

Show

Victor NOËL added a comment - Mon, 22 Jun 2015 - 10:06:01 +0200 The first point is covered by ~~PETALSCDK-135~~ and ~~PETALSCDK-90~~. The second one is meant to ensure the DeliveryChannel is never full. Frankly, disk is cheap, you have to correctly dimension your system to run a system when such a situation can arise. It is still a valid point and it is not covered by this issue, we should create another one or complete this one, but I don't know yet what to do: I mean, there is always a point when nothing can be done anymore. The error is one solution yes, but if you don't have nor disk nor memory (because if things are persisted to disk it is because there is no more memory), how do you want your JVM to actually answer an exchange? A solution would be to find a way to always have some memory available (i.e., the DeliveryChannel persistence fails earlier than when there is no more memory). Nevertheless, the case of remote container is interesting and we didn't think yet about it either. I guess it should be returned in error to the sending container. Finally, when talking of persistence, I'm at the container level, so I don't really care which MEP is used (InOnly in your example), it is not an optional (i.e. per exchange) guaranty, but a per-container guaranty: if you disable it, it is at the container level, so it is not reserved to InOnly or whatever. I personally think it is possible to achieve as good performance than currently (and maybe better…) WHILE having persistence, so this should come by default. Also, the main idea, maybe not explicitly expressed here, is that no message should be lost if we can avoid it without a loss of performance. And the next step is to be able to keep messages in error somewhere and give the possibility to the administrator to replay them. It is related to PETALSCDK-136 )that you just commented

and PETALSDISTRIB-144.

Hide

Permalink

Christophe DENEUX added a comment - Mon, 22 Jun 2015 - 09:50:28 +0200

Victor, I think that you mixed two different problems:

What to do when a component is not able to process a message that it picks from the DeliveryChannel because no more resources is available ?
What to do when the DeliveryChannel is full ?

For the 2nd point, I think also that an error must be returned according to the MEP when the message is put into the DeliveryChannel. Caution when receiving a message from a remote container, we can get such an error when the receiver transporter try to put the incoming message in it local DeliveryChannel.

In my mind, the persistence mechanism must be reserved to assure a transport warranty that wan be used by InOnly services consumers if they need such a feature.

Show

Christophe DENEUX added a comment - Mon, 22 Jun 2015 - 09:50:28 +0200 Victor, I think that you mixed two different problems:

What to do when a component is not able to process a message that it picks from the DeliveryChannel because no more resources is available ?
What to do when the DeliveryChannel is full ?

For the 1st problem, the component must return an error according to the MEP of the incoming message. A persistence mechanism is not sufficient, you postpone only the problem: What to do when no more disk space will be available ? For the 2nd point, I think also that an error must be returned according to the MEP when the message is put into the DeliveryChannel. Caution when receiving a message from a remote container, we can get such an error when the receiver transporter try to put the incoming message in it local DeliveryChannel. In my mind, the persistence mechanism must be reserved to assure a transport warranty that wan be used by InOnly services consumers if they need such a feature.

Hide

Permalink

Victor NOËL added a comment - Fri, 19 Jun 2015 - 17:25:23 +0200 - edited

One possible library we could use for persistence is Chronicle Queue.

See https://groups.google.com/forum/#!topic/java-chronicle/4-cOUnWQKnE for a discussion with the author to see if it can fit our needs.

It's quite low+level, but then it makes sense for it to be low-level since we are implementing an ESB where this kind of feature is critical.

Main open questions with it are:

Message are indefinitely stored on the disk (the queue is read with an index): some strategy must exist to remove old files, there is already a rotation strategy so it is easy to identify the file to remove.
It means serializing every messages that have to be transferred, but that's the cost of robustness and not losing messages (and this question is related, in a way, to ~~PETALSESBCONT-327~~, in particular when messages and attachment are cached!).
When recovering from a crash, the index has to be found again (but that can be done quite quickly and only once, or the index could have been stored also on disk).

Also, this kind of behaviour can open interesting questions about replay (since we store all messages, before they are being processed AND after they are sent back, in two different queues!).

Show

Victor NOËL added a comment - Fri, 19 Jun 2015 - 17:25:23 +0200 - edited One possible library we could use for persistence is Chronicle Queue. See https://groups.google.com/forum/#!topic/java-chronicle/4-cOUnWQKnE for a discussion with the author to see if it can fit our needs. It's quite low+level, but then it makes sense for it to be low-level since we are implementing an ESB where this kind of feature is critical. Main open questions with it are:

Message are indefinitely stored on the disk (the queue is read with an index): some strategy must exist to remove old files, there is already a rotation strategy so it is easy to identify the file to remove.
It means serializing every messages that have to be transferred, but that's the cost of robustness and not losing messages (and this question is related, in a way, to ~~PETALSESBCONT-327~~, in particular when messages and attachment are cached!).
When recovering from a crash, the index has to be found again (but that can be done quite quickly and only once, or the index could have been stored also on disk).

Also, this kind of behaviour can open interesting questions about replay (since we store all messages, before they are being processed AND after they are sent back, in two different queues!).

People

Assignee:

Unassigned

Reporter:

Victor NOËL
Watchers:

0

Dates

Created:

Fri, 19 Jun 2015 - 16:43:26 +0200

Updated:

Thu, 13 Apr 2023 - 17:03:34 +0200

Petals ESB Container

Introduce a mechanism to handle a lot of messages in the DeliveryChannel

Details

Issue Links

Activity

People

Dates