SAP Knowledge Base Article - Preview

2973429 - Business Processes Stuck After System Update - Automatically Resume After X Hours

Symptom

There are a few conditions that must be met and many different steps that can be taken in order to confirm that this issue is the same:
 
  • As soon as a business process is triggered (e.g. after placing an order, triggering the forgotten password process, etc.), it goes in Running state and remains stuck with no progress.



  • The runningOnClusterNode value for the Task (can be checked from backoffice or the database) will be -1.
     


  • No TaskExecutor threads will be running in the thread dumps.



  • After a fixed amount of hours (more exactly, the difference between server time and application time), the business processes just start on their own. 



  • During 'ant updatesystem', the Java process that is spun up will not have a user.timezone parameter set, which means that the server timezone will be considered by default, e.g.

    /Library/Java/JavaVirtualMachines/sapmachine-jdk-11.0.5.jdk/Contents/Home/bin/java -Xmx2g -Djava.locale.providers=COMPAT,CLDR --add-exports=java.base/jdk.internal.ref=ALL-UNNAMED --add-exports=java.naming/com.sun.jndi.ldap=ALL-UNNAMED --add-exports=jdk.management.agent/jdk.internal.agent=ALL-UNNAMED -
    ...
    ..
    de.hybris.bootstrap.loader.Loader -deployname client -platformhome /Users/i852913/hybris/2005.3/hybris/bin/platform -cp . -loadhybris true -systeminit false -file /Users/i852913/hybris/2005.3/hybris/temp/hybris/client/yrunexec.bsh



  • Commerce version is at least 6.7.



  • The BufferedAuxTablesTasksProvider is used, rather than the DefaultTasksProvider.

    A good way to know which provider is used is to search the startup logs for something like:
     
    INFO   | jvm 1    | main    | 2020/09/18 23:00:07.932 | INFO  [Task-master-poll] [ConfigurableTasksProvider] no tasks provider defined - default tasks provider (de.hybris.platform.task.impl.BufferedAuxTablesTasksProvider) will be used
     
     
    If the default strategy is used, you will see instead:
     
    INFO   | jvm 1    | main    | 2020/08/12 21:29:00.021 | [m[32mINFO  [PooledThread[2]] [ConfigurableTasksProvider] no tasks provider defined - default tasks provider (de.hybris.platform.task.impl.DefaultTasksProvider) will be used
     
     
    The quickest way to rule out this issue is if the below query returns an error because the table doesn't exist:
    select * from tasks_aux_scheduler
     
  • The timezone of the application precedes that of the server timezone. For example, the server time is UTC while the application time is America/Chicago (UTC -5):

    tomcat.generaloptions=${tomcat.jdkmodules.config} -Djava.locale.providers=COMPAT,CLDR -Xmx2G -ea -Dcatalina.base=%CATALINA_BASE% -Dcatalina.home=%CATALINA_HOME% -Dfile.encoding=UTF-8 -Djava.util.logging.config.file=jdk_logging.properties -Djava.io.tmpdir="${HYBRIS_TEMP_DIR}" -Duser.timezone=America/Chicago



  • The timezone for 'ant' tasks is not specified, or does not match the application timezone. This is determined by the standalone.javaoptions property and an example of a situation that fits the symptoms is one where -Duser.timezone is not specified (in this case the server timezone is used by default), e.g.

    standalone.javaoptions=-Xmx2g -Djava.locale.providers=COMPAT,CLDR



  • Enabling DEBUG level logging on the de.hybris.platform.task.impl package will show the following statement, with a negative duration:

    DEBUG [Task-master-poll] [AuxiliaryTablesSchedulerRole] 0: got scheduler timestamp 2020-09-19T03:25:45.890Z, now is 2020-09-19T02:58:57.545312Z, duration is -00:26:48.344



  • Manually running the below query in hac (commit mode enabled and last_activiy_ts any time in the past) will unblock the business processes within the next 10 seconds:
    update tasks_aux_scheduler set last_activity_ts='2000-01-01 00:00:00.000' where id='scheduler' and version='3'


Read more...

Environment

This article only applies with solutions hosted On Premise or CCv1. Additionally, the Commerce patch version is earlier than one of the versions below:

  • 1808.33
  • 1811.31
  • 1905.23
  • 2005.7
  • 2011.2

For patch versions equal to or greater than the ones above, the issue is fixed without the need to apply the resolution from this article. See: ECP-5401

For CCv2 (SAP Commerce in the Public Cloud), a similar situation can occur only if:

  1. The application timezone is ahead of GMT/UTC (e.g. GMT +08:00 Australia/West).
  2. The initial Commerce version used is earlier than one of the patches listed above (e.g. 2005.4).
  3. The Commerce version is upgraded to a patch version equal to or greater than the ones listed above (e.g. 2005.9), and the system is updated in rolling mode.

In this scenario, manually setting the last_activity_ts to a time in the past will resolve the issue (like the last point in the Symptoms section above). After applying the workaround once, the issue should not recur. See: ECP-6067

Product

SAP Commerce 1811 ; SAP Commerce 1905 ; SAP Commerce 2005 ; SAP Commerce 2011 ; SAP Hybris Commerce 1808

Keywords

  • taskengine
  • businessprocess
  • waiting
  • orders stuck
  • not progressing
  • time zone
  • deploy
, KBA , CEC-COM-ADM-BO , Backoffice , CEC-HCS-CCAZ-CZO , Customer Zone on Azure , Problem

About this page

This is a preview of a SAP Knowledge Base Article. Click more to access the full version on SAP for Me (Login required).

Search for additional results

Visit SAP Support Portal's SAP Notes and KBA Search.