Message Engine doesn't start after setting up cluster

This issue hit me a few times and always took me some time to figure out what happened. So I thought it's a good idea to note it down.


Symptom:

When setting up cluster environment for Maximo, I will need to setup an integration bus with a message engine for each cluster (IF, UI, Cron etc.)


Each message engine will require its own individual schema (and thus individual user if the Oracle DB is used)


After integration bus are setup and Maximo cluster started, we see a lot of errors in the log file, usually in the Cron or MIF cluster due to message engine is not available.


When restarting the cluster, we can see that the message engine for that cluster has a "partial started" status. But a few minutes after the whole cluster is started, the message engine would show an "unavailable" status.

 

Troubleshoot:

  • Check ffdc log under [Websphere_Home]\AppServer\profiles\ctgAppSrv01\logs\ffdc\, check for [MXServer_Name]_exception.log to see if there are any exception related to integration bus or message engine such as:

com.ibm.ws.sib.msgstore.persistence.DatasourceWrapperStoppedException com.ibm.ws.sib.msgstore.persistence.impl.PersistentMessageStoreImpl.start 1:206:1.47.1.53 D:\IBM\WebSphere\AppServer\profiles\ctgAppSrv01\logs\ffdc\MAXUI-N1-4_58bec328_22.10.24_15.50.24.7956943157914483024680.txt

com.ibm.ws.sib.msgstore.PersistenceException com.ibm.ws.sib.msgstore.impl.MessageStoreImpl.start 755 D:\IBM\WebSphere\AppServer\profiles\ctgAppSrv01\logs\ffdc\MAXUI-N1-4_58bec328_22.10.24_15.50.24.8271423565797649410081.txt


  • For each of the above exception, open the file it referenced. We might see some detailed error message which could help us to solve the issue.


  • In this case, I have this vague error message below:

CWSIS1501E: The data source has produced an unexpected exception: com.ibm.ws.sib.msgstore.persistence.DatasourceWrapperStoppedException: New connections cannot be provided because the persistence layer has been stopped


Solution:

  • For this specific case, it is caused by a failed setup process in previous setup which left a bunch of tables created under different message engine details. Thus, the current message engine doesn't like to reuse it. To solve this issue, we can simply change the schema name of for the message store so that the next time the cluster starts, it will create a whole new bunch of tables for it own use.

  • If Oracle database is used, schema is linked with the DB account used for login, and is more difficult to change. Thus, I'll just drop all the tables used by that schema. One quick way to identify and delete those tables is running this query:

select 'drop table ' || table_name || ';' from user_tables;


No comments:

Post a Comment