I had to deal with this quite often. Most of the times, I got it right and able to identify the problem quickly. In a few cases, it took some time, and usually very stressful as it mostly occurs in Production. (It occurs in DEV and PRE-PROD all the time, it’s just that people usually don’t care, and it goes unnoticed)
Today I had to deal with it again and it took me some time. The cause was something I dealt with before, was told by a colleague on how to fix it (the easy way), but I forgot. This time around, under panic mode, I restarted a few JVMs before I remembered I should ask around and was reminded by my colleague again that it could be fixed with much less damage. I told myself I should write it down for the next time, so here is the sum of what I learned:
- Publish channel in Maximo uses Sequential Queue by default. When a message failed, it will stop processing other messages.
- I saw configured in some systems, the behaviour (Sequential processing) can be disabled by simply change the error output of SQOUTBD bus destination from “none” to “system” or an error queue.
- Check Message Tracking to see if there are messages stuck in “RECEIVED” status, if there are many of those messages. it means JMSConsumer doesn’t run, or it does run but one message failed (has ERROR status in Message Reprocessing), and thus everything else got stuck. If there is no message in “RECEIVED” status, it is either because Publish Channel is not enabled, or because Message Tracking is not enabled.
- For Publish Channel to publish messages, we need both the External System and the Publish Channel to be enabled. Event Listener on the Publish Channel should be enabled (unless it is triggered by something else like an Autoscript)
- If Message Tracking is not enabled for the Publish Channel, we should enable it (now), unless the interface is extremely unimportant, and no one cares.
- If there are a ton of “RECEIVED” messages in Message Tracking, it’s likely due to two reasons noted below. Messages that get published successfully have “PROCESSED” status.
- If there’s an error in Message Reprocessing that blocks the queue, try to re-process, or delete it to clear the blockage.
- If there’s no blockage in Message Reprocessing, it’s likely due to JmsConsumer cron task doesn’t run. Try re-activate/re-load the cron instance. Make sure to enable “Keep History?” flag. After re-activating and reloading the cron instance. If it shows “START”, but doesn’t show “ACTION”, it means the Cron instance doesn’t run. It’s likely there’s a hang scheduler task. It can be resolved by restarting the concerning JVM/server. This is the bad approach used today. The easy way is to query and delete the task entry in “TASKSCHEDULER” table. Don’t worry, once deleted, the cron task instance will create a new task entry on the next run.
- For blockage on sequential queue, on non-prod environment, we can see queue depth and clear all of those messages in the queue to clear blockage using two methods below:
- In Maximo, go to External Systems > Action Add/Modify Queues > Select “sqout” > choose View (or Delete Queue Data)
- In Websphere, go to Service Integration > Buses > Destinations > SQOUTBD > Queue Points. It will show Queue Depth which is the number of messages in the queue. Click on the link to open > Runtime tab > Messages > Delete or Delete All
No comments:
Post a Comment