Sunday, January 25, 2009

SAP XI component monitoring (XΙ 3.0)

After a short period of assisting in the troubleshooting of a problematic SAP XI 3.0 installation, I propose a set of monitoring transactions, that can (maybe must) be used in a daily basis (from XI/Basis people) to make sure the XI server is running well. First, some low-level pre-requisites:

  1. If you are stupid enough to setup the SAP/XI server on Windows (particularly 32-bit) OS (and/or be responsible for it), make sure that the kernel has enough resources to handle the zillion connections required for XI operation. You will NEED to keep the Memory / Free System Page Table Entries (PTEs) at least 20.000. You can monitor this with "perfmon.msc". If this number drops well below 20k, you will experience sever problems: dropped connections, internal time-outs and plenty of HTTP errors (401, 503, etc). Anyway, check the boot.ini switches /USERVA, /3GB, /PAE and set them following Microsoft rules.
  2. Bare in mind that XI requires plenty of power and resources. It starts several virtual machine instances (both ABAP and JAVA/J2EE) and handles heavy network load. These need plenty of memory and a good OS kernel. When setting up an XI server, these things should be taken into account.
The XI Directory and XI Repository have nothing to do with XI monitoring. Everything needs to be done from the R/3 system itself, and/or the XI Runtime Workbench (RWB). First, the XI-related SAP transactions, which help you find out if everything is OK:

  • XI Administration: SXMB_ADM. Manages every aspect of XI operation from the R/3 end, including queues, archiving and deletion, as well as various system-wide definitions.
  • ICM Monitor: SMICM. Supreme overlord for Java maintenance from the R/3 end. Permits J2EE restarts (Check "Administration" menu) and offers useful Java monitor stats.
  • Queue monitors: SMQ2, SMQ1, SMQ3. If any message is stuck in the way in or out of the XI switch, you'll find it here. Watch particularly for SMQ2. If you are unlucky to find out that a certain queue is stuck (the top message failed to get processed), select the message and press SHIFT+F6 (Save LUW). This will send it at the end of another queue, permitting processing of the next messages. Note: XBTI* are queues for incoming messages, XBTO* are queues for outgoing messages.
  • Message monitor: SXMB_MONI. Find recent messages with errors (inbound or outbound) and generally anything that goes through XI. Perhaps the most useful transaction, but with rather limited search criteria options :-(
  • Business Process Engine monitors: SXMB_MONI_BPE. A useful list of monitoring tasks for the status of the XI Processing Engine.
  • BPE Errors: SWF_XI_SWI2_DIAG. A sub-section of BPE monitors for latest errors.
  • Outgoing RFC monitor: SM58. Find out if another R/3 system has problems processing the messages that XI sent.
  • Local RFC Monitor: SWF_XI_SWU2. Monitor the XI ABAP processing status.
  • Hardcore message monitoring/retrieval: SE16. For hardcore problems, you can always resort to the tables, that XI uses to store (persist) transferred messages. Look into tables: SXMSPMAST (master table), SXMSPERROR/SXMSPERRO2 (error message info), SXMSPVERS, and some other. NOTE: XI_AF_* tables, that exist in the SAP XI DB (part of the J2EE adapter framework), are NOT managed by R/3 and do NOT exist in the ABAP workbench.
  • IDoc monitoring: IDX5. Useful for finding the status of certain R/3 objects (materials, sales/purchase orders, etc) that seem to be lost in the XI way. You need the IDoc number, the Message GUID, or a combination of time-date and IDoc type. Extremely useful when checking with other modules/systems and SAP users.
  • TCP/IP listeners: SMMS. See where the XI engine listens for incoming client connections in the TCP/IP level. Quite useful in conjunction with netstat, telnet, wireshark/ethereal and other networking tools.
Another set of important monitors is within the Runtime Workbench (RWB), as a part of the web-based XI components. With RWB, you can:

  • Monitor Communication Channel status: Check channels for hidden errors (e.g. "why my file does not get parsed from XI?"); start/stop channel service (common solution for old XI systems); monitor produced/consumed messages.
    Access from: Component Monitoring / Adapter Engine / Communication Channle Monitoring.
    URL: http://<server_ip>:<adm_port>/rwb/index.jsp
  • Monitor Messages: Just like SXMB_MONI, but this is more reliable, in some cases.
    Access from: Message Monitoring.
  • Check user locks: Remove stale locks from missed connections, etc. in Repository/Directory.
    Access From: XI Tools / Administration / Repository / Lock Overview
    URL: http://<server_ip
    >:<adm_port>/rep/support/public/LockAdminService

All XI messages follow the SOAP/XML scheme. When a specific problem is found with a message, it is always good practice to look into its Trace section, apart from the Error section. Most likely, the trace contains a Java stack trace, from which, the top entry is of interest.