Sunday, May 15, 2016

TBSM 6.1.1 Fix Pack 4 - post-installation manual steps

New TBSM 6.1.1 Fix pack 4 was released by end of April as I informed you in my previous post.
You can find a post about it also on the IBM developerWorks site:
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Business%20Service%20Manager1/page/Advanced%20Topics

Except obviously fixes the fix pack contains few new functionalities and some of them require a manual installation action from administrators. Are they worthwhile? Let me check with you.

1. IV65545 - TBSMEVENTREADER STOPS BECAUSE EVENT PROCESSOR BLOCKING ON RADEVENTSTORE

If you want to understand the context of the issue behind this fix (APAR), see here:
http://www-01.ibm.com/support/docview.wss?uid=swg1IV65545

If you let me quote:
"TBSM stops processing events because it is blocking on the TBSM
database radevent table.  Eventually all connections to the
database are used up and the processing stops.

This causes the event reader for TBSM stops reasding events
because the queue to process them is full - this is usually the
first symptom." 

So we have a typical performance issue here, if there's too many events to process, because you have too many rules or service instances defined in TBSM, the data amount that needs to be stored in TBSM database, in table RADEVENTSTORE (typically a mapping between events and instances) will overwhelm the TBSM database if it was left as configured by default. A new, custom configuration is recommended to do instead.
Since the Fix pack Readme available here:
http://www-01.ibm.com/support/docview.wss?uid=swg24041505

misses the Microsoft Windows instructions, just in case, here's my recipe:
a) click Menu Start-> IBM DB2 -> DB2COPY1 (Default) (or whatever is your DB2 copy name)
b) choose Command Windows - Administrator
c) In the command line tool go to (assuming your TBSM_HOME is in Program Files):

cd "\Program Files\IBM\tivoli\tipv2\profiles\TIP
Profile\installedApps\TIPCell\isc.ear\sla.war\install"\ 


C:\Program Files\IBM\tivoli\tipv2\profiles\TIPProfile\installedApps\TIPCell\isc.
ear\sla.war\install>db2 connect to tbsm

   Database Connection Information

 Database server        = DB2/NT64 10.1.0
 SQL authorization ID   = MPALUCH
 Local database alias   = TBSM


C:\Program Files\IBM\tivoli\tipv2\profiles\TIPProfile\installedApps\TIPCell\isc.
ear\sla.war\install>type AddRADEventStoreIndex.sql | db2
(c) Copyright IBM Corporation 1993,2007
Command Line Processor for DB2 Client 10.1.0

You can issue database manager commands and SQL statements from the command
prompt. For example:
    db2 => connect to sample
    db2 => bind sample.bnd

For general help, type: ?.
For command help, type: ? command, where command can be
the first few keywords of a database manager command. For example:
 ? CATALOG DATABASE for help on the CATALOG DATABASE command
 ? CATALOG          for help on all of the CATALOG commands.

To exit db2 interactive mode, type QUIT at the command prompt. Outside
interactive mode, all commands must be prefixed with 'db2'.
To list the current command option settings, type LIST COMMAND OPTIONS.

For more detailed help, refer to the Online Reference Manual.

db2 => DB20000I  The SQL command completed successfully.
db2 => db2 => SQL2314W  Some statistics are in an inconsistent state. The newly
collected
"INDEX" statistics are inconsistent with the existing "TABLE" statistics.
SQLSTATE=01650
db2 => db2 => DB20000I  The RUNSTATS command completed successfully.
db2 =>
C:\Program Files\IBM\tivoli\tipv2\profiles\TIPProfile\installedApps\TIPCell\isc.
ear\sla.war\install>




Next, open your TIP with TBSM and Impact UI, go to folders System Configuration->Event Automation and select Data Model page. Next change project to TBSM_BASE and find TBSMDatabase data source.

And the last step is the lock timeout set to 30, re-use your DB2 session, if you lost it connect to TBSM database first and then set the lock timeout:


db2 => connect to tbsm

 Database Connection Information

 Database server        = DB2/NT64 10.1.0
 SQL authorization ID   = MPALUCH
 Local database alias   = TBSM


db2 => set current lock timeout 30
DB20000I  The SQL command completed successfully.
db2 =>



2. Adding the new "Delete Service Instance" menu action.

Another manual action you want to take is adding this new action item to your right-click menu in your TIP-based ServiceInstance tree template. It is quite harsh to remove your selected instance especially if you have thousands of instances - going to that big container-like page with all instances and check boxes to select those which you want to install - is a hassle. Let's see if setting up the new menu item is fine and easy.
So basically this is what you want in your ServiceInstance.xml at the end:

<?xml version="1.0" encoding="UTF-8"?><treeTemplate name="ServiceInstance">
<columnList>
<column displayName="State"/>
<column displayName="Time"/>
<column displayName="Events"/>
</columnList>
<templateTreeMapping primaryTemplateName="DefaultTag">
<treeColumn displayName="State" attribute="serviceStatusImage"/>
<treeColumn displayName="Time" attribute="slaStatusImage"/>
<treeColumn displayName="Events" attribute="rawEventsImage"/>
<actionMapping clickType="~popupMenu" actionName="ShowServiceInstanceEditor"/>
<actionMapping clickType="~popupMenu" actionName = "DeleteServiceInstance"/>
<actionMapping clickType="~popupMenu" actionName="InstantiateOneHopServiceMap"/>
<actionMapping clickType="~popupMenu" actionName="ChooseChildrenTool"/>
<actionMapping clickType="~popupMenu" actionName="ShowMemberTemplates"/>
<actionMapping clickType="~popupMenu" actionName="ViewTools"/>
<actionMapping clickType="~popupMenu" actionName="IntegrationTools"/>
<actionMapping clickType="~popupMenu" actionName="MaintTools"/>
</templateTreeMapping>
</treeTemplate>


If, for any reason, something like this happens to you after you did put both artifacts back to DB2 and wanted to reinit your canvas:
C:\Program Files\IBM\tivoli\tbsm\bin>rad_reinitcanvas.bat tipadmin smartway
Final result - Failure : 1


simply restart your TIPProfile. It will do the job and after all you should get this new menu item available in the Service Navigator portlet in TIP while looking at Services:






What is slightly weird is that if you're in the service edit mode, even after deleting it it stays open in the editor:
No worries thought, even if you try to save it, TBSM won't let you do it.

3. The service SLAPrunePolicyActivator

This one is quite important. You may remember or be aware of this issue explained in here:
http://www.ibm.com/support/knowledgecenter/SSSPFK_6.1.1.3/com.ibm.tivoli.itbsm.doc/customization/bsmc_slac_maint.html

You're safe if you don't use SLA calculations in TBSM. But if you do, this article is nice enough to tell you which tables to clean manually yourself from time to time. So finally we have an automation doing that work for you! Let's see if it's easy to setup.

As the prerequisite you're to install new policies shipped with TBSM 6.1.1 FP4 (just one is new, others come from the previous 6.1.1 Fix packs).Unfortunately something went wrong on my TBSM on Windows:

C:\Program Files\IBM\tivoli\tbsm\bin>ApplyPoliciesFor611FP4.bat <user> <pass>
"running rad_discover_schema on SLA_OutageView"
Final result - Failure : 1
"running rad_discover_schema on SLA_ArchiveView"
Final result - Failure : 1
PUSH policy: ResEnrichReaderRecycle.js --project TBSM_BASE
Final result - Failure : 1
PUSH policy: AV_GetColorForSeverity.ipl  --project TBSM_BASE
Final result - Failure : 1
PUSH policy: SLAPrune.ipl  --project TBSM_BASE
Final result - Failure : 1
trigger policy: ResEnrichReaderRecycle
Final result - Failure : 1
NOTE:  If there are failures during the PUSH, it is likely due to policy being l
ocked!
       if this is the case, unlock the policy and re-run this script!
C:\Program Files\IBM\tivoli\tbsm\bin>


So I'll use the Impact UI instead:
Once I have the policy uploaded, I configure the service. I think it makes great sense to put that service also into the TBSM_BASE project. Let see how it works. The service's log looks not much self-explanatory:

May 14, 2016 11:36:22 PM CEST[SLAPrunePolicyActivator]SLAPrunePolicyActivator: executed
May 14, 2016 11:36:22 PM CEST[SLAPrunePolicyActivator]SLAPrunePolicyActivator: started


and the policy log is also not very talkative about the effort or how many entries where there in those 7 tables that it pruned:

14 maj 2016 23:36:22,401: [SLAPrune][pool-7-thread-63]Parser log: Starting policy SLAPrune
14 maj 2016 23:36:22,401: [SLAPrune][pool-7-thread-63]Parser log: Calling BatchDelete for DataType SLA_DurationCountWatchers with filter: timestamp < 1460669782401
14 maj 2016 23:36:22,417: [SLAPrune][pool-7-thread-63]Parser log: Calling BatchDelete for DataType SLA_TasksOfDurationCountWatcher with filter: timestamp < 1460669782401
14 maj 2016 23:36:22,417: [SLAPrune][pool-7-thread-63]Parser log: Calling BatchDelete for DataType SLA_IncidentCountWatchers with filter: timestamp < 1460669782401
14 maj 2016 23:36:22,432: [SLAPrune][pool-7-thread-63]Parser log: Calling BatchDelete for DataType SLA_TasksOfIncidentCountWatchers with filter: timestamp < 1460669782401
14 maj 2016 23:36:22,432: [SLAPrune][pool-7-thread-63]Parser log: Calling BatchDelete for DataType SLA_CumulDurationRuleState with filter: timestamp < 1460669782401
14 maj 2016 23:36:22,448: [SLAPrune][pool-7-thread-63]Parser log: Calling BatchDelete for DataType SLA_CumulRuleArchive with filter: timestamp < 1460669782401
14 maj 2016 23:36:22,448: [SLAPrune][pool-7-thread-63]Parser log: Calling BatchDelete for DataType SLA_EventStore with filter: timestamp < 1460669782401
14 maj 2016 23:36:22,464: [SLAPrune][pool-7-thread-63]Parser log: Finished policy SLAPrune and exiting now...


Too late I realized that I could check on the radeventstore data volume before I ran the service. I could only check on that after I did that:

db2 => select count(*) from tbsmbase.radeventstore
1
-----------
          0
  1 record(s) selected.
db2 =>


Well. next time. I believe it works ok ;)

4. TBSMDBChecker

Last not least, not sure why post-install steps are spread across the readme file, this one isn't really together with those mentioned above and you need to read the readme quite carefully in order to not miss it.

Copying the TBSMDB checker tools from Fix pack zip to TBSM tbsmdb directory could be done by the installer and it isn't, maybe because tbsmdb directory belongs to the TBSM DB management tool and it might be owned by another user, not the one which runs TBSM.

TBSM_Check_DB.bat, TBSM_Check_DB.sh and TBSMDBChecker.props are identical with those that you already have. So the only difference is in tbsmDatabaseChecker.xml which contains few extra sections related to checking on user providing a password. Checking TBSM DB is rather a simple task and should look like this (if everything works ok):

C:\Program Files\IBM\tivoli\tbsmdb\bin>TBSM_Check_DB.bat
Calling ant script tbsmDatabaseChecker.xml to check the databases
    [input] Enter administrative database userid. This user should have SYSADMIN
 or SECADMIN authority and will be used to check authority of data server userid
:
db2admin
Enter the password for administrative database userid db2admin:
    [input] Enter database userid that will be used to connect the database from
 the TBSM data server: [db2admin]

Enter the password for data server database userid db2admin:
    [input] Enter database(s) to be checked:
    [input]    S = Service Model
    [input]    H = Metric History
    [input]    M = Metric Marker
    [input]  [ALL]
S
     [echo] Executing main TBSM Database checker task at 2016/05/15 00:17:13.994

     [echo] Checking of the following TBSM database was successful: TBSM
     [echo] See the stdout and stderr logs for more detail. The logs can be foun
d in: C:\Program Files\IBM\tivoli\tbsmdb\logs.
     [echo] TBSM Database checker ended at 2016/05/15 00:17:17.075

BUILD SUCCESSFUL
Total time: 25 seconds
C:\Program Files\IBM\tivoli\tbsmdb\bin>


If you want to learn more about the TBSM database checker tool, read this technote:
http://www-01.ibm.com/support/docview.wss?uid=swg24031960


5. Discovery library toolkit and Monitoring Agent for TBSM fix pack installation 

The installation itself is just like described, I'm adding these steps to the list just to make the picture of the post-installation manual steps complete. There's no APAR for the Monitoring Agent for TBSM in the fix pack 4 hence still the one included in Fix Pack 1 is still the latest one and if you have it already installed, you're done with it. Speaking about the discovery library toolkit, there's an update in fix pack 4 to install.


This is it. A few of manual post-install steps, always easy to overlook when reading a long readme, I hope that someone finds this overview useful.

mp

Friday, May 13, 2016

Triggering TBSM Rules

Introduction

Tivoli Business Service Manager can calculate amazing things for you, if you only need them. This is thanks to the powerful rules engine being the key part of TBSM as well as the Netcool/Impact policies engine running  just under the hood together with every TBSM edition. You can present your calculation results later on on a dashboard or in reports, depending if you think of a real time scorecard or historical KPI reports.
In this article, I’ll show how you can make TBSM to process its inputs by triggering various template rules in various ways. It is something that isn’t really well documented or at least it isn’t well documented in a single place.

Status, numerical and text rules triggers

In this chapter  I’ll show three kinds of rules (status, numerical and text) and I’ll show how TBSM triggers them, so processes the input data, runs them and returns the outputs.
In general these three techniques always kick off TBSM rules based on the same two conditions: time and a new value. Here they are:
  1. Omnibus Event Reader service (for incoming status rules or numerical rules or text rules based on Omnibus ObjectServer as the data feed)
  2. TBSM data fetchers (for numerical or text rules with fetcher based data feed)
TBSM/Impact services of type Policy activator  (using Impact policy with PassToTBSM function calls to send data to numerical or text rules)
Figure 1. Three techniques of triggering status, numerical and text rules in TBSM



Make note. There are ITM policy fetchers as well as highly undocumented any-policy fetchers configurable in TBSM, I’ll not comment on them in this material, however their basis is just like any fetcher: the time.
Let’s take a look at the first type of rules triggers, the most popular, the OMNIbus Event Reader service in Impact, widely used in TBSM to process events stored in ObjectServer against the service trees.

OMNIbus Event Reader

Omnibus Event Reader is an Impact service running regularly by default every 3 seconds in order to connect Netcool/OMNIbus for getting events stored in the Objectserver memory that might be affecting TBSM service tree elements. It selects the events based on the following default filter:

(Class <> 12000) AND (Type <> 2) AND ((Severity <> RAD_RawInputLastValue) or (RAD_FunctionType = '')) AND (RAD_SeenByTBSM = 0) AND (BSM_Identity <> '')

(Severity <> RAD_RawInputLastValue) is the condition ensuring that event will be tested against containing a new value in the Severity field comparing to the previous event.
The Event Reader itself can be found in Impact UI server within Services page in TIP among other services included in the Impact project called TBSM_BASE:
Figure 2. Configuration of TBSMOMNIbusEventReader service


Make note. TBSM allows you configuring other event readers but you can use just one same time.
All incoming status rules use this Event Reader by default. There’s embedded mapping between the name of the Event Reader and the Data source field in Status/Text/Numerical rules, hence just “ObjectServer” caption occurs in the new rule form:
Figure 3. Screenshot of New Incoming status rule form

Policy activators  

Impact services of type Policy activators simply call a policy every defined period of time and run that policy.
Figure 4. Screenshot of a policy activator service configuration for TBSMTreeRuleHeartbeat

The policy needs to be created earlier. In order to trigger TBSM rules, it is required to call PassToTBSM() function and contain in its argument an object. Let’s say this is my TBSMTreeRulesHeartbeat policy:

Seconds = GetDate();
Time = LocalTime(Seconds, "HH:mm:ss");
randNum = Random(100000);

ev=NewEvent("TBSMTreeRuleHeartbeatService");
ev.timestamp = String(Time);
ev.bsm_identity = "AnyChild";
ev.randNum =  randNum;
PassToTBSM(ev);
log(ev); 

In my example, a new value generates every time when the policy is activated by using GetDate() function. Pay attention to field called ev.bsm_identity. I’ll be referring to this field later on. For simplification this field has always value “AnyChild”. 

Make note. Unlike TBSM OMNIbus event reader, the policy activating services, also policies themselves don’t have to be included in TBSM_BASE impact project.
Netcool/Impact policies give you freedom of reaching to any data source, via SQL or SOAP or XML or REST or command like or anyhow you like, and processing any data you can see useful to process in TBSM. The only requirement is passing that data to TBSM via PassToTBSM function.

TBSM data fetchers

TBSM datafetchers combine an Impact policy with DirectSQL function call and Impact policy activator service. Additionally, data fetchers have a mini concept of schedule, it means you can set their run time interval to specific hour and minute and run it once a day (i.e. 12:00 am daily). It also allows postponing or rushing next runtime in case of longer taking SQL queries.
Make note. Data fetchers can be SQL fetchers, ITM policy fetchers or any policy fetchers, unfortunately the TBSM GUI was never fully adjusted to reconfigure ITM policy fetcher and was never enabled to allow configuration of any policy fetchers and in case of the last two you’ve got to perform few manual and command-line level steps instead. Fortunately, the PassToTBSM function available in Impact 6.x policies can be used instead and the any-policy fetchers aren’t that useful anymore.
Every fetcher by default runs every 5 minutes.
Figure 5. Screenshot of data fetcher Heartbeat fetcher


In the example presented on the screenshot above data fetcher connects DB2 database and runs DB2 SQL query. The new value is ensured every time in this example by calling the native DB2 random function and the whole query is the following:

select 100000*rand() as "randNum", 'AnyChild' as "bsm_identity" from sysibm.sysdummy1

Pay attention to field bsm_identity. It always returns the same value “AnyChild”, just like the policy explained before.


Triggering status, numerical, text and formula rules  

In the previous chapter I presented various rules triggering methods. It’s now time to show how those triggers work in real. I’ll create a template with 5 numerical and text rules (I don’t want to change any instance status, so I won’t create any incoming status rule this time) and additionally with 3 supportive formulas and I’ll present the output values of all those rules on a scorecard in Tivoli Integrated Portal. Below you can see my intended template with rules:
Figure 6. Screenshot of t_triggerrulestest template with rules

OMNIbus Event Reader-based rules

Let’s start from 2 rules utilizing the TBSM OMNibus Event Reader service. Like I said, I won’t use a status rule, but I’ll use one text and one numerical rule, in order to return the last event’s severity and the last event’s summary. Before I do it, let me configure my simnet probe that will be sending random events to my ObjectServer. My future service instance implementing the t_triggerrulestest template will be called TestInstance (or at least it will have such a value in its event identifiers), I want one event of each type and various probability that such event will be sent:

#######################################################################
#
# Format:
#
#       Node Type Probability
#
# where Node        => Node name for alarm
#       Type        =>  0 - Link Up/Down
#                      1 - Machine Up/Down
#                      2 - Disk space
#                      3 - Port Error
#       Probability => Percentage (0 - 100%)
#
#######################################################################

TestInstance 0 10
TestInstance 1 15
TestInstance 2 20
TestInstance 3 25


Let’s see if it works:
Figure 7. Screenshot of Event Viewer presenting test events sent by simnet probe.

So now my two rules, I marked important settings with green. So my Data feed is ObjectServer, the source is SimNet probe, the field containing service identifiers is Node and the output value taken back is Severity:
Figure 8. Screenshot of the rule numr_event_lastseverity form
Same here, this time the rule is Text rule and I have a fancy output expression, it is:


'AlertGroup: '+AlertGroup+', AlertKey: '+AlertKey+', Summary: '+Summary

The rule itself:

Figure 9. Screenshot of the rule txtr_event_lastsummary

Fetcher-based rule

That was easy. Now something little bit more complicated, the data fetcher. I already have my datafetcher created and show above in this material, let’s check if it works fine, the logs shows the fetcher is fine, i.e. if it fetches data every 5 minutes:

1463089289272[HeartbeatFetcher]Fetching from TBSMComponentRegistry has started on Thu May 12 23:41:29 CEST 2016
1463089289287[HeartbeatFetcher]Fetched successfully on Thu May 12 23:41:29 CEST 2016 with 1 row(s)
1463089289287[HeartbeatFetcher]Fetching duration: 00:00:00s
1463089289412[HeartbeatFetcher]1 row(s) processed successfully on Thu May 12 23:41:29 CEST 2016. Duration: 00:00:00s. The entire process took 00:00:00s
1463089589412[HeartbeatFetcher]Fetching from TBSMComponentRegistry has started on Thu May 12 23:46:29 CEST 2016
1463089589427[HeartbeatFetcher]Fetched successfully on Thu May 12 23:46:29 CEST 2016 with 1 row(s)
1463089589427[HeartbeatFetcher]Fetching duration: 00:00:00s
1463089589558[HeartbeatFetcher]1 row(s) processed successfully on Thu May 12 23:46:29 CEST 2016. Duration: 00:00:00s. The entire process took 00:00:00s

And the data preview looks good too:
Figure 10. The Heartbeat fetcher output data preview
My next rule will be just one and it will be a numerical rule to return the randNum value, I marked important settings in green again, so I select HeartbeatFetcher as the Data Feed, I select bsm_identity as service event identifier and randNum as the output value:
Figure 11. Screenshot of numr_fetcher_randNum rule

Policy activated rules

Last but not least I will create two rules getting data from my policy activated by my custom Impact Service. I did show the policy and the service in the previous chapter, let’s just make sure they both work ok. This is how the service works, every 5 minutes I get my policy activated and every time it returns another value in the randNum field:

12 maj 2016 23:41:29,652: [TBSMTreeRulesHeartbeat][pool-7-thread-87]Parser log: (PollerName=TBSMTreeRuleHeartbeatService, randNum=71648, timestamp=23:41:29, bsm_identity=AnyChild)
12 maj 2016 23:46:29,673: [TBSMTreeRulesHeartbeat][pool-7-thread-87]Parser log: (PollerName=TBSMTreeRuleHeartbeatService, randNum=8997, timestamp=23:46:29, bsm_identity=AnyChild)
12 maj 2016 23:51:29,674: [TBSMTreeRulesHeartbeat][pool-7-thread-91]Parser log: (PollerName=TBSMTreeRuleHeartbeatService, randNum=73560, timestamp=23:51:29, bsm_identity=AnyChild)
12 maj 2016 23:56:29,700: [TBSMTreeRulesHeartbeat][pool-7-thread-91]Parser log: (PollerName=TBSMTreeRuleHeartbeatService, randNum=60770, timestamp=23:56:29, bsm_identity=AnyChild)
13 maj 2016 00:01:29,724: [TBSMTreeRulesHeartbeat][pool-7-thread-92]Parser log: (PollerName=TBSMTreeRuleHeartbeatService, randNum=55928, timestamp=00:01:29, bsm_identity=AnyChild)

Let’s then create the rules. I will have two rules again, one numerical and one text. The numerical rule will have the TBSMTreeRuleHeartbeatService as the Data Feed, the bsm_identiy field will be selected as the service event identifier field and randNum field will be my output:
Figure 12. Screenshot of numr_heartbeat_randNum rule

Make note. Every time you add another field to your policy activated by your service, make sure that new field is mapped to the right data type in the Customize Fields form. You will need to add that field first:
Figure 13. Screenshot of CustomizedFields form

And the second rule looks the following, this time it is a text rule and I return the timestamp value:
Figure 14. Screenshot of txtr_heartbeat_lasttime rule

Formula rules  

The last rules I’ll create will be three formula, policy-based, text rules. Each of them will go to another rules create previously and “spy” on their activity. Let’s see the first example:

Figure 15. Screenshot of nfr_triggered_by_events rule


This rule will use policy and will be a text rule. It is important to mark those fields before continuing, later the Text Rule field greys out and inactivates. After ticking both fields I click on the Edit policy button. All three rules will look the same at this level; hence I won’t include all 3 screenshots as just their names will differ. I’ll create another policy in IPL for each of them. Here’s the mapping:

Rule name
Policy name
nfr_triggered_by_events
p_triggered_by_events
nfr_triggered_by_fetcher
p_triggered_by_fetcher
nfr_triggered_by_service
p_triggered_by_service

Each policy of those three will look similar, it will just look after different rules created so far. The p_triggered_by_events policy will do this:
// Trigger numr_event_lastseverity
// Trigger txtr_event_lastsummary

Seconds = GetDate();
Time = LocalTime(Seconds, "HH:mm:ss");

Status = Time;
log("TestInstance triggered by SimNet events at "+Time);
log("Output value of numr_event_lastseverity: "+InstanceNode.numr_event_lastseverity.Value);
log("Output value of txtr_event_lastsummary: "+InstanceNode.txtr_event_lastsummary.Value);

Policy p_triggered_by_fetcher will do this:
// Trigger numr_fetcher_randnum

Seconds = GetDate();
Time = LocalTime(Seconds, "HH:mm:ss");

Status = Time;
log("TestInstance triggered by HeartbeatFetcher at "+Time);
log("Output value of numr_fetcher_randnum: "+InstanceNode.numr_fetcher_randnum.Value); 

And policy p_triggered_by_service this:

// Trigger numr_heartbeat_randnum
// Trigger txtr_heartbeat_lasttime

Seconds = GetDate();
Time = LocalTime(Seconds, "HH:mm:ss");

Status = Time;
log("TestInstance triggered by TBSMHeartbeatService at "+Time);
log("Output value of numr_heartbeat_randnum: "+InstanceNode.numr_heartbeat_randnum.Value);
log("Output value of txtr_heartbeat_lasttime: "+InstanceNode.txtr_heartbeat_lasttime.Value);

You can notice that each policy starts from a comment section. This is important. This is how the formula rules get triggered. It is enough to mention another rule by its name in a comment to trigger your formula every time that other referred rule returns another output value. This is why we have the randnum-related rules in every formula. Those rules are designed to return another value every time they run. Just the first rule isn’t the same, but I assume it will trigger every time a combination of Summary, AlertGroup and AlertKey fields value in the source event changes.
The trigger numerical or text rules are also mentioned later when these policies call them and obtain their output values in order i.e. to put those values into log file. But it is not necessary to trigger my formulas. I log those trigger text and numerical rules outputs for troubleshooting purposes only.
The purpose of these 3 policies and 3 formulas is to report on time when the numerical or text values worked for the last time.
Below you can see an example of one of the policies in actual form.
Figure 16. Screenshot of one of the policies text

Testing the triggers

Now it’s time to test the trigger rules, the triggers and troubleshoot in case 

Triggering rules in normal operations
In order to do that we will need a service instance implementing our newly created template. I call it TestInstance and this is its base configuration:
Figure 17. Screenshot of configuration of service instance TestInstance – templates

It is important to make sure that the right event identifiers were selected in the Identification Fields tab. I need to remember what bsm_identity I set in all rules, it is mainly AnyChild (the policy and the fetcher) and TestInstance (the SimNet probe).
Figure 18. Screenshot of configuration of service instance TestInstance – identifiers


Make note. In real life your instance will have its individual identifiers like TADDM GUID or ServiceNow sys_id. It is important to find a match between that value and the affecting events or matching KPIs and if this is necessary to define new identifiers, which will ensure such a match.
Let’s see if it works in general. I created a scorecard and a page to present on all values of my new instance. I’ll put on top also fragments of my formula related policy logs to see if the data returned in policies and timestamps match:
Figure 19. Screenshot of the scorecard with policy logs on top

Let’s take a closer look at the first section. Same event arrived just once but since formula is triggered by two rules it was triggered twice in a row. In general the last event arrived at 20:27:00 and its severity was 4 (major) and the summary was on Link Down. Both rules numr_event_lastseverity and txtr_event_lastsummary triggered m formula correctly.

The next section is about the fetcher, the latest random number is 16589,861 and the rule numr_fetcher_randnum triggered my formula correctly. 
The last is the policy activated rule and formula, let’s see. This time I have two rules again and they both triggered the formula correctly. The last run was at 20:26:30. I have two different randnum values in both runs. This is caused by referring to numerical rules twice in my formula policy.

Triggering rules after TBSM server restart



I’ll now show a problem that TBSM has with rules that are not triggered by any trigger. Like I said in the previous chapters, TBSM needs rules to be triggered every now and then but also the value to change between triggers, in order to return the value again.
It causes some issues in TBSM server restart situations. If a value hasn’t changed before server restart and is still the same after the restart, TBSM may be unable to display or return it correctly, if the rule used to return it is not triggered. Server restart situation means clearing TBSM memory so no output values of no rules are preserved for after the server restart.
Here’s an example. I’ll create one new formula rule with this policy in my test template:

Status = ServiceInstance.NUMCHILDREN;
log("Service instance "+ServiceInstance.SERVICEINSTANCENAME+" ("+ServiceInstance.DISPLAYNAME+") has children: "+Status);

Here’s the rule itself:
Figure 20. Screenshot of nfr_numchildren rule configuration
As next step, I add one more column to my scorecard to show the output of the newly created rule. I also created 3 service instances and made them a child to TestInstance instance.

Figure 21. Screenshot of the scorecard shows 3 children count

Also my formula policy log will return number 3:

13 maj 2016 12:17:56,664: [p_numchildren][pool-7-thread-34 [TransBlockRunner-2]]Parser log: Service instance TestInstance (TestInstance) has children: 3

Now, if I only restart TBSM server, the value shown will be 0 and I will see no new entry in the log:
Figure 22. Screenshot of the scorecard after server restart shows 0 children

I can change this situation by taking one of three actions:
  1. Adding new or removing old child instances from Testinstance
  2. Modifying the formula policy
  3. Introducing a trigger to the formula policy

However two first options don’t protect me from another server restart situation occurring.
Let’s say I add another child instance. This is how the scorecard will look like:

Before the restart
However, after the restart


Or I may want to modify my rule. After saving my changes, the value will display correctly. However another server restart will reset it back to 0 again anyway.
So let’s say I change my policy to this:
Status = ServiceInstance.NUMCHILDREN;
log("Service instance "+ServiceInstance.SERVICEINSTANCENAME+" ("+ServiceInstance.DISPLAYNAME+") has children: "+Status);
log("Service instance ID: "+ServiceInstance.SERVICEINSTANCEID);

And my policy log now contains two entries per run:
13 maj 2016 12:56:07,023: [p_numchildren][pool-7-thread-4 [TransBlockRunner-1]]Parser log: Service instance TestInstance (TestInstance) has children: 4
13 maj 2016 12:56:07,023: [p_numchildren][pool-7-thread-4 [TransBlockRunner-1]]Parser log: Service instance ID: 163

But the situation before and after the restart is the same:
Before the restart
After the restart


It’s not a frequent situation though. If your rules are normally event-triggered rules or data fetcher triggered rules you can expect frequent updates to your output values even after your TBSM server restarts.  Just in case you want to present an output value in a rule that normally is not triggered, make sure you include a reference to a trigger in your rule. Let’s use one of the triggers we configured previously in my new formula policy:

// Trigger by numr_fetcher_randnum    
// Trigger by numr_heartbeat_randnum

Status = ServiceInstance.NUMCHILDREN;
log("Service instance "+ServiceInstance.SERVICEINSTANCENAME+" ("+ServiceInstance.DISPLAYNAME+") has children: "+Status);
log("Service instance ID: "+ServiceInstance.SERVICEINSTANCEID); 

You can already notice by following the log of the policy that there are many entries per every policy run, precisely as many entries as many times the formula was triggered by one of the trigger rules.
The first pair of entries was added after saving the rule. The next 2 pairs were added in result of the triggers working fine:


13 maj 2016 13:22:36,837: [p_numchildren][pool-7-thread-3 [TransBlockRunner-1]]Parser log: Service instance TestInstance (TestInstance) has children: 4
13 maj 2016 13:22:36,837: [p_numchildren][pool-7-thread-3 [TransBlockRunner-1]]Parser log: Service instance ID: 163
13 maj 2016 13:24:12,833: [p_numchildren][pool-7-thread-3 [TransBlockRunner-1]]Parser log: Service instance TestInstance (TestInstance) has children: 4
13 maj 2016 13:24:12,833: [p_numchildren][pool-7-thread-3 [TransBlockRunner-1]]Parser log: Service instance ID: 163
13 maj 2016 13:24:18,465: [p_numchildren][pool-7-thread-3 [TransBlockRunner-1]]Parser log: Service instance TestInstance (TestInstance) has children: 4
13 maj 2016 13:24:18,465: [p_numchildren][pool-7-thread-3 [TransBlockRunner-1]]Parser log: Service instance ID: 163
 
Let’s make the final test, so TBSM server restart:
Before the restart
After the restart


This excercise ends my material for tonight. I'll continue in another material on triggering the status propagation rules and numeric aggregation rules. See you soon!
mp