Leveraging 13c Cloud Control’s Auto-heal Capabilities in Your EnvironmentMukesh Pathak
As an Oracle Managed Service Provider (MSP), BIAS strives to make incident handling and repetitive actions as seamless and automated as possible. Automating repetitive tasks leveraging cloud controls corrective actions, allows us to increase our efficiency as well as provide our customers with a consistent quality of service.
As a starting point, we performed an analysis of our MSP customer base to classify and list alert frequency and alert types. As we suspected, a good deal of time was consumed by repetitive tasks such as locking sessions, tablespace and filesystem utilization.
To eliminate repeated tasks, we leveraged the corrective actions feature in OEM. Corrective actions are processes that can be put in place at the metric level to execute when a certain threshold is reached. A corrective action script was created for each alert. Each script included the steps we typically perform manually when an alert is triggered.
As an example, let’s look at tablespace full alerts. The script will resize an existing datafile, or a new datafile will be added per the customer’s instructions.
- The first step is to create the corrective action in library, for this: Log in OEM, click on Monitoring and then Corrective actions:
- There are several kinds of corrective actions libraries you can choose from, for this example we will choose “OS command”:
- Provide the name of your corrective action, a small description, and the target for running this corrective action against:
- Provide your script in the parameters section and credentials to run this at OS, in the next screens and click on save to library. Once done, you will see the new corrective action created under your library.
- Now, we need to add this corrective action to the Metric, so that it gets executed when the metrics has met the threshold criteria. For this, go to the database homepage of the database you want to apply the corrective action. Then click Oracle Database>Monitoring>Metric and Collection Settings
- Go to the tablespace space used metrics and set the critical and warning threshold values to an acceptable value. Click on Edit, and then add button under corrective action for warning threshold.
- Choose the “from library” from the drop-down, click continue, and then choose the corrective action we created earlier.
- You now see that the metric has a corrective action set for the warning threshold. Click continue. The corrective action now has “warning only” which means the corrective action will execute only when the warning threshold is hit; click continue. Click OK to save the changes
Finally, we have created a corrective action and configured our metric to execute this corrective action once the threshold criteria is met. This is the first of the 2-part blog series. Part two will include how to compare how many DBA man-hours can be saved. From the results thus far, we are very excited about the value it adds. Stay tuned. Happy reading!