CP4WatsonAIOps CP4WAIOPS v.3.4.0

Demo Environment Installation - Short Track ๐Ÿš€

K8s CNI

ยฉ2022 Niklaus Hirt / IBM

โ—โ—โ— This is the Repository for Field Validation Testing 3.4

โ— โš ๏ธ โ— Create the pull secrets for FVT before starting installation (adapt the file first):

./00_FVT_CREATE_SECRETS.sh

โ— THIS IS WORK IN PROGRESS

Please drop me a note on Slack or by mail nikh@ch.ibm.com if you find glitches or problems.


Installation


๐Ÿš€ Demo Installation

Those are the steps that you have to execute to install a complete demo environment:

  1. AI Manager Installation
  2. AI Manager Configuration
  3. Slack integration
  4. Demo the Solution

โ—You can find a PDF version of this guide here: PDF.

๐Ÿšจ๐Ÿšจ๐Ÿšจ๐Ÿšจ ๐Ÿ“บ Here is a video that walks you through the complete installation process.

๐Ÿš€ TLDR - Fast Track

These are the high level steps that you need to execute to install the demo environment

  1. Install AI Manager

    ansible-playbook ./ansible/00_aimanager-install-all.yaml -e ENTITLED_REGISTRY_KEY=<REGISTRY_TOKEN> 
  2. AI Manager Configuration

  3. Slack integration

โ„น๏ธ In-depth documentation


1 Introduction


This document is a short version of the full README ๐Ÿฅ that contains only the essential steps.

This is provided as-is:

  • I'm sure there are errors
  • I'm sure it's not complete
  • It clearly can be improved

โ—This has been tested for the new CP4WAIOPS v.3.4.0 release on OpenShift 4.8 (4.10 not being available on Techzone yet) on ROKS

So please if you have any feedback contact me


2 AI Manager Installation


2.1 Get the code

Clone the GitHub Repository

From IBM internal:

git clone https://<YOUR GIT TOKEN>@github.ibm.com/NIKH/aiops-install-ansible-fvt-33.git 

Or my external repo (this is updated less often than the IBM internal one):

git clone https://github.com/niklaushirt/cp4waiops-public.git

2.2 Prerequisites

2.2.1 OpenShift requirements

I installed the demo in a ROKS environment.

You'll need:

  • ROKS 4.8
  • 5x worker nodes Flavor b3c.16x64 (so 16 CPU / 64 GB)

You might get away with less if you don't install some components (Event Manager, ELK, Turbonomic,...) but no guarantee:

  • Typically 4x worker nodes Flavor b3c.16x64 for only AI Manager

2.2.2 Tooling

You need the following tools installed in order to follow through this guide:

  • ansible
  • oc (4.7 or greater)
  • jq
  • kafkacat (only for training and debugging)
  • elasticdump (only for training and debugging)
  • IBM cloudctl (only for LDAP)

2.2.1 On Mac - Automated (preferred)

Just run:

./10_install_prerequisites_mac.sh

2.2.2 On Ubuntu - Automated (preferred)

Just run:

./11_install_prerequisites_ubuntu.sh

2.3 Pull Secrets

2.3.1 Get the CP4WAIOPS installation token

You can get the installation (pull) token from https://myibm.ibm.com/products-services/containerlibrary.

This allows the CP4WAIOPS images to be pulled from the IBM Container Registry.

2.4 Install AI Manager

2.4.1 Start AI Manager Installation

  1. โ— Create the pull secrets for FVT (adapt the file first):
./00_FVT_CREATE_SECRETS.sh
  1. Start the Easy Installer with the token from 2.3.1:
./01_easy-install.sh -t <REGISTRY_TOKEN>
  1. Select option ๐Ÿฅ00 to install the complete AI Manager demo environment.

there are options to install only vanilla 'AI Manager'

Or directly run:

ansible-playbook ./ansible/00_aimanager-install-all.yaml -e ENTITLED_REGISTRY_KEY=<REGISTRY_TOKEN> 

This takes about one to two hours. After completion Easy Installer will exit, open the documentation and the AI Manager webpage (on Mac) and you'll have to to perform the last manual steps.

You now have a full, basic installtion of AI Manager with:

  • AI Manager
  • Open LDAP
  • RobotShop demo application
  • Trained Models based on precanned data (Log- and Metric Anomalies, Similar Incidents, Change Risk)
  • Topologies for demo scenarios
  • AWX (OpenSource Ansible Tower) with runbooks for the demo scenarios
  • Demo UI

2.5 Configure AI Manager

There are some minimal needed configurations that you have to do to fully configure the demo environment. Those are covered in the following chapters.

Minimal Configuration

Those are the manual configurations you'll need to demo the system and that are covered by the flow above.

Basic Configuration

  1. Configure LDAP Logins

Advanced Configuration

  1. Enable Story creation Policy
  2. Create AWX Connection
  3. Create Runbook Policy

Configure Topology

  1. Re-Run Kubernetes Observer

Configure Slack

  1. Setup Slack

3. AI Manager Configuration


โ— Make sure the playbook 00 has completed before continuing

You have to do the following:

  1. Login to AI Manager
  2. Add LDAP Logins to CP4WAIOPS
  3. Enable Story creation Policy
  4. Publish Runbook
  5. Create Runbook Policy
  6. Re-Run Kubernetes Observer
  7. Now you can create the Slack Integration

3.1 First Login

After successful installation, the Playbook creates a file ./LOGINS.txt in your installation directory.

โ„น๏ธ You can also run ./tools/20_get_logins.sh at any moment. This will print out all the relevant passwords and credentials.

  • Open the LOGINS.txt file that has been created by the Installer in your root directory K8s CNI

  • Open the URL from the LOGINS.txt file

  • Click on IBM provided credentials (admin only)

    K8s CNI

  • Login as admin with the password from the LOGINS.txt file

    K8s CNI

3.2 Add LDAP Logins to CP4WAIOPS

  • Go to AI Manager Dashboard

  • Click on the top left "Hamburger" menu

  • Select Access Control

    K8s CNI

  • Select User Groups Tab

  • Click New User Group K8s CNI

  • Enter demo (or whatever you like) K8s CNI

  • Click Next

  • Select Identity Provider Groups

  • Search for demo

  • Select cn=demo,ou=Groups,dc=ibm,dc=com K8s CNI

  • Click Next

  • Select Roles (I use Administrator for the demo environment)

    K8s CNI

  • Click Next

  • Click Create

  • Click on the top right image

  • Select Logout

    K8s CNI

  • Click Log In

    K8s CNI

  • Select Change your Authentication method

    K8s CNI

  • Select Enterprise LDAP

    K8s CNI

* Login with the demo credentials * User: demo * Password: P4ssw0rd!
![K8s CNI](./doc/pics/doc13.png)

3.3 Enable Story creation Policy

  • In the AI Manager "Hamburger" Menu select Operate/Automations

  • Under Policies

  • Select Stories from the Tag dropdown menu K8s CNI

  • Enable Default story creation policy for high severity alerts

  • Also enable Default story creation policy for all alerts if you want to get all alerts grouped into a story K8s CNI

โ— Wait for the playbook to complete before continuing

3.4 Publish Runbooks

โ—If you don't get any runbooks you can run the following to try to create them again: ansible-playbook ./ansible/45_aimanager-load-awx-playbooks-all.yaml

  • In the AI Manager "Hamburger" Menu select Operate/Automations

  • Select Runbooks tab

  • For the Mitigate RobotShop Problem click on the three dots at the end of the line

  • Click Edit

    K8s CNI

  • Click on the blue Publish button

    K8s CNI

  • Repeat for the other Runbooks

3.5 Create Runbook Policy

  • In the AI Manager "Hamburger" Menu select Operate/Automations

  • Under Policies, click Create Policy K8s CNI

  • Select Assign a runbook to alerts K8s CNI

  • Name it Mitigate RobotShop K8s CNI

  • Under Condition set1

  • Select resource.name (you can type name and select the name field for resources)

    K8s CNI

  • Set Operator to contains

    K8s CNI

  • And for value you type mysql (select String: mysql)

    K8s CNI

  • Under Runbooks

  • Select the Mitigate RobotShop Problem Runbook

    K8s CNI

  • Under Select Mapping Type, select Use default parameter value (this has been prefilled by the installer)

    K8s CNI

  • Click Create Policy

3.6 Re-Run Kubernetes Integration

In the AI Manager (CP4WAIOPS)

  1. In the AI Manager "Hamburger" Menu select Define/Data and tool integrations
  2. Click Kubernetes
  3. Under robot-shop, click on Run (with the small play button)

4. Slack integration


For the system to work you need to follow those steps:

  1. Create Slack Workspace
  2. Create Slack App
  3. Create Slack Channels
  4. Create Slack Integration
  5. Get the Integration URL
  6. Create Slack App Communications
  7. Slack Reset

4.1 Create your Slack Workspace

  1. Create a Slack workspace by going to https://slack.com/get-started#/createnew and logging in with an email which is not your IBM email. Your IBM email is part of the IBM Slack enterprise account and you will not be able to create an independent Slack workspace outside if the IBM slack service.

slack1

  1. After authentication, you will see the following screen:

slack2

  1. Click Create a Workspace ->

  2. Name your Slack workspace

slack3

Give your workspace a unique name such as aiops-<yourname>.

  1. Describe the workspace current purpose

slack4

This is free text, you may simply write โ€œdemo for Watson AIOpsโ€ or whatever you like.

slack5

You may add team members to your new Slack workspace or skip this step.

At this point you have created your own Slack workspace where you are the administrator and can perform all the necessary steps to integrate with CP4WAOps.

slack6

Note : This Slack workspace is outside the control of IBM and must be treated as a completely public environment. Do not place any confidential material in this Slack workspace.

4.2 Create Your Slack App

  1. Create a Slack app, by going to https://api.slack.com/apps and clicking Create New App.

    slack7

  2. Select From an app manifest

slack7

  1. Select the appropriate workspace that you have created before and click Next

  2. Copy and paste the content of this file ./doc/slack/slack-app-manifest.yaml.

    Don't bother with the URLs just yet, we will adapt them as needed.

  3. Click Next

  4. Click Create

  5. Scroll down to Display Information and name your CP4WAIOPS app.

  6. You can add an icon to the app (there are some sample icons in the ./tools/4_integrations/slack/icons folder.

  7. Click save changes

  8. In the Basic Information menu click on Install to Workspace then click Allow

4.3 Create Your Slack Channels

  1. In Slack add a two new channels:

    • aiops-demo-reactive
    • aiops-demo-proactive

    slack7

  2. Right click on each channel and select Copy Link

    This should get you something like this https://xxxx.slack.com/archives/C021QOY16BW The last part of the URL is the channel ID (i.e. C021QOY16BW) Jot them down for both channels

  3. Under Apps click Browse Apps

    slack7

  4. Select the App you just have created

  5. Invite the Application to each of the two channels by typing

    @<MyAppname>
  6. Select Add to channel

    You shoud get a message from saying was added to #<your-channel> by ...

4.4 Integrate Your Slack App

In the Slack App:

  1. In the Basic Information menu get the Signing Secret (not the Client Secret!) and jot it down

    K8s CNI

  2. In the OAuth & Permissions get the Bot User OAuth Token (not the User OAuth Token!) and jot it down

    K8s CNI

In the AI Manager (CP4WAIOPS)

  1. In the AI Manager "Hamburger" Menu select Define/Data and tool integrations

  2. Click Add connection

    K8s CNI

  3. Under Slack, click on Add Connection K8s CNI

  4. Name it "Slack"

  5. Paste the Signing Secret from above

  6. Paste the Bot User OAuth Token from above

    K8s CNI

  7. Paste the channel IDs from the channel creation step in the respective fields

    K8s CNI

    K8s CNI

  8. Test the connection and click save

4.5 Create the Integration URL

In the AI Manager (CP4WAIOPS)

  1. Go to Data and tool integrations

  2. Under Slack click on 1 integration

  3. Copy out the URL

    secure_gw_search

This is the URL you will be using for step 6.

4.6 Create Slack App Communications

Return to the browser tab for the Slack app.

4.6.1 Event Subscriptions

  1. Select Event Subscriptions.

  2. In the Enable Events section, click the slider to enable events.

  3. For the Request URL field use the Request URL from step 5.

    e.g: https://<my-url>/aiops/aimanager/instances/xxxxx/api/slack/events

  4. After pasting the value in the field, a Verified message should display.

    slacki3

    If you get an error please check 5.7

  5. Verify that on the Subscribe to bot events section you got:

    • app_mention and
    • member_joined_channel events.

    slacki4

  6. Click Save Changes button.

4.6.2 Interactivity & Shortcuts

  1. Select Interactivity & Shortcuts.

  2. In the Interactivity section, click the slider to enable interactivity. For the Request URL field, use use the URL from above.

There is no automatic verification for this form

slacki5

  1. Click Save Changes button.

4.6.3 Slash Commands

Now, configure the welcome slash command. With this command, you can trigger the welcome message again if you closed it.

  1. Select Slash Commands

  2. Click Create New Command to create a new slash command.

    Use the following values:

    Field Value
    Command /welcome
    Request URL the URL from above
    Short Description Welcome to Watson AIOps
  3. Click Save.

4.6.4 Reinstall App

The Slack app must be reinstalled, as several permissions have changed.

  1. Select Install App
  2. Click Reinstall to Workspace

Once the workspace request is approved, the Slack integration is complete.

If you run into problems validating the Event Subscription in the Slack Application, see 5.2

4.7 Create valid CP4WAIOPS Certificate (optional)

Installer should aready have done this.

But if there still are problems, you can directly run:

ansible-playbook ./ansible/31_aimanager-create-valid-ingress-certificates.yaml

4.8 Slack Reset

4.8.1 Get the User OAUTH Token

This is needed for the reset scripts in order to empty/reset the Slack channels.

This is based on Slack Cleaner2. You might have to install this:

pip3 install slack-cleaner2

Reset reactive channel

In your Slack app

  1. In the OAuth & Permissions get the User OAuth Token (not the Bot User OAuth Token this time!) and jot it down

In file ./tools/98_reset/13_reset-slack.sh

  1. Replace not_configured for the SLACK_TOKEN parameter with the token
  2. Adapt the channel name for the SLACK_REACTIVE parameter

Reset proactive channel

In your Slack app

  1. In the OAuth & Permissions get the User OAuth Token (not the Bot User OAuth Token this time!) and jot it down (same token as above)

In file ./tools/98_reset/14_reset-slack-changerisk.sh

  1. Replace not_configured for the SLACK_TOKEN parameter with the token
  2. Adapt the channel name for the SLACK_PROACTIVE parameter

4.8.2 Perform Slack Reset

Call either of the scripts above to reset the channel:

./tools/98_reset/13_reset-slack.sh

or

./tools/98_reset/14_reset-slack-changerisk.sh

5. Demo the Solution


5.1 Simulate incident - Command Line

Make sure you are logged-in to the Kubernetes Cluster first

In the terminal type

./tools/01_demo/incident_robotshop.sh

This will delete all existing Alerts/Stories and inject pre-canned event, metrics and logs to create a story.

โ„น๏ธ Give it a minute or two for all events and anomalies to arrive in Slack. โ„น๏ธ You might have to run the script 3-4 times for the log anomalies to start appearing.