Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 9 Next »

1 Introduction

AI Monitoring for Boomi: Enhancing Performance and Efficiency.
Experience a new level of efficiency and performance with AI monitoring for Boomi. Designed specifically for IT professionals, system administrators, business owners, and data analysts, this powerful tool revolutionizes the way you monitor and manage your Boomi integrations.

Eyer possess extensive knowledge about the integration market, solution and use cases, and this has been leveraged to build a complete anomaly detection package for the Boomi runtime (Atoms & Molecules)

2 Why Eyer & Boomi

The Eyer platform is an automated time-series observability platform that easily integrates and supports different technologies through APIs.

Eyer will integrate directly in Boomi processes, giving the user the power to act on early warnings relating to Atom & Molecule performance like throughput, latency and other metrics related to runtime processing. Some use cases are listed in chapter 3, with examples of data to act on from the anomaly alerts and how to utilise the data in a Boomi process.

Benefits of AI Monitoring for Boomi:
Experience the power of AI monitoring for Boomi and take control of your integrations like never before. Maximize performance, minimize downtime, and streamline your operations. Discover a new level of efficiency today.

  • Proactive Problem Detection: Identify and resolve issues before they impact your operations. Stay one step ahead with proactive monitoring.

  • Minimize Downtime: Swiftly address any disruptions to your integrations. Minimize downtime and keep your business running smoothly.

  • Streamline Operations: Optimize your workflows and streamline your operations with intelligent monitoring and analytics. Maximize efficiency and productivity.

  • Enhance Decision-Making: Leverage predictive analytics to gain valuable insights. Make data-driven decisions and drive better outcomes for your business.

  • Save Time and Effort: Automate monitoring alerts and notifications. Spend less time on manual tasks and more time on strategic initiatives.

Data to act on:

  • Anomaly Alerts: Receive instant notifications when anomalies are detected in your integrations, down to metrics and components. Take proactive measures to prevent disruptions.

  • Root Cause Analysis: Quickly identify the root cause of any issues and resolve them efficiently. Minimize downtime and keep your operations running smoothly.

  • TBA! Predictive Analytics: Harness the power of predictive analytics to anticipate potential bottlenecks or performance issues. Make informed decisions and optimize your workflows.

  • Automated Alerts: Save time and effort with automated alerts. Receive notifications directly to your preferred channels, ensuring you never miss critical updates.

  • Easy Integration: Seamlessly integrate AI monitoring for Boomi into your existing systems. Get up and running quickly without any disruption to your workflow.

  • Boomi Recipes: Leverage a library of pre-built recipes to accelerate your integration projects. Benefit from best practices and optimize your implementation.

3 Eyer Boomi runtime agent

Eyer can monitor and process runtime information from the Atoms and Molecules. To enable JMX data for Eyer, we use Influx Telegraf in combination with Jolokia, to expose the runtime performance data to Eyer. The data we fetch from the runtime are the following (example, internal JSON format in Eyer):

[{		
	system: "DESKTOP-S01F7CP",	
	nodes :{	
		nodetype: "operatingsystem",
		data:
		{
		cpu_usage_system: 1.4797507788161994,
		cpu_usage_user: 31.386292834890966
		TotalSystemMemUsed: 5312753664,
		AtomCommittedVirtualMemorySize: 327794688,
		HeapMemoryUsage.committed: 134217728,
		HeapMemoryUsage.init: 134217728,
		HeapMemoryUsage.max: 536870912,
		HeapMemoryUsage.used: 78256432,
		AtomProcessCpuLoad: 0.0028079687560744176,
		TotalPhysicalMemorySize: 8502923264,
		timestamp: 1697127400
		},
		nodetype: "ExecutionManager",
		data:
		{
		AverageExecutionQueueTime: 0,
		AverageExecutionTime: 0,
		LocalRunningWorkersCount: 0,
		MaxQueuedExecutions: 0,
		QueuedExecutionCount: 0,
		QueuedExecutionEstimatedCount: 0,
		QueuedExecutionTimeout: 0,
		RunningExecutionCount: 0,
		RunningExecutionEstimatedCount: 0,
		timestamp: 1697127400
		},
		nodetype: "ResourceManager",
		data:
		{
		AtomInBadState: false,
		DeadlockDetected: false,
		LowMemory: false,
		OutOfMemory: false,
		TooManyOpenFiles: false,
		timestamp: 1697127400
		},
		nodetype: "Scheduler",
		data:
		{
		ExecutingSchedulesCount: 0,
		MissedSchedulesCount: 0,
		ScheduleCount: 7,
		timestamp: 1697127400
		},
		nodetype: "ProcessSummaryReportingService",
		data:
		{
		PendingExecutionCount: 0,
		PendingReportCount: 0,
		PendingResultCount: 0,
		timestamp: 1697127400
		},
		nodetype: "MessageQueueFactory",
		data:
		{
		PendingMessageCount: 0,
		timestamp: 1697127400
		},
		nodetype: "config",
		data:
		{
		Restarting: false,
		Status: "RUNNING",
		timestamp: 1697127400
		},
		nodetype: "QueueAcknowledgement-track",
		data:
		{
		PendingStoreMessageCount: 0,
		PendingUploadMessageCount: 0,
		timestamp: 1697127400
		},
		nodetype: "MessagePollerThread",
		data:
		{
		connectFailureCount: 2,
		deliveredMessageCount: 0,
		timestamp: 1697127400
		}
	}	
}]		

4 Eyer Boomi connector & recipes

The Eyer Boomi connector allows the user to interface the anomaly & correlation engine, within a Boomi process. Time-series data from the Atoms & Molecules described in chapter 3 will be used to detect anomalies and correlations. The anomaly alerts with detailed information can then be used in Boomi processes to automatically take action and for decision making. Some use cases based on anomaly detection are listed below in section 5.

Eyer will provide Boomi recipes to get users quickly up and running.

Eyer_Boomi_Alert_Decision.jpg

5 Use cases

An alert query done by the Eyer Boomi connector (see chapter 4) should always return new / updated / closed anomaly alerts since the last query (query time window). Recommended interval between queries is x minutes.

Example

Query 1 at time 0 returns:

  • 1 new alert, id 1

  • 1 new alert, id 2

Query 2 at time 1 returns:

  • 1 closed alert, id 1

  • 1 updated alert (increase nodes affected), id 2

  • 1 new alert, id 3

Timestamp from previous query should be buffered and used in the next query from Boomi

  • Current query timestamp (mandatory)

  • Previous query timestamp (optional)

  • Status (if blank = all statuses, or new/updated/closed) -optional

  • Criticality (lower lever, criticality >=x) - optional

Boomi will iterate over each anomaly in the return message (query response), and check if the iteration should trigger custom processing in Boomi (branching, decision etc).

In the cases below, the “input control parameters” section contains what fields from the anomaly alerts response that will be used for validation / actions in Boomi (fields are selected by Boomi from the query response)

In Boomi, there will be some fields from the anomaly alert that should be stored as variables (state, nodes involved, metrics etc), to ensure further correct processing depending on the anomaly alert state. 

5.1 Log & alert an anomaly alert

Case 1: As a user I want to receive an alert & log IF new anomaly alert has alert criticality >= x

General logging / notification if alert exceeds a certain threshold on criticality. All alert updates with criticality > x are also logged. Reports new anomaly.
Logs: whole alert with all fields

Input control parameters: 

  • Id

  • Alert criticality

  • status = new

  • Timestamp

Case 2: As a user I want to receive an alert & log IF updated anomaly alert has alert criticality >= x

General logging / notification if alert exceeds a certain threshold on criticality. All alert updates with criticality > x are also logged. Reports updated anomaly.
Logs: whole alert with all fields

Input control parameters: 

  • Id

  • criticality

  • status = updated

  • Timestamp

  • Last updated timestamp

Case 3: As a user I want to receive an alert & log IF closed anomaly alert that used to have alert criticality >= x

General logging / notification if alert that was previously acted on, is closed.
Logs: alert closed.

Input control parameters: 

  • Id

  • status = closed

  • Timestamp

  • Last updated timestamp

Case 4: As a user I want to receive an alert & log IF new anomaly alert contains systems x & nodes y with alert criticality >= z

General logging and notification if alert contains a specific set of system(s) and node(s), and criticality > z. Logs: whole alert with all fields.

Input control parameters:

  • Id

  • Alert criticality

  • status = new

  • Timestamp

  • Systems affected

  • Nodes affected

Case 5: As a user I want to receive an alert & log IF updated anomaly alert contains systems x & nodes y with alert criticality >= z

General logging and notification if alert contains a specific set of system(s) and node(s), and criticality > z. Logs: whole alert with all fields.

Input control parameters:

  • Id

  • Alert criticality

  • status = updated

  • Timestamp

  • Last updated timestamp

  • Systems affected

  • Nodes affected

Case 6: As a user I want to receive an alert & log IF closed anomaly alert contains systems x & nodes y with alert criticality > z

General logging and notification if previous alert contained a specific set of system(s) and node(s), and criticality > z.
Logs: alert closed.

Input control parameters:

  • Id

  • status = closed

  • Timestamp

  • Last updated timestamp

Case 7: As a user I want to receive an alert & log IF new anomaly alert contains systems x & nodes y, and node y1 has criticality > z

General logging and notification if alert contains a specific set of system(s) and node(s), and criticality > z. Logs: whole alert with all fields.

Input control parameters:

  • Id

  • Alert criticality

  • status = closed

  • Timestamp

  • Last updated timestamp

  • (Systems affected)

  • (Nodes affected)

  • Nodes criticality

Case 8: As a user I want to receive an alert & log IF updated anomaly alert contains systems x & nodes y, and node y1 has criticality > z

General logging and notification if alert contains a specific set of system(s) and node(s), and criticality > z. Logs: whole alert with all fields.

Input control parameters:

  • Id

  • Alert criticality

  • status = closed

  • Timestamp

  • Last updated timestamp

  • (Systems affected)

  • (Nodes affected)

  • Nodes criticality

Case 9: As a user I want to receive an alert & log IF closed anomaly alert contains systems x & nodes y, and node y1 has criticality > z

General logging and notification if alert previously contained a specific set of system(s) and node(s), and criticality > z.
Logs: alert closed.

Input control parameters:

  • Id

  • Alert criticality

  • status = closed

  • Timestamp

  • Last updated timestamp

  • (Systems affected)

  • (Nodes affected)

  • Nodes criticality

5.2 Automated action based on anomaly alert

Case 1: As a user, I want to take automated action IF alert (new) includes an anomaly on a specific metric (higher than normal) on a system A and node B, with alert criticality >= C

Store Id as control token, to monitor for “updated” & “closed” status. Based on alert, take needed action (routing, decision , messaging).

Input control parameters:

  • Id

  • Systems involved

  • Nodes involved

  • Timestamp

  • Criticality

  • Status = new

  • Metric names with anomalies

  • Metric types with anomalies

  • Metric value at the time of anomaly per metric

  • Metrics baseline (upper/lower) on alert per metric involved

Case 2: As a user, I want to take automated action IF alert (updated) includes an anomaly on a specific metric (higher than normal) on a system A and node B, with criticality >= C

Store Id as control token, to monitor for “closed” status. Based on alert, take needed action (routing, decision , messaging).

Input control parameters:

  • Id

  • System name

  • Node name

  • Timestamp

  • Last updated timestamp

  • Criticality

  • Status = updated

  • Metric names with anomalies

  • Metric types with anomalies

  • Metric value at the time of anomaly per metric

  • Metrics baseline (upper/lower) on alert per metric involved

Case 3: As a user, I want to end & revert an automated action IF alert (closed) includes an anomaly on a specific metric (higher than normal) on a system A and node B, with criticality >= C

Reverts action if alert previously contained a specific Id with criteria above. Based on alert, revert needed action (routing, decision , messaging).

Input control parameters:

  • Id

  • Timestamp

  • Last updated timestamp

  • Status = closed

5.3 Manual action based on anomaly alert

Case 1: As a user, I want to take manual action IF alert (new) includes an anomaly on execution latency (higher than normal) on a system (atom)

Store Id as control token, to monitor for “updated” & “closed” status. Based on alert, take needed action (config change, scaling, routing, decision , messaging).

Input control parameters:

  • Id

  • System name

  • Node name

  • Timestamp

  • Criticality

  • Status = new

  • Metric names with anomalies

  • Metric types with anomalies

  • Metric value at the time of anomaly per metric

  • Metrics baseline (upper/lower) on alert per metric involved

Case 2: As a user, I want to take manual action IF alert (updated) includes an anomaly on execution latency (higher than normal) on a system (atom)

Store Id as control token, to monitor for “closed” status. Based on alert, take needed action (config change, scaling, routing, decision , messaging).

Input control parameters:

  • Id

  • System name

  • Node name

  • Timestamp

  • Last updated timestamp

  • Criticality

  • Status = updated

  • Metric names with anomalies

  • Metric types with anomalies

  • Metric value at the time of anomaly per metric

  • Metrics baseline (upper/lower) on alert per metric involved

5.4 Runtime scaling based on anomaly alert

Case 1: As a user, I want a scaling notification / memory exhaust alert sent to incident system IF anomaly alert (new) contains memory specific metric anomalies and value is > baseline

Store Id as control token, to monitor for “updated” & “closed” status. Based on alert, take needed action (config change, scaling).

Input control parameters:

  • Id

  • System name

  • Node name

  • Timestamp

  • status = new

  • Metric names with anomalies

  • Metric types with anomalies

  • Metric value at the time of anomaly per metric

  • Metrics baseline (upper/lower) on alert per metric involved

Case 2: As a user, I want a scaling notification / memory exhaust alert sent to incident system IF anomaly alert (updated) contains memory specific metric anomalies and value is > baseline

Store Id as control token, to monitor for “closed” status. Based on alert, take needed action (config change, scaling).

Input control parameters:

  • Id

  • System name

  • Node name

  • Timestamp

  • Last updated timestamp

  • status = updated

  • Metric names with anomalies

  • Metric types with anomalies

  • Metric value at the time of anomaly per metric

  • Metrics baseline (upper/lower) on alert per metric involved

Case 3: As a user, I want a scaling notification / memory exhaust alert sent to chat IF anomaly alert (new) contains memory specific metric anomalies and value is > baseline

Store Id as control token, to monitor for “updated” & “closed” status. Based on alert, take needed action (config change, scaling).

Input control parameters:

  • Id

  • System name

  • Node name

  • Timestamp

  • status = new

  • Metric names with anomalies

  • Metric types with anomalies

  • Metric value at the time of anomaly per metric

  • Metrics baseline (upper/lower) on alert per metric involved

Case 4: As a user, I want a scaling notification / memory exhaust alert sent to chat IF anomaly alert (updated) contains memory specific metric anomalies and value is > baseline

Store Id as control token, to monitor for “closed” status. Based on alert, take needed action (config change, scaling).

Input control parameters:

  • Id

  • System name

  • Node name

  • Timestamp

  • Last updated timestamp

  • status = updated

  • Metric names with anomalies

  • Metric types with anomalies

  • Measured value on anomaly alerts per metric

  • Metrics baseline (upper/lower) on alert per metric involved

5.5 Set runtime variables based on anomaly alerts

Case 1: As a user, for an anomaly alert that contains an anomaly on a specific node and metric, with criticality >=x  —-  I want to store Id, node criticality, status, timestamp, last updated timestamp (if status != closed)

A Boomi process “listens” for anomaly alerts with a specific set of criterias, then stores alert fields as variables to “remember” status for correct processing.

Input control parameters:

  • Id

  • System name

  • Node name

  • Timestamp

  • Last updated timestamp

  • status = updated

  • Metric names with anomalies

  • Metric types with anomalies

  • Measured value on anomaly alerts per metric

  • Metrics baseline (upper/lower) on alert per metric involved

5.6 (Send) Endpoint throttling based on queue buildup

Case 1: As a user, for an anomaly alert that contains an anomaly on a specific node and metric, with criticality >=x  — I want to control a flow shape (if status != closed)

A Boomi process “listens” for anomaly alerts with a specific set of criterias, then stores alert fields as variables to “remember” status for correct processing. Use variable to control a flow shape to initiate endpoint throttling

Input control parameters:

  • Id

  • System name

  • Node name

  • Timestamp

  • Last updated timestamp

  • status = updated

  • Metric names with anomalies

  • Metric types with anomalies

  • Measured value on anomaly alerts per metric

  • Metrics baseline (upper/lower) on alert per metric involved

5.7 (Receive) Connector / Plans / process throttling based on queue buildup or resource exhaustion

Case 1: As a user, for an anomaly alert that contains an anomaly on a specific node and metric, with criticality >=x  — I want to initiate receive throttling (if status != closed)

A Boomi process “listens” for anomaly alerts with a specific set of criterias, then stores alert fields as variables to “remember” status for correct processing. Use variable to set connector properties / adjust rate limits in Plans (Contracts), or if properties can not be set send an alert to specified user(s) with details so manual restrictions can be set.

Input control parameters:

  • Id

  • System name

  • Node name

  • Timestamp

  • Last updated timestamp

  • status = updated

  • Metric names with anomalies

  • Metric types with anomalies

  • Measured value on anomaly alerts per metric

  • Metrics baseline (upper/lower) on alert per metric involved

6 Anomaly Alert Data fields

Data fields returned from the Eyer anomaly query API

  • Anomaly Id

  • Status (new / update / closed)

  • Anomaly alert timestamp

  • Last updated timestamp (if existing)

  • Criticality / score

  • Systems involved

    • Nodes involved

      • Metrics involved

        • Metric value at time of alert

        • Baseline(s) and their values per metric at time of alert

        • Metric type

  • Number of total nodes in correlation group at current alert

  • Number of total nodes in correlation group at previous alert

  • Number of total systems in the correlation group at time of alert

  • Number of total systems in the correlation group at time of previous alert

7 Terminology

  • Anomaly alert - a set of conditions are met (x number of deviations on y metrics, on z number of nodes in a correlation group), and an alert is created. Can then be read by the Eyer Boomi connector.

  • Deviation - a metric is deviating from any baseline on given time

  • Baseline - a “normal” behaviour on any given time for a metric, described by an upper / lower limit. Learned automatically from our ML algorithms.

  • Correlation group - a set of correlated nodes. Correlation is automatically performed across all nodes.

  • Criticality - low / medium / high / critical. Based on how how many nodes of total in correlation group that has active deviations

  • Score - criticality is deducted from this numeric (0-100). 100 means no deviations (no impact) on any metric on nodes in a correlation group, 0 means that all metrics on all nodes in a correlation group have deviations (serious impact). Score is calculated per node, and for the whole correlation group.

  • Id - unique anomaly alert identifier. Will be maintained throughout the “lifecycle” of an anomaly alert (new / updated / closed)

  • Timestamp - the timestamp of when an anomaly alert (initial & updates) was triggered.

  • Status - an anomaly alert has a lifecycle, from when it is initiated until it is closed. In between “new” and “closed”, an anomaly alert can be updated if conditions are changing (more deviations on nodes in correlation group are detected, thus bigger impact).

  • System - typically defined by a monitoring agent (windows server agent, Boomi. agent, SQL agent etc)

  • Node - a logical unit with metrics (a system is also a node). For instance in a Boomi monitoring agent, there are nodes like operating_system, execution_manager, resource_manager and more.

  • Metric - a numeric value (KPI, performance metric) connected to a node as a time-series. Can also represent boolean values (1/0)

  • No labels