Map your custom data source - Example 3 with LLM / ChatGPT assistance
You can get great assistance from a LLM (ChatGPT / Gemini / Mistral etc) on the mapping. It is just about asking the correct question and providing enough information for the LLM.
Before you ask the LLM about a mapping suggestion, make sure you pick the correct mapping example and output for it to “understand” how it should do the mapping.
In the example below, I provide four pieces of information to the LLM:
An example input file (JSON). In this case, the input file from mapping example 1
An example mapping (JSON). Again, from the mapping example 1
How the mapped example output file will look (from example 1)
Then ask how it would map a new input file based on the above.
In the example below, we provide a single line of data coming from the Influx Telegraf (SNMP in/out Octets), which we know is going to repeat and potentially come from multiple routers and switches.
LLM prompt
Learn how to do a json mapping. You have the file below
{ "metrics": [ { "fields": { "disk_reads_mean": 1.0121461361787716, "disk_writes_per_sec_mean": 0.8571885607181682, "last_Run_mean": 1740047577489, "log_bytes_flushed_mean": 3.9596305549082618, "logical_reads_mean": 914.9966524003087, "total_sessions_mean": 14, "waits_total": 0, "waits_total_mean": 7.3 }, "name": "dbwatch", "tags": { "host": "testmachine1", "instance": "Database1", "url": "http://localhost:8080/dashboard/dbWatch%2001/data/HkRmUaLz0Z.telegraf.json" }, "timestamp": 1740047760 } ] }
then you have the JSON mapping
{ "selector": [ "tags.host", "tags.instance", "timestamp" ], "type": "telegraf.waitstats", "id_selector": "tags.instance", "root": "metrics", "system": "tags.host", "mappings": [ { "type": "telegraf.waitstats.waits", "guid": "16e41dba-e6a4-4427-b753-cf6eee41e384", "name": "dbWatch waits", "timestamp": "timestamp", "fields": [ { "name": "fields.disk_reads_mean", "display": "Disk reads", "type": "float", "aggregation": "avg", "guid": "9ed41a78-3094-4e41-98b5-b1f98f63dc3a" }, { "name": "fields.disk_writes_per_sec_mean", "display": "Disk writes per second", "type": "float", "aggregation": "avg", "guid": "1203a428-0b80-4240-8c1b-f7fa2665ec22" }, { "name": "fields.log_bytes_flushed_mean", "display": "Log bytes flushed", "type": "float", "aggregation": "avg", "guid": "2add293a-446a-468e-91c3-8f164fe697c8" }, { "name": "fields.logical_reads_mean", "display": "Logical reads", "type": "float", "aggregation": "avg", "guid": "7acb184d-c4ce-4287-ab4a-f63f15976bac" }, { "name": "fields.total_sessions_mean", "display": "Total sessions", "type": "int", "aggregation": "avg", "guid": "30c71236-48ad-4fa8-b282-f587ffe8b4b9" }, { "name": "fields.waits_total_mean", "display": "Waits total", "type": "float", "aggregation": "avg", "guid": "1f2a5a97-338b-4a92-8846-f27cd9f6b71f" } ], "name_map": "tags.instance" } ], "skip_merge": true }
you get the following output
[ { "Node": { "Name": "Database1", "SystemName": "testmachine1", "AgentId": "telegraf.waitstats", "AtomId": "Database1" }, "NodeType": { "Id": "16e41dba-e6a4-4427-b753-cf6eee41e384", "Name": "telegraf.waitstats.waits", "DisplayName": "Database waits" }, "Stats": [ { "Name": "fields.disk_reads_mean", "Type": "float", "Guid": "9ed41a78-3094-4e41-98b5-b1f98f63dc3a", "Aggregation": "avg", "Display": "Disk reads", "Value": 1.0121461361787716, "Timestamp": 1740047760 }, { "Name": "fields.disk_writes_per_sec_mean", "Type": "float", "Guid": "1203a428-0b80-4240-8c1b-f7fa2665ec22", "Aggregation": "avg", "Display": "Disk writes per second", "Value": 0.8571885607181682, "Timestamp": 1740047760 }, { "Name": "fields.log_bytes_flushed_mean", "Type": "float", "Guid": "2add293a-446a-468e-91c3-8f164fe697c8", "Aggregation": "avg", "Display": "Log bytes flushed", "Value": 3.9596305549082618, "Timestamp": 1740047760 }, { "Name": "fields.logical_reads_mean", "Type": "float", "Guid": "7acb184d-c4ce-4287-ab4a-f63f15976bac", "Aggregation": "avg", "Display": "Logical reads", "Value": 914.9966524003087, "Timestamp": 1740047760 }, { "Name": "fields.total_sessions_mean", "Type": "int", "Guid": "30c71236-48ad-4fa8-b282-f587ffe8b4b9", "Aggregation": "avg", "Display": "Total sessions", "Value": 14, "Timestamp": 1740047760 }, { "Name": "fields.waits_total_mean", "Type": "float", "Guid": "1f2a5a97-338b-4a92-8846-f27cd9f6b71f", "Aggregation": "avg", "Display": "Waits total", "Value": 7.3, "Timestamp": 1740047760 } ] } ]
Question: how would the mapping for the following input look?
{ "metrics": [ { "fields": { "ifInOctets": 2534174524, "ifOutOctets": 3408455650 }, "name": "Octets_field", "tags": { "agent_host": "192.168.88.54", "host": "MacBookPro.bambam", "hostname": "US-8-150W", "ifName": "0/3" }, "timestamp": 1741006560 } ] }
The LLM then produced the mapping below
{
"selector": [
"tags.host",
"tags.ifName",
"timestamp"
],
"type": "telegraf.ifstats",
"id_selector": "tags.ifName",
"root": "metrics",
"system": "tags.host",
"mappings": [
{
"type": "telegraf.ifstats.octets",
"guid": "b5d4d7c9-8f52-4df5-8101-ec5c63a8a123",
"name": "Interface Octets",
"timestamp": "timestamp",
"fields": [
{
"name": "fields.ifInOctets",
"display": "Inbound Octets",
"type": "int",
"aggregation": "avg",
"guid": "8f1c9de3-91ef-4a34-bc76-e2d05d2a6e51"
},
{
"name": "fields.ifOutOctets",
"display": "Outbound Octets",
"type": "int",
"aggregation": "avg",
"guid": "d10f4182-fc7c-4d65-a8c5-b72a2e6ec351"
}
],
"name_map": "tags.ifName"
}
],
"skip_merge": true
}
We take this mapping and the input I wanted to map, and test it towards our API to validate the mapping
[
{
"Node":
{
"Name": "3/26",
"SystemName": "DESKTOP-S01F7CP",
"AgentId": "telegraf.ifstats",
"AtomId": "3/26"
},
"NodeType":
{
"Id": "b5d4d7c9-8f52-4df5-8101-ec5c63a8a123",
"Name": "telegraf.ifstats.octets",
"DisplayName": "Interface Octets"
},
"Stats":
[
{
"Name": "fields.ifInOctets",
"Type": "int",
"Guid": "8f1c9de3-91ef-4a34-bc76-e2d05d2a6e51",
"Aggregation": "avg",
"Display": "Inbound Octets",
"Timestamp": 1741167360
},
{
"Name": "fields.ifOutOctets",
"Type": "int",
"Guid": "d10f4182-fc7c-4d65-a8c5-b72a2e6ec351",
"Aggregation": "avg",
"Display": "Outbound Octets",
"Timestamp": 1741167360
}
]
},
{
"Node":
{
"Name": "3/11",
"SystemName": "DESKTOP-S01F7CP",
"AgentId": "telegraf.ifstats",
"AtomId": "3/11"
},
"NodeType":
{
"Id": "b5d4d7c9-8f52-4df5-8101-ec5c63a8a123",
"Name": "telegraf.ifstats.octets",
"DisplayName": "Interface Octets"
},
"Stats":
[
{
"Name": "fields.ifInOctets",
"Type": "int",
"Guid": "8f1c9de3-91ef-4a34-bc76-e2d05d2a6e51",
"Aggregation": "avg",
"Display": "Inbound Octets",
"Timestamp": 1741167360
},
{
"Name": "fields.ifOutOctets",
"Type": "int",
"Guid": "d10f4182-fc7c-4d65-a8c5-b72a2e6ec351",
"Aggregation": "avg",
"Display": "Outbound Octets",
"Timestamp": 1741167360
}
]
},
{
"Node":
{
"Name": "3/19",
"SystemName": "DESKTOP-S01F7CP",
"AgentId": "telegraf.ifstats",
"AtomId": "3/19"
},
"NodeType":
{
"Id": "b5d4d7c9-8f52-4df5-8101-ec5c63a8a123",
"Name": "telegraf.ifstats.octets",
"DisplayName": "Interface Octets"
},
"Stats":
[
{
"Name": "fields.ifInOctets",
"Type": "int",
"Guid": "8f1c9de3-91ef-4a34-bc76-e2d05d2a6e51",
"Aggregation": "avg",
"Display": "Inbound Octets",
"Timestamp": 1741167360
},
{
"Name": "fields.ifOutOctets",
"Type": "int",
"Guid": "d10f4182-fc7c-4d65-a8c5-b72a2e6ec351",
"Aggregation": "avg",
"Display": "Outbound Octets",
"Timestamp": 1741167360
}
]
}
]If you compare this result to the mapping example 2, you notice that the mapping creates a new node for every switch port with corresponding metrics, instead of a single node with all metrics. This is because we asked the LLM to use the mapping from example 1