Linux AMA syslog agents: How to identify DCRs that are causing duplicate data collection

If you’re using the Microsoft AMA agent, you’re likely familiar with Data Collection Rules.

This tip is specifically for AMA agents installed on linux servers for the purpose of collecting syslog data.

It’s pretty easy to create 2 or more DCRs that overlap in their logic and result in collecting duplicate data. A common example is to get duplicate syslog data showing up in both the Syslog and CommonSecurityLog tables.

It can be difficult to read through all of your DCRs to find the duplicate configuration.

One approach to fixing this issue is to login to your server where the AMA agent is installed and look at the json files under:

/etc/opt/microsoft/azuremonitoragent/config-cache/configchunks/

Each json file represents a single Data Collection Rule. Here’s an example. Pay attention to the value following “agentConfigurations/dcr-<some alphanumeric>”

You’ll need that value to trace back to the DCR configuration in the Azure portal.

Now go to the Azure Portal and open the Resource Graph Explorer:

Run this query to get a list of your DCRs and their associated “dcr-xxx” values:

resources
| where type == 'microsoft.insights/datacollectionrules'
|extend immutableId = properties.immutableId
|project name, immutableId

Once you’ve identified a DCR you can simply delete it and after a few minutes you will see the .json file disappear from your AMA’s /configchunks/ directory.

Restarting the AMA agent might speed up the process of the json file being removed:

systemctl restart azuremonitoragent

or:

cd /var/lib/waagent/Microsoft.Azure.Monitor.AzureMonitorLinuxAgent-<agent version>
./shim.sh -disable
./shim.sh -enable

ls /etc/opt/microsoft/azuremonitoragent/config-cache/configchunks/*.json

Getting Started With Defender for IoT/OT

Pressure is increasing on manufacturers to monitor their shop floors in order to avoid major disruptions in supply chains. A recent example of such a risk is CVE-2023-3595. This vulnerability has a CVSS score of 9.8 (i.e., very bad).

It involves the use of CIP (Common Industrial Protocol). As such, you wouldn’t expect there to be your typical IOCs like IP addresses and hashes that you could add to your SIEM to detect this vulnerability. You would need to sniff your factory network, looking for malicious use of the CIP protocol. This is where OT security tools like Defender for OT come in. (Since we’re just talking about OT here I’m going to drop the IoT…)

Here’s a quick walkthrough to getting started with Defender for OT:

Getting Started with Defender for OT

  • Login to the Azure portal and search for Defender for OT and select ‘Set up OT/ICS Security’
  • Download the sensor and install the ISO in a hypervisor like Hyper-V or VMWare. (when setting up your VM, make sure to use at least 2 network interfaces – 1 for management and 1 for sniffing)
  • Connect the ‘sniffing’ network interface from your VM to a SPAN port on your network. If you’re just playing around with the sensor you can just have it sniff your home network or whatever is safe for you to monitor.
  • After the sensor installation you will be provided 3 unique credentials to login to the sensor’s web interface, so don’t lose those credentials.
  • Get a license – there’s a 60 day trial license here in the M365 admin center:
    https://learn.microsoft.com/en-us/azure/defender-for-iot/organizations/getting-started
  • Go back to the Azure portal and register your sensor. Once the sensor is registered it will give you a zip file which is the license key for that sensor.

How to Test Your Sensor:

  • Download some sample pcap files like from here: https://github.com/EmreEkin/ICS-Pcaps
  • Login to your sensor (https://<ip address of sensor>) with the username ‘cyberx’ and the password that was given to you during the sensor installation.
  • Go to System Settings > Play Pcap, and upload one of your sample pcap files.
  • After selecting ‘play all’, your sensor will begin analyzing your pcap traffic.
  • If nothing interesting is seen in the alerts tab you may need to create a custom alert to trigger some alerts. Some experience with network traffic analysis and Wireshark can be very useful.

Next Steps: Connect Defender for OT to Sentinel

  • Back in the Azure portal, go to the Content Hub and install the Defender for OT solution bundle.
  • Now go to Connectors and enable the Defender for IoT connector
  • Finally go to Analytics, search the templates for all of the OT rules and enable whatever you like.
References:
https://learn.microsoft.com/en-us/azure/defender-for-iot/organizations/ot-deploy/install-software-ot-sensor

https://www.netresec.com/?page=PcapFiles

Highlights of Microsoft Build 2023!

Although much of Microsoft Build is centred around helping developers, there’s much for you an me as well, like avatars in Teams! Well, read on..

Microsoft MVPs get early access to the full list of topics from Microsoft Build so we can review all of the topics and then share back to the community with some fresh perspectives right after the public release date has passed.

So here’s my choice of interesting topics, hot off the press, in case you decide to check them out at https://build.microsoft.com/en-US/home or https://news.microsoft.com/source

For a full list of Microsoft Build topics go here.

Configuring the ‘NEW’ AMA (and Arc) agent to forward syslog to Sentinel

(Note: I have an older blog on this topic but based on new insights this article supersedes the old one)

Microsoft has already labeled the old ‘OMS’ syslog collector agent as legacy, so it’s important to think about switching to the new AMA agent.

If you have the word ‘syslog’ stuck in your brain you might think that in order to configure syslog you’d go into the Sentinel data connectors and search for ‘syslog’ – don’t do this.

The procedure for getting syslog into Sentinel is a bit different for VMs that are in Azure vs on-prem.

I’ve tested the recommendations below with the latest versions of Redhat And Ubuntu, on both on-prem VMs and in Azure.

For Azure VMs:
Create a DCR and configure your syslog facilities. Done..

– Note that a DCR is a Data Collection Rule. This is a fairly new way to define what data needs to be collected. It somewhat hides the underlying method of collection – eg. you tell it to collect syslog from a list of servers in scope, and it takes care of communicating with the AMA agent to make it happen.

– In Sentinel, you don’t need to do anything! The DCR points the data to the servers in scope and the log analytics workspace.

For an on-prem VM
– Just make sure you install the Arc agent first, then create your DCR for syslog, just like for the Azure VM. Done!

A very simple test:

On your linux server, ssh in and type “logger testing123”

In Sentinel > Logs, type “search testing123” . You will see your logs show up in the Syslog table in about 5-10 minutes, depending on when you pushed out your DCR.

(TIP: If you don’t see your logs in Sentinel but you see them with tcpdump, then check your firewall rules, eg. by default RedHat7 will block incoming syslog)

To check if your logs are event getting to your syslog server use tcpdump eg:
tcpdump -i any port syslog -A -s0 -nn

And note that you could be seeing your logs with the above tcpdump command but they’re still not getting to Sentinel. In that case check if the local firewall rules are blocking syslog.

If you need to troubleshoot the actual AMA agent on linux (or windows), there’s a script for that here.

What about CEF Logs?

If you need to collect CEF formatted syslog then you will need to go into Sentinel > Data Connectors – and search for “Common Event Format (CEF) via AMA (Preview)”. There’s a script in here you’ll need to run for both your Azure VMs and your on-prem linux VMs/servers.

The AMA/CEF script is commonly used for when you are trying to collect CEF logs from somewhere like PaloAlto firewalls and you want to forward them into Sentinel.

After installing the CEF forwarder script, create a DCR to forward the logs to Sentinel’s workspace. MAKE SURE YOU CREATE THIS DCR from Sentinel> Data Connectors CEF Via AMA. This will provide a special CEF configuration with the string ‘SECURITY_CEF_BLOB’ inside a json file located at /etc/opt/microsoft/azuremonitoragent/config-cache/configchunks/nnn.json. If this configuration doesn’t exist, your CEF data will only be forwarded to the Syslog table, not the CommonSecurityLog table.

Tip: Don’t use the same ‘syslog facility’ for CEF as you use for syslog or you may get log duplications. eg. if you use ‘user.info’ for CEF, don’t use it for your ‘regular’ syslog DCR.

There’s a great troubleshooter script for CEF:

Read this or jump the bottom of the page and run the troubleshooter:


sudo wget -O sentinel_AMA_troubleshoot.py https://raw.githubusercontent.com/Azure/Azure-Sentinel/master/DataConnectors/Syslog/Sentinel_AMA_troubleshoot.py &&sudo python sentinel_AMA_troubleshoot.py

Don’t get this confused with the ‘old’ CEF troubleshooter that was created for the OMS agent.

After running all of the troubleshooting tests, go to the Sentinel logs and look for the ‘MOCK’ test logs:

CommonSecurityLog
| where DeviceProduct == "MOCK"

If you need to run the MOCK test manually, here’s the command (sometimes the troubleshooting script doesn’t catch the test log so you may need to run it in another terminal session while running the script):


echo -n "<164>CEF:0|Mock-test|MOCK|common=event-format-test|end|TRAFFIC|1|rt=$common=event-formatted-receive_time" | nc -u -w0 <YOUR IP ADDRESS> 514

Another tip: Keep the /var mount point separate from root and give it about 500GB of disk space just for /var. Keep an eye on your disk space and increase this if necessary.

Another tip: There’s no need to keep the incoming syslog data in /var/log, so disable any default filters in /var/log/syslog.d/* that would store data there. The AMA agent will queue the data it needs in /var/opt/microsoft/azuremonitoragent/events/user.info and clear it out daily or sooner. If you don’t disable these default filters you may need to double the disk you need for /var.

References:

https://learn.microsoft.com/en-us/azure/sentinel/forward-syslog-monitor-agent

https://learn.microsoft.com/en-us/azure/sentinel/connect-cef-ama

OpenAI – What Should be Monitored?

Since the explosion of publicly accessible OpenAI, the question of how to monitor its use within an organization has been a frequently asked question.

Below are some topics relevant to the most common OpenAI services/features available today. Consider using these topics/suggestions as a starting point to creating a scope of topics relevant to security governance, and to help develop security policies for your organization.

Publicly Accessible OpenAI services

  • Description: Web sites like OpenAI’s ChatGPT provide a wealth of knowledge and an opportunity to accelerate a user’s knowledge on an infinite number of topics.
  • Security Policy Consideration: Pasting corporate information into a public facing site of any kind should be considered prohibitive.

Corporate Licensed OpenAI services

  • Description: OpenAI resources such as Azure OpenAI can be enabled at low cost within the cloud. These AI models can be customized to solve complex challenges within an organization or provide public facing features which enhance a corporation’s service offerings.
  • Security Policy Consideration: Creation of resources in openAI based tools such as Azure OpenAI Studio and PowerApps should be controlled and monitored by the security team.

End User OpenAI Related Productivity Tools

  • Description: Microsoft’s Copilot is an example of end-user OpenAI tools that will change they way people work, and it will have a dramatic affect on their productivity.
  • Security Policy Consideration: Authorized use of AI tools, such as Copilot should be monitored.

Be aware of ‘Self-Aware’ OpenAI Tools

Description: If you’ve used Auto-GPT, you might be concerned about the ability of OpenAI tools to be given full root/admin control to do whatever it takes to provide the answer to a question. This includes creation of scripts, adding/deletion of files, and even rebooting your pc.

Security Policy Consideration: Strict monitoring of any open source OpenAI tools that are running on enduser pc’s or on servers should be strictly monitored and approved for use.

Security Monitoring and Best Practices

  • Monitoring of all use of AI generated activity should be monitored via EDR, CASB, SIEM etc.
  • Discuss with your vendors the best practices on how their OpenAI tools can be monitored.
  • Test/simulate the use of each OpenAI tool and validate your ability to monitor its activities, including individual user access and change controls.

Creating Your Own Threat Actor Research Bot

There is nothing perfect about auto-gpt but like chatgpt it’s another tool that if used creatively can be used to achieve amazing things I wouldn’t have even considered doing 2 months ago.

If you want to read about my odd path of discovery in building this script, see the short story below, otherwise just enjoy the script.

Ramon Gomez on LinkedIn had the idea of using auto-gpt to find threat actor in the new as they relate to the United States Energy sector.

His attempts at using auto-gpt failed but I gave it a try anyways.

Sure enough it failed for me too, but I carefully read the output from auto-gpt and I could see what it was trying to do:

  • download the enterprise-attack.json file from Mitre – this is a full ‘database’ of all things Mitre ATT&CK and it includes information about threat actors and some of the industries that they’re associated with.
  • create an run a python script that reads enterprise-attack.json and extract the threat actors associated with the US energy sector. – this script had syntax errors so it was never going to run, but it tried…
  • find a list of reliable new web sites that are related to cyber news. – this worked so I had a list of possible sites, but they weren’t perfect..
  • create another python script that scraped the news sites for information associated with the threat actors – again it tried and failed.

Although auto-gpt tried and failed, it had an excellent approach to the problem.

And using ‘regular’ chatgpt I was able to ask the same sorts of questions and get much better answers.

Finally, as a result, chatgpt (and I) came up with the script you see below.

Note that this script has flaws, like some of the urls aren’t useful (but some are), but it does in fact work! enjoy.

import requests
from bs4 import BeautifulSoup

# Define a dictionary of threat actor names and their aliases
threat_actors = {
    'APT1': ['Comment Crew'],
    'Lazarus': ['Lazarus'],
    'APT29': ['Cozy Bear'],
    'APT32': ['OceanLotus Group']
}

# Define the URLs for the news resources
# Loop through the URLs and extract relevant web page URLs
# Define the URLs of the news resources

urls = [

  'https://www.fireeye.com/blog/threat-research.html',

  'https://www.kaspersky.com/blog/tag/apt',

  'https://www.ncsc.gov.uk/news/reports',

  'https://thehackernews.com/search/label/apt',

  'https://www.recordedfuture.com/apt-group-threat-intelligence/',

  'https://www.anomali.com/blog/threat-research'

]

webpage_urls = []
for url in urls:
    html = requests.get(url).text
    soup = BeautifulSoup(html, 'html.parser')
    for link in soup.find_all('a'):
        href = link.get('href')
        for actor in threat_actors:
            if actor in link.text or any(alias in link.text for alias in threat_actors[actor]):
                webpage_urls.append(href)

# Print the extracted webpage URLs
for url in webpage_urls:
    print(url)

Cheat Sheet for Configuring Carbon Black Cloud (EDR) for Sentinel

(And a pretty good example of configuring ANY Sentinel data connectors that use APIs and Azure Functions.)

There are several ways to get data into Sentinel.

If your log source requires an API pull, it’s very likely that Sentinel will use an Azure Function app to pull the data.

Vendors have many uses for their APIs so you can’t assume that theirs was built just to please your SIEM.

As a result there is often a bit of a learning curve when trying to understand how to get API logs into Sentinel and how Azure functions play a role.

Here are the 3 basic steps for getting your API logs into Sentinel:

  • Spend some time to understand the vendor’s API – List out all of the available API options. For Carbon Black, there are over 15 APIs!!! But you need just 2 of them for Sentinel.
  • Collect the variables needed to configure the Azure Function – for Carbon Black there are about 15 variables!!! (see cheat sheet below). You’ll need the help of your friendly AWS admin as well as your Carbon Black admin.
  • Get familiar with how to troubleshoot a vendor’s API and Azure Functions – If you’re familiar with curl, wget and/or Postman, then you’re on your way, but that subject is out of scope for this article. For Azure Functions it’s mostly about being able to tweak your variables and how to monitor the output logs (see cheat sheet below).

My cheat sheet for Sentinel Carbon Black Cloud (EDR) Connector

Converting this cheat sheet from Excel to fit in this blog is beyond my skills, so I’ve attached a pdf version.

I’m confident that if you can configure Carbon Black Cloud, you’ll probably have no issue with most other Azure Function API connectors for Sentinel.

I doubt any other connectors will have this many variables!

Addendum: Which audit logs should I collect?

When configuring this connector, there are 4 check boxes to choose from. Here ‘s a brief explanation for them:

CB Alerts Only:
If you just need the CB alerts, then just check off ‘Alert – supported with SIEM API credentials’. This is a good starting point since you don’t have to worry about connecting to AWS.

CB Alerts with All Events
If you need all of the events associated with the alerts, then check off these 2:
Event – supported with aws s3 bucket credentials
Alert – supported with aws s3 bucket credentials
And then proceed to set up an s3 bucket along with everything else.

Audit
These are just basic Carbon Black audit logs, no alerts/events here.
And no need for s3 buckets so enable it with no difficulty.

Create A Powerful Threat Hunter for Microsoft 365 Defender

If you’re a user of Microsoft Sentinel, you’re likely familiar with it’s Threat Hunting feature which lets you run hundreds of KQL queries in a matter of seconds.

Unfortunately the Threat Hunting in the M365 Defender portal doesn’t have this feature, so you’re stuck running your hunting queries one at a time.

So I’ve created a proof of concept script that provides some threat hunting automation by taking the 400+ threat hunting queries in the Microsoft Sentinel Github repository and feeding them into the M365 Defender ThreatHunting api.

Requirements
  • a unix shell from which you can run python and git commands
  • python3, git command, and the python modules you see at the top of the scripts attached below
  • admin access to Azure to set up an app registration
  • operational use of the M365 Defender portal (https://security.microsoft.com)
Caveats

There are some known performance limitations to using Defender advanced threat hunting, so although the script may seem to be working, there could be timeouts happening in the background if Defender decides you’re using too many resources. This script doesn’t have any built in error checking so re-running the script or validating the queries within the Defender portal may be required.

The Setup Procedure:
  1. Create an app registration in Azure. Configuring an app registration is out of the scope of this article, but there are plenty of examples on how to do this. What’s important are the permissions you’ll need to allow:
    • ThreatHunting.Read.All
    • APIConnectors.ReadWrite.All
  2. From the app registration created in step #1, copy the tenantID, clientID and the secret key, and paste them into the script I’ve provided below.
  3. Create a directory named ‘queries’. This will be used to store the .yaml files from Github. These files contain the hunting queries.
  4. In the same directory, download the Github repository using this command:
  5. Now it’s time to create and run the first of 2 scripts. This first script should require no changes. Its purpose is to scan the Azure-Sentinel subdirectories (that you downloaded in step #4) for .yaml files and copy them all to the ‘queries’ directory.
    • Here’s the script, name it something like get_yaml.py and then run it like: “python3 get_yaml.py
# python script name: "get_yaml.py"
import os
import shutil

# Set the directory paths
source_dir = 'Azure-Sentinel/Hunting Queries/Microsoft 365 Defender/'
target_dir = 'queries'

# Recursively search the source directory for YAML files
for root, dirs, files in os.walk(source_dir):
    for file in files:
        if file.endswith('.yaml'):
            # Create the target directory if it doesn't already exist
            os.makedirs(target_dir, exist_ok=True)

            # Copy the file to the target directory
            source_file = os.path.join(root, file)
            target_file = os.path.join(target_dir, file)
            shutil.copy2(source_file, target_file)

            # Print a message to indicate that the file has been copied
            print(f'Copied {source_file} to {target_file}')

6. Edit this next script (below) and insert the tenantID, clientID and secret from the Azure app registration, as mentioned above in step #1. Name the script something like threathunt.py and run it with the python (or python3) command: “python3 threathunt.py”.

  • Note that the script below expects the .yaml files to be in a folder named “queries” (from when you ran the first script above).
  • After running the script below, look in the queries folder for any new files with a .json extension.
  • If the query ran successfully and generated results, a .json file will be created with the same filename as the matching .yaml file (eg. test.yaml > test.json)
import requests
import os
import yaml
import json
from azure.identity import ClientSecretCredential

# Replace the values below with your Azure AD tenant ID, client ID, and client secret
client_id = "YOUR APP REGISTRATION CLIENT ID"
client_secret = "YOUR APP REGISTRATION SECRET KEY"
tenant_id = "YOUR TENANT ID"
scope = "https://graph.microsoft.com/.default"

authority = f"https://login.microsoftonline.com/"

# Create a credential object using the client ID and client secret
credential = ClientSecretCredential(
    authority=authority,
    tenant_id=tenant_id,
    client_id=client_id,
    client_secret=client_secret
)

# Use the credential object to obtain an access token
access_token = credential.get_token("https://graph.microsoft.com/.default").token

# Print the access token
print(access_token)

# Define the headers for the API call
headers = {
    "Authorization": f"Bearer {access_token}",
    "Content-Type": "application/json",
    "Accept": "application/json"
}

# Define the API endpoint
url = "https://graph.microsoft.com/v1.0/security/microsoft.graph.security/runHuntingQuery"

# Define the path to the queries directory
directory_path = "queries"

# Loop through all YAML files in the queries directory
for file_name in os.listdir(directory_path):
    if file_name.endswith(".yaml") or file_name.endswith(".yml"):
        # Read the YAML file
        with open(os.path.join(directory_path, file_name)) as file:
            query_data = yaml.load(file, Loader=yaml.FullLoader)
            # Extract the query field from the YAML data
            query = query_data.get("query")
            if query:
                print(query)
                # Define the request body
                body = {
                    "query": query
                }
                # Send the API request
                response = requests.post(url, headers=headers, json=body)
                # Parse the response as JSON
                json_response = response.json()

                if json_response.get('results') and len(json_response['results']) > 0:
                  #print(json.dumps(json_response, indent=4))
                  output_file_name = os.path.splitext(file_name)[0] + ".json"
                    # Write the JSON response to the output file
                  with open(os.path.join(directory_path, output_file_name), "w") as output_file:
                      output_file.write(json.dumps(json_response, indent=4))

Once you have the results from your threat hunt you could:

  • Log into the M365 Defender portal and re-run the hunting queries that generated data. All of these YAML queries from github should be in the Threat Hunting > Queries bookmarks. If not you can manually copy/paste the query from inside the YAML file directly into Query Editor and play around with it.
  • Import the data into Excel or Power BI and play around with the results.
  • Create another python script that does something with the resulting .json files like aggregate the fields and look for commonalities/anomalies.

Happy Hunting!

Use ChatGPT to Search Mitre ATT&CK More Effectively

My python fu has never been above beginner, so writing scripts to use Mitre’s ATT&CK json files was always a hit and miss effort.

So I asked chatgpt to write it for me and after several back and forth tweaks and coding errors ‘we’ came up with these 2 which I find pretty useful.

To use the scripts, simply run “python <script>” and it will dump the results to a csv file (the first script requires you the first download the json file but the 2nd one doesn’t – see comments).

If you don’t like them exactly the way they are, paste them into chatgpt and simply ask it to make some modifications, eg:

  • Add a header row
  • Sort by the first column
  • Only include these fields: technique_id,technique_name,tactic_name,platforms
  • Use a comma as the field separator
  • rather than reading from a local json file, read from the web version of enterprise-attack.json
ATT&CK_Tactic_Technique_LogSource.py
import json

# Load the MITRE ATT&CK Enterprise Matrix from the JSON file
# https://raw.githubusercontent.com/mitre/cti/master/enterprise-attack/enterprise-attack.json

with open('enterprise-attack.json', 'r') as f:
    data = json.load(f)

# Open a file to write the output to
with open('output.csv', 'w') as f:
    # Print the header row
    f.write("technique_id,technique_name,tactic_name,platforms,permissions_required\n")

    # Loop through each technique in the JSON file and print its fields
    for technique in data['objects']:
        # Extract the technique ID and name
        if 'external_references' in technique and len(technique['external_references']) > 0:
            technique_id = technique['external_references'][0].get('external_id', '')
        else:
            technique_id = ''
        technique_name = technique.get('name', '')

        # Extract the required platforms, if any
        platforms = ",".join(technique.get('x_mitre_platforms', []))

        # Extract the required permissions, if any
        permissions_required = ",".join(technique.get('x_mitre_permissions_required', []))

        # Extract the tactic name, if any
        tactic_name = ""
        if 'kill_chain_phases' in technique and len(technique['kill_chain_phases']) > 0:
            tactic_name = technique['kill_chain_phases'][0].get('phase_name', '')

        # Write the technique fields to the output file in CSV format
        if technique_id and technique_name:
            f.write(f"{technique_id},{technique_name},{tactic_name},{platforms},{permissions_required}\n")

# Read the contents of the output file into a list of lines
with open('output.csv', 'r') as f:
    lines = f.readlines()

# Sort the lines based on the technique_id column
lines_sorted = sorted(lines[1:], key=lambda x: x.split(',')[0])

# Write the sorted lines back to the output file
with open('output.csv', 'w') as f:
    f.write(lines[0])
    f.writelines(lines_sorted)
ATT&CK_All_Raw_Fields.py
import json
import urllib.request

# Load the JSON file from the URL
url = "https://raw.githubusercontent.com/mitre/cti/master/enterprise-attack/enterprise-attack.json"
response = urllib.request.urlopen(url)
data = json.loads(response.read())

# Create a list of all the field names
field_names = []
for technique in data["objects"]:
    for field in technique:
        if field not in field_names:
            field_names.append(field)

# Add a header column to the field names
field_names_with_header = ["Header"] + field_names

# Write the data to a file with ";" as the delimiter
with open("enterprise-attack.txt", "w") as txt_file:
    # Write the header row
    header_row = ";".join(field_names_with_header) + "\n"
    txt_file.write(header_row)

    # Write the data rows
    for i, technique in enumerate(data["objects"]):
        values = [str(technique.get(field, "")).replace("\n", "") for field in field_names]
        row = f";T{i+1}" + ";" + ";".join(values) + "\n"
        txt_file.write(row)

Configuring Sentinel to Collect CEF Syslog with the AMA agent

(NOTE: I suggest you read my new blog post on this topic here. Some of the suggestions below may be outdated)

Here’s a quick guide to installing the AMA agent for collecting syslog data into Microsoft Sentinel.

Note: the ‘legacy’ method involved installing the old OMS agent and a separate step for the CEF collection script. The new procedure only requires a single python script, but it also requires the Azure Arc agent.
(which arguably is an added benefit for many reasons we won’t get into here).

Note: the AMA agent relies on the Azure Arc agent, unless your syslog server is in the Azure cloud.

This procedure is using Ubuntu 20.04, but it will be almost identical for most other linux platforms:

  1. Install Python3 and other tools:
    • sudo apt-get update;sudo apt-get install python3;sudo apt-get install net-tools;sudo ln -s /usr/bin/python3 /usr/bin/python
  2. Install ARC agent. 
    • (if you haven’t done this before, go to the Azure portal and search for Arc.)
    • Verify: In the Azure portal, verify the server shows up in the list of Arc servers.
  3. In Sentinel, enable the CEF AMA connector and Create a DCR (data collection rule).
    • Enable these syslog Facilities: LOG_AUTH, LOG_AUTHPRIV and all of the LOG_LOCAL*. Use the log level of LOG_DEBUG.
    • WAIT 5 MINUTES for the Arc agent heartbeats to get up to Sentinel.
    • Verify: In the Sentinel log analytics workspace, query the ‘Heartbeat’ table – verify your arc server is here. 
  4. Install CEF AMA for syslog:

In Sentinel, verify the CEF logs are parsing by querying the CommonSecurityLog table:

References:

https://learn.microsoft.com/en-us/azure/sentinel/connect-cef-ama

https://learn.microsoft.com/en-us/azure/sentinel/forward-syslog-monitor-agent#configure-linux-syslog-daemon

https://learn.microsoft.com/en-us/azure/sentinel/data-connectors/common-event-format-cef-via-ama