/m365-endpoint-api-digester

A python based application to consume, convert and output information from the Microsoft 365 Endpoint API. Currently outputs CSV, Squid3+ Conf via template, Puppet-Squid type Heira YAML and untested Palo Alto EDL.

Primary LanguagePythonMIT LicenseMIT

m365-endpoint-api-digester

An Microsoft 365 endpoint API utility

For information see: https://docs.microsoft.com/en-us/microsoft-365/enterprise/microsoft-365-ip-web-service?view=o365-worldwide

The Office 365 IP Address and URL web service helps you better identify and differentiate Office 365 network traffic, making it easier for you to evaluate, configure, and stay up to date with changes

An extension of offically provided start scripts at: https://docs.microsoft.com/en-us/microsoft-365/enterprise/microsoft-365-ip-web-service?view=o365-worldwide#example-python-script


What does this do?

Consumes the M365 endpoint API, which supplies a list of IPv4/6 addresses and DNS addresses which are used by M365 related products (Teams, OneDrive, Sharepoint & O365 Applications). With this list is is possible to create rules for network infrastructure devices such as firewalls, IDS, proxies etc, to permit, monitor, throttle or control access.

This python command line application can currently digest the M365 API, and product output files in the following formats:

  • Generic CSV - For further processing
  • PuppetSquid - For use in a puppet controlled environment, likely as part of your CI/CD workflow. See m365digester/Outputs/PuppetSquid.py
  • Squid3 via a template - For use directly in your Squid configuration. See m365digester/Outputs/SquidConfig.py and examples/squidconfig.template

Motivation

This script began as part of a requirement to produce a Squid3 based proxy running on Puppet manage infrastructure, with the sole purpose of proxying MS Teams and MS OneDrive connections to M365 from networks that were not permitted to be on a routable network, nor were they permitted to have generic proxied internet access in the interests of security. The first iteration of this script produced YAML only, and had little configurability. Squid3 uses Splay Trees, which does not necessarily work well when trying to translate the rules provided by the M365 Endpoint API (see here), so a primitive 'collapser' is included, to reduce the rule sets produced to their minimum, and thus keep Squid happy.


Configuration

Uses the standard 'logging' module from Python ~2+ for log levels:

  • CRITICAL
  • ERROR
  • WARNING
  • INFO
  • DEBUG

Options

Parameters can be set in several ways

  • Environment variables
  • Command line
  • As part of a python Dict() (when used as a module)

Parameters

Short Long ENVVAR Type Default Information
-v --version N/A Semantic version Return version information
--log-level-file LOG_LEVEL_FILE Log Level DEBUG Set the log level for file output
--log-level-console LOG_LEVEL_CONSOLE Log Level INFO Set the log level for console output
-l --log-file-output LOG_FILE_PATH File path and name None Log file target
-k --keep-sqlitedb SQLITEDB_KEEP Switch (Bool) False If set, any SQLite databases used on disk will not be deleted at the termination of this application
-j --sqlitedb-file-path SQLITEDB_FILE_PATH File path and name ./{APP_NAME}.db If set, all SQLite operations will be performed on this file on disk, not in memory
-W --disable-wildcards WILDCARDS_DISABLED Bool False Prevent the replacement of wildcards eg: '*.domain.com' with single prefix dots '.'
-w --wildcard-pattern WILDCARD_PATTERN String (regex) '^(*).' Regex to use for the detection and replacement of wildcards
-C --collapse-acls-disable ACL_COLLAPSE_DISABLED Switch (Bool) True If disabled, ACLs will not be reduced to a smaller set based on inner/outer subdomain tree positioning
-z --categories-include CATEGORIES_INCLUDE Domain List (space seperator) Allow Default List of categories from API to process
-q --disable-domains DOMAINS_DISABLED Switch (Bool) Disable processing of domain names from API False
-n --disable-ipv4 IPV4_DISABLED Switch (Bool) Disable processing of IPv4 addresses from API False
-m --disable-ipv6 IPV6_DISABLED Switch (Bool) Disable processing of IPv6 addresses from API False
-i --client-request-id M365_REQUEST_ID String (GUID) Automatically generated from host NIC MAC Request ID to use with M365 API
-s --service-instance M365_SERVICE_INSTANCE String (Choice) Worldwide Specify M365 service instance API type
-e --extra-known-domains EXTRA_KNOWN_DOMAINS Domain list (space separated) Not specified Use for your tenancy domain names or other extras including overrides, do not use quotations, wildcards permitted, ie: '-e mycompany-files.sharepoint.net *.live.com autodiscover.mycompany.mail.onmicrosoft.com'
-E --extra-known-ips EXTRA_KNOWN_IPS IP address list (space separated) Not specified Use for other extras IP addresses including overrides, do not use quotations, wildcards not permitted, ie: '-E 192.168.1.0/24'
-x --exclude-addresses EXCLUDE_ADDRESSES Domain/Address list (space separated) Not specified Use to exclude entries from consideration when processing or generating files, ie: '-x autodiscover.*.onmicrosoft.com'
-u --output-path OUTPUT_PATH File path without name './' Path on disk to place output file. Mutually exclusive with -o
-p --output-prefix OUTPUT_PREFIX File name only without extension '{APP_NAME}' Filename without extension for output file
-o --output-file OUTPUT_FILE File name and path Unset Full path and filename for output file. Mutually exclusive with -u and -p
-t --output-type OUTPUT_TYPE String (Choice) yaml Output file type, from: [ GENERALCSV PUPPETSQUID SQUIDCONFIG ]
--output-template OUTPUT_TEMPLATE File name and path Unset Input template file for output file types supporting it (ie: SQUIDCONFIG)
--linesep LINESEP String (Choice) Python os.linesep Specify line separator (new line), CRLF on Windows, LF on nix* ")

Use as a Docker container

NOTE: This container is not yet published, but the included Dockerfile has been tested locally and does work.

docker run -v /host/output/target:/output:rw dougbarry/m365digester:latest -l /output/m365digester.log -k -j /output/m365digester.db -z Allow Default Optimize -m -e testcompany-files.sharepoint.com testcompany-cloud.microsoft.com *.live.com -t puppetsquid -o /output/puppet-squid-snippet.yml

Use via docker-compose

services:
  m365digester:
    image: dougbarry/m365digester:latest
    environment:
      - "M365_REQUEST_ID=b10c5ed1-bad1-445f-b386-b919946339a7"
      - "OUTPUT_PATH=/output"
    volumes:
      - /host/path/output:/output:rw

Use as command line application:

./m365digester-cli -l ./m365-digester.log -k -j ./m365-digester.db -z Allow Default Optimize -m -e testcompany-files.sharepoint.com testcompany-cloud.microsoft.com *.live.com -o puppet-squid-snippet.yml

Use as module:

Default config in M365Digester/Defaults.py
Examples in examples/

Minimum viable use as module:

#!/usr/bin/env python3
import pprint
from m365digester.M365Digester import M365Digester

app = M365Digester()
app.main()
pprint.pprint(app.rule_list)

Module usage with argument parsing

See M365Digester/M365DigesterCli.py

Custom M365 domains in module

from m365digester.M365Digester import M365Digester

...
config = dict()
config.setdefault('extra_known_domains', [
    'mytenancy.sharepoint.com',
    'mytenancy-files.sharepoint.com',
    'mytenancy-my.sharepoint.com',
    'mytenancy-myfiles.sharepoint.com',
    # really just for *.officeapps.live.com but squids splay trees don't always like it
    '.live.com'
])
...
app = M365Digester(config, my_logger)
...

Contributing

If you have any issues or suggestions, please submit an issue on GitHub. All contributions considered and welcomed

License

MIT License