I Made A Thing Again - mx_recon.py

on under Python
4 minute read

A common thing I do at work is look up mx records to see where mail msgs are pointing. I work within the cloud email realm so it comes up every day. Usually I just whip up a command prompt and run nslookup, which is fine. However, after I run nslookup, I then have to copy and paste the info into a separate note, then add manual notes such as “lots of includes (the includes)”, “softfail!!!”, “DMARC could use xyz tags but isn’t”, et cetera.
It’s on my github at cpardue/python3/mx_recon if you want to just look at it and be done with this.

So to combat this and to bolster my actually learning some python goals, I wrote a script to handle it, using some python dns packages.
The shorthand of the logic is as follows:

  1. Gimme a domain name to check.
  2. Check for and output MX, SPF, and DMARC records.
  3. Write results to domainname.com.txt.

Initially the script just did a dns query for a user-supplied FQDN. But the output was exactly what I’d get from nslookup, so other than writing to an output file, what’s the point of that. I decided it could be better and learned how to implement some regex using the python ‘re’ package.

After playing around in regex101.com A LOT, I came up with some nifty regex to parse various things.

# look for redirect=_somespf.somewhere.example.com

# look for include:_somespf.somewhere.example.com

# look for (?+-~)all
# look for rua=mailto:[email protected]

# look for aSPF=r;

and so on. 

Once I was able to nail down some regex and actually match to values in DNS records, it was relatively simple to then parse out the relavent tags from their values. I found that I could do something like

    tag_and_value = regex for include:_spf.something.com in RECORD
    value = regex for _spf.something.com in tag_and_value
    if value is none
      print "this tag's not declared"
      print "includes: " + value

I ended up doing a double tierd match on nearly everything that included a tag=value pair. I think I could have relied on a dictionary key pair but I am not sure that I could then still account for case sensitive keys. I’m probably going to investigate that more in whatever I write next.

Once I had an MX Record check, then a parsed out SPF and DMARC record check, I felt bad.
I was ending up with a nicely simple txt file that basically said:
MX Records
0 aspx.google.com
5 alt1.google.com
5 alt2.google.com
SPF Record
“the entire record”
Includes: [each, include, listed]
Macros found: [each, macro, found]
All Mechanism: [-] HARDFAIL
DMARC Record
“the entire record”
adkim found: [s] STRICT
each tag not listed
Policy = [none]

Ok so the SPF and DMARC were parsed out but all I had for MX records were priority and the hostname. That’s fine, that’s all there is. But those hosts are always something. I run into a lot of obscure security solutions or mail servers and I end up having to google them, find out what hosting service it is, then list that in my notes as well, such as “MX still points to Mimecast” or something similarly simple. So, since I figured out regex for the SPF and DMARC tags, I figured I might as well figure out how to use regex to match the MX Record host to a small list of the most popular cloud email hosting solutions.
And thus I did so. Logic went something like this:

for line in mx_data
  host = regex for .example.com in line
  if host = .google.com
    print "Google Workspace Host"
  if host = .outlook.com
    print "O365 Host"
  if host = .amazonaws.com
    print "Amazon Workmail Host"
and so on. 

I did this for as many possible MX Hosts I could think of off the top of my head, then scraped google for top cloud email hosts and top cloud email security solutions, then added each one in there. I added the host right next to the mx record’s priority and hostname to keep it simple.
I went ahead and set up a requirements.txt file and tested, tested, retested, added debugging and info logging, and made sure that the output txt file is easily understandable.
I might circle back and simplify it a little if while I am using the script it becomes annoying to me. In the meantime I am having a lot of fun with being able to actually make things that work. I am finding that there is more and more incentive to be decent at this stuff as time goes by.

logs, dns, mx, python
comments powered by Disqus