Two stories in as many weeks have flushed out some of the management problems Microsoft has with the management of its vast IT inventory - DNS and SSL.
The first story was a downtime of Microsoft Teams. The reason - an expired certificate. A bit of background from one of the manager's LinkedIn headhunting post:
Microsoft Teams is the most exciting product Microsoft has produced in the last decade, and has been an incredible growth story. Unstoppable user growth also means challenges ... Arunachalam Thenappan - Principal Group Engineering Manager.
"... challenges to solve in our middle tier/backend services layers ..."
Keeping Up with Product Growth
We did a quick report on MS teams subdomain and at the time it showed 22 certificates either expired or expiring within 7 days. We did the same today (as you can do yourself) and the number went up to 39. A large part of that number will be certificates for test/dev or legacy domains that are not in use any more.
Figure: Rapid growth of a cloud service means massive changes with security implications.
Two weeks ago and only with the outside view, I'd say that MS Teams may need a better way to manage legacy certs. But, maybe they use a simple approach -wipe unused virtual servers with everything on them, including private keys for those certificates. Still, I would expect them to revoke unused certificates If they had an IT security admin, but they will have to balance an additional cost, priorities, etc. You can see from the quote above that they've been massively growing, which makes it much harder to keep on top of things that are not directly impacting the core service.
But let's have a look at the bigger picture. In terms of certificate expirations across the Microsoft domains, the free KeyChest domain audit is way too restricted (around 3,000 subdomains and only one domain at a time) but even with that, you can see that the problem is not limited to the MS Teams product.
Figure: msn.com is one of many domains owned by Microsoft, and it's mentioned in the DNS hijacking context.
What would be more worrying though is if there was an intersection of domains that could be subject to a potential compromise of their private keys and other vulnerabilities - like DNS domain hijacking.
Certificate Management and DNS Hijacking
Having read several articles about the domain hijacking within the microsoft.com and other Microsoft-owned domains, the certificate problem appears to be more significant. Microsoft is a massive company so there is inevitably going to be mistakes being made. They even had issues with the security of code signing keys in the past. We can assume that there is a distinct possibility that attackers can combine several vulnerabilities to launch powerful attacks.
The researcher - Michel Gaschet of NIC.gp - quoted in the ZDNet article "Microsoft has a subdomain hijacking problem" is quoted:
Microsoft has a problem in managing its thousands of subdomains, many of which can be hijacked and used for attacks against users, its employees, or for showing spammy content".
Michel was reporting issues to Microsoft over the last 3 years but only 5-10% were fixed - the reports included 142 domains in 2019 and 117 in 2020 (we are still in February).
Validity of Cursory Checks
You may say that this is a pretty far-fetched theory and only based on a pretty high-level "circumstantial evidence". That is true, but my experience tells me that if you can see issues in cursory checks, there is a good chance that a proper audit will confirm some and discover more. Whenever I, or my colleagues looked properly at a potential issue, we were almost always rewarded (whether it was vulnerabilities discovered recently by the team at Brno University - ROCA - insecure key generation in 25% of all TPMs, Minerva - insecure elliptic curve signing in open source libraries, or some previous work like Unwrapping Chrysalis - hidden API in HSMs, or sensor network security).
The risk of hijacking a Microsoft subdomain and getting a valid HTTPS certificate and its private key is certainly there. The ZDNet article mentions that one of the issues is that this kind of integration and configuration bugs is "not part of Microsoft's bug bounty program" which lowers its priority. This is something that I find a big issue across all kinds of companies - from high-tech startups, major banks, to big technology companies.
Product Bugs v Integration Gaps
It is generally much easier to fix bugs in "products" as there is an easy to identify the vendor. Fixes are localized and companies have processes to test changes and deploy them. Management and integration issues have to be identified, analyzed, and fixed by the companies themselves. They are likely to include vulnerabilities (or simply incorrect configuration options) of several products as well as IT system "glue" developed either by the integrators or the companies themselves.
While it is easier to read about bugs in products than gaps and weaknesses in business system designs, the threat of the latter is increasing with the use of public APIs and opening-up of legacy systems - e.g., banking systems. All that is needed is a sufficiently high reward to motivate attacks to focus on a particular company, rather than products used across the globe.
You have better things to do. KeyChest with its global database of web certificates can instantly create an initial "big picture" so you can start analyzing your exposure to cyber attacks and adjust it according to your risk appetite.