When is a control not a control?

The simple answer: When it is not controlling anything. Users call IT when their email system is down. Users call IT when they can't login to critical systems. The entire financial system is down? The smart CIO is already on the phone to the CFO with an explanation.

But when the anti-virus software on a user's workstation stops auto-scanning new files, more than likely the users do not notice and do not call. When the web filter starts letting in the stuff the web filter is supposed to block, not only do users not call in, they sometimes start sending links to their co-workers.

Audits can find when controls are not working. But audits are formal reviews and they do not happen often at the detailed level of assuring controls. Their findings, even if they are internal audits, are still seen as matters of pass-and-fail so the stakes are high. And doing nothing between audits and just focusing on your next big security initiative leads to the kind of benign neglect that does not do your career or your company any favors.

This is one of the areas where operations and security usually have vastly different viewpoints. IT operations know they'll hear about it if systems fail. They may have some alarms set up and they are not assuming that their only way to know their systems are working is by outage. Nonetheless, they know the users are real-time monitors for uptime. The operations mindset is “install occurred without error messages; user is up and running; job well done; they'll call me if they need anything else.”

For security, waiting until something bad happens to find out that a control is not working cannot be an option.

Security cannot adopt the operations mindset when it comes to security controls. Controls need to be tested and their effectiveness and coverage needs to not be taken for granted.

Another thing about controls is that they are often among the first things turned off by operations when there is trouble on a device. Vendors will instruct engineers to turn off anti-virus when an application is hanging. Not because they have proof it is causing the problem, but because it rules that out. When a server is running slow, engineers will turn off logging in case that is causing the problem. Again, not because they have proof, but because it might help.

And, of course, once the problem with the server or the application is fixed, the incident is closed since “everything” is back up (well, maybe not logging or anti-virus).

Control testing is not glamorous. Some of it can be automated. But some of it comes down to the grunt work of a security department.