August 2013 – Life At Warp 9

One of the problems with machine learning in an uncontrolled environment is lies. Bad data, noise, and intentional or unintentional misinformation complicate learning. In an uncontrolled environment any intelligence (synthetic or otherwise) is faced with the extra task of separating truth from fiction.

Take GBUdb, for example. Message Sniffer’s GBUdb engine learns about IP behaviors by watching SNF’s scan results. Generally if a message scan matches a spam or malware rule then the IP that delivered the message gets a bad mark. If the scanner does not find spam or malware then the IP that sent the message is given the benefit of the doubt and gets a good mark.

In a perfect world this simple algorithm generates reliable statistics about what we can expect to see from any given IP address. As a result we can use these statistics to help Message Sniffer perform better. If GBUdb can predict spam and malware from an IP with high confidence then we can safely stop looking inside the message and tag it as bad.

Similarly if GBUdb can predict that an IP address only sends us good messages then we can let the message through. Even better than that — if the message matches a new spam or malware rule then most likely we’ve made a mistake. In that case we can turn off the troublesome rule, let the message through, and raise a flag so bigger brains can take a look and fix the error.

Right?

Not always!

Message Sniffer’s Auto-Panic feature does a fantastic job of helping us catch problems before they can cause trouble, but Auto-Panic can also be tricked into letting more spam through the filters.

When a new pre-tested spam campaign is launched on a new bot-net there is some period of time where completely unknown IP addresses are sending messages that are guaranteed (pre-tested) not to match any recognizable patterns. All of these IPs end up gathering good marks for sending “apparently” clean messages… and since they are churning out messages as fast as they can they gain a good reputation quickly.

Back at the lab the SortMonsters and RuleBots are hard at work analyzing samples and creating rules to recognize the new campaign. This takes a little bit of time and during that time GBUdb can’t help but become convinced that some of these IPs are good sources. The statistics prove it, after all.

When the new pattern rules get out to the edges the Auto-Panic feature begins to work against us. When the brand new pattern rules find spam or malware coming from one of these new IPs it looks like a mistake. So, Auto-Panic fires and turns off the new rules!

For a time the gates are held wide open. As new bots come online they get extra time to sneak their messages through while the new rules are suppressed by Auto-Panic. Not only that but all of the new IPs quickly gain a reputation for sending good messages so that they too can trigger the Auto-Panic feature.

In order to solve this problem we’ve introduced a new behavior into the learning engine. We’ve made it skeptical of new, clean IPs. We call it White-Guard.

White-Guard singles out IPs that are new to GBUdb and possibly pretending to be good message sources. Instead of taking the new statistics at face value the system decides immediately not to trust them and not to distrust them either. The good and bad counts are artificially set to the same moderately high value.

It’s like a stranger arriving in a small town. The town folk won’t treat the stranger badly, but they also won’t trust them either. They withhold judgement for a while to see what the stranger does. Whatever opinion is ultimately formed about the stranger they are going to have to earn it.

In GBUdb, the White-Guard behavior sets up a neutral bias that must be overcome by new data before any actions will be triggered by those statistics. Eventually the IP will earn a good or bad reputation but in the short term any new “apparently” clean IPs will be unable to trigger Auto-Panic.

With Auto-Panic temporarily out of reach for these sources new pattern rules can take effect more quickly to block the new campaigns. This earns most of the new bot-net IPs the bad reputations they deserve and helps to increase early capture rates.

Since we’ve implemented this new learning behavior we have seen a significant increase in the effectiveness of the GBUdb system as well as an improvement in the accuracy of our rule conflict instrumentation and sampling rates. All of these outcomes were predicted when modeling the dynamics of this new behavior.

It is going to take a little while before we get the parameters of this new feature dialed in for peak performance, but early indications are very good and it’s clear we will be able to apply the lessons from this experiment to other learning scenarios in the future.