Apr 092013

On terrors and trials and troubles we tumble – so easily lost in this worlds wild jumble of chaos and rumors and strife and the hundreds of pointless distractions that cost us our marbles, and yet there’s a way if you manage to find it to keep from the fray all your virtues and kindness – a way to find joy, even bliss and good morrows. Your own private stock of fond memories to follow.

This path to good fortune is not for the timid and those who are on it could tell you some tales. Amid crisis and horror, between tears and sorrow, there’s monsters and fears and dark nights on each side of this quaint, narrow road full of light and bright moments – it’s peace and warm comfort in stark brilliant contrast to all of the dark scary places it goes past.

‘Tis love that I speak of and not simply friendship, but kinship – the kind that you find when you tarry along on your first timid step on this path that can be so uplifting or so very bad… but then once you have found them, this singular spirit, that follows you on and pretends not to fear it, you find you’re together and somehow the darkness is farther away and not nearly as heartless.

Your steps intertwine, there is dancing and wine and good words and good song and good cheer goes along and you find that no matter how hard the wind blows and no matter how scary the outside world grows, and no matter how shaky your next step may be, that your partner can help you, and does, day to day, in their subtle sweet magical spiritual ways.

You’re both stronger, and braver, and more fleet of foot and the sharp narrow path that upon you first took becomes broader and wider with each every day – soon as broad as each moment you have in your way and with each tender kiss and each loving caress you can light up the darkness – force evil’s regress.

Lonely souls fear to tread here and well so they should for this place is not for them – it does them no good and the road doesn’t widen and so they fall off. It is sad, but it happens more often than not. And it also is true for more than one in two that do venture this road that their love is not true and they find they’re apart in this harsh frightening place and they find that they can’t stand the look on their face and they stumble and cry as the frightening beasts beat them, then scurry away as their partners retreat, and they loose all the joy and tell sad sorry stories that frighten young children and prosper young lawyers.

So hold fast your other! You dare not let go. There is truth to this story. It’s not just for show! I’ve been down this long path more than once don’t you see and found many fierce terrible things in my spree. These beasties I speak of were you and were me as we fell off the path and made bitter decrees to get justice from all those around: our just share! when in-stead we were missing the love that was there.

So hold tight to your lover, make strong your belief, and find comfort in each other’s arms, and sew peace, and you’ll find after all that true love does survive all the slings and the arrows that life can provide, and in fact it repels them. It really is true… just remember this magic will take both of you.

– Original (c) 1998 Pete McNeil

Feb 182013

MicroNeil has always been interested in the application of synthetic intelligence to real-world problems. So, when we were presented with the challenge of protecting messaging systems (and specifically email) from abuse, we applied machine learning and other AI techniques as part of the solution.

Email processing, and especially filtering, presents a number of challenges:

  • The Internet is increasingly a hostile environment.
  • Any systems that are exposed to the Internet must be hardened against attack.
  • The value of the Internet is derived from it’s openness. This openness tends to be in conflict with protecting systems from attack. Therefore, security measures must be carefully crafted so that they offer protection from abuse without compromising desirable and appropriate operations.
  • The presence of abuse and the corresponding need for sophisticated countermeasures sets up an environment that is constantly evolving and growing in complexity.
  • There is disagreement on: what constitutes abuse, the design of countermeasures and safeguards, what risks are acceptable, and what tactics are appropriate.
  • All of these conditions change over time.

As consequence of these circumstances any successful filtering system must be extremely efficient, flexible, and dynamic. At the same time it must respond to this complexity without becoming too complex to operate. This sounds like a perfect place to apply synthetic intelligence but in order to do that we need to use a framework that models an intelligent entity interacting with it’s environment.

The progressive evaluation model provides precisely that kind of framework while preserving both flexibility and control. This is accomplished by mapping a synthetic environment and the potential responses of an intelligent automaton (agent) onto the state map of the SMTP protocol and the message delivery process.

Each state in the message delivery process potentially represents a moment in the life of the agent where it can experience the conditions present at that moment and determine the next action it should take in response to those conditions. The default action may be to proceed to the next natural step in the protocol but under some conditions the agent might choose to do something else. It may initiate some kind of analysis to gather additional information or it might execute some other intermediate step that manipulates the underlying protocol.

The collection of steps that have been taken at any point and the potential steps that are possible from that point forward represent various “filtering strategies.” Filtering strategies can be selected and adjusted by the agent based on the changing conditions it perceives, successful patterns that it has learned, and the preferences established by administrators and users of the system.

The filtering strategies made available to the agent can be restrictive so that the system’s behavior is purely deterministic; or they can be flexible to allow the agent to learn, grow, and adapt. The constraints and parameters that are established represent the system policy and ultimately define what degrees of freedom are provided to the agent under various conditions. The agent works within these restrictions to optimize the performance of the system.

In a highly restrictive environment the agent might only be to allowed to determine which DNSBLs to check based on their speed and accuracy. Suppose there are several blacklists that are used to reject new connections. If one of these blacklists were to become slow to respond or somehow inaccurate (undesirable) then the agent might be allowed to exclude that test from the filtering strategy for a time. It might also be allowed to change the order in which the available blacklists are checked so that faster, less comprehensive blacklists are checked first. This would have the effect of reducing system loads and improving performance by rejecting many connections early in the process and applying slower tests only after the majority of connections have been eliminated.

A conservative administrator might also permit the agent to select only cached results from some blacklists that are otherwise too slow. The agent might make this choice in order to gain benefit from blacklists that would otherwise degrade the performance of the system. In this scenario the cached results from a slow but accurate blacklist would be used to evaluate each message and the blacklist would be queried out of band for any cache misses. If the agent perceived an improvement in the speed of the blacklist then it could elect to use the blacklist normally again.


Figure 1 – Basic Progressive Response Model for Email Processing

Refer to Figure 1. Generally the agent “lives” in a sequence of events that are triggered by new information. At the most basic level it is either acting or waiting for new information to arrive. When information is added to it’s local context (red arrows) then that new information is applied (green arrows) to the current state of the agent (blue boxes).

If the new information is relevant and sufficient then it will trigger the agent to take action again and change it’s state thus moving the process forward. Each action is potentially guided by the all of the information that is available in the local context including a complete history of all previous actions.

In Figure 1, an agent waiting asleep is prompted by it’s local context to let it know that a new connection has occurred. Let’s assume that this particular system is designed so that each agent is assigned a single connection. The agent acts by waking up and changing it’s state. That action (a change in it’s own state) triggers the next action which is to issue a command to test the local blacklist with the new IP. Then, the agent changes it’s state again to a branching state where it will respond to the local blacklist result once it is available. At this point the agent goes back to a waiting state until new information arrives from the test because it is unable to continue without new information.

Next, the local blacklist result arrives in the local context. This prompts the agent again causing it to evaluate the local blacklist result. Depending upon that result it will chose one of two filtering strategies to use moving forward: either rejecting the connection or proceeding to another test.

This process continues with the agent receiving new stimuli and responding to that stimuli according to the conditions it recognizes. Each stimulus elicits a response and each response is itself a stimulus. The chain of stimuli and responses cause the agent to interact with the process following a path through the states made available to it by progressively selecting filtering strategies as it goes.

As each step is taken additional information about the session and each message builds up. Each new piece of information becomes part of the local environment for the agent and allows it to make more sophisticated choices. In addition to conventional test data the agent also builds up other information about it’s operating environment such as performance statistics about the server, other sessions that are active, partial results from it’s own calculations, and references to previous “experiences” that are “interesting” to it’s learning algorithms.

Agents might also communicate with each other to share information. This would allow these agents to from a kind of group intelligence by sharing their experiences and the performance of their filtering strategies. Each agent would gain more comprehensive access to test data and the workload of devising and evaluating new strategies would be divided among a larger and more diverse collection of systems.

The level of sophistication that is possible is limited only by the sophistication of the agent software and the restrictions imposed by system policies. This framework is also flexible enough to accommodate additional technologies as they are developed so that the costs and risks associated with future upgrades are reduced.

Typically any new technologies would appear to the agent as optional tools for new filtering strategies. Existing filtering strategies could be modified to test the qualities of the new tools before allowing them to affect the message flow. This testing might be performed deterministically by the system administrator or the agent might be allowed to adapt to the presence of the new tool and integrate it automatically once it learns how to use it through experimentation.

So far the description we have used is strictly mechanical. Even in an intelligent system it would be possible and occasionally desirable for the system administrator to specify a completely deterministic set of filtering strategies. However, on a system that is not as restrictive there are two opportunities for the intelligence of the agent to emerge.

Parametric adaptation might allow the agent to respond with some flexibility within a given filtering strategy. For example, if the local blacklist test were replaced by a local IP reputation test then the agent might have a variable threshold that it uses to judge whether the connecting IP “failed.” As a result it would be allowed to select filtering strategies based upon learning algorithms that adjust IP reputation thresholds and develop the IP reputation statistics.

Structural adaptation might allow the agent to swap out components of filtering strategies. Segments of filtering strategies might be represented in a genetic algorithm. After each session is complete the local context would contain a complete record of the strategies that were followed and the conditions that led to those strategies. Each of these sessions could be added to a pool, evaluated for fitness (out of band), and the most successful strategies could then be selected to produce a new population of strategies for trial. A more sophisticated system might even simulate new strategies using data recorded in previous sessions so that the fitness of new filtering strategies could be predicted before testing them on live messages.

Structural and parametric adaptation allow an agent to explore a wide range of strategies and tuning parameters so it can adopt strategies that produce the best performance across a range of potentially conflicting criteria. In order to balance the need for both speed and accuracy the agent might evolve a progressive filtering strategy that leverages lightweight tests early in the process in order to reduce the cost of performing more sophisticated tests later in the process. It might also improve accuracy by combining the scores of less accurate tests using various tunable weighting schemes in order to refine the results.

Another interesting adaptation might depend on session specific parameters such as the connecting system address range, HELO, and MAIL FROM: information, header structure, or even the timing and sequence of the events in the underlying protocol. Over time the agent might learn to use different strategies for messages that appear to be from banks, online services, or dynamic address ranges.

Given enough flexibility and sensitivity it could learn to recognize early clues in the message delivery process and select from dozens of highly tuned filtering strategies that are each optimized for their own class of messages. For example it might learn to recognize and distrust systems that stall on open connections, attempt to use pipelining before asking permission, or attempt to guess recipient addresses through dictionary attacks.

It might also learn to recognize that messages from particular senders always include specific features. Any messages that disagree with the expected models would be tested by filtering strategies that are more “careful” and apply additional tests.

Systems with intelligent agents have the ability to adapt automatically as operating conditions change, new tests are made available, and test qualities change over time. This ability can be extended if collections of agents are allowed to exchange some of their more successful “formulas” with each other so that all of the participating agents can learn best practices from each other. Agents that share information tend to converge on optimal solutions more quickly.

There are also potential benefits to sharing information between systems of different types. Intelligent intrusion detection systems, application servers, firewalls, and email servers could collaborate to identify attackers and harden systems against new attack vectors in real time. Specialized agents operating in client applications could further accelerate these adaptations by contributing data from the end user’s point of view.

Of course, optimizing system performance and responding to external threats are only parts of the solution. In order to be successful these systems must also be able to adapt to changing stakeholder preferences.

Consider that a large scale filtering system needs to accommodate the preferences of system administrators in charge of managing the infrastructure, group administrators in charge of managing groups of email domains and/or email users, power users who desire a high degree of control, and ordinary users who simply want the system to work reliably and automatically.

In a scenario like this various parts of the filtering strategy might be modified or swapped into place at various stages based on the combined preferences of all stakeholders.


Figure 2 – Structural Adaptation Per User

At any point during the progressive evaluation process it is possible to change the remaining steps in the filtering strategy. The change might be in response to the results of a test, results from an analysis tool, a change in system performance data, or new information about the message.

In Figure 2 we show how the filtering strategy established by the administrator is followed until the first recipient is established. The first recipient is interpreted as the primary user for this message. Once the user is known the remainder of the filtering strategy is selected and adjusted based on the combined user, domain, group, and administrator preferences that apply.

Beginning with settings established by the system administrator each successively more specific layer is allowed to modify parts of the filtering strategy so that the composite that is ultimately used represents a blend of all relevant preferences. The higher, more general layers determine the default settings and establish how preferences can be modified by the lower, more specific layers.


Figure 3 – Blended Profile Selection

Refer to Figure 3. The applicable layers are selected from the bottom up. The specific user belongs to a domain, a domain belongs to a group, a group may belong to another group, and all top level groups belong to the system administrator. Once a specific user (recipient) is identified then the applicable layers can be selected by following a path through the parent of each layer until the top (administrator) layer is reached. Then, the defaults set by the administrator are applied downward and modified by each layer along the same path until the user is reached again. The resulting preferences contain a blend of the preferences defined at each layer.


Figure 4 – Composite Strategy Interaction

It is important to note that these drawings are potentially misleading in that they may appear to show that the agent is responsible for executing the SMTP protocol and all that is implied. In practice that would not be the case. Some of the key states in the illustrated filtering strategies have been named for states in the SMTP protocol because the agent is intended to respond to those specific conditions. However the machinery of the protocol itself is managed by other parts of the software – most likely embedded in the machinery that builds and maintains the local context.

You could say that the local context provides the world where the intelligent agent lives. The local context implements an API that describes what the agent can know and how it can respond. The agent and the local context interact by passing messages to each other through this API.

Typically the local context and the agent are separate modules. The local context module contains the machinery for interacting with the real world, interpreting it’s conditions, and presenting those conditions to the agent in a form it can understand. The agent module contains the machinery for learning and adapting to the artificial world presented by the local context. Both of these modules can be developed and maintained independently as long as the API remains stable.

It should be noted that this kind of framework can be applied broadly to many kinds of systems – not just email processing and other systems on the Internet. It is possible to map synthetic intelligence like this into any system that has sufficiently structured protocols and can tolerate some inconsistency during adaptation. The protocols provide a foundation upon which an intelligent agent can “grow” it’s learning processes. A tolerance for adaptation provides a venue for intelligent experimentation and optimization to occur.

Further, the progressive evaluation model is also not limited to large-scale processes like message delivery. It can also inform the development of smaller applications and even specialized functions embedded in other programs. A lightweight implementation of this technique underpins the design of the pattern matching engine used in Message Sniffer. Unlike conventional pattern matching engines, Message Sniffer uses a swarm of lightweight intelligent agents that explore their data set collaboratively in the context of an artificial “world” that is structured to represent their collective knowledge. Each of these agents progressively evaluates it’s location in the world, it’s location in the data set, it’s own state, and the locations and states of it’s peers. This approach allows the engine to be extremely efficient and virtually immune to the number of patterns it must recognize simultaneously.

Broadly speaking, this technique can be applied to a wide range of tasks such as automated network management, data systems provisioning, process control and diagnostics, interactive help desks, intelligent data mining, logistics, robotics, flight control systems, and many others.

Of course, email processing is a natural fit for applications that implement the Progressive Evaluation Model as a way to leverage machine learning and other AI techniques. The Internet community has already demonstrated a willingness to “bend” the SMTP protocol when necessary, SMTP provides a good foundation upon which to build intelligent interactive agents, and messaging security is a complex, dynamic problem in search of strong solutions.

Nov 092012

I recently read a few posts that suggest any computer language that requires an IDE is inherently flawed. If I understood the argument correctly the point was that all of the extra tools typically found in IDEs for languages like C++ and Java are really crutches that help the developer cope with the language’s failings.

On the one hand I suppose I can see that point. If it weren’t for all of the extra nudging and prompting provided by these tools then coding a Java application of any complexity would become much more difficult. The same could be said for serious C++ applications; and certainly any application with mixed environments and multiple developers.

On the other hand these languages are at the core of most heavy-lifting in software development and the feature list for most popular IDEs continues to grow. There must be a reason for that. The languages that can be easily managed with an ordinary editor (perhaps one with good syntax highlighting) are typically not a good fit for large scale projects, and if they were, a more powerful environment would be a must for other reasons.

This got me thinking that perhaps all of this extra complexity is part of the ongoing evolution of software development. Perhaps the complexity we are observing now is a temporary evil that will eventually give way to some truly profound advancements in software development. Languages with simpler constructs and syntax are more likely throw-backs to an earlier paradigm while the more complex languages are likely straining against the edges of what is currently possible.

The programming languages we use today are still rooted in the early days of computing when we used to literally hand-wire our systems to perform a particular task. In fact the term “bug” goes all the way back to actual insects that would occasionally infest the circuitry of these machines and cause them to malfunction. Once upon a time debugging really did mean what it sounds like!

As the hardware of computing became more powerful we were able to replace physical wiring with machine-code that could virtually rewire the computing hardware on the fly. This is still at the heart of computing. Even the most sophisticated software in use today eventually breaks down into a handful of bits that flip switches and cause one logic circuit to connect to another in some useful sequence.

In spite of the basics task remaining the same, software development has improved significantly over time. Machine-code was better than wires, but it too was still very complicated and hardware specific. Remembering op codes and their numeric translations is challenging for wetware (brains) and in any case isn’t portable from one type of machine to another. So machine-code eventually evolved into assembly language which allowed programmers to use more familiar verbs and register names to describe what they wanted to do. For example you can probably guess that “add ax, bx” probably instructs the hardware to add a couple of numbers together and that “ax” and “bx” are where those numbers can be found. Even better than that, assembly language offered some portability between one chunk of hardware and another because the compiler (a hardware specific chunk of software) would keep track of the specific op codes so that software developers could more easily reuse and share chunks of code.

From there we evolved to languages like C that were just barely more sophisticated than assembly language. In the beginning, C was slightly more than a handy syntax that could be expanded into assembly language in an almost cut-and-paste fashion. It was not uncommon to actually use assembly language inside of C programs when you wanted to do something specific with your hardware and you didn’t have a ready-made library for it.

That said, the C language and others like it did give us more distance from the hardware and allowed us to think about software more abstractly. We were better able to concentrate on algorithms and concepts once we loosened our grip the wiring under the covers.

Modern languages have come a long way from those days but essentially the same kind of translation is happening. It’s just that a lot more is being done automatically and that means that a lot more of the decisions are being made by other people, by way of software tools and libraries, or by the machinery itself, by way of memory managers, signal processors, and other specialized devices.

This advancement has given us the ability to create software that is profoundly complex – sometimes unintentionally! Our software development languages and development tools have become more sophisticated in order to help us cope with this complexity and the lure of creating ever more powerful software.

Still, fundamentally, we are stuck in the dark ages of software development. We’re still working from a paradigm where we tell the machine what to do and the machine does it. On some level we are still hand-wiring our machines. We hope that we can get the instructions right and that those instructions will accomplish what we have in mind but we really don’t have a lot of help with those tasks. We write code, we give it to the machine, we watch what the machine does, we make adjustments, and then we start again. The basic cycle has sped up quite a bit but the process of software development is still a very one-way endeavor.

What we are seeing now in complex IDEs could be a foreshadowing of the next revolution in software development where the machines will participate on a more equal footing in the process. The future is coming, but our past is holding us back. Right now we make educated guesses about what the machine will do with our software and our IDEs try to point out obvious errors and give us hints that help our memory along the way. In fact they are straining at the edges of the envelope to do this and the result is a kind of information overload.

The problem has become so bad that switching from one IDE to another is lot like changing countries. Even if the underlying language is the same, everything about how that language is used can be different. It is almost as if we’ve ended up back in the machine-code days where platform specific knowledge was a requirement. The difference is that instead of knowing how to rewire a chunk of hardware we must know how to rewire our tool stack.

So what would happen if we took the next step forward and let go of the previous paradigm completely? Instead of holding on to the idea that we’re rewiring the computer to do our bidding and that we are therefor completely responsible for all of the associated details, we could collaborate with the computer in a way that allows us to bring our relative strengths together and achieve a superior result.

Wetware is good at creativity, abstraction, and the kind of fuzzy thinking that goes into solving new problems and exploring new possibilities. Hardware is good at doing arithmetic, keeping track of huge amounts of data, and working very quickly. This seems like two sides of a great team because each partner brings something that the other is lacking. The trick is to create an environment where the two can collaborate efficiently.

Working with a collaborative IDE would be more like having a conversation than editing code. The developer would describe what they are trying to do using whatever syntax they understand best for that task and the machine would provide a real-time simulation of the result. Along the way the machine would provide recommendations about the solution they are developing through syntax highlighting and co-editing, hints about known algorithms that might be useful, and simulations of potential solutions.

The new paradigm takes the auto-complete, refactoring, and object browser features built into current IDEs and extends that model to reach beyond the code base for any given project. If the machine understands that you are building a particular kind of algorithm then it might suggest a working solution from the current state-of-the-art. This suggestion would be custom fitted to the code you are describing and presented as a complete simulation along with an analysis (if you want it) of the benefits. If the machine is unsure of what you are trying to accomplish then it would ask you questions about the project using a combination of natural language and the syntax of the code you are using. It would be very much like working side by side with an expert developer who has the entire world of computer science at top of mind.

The end result of this kind of interaction would be a kind of intelligent, self-documenting software that understands itself on a very deep level. Each part of the code base would carry with it a complete simulation of how the code should operate so that it can be tested automatically on various target platforms and so that new modifications can be regression tested during the development process.

The software would be _almost_ completely proven by the time it was written because unit tests would have been performed in real-time as various chunks of code were developed. I say, _almost_ because there are always limits to how completely any system can be tested and because there are always unknowns and unintended consequences when new software is deployed.

Intelligent software like this would be able to explain the design choices that were made along the way so that new developers could quickly get a full understanding of the intentions of the previous developers without having to hunt them down, embark on deep research efforts, or make wild guesses.

Intelligent software could also update and test itself as improved algorithms become available, port itself to new platforms automatically as needed, and provide well documented solutions to new projects when parts of the code base are applicable.

So, are strong IDEs a sign of weak languages? I think not. Instead, they are a sign that our current software development paradigm is straining at the edges as we reach toward the next revolution in computing: Intelligent Development Environments.

Aug 302012

An ill wind feeds my discontent it
grows with every lost moment and
steeped in deep anxiety it
grows the pit inside of me
it grows
the pit inside of me it grows the pit
consuming me

I lie awake, I pace the floor,
I watch the ever present door and
stare into the deep abyss
the hole
that ate the self I miss
the hole that ate the self
I miss the whole
I miss the wholeness

Selfless, lost, and still
not going far or fast
Perpetuation lasts and lasts
this yearning, churning, throbbing,
beating fast
I gasp!
breathe deep and let it pass
then start again
then let it pass
again and then

Again the thickening morass
my clouded mind, my faded past
The distant memories
obscured by present miseries and
blurred uncertain
treacheries I sense as apparitions hence

My aspect and my furrowed brow now
fearful to behold and yet
somehow all tempered by the very madness
working to conceal the sadness
stalking me and
still not talking

So I pull hard and
I strain in vein
I try not to complain
Insane, I give in to the pain and
then control myself again
and then again
control myself again
back where I started once again



Aug 252012

I needed to create MD5 hashes to populate a password database for apache. This seemed like a very simple thing. So, when I wanted an MD5 hash in hex for my JSP app I expected to find a utility ready and waiting inside java. No such luck. No problem, I thought — I’ll just “google” it.

I was surprised to find that there are lots of half-baked solutions out there posted on various discussion forums, but none of them were solid, simple, and self-explanatory; or at least they didn’t look like code I would be comfortable with. So I decided to write it up and make it easy to find just in case someone else has the same experience.

My solution breaks out into three tiny functions that might be re-used lots of places.

import java.util.*;
import java.security.*;

String HexForByte(byte b) {
    String Hex = Integer.toHexString((int) b & 0xff);
    boolean hasTwoDigits = (2 == Hex.length());
    if(hasTwoDigits) return Hex;
    else return "0" + Hex;

String HexForBytes(byte[] bytes) {
    StringBuffer sb = new StringBuffer();
    for(byte b : bytes) sb.append(HexForByte(b));
    return sb.toString();

String HexMD5ForString(String text) throws Exception {
    MessageDigest md5 = MessageDigest.getInstance("MD5");
    byte[] digest = md5.digest(text.getBytes());
    return HexForBytes(digest);

HexForByte(byte b) gives you a two digit hex string for any byte. This is important because Integer.toHexString() will give you only one digit if the input is less than 16. That can be a real problem if you are building hash strings. Another tricky bit in this function strips the sign from the byte before converting it to an integer. In java, every kind of number is signed so we have to watch out for that when making conversions.

HexForBytes(byte[] bytes) gives you a hex string for any array of bytes. Each byte will be correctly represented by precisely two hex digits. No more, no less.

Wrapping it all up, HexMD5ForString(String text) gives you an MD5 digest in hex of any string. According to the apache documentation this is what I will want to put into the database so that mod_auth_digest can authenticate users of my web app. To see what started all of this look here: http://httpd.apache.org/docs/2.4/misc/password_encryptions.html

With the code above in place I can now do something like:

HexMD5ForString( user + ":" + realm + ":" + password );

From the look of it, the Java code on the apache page looks like it will work; and it may be faster; but doing it my way the code is less obscure and yields a few extra utility functions that can be useful other places.


Jun 152012

Back in the early days of spam fighting we recognized a problem with all types of filtering. No matter what kind of filtering you are using it is fairly trivial for an attacker to defeat your filters for a time by pretesting their messages on your system.

You can try to keep them out, but in the end, if you allow customers on your system then any one of them might be an attacker pretending to be an ordinary customer. To test a new campaign they simply send a sample to themselves and see if it makes it through. If it does then they have a winner. If it doesn’t then they need to try again. Either way they always appear to be just an ordinary customer that gets ordinary spam like anyone else.

The simplest and most effective way to solve this problem is to selectively delay the processing of some messages so that all of your filtering strategies have time to catch up to new threats. After a short delay these messages are sent through the filtering process again where they receive the benefit of any adjustments that have been made. We call this solution “Gauntlet” because it forces some messages to “run the gauntlet” before allowing them to pass.

The first step is to send your messages through your usual filtering process. You will be able to discard (or quarantine) most of these immediately. The remaining messages should be fairly clean but, most importantly, they will be a much smaller volume.

The next step is deciding which messages to delay. This is controversial because customer expectations are often unreasonable. Even though email was never designed to be an instantaneous form of communication it tends to be nearly so most of the time; and in any case most email users expect to receive messages within seconds of when they are sent.

The reality is that many messages take some time to be delivered and that there is usually very little control or knowledge on part of the recipient regarding when messages are sent. As a result there is a fair amount of ambiguity over the apparent travel time of any given message. It is also true that while most customers will violently agree that email should never be delayed, under most circumstances a delay will be unnoticed and inconsequential. In fact one of the most powerful features of email is that the recipient can control when they receive and respond to email – unlike phone calls, instant messages, or friends dropping in unannounced.

This flexibility between perceived and actual delivery times gives us an opportunity to defeat pretested spam – particularly if we can be selective about which messages we delay.

The more sophisticated the selection process the less impact delayed processing will have on end users and support staff. Often the benefits from implementing Gauntlet far outweigh any discomfort that might occur.

For example, Message Sniffer generally responds to new threats within seconds of their arrival in spam traps and typically generates new rules within minutes of new outbreaks (if not immediately). Many of those messages, usually sent in dramatically large bursts, are likely to be reaching some mailboxes before they arrive in spam traps. If some messages can be delayed by as little as 10, 20, or 30 minutes then the vast majority of those messages will never reach a customer’s mailbox.

If a selective 30 minute delay can prevent virtually all of a new phishing or malware campaign from reaching it’s target then the benefits can be huge. If a legitimate bank notification is delayed by 30 minutes the delay is likely to go completely unnoticed. It is worth noting that many email users regularly go hours or even days without checking their mail!

On the other hand there are also email users (myself included) that are likely to “live in” their email – frequently responding to messages mere minutes or seconds after they arrive. For these users most of all, the sophistication of the selection process matters.

What should be selected for delayed processing?

More advanced systems might use sophisticated algorithms (even AI) to select messages in or out of delayed processing. A system like that might be tuned to delay anything “new” and anything common in recently blocked messages.

Less sophisticated systems might use lists of keywords and phrases to delay messages that look like high-value targets. Other simple selection criteria might include messages from ISPs and online services that are frequently abused or messages containing certain kinds of attachments. Some systems might chose to delay virtually all messages by some small amount while selecting others for longer delays.

A more important question is probably which messages should not be delayed. It is probably best to expedite messages that are responses to recently sent email, messages from known good senders such as those from sources with solid IP reputations, and those that have been white-listed by administrators or customers.

In order to remove the mystery and offload some of the support work, the best solutions can put some of the controls in the hands of their customers. Customers who feel it is vital that none of their messages are delayed might opt out. Others who prefer to minimize their exposure to threats might elect to impose longer delays and to delay every message regardless of it’s source and content.

One customer who implemented Gauntlet back in the early days had an interesting spin on how they presented it to their users. Instead of telling them they were delaying some messages they told the customer that the delayed messages were initially quarantined as suspicious but later released automatically by sophisticated algorithms. This allowed them to implement relatively moderate delays without burdening their users with any additional complexity.

However it is implemented, delayed message processing is a powerful tool against pretested spam. Recent, dramatic growth in the volume and sophistication of organized attacks by cyber criminals is a clear sign that the time has come to implement sophisticated defenses like Gauntlet.


Mar 312012

Artist: Pete McNeil and Julia Kasdorf
Album: Impromptu
Review By: Dan MacIntosh
Rating: 3.5 Stars (out of 5)

After getting an earful of Julia Kasdorf on Impromptu, it’s really difficult to believe this singer/songwriter/musician actually got her start by playing bass in San Francisco punk bands, such as Angry Samoans.  However, anyone that has followed the punk rock scene long enough is well aware of the way many of these players use punk music as career kickoffs, before moving on to their true musical loves. In Kasdorf’s case, singer/songwriter music — with just a touch of the blues – is the style that most sincerely represents her artistic heart.

Impromptu is actually a two-sided coin, if you will, as Pete McNeil (who also calls himself MadScientist) also contributes songs to this double-artist collection. Whereas Kasdorf goes for the mostly introspective approach to songwriting, McNeil is more apt to rev it up, as he does during the roadhouse blues of “Treat Me Like A Road.”  However, “Kitties” is one of the coolest tracks on this collection. It has a distinctive psychedelic – you might say druggy – feel to it. Instead of rollicking blues guitar, the six-string part is moody and spooky, instead, and placed over an inventive, wandering bass line. McNeil’s “Doldrums” and “Baby Please” are also built upon basic blues structures, much like “Treat Me Like A Road.”

Kasdorf’s songs are consistently lyrically intriguing. For instance, “Motel” opens with her announcing, “I’m gonna hide in a motel.” This could be describing reactionary behavior of typical musicians. However, it could represent something a lot darker, as in someone retreating to such anonymity in order to indulge in destructive drug-taking behavior. Nevertheless, when Kasdorf sings a line about burning old love letters, it suggests something more akin to post-relationship breakup activity.

With “This Heart,” Kasdorf expresses a much more empathetic perspective. It’s sung almost as a prayer, and speaks to the artist’s care for those less fortunate, including the underprivileged in Romania and Brazil. The track also features a bit of surf guitar in its upbeat melody, which is enjoyable. The chorus states, “You gave me this heart.” It reveals that Kasdorf might not be quite so concerned people half a world away, had Jesus not first given her a loving heart.

One other fine song is simply titled “Sunday.” It begins with rain sound effects before Kasdorf begins singing about the rain. When Kasdorf vocalizes on it, it’s with a world-weary, slightly scratchy voice. “I wish it was Sunday again,” she sings longingly. This recording is beautifully augmented by Carla Deniz’s supportive viola.

Although Kasdorf tends to sing with relatively stripped-down arrangements, she sure sounds boisterous and right at home during “Lament,” which also features a bevy of backing vocals and an orchestrated arrangement. This track is one place where the listener might secretly wish it also featured a string section. In other words, a little more could have been even better.

McNeil has said Impromptu is the first compilation for ALT-230 label. If what comes after this album is even close to the quality it contains, that is really a label future to get excited about. These songs may not be as commercial as what’s getting airplay these days, but that’s probably not a bad thing. Sure, it’s interesting to hear how electronic music is playing in such close quarters with rap and R&B, but after a while all of that stuff just starts to sound the same.

Best of all, Impromptu is filled with fantastic songs. The arrangements are slightly on the retro side, but they are retroactive back to a time when music just seemed to make a whole lot more sense. Instead of creating music for feet (for dancing), McNeil and Kasdorf compose songs for the heart and mind. After all, it doesn’t take a genius to create beats, no matter how much rap artists might brag about this particular skill. A title like Impromptu suggests something improvised and made up on the spot. However, his is well planned, and thoughtfully created music. You don’t have to love it, but you really oughta love it.

The Impromptu CD is available at CDBaby, AmazoniTunes, and everywhere you find great music!

Mar 202012

Artist: Julia Kasdorf and Pete McNeil
Album: Impromptu
Reviewed by Matthew Warnock
Rating: 4 Stars (out of 5)

Collaboration is the spark that has ignited some of the brightest musical fires in songwriting history.  When artists come together on a project featuring a core duo or group and a number of guest artists, there is something that can happen that makes these moments special, especially when the stars align and everything winds up in the right place at the right time.  Songwriters and performers Julia Kasdorf and Pete McNeil have recently come together on just such a record, which features the duo on each track alongside various other accomplished artists.  The end result, Impromptu, is an engaging and enjoyable record that possesses a sense of cohesiveness deriving from the duo’s contribution, but that moves in different and exciting directions as the different guest musicians come and go throughout the album.

Though most of the album is a collaborative effort between McNeil, Kasdorf and guest artists, there are a couple of tracks that feature just the duo, including “The Minute I’m Gone,” though one might not realize this unless the liner notes were consulted.  Kasdorf, being a multi-talented multi-instrumentalist, contributes the lyrics and music, as well as performs vocals, both lead and background, guitar and bass, while McNeil brings his talents to the drum work on the track.  Not only is the song a sultry, blues-rock number that grooves and grinds its way to being one of the most interesting songs on the album, but the duo do a seamless job of overdubbing each part to make it sound like a band playing live in the studio, rather than two musicians playing all the parts.  The same is true for the other duo track, “Motel,” though in a more laid-back and stripped-down approach.  Here, the brushes played by McNeil, set up Kasdorf’s vocals, bass and guitar in a subtle yet effective way, allowing the vocals to float over the accompaniment while interacting at the same time.  Recording in this way is not easy, especially when trying to create the atmosphere of an ensemble in the studio, but Kasdorf and McNeil pull it off in a way that is both creative and engaging, and it is one of the reasons that the album is so successful.

McNeil also steps to the forefront of several songs to take over the role of lead vocalist, including the Cream inspired blues-rocker “Doldrums.”  Here, the drummer lays down a hard-driving groove that is supported by Kasdorf on rhythm guitar and bass while he digs deep into the bluesy vocal lines that define the track.  Guest lead guitarist Eric Nanz contributes a memorable solo and plenty of bluesy fills to the song, bringing a Wah-based tone to the track that brings one back to the classic tone used by late-‘60s blues rockers such as Eric Clapton, Jeff Beck and Jimmy Page.  McNeill also takes the reins on the track “Kitties” where he sings, as well as plays drums and synth, with bassist John Wyatt filling in the bottom end.  With a psychedelic vibe to it, the song stands out against the rest of the album in a good way, adding variety and diversity to the overall programming of the album while featuring the talented drummer-vocalist-pianist at the forefront of the track.

Overall, Impromptu is not only a cool concept, but an album that stands on its own musicality and songwriting regardless of the writing and recording process used to bring the project together.  All of the artists featured on the album, the core duo and guest artists alike, gel together in a way that serves the larger musical goals of the record, providing an enjoyable listening experience along the way.

The Impromptu CD is available at CDBaby, AmazoniTunes, and everywhere you find great music!


Mar 162012

When I added the cube to the studio I was originally thinking that it would be just a handy practice amp for Chaos. He was starting to take his electric guitar work seriously and needed an amp he could occasionally drag to class.

Then the day came that one of our guitar friends showed up to record a session for ALT-230 and had forgotten their amp. So, instead of letting the time slot go to waste we decided to give the little cube a try. We figured that if what we got wasn’t usable we would re-amp the work later or run it through Guitar Rig on the DAW.