September 2014 – Life At Warp 9

What if you could describe any kind of digital modulation scheme with high fidelity using just a handful of symbols?

Origin:

While considering remote operations on HF using CW I became intuitively aware of two problems: 1. Most digital modes done these days require a lot of bandwidth because the signals are being transmitted and received as digitized audio (or worse), and 2. Network delays introduce timing problems inherent in CW work that are not normally apparent in half-duplex digital and audio work.

I’m not only a ham and an engineer, I’m also a musician and a CW enthusiast… so it doesn’t sit well with me that if I were to operate CW remotely I would have few options available to control the timing and presentation of my CW transmissions. A lot more than just letters can be communicated through a straight key – or even paddles if you try hard enough. Those nuances are important!

While considering this I thought that instead of sending audio to the transmitter as with other digital modes I might simply send timing information for the make and break actions of my key. Then it occurred to me that I don’t need to stop there… If I were going to do that then in theory I could send high fidelity modulation information to the transmitter and have it generate the desired signals via direct synthesis. In fact, I should be able to do that for every currently known modulation scheme and then later use the same mechanism for schemes that have not yet been invented.

The basic idea:

I started scratching a few things down in my mind and came up with the idea that:

I can define a set of frequencies to use for any given modulation scheme and then identify them symbolically.
I can define a set of phase relationships and assign them symbolic values.
I can define a set of amplitude relationships and assign them symbolic values.
I can define the timing and rate of transition between any of the above definitions.

If I were to send a handful of bytes with the above meanings then I could specify whatever I want to send with very high fidelity and very low bandwidth. There are only so many things you can do to an electrical signal and that’s all of them: Frequency, Amplitude, Phase, … the rate of change between those, and the timing of those changes.

Consider:

T — transition timing in milliseconds. 0-255.
A — amplitude in % of max voltage 0-255.
F — frequency (previously defined during setup) 0 – 255 possible frequencies.
P — phase (previously defined during setup) 0 – 255 possible phase shifts.

To send the letter S in CW then I might send the following byte pairs. The first byte contains an ascii letter and the second byte contains a numerical value from 0 – 255:

T10 A255 T245 T10 A0 T245 T10 A255 T245 T10 A0 T245 T10 A255 T245 T10 A0 T0

That string essentially means:

// First dit.
// Turn on for about a quarter of a second.

T10 – Transition to the following settings using 10 milliseconds.
A255 – Amplitude becomes 255 (100%).
T245 – Stay like that for 245 milliseconds.

// … then turn off for about a quarter of a second.

T10 – Transition to the following using 10 milliseconds.
A0 – Amplitude becomes 0.
T245 – Stay like that for 245 milliseconds.

// Do that two more times.

// Turn off.

T10 – Transition using 10 ms.
A0 – Amplitude becomes 0.
T0 – Keep this state until further notice.

36 Bytes total.

So, a handful of bytes can be used to describe the amplitude modulation envelope of a CW letter S using a 50% duty cycle for dits and having a 10ms rise and fall time to avoid “clicks”. This is a tiny fraction of the data that might be required to send the same signal using digitized audio… and if you think about it the same technique could be used to send all other digital modes also.

There’s nothing to say that we have to use that coding scheme specifically. I just chose that scheme for illustration purposes. The resolution could be more or less than 0-255 or even a different ranges for each parameter. In practice it is likely that the data might be sent in a wide range of formats:

We could use a purely binary mode with byte positions indicating specific data values… (TransitionTime, Amplitude, Frequency, Phase) so that each word defines a transition. That opens up the use of some nonsensical values for special meanings — for example the phase and frequency values have no meaning at zero amplitude so many words with zero amplitude could be used as control signals. Of course that’s not very user friendly so…

We could use a purely text based mode using ascii… The same notation used above would provide a user friendly way of sending (and debugging) DMDL. I think that’s probably my favorite version right now.

Why not just send the bytes and let the transmitter take care of the rest?

If I were to use a straight key for my input device this mechanism allows me to efficiently transmit a stream of data that accurately defines my operation of that key… so those small musical delays, rhythms, and quirks that I might want to use to convey “more than just the letters” would be possible.

At the transmitter these codes can be translated directly into the RF signal I intended to send with perfect fidelity via direct synthesis. I simply tell the synthesizer what to do and how to make those transitions. I know that beast doesn’t quite exist yet, but I can foresee that the code and hardware to create it would not be terribly hard to develop using a wave table, some simple math, and a handful of timing tricks.

Even more fun: If I want to try out a new modulation scheme for digital communications then I can get to ground quickly by simply defining the specifications of what I want that modulation to look like and then converting those specs into DMDL code. With a little extra imagination I can even define DMDL code that uses multiple frequencies simultaneously!

If you really want to go down the rabbit hole, what about demodulating the same way?

Now you’re thinking like a Madscientist. A cognitive demodulator based on DMDL could make predictions based on what it thinks it hears. Given sufficient computing power, several of those predictions can be compared with the incoming signal to determine the best fit… something like a chess program tracing down the results of the most likely moves.

The cognitive demodulator would output a sequence of transitions in DMDL and then some other logic that understands the protocol could convert that back into the original binary data.

Even kewler than that, the “converter” could also be cognitive. Given a stream of transitions and the data it expects to see it could send strings of DMDL back to the demodulator that indicate the most likely interpretations of the data stream and a confidence for each option.

If there is enough space and processing power, the cognitive demodulator when faced with a low confidence signal from the converter might respond by replaying the previous n seconds of signal using different assumptions to see if it can get a better score.

Then, if that weren’t enough, an unguided learning system could monitor all of these parameters to optimize for the assumptions that are most likely to produce a high confidence conversion and a high correlation with the incoming signals. Over time the system would learn “how to listen” in various band conditions.