• Please review our updated Terms and Rules here

Audio cassette formats

RichCini

Veteran Member
Joined
Aug 7, 2005
Messages
547
Location
Long Island, NY
All --

A gentleman from France sent me an audio recording (PCM/WAV) of a section from a Tarbell cassette to try to process into bytes. I developed a program long ago (with a lot of help from Bob Grieb) which uses a software ADC and software UAR/T to take a PCM audio stream (the MITS 88-ACR v1) and covert it to data bytes to be used with my Altair32 Emulator. So, I though that I'd take that program and adapt it for this.

In the process of pulling my old notes on the program, thought that I'd add some flexibility to allow processing different audio encodings. This is where I need some expertise to make sure I have all the bits (no pun intended).

The original MITS encoding was 1850/2400Hz, 300,N81 (10-bit data frame). There was a later version of the 88-ACR which used the Bell 103 frequencies (2025/2225Hz) but it's not clear what the serial data format was (assuming the same as v1).

When looking at the Tarbell manual, it looks like it uses 1200/2400Hz, 300,N82 (11-bit data frame) which I think is the Kansas City Standard. But, I've also read that there was a faster "Tarbell native" format which I really can't seem to locate.

Right now, my program only supports the 88-ACR v1 format but it's not that hard to add the others. If someone can help clear this up, I'd appreciate it.

Thanks!

Rich
 
Rich,

The original MITS encoding was the Bell 103 standard (2025/2225Hz). However, this proved too tight a frequency range to work reliably with many cassette players, so very shortly after its introduction, MITS changed the 88-ACR to use 1850/2400Hz. The data stream remained the same. Both have a center frequency of 2125Hz so it wasn't hard to convert old tapes to new tapes if you could find a cassette player able to reliably read the original format.

The 1200/2400 format you describe is, as you surmised, the Kansas City Standard. A modern demodulation approach that simply compares the incoming frequency to 2125Hz to determine if it's above or below that threshold (as required for the Altair 88-ACR) will also read a KCS tape without any issue.

When it comes to the higher speed Tarbell format, you're in the same boat as I am - I'm not familiar with the details and would have to read about it to learn more. It's probably like the 1200 baud format used by Processor Technology and the Sol-20 which was more of a base band mechanism than it was FSK.

Mike
 
Ok, so I had the MITS sequence backwards. Good to know that what I have for that is probably most common. The program works quite well on the MITS tapes. If anyone has KCS/Tarbell tape recordings let me know. I need them samples at 22050hz mono, 8-bit PCM/WAV to work. I haven’t had much success with higher sampling rates or resolution.
 
hello,

Interesting info.

I've been playing quite a bit with the 'hxtape' system, which uses PHP code to do something very similar to this.

The audio file format is different of course. But from what is described here the bit/wave manipulations are similar, although I don't think I've spotted anything like the idea you mention, where the mid freq becomes the base and anything less in treated an one freq, and anything more as the other. Hxtape may be trying something cleverer, but it doesn't always work, but then again, I don't fully understand the code.

However, hxtape works fine for my purposes. I can save progs etc from the HX-20 to the laptop, manipulate the saved .WAV to achieve the required format (this system needs 11,025, mono, 8 bit) for conversion from the .WAV to .BAS or .TXT. There's also a reverse process to convert .TXT back into .WAV which can then be LOADed back into the HX-20 (using say MediaPlayer to PLAY the file while the HX-20 goes into LOAD.

BUT, if the file from the tape is poor, or rather noisy, then hxtape gets too many errors and will not work, while the HX-20's LOAD process seems to be able to LOAD OK files that are too bad for hxtape to work with.

Geoff
 
Hello Rich;

I have been interested in the S100 Tarbell Cassette Board for a long time with the eventual goal of getting one fabricated. In
researching the card, the "native" Tarbell recording method appears to be the "Manchester" encoding scheme. The Tarbell can be
modified to use the Kansas City Standard but this is a slower format.

There are a few articles on-line that discuss this format (one is enclosed); but it would seem you may have to do some hardware work
to get the Tarbell .WAV files in a format that your software can use if the Machester encoding is being used.
I have never been able to locate an example of an actual Tarbell Tape file but from the Tarbell manual it seems that Don Tarbell used

certain bytes as program control bytes...

IF KCTAPE ; BEGIN BLOCK FOR KCTAPE ROUTINES
; *RULES FOR WRITING/READING SOFTWARE FROM CASSETTE TAPE-
; *1) 1ST BYTE MUST BE 'START BYTE' = (3C)H
; *2) 2ND BYTE MUST BE 'SYNC BYTE' = (E6)H
; *3) 'DATA BYTES' ARE SENT NEXT AS SEQUENTIAL DATA
; *4) 'STOP BYTE' = (1A)H IDENTIFIED AS SUCH BY IMMEDIATELY FOLLOWING WITH (FF)H
; *5) 'CKSUM BYTE' WILL FOLLOW IF USED, OTHERWISE THERE WILL BE (FF)H
;
; * EXAMPLE: FF FF FF FF [3C] [E6] [DD DD DD DD DD DD] [1A] [FF] [CKSUM] FF FF FF

I will be interested in what you find out about the Tarbell tape file format.

>>> Charles

enclosure: Zipped WordPad 6.1 text file (wordpad included in Win7) View attachment Manchester_Data_Encoding.zip
 
Thanks Charles. Here's some additional on the recording I got from this person. It came from an LP record (The Bermuda Triangle, by Tomita) and the record (from 1979) contains a message encoded with the "Tarbell protocol". Unfortunately there's not much more information than that. It seems that one of the tracks has a message encoded in it. I'm hoping that it can be re-sampled at the correct bit rate, but I do need to modify my UAR/T to have two stop bits (11-bit serial frame) rather than one (a 10-bit serial frame).
 
Hi Rich,

I'm not sure if it could be of any help, a while back I wrote up a detailed technical article on Processor Technology's 300 and 1200 baud tape record & playback systems.

One interesting thing about it is, as mentioned on page 17 of the article, because of the method it would be possible to gate out noise pulses and clean up the signals from damaged tapes while still recovering the original data:

http://worldphaco.com/uploads/The_SOL-20_tape.pdf

Hugo.
 
Thanks Charles. Here's some additional on the recording I got from this person. It came from an LP record (The Bermuda Triangle, by Tomita) and the record (from 1979) contains a message encoded with the "Tarbell protocol". Unfortunately there's not much more information than that. It seems that one of the tracks has a message encoded in it. I'm hoping that it can be re-sampled at the correct bit rate, but I do need to modify my UAR/T to have two stop bits (11-bit serial frame) rather than one (a 10-bit serial frame).

http://isaotomita.net/recordings/bermuda.html shows that there are two Tabel audio recordings and the first part of both should be identical. Gives what the message should be which might make it easier to figure out the wave forms or determine that the recording engineer didn't use a Tarbell card to encode the data.
 
There's an article by Don Tarbell himself in the April 1977 issue of Kilobaud https://archive.org/details/kilobaudmagazine-1977-04
I feel I've seen other articles on it in similar vintage computer magazines, maybe BYTE..?

(EDIT: Now I recall why I remember this article. That issue also has a series on writing your own operating system that I used for my own attempt at one for my Versafloppy II back in high school.)
 
Last edited:
.....I forgot to mention that one of the really interesting things about PT's 1200 Baud system, was that they were able to , in the playback decoder, recognize and respond to only 1/2 of a cycle of a 600Hz tone. It is actually very clever the way they did it, I thought, using a frequency doubling technique to make it possible. Maybe I'm too easily impressed.
 
Thanks @krebizfan. The person who sent me the clip, sent me the translation for the side A clip, but not the side B. I can say that I'm not getting that translation, so I need to do some more work on it; I think it's because of the serial frame using two stop bits rather than one. A project for this weekend.

If people would find it helpful, I can post the code for the conversion program.
 
Hi,

I join this thread as I'm also involved in trying to decode the audio message contained in the "Bermuda Triangle" LP recording.

The Tarbell format is simply data bits manchester encoded: "01" decodes to "1", and "10" to "0". First decoded bit is the most significant of an 8 bits byte.

Since there are two pulses per data bit, special care must be taken in order to decode in sync with the first pulse and not the second. On the interface this is achieved by keeping a "clock" derived directly from the raising edges of the rectfied audio signal.

Clock is assumed to be acquired when there is a change on the data bits (from 0->1 or 1->0). To achieve this, the tape recording starts with a leading sequence of $FF (all 111...) and then the byte $E6 which provides the needed bit change in order to acquire the clock.

Of course when decoding in software this complex clock acquisition resolves into just looking for a pulse "0110" or "1001".

I wrote a decoding program in JavaScript which I previously used for another Tarbell cassette file, the "Tarbell BASIC 1.5" (I think there is a post on this forum if I remember well).

The problem with the "Bermuda Triangle" LP recording is the bad quality of the audio (which is surprising if we consider it comes from a remastered source put on CD).

If you look at the attached picture below you can see the exact point after the $FF lead where the $E6 byte should be. You can see how some short pulses are below the zero level making them disappear when the signal is quantized.

wave.png

To make the signal less "wavy", I tried to apply different high pass filters with limited success; some clear text starts to appear but most are corrupted bits.

This is what I get after applying a 400 Hz high pass filter 12db rolloff:
piece1.png

And this with 600 Hz high pass 6db rolloff:
piece2.png

To be sure the issue is in the quality of the audio source, I did a plot of the pulse lengths distribution. Since the input signal is manchester encoded, there are only two possible pulses: the short one is 3000Hz (14.7 samples at 44100Hz) and the long one 1500Hz (29.4 samples at 44100Hz).

In a clean signal the plot should be made of two gaussians centered at 15 and 30, with a clear separation between the two; the more separated the signal, the better the overall quality of the audio.

But as you can see from the plot below, there are yes two gaussian distributions, but they do overlap considerably with no distinct division between the two. That results in "0" bits decoding as "1" and vice versa.

distr.png

So apparently I am at a dead end with this one. The only possibility which I can think of is to apply some filtering in order to restore the quality of the audio.

Do you have any suggestion?
 
Well, the person I'm working with sent me additional copies of the audio, both 44100Hz and 22050Hz in 32-bit and 8-bit from both the CD and the LP. The conversion program I wrote for the MITS 88-ACR works in two-steps -- a software ADC based on two 81-step Hamming band pass filters with the bands corresponding to the two audio frequencies and then a software UAR/T to convert the cleaned-up audio into characters. The 88-ACR uses one stop bit while the Tarbell uses two. That's what I'm working on now, but I needed clean copies of the audio to work from.

At some point, I'll post the code for it, but it's not cleaned up enough to release yet.

Rich
 
The 88-ACR uses one stop bit while the Tarbell uses two.
the Tarbell interface does not use any stop bits, it's a synchronous stream once the clock is acquired.

The interface can also be made compatible with the Kansas City Standard, in that case there are indeed stop bits, but of course it's a different protocol and anyway it's not the one used in the "Bermuda Triangle LP".

When used in native Tarbell format, the data rate is also different: it's 1500 bits per second. As said before, the signal is made from a 1500 bps bitstream in XOR with a 1500 Hz clock signal.

I also have the whole set of WAV files, but they all have the same audio quality issue.

If you need clean test .WAV files I can make new ones with an encoding utility I wrote before. I also have the .WAV file sampled from the Tarbell BASIC cassette tape, sent to me by Jonathan Haddox. It's a noisy recording but it can be decoded with some effort.
 
the Tarbell interface does not use any stop bits, it's a synchronous stream once the clock is acquired.

Well, then I probably misunderstood the manual...looks it may be referring to the KCS format. Page 4 of the Tarbell Cassette manual says (in talking about BYTE and Don Lancaster) "...In this format, each 8-bit byte is written on tape in an asynchronous format, with one start bit (zero), 8 data bits (zero or one), and two stop bits (ones)." That's where I got the 2-stop bits from, so yes, KCS. Thanks for pointing that out.

Page 13 also mentions 3Ch as a start byte before the E6h sync byte. It also mentions checksum bytes.

I don't have a Tarbell card in my collection so I can't experiment with it. Would it be fair to say that a "0" is 1500Hz and a "1" is 3000Hz, or do I have that reversed?


Thanks.
Rich
 
Last edited:
Well, then I probably misunderstood the manual...looks it may be referring to the KCS format.
yes, the asynchronous format is the KCS which they provided for compatibility reason, but the true Tarbell format is the synchronous one with manchester encoding. [/QUOTE]

Page 13 also mentions 3Ch as a start byte before the E6h sync byte. It also mentions checksum bytes.
yes, there is the 3Ch byte and the checksum in the protocol but I haven't found them in the Tarbell Basic recording (at the moment the only good file we have).

Would it be fair to say that a "0" is 1500Hz and a "1" is 3000Hz, or do I have that reversed?

Bits are mapped to tones only in the KCS format. In the Tarbell "0" is an audio pulse that starts HIGH and then goes LOW, "1" is the opposite, starts LOW and then goes HIGH. It is the result of the data in XOR with the clock.

BTW I also made a KCS decoder and encoder program in plain JavaScript here:

https://github.com/nippur72/z80ne-wav

("Z80NE" is the name of the computer that used the KCS format for which I wrote the utility).
 
The software I wrote processes the audio stream in two phases. The first phase does the ADC work, so just picking out the ones from the zeros based -- in the case of the ACR/KCS -- on the two frequency patterns recorded on the tape. The second part takes that preprocessed stream and reassembles it into 8-bit bytes. In the ACR/KCS version, it's a software UAR/T, but in this case it would do the Manchester decoding, looking for the transitions 0->1 (low->high) or 1->0 (high->low) and the 01 and 10 patterns. In one of the earlier posts, the following statement is made: "Since the input signal is manchester encoded, there are only two possible pulses: the short one is 3000Hz (14.7 samples at 44100Hz) and the long one 1500Hz (29.4 samples at 44100Hz)." I guess what I'm asking is which one is considered "low" and which one is "high" so I can look for transitions.

I have attached a copy of my core processing code. Since it's a Windows program, there's a lot of code there, but the important stuff is in ProcessWavFile(). I noticed that my comment at the top of the source file about the versions of the 88-ACR needs to change as I have the versions backwards.

View attachment WaveProc.c.zip

Rich
 
I have attached a copy of my core processing code. Since it's a Windows program, there's a lot of code there, but the important stuff is in ProcessWavFile(). I noticed that my comment at the top of the source

I gave a look at the source code, cool stuff! I do a similar thing in my KCS decoder, I use two tuned filters to detect the tones and then a low pass filter + rectifier to do AM demodulation and recover the original baseband signal. I work in JavaScript which is very practical because it has libraries for handling WAV files and filters (which are calculated at runtime).

Anyway most of the cassette interfaces of the time did not use such filtering/DSP techniques, they simply counted the rising edges of the signal doing a very raw FM decoding (but it worked nicely).

In one of the earlier posts, the following statement is made: "Since the input signal is manchester encoded, there are only two possible pulses: the short one is 3000Hz (14.7 samples at 44100Hz) and the long one 1500Hz (29.4 samples at 44100Hz)." I guess what I'm asking is which one is considered "low" and which one is "high" so I can look for transitions.

The short and long pulses that come out of the manchester encoding can't be traced back to bits: a short pulse is a half bit, but a long pulse spans over two consecutive bits (half in one bit and the other half in the successive). This is a consequence of the manchester encoding. For example:

original_data_bits[] = { "0", "0", "0", "1", "0" }

manchester[] = { "10", "10", "10", "10", "01", "10" }

now manchester[3][1] == manchester[4][0] == '0'

this results in a "00" pulse when the audio signal is produced. That pulse has a period of 1/1500 sec. Such a long pulse is also detected in the Tarbell interface to put the clock in sync.
 
Hi,

I join this thread as I'm also involved in trying to decode the audio message contained in the "Bermuda Triangle" LP recording.

The Tarbell format is simply data bits manchester encoded: "01" decodes to "1", and "10" to "0". First decoded bit is the most significant of an 8 bits byte.

Since there are two pulses per data bit, special care must be taken in order to decode in sync with the first pulse and not the second. On the interface this is achieved by keeping a "clock" derived directly from the raising edges of the rectfied audio signal.

Clock is assumed to be acquired when there is a change on the data bits (from 0->1 or 1->0). To achieve this, the tape recording starts with a leading sequence of $FF (all 111...) and then the byte $E6 which provides the needed bit change in order to acquire the clock.

Of course when decoding in software this complex clock acquisition resolves into just looking for a pulse "0110" or "1001".

I wrote a decoding program in JavaScript which I previously used for another Tarbell cassette file, the "Tarbell BASIC 1.5" (I think there is a post on this forum if I remember well).

The problem with the "Bermuda Triangle" LP recording is the bad quality of the audio (which is surprising if we consider it comes from a remastered source put on CD).

If you look at the attached picture below you can see the exact point after the $FF lead where the $E6 byte should be. You can see how some short pulses are below the zero level making them disappear when the signal is quantized.

View attachment 67360

To make the signal less "wavy", I tried to apply different high pass filters with limited success; some clear text starts to appear but most are corrupted bits.

This is what I get after applying a 400 Hz high pass filter 12db rolloff:
View attachment 67364

And this with 600 Hz high pass 6db rolloff:
View attachment 67365

To be sure the issue is in the quality of the audio source, I did a plot of the pulse lengths distribution. Since the input signal is manchester encoded, there are only two possible pulses: the short one is 3000Hz (14.7 samples at 44100Hz) and the long one 1500Hz (29.4 samples at 44100Hz).

In a clean signal the plot should be made of two gaussians centered at 15 and 30, with a clear separation between the two; the more separated the signal, the better the overall quality of the audio.

But as you can see from the plot below, there are yes two gaussian distributions, but they do overlap considerably with no distinct division between the two. That results in "0" bits decoding as "1" and vice versa.

View attachment 67366

So apparently I am at a dead end with this one. The only possibility which I can think of is to apply some filtering in order to restore the quality of the audio.

Do you have any suggestion?

I would suggest peak detection. Every time the signal changes direction thats a peak, so flip the result.
It ignores baseline shift. Of course you loose polarity but can recover that by looking for SYNC or not SYNC
to get the polarity

joe
 
I would suggest peak detection.

thank you Joe! while thinking about your suggestion and how to detect peaks, I calculated the first derivative to the audio signal and suddenly realized I had a quite clean signal to work with. So I applied a wideband low pass filter (4000 Hz) to let only the information pass through.

This was my clean signal in the exact same spot after the leading sequence:
derivata.png

I then fed it to my decoding program and ...voilà! The whole text fully decoded!

Code:
**********  THIS IS THE BERMUDA TRIANGLE, OVER. LOOK OUT!
THE CYLINDRICAL OBJECT JUST LIKE THE ONE EXPLODED OVER
SIBERIA AND CRASHED INTO TUSNGUSKA IN 1908,  HAS JUST 
COME INTO THE SOLAR SYSTEM.  ************  ***********

this is the hex dump, note the $3C $E6 start bytes and the two bytes checksum at the end (still to be verified).
decoded.png
 
Back
Top