The oscillator portion of the Kawai K3 consists of 12 digital oscillators with a fairly clean sound comparable to the Korg DWGS models, however its design is remarkably simple and uses no custom ICs.

In general, it’s difficult to design a practical polyphonic digital synthesizer using only off-the-shelf parts, since even a straightforward oscillator design ends up being complicated and expensive. A divide-by-n based design generally needs separate hardware for each oscillator. While this is simple for a monophonic instrument, it doesn’t scale well for polyphonic synthesizers. Phase accumulators are better suited to polyphonic instruments because time multiplexing can be implemented easily, but to obtain good enough pitch resolution the phase accumulator needs to be about 24 bits wide. Many general purpose logic ICs like the 74283 adder only handle 4 bits, so between the phase memory, frequency memory, phase adder and phase latch, a standard implementation needs about 24 ICs just for the phase accumulator. An example of this is the PPG Wave 2.2’s PROZ board, which contains 72 ICs. Further, unless the sample rate is very high, it’s desirable to have good quality interpolation in order to avoid objectionable aliasing. But this generally requires a multiplier, and multipliers were expensive at the time. Alternatively, truncation can be used with highly oversampled waveform data, as in the Korg DWGS models. This doesn’t use a multiplier, but instead requires a large amount of waveform memory. To reduce the size and complexity of the hardware, most practical digital oscillator designs used custom ICs that integrate the phase accumulator and multiplier (if applicable) into a smaller package. But while custom ICs can offer good performance with simple hardware, they can be expensive to design, and are thus not well suited to relatively inexpensive instruments with small production runs.

While there were a few commercially available polyphonic digital synthesizers and samplers built from only off-the-shelf parts (including the PPG Wave 2, 2.2 and 2.3, DK Synergy, NED Synclavier, Fairlight CMI, Dynacord ADD One, Gleeman Pentaphonic, digital organs made by Wersi and Hohner, and most divide-by-n samplers), they were generally expensive and complicated. As far as I’m aware, the only really economically designed digital synthesizer of any significance to use only off-the-shelf parts was the Kawai K3, and additionally some computer peripherals like the Mountain Computer Music System (best known as the digital oscillator boards for the Syntauri alphaSyntauri), Acorn Music 500/Hybrid Music 5000/Peartree Computers Music 87, and the extremely obscure Clef PDSG. The K3 has a remarkably small parts count, with the digital oscillator board containing only 35 ICs. This is achieved by implementing the phase accumulator, linear interpolation and waveform address generation in very clever and efficient ways. A few of the other models I mentioned were similarly clever (mainly the Dynacord ADD One, Wersi and Hohner organs, and the Acorn Music 500). I’ll try to describe some of those later.

Follow along with the annotated schematic:

K3_annotated_schematic

The K3’s master clock is 12 MHz, fixed frequency. The output sample rate is 41.667 kHz (144 clocks per sample, 12 clocks per oscillator). The phase accumulator seems to be 23 or 24 bits. The phase for each oscillator is calculated in 3 passes, 8 bits at a time. Compared to a more straightforward implementation that uses a 24 bit wide data path to compute the phase all at once, this is significantly slower. This means that the maximum polyphony and sample rate are limited, but it requires far less hardware. In total, the phase accumulator uses only 6 ICs, plus 4 more to permit writing to the frequency memory. Since the phase and frequency data are time multiplexed, 3 more latch ICs are also needed to select the correct data for the waveform generation section.

The other clever thing is the way the waveforms are stored and reproduced. The K3 uses a series of ROMs to select the multisampled waveforms and implement linear interpolation in a very simple way. This interpolation technique is carried over from Kawai’s ADEPT digital additive organ models from 1981, used here in simplified form. I think this makes the K3 the only commercial digital synthesizer that implements linear interpolation in the digital domain without using a custom IC or high resolution multiplier.

For tabularized waveform wave(n) and fractional component frac, linear interpolation can be implemented in two ways:

out = wave(n) * (1-frac) + wave(n+1) * frac

out = wave(n) + (wave(n+1) – wave(n)) * frac

While these are mathematically equivalent, the second formula requires fewer multiply operations. In the K3, the differential waveform wave(n+1) – wave(n) is tabularized in a second waveform ROM. For linear interpolation, the output of this differential waveform ROM must then be scaled by the fractional waveform address, then added to the output of the main waveform ROM.

The scaling is typically performed by a multiplier, but historically this was a major problem for affordable digital synthesizers, since multipliers were complicated and expensive. They were eventually incorporated into ASICs, but before the late 80s it was generally desirable to find clever ways to eliminate multiply operations entirely. But linear interpolation of 8 bit waveforms reproduced by an 8 bit DAC doesn’t really require a high resolution multiplier. It’s good enough to use a small number of fractional bits, since much of the benefit from more fractional bits would be lost in the quantization distortion anyway. So it’s sufficient here to perform linear interpolation with 3 fractional bits using an 8 x 3 bit multiplier with 8 bit output.

While not necessarily elegant or efficient, any combinational logic, including a multiplier, can be implemented in a ROM. The size of the ROM required depends on the number of inputs and outputs needed by the multiplier. A 16 x 16 bit multiplier with 32 bit output could be implemented as a 4Gi x 32 ROM. Obviously that’s not practical, but a cheap EPROM can be used to easily implement a lower resolution multiplier. In this case it only needs a 2Ki x 8 ROM. The interpolation ROM contains 8 different sets of line segments with slopes 0, 1/8, 2/8, 3/8, 4/8, 5/8, 6/8 and 7/8. The output of the differential waveform ROM drives the lower bits of the interpolation ROM (A0 to A7) and the fractional waveform address drives the upper bits (A8 to A10). The ROM’s output is thus the differential waveform scaled by the fractional address. This is then added to the output of the main waveform ROM with an 8 bit adder, implementing the above formula for linear interpolation with one multiplication.

There are two 32Ki x 8 waveform ROMs, one for the main waveform and one for the differential waveform. These contain 31 stored waveforms (the remaining space is empty), plus there are two 2Ki x 8 SRAMs for two programmable waveforms (in practice only one is available at a time). The user waveforms are synthesized by the CPU and written to the waveform RAM. Only 32 out of 128 harmonics may be programmed (this is purely a software limitation), but in all other respects the user waveforms are exactly the same as the other waveforms. Between the two waveform memories, each waveform is 2 x 1Ki x 8. The complete waveform is stored; the waveforms are all summations of sines, but no tricks are used to exploit the odd symmetry that results.

The 1024 points per waveform are divided into 6 multisamples:

  • octave 1: 512 points, 128 harmonics
  • octave 2: 256 points, 64 harmonics
  • octave 3: 128 points, 32 harmonics
  • octave 4: 64 points, 16 harmonics
  • octave 5: 32 points, 8 harmonics
  • octave 6: 32 points, 4 harmonics

Compared to the Korg DWGS models, the waveforms are oversampled much more modestly. This doesn’t degrade the sound quality, because linear interpolation attenuates the high frequency images more than truncation. In total, the K3’s waveform memory is 64Ki x 8 for 32-ish waveforms, vs. 64Ki x 8 for 8 waveforms in the DW-6000. So the K3 stores 4 times as many waveforms in the same amount of memory.

The waveform ROM is addressed in an unusual way. The address bits can be broken down as follows:

  • WA10 to WA15: These come from the CPU and select the waveform. WA15 selects between the ROM and RAM waveforms.
  • WA5 to WA9: These are the upper phase bits (32 steps), used the same way by all multisamples.
  • WA0 to WA4: These are used as the lower phase bits (1-16 steps), and to select the multisamples. The number of bits used for the waveform phase depends on the length of the multisample selected.

WF0 to WF2: Fractional address bits for interpolation.

In a straightforward implementation where the multisamples are arranged sequentially, the bits of the phase accumulator would be masked and added to an offset to play the correct portion of the waveform memory. 12 bits of phase data are needed for this: the upper 9 bits are masked and offset to form the waveform address, then the 3 bits immediately below the masked portion are selected and used as the fractional address. The K3 instead interleaves the multisamples in an unusual format. This is done so that only 6 bits of phase data need to be manipulated to select the correct multisample and generate the fractional bits. This multisample decoding is performed by a 4Ki x 8 ROM, significantly simplifying the hardware. The upper 5 phase bits go straight to the waveform ROM without any modification. The multisamples are automatically selected by the decoder ROM based on the oscillator frequency, so any pitch modulation that crosses the multisample split point will switch immediately. This is in contrast to other synthesizers that select the multisample in software based on the played note, regardless of pitch modulation.

The lowest octave uses only 2 fractional bits for interpolation, resulting in somewhat degraded sound quality. This is because the decoder ROM uses only 6 phase bits total. In the lowest octave, 4 bits are used for the waveform address, leaving only 2 for fractional bits. This seems like a strange design oversight, since the decoder ROM uses only 4Ki x 8 of an 8Ki x 8 EPROM, and there doesn’t seem to be anything preventing the use of an additional phase bit. I think changing IC31 from a 74LS174 (6 bit latch) to a 74LS374 (8 bit latch) and rewriting the decoder ROM to use all 8Ki bytes would have been sufficient to have 3 fractional bits in the lowest octave.

The 2Ki x 8 interpolation ROM also uses an 8Ki x 8 EPROM, and similarly the pitch and phase data RAMs are 2Ki x 8 despite using only 36 x 8 bits. Probably Kawai used these so that they would have to stock fewer parts, and because the cost savings from using more appropriately specified components would be negligible. But these don’t suggest any easy improvements. An 8Ki x 8 interpolation ROM could support 5 fractional bits, but additional hardware would be needed to provide these bits. The pitch and phase data RAMs could support a very large number of oscillators with only minimal modifications to the digital oscillator board, but the sample rate would be proportionally reduced. The master clock frequency could be increased to improve this tradeoff, but there’s an upper limit to how fast the ICs can run. For the most part, the K3’s performance can’t be improved without significantly increasing the cost and complexity of the design.

There are two 8 bit waveform DACs, one for each oscillator. The data is latched separately for each DAC, then the conversions are performed simultaneously. These are multiplying DACs, so the amplitude of each oscillator can be controlled by varying the reference voltage. By inverting the reference voltage to one of the DACs, a single control voltage can adjust the balance between the two oscillators. The DAC outputs are mixed and then demultiplexed into 6 voices. It’s more common to use a single DAC with 12 sample and holds, but this arrangement makes the oscillator balance and mixing circuits simpler. Amplitude scaling, like most similar synthesizers, is performed by an analog VCA.

Overall, the design performs remarkably well. The sound quality is similar to Korg’s DWGS models, despite using 75% less data for each waveform, and it sounds considerably cleaner than the Ensoniq ESQ-1, which has a similar sample rate and waveform resolution but uses truncation. However, the transitions between multisample zones are highly audible, and the overall sound is rather dull, with the bandwidth only extending to about 8.5-10.5 kHz at the low end of the multisample range. In this respect it’s again similar to the DWGS models. Because the sample rate isn’t very high and the multisamples span an octave, the high harmonics are pushed very close to aliasing. But different waveform ROMs could potentially be written that would produce a brighter sound at the expense of a moderate amount of aliasing. It’s possible to have up to 256 harmonics, but this would result in a great deal of aliasing. 192 harmonics might be a better compromise.

Advertisements

Made in Pure Data and Octave, 2012-2016

Spotify link

I also made a video of the spectrogram for the last track:

I sampled the drum sounds from the rare Technics SX-AX7. The AX7 is a synthesizer/arranger from 1988. It’s fully digital, aside from the single BBD-based chorus. The AX3 and AX5 are similar but have fewer features. As far as I’m aware, both the hardware and the voice architecture are unique to the AX models. The SX-K700 and KN800 are completely different, despite being similar arranger models from the same time period. The AX7’s sounds are programmable, consisting of a sampled attack transient and single cycle “basic” and “harmonics” waveforms. There’s a global ADSR amplitude envelope, and a separate one for the harmonics waveform. This sort of crudely simulates filtering. It’s velocity sensitive, the harmonics waveform can be detuned and transposed, and there’s also a rather low quality digital reverb/delay. Hiding behind this relatively simple set of editable parameters are more complex semi-preset multi-stage envelopes (these change according to the selected waveforms) and a variety of interesting vibrato, tremolo and pitch envelope variations. The AX7 is good for bass and mallet sounds, but most other sounds are rather bad. Due to the low sample rate (31.25 kHz) and high quality interpolation, the sound is fairly clean and somewhat dull. The single cycle waveforms by themselves don’t sound particularly special, and many of the transient waveforms are useless, but in the right combination it can sound surprisingly good. The K700 and KN800 generally sound brighter and use conventionally sampled sounds.

The drum and accompaniment patterns are also programmable. These features are similar to the K700, but less sophisticated than the KN800. The drums aren’t amazing, but they’re fairly punchy with a few unique sounds. The sequencer timing is quite sloppy, but interestingly this results in a sort of syncopation rather than random jitter, so the effect generally isn’t unpleasant. One interesting feature is that the patterns can be switched immediately, in the middle of a pattern, rather than only at the start of the pattern. This means that it’s easy to obtain some interesting variety by programming a few variations and quickly switching between them. Another nice feature is that the patterns can be programmed in triplets or unusual time signatures like 5/4 and 7/4.

T6118A, T6118B, T6118C (1981): These are Kawai’s first custom drum ICs. 24 pins, clock is 1.6 MHz, manufactured by Toshiba. They each generate 6 channels, and the A/B/C suffix denotes different variants for different sets of sounds. They’re used in the DX series organs. The high end organs have all 3 ICs (15 sounds total), while the lower end models use only the T6118A and T6118B (10 sounds).

These are the sounds made by each IC, with the original names from the service manuals:

  • T6118A: Hihat-L, Hihat-S, Cymbal, Hi Conga, Clave, Bass D
  • T6118B: SD (Tone), Rimshot, Low Conga, Cowbell A, Cowbell B (these are used together) SD (Noise)
  • T6118C: Low Bongo, Low Tam, Guiro, Tamb., Wow-Gui. The remaining channel is unused.

They appear to continuously generate digital waveforms, which are either stored or synthesized, then sent to an onboard waveform DAC. This is externally buffered and sent back into the IC for enveloping via a second multiplying DAC. The output of this DAC is again buffered and again sent back into the IC for demultiplexing. The 6 outputs then go to sample and holds, then analog filtering and mixing circuits. The input seems to be simple trigger pulses, with no accent. So in total, each IC contains waveform generators, envelope generators, two DACs and demultiplexer control circuits.


Kawai R100, R50, R50e, R50iii (1986-87): These are 8 note polyphonic drum machines. Although they can transpose samples, there are no envelopes, so they’re not really full fledged romplers. Similar hardware is likely used in the ADEPT2 and ADD series Kawai organs.

They’re based around 3 custom gate arrays:

MB63H158: “Sensor LSI”, 64 pins, made by Fujitsu. This is a simple gate array that scans the front panel buttons and drum pads and calculates pad velocity. It’s also used as a keyboard scanner in other models, like the K1, K3 and K5.

The sample playback gate arrays are made by Mitsubishi:

M60009-0104FP: AGU “Address Generation Unit”, 100 pins. This interfaces with the CPU and contains the phase accumulators and address generation. It seems to contain all the registers for sound generation, and sends amplitude data and timing signals to the DGU IC.

M60009-0103FP: DGU “Data Generation Unit”, 100 pins. This contains the anti-log table and output accumulators, sends data to the DAC and controls the sample and holds.

The waveform ROM is 256k x 16, storing both 8 bit header data, which goes to the AGU, and 12 bit waveform data, which goes to the DGU. The data isn’t packed in any clever way, so the remaining space is wasted. The samples are recorded at 31.25 kHz (not 32 kHz as is incorrectly claimed in the user manuals). The R100, R50 and R50e each have different waveform ROMs. The rare R50iii contains an expansion board with all 3 ROMs. The AGU can only address 256k, so only one waveform ROM can be used at a time. The waveform data is stored in a logarithmic format. Like Yamaha’s FM ICs, the amplitude is scaled without a multiplier by adding the amplitude information to the logarithmic waveform data, then passing the sum through an anti-log table. This is equivalent to multiplication since, a * b = exp(log(a) + log(b)). There doesn’t appear to be an amplitude envelope, so the scaling is used only for the sound level, velocity and panning.

The master clock is 5 MHz and the playback sample rate is 31.25 kHz (160 clocks per sample, 16 clocks per voice plus 16 clocks for each stereo output channel). The DAC is 12 bit linear (DAC312 in the R100; the R50 apparently uses a cheaper resistor array). There are 8 monophonic output channels, plus stereo mix outs. The DGU IC internally pans and accumulates the sounds, and the stereo outputs get their own time slots for conversion. While this is electronically simple and more flexible than drum machines that use analog mixing to assign output channels to fixed panning positions, it also means that the stereo mix outputs have lower dynamic range (12 bits per output) than an analog mix of the individual outputs (approximately 14-15 bits depending on how the channels are panned). The sounds can be tuned chromatically over a range of -8 to +7 semitones, but the flange effects in the R50 adjust the pitch more finely, and it’s not unlikely that the hardware supports a larger transposition range. Interpolation seems to be truncation. Coupled with the low playback sample rate, this results in a dirty sound similar to the E-mu SP-1200 or Ensoniq Mirage.


 

This was the original initial incarnation of the patch:

I continued working on it after that, so I don’t have that version anymore. The final version was later released here:

You can download the patch here.

The main patch is _dist_integral_test2a.pd. I’ve included all the relevant abstractions that I’ve made, but it also requires the creb, Cyclone, IEMlib and zexy libraries. I’ve only cleaned it up slightly and added a few annotations. I changed the reverb because the one I used originally needed to be manually adjusted to get the right sound. It’s still messy and confusing and some things are implemented in a less than ideal way. It’s fairly CPU intensive, so it may not run on anything less than a powerful laptop. Some of the abstractions are useful, but they’re not necessarily well documented, easy to use or free of bugs. They were mostly made for my personal use, and I can’t guarantee that they will be useful to anyone else. Ignore any “load_object: Symbol “lp_setup” not found” and “expr divide by zero” errors.

More PD algorithmic IDM, 2015-2017.

Spotify link

I signed up with DistroKid, so my music will be available on Spotify and various other stores and streaming services. I’m uploading now, but it will take some time for everything to appear.

If you want to buy a release I still recommend Bandcamp, because the sound quality and artwork will in some cases be better. I’m not doing this intentionally; it’s just that Bandcamp allows larger file uploads and additional artwork.