So, is there a "standard" that all Windows sound/MIDI drivers are written to follow (ie, such that each driver has the same set of functions, to be called using the same method)? Yes. This is called Windows "Media Control Interface" (ie, MCI). The Windows MIDI Mapper, the MCI Sequencer Device (driver), and the MCI Wave Audio Device (driver) are 3 parts of MCI that are particularly relevant to sound/MIDI cards.
When a user buys a sound/MIDI card, he installs the Windows MCI driver that ships with it (for example, using Windows 95 Control Panel's Add new Hardware/Software. See the explanation of installing a driver under Win95 for more details). Virtually every PC sound/MIDI card ships with a Windows MCI driver for that card. You should never have to write a Windows driver for a card unless you're the manufacturer of that card. As a program developer, you don't care what particular brand/model card is being used (as long as it supports what you need to do). You don't even need to know the name of the driver, because you call it indirectly via functions in the Windows operating system.
Note: Windows programs that call these MIDI and digital audio functions in the Windows operating system should include the header MMSYSTEM.H (and MMREG.H for Win32), and also be linked with the library WINMM.LIB (or MMSYSTEM.DLL if a Win3.1 program).
Therefore, Windows maintains 4 lists:
For a card that can input/output MIDI data, as well as record/play Digital Audio data, a name for it will appear in all 4 lists.
What name for a device appears in each list? Well, that is entirely up to the card's device driver. It is the card's device driver that tells Windows into which lists Windows should place the card's name, and what name to use for each list. And in fact, if the card has several different components that can logically fit into one list, then the device driver may tell Windows to put several component names into that list.
Let's take an example for illustration: A Creative Labs' AWE32 sound card. This card has the following components:
Notice that the last 3 components can all fit into the list of devices that support outputting or playing MIDI data. So, the AWE32 driver is going to tell Windows to put 3 names into that one list. For example, the driver may tell Windows to put the name "AWE32 MIDI Out" into the list. The driver may tell Windows to also put the name "AWE32 FM Synth" into that same list. Finally, the driver may tell Windows to also put the name "AWE32 Wavetable Synth" into that same list. Although these 3 components are all on the same card, to Windows, they represent 3 individual devices that are capable of outputting/playing MIDI data, and Windows treats them as if they really are 3 separate devices. (Incidentally, there is no particular rules about naming the components. I just arbitrarily picked the above 3 names, although they do make sense. A manufacturer can put any names he wants into his driver for inclusion into those Windows lists).
Of course, the AWE32 driver will also tell Windows to put the name "AWE32 Digital Audio In" into the list of devices that support recording Digital Audio. The driver will also tell Windows to put the name "AWE32 Digital Audio Out" into the list of devices that support playing Digital Audio. Finally, the driver will also tell Windows to put the name "AWE32 MIDI In" into the list of devices that support recording or creating MIDI data.
Windows assigns a Device ID (ie, merely an unsigned long numeric value) to each name in its lists. The first device in each list gets a value of 0, and the subsequent devices have increasing values for their IDs. (ie, The second device in a list has a Device ID of 1, the third device has an ID of 2, etc).
So in our above example, the 3 components that the AWE32 device driver told Windows to add to the list of devices that support playing MIDI data ("AWE32 MIDI Out", "AWE32 FM Synth", and "AWE32 Wavetable Synth") will have Device IDs of 0, 1, and 2 respectively (assuming that they were added to the list in that order and are the first 3 names to be added to the list).
In fact, drivers for MIDI interfaces that have multiple MIDI Outputs typically add a separate name to the list for each MIDI Output, for example "MIDI Out #1 for Brand X card", "MIDI Out #2 for Brand X card", etc.
The one caveat to this is the "MIDI Mapper" device driver which is part of Windows. This doesn't appear in the list of devices that support outputting MIDI data, although the MIDI Mapper is indeed a device to which your program can output MIDI data. MIDI Mapper always has a Device ID of -1.
For a discussion of writing a program that properly shares MIDI inputs and outputs, see Managing MIDI Ports.
For example, if you want to play a MIDI file (ie, play a musical sequence), then you have to load the MIDI data contained within that file, use a (Windows software) timer to determine when it is time to output each MIDI message, and when that time occurs, pass that message (ie, its data bytes) to a Windows function that actually outputs the bytes.
If you want to record a MIDI file, then you have to tell Windows to notify you when the card's driver has input the next MIDI message and to request Windows to pass you those MIDI data bytes, use a (Windows software) timer to timestamp that MIDI message, and write the bytes to disk (typically in MIDI File Format).
Because your program does most of the work, you have a lot of control over the playback/recording process, such as the ability to vary tempo, or perform filtering/mixing/transposing/etc of data on-the-fly.
For a more indepth discussion of writing a program that uses the Low level MIDI functions, see Low level MIDI API.
There are also low level functions for Digital Audio. Again, these require that you do a lot of work to record/play the digital audio data.
For example, if you want to play a WAVE file (ie, play digital audio), then you have to load the digital audio data contained within that file in "blocks" (ie, usually filling a buffer of 4K of data at a time), and feed it to a Windows function that actually outputs the data. In order to get a smooth, continuous playback of digital audio, you typically have to do double-buffering (ie, while the card is playing back one buffer of 4K data, you need to be simultaneously filling a second buffer with the next 4K of data).
If you want to record a WAVE file, then Windows has to feed you a continuous stream of those blocks of data, and you must write each block to disk (typically in WAVE File Format). In order to get a smooth, continuous recording of digital audio, you typically have to do double-buffering (ie, while the card is recording one buffer of 4K data, you need to be simultaneously writing the buffer containing the previous 4K of data to disk).
For a more indepth discussion of writing a program that uses the Low level Digital Audio functions, see Low level Digital Audio API.
For example, if you want to play a MIDI file (ie, play a musical sequence), then you merely call one Windows function, specifying the name of the MIDI file to be played back. Windows will open the MIDI file and load the MIDI data contained within that file, use a (Windows software) timer to determine when it is time to output each MIDI message, and actually output the bytes. In other words, Windows plays the MIDI file in the background while your program can go on to do other things simultaneously.
If you want to record a MIDI file, then you merely call one Windows function, specifying the name of the MIDI file to be created. Again, Windows does all of the work of reading the incoming MIDI data, time-stamping each message, and writing the data to disk in MIDI File Format. (NOTE: At this time, Windows does not offer High level API support for MIDI recording).
But because your program doesn't do most of the work, you lose a lot of control over the playback/recording process, such as being able to perform filtering of data on-the-fly, etc. You're pretty much limited to being able to stop, start, pause, rewind, and fast-forward the playback.
For a more indepth discussion of writing a program that uses the High level MIDI functions, see High level MIDI API.
There is also a high level API for playing/recording WAVE files. Just like with the High level MIDI API, Windows does a lot of the work (ie, completely loads and plays the WAVE file in the background), but again, you lose a lot of control over the actual playback (for example being able to do on-the-fly mixing of digital audio -- to implement virtual tracks. For a more indepth discussion of virtual tracks, see the article Digital Audio on a computer).
For a more indepth discussion of writing a program that uses the High level Digital Audio functions, see High level Digital Audio API.
Currently, the Stream API supports playback only (not recording -- ie, it can't supply you with time-stamped incoming MIDI messages). So, if you need to do MIDI recording, you'll have to use the Low level MIDI API and a MultiMedia Timer, or use DirectMusic.
For a more indepth discussion of writing a program that uses the MIDI Stream functions, see MIDI Stream API.
Many game programs use lots of digital audio for sound effects, and various waveforms often have to be mixed on-the-fly. For this purpose, Microsoft devised a new set of functions called "DirectSound" (ie, part of DirectX) under Win95/98 and WinNT (4.X and above only). These functions let the operating system take care of mixing several waveforms to a sound card's stereo DAC.
DirectX versions prior to 6.1 support playback, but not recording.
A card's device driver should have extra support to work well with the DirectSound API (although older drivers not directly supporting DirectSound may work with this API albeit very inefficiently and/or with some features disabled).
For a more indepth discussion of writing a program that uses the DirectSound functions, see DirectSound API.
Many game programs also use MIDI for background music. For this purpose, Microsoft devised a new set of functions called "DirectMusic" (ie, part of DirectX) under Win95/98 and WinNT (4.X and above only). These functions are a bit newer than the older low level MIDI API, and address some of the oddities/discrepancies that plague the legacy MIDI API.
DirectX versions prior to 6.0 do not include DirectMusic, and versions prior to 6.1 support only playback.
A card's device driver should have extra support to work well with the DirectMusic API (although older drivers not directly supporting DirectMusic may work with this API albeit very inefficiently and/or with some features disabled).
For a more indepth discussion of writing a program that uses the DirectMusic functions, see DirectMusic API.
A card's device driver needs extra support to work with the Mixer API. Not all Win95 and WinNT drivers have this support. Win3.1 drivers do not.
For a more indepth discussion of writing a program that uses the Mixer functions, see Mixer API.
Older drivers should at least support the Aux API. Just like with the lists for devices that support MIDI input and output, and Digital Audio playback and recording, Windows maintains a list of all devices that have Aux outputs. (And again, a card's device driver instructs Windows to put some name into this list). The Aux functions allow a program to control the volume (and perhaps other settings) of each Aux output. Typically, audio from a CDROM or the audio output of a voice modem is connected to a sound card's Aux output. (ie, So, it's the volume of these that you're affecting).
For a more indepth discussion of writing a program that uses the Aux functions, see Aux API.
For a more indepth discussion of writing a program that uses the MultiMedia File API, see MultiMedia File API.
Windows MultiMedia System Book
For a more indepth discussion of the history of why and how the Windows MIDI and digital audio functions came to be, read the introduction to the article Audio cards and MIDI Interfaces for a computer.