Using In Band DTMF Processing with the Skype API and PSTN Calls
I will explain, in general terms, how it can be done, what to be careful about, and also show you this working example done with MyToGo For Skype which is free, so that you can see that this can in fact be done.
MyToGo For Skype was created using the Skype4COM library interface, however the same programming logic could have been used by using the Raw Skype API interface as well. MyToGo for Skype can be run as a stand alone program or as an Extra for Skype.
The Audio API for Skype has been present since release 2.6 of the Skype client for Windows release. There are 2 methods to gain access to Skype Audio via the API, you can use the .wav file or port access methods. In our case we need to use the audio being fed by the Skype API to a port not a file. Here is why……
]]>First you will need to process Skype audio using the Skype API port interface
You cannot use the Skype API file interface for audio for processing real-time in band DTMF which is provided by the Skype API because the Skype API opens the .wav file exclusively while it is writing audio data, so processing the audio via file using the Skype API file interface cannot be done currently in real-time since the file would remain opened by the Skype API until closed to process the file content.
You will need to decode the DTMF digits using your application
Since the Skype API does not provide any in band DTMF decoding (“Which is Good actually, the CPU overhead is high, and should NOT be used when not needed, this is why in band DTMF decoding is normally done using a separate CPU or processor, like a modem for example”), this means you will need to do this asynchronously while receiving Skype audio data via a local port and do the other things your program is doing at the same time.
You should use a buffer size of at least 8,192 bytes when processing the data from the port asynchronously because you can receive as much as 32,768 bytes a second of Audio data. This will mean that you will only need to do a maximum of 4 reads a second to keep up with any audio data, worse case, which will allow you to do whatever other processing your application may need to do.
Is This My DTMF Tone or Yours? or “Look My Garage Door Opener Opens my Neighbors Garage Door Too!”
Imagine if mutiple applications for Skype are processing the same DTMF input.
How does one determine that the tones are for it or that they are not?
Can you predict the outcome of what can/could/would/will happen when two or more applications for Skype process the same DTMF digit?
What would happen if you conference called together two automated PBX systems who are both waiting for DTMF digits from your phone? do you really want that mess?
When you call a PBX, or an IVR system, you don’t have this issue because there is only one system ready to process your DTMF input, with Skype because of the API interface, you now can have MULTIPLE applications all waiting for DTMF input, and different digits could mean different things. As is the case for MyToGo For Skype Where a # means end of Skype Speed-Dial number and to call the Skype contact or PSTN number assigned to that Skype Speed-Dial number or that the digits prior to the # are a telephone number and to call that telephone number. Now what happens to the application where the # means to do something else?
A giant mess can be created when or if a Skype user is running two applications processing the same DTMF tones. We never see this in other cases because NOBODY has the rich API interface that Skype has.
PBX or IVR systems never need to worry about “What if there is another application listening for DTMF digits?”), they simply say “If you want to do this or that enter this or that now” this is NOT this case when designing applications using the Skype API.
Do you have error logic that says, if call in progress then….or just then…..?
Can more than one copy of your program be started?
So, please be careful if you think that ONLY your application is going to be using DTMF and plan for proper error logic, if and when your application ends up along side of other applications who may or may not be processing the same DTMF tones you are.