Presentation¶
This framework records audio from the microphone and applies some Digital Signal Processing to it.
The client app can request all the recorded audio and the audio fragment that has only voice samples in a real-time fashion, as well as other information from the recording.
Specifications¶
- The iOS (iPhone only) minimum operating system version: 11.0.
- Signal format: 16 bit linear PCM, signed.
- Frequency of sampling: 8 or 16 KHz, configurable.
- 30 secs. of recording (maximum).
-
Full SDK implementation sizes in MB:
Architecture MB arm64 (device) 0.53 arm64 (simulator) 0.83 x86_64 (simulator) 0.83 TOTAL 2.19 These sizes have been calculated using
lipo -detailed_info
for fat libraries andgdu
ordu
for the rest of files. NodSYMs
folders are included in these sizes. The final AppStore and installed sizes may differ.If more than one architecture is supported the total size is the sum of each one of them, e.g. if all architectures are supported the combined size of this SDK and its dependencies will be a total of 2.19MB.
The final user is not affected by this total size because the store only distributes the specific architecture of the user's device once the app is downloaded and installed.
Library dependencies¶
- There are no library dependencies. Please remove all dependencies from previous versions of the Voice SDK if necessary.
Integration¶
- Create a new Xcode project.
-
Add permissions required into Info.plist:
- Privacy - Microphone Usage Description.
- Drag and drop the Voice.xcframework into Xcode and copy inside project.
- Add all .xcframework into the targets general tab, inside Embedded Binaries and Linked Frameworks and Libraries (if they do not appear yet), they should appear in mode "Embed & Sign".
Real-time requests from the UI to the voice framework¶
Typical scenario¶
- The UI configures the recording.
- The UI starts the recording. It can be stopped at any time.
-
The UI sends requests to the SDK and receives the responses asynchronously through callbacks. There are 5 possible responses:
- The audio info with the recording information.
- The state of the recorder.
- The total signal recorded so far.
- The portion of voice from the recorded signal so far.
- Possible errors from the SDK.
Example scenario¶
-
On the main ViewController:
- Create a VoiceRecorder.
- Configure the VoiceRecorder (optional).
- Register a callback listener, as a subclass of VoiceRecorderAdapter, and override the desired methods.
- Starts the recording, on user action.
- Process the responses within the adapter, like the received audio. Also process the changes of state of the recorder, typically to enable/disable certain user controls.
- Send extra requests to the VoiceRecorder and process the asynchronous response with the adapter.
- Stops the recorder when desired.
Simplified Object Diagram¶
Example of use¶
import UIKit
import Voice
class MainViewController: UIViewController, PermissionsHandlerResultCallback {
// [...] Views & stuff
// Model & recorder
// [...]
private var voiceRecorder: VoiceRecorder!
// Adapter for callbacks
private var recAdapter: RecAdapter!
// Permissions
private var permissionsHandler: PermissionsHandler!
override func viewDidLoad() {
super.viewDidLoad()
// Configure Views
// [...]
// Model state & recorder
// [...]
voiceRecorder = VoiceRecorder()
// Callback control
recAdapter = RecAdapter(self)
// Ask for permission
permissionsHandler = PermissionsHandler(self)
permissionsHandler.askForPermissions()
}
func onPermissionsResult(_ accepted: Bool) {
// Stub: process accepted recording or not
}
override func viewWillAppear(_ animated: Bool) {
voiceRecorder.setListener(recAdapter)
// & configure
}
override func viewWillDisappear(_ animated: Bool) {
// Detach listener
voiceRecorder.setListener(nil)
}
// Requests to voiceRecorder -----------------
private func fakeRequestsFunction() {
// Example of UI requests
voiceRecorder.start()
voiceRecorder.stop()
voiceRecorder.requestAudioInfo()
voiceRecorder.requestRecordedAudio()
voiceRecorder.requestRecordedVoice()
// "Hollywood principle" (the responses are async)
// They will be processed through callbacks
}
// -------------------------------------------
/**
* Listener for receiving the AudioThread callbacks.
*/
private class RecAdapter: VoiceRecorderAdapter {
private weak var outer: MainViewController!
init(_ host: MainViewController) {
self.outer = host
super.init()
}
/**
* Receives the state of the recorder. It could be VoiceRecorder.STOPPED or VoiceRecorder.RECORDING.
* The UI typically changes in response to this state, enabling or disabling controls.
*/
override func onRecorderState(_ recorderState: Int) {
// Stub
}
/**
* Receives the AudioInfo of the recording, and displays it on the UI.
*/
override func onAudioInfo(_ audioInfo: AudioInfo) {
// Get current config from VoiceRecorder
let config = outer.voiceRecorder.getConfig()
// Voice & total audio recorded so far (ms)
let voiceDur = audioInfo.voiceDuration
let audioDur = audioInfo.audioDuration
let isQualityOk = audioInfo.isEnoughQuality
}
/**
* Receives a wrapper with AudioInfo and all the audio recorded so far.
*/
override func onRecordedAudio(_ wrapper: SignalWrapper) {
if (wrapper.rawSignal != nil) { /* process audio in raw format (Double array) */ }
if (wrapper.wavSignal != nil) { /* process audio in wav format (Byte array) */ }
if (wrapper.wavBase64Signal != nil) { /* process audio in wav format (base64 text encoded) */ }
}
/**
* Receives a wrapper with AudioInfo and all the voice recorded so far.
*/
override func onRecordedVoice(_ wrapper: SignalWrapper) {
if (wrapper.rawSignal != nil) { /* process voice in raw format (Double array) */ }
if (wrapper.wavSignal != nil) { /* process voice in wav format (Byte array) */ }
if (wrapper.wavBase64Signal != nil) { /* process audio in wav format (base64 text encoded) */ }
}
/**
* Receives possible exceptions from the AudioThread.
*/
override func onError(_ error: AudioInputError) {
// Stub
}
}
// Example of recorder's configuration
private func configureVoiceRecorder() {
let config = VoiceRecorder.Config.Builder()
.setFormat(VoiceRecorder.Config.RAW_FORMAT)
.setFs(8000)
.setAutoAudioInfo(true)
.setAutoVoiceCapture(true)
.setStopOnCapture(true)
.build()
voiceRecorder.setConfig(config);
}
}
Additional note about integration¶
The SDK validates the audio recording prior to send it to the das-Peak service on the cloud. Thus, the expected behavior is that both ends agree, either that the audio recording is valid or it is not valid.
But, although this is true in a general sense, there may be a few cases where both ends, the SDK and the cloud, disagree. This effect is due to slight algorithmic and precision differences.
The following practices are recommended:
- Das-Peak service always has the final word. It could be different from SDK's validation.
- To avoid discrepancies between those systems, it is recommended to configure the SDK a little more restrictive than the cloud service. The default values of the VoiceRecorder guarantees that behavior: 5 seconds of voice / 5 dB of signal-to-noise ratio.