Today we will translate speech into text. First, make sure you have the latest version of iOS and Xcode installed . To Speech Framework need at least iOS 10 . I created a project with SwiftUI support , this requires iOS 13 . But this is not necessary, you can use the Storyboard .If you do not know what SwiftUI is and want a quick overview, here you are .Create a new project โFile> New> Project ...โ , select โSingle View Appโ and โUser Interface: SwiftUIโ . The project will look something like this:
Select a fileContentView.swift and change โstruct ContentView ...โ to:struct ContentView: View {
@ObservedObject var speechRec = SpeechRec()
var body: some View {
Text(speechRec.recognizedText)
.onAppear {
self.speechRec.start()
}
}
}
class SpeechRec: ObservableObject {
@Published private(set) var recognizedText = ""
func start() {
recognizedText = "!"
}
}
ContentView is what we show on the screen. SpeechRec is where we will translate speech into text. We will keep the recognized text on a recognizedText , and the ContentView will display this on the screen.Resolution
First, we need to ask the user for permission. Select the Info.plist file and add two keys there: NSSpeechRecognitionUsageDescription and NSMicrophoneUsageDescription .
Import Speech and ask permission:import Speech
...
class SpeechRec: ObservableObject {
...
func start() {
SFSpeechRecognizer.requestAuthorization { status in
}
}
}
If you run, will ask permission:
.Translate speech to text
To translate speech into text, we need to use SFSpeechRecognizer with localization โru-Ruโ to recognize Russian speech. Then you need to specify the source, in our case it is a stream from the microphone.Latest version of our class:...
class SpeechRec: ObservableObject {
@Published private(set) var recognizedText = ""
let speechRecognizer = SFSpeechRecognizer(locale: Locale(identifier: "ru-RU"))
var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
var recognitionTask: SFSpeechRecognitionTask?
let audioEngine = AVAudioEngine()
func start() {
self.recognizedText = "..."
SFSpeechRecognizer.requestAuthorization { status in
self.startRecognition()
}
}
func startRecognition() {
do {
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
guard let recognitionRequest = recognitionRequest else { return }
recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest) { result, error in
if let result = result {
self.recognizedText = result.bestTranscription.formattedString
}
}
let recordingFormat = audioEngine.inputNode.outputFormat(forBus: 0)
audioEngine.inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { buffer, _ in
recognitionRequest.append(buffer)
}
audioEngine.prepare()
try audioEngine.start()
}
catch {
}
}
}
After asking permission, the recognition process immediately begins. I tried to write briefly and therefore missed some necessary checks.Now run and say something in Russian. Your speech will be displayed on the screen.
.What's next?
Documentation:developer.apple.com/documentation/speechWWDC video:developer.apple.com/videos/all-videos/?q=SpeechGitHub project:github.com/usenbekov/speech-to-text-demo