I want to records short audio on the Android and recognize it with google cloud. The idea would be to stop the recorder when there is a pause, or after a few seconds or so, whichever comes first. I tried the built-in speech recognizer, but it no longer seems to support stopping the listener at will, and instead waits for 5 to 6 seconds after the last utterance before stopping, which makes it unusable to me. Further, i may want to temporarily store the audio files.
Some testing with google cloud speech shows promise, and now i want to record the audio in a supported format. Google speech recognizer lists supported encodings here and MediaRecorder here. The only overlap seems to AMR_WB. The speech recognizer recommends using lossless, where available.
So, that leads me to AudioRecord (i'm not quite sure why i didn't start there in the first place), with a lot more formats available.
One more point, is that the speech recognizer request (ultimately to be made from a separate system) can be sent with Base64 encoding (as opposed to saving the audio file on the google cloud).
I have three questions right now:
Some testing with google cloud speech shows promise, and now i want to record the audio in a supported format. Google speech recognizer lists supported encodings here and MediaRecorder here. The only overlap seems to AMR_WB. The speech recognizer recommends using lossless, where available.
So, that leads me to AudioRecord (i'm not quite sure why i didn't start there in the first place), with a lot more formats available.
One more point, is that the speech recognizer request (ultimately to be made from a separate system) can be sent with Base64 encoding (as opposed to saving the audio file on the google cloud).
I have three questions right now:
- Which audio format does it make the most sense to use?
- How do i encode it afterwards in Base64?
- How do i stop the recorder when the user stops talking or after a set amount of seconds, whichever comes first?