Week 9 Non-visual Interfaces

Voice Meme is an app uses non-visual interface that adds sound effects as you talk. It is perfect for comedians, public speakers, youtubers who want to add a touch of humor as they talk to their audience.

The app detects keywords using speech recognition framework, and plays the corresponding sound file and displaying a gif. I used this SwiftGif library to load gif to UIImageView.

I was trying to determine the time it took for user to say certain words, so it only load the gif and sound file when it’s longer than a certain threshold. I tried to use lastSegment.duration. According to the definition of duration, it is the number of seconds it took for the user to say the word represented by the segment, measured from the start of the utterance. However when I printed lastSegment.duration the number seems to keep incrementing instead of returning the time it took to say the last word.

Another issue I encountered was that there is a 60 sec limit to an individual speech request, therefore it’s only going to listen for 60sec of live audio. However, I wanted the app to listen for as long as the user wants, but couldn’t figure out how to reset the SFSpeechRecognitionTask at the end of every 60 seconds.