project – 5: Voice Assistant
In this project, you will be designing your own customized voice assistant (“Hey Google”, “Hey Siri” and Now! “Your creative phrase here”). Believe me, it is a very simple implementation. You have all the commands at your disposal to accomplish this task.
Training your voice with Key Phrase [30 points]
Script File: run_record_voices.m
At first, begin with the starter code “run_record_voices.m” provided as a part of this project. This code is designed in a way that you record five different modulations of the same phrase for instance it could be “Hey Jarvis” and make sure to repeat the same phrase 5 times. Your recording is only for 1.5 seconds; make sure your word does not exceed the duration. Visualize the Key phrase each time you record and observe the plot, re-do the task in case you find any disturbances. Make sure that no disturbances are present in any of those 5 words. Within the loop, written in this code, make sure to apply the function “voice_to_envelopes.m” (provided as a part of this project) with input arguments as the trigger_word and
100. This function acts like a filter. It converts your voices to envelopes (set the variable name to be training_envelopes). Make sure to store the values of envelope returned by the function in a column of a matrix (training_envelopes). Visualize the envelope for each key phrase. After recording five times, variable training_envelopes would be of size 12000 x 5. Please make sure that your envelope meets these requirements. Now, save the envelopes in a MAT file as shown below.
“save training_words training_envelopes”
Check the comments in “run_record_voices.m” and Recording Voice Sample.mov (present in resources) for further information.
Testing Phase [40 points]
Script file: run_test_voices.m
Create a new Script file named “run_test_voices.m”. Make sure to clear workspace before you proceed further. Load the MAT file training_words as mentioned below:
This command loads the envelopes you computed in your previous code (training samples). Now, initiate a while loop where you would allow the user to utter phrases for a maximum of three attempts or until he/she says the exact same key phrase [whichever is earlier]. You can copy and paste the code that is present in “run_record_voices.m” to record test words [make sure to assign ‘fs’ as 8000]. Apply voice_to_envelope to each phrase that user utters and save the variable as testing_envelope.
Compare the similarity by computing the correlation between the testing_envelope with each column of training_envelope. Maximum value of xcorr() provides the similarity measure between 2 envelopes. Input arguments to xcorr() are testing_envelope and training_envelope (one column each time).
Hint: You can initiate a ‘for’ loop (for the number of training words) within the ‘while’ loop to compute the similarity between the test envelope and each training envelope.
You would have a single similarity value for a given a test phrase from each training word [5 values in total]. If any of those values exceed the threshold (0.9) then you would display to the user that he/she has decoded the password, else you would recommend the user to repeat until he/she meets the maximum number of attempts.
Check Test Result Sample #1.mov, Test Result Sample #2.mov and Test Result Sample #3.mov (in resources) for some of the possible results.
Report [30 points]
Now, write a report about the same. You may include plots you got for your key phrase along with their envelopes. You can include plots of the envelope where the user said something different from the key phrase and when he exactly mentioned the same phrase. You may also study the performance of this algorithm by varying the threshold instead of 0.9 and report the same.
Things to be submitted:
案例作业：CS案例matlab案例A5: Voice Assistant北美