Our solution
•Face detection in video, through LSTM on video face dataset.
•Infrequent face recognition based on a small number of frames.
原链接solutions (blog)
https://medium.com/@ageitgey/machine-learning-is-fun-part-4-modern-face-recognition-with-deep-learning-c3cffc121d78
•Face detection on individual frames using HOG
•Frequent face recognition on individual frames
比较标准Comparison metric
•Better design —
•Speed (infrequent vs. each frame; network inference vs. HOG)
•Come up with some quantitative numbers.
•Consistency (inherent)
•Accuracy? Think about it.
image.png
实现思路:
1、detection用LSTM(上图右部分)
用video face detection dataset训一个lstm的network去detect视频中的人脸,代替原来的step 1(https://medium.com/@ageitgey/machine-learning-is-fun-part-4-modern-face-recognition-with-deep-learning-c3cffc121d78)hog
2、 step2 保留,还是进行posing and projecting face
3、上图左半部分, 每个方框代表一帧,之前所说的给一个人的所有bounding box一个id的意思是 我们使用前几帧去进行recognition,video里的人尽量保持不动 通过这几帧确定这个人后,以后的帧都可以锁定这个人。意思是不是每帧都做recognition,并且这些box相当于一个history, 假设这个人在20秒后出画面再进来 不用进行recognition还能认出这个人
导师给的:
Video face detection
•Data: multiple-faces detection
•Model: LSTM (each frame produce an output)
•We have done plenty of work on similar setting
Video face recognition
•Face detection in each frame: from the LSTM model.
•Transfer learning: different datasets (video face data & real time)
•Recognition would be based on a small number of frames.
•Infrequent: know box IDs.
综上:第一点需要做的就是找detection这块video的dataset,老师预计有这样训好的lstm 改一改也能用,他上面第三句话的意思是 我们实验室有这样做好的model是语音识别的 我改改也可以用。第二点是detection的数据recognition时能否用。原blog里后面的两个step他没有提,你可以看看有没有必要用。