You need to do the following things:
Find a open source library which does the following :
1. Enables the end user to get a live video footage when connected to a camera (Ex: This can be done with Aforge Library)
2. Enable the end user to get the audio streaming when connected to the Camera microphone or via external Microphone.
3. Enable user to get a notification when a face gets detected. This can be seen for example via OpenCv library. And the notification should be like your Watsapp Notification once the user gets it, taps it should open the app and show the live video or photo or the person standing infornt of the camera.
Just remember everything has to be done via open Library so nothing new has to be invented just has to be aggregated.