This skill provides local audio/video transcription via Whisper, voice input (dictation, push-to-talk), and CapCut-style subtitle embedding for MP4s. It activates for tasks like transcribing files, podcasts, social media content, local speech-to-text, speaker diarization, and voice dictation.
Design e Frontend#ai#apiby Mobiss11