Role Overview:
As a Research Scientist you will help lead the development for our Speech Synthesis technology which is core to Hippocratic AI and our Generative AI Healthcare Agents. You will play a pivotal role in developing and advancing our state-of-the-art speech synthesis capabilities tailored specifically for the healthcare domain. Working alongside a dedicated and talented team, you will lead efforts to tackle challenging problems in generative speech synthesis, text-to-speech, accent modeling, and voice conversions. This position offers an exciting opportunity to contribute to groundbreaking research and shape the future of AI-driven healthcare.
Responsibilities:
Collaborate with cross-functional teams to develop and implement innovative speech synthesis techniques to enhance our AI Healthcare Agent.
Lead research initiatives focused on advancing speech synthesis capabilities for medical applications, including TTS (Text-to-Speech), SST (Speech-to-Speech translation), and voice conversion methods.
Drive the development of novel approaches in accent modeling, style-transfer, and multi-language audio synthesis to improve the efficacy and versatility of our healthcare AI platform.
Conduct experiments, analyze results, and publish findings in top-tier conferences and journals within the healthcare and AI domains.
Stay updated on the latest advancements in speech signal processing, deep learning, and related fields to inform research directions and maintain a competitive edge.
Qualifications:
Ph.D. or Postdoctoral researcher in audio synthesis, speech processing, or a related field, with a strong academic background in healthcare applications preferred.
3+ years of experience in audio synthesis techniques, including TTS, SST, or voice conversion methods, with a track record of successful research projects.
Demonstrable research experience with a strong publication record in major Speech Synthesis and Speech Processing venues such as ICASSP, Interspeech, or NeurIPS.
Hands-on experience in Python, PyTorch, or TensorFlow.
Familiarity with compute platforms such AWS for scalable model development and deployment.
Preferred Qualifications:
Experience in audio identity embedding, accent modeling, style-transfer, and multi-language audio synthesis within the context of healthcare applications.
Knowledge of attention mechanisms, diffusion models, and advanced speech signal processing techniques.
About Hippocratic AI
Hippocratic AI is building a safety focused large language model (LLM) for the healthcare industry. We believe that generative AI has the potential to massively increase healthcare access the world over but has to be built and tested responsibly. Like the Hippocratic oath that doctors take we are building a model that aims to “Do no Harm.”