Implement distributed training strategies across diverse hardware environments, utilizing advanced optimization algorithms to enhance learning speed while preserving model accuracy.
Lead efforts in model efficiency by applying techniques such as compression, pruning, and quantization. Innovate methods to distill knowledge from complex models to more streamlined ones effectively.
Optimize resource usage with micro-tuning strategies and leverage hardware-specific optimizations for efficient model training and performance improvement.
Collaborate across teams to integrate cutting-edge technologies into practical applications, focusing on designing and refining models and streamlining training processes.
Develop robust systems for model deployment and evaluation, including data preprocessing and parallel training setups, ensuring high-quality outcomes and efficient iteration.
Requirements
Bachelor's degree or higher in Computer Science, Artificial Intelligence, Mathematics, or related fields.
Minimum 5 years of AI experience, including 3 or more years specializing in large-scale language model development and optimization, ideally with successful projects.
Proficiency in deep learning theories, PyTorch, TensorFlow, etc., and expertise in model fine-tuning and SFT.
Experience in leading and/or mentoring team is a must.
Strong skills in algorithm design, optimization, large-scale data processing, and high-performance computing, coupled with leadership, teamwork, communication, and project management abilities.