Text generation is a key enabler for accelerated text input and intelligent interaction on Apple platforms. Our team is working on internationalizing generative models to redefine user interaction for the users all over the world. This work presents unique challenges, and we are dedicated to finding innovative solutions for various international languages and building pioneering NLP models that work offline.
Our efforts include developing on-device NLP solutions like tokenization and Chinese and Japanese conversion, enhancing the system experience for global customers. If you are passionate about delivering the best user experiences to a worldwide audience, join our ambitious and collaborative Input Experience Internationalization team in Software Engineering.
Description
Our mission is to make the input experience, a key pillar of Apple Intelligence, more inclusive and accessible for everyone, regardless of the languages they use or their background. With your ML expertise, engineering passion, and programming skills, you will:
- Work on language expansion of a wide range of most advanced text generation technologies such as context-augmented text rewriting, safety-controlled text composition, free-form text transformation, personalized smart interactions, etc.
- Apply NLP techniques and linguistic knowledge to data engineering and ML training to deliver the best input experience to Apple customers
- Seek scalable and innovative approaches for international languages to continually update the input experience
- Collaborate with different teams — such as NLP, Localization, SIML, AIML, Human Interface, etc. — to solve both design and engineering challenges and ensure that the next generation of experiences can reach all Apple customers.
Minimum Qualifications
Proficiency in one of the East Asian languages (Chinese, Japanese, Korean, etc.) or Southeast Asian languages (Thai, Burmese, etc.)
Strong fundamentals in ML and NLP algorithms including tokenization
Strong programming and communication skills
MS in Computer Science or a related field
Preferred Qualifications
Familiarity with tokenizer libraries and Unicode
Experience with prompt engineering for international languages
Understanding of data structures and common NLP algorithms
Proficiency in a C-like dialect (Java, C, C++, Objective-C, Swift, etc.)
PhD in CS/EE/Physics/Statistics/etc. (or Bachelor/Master with 2 years of industry experience)