This is Adyen

Adyen provides payments, data, and financial products in a single solution for customers like Meta, Uber, H&M, and Microsoft - making us the financial technology platform of choice. At Adyen, everything we do is engineered for amb

Adyen seeks a MLOps or DevOps Engineer to join our Generative AI team in Madrid. The mission of this team is to create a Generative AI platform enabling various applications with Large Language Models (LLMs). The team focuses on developing platform components for internal deployment and delivering end-to-end solutions for operational efficiency. Through use cases like support case routing and sentiment analysis, they showcase AI's adaptability across different domains within the organization, revolutionizing workflows and decision-making processes.

Our mission is to create a Generative AI platform at Adyen, supporting various applications based on LLMs. This involves developing platform-oriented components for deploying an LLM backend within Adyen's GPU cluster in Kubernetes, with features like monitoring, access control, rate limiting, prompt debuging and experiment tracking. We mainly use Open Source frameworks like HuggingFace and LangChain, and models like Llama or Mixtral. This involves developing platform components, but also delivering on some of the most promising use case across different areas within the company.

Key responsibilities:

  • Collaborate with the team to design and build the infrastructure to host LLMs in-house while thinking about scale, performance and reliability.
  • Own the deployment strategy of ML models for downstream tasks such as ticket routing (text classification), summarization, sentiment analysis, and question-answer retrieval.
  • Automate the ML pipeline using MLOps tools and practices and optimize it for scalability and performance. 
  • Containerize applications  and manage the Kubernetes deployments as well as the infrastructure needed to deploy LLMs internally; from GPUs to vector databases and inference components. 
  • Develop observability best practices for the whole LLM infrastructure and build the internal framework which allows the team to monitor the LLM behavior to ensure their robustness under real life conditions. 
  • Design and implement APIs, services or frameworks to facilitate the seamless integration and usage of LLMs within various applications and services. 
  • Stay up to date with the latest advancements in MLOps tools and practices.

Qualifications:

  • 5+ years of professional experience as a DevOps Engineer, MLOps Engineer, ML Engineer, Data Engineer 
  • Strong software development skills, including: version control (e.g. Git and preferable on Gitlab), coding best practices, debugging, unit and integration testing.
  • Proficient in Python, Airflow, MLflow, Docker, Kubernetes and ArgoCD.
  • Proficiency with observability tools, such as: Prometheus, Logsearch, Kibana and Grafana.
  • Knowledge of data pipelines and ETL processes to prepare and manage data for ML training and inference. As well as model development and deployment frameworks.
  • Solid understanding of DevOps best practices and tools to automate software development and deployment processes, and CI/CD concepts and experience in implementing these practices.
  • Ability to diagnose and resolve model performance, scalability, and deployment issues.
  • Familiarity with monitoring tools to track model performance, resource utilization, and system health. Experience in logging and error monitoring for ML models and applications.

Desirable additional requirements:

  • Knowledge of ETL pipelines using PySpark and Airflow for data preprocessing and model training.
  • Clear understanding of the end-to-end machine learning lifecycle
  • Experience with Helm for packaging and deploying applications on Kubernetes and with Kustomize for customizing and managing Kubernetes configurations.
  • Familiarity with infrastructure as code with tools like Terraform.
  • Experienced with Open-source Machine Learning frameworks like Huggingface Transformers.
  • General LLMOps experience is a plus, including model deployment, monitoring, resources, and infrastructure management, including GPU knowledge.

Our Diversity, Equity and Inclusion commitments 

Our unique approach is a product of our diverse perspectives. This diversity of backgrounds and cultures is essential in helping us maintain our momentum. Our business and technical challenges are unique, and we need as many different voices as possible to join us in solving them - voices like yours. No matter who you are or where you’re from, we welcome you to be your true self at Adyen. 

Studies show that women and members of underrepresented communities apply for jobs only if they meet 100% of the qualifications. Does this sound like you? If so, Adyen encourages you to reconsider and apply. We look forward to your application!

What’s next?

Ensuring a smooth and enjoyable candidate experience is critical for us. We aim to get back to you regarding your application within 5 business days. Our interview process tends to take about 4 weeks to complete, but may fluctuate depending on the role. Learn more about our hiring process here. Don’t be afraid to let us know if you need more flexibility.

This role is based out of our Madrid office. We are an office-first company and value in-person collaboration; we do not offer remote-only roles.

Location

Madrid

Job Overview
Job Posted:
6 months ago
Job Expires:
Job Type
Full Time

Share This Job: