# Senior AI Engineer - Payments
Adyen is establishing a GenAI team focused on identifying high-impact use cases for automation through agentic capabilities. As a Senior AI Engineer, you will design agents that reason over complex, multi-step tasks, build infrastructure for production-grade reliability, and shape how humans and AI collaborate at scale within a global payments company.
## What you'll do:
- Discover and Build: Proactively engage with product and engineering teams to uncover critical challenges. Find high-impact opportunities and rapidly design and build AI prototypes (MVPs) to demonstrate value.
- Develop Strategic AI Products: Own end-to-end development of bespoke AI tools that solve problems unique to Adyen's scale. Build intelligent solutions for merchant experience optimization, pricing models, and internal workflows.
- Own Evaluation and Benchmarking: Define and lead evaluation strategy for agentic systems and LLMs. Design internal benchmarks grounded in real domain complexity, probing for genuine capabilities, edge cases, and failure modes. Build reusable evaluation infrastructure embedded in the development process.
- Provide AI Expertise Across the Organization: Serve as technical resource for AI initiatives across Adyen - evaluating agentic frameworks, retrieval and search strategies, and agent tool-use approaches. Surface connections across initiatives and help teams avoid duplicating work.
- Raise the Bar: Set engineering standards for the team and company. Provide mentorship through problem decomposition, research methodology, and code review. Champion reproducibility, documentation, and rigorous evaluation practices.
## Who you are:
- 7+ years of hands-on experience in applied AI/ML research or engineering with a clear track record of shipping AI systems, including agentic or LLM-powered systems, in production environments.
- Deep expertise in language models and Generative AI with hands-on depth across several of: architecture, post-training (fine-tuning, RLHF), inference optimization, context engineering (RAG), and failure modes at scale.
- Proven experience designing and operating agentic systems at scale, multi-agent orchestration, tool use, memory and context management, state handling for long-running workflows, and human-in-the-loop design.
- Rigorous and systematic about evaluation with experience designing evaluation frameworks or internal benchmarks beyond standard metrics.
- Strong foundation in classical machine learning: supervised learning, ensemble methods, optimization, probabilistic modeling, and statistics.
- Write clean, well-structured, production-ready code, primarily Python, and hold research code to an engineering standard.
- Hands-on experience with at least one production-grade agentic framework.
## Nice to Have:
- Familiarity with financial data, payments, fraud detection, or risk systems.
- Track record of external visibility: publications, conference presentations, or open-source contributions.
- Experience with observability and evaluation tooling.
- Familiarity with MLOps and model deployment pipelines in large-scale environments.