Senior Software Engineer - ML/AI: Ontwikkel en implementeer innovatieve AI-oplossingen met Python, SQL en Airflow voor het optimaliseren van betalingsprocessen. Bouw prototypes en strategische tools die de operationele efficiëntie verhogen en waardevolle inzichten bieden aan teams in Amsterdam.
This is Adyen Adyen provides payments, data, and financial products in a single solution for customers like Meta, Uber, H&M, and Microsoft - making us the financial technology platform of choice. At Adyen, everything we do is engineered for ambition. Our Payment Solutions team is the engine behind this, processing billions of transactions while empowering businesses to create seamless payment experiences. As Adyen and our merchants scale globally, the complexity and volume of our operational knowledge are growing exponentially. Senior AI Engineer - Payments To position our Payments organization at the technological forefront, we are establishing a GenAI team focused on identifying high-impact use cases for automation through agentic capabilities. As a Senior AI Engineer, you will undertake some of the most technically demanding work in applied AI: designing agents that reason over complex, multi-step tasks, building the infrastructure to ensure production-grade reliability, and shaping how humans and AI collaborate at scale within a global payments company. This is not a narrow research role. You will take full ownership of your work, from early research through deployed production systems, influence the team's technical direction, and act as a force multiplier for the broader AI organization, including contributing to custom model development for structured financial data, and working toward our longer-term ambition of defining how humans and AI collaborate at scale across the company. What you’ll do: Discover and Build: You will be an explorer within Adyen, proactively engaging with product and engineering teams to uncover their most critical challenges. You won't wait for instructions; you'll find high-impact opportunities and rapidly design and build AI prototypes (MVPs) to demonstrate value. Develop Strategic AI Products: You will own the end-to-end development of bespoke AI tools that solve problems unique to Adyen's scale. Whether it's optimizing merchant experience, enhancing pricing models, or improving internal workflows, you will build intelligent solutions. Own Evaluation and Benchmarking: Define and lead the evaluation strategy for the agentic systems and LLMs your team builds and deploys. Design internal benchmarks grounded in real domain complexity, probing for genuine capabilities, edge cases, and failure modes that standard metrics miss. Build reusable evaluation infrastructure that is embedded in the development process, not bolted on after the fact. Provide AI Expertise Across the Organization: Serve as a technical resource for AI initiatives across Adyen - evaluating agentic frameworks, retrieval and search strategies, or agent tool-use approaches across partner teams. Surface connections across initiatives and help teams avoid duplicating work or converging on the wrong approach. Raise the Bar: Set engineering standards for the team and company. Provide mentorship through problem decomposition, research methodology, and code review. Champion reproducibility, documentation, and rigorous evaluation practices across the AI organization. Who you are: You have 7+ years of hands-on experience in applied AI/ML research or engineering, with a clear track record of shipping AI systems, including agentic or LLM-powered systems, in production environments. You have deep expertise in language models and Generative AI, with hands-on depth across several of: architecture, post-training (fine-tuning, RLHF), inference optimization, context engineering (RAG), and failure modes at scale. You have proven experience designing and operating agentic systems at scale, multi-agent orchestration, tool use, memory and context management, state handling for long-running workflows, and human-in-the-loop design. You understand what separates production agents from research prototypes. You are rigorous and systematic about evaluation. You have designed evaluation frameworks or internal benchmarks that go beyond standard metrics. You understand the failure modes of LLM-as-judge approaches and know how to measure what actually matters for a given system and use-case. You have a strong foundation in classical machine learning: supervised learning, en