What you will do

About this role

## Lead AI Application Engineer

We are looking for a dedicated Lead AI Application Engineer to join one of our clients' teams in an innovative environment.

### Key Responsibilities

Build & Run the Shared AI Platform

Architect and maintain a multi-tenant AI Platform that supports the full ML lifecycle across cloud and on-premises environments
Ensure high availability, low latency, and cost-efficiency for all shared AI resources
Implement LLMOps/MLOps best practices, including automated deployment pipelines for models

Curate the AI Services Catalogue

Develop and expose "as-a-service" capabilities: Inference-as-a-Service, Embeddings-as-a-Service, and RAG-as-a-Service
Standardize how squads interact with LLMs, providing unified APIs and abstraction layers to prevent vendor lock-in

Manage AI Data Infrastructure

Own the deployment and scaling of Vector Databases (e.g., Pinecone, Milvus, Weaviate) and Feature Stores (e.g., Feast, Tecton, Hopsworks)
Optimize data retrieval patterns to support real-time AI applications and agentic workflows
Oversee Model Hosting environments, utilizing Kubernetes (K8s) and GPU orchestration to manage compute resources efficiently

Enable Developer Self-Service

Build and maintain a Self-Service Portal or CLI that allows product squads to provision AI environments, models, and data stores independently
Reduce "Time-to-Inference" for new features by providing pre-configured templates and blueprints
Conduct internal workshops and provide documentation to empower squads to use the platform effectively

### Must-Have Technical Skills

Infrastructure: Deep experience with Kubernetes (K8s), Docker, and Terraform/Pulumi
Hybrid Cloud: Proven experience managing workloads across AWS/Azure/GCP and On-Premises (NVIDIA AI Enterprise, OpenShift)
AI/ML Tooling: Hands-on experience with vLLM, TGI (Text Generation Inference), or NVIDIA Triton for model serving
Databases: Expertise in Vector DBs and traditional SQL/NoSQL databases
Languages: High proficiency in Python and Go or Rust for platform tooling

### Experience

8+ years in Platform Engineering, DevOps, or Site Reliability Engineering (SRE)
2+ years specifically focused on building AI/ML infrastructure or platforms
Experience building Internal Developer Platforms (IDP) is a massive plus

Skills & experience

LeadKubernetesDockerTerraformPulumiAWSAzureGCPNVIDIA AI EnterpriseOpenShiftvLLMTGINVIDIA TritonVector DatabasesPineconeMilvusWeaviateFeastTectonHopsworksPython

More at this company

More jobs

Lead AI Application Engineer (Infrastructure & LLMOps)Berlin · Full-timeView →Senior EmbeddedSoftware EngineerBerlin · Full-timeView →

Keep exploring

What you will do

About this role

## Lead AI Application Engineer

We are looking for a dedicated Lead AI Application Engineer to join one of our clients' teams in an innovative environment.

### Key Responsibilities

Build & Run the Shared AI Platform

Architect and maintain a multi-tenant AI Platform that supports the full ML lifecycle across cloud and on-premises environments
Ensure high availability, low latency, and cost-efficiency for all shared AI resources
Implement LLMOps/MLOps best practices, including automated deployment pipelines for models

Curate the AI Services Catalogue

Develop and expose "as-a-service" capabilities: Inference-as-a-Service, Embeddings-as-a-Service, and RAG-as-a-Service
Standardize how squads interact with LLMs, providing unified APIs and abstraction layers to prevent vendor lock-in

Manage AI Data Infrastructure

Own the deployment and scaling of Vector Databases (e.g., Pinecone, Milvus, Weaviate) and Feature Stores (e.g., Feast, Tecton, Hopsworks)
Optimize data retrieval patterns to support real-time AI applications and agentic workflows
Oversee Model Hosting environments, utilizing Kubernetes (K8s) and GPU orchestration to manage compute resources efficiently

Enable Developer Self-Service

Build and maintain a Self-Service Portal or CLI that allows product squads to provision AI environments, models, and data stores independently
Reduce "Time-to-Inference" for new features by providing pre-configured templates and blueprints
Conduct internal workshops and provide documentation to empower squads to use the platform effectively

### Must-Have Technical Skills

Infrastructure: Deep experience with Kubernetes (K8s), Docker, and Terraform/Pulumi
Hybrid Cloud: Proven experience managing workloads across AWS/Azure/GCP and On-Premises (NVIDIA AI Enterprise, OpenShift)
AI/ML Tooling: Hands-on experience with vLLM, TGI (Text Generation Inference), or NVIDIA Triton for model serving
Databases: Expertise in Vector DBs and traditional SQL/NoSQL databases
Languages: High proficiency in Python and Go or Rust for platform tooling

### Experience

8+ years in Platform Engineering, DevOps, or Site Reliability Engineering (SRE)
2+ years specifically focused on building AI/ML infrastructure or platforms
Experience building Internal Developer Platforms (IDP) is a massive plus

Skills & experience

LeadKubernetesDockerTerraformPulumiAWSAzureGCPNVIDIA AI EnterpriseOpenShiftvLLMTGINVIDIA TritonVector DatabasesPineconeMilvusWeaviateFeastTectonHopsworksPython

More at this company

More jobs

Lead AI Application Engineer (Infrastructure & LLMOps)Berlin · Full-timeView →Senior EmbeddedSoftware EngineerBerlin · Full-timeView →

Keep exploring

Lead AI Aplication Engineer (Infrastructure & LLMOps)

About this role

More jobs

Similar jobs

Lead AI Aplication Engineer (Infrastructure & LLMOps)

About this role

More jobs

Similar jobs