Senior Data Scientist role at Samba, a media intelligence company. You will shape Samba TV's data science function for the agentic era of advertising by defining modeling methodology, building production ML systems, and applying modern AI capabilities across data products.
Key Responsibilities:
- Own end-to-end delivery of data science projects from problem scoping through production deployment
- Define and ship modeling methodology including model selection, evaluation frameworks, and reproducibility standards
- Apply ML and statistics expertise (regression, classification, clustering, model evaluation, experimental design, causal inference) to billion-row datasets
- Build production-quality Python and PySpark on Databricks with well-tested, documented, reusable code
- Partner with Data Engineering on data requirements and pipeline validation
- Build and operate advanced AI systems using RAG, LLM-augmented modeling, and Graph Neural Networks
- Integrate LLMs and agentic workflows into production ML pipelines
- Drive technical design for modeling components with clear solution documentation
- Establish MLOps practices including experiment tracking, pipeline orchestration (Airflow), model monitoring, and retraining workflows
- Apply privacy-compliant data handling practices (GDPR, CCPA)
- Mentor data scientists on the team
Required Qualifications:
- 8+ years hands-on data science experience with Bachelor's in Statistics, Data Science, Computer Science, Mathematics, or related field (or 6+ years with Master's, 3+ years with PhD)
- Demonstrated ability to own and deliver complex, multi-sprint data science projects
- Solid command of core ML and statistics applied to billion-row datasets
- Track record of building methodology with data analysis, model selection, and evaluation frameworks
- Production experience with vector databases (Pinecone, Weaviate, Milvus, pgvector, or equivalent)
- Advanced Python with production-quality tested code; strong SQL and PySpark
- Databricks, Delta Lake, and job orchestration (Airflow) experience
- Hands-on production experience on AWS, GCP, and Databricks
- MLOps proficiency
- Experience designing and operating agentic AI systems in production
- Strong communication and mentoring skills
Preferred Qualifications:
- Knowledge graph design (RDF, OWL, SPARQL)
- Natural Language Processing
- Background in ad tech, CTV/OTT, ACR, audience activation, identity resolution, or measurement methodologies
- Experience with causal inference (A/B testing, synthetic control, uplift modeling)