Design and build machine learning, NLP, and generative AI solutions that support scientific discovery, knowledge extraction, decision support, and intelligent content understanding.
Responsibilities:
- Design and build machine learning, NLP, and generative AI systems for scientific discovery, knowledge extraction, decision support, and intelligent content understanding
- Work with large-scale, complex, and heterogeneous data, including scientific publications, research datasets, knowledge graphs, ontologies, taxonomies, citations, metadata, and content from every scientific discipline
- Apply the right technique to each problem, using approaches such as classification, regression, clustering, ranking, feature engineering, deep learning, embeddings, LLMs, retrieval, and generative AI
- Develop capabilities for semantic search, information retrieval, entity extraction, content classification, recommendation, ranking, summarization, question answering, and evidence-grounded generation
- Build, evaluate, fine-tune, prompt, and integrate models into robust production systems, while continuously improving quality, relevance, reliability, and user value
- Write clean, tested, production-quality Python and contribute reusable data science components, packages, and scalable data pipelines for preprocessing, inference, experimentation, monitoring, and continuous improvement
- Support deployment, monitoring, model maintenance, drift detection, automated retraining, and ongoing optimization of data science systems
- Collaborate with engineering, product, UX, analytics, research, and domain experts, and communicate technical concepts, model behavior, insights, trade-offs, and recommendations clearly to technical and non-technical audiences