Prompt Engineer responsible for the end-to-end technical migration workflow for transferring templates to LLM raters. The role involves using internal tools for prompt engineering techniques to maximize model performance.
Responsibilities:
- Use Automatic Prompt Generation (APG) tools to create baseline prompts for complex parent-child template clusters
- Execute and supervise the Automated Prompt Optimization (APO) tool, review outputs, and flag when APO reaches deadlocks or plateaus
- Manually design, test, and refine prompts to navigate complex template architectures, overcome anti-patterns, and resolve edge cases
- Monitor Shadowbot runs to ensure sufficient discrepancies (between human and LLM ratings)
- Test prompt versions against golden data to continuously measure rater quality against human crowd baseline, with accuracy metrics such as F1 scores, precision, and recall
- Prepare technical launch readiness justifications (Launch Certification Documentation)