# Research Assignment: AI Engineer – Content Generation App for HBO/WO Students
## About the role
The City of Amsterdam is seeking a student (3rd or 4th year HBO or WO) to work during the internship period on a Content Quality Framework for measurably better AI-generated content.
## The project
The City of Amsterdam is exploring ways to use generative AI to create content for Amsterdam.nl content specialists. The app combines AI technology (LLM) with a knowledge database (RAG) in a user-friendly front-end with workflow functionality to generate consistent, accessible, and truthful content.
## What you'll do
You will develop a Content Quality Framework that enables the quality of the Content Generation App to be measured, assessed, and improved structurally. The framework has three scoring levels:
- Input score: how good are the source documents in the knowledge database (complete, current, consistent)?
- Output score: how good is the generated content? An overall content quality score that combines multiple metrics
- Guardrail metrics: hard boundaries from Responsible AI — a change that worsens the bias score may not be implemented
### The three buttons
- Validation & Quality → improves the input score
- Assess source documents in the RAG database
- Identify outdated, conflicting, or missing sources
- Validate semantic coverage
- Perform consistency checks
- Optimization → improves the output score
- Optimization through better prompts and prompt engineering
- Compare and evaluate different language models
- Set up and conduct A/B tests
- Build dashboards
- Content Intelligence Loop → improves both scores structurally
- Create: generate content based on the knowledge database
- Measure: measure how content performs
- Analyze: compare performance against benchmarks
- Optimize: feed insights back
## Responsible AI as a binding requirement
- Human-centered — the editor always makes the final decision
- Guardrail metrics — bias detection on geographical, socioeconomic, and cultural factors
- Source traceability — every claim traceable to source document
## Research questions
You choose one of the following questions or formulate your own:
- Which metrics and scoring models are most suitable for reliably assessing the quality of AI-generated government content?
- How can the completeness and consistency of source documents in the RAG database be systematically monitored and improved?
- What is the effect of prompt optimization and model selection on the output quality score?
- How can a Content Intelligence Loop be structured systematically?
- How does AI-generated content perform compared to manually written content?
## Activities
You work 32-36 hours per week from September 2026 through January 2027, partly on-site and partly from home. You work together with data and analytics specialists, content specialists, and editors.