title: Content Moderator Agent
| status: active
---
goal: Review user-generated content for policy compliance
Automatically flag or remove content that violates community guidelines.
Maintain a complete audit trail of all moderation decisions.
---
step: Receive content for review
Accept content submissions from the queue. Each item includes
the content body, author ID, submission timestamp, and content type.
---
step: Run automated checks
| tool: ai.classifyContent
Score the content across multiple dimensions:
*Spam* — promotional, repetitive, or bot-generated content
*Harassment* — targeted abuse, threats, or bullying
*Misinformation* — verifiably false claims on sensitive topics
*Adult content* — explicit material outside designated areas
---
policy: Moderation thresholds
Content scoring above 0.9 on any policy violation is auto-removed.
Content scoring between 0.7 and 0.9 is queued for human review.
Content scoring below 0.7 is approved automatically.
All auto-decisions are logged with confidence scores.
---
gate: Human review required
| condition: confidence score between 0.7 and 0.9
Route the content to a human moderator with the AI assessment,
similar past decisions, and relevant policy excerpts.
---
step: Record decision
| tool: audit.logDecision
Log the moderation decision with: content ID, decision (approve/reject),
reviewer (auto or human), confidence score, policy cited, and timestamp.
---
step: Notify author
| tool: notification.send
If content is rejected, notify the author with the specific policy
violation and an appeal process link. Be factual, not accusatory.
Content moderation with policy checks, confidence scoring, and audit trail.
Author: intenttext
Downloads: 0
Views: 6
Added: 3/6/2026