Skip to main content

What principles guide Playlab’s approach to Trust and Safety?

Playlab is guided by three core principles:
  • Impact and Safety drive our organization’s incentives: As a nonprofit, Playlab is not driven by short-term growth incentives. Instead, they invest in solving hard problems in education around safety, efficacy, transparency, and bias.
  • Responsible growth grounded in AI literacy: Playlab is invite-only, with joining primarily through professional learning offered by Playlab or trusted partners. This approach grounds their work in AI literacy, believing that the best way to understand new technology is by actively creating with it.
  • Prioritizing open source AI: Playlab believes AI models used in public education should be transparent and open to interrogation for bias and interpretability. While they currently use closed AI models to provide access to frontier technology, their long-term plan is to prioritize fully open-source models.

How does Playlab implement moderation and safety in their AI models?

Every app in Playlab includes:
  • Additional bias and alignment guidance provided to the AI models
  • Automated moderation for all prompts and model outputs
  • Moderated content is automatically hidden from view in conversations to protect users
  • Ability for users to manually flag outputs for issues related to bias, appropriateness, and hallucination
  • Moderation of user inputs
  • Org-level email notifications that alert admins when moderated content is detected
  • Designated Safety Admin roles for organizations responsible for reviewing flagged activity
Playlab is actively refining moderation categories to be more intuitive and building a data annotation pipeline to train more accurate moderation models. For more details, see the Safety and Moderation Updates feature page.

What safety measures are built into Playlab’s product development?

Playlab’s approach to responsible product development includes:
  • Red teaming (adversarial testing)
  • Testing higher risk releases with a smaller subset of users through co-design
  • Disclosures in product
  • Ongoing professional learning
  • Dedicated resources for developing improved age-appropriate and education-appropriate moderation models
  • Org-level moderation notifications so administrators are informed when flagged content is detected
  • Safety Admin designation at the organization level for dedicated safety oversight
  • Flag visibility and acknowledgement tools so admins can track and review flagged content with a full audit trail

How does Playlab help communities create appropriate guardrails?

Playlab supports their community through:
  • Professional learning, courses, content, and coaching to design guardrails for specific projects
  • Reviewable and inspectable app usage by creators, enabling teachers to understand how students use resources
  • Templates with built-in guardrails and guidelines when creating new apps
  • The Playlab Assistant, which provides suggestions on improving equity and mitigating biases
  • Reminders regarding processing of sensitive information
  • Visibility into which uploaded References inform model outputs
  • Toggle control over additional functionality that carries increased risk
  • Batch moderation email digests that give org admins and workspace owners visibility into flagged activity without being overwhelmed
  • At least one person in each organization must be designated to receive moderation notifications, ensuring flagged content is always reviewed

How does Playlab approach testing and evaluation?

Playlab encourages:
  • Piloting and testing how apps might drive impact in specific contexts
  • Testing for and guarding against harm and bias
  • Prioritizing projects that drive forward impact
In collaboration with partners like Chan Zuckerberg Initiative and Leading Educators, Playlab is developing rubrics and evaluation tools to assess the quality, impact, and safety of apps built on their platform.

How can I provide feedback or suggestions about Trust and Safety?

For feedback, ideas, or questions about Playlab’s approach to trust and safety, you can reach out to safety@playlab.ai.

Where can I learn about recent safety updates?

For the latest on moderation improvements, org-level notifications, Safety Admin roles, and flag handling, visit the Safety and Moderation Updates feature page.

Where can I find more information about Playlab’s data policies?

For specific policies on how Playlab handles data, you can check their Security FAQ, which distills information from their privacy policy and Data Privacy Agreements with enterprise partners.