ChatGPT safety checks may trigger police action

Beyond threats, OpenAI is working on detecting risky behaviours in ChatGPT, such as sleep deprivation and unsafe stunts, offering guidance toward trusted contacts and therapists.

ChatGPT safety checks may trigger police action

OpenAI has confirmed that conversations on ChatGPT are subject to human review when they involve threats of serious harm to others, and in extreme cases, such threats may be referred to law enforcement. The company detailed these policies in a recent blog post outlining how it handles sensitive and potentially dangerous interactions.

how the system works

ChatGPT is designed with safeguards to identify when users express distress or harmful intentions. OpenAI explained that it treats self-harm and threats to others differently:

  • Self-harm: If a user expresses suicidal intent, the AI provides empathetic responses and directs them to professional hotlines such as 988 in the US or Samaritans in the UK. OpenAI emphasised that these cases are not automatically referred to the police, to respect users’ privacy.

  • Threats to others: When a user signals intent to hurt someone else, the conversation is flagged for review. Human moderators trained in OpenAI’s usage policies examine the chat. If they determine there is an imminent threat, OpenAI may contact law enforcement. Accounts involved in such cases can also be suspended or banned.

Challenges and limitations

The company admitted that its safety features are more reliable in short conversations. In long or repeated chats, safeguards may weaken, potentially leading to inconsistent enforcement. OpenAI said it is working to improve protections so that risk detection remains strong across extended interactions.

Beyond threats of harm, OpenAI is exploring ways to intervene earlier in risky behaviour, such as extreme sleep deprivation, dangerous stunts, or other activities that could cause harm. Future updates may include parental controls for teens, and tools to connect users with trusted contacts or licensed therapists before situations escalate.

Why this matters

These policies highlight that ChatGPT conversations are not always fully private. While most interactions remain confidential, messages indicating imminent danger to others may be reviewed by humans and, in rare cases, lead to real-world interventions, including police involvement.

For users, this means balancing the benefit of an AI tool that can provide empathetic support with the understanding that certain conversations could trigger moderation and reporting.

Go to Top