Bluesky is introducing a major moderation overhaul to clarify how rule violations are tracked and to make enforcement more visible to users. Among them are broader reporting options, a more detailed strike system with severity ratings, and clearer notices and appeals — all of which will ship with the app’s latest release.

The initiative is a sign of the platform’s rapid expansion in its first year since letting average users sign up — and an attempt to dodge the kind of toxicity that has plagued bigger social networks. Bluesky is not changing what it will enforce; instead, Bluesky is modernizing the tooling underlying enforcement to provide consistency, quicker action on high-risk reports, and plain-language explanations sent directly to impacted accounts.

Table of Contents

What Users Will See in Bluesky’s New Moderation Tools
A Strike System, and More Clarity on Consequences
Why Transparency Is Important on a Federated Network
Regulatory Pressure and the Current Safety Climate
What to Watch Next as Bluesky Rolls Out Moderation

A white butterfly icon centered on a blue background with subtle wave patterns.

What Users Will See in Bluesky’s New Moderation Tools

The range of reporting options on posts grows from six to nine, allowing people more specific labels for what they are flagging. New categories include Youth Harassment or Bullying, Eating Disorders, and Human Trafficking, which can be mapped to what is relevant today in terms of risk vectors based on trust-and-safety concerns from an industry regulatory perspective.

Under the hood, Bluesky has centralized but also made it so that enforcers can see the pattern over time of rule-breaking — how individuals and content are bad rather than a snapshot in this month’s policy violations. That could help decrease inconsistent outcomes, a common criticism of social platforms when identically breaking posts are penalized differently.

There are smaller interface tweaks shipping with the policy tooling, as well, such as an updated control for who can reply to a post and a dark-mode app icon. But the attention is squarely on how the reports move through and how results are reported.

A Strike System, and More Clarity on Consequences

Bluesky’s newly extended strike system rates each piece of content by severity, from low to critical risk. Major-risk violations can result in permanent bans, and lesser ones prompt temporary action; the severity is adjusted for repeat offenses. Multiple milder violations can be added up to tip an account over the threshold, though none of the most recent posts might be individually offensive.

Most importantly, users who are facing enforcement will start receiving clearer notices that cite which guideline was violated, which level of severity is being assessed (which isn’t the same as a strike), how many times the social platforms believe it was violated in total, and how close they might be to arriving at an account-level penalty, if applicable, as well as when the temporary suspension ends.

Appeals are available now, which is in line with the Santa Clara Principles on transparency and remedy that many trust-and-safety experts support.

That sort of clarity could have helped avoid confusion in past confrontations, such as when a popular account was suspended after quoting the lyrics to a song which moderators took as a violent threat. When the rules and reasons are clear, platforms generally have fewer elongated moderation blowups and more specific appeals.

Why Transparency Is Important on a Federated Network

Since Bluesky is building on the AT Protocol, moderation isn’t a top-down function; it is instead meant to be composable and leave space for third-party labeling services and community norms. It’s not that stronger central enforcement tooling is wrong; I think it helps establish a baseline while the ecosystem explores layered controls.

The Trust and Safety Professional Association has said that clear notices and predictable escalation paths are important for upholding community trust. In decentralized environments, that predictability is doubly important: users can’t tune their behavior or select additional filters unless they understand the substrate.

Regulatory Pressure and the Current Safety Climate

The new categories dovetail with legal responsibilities concerning the safety of minors and illegal content. The UK’s Online Safety Act requires platforms to mitigate harms including grooming and trafficking with audited procedures, while the EU’s Digital Services Act calls for risk assessments and a faster response time to priority flags as well as transparency reporting.

In the United States, a hodgepodge of state laws is adding up to make noncompliance more financially onerous. Earlier this year, for instance, Bluesky opted to block access in Mississippi rather than diverting the resources it would have taken to comply with that state’s age-verification standards and risking potential five‑figure per‑user fines. Precision reporting and auditable enforcement trails are the sorts of controls regulators are increasingly demanding.

The need for safety is evident across the industry. Research from Pew Research Center has shown that about 4 in 10 American adults have experienced some kind of online harassment, with a meaningful share of Americans experiencing severe forms. For expanding companies, jumping the curve is not only a matter of brand, it’s risk management.

What to Watch Next as Bluesky Rolls Out Moderation

Signs that the overhaul is making a change will include faster time-to-action on high-severity reports, consistent decisions for similar cases, and transparent appeals with clear reversal rates. A public ranking system, modeled after DSA numbers, would add even more validation to the improvements.

Just as important will be how these tools interoperate on third-party labelers and community filters — the bet Bluesky made on composable moderation. If a baseline can be made sharper and the layers above it can remain malleable, then you have the potential for a network that adapts without descending toward the polarized patterns that have cursed larger incumbents.