20 June 2024

Four-layer moderation system for the most trusted space

Ensuring trust and safety in online environments is paramount in today’s digital age. We constantly develop moderation systems to guarantee a pleasant user experience without any issues. Our moderation system contains four layers and is suitable for

moderation
technology
industry

We consider differences between different industries and set the moderation system that is suitable for all of our partners and allows end-users to feel safe and free.

Four levels of moderation

Blocking lists, masking of sensitive data, and some extra tools for the spam-free area

The first moderation layer is based on lists of words, phrases, and links. It is the simplest yet effective way to prevent users from sending generally violent lexicon or competitors' names. To prevent spam, you can restrict sending links to a chat. To prevent sharing private data—any phone numbers—we also mask phone numbers, bank accounts, cars, and crypto wallet numbers to protect users from fraud.

The benefit of blocking lists is that they can be updated anytime—adding new items after the launch is possible. The default stop lists are prepared for different languages and activated when we receive the GET parameter with the user interface language.

AI-powered tools for text and pictures

The AI moderation system checks all messages sent to a chat in 10 ms. It automatically takes predetermined moderation actions when it detects content that matches criteria set in rules, such as profanity, hate speech, sexual content, self-harm, politics, fraud, etc. The tool automatically hides all messages that match the criteria, with a mark indicating the reason for hiding. Additionally, the AI tool flags such messages on the admin panel, allowing in-flight moderators, if necessary, to focus on rule breakers who posted these messages or send such issues to the CRM system.

This tool can be used for preventing self-harm, for example, in the context of responsible betting. If the system finds messages that can be interpreted as potential self-harm content, such messages can be sent to the CRM system for a ticket to resolve the issue, possibly blocking access to betting. Chats can be crucial for responsible betting because the main platform sees what is really happening with users and what they have on their minds. Moderation helps you keep chats clean while staying in touch with users and always knowing what they want, doubt, plan, and think about your platform.

Users’ moderation

People have become accustomed to using social media to make their personal space comfortable and friendly. Users who see offensive messages tend to report such content, delete inappropriate posts and comments, and block their authors from their feeds and private space. So, we allow users to report messages and rule breakers and hide and block them. Users can block other users if they don’t like each other. Both can continue to communicate in a chat but won’t see each other's messages.

Tools for human in-flight moderation

The first three layers cover 97% of violations in a chat, but if you need a chat which is 100% clean, we provide convenient tools for in-flight moderation. Moderators can hide messages, flagged by AI, ban users, and check how often particular users break the rules. Moderators also review user reports and transmit technical issues to the CRM system, which helps users feel safe on different levels.

All levels are responsible for different benefits and advantages

Pre-moderation is easily customised. Platforms can tailor their own rules to fit community standards and cultural sensitivities, considering habits and possible violations on the platform. AI tools support scalability: it doesn’t matter how many users and messages are gathered on the platform; AI processes all content at the same speed and with the same high quality. AI helps to reduce routine tasks for human moderators by handling violations of the same type, allowing human moderators to focus on more challenging tasks or nuances. AI, combined with in-flight moderation, enables real-time moderation, ensuring users do not encounter any violations in the chat, as all examples of profanity are caught by AI or human intervention.

Coverage for all nuances: users’ moderation tools protect individual preferences and help them maintain a good mood. If someone cheers for a different football team, users can simply hide such fans for themselves, and everyone is content.

What about chat rules?

We have a default scope of universal chat rules for all domains and platforms; however, they can be customized for each platform along with the settings of AI tools. Regular chat rules include the following restrictions:

What is prohibited?

- Obscene and inappropriate language, socially inappropriate/unacceptable words or phrases, as well as abuse, swearing, and hostile remarks

Cursing, vulgar and offensive language aimed at denigrating the honour and dignity of other users are all prohibited. By hostile remarks, we mean messages aimed at offending people because of their race or ethnicity, nationality, religion, disability or illness, sex, gender identity, or sexual orientation.

Aggressive or degrading statements, harmful stereotypes, and statements referring to the inferiority of others are prohibited. Harmful stereotypes, which mean degrading comparisons, are also forbidden. Speech that degrades human dignity through comparison, generalization, or behaviour description is also forbidden.

- Flooding and advertisements

- Threats

- Private data

It is forbidden to send phone numbers, addresses, IDs, any other document data, bank card details, etc.—whether users' own data or that of others.

- Fraud and begging

- Cybersecurity breaches

Any attempts to gather users’ confidential information or data or to gain unauthorized access to a service, product, or platform are forbidden.

- Nicknames/usernames containing anything restricted by any of these rules are forbidden. Also, you can activate a tool that will check users' nicknames from the main platform’s profiles.

- Political and sexualized provocations

Propaganda and any provocative content are forbidden.

Threshold for violations

All of these topics and threads can be settings specifically for AI tools. The default threshold for marking such a message is 75%. This means that if AI tools are 75% certain about the existence of one of the threats in a message, it will be flagged and, if suitable, hidden automatically. However, if you have a more sensitive thread for your platform, you can decrease the entry threshold for such a violation and flag it even with 30 or 40% certainty. So, if even a hint of sexual issues or self-harm is inappropriate, add the low threshold for these topics, and AI will catch all possible variations and even veiled messages in this area.

Get in Touch

If you want to partner with us or get more details, please schedule a demo meeting by filling in the form. We'll get in touch with you shortly.

Safety VS Censorship: Our Approach to Moderation
5 August 2025
What Are In-App Communities, and How to Build Them From Scratch
18 July 2025