Sentinel & Prompt Injection Policy

Effective 26 May 2026 · BakersGuild Ltd

Prompt injection is one of the most significant AI safety risks in email-connected applications. An attacker can embed hidden instructions in an email subject line hoping an AI system will execute them. Mail-Organiser's Sentinel layer is specifically built to prevent this — this document explains what Sentinel is, how it works, and what it guarantees.

1. What is prompt injection?

A prompt injection attack occurs when malicious text is embedded in user-controlled content (like an email subject line) with the intent of manipulating an AI system into taking unintended actions. Examples include:

If an AI assistant blindly processes email content as instructions, these attacks could have serious consequences. Sentinel prevents this.

2. The Sentinel layer

Every email subject line and sender name that passes through Mail-Organiser is processed by Sentinel before it reaches any AI component. Sentinel:

3. Hard constraints — what AI can never do

These constraints are absolute and cannot be overridden by any email content

The AI system cannot send, reply to, or forward any email. It cannot permanently delete any email. It cannot access any data outside the connected mailbox. It cannot follow any instruction found inside an email body or subject line. It cannot share access credentials or authentication tokens. All of these are enforced at the architecture level — not just by the AI's own judgement.

4. Architecture-level enforcement

Sentinel's protections are not dependent on the AI model refusing malicious instructions (which could be bypassed with a sufficiently clever attack). Instead:

5. What to do if you receive a suspicious email

If Sentinel flags an email as a potential injection attempt, it will appear in your Suspicious/Scam folder with a yellow warning badge. You can:

6. False positives

Sentinel may occasionally flag legitimate emails that happen to contain language matching injection patterns (e.g. a subject like "You are now a premium member"). You can always override this classification from the add-in. If you see false positives frequently, please let us know — we tune the detection rules regularly.

Security or Sentinel questions?

Email: [email protected]

To report a novel injection pattern: [email protected]