Skip to content

Markdown Sanitization - XSS#1046

Open
AndyVale wants to merge 6 commits intoalphaonelabs:mainfrom
AndyVale:fix/markdown-sanitization
Open

Markdown Sanitization - XSS#1046
AndyVale wants to merge 6 commits intoalphaonelabs:mainfrom
AndyVale:fix/markdown-sanitization

Conversation

@AndyVale
Copy link

@AndyVale AndyVale commented Mar 25, 2026

Markdown Sanitization - XSS

Abstract

This PR remediates Stored and Reflected XSS vulnerabilities in the markdown rendering pipeline by centralizing sanitization using bleach.

Current Problem

The application was vulnerable to script injection in two main areas:

  • Stored XSS: Success Stories and Course content were rendered using |safe without server-side sanitization, allowing malicious tags to execute for all users.
  • Reflected/Self-XSS: The EasyMDE editor used an unsanitized client-side preview, which could be exploited via social engineering to execute scripts in a teacher's session.
XSS.mp4

Fixes

A defense-in-depth approach has been implemented:

1. Centralized Sanitization

  • Created a markdownify_sanitized function in web/templatetags/markdown_filters.py using bleach to strip malicious tags and attributes while preserving markdown formatting.
  • Updated settings.py to use this function globally for all django-markdownx AJAX previews.

2. Template Security

  • Replaced dangerous |safe instances with a |markdown filter that converts and sanitizes content on the server.

3. Hardened Editor Previews

  • Integrated static/js/markdown_preview.js across (I think) all editors (Course, Teach, and Success Stories) to enforce server-side sanitized previews, eliminating the reflected XSS vector.

Verification

Confirmed that payloads such as <script> tags and onerror attributes are properly stripped while preserving valid markdown content.

maybeNOT_XSS.mp4

Markdown Sanitization XSS Fixes

This PR centralizes and enforces server-side sanitization for markdown rendering to remediate stored and reflected XSS vectors.

Purpose

Eliminate stored XSS (from templates that used raw HTML via |safe) and reflected/self-XSS in EasyMDE previews by routing all markdown rendering through a bleach-based sanitizer on the server.

Key Changes

  • Core sanitization:

    • Added markdownify_sanitized(content) in web/templatetags/markdown_filters.py which runs markdownify(content) then bleach.clean(..., strip=True) with explicit allowlists:
      • SAFE_TAGS: headers, p, br, hr, strong/em/b/i/u/del, ul/ol/li, a, pre/code, blockquote, table elements, img
      • SAFE_ATTRIBUTES: a[href,title], img[src,alt,title], code[class], td/th[align]
      • ALLOWED_PROTOCOLS: http, https, mailto
    • Updated the markdown template filter to return mark_safe(markdownify_sanitized(text)).
  • Configuration:

    • Set MARKDOWNX_MARKDOWNIFY_FUNCTION = "web.templatetags.markdown_filters.markdownify_sanitized" in web/settings.py so django-markdownx AJAX previews and the markdownify endpoint use the sanitized function.
  • Client-side preview hardening:

    • Added web/static/js/markdown_preview.js providing getCookie(name) and sanitizedPreviewRender(plainText, previewUrl). sanitizedPreviewRender sends the markdown to the server (synchronous XHR, sets X-CSRFToken) and returns the server's sanitized HTML; on non-200 responses it returns a small error HTML message.
    • Updated EasyMDE init across templates (courses create/update, success stories create, teach) to include previewRender: plainText => sanitizedPreviewRender(plainText, '{% url "markdownx_markdownify" %}') and to include the new static script. The inline synchronous XHR previously present in some templates was removed in favor of the shared helper.
  • Template updates:

    • success_stories/detail.html now loads the markdown_filters tag library and renders content with {{ success_story.content|markdown }} instead of trusting pre-rendered HTML (removes use of |safe).
    • Editor pages include the new preview script and route preview rendering through the server-side sanitizer.

Impact

  • Markdown rendered in Success Stories, Courses, and Teach editor previews is now sanitized server-side (bleach), stripping <script> tags and dangerous attributes (e.g., onerror) while preserving allowed markdown/HTML.
  • The EasyMDE reflected XSS vector is mitigated by using server-side sanitized previews; note EasyMDE preview uses synchronous XHR and depends on server availability and correct CSRF handling.
  • Behavior change is minimal for valid content; preview rendering now depends on the backend endpoint and will surface a small error message if the preview request fails.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 25, 2026

Warning

Rate limit exceeded

@AndyVale has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 12 minutes and 14 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: alphaonelabs/coderabbit/.coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5a559013-7502-4f13-aa6c-be681bd5c874

📥 Commits

Reviewing files that changed from the base of the PR and between 3872ac0 and ef871ad.

📒 Files selected for processing (1)
  • web/templatetags/markdown_filters.py

Walkthrough

Adds server-side sanitized markdown rendering and a client-side preview that posts editor content to the markdownify endpoint. Introduces a bleach-based sanitizer in a template tag, registers the sanitizer in settings, and updates EasyMDE editor instances in several templates to use the server-backed preview renderer.

Changes

Cohort / File(s) Summary
Backend config & sanitizer
web/settings.py, web/templatetags/markdown_filters.py
Added MARKDOWNX_MARKDOWNIFY_FUNCTION and implemented markdownify_sanitized(); updated the markdown template filter to return bleach-sanitized HTML.
Client preview utility
web/static/js/markdown_preview.js
New JS helpers getCookie() and sanitizedPreviewRender() that POST editor content to the markdownify endpoint with CSRF and return server HTML or an error snippet.
Editor templates
web/templates/courses/create.html, web/templates/courses/update.html, web/templates/success_stories/create.html, web/templates/teach.html
Included markdown_preview.js and set EasyMDE previewRender to use sanitizedPreviewRender() pointing at {% url "markdownx_markdownify" %}.
Content display template
web/templates/success_stories/detail.html
Loaded markdown_filters and switched success_story.content rendering to the markdown filter (now sanitized).

Sequence Diagram

sequenceDiagram
    participant User as User
    participant Browser as Browser
    participant EasyMDE as EasyMDE Editor
    participant Server as Django Server
    participant Bleach as Bleach Sanitizer

    User->>Browser: Types/edits markdown
    Browser->>EasyMDE: Editor content updated
    EasyMDE->>Browser: Trigger previewRender(plainText)
    Browser->>Browser: getCookie('csrftoken')
    Browser->>Server: POST /markdownx/markdownify/ (content + CSRF)
    Server->>Server: markdownify(content) -> raw HTML
    Server->>Bleach: clean(raw HTML) using allowlists
    Bleach->>Server: sanitized HTML
    Server->>Browser: 200 OK + sanitized HTML
    Browser->>EasyMDE: Render preview pane with sanitized HTML
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Markdown Sanitization - XSS' directly reflects the main security objective of the PR—remediating XSS vulnerabilities through centralized markdown sanitization.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
web/templates/courses/update.html (1)

88-102: ⚠️ Potential issue | 🟡 Minor

Remove this dead code—getCookie is no longer used.

After refactoring to use sanitizedPreviewRender from markdown_preview.js, this local getCookie function is no longer called anywhere in this template. The markdown_preview.js file includes its own getCookie implementation.

Leaving dead code can cause confusion for future maintainers and slightly increases the page size.

🧹 Suggested removal of dead code
          });

-          // Helper function to get CSRF token from cookies
-          function getCookie(name) {
-              let cookieValue = null;
-              if (document.cookie && document.cookie !== '') {
-                  const cookies = document.cookie.split(';');
-                  for (let i = 0; i < cookies.length; i++) {
-                      const cookie = cookies[i].trim();
-                      if (cookie.substring(0, name.length + 1) === (name + '=')) {
-                          cookieValue = decodeURIComponent(cookie.substring(name.length + 1));
-                          break;
-                      }
-                  }
-              }
-              return cookieValue;
-          }
-
           // Fix form submission for hidden required textareas
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/templates/courses/update.html` around lines 88 - 102, Remove the unused
local helper function getCookie from the template: locate the function
definition named getCookie in the update.html template and delete it; ensure no
remaining references to getCookie exist in update.html (the
sanitizedPreviewRender logic now comes from markdown_preview.js which provides
its own getCookie), so simply remove the dead getCookie function to avoid
duplication and reduce page size.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@web/templates/success_stories/detail.html`:
- Around line 57-59: Move the template tag `{% load markdown_filters %}` from
its current location near the content block to the top of the template
(immediately after the `{% extends ... %}` line) and remove the duplicate `{%
load markdown_filters %}` at line 57 so the filter usage `{{
success_story.content|markdown }}` continues to work but the `{% load %}`
follows Django convention and improves readability.

In `@web/templatetags/markdown_filters.py`:
- Around line 28-47: Update markdownify_sanitized to include type hints and
eliminate the unnecessary intermediate variable: annotate the parameter as
Optional[str] (or str) and the return type as str, and replace the two-step
assignment with a single return that calls bleach.clean(markdownify(content),
tags=SAFE_TAGS, attributes=SAFE_ATTRIBUTES, strip=True). Ensure you import
typing.Optional if using Optional and keep references to markdownify,
bleach.clean, SAFE_TAGS, and SAFE_ATTRIBUTES unchanged.
- Around line 20-26: The SAFE_ATTRIBUTES list is fine but you should add an
explicit protocol allowlist for defense-in-depth: define a small set like
ALLOWED_PROTOCOLS = {"http", "https", "mailto"} and pass it into the
bleach.clean (or bleach.Cleaner) invocation that uses SAFE_ATTRIBUTES so
href/src are only allowed those protocols; reference SAFE_ATTRIBUTES and the
bleach.clean / bleach.Cleaner call in your patch so protocol filtering is
explicit rather than relying on bleach defaults.

---

Outside diff comments:
In `@web/templates/courses/update.html`:
- Around line 88-102: Remove the unused local helper function getCookie from the
template: locate the function definition named getCookie in the update.html
template and delete it; ensure no remaining references to getCookie exist in
update.html (the sanitizedPreviewRender logic now comes from markdown_preview.js
which provides its own getCookie), so simply remove the dead getCookie function
to avoid duplication and reduce page size.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: alphaonelabs/coderabbit/.coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7143131e-c9af-42bc-9e91-452175c1c155

📥 Commits

Reviewing files that changed from the base of the PR and between f74a131 and e7da5bc.

📒 Files selected for processing (8)
  • web/settings.py
  • web/static/js/markdown_preview.js
  • web/templates/courses/create.html
  • web/templates/courses/update.html
  • web/templates/success_stories/create.html
  • web/templates/success_stories/detail.html
  • web/templates/teach.html
  • web/templatetags/markdown_filters.py

@github-actions
Copy link
Contributor

👀 Peer Review Required

Hi @AndyVale! This pull request does not yet have a peer review.

Before this PR can be merged, please request a review from one of your peers:

  • Go to the PR page and click "Reviewers" on the right sidebar.
  • Select a team member or contributor to review your changes.
  • Once they approve, this reminder will be automatically removed.

Thank you for contributing! 🎉

@github-actions github-actions bot added the files-changed: 8 PR changes 8 files label Mar 25, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@web/templatetags/markdown_filters.py`:
- Around line 48-52: Add type hints to the markdown filter: update the function
signature of markdown to accept text: str and return a django SafeString (or
typing.Any/str if you prefer) and eliminate the intermediate variable by
returning mark_safe(markdownify_sanitized(text)) directly; reference the
markdown function, markdownify_sanitized call, and mark_safe to locate and
update the code accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: alphaonelabs/coderabbit/.coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 161aeff7-7524-40b0-ac1f-bdc20f261e20

📥 Commits

Reviewing files that changed from the base of the PR and between e7da5bc and 286f914.

📒 Files selected for processing (3)
  • web/templates/courses/update.html
  • web/templates/success_stories/detail.html
  • web/templatetags/markdown_filters.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
web/templatetags/markdown_filters.py (1)

49-51: ⚠️ Potential issue | 🟡 Minor

Return a concrete safe-string type instead of Any.

Line 49 still advertises Any, which leaves the ANN401 warning unresolved and hides the actual contract here. Since this filter returns mark_safe(...), annotate it as SafeString and consider text: Optional[str] to match markdownify_sanitized.

✨ Suggested typing cleanup
-import bleach
-from typing import Any, Optional
+from typing import Optional
+
+import bleach
 from django import template
-from django.utils.safestring import mark_safe
+from django.utils.safestring import SafeString, mark_safe
@@
 `@register.filter`
-def markdown(text: str) -> Any:
+def markdown(text: Optional[str]) -> SafeString:
     """Convert markdown text to sanitized HTML."""
     return mark_safe(markdownify_sanitized(text))

As per coding guidelines "Use type hints in Python where appropriate".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/templatetags/markdown_filters.py` around lines 49 - 51, The function
signature for markdown should use a concrete safe-string return type and accept
optional input: change the annotation of def markdown(text: str) -> Any to def
markdown(text: Optional[str]) -> SafeString, importing Optional from typing and
SafeString from django.utils.safestring, and keep the implementation returning
mark_safe(markdownify_sanitized(text)) to match markdownify_sanitized's nullable
input and to resolve the ANN401 warning while making the contract explicit.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@web/templatetags/markdown_filters.py`:
- Around line 31-45: Add regression tests covering the sanitizer boundary around
the markdownify_sanitized function: create tests that call markdownify_sanitized
with payloads that must be stripped (e.g., "<script>" tags, attributes like
onerror/onload, and javascript: links) and assert those are removed, and also
tests that ensure legitimate Markdown output is preserved (links, tables, fenced
code blocks, images) to prevent regressions when
SAFE_TAGS/SAFE_ATTRIBUTES/ALLOWED_PROTOCOLS change; include one test for the
markdownx preview flow if applicable and name tests to map to
markdownify_sanitized so future reviewers can find coverage easily.

---

Duplicate comments:
In `@web/templatetags/markdown_filters.py`:
- Around line 49-51: The function signature for markdown should use a concrete
safe-string return type and accept optional input: change the annotation of def
markdown(text: str) -> Any to def markdown(text: Optional[str]) -> SafeString,
importing Optional from typing and SafeString from django.utils.safestring, and
keep the implementation returning mark_safe(markdownify_sanitized(text)) to
match markdownify_sanitized's nullable input and to resolve the ANN401 warning
while making the contract explicit.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: alphaonelabs/coderabbit/.coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 232edf39-24e7-4c9e-a42e-8ec9331247e3

📥 Commits

Reviewing files that changed from the base of the PR and between 286f914 and 3872ac0.

📒 Files selected for processing (1)
  • web/templatetags/markdown_filters.py

Comment on lines +31 to +45
def markdownify_sanitized(content: Optional[str]) -> str:
"""
Custom markdownify function that sanitizes the output HTML using bleach.
This is used both for template filters and for the markdownx AJAX preview.
"""
if not content:
return ""

return bleach.clean(
markdownify(content),
tags=SAFE_TAGS,
attributes=SAFE_ATTRIBUTES,
protocols=ALLOWED_PROTOCOLS,
strip=True
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Add regression tests for the sanitizer boundary.

This function is now the trust boundary for both template rendering and the Markdownx preview, but I don't see automated coverage in this change for the deny/allow cases. Please lock in at least <script>, event-handler attributes like onerror, javascript: links, and a few preserved markdown cases such as links, tables, fenced code, and images so future allowlist edits don't quietly reopen XSS or break authoring. As per coding guidelines "Verify tests cover the key logic paths."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/templatetags/markdown_filters.py` around lines 31 - 45, Add regression
tests covering the sanitizer boundary around the markdownify_sanitized function:
create tests that call markdownify_sanitized with payloads that must be stripped
(e.g., "<script>" tags, attributes like onerror/onload, and javascript: links)
and assert those are removed, and also tests that ensure legitimate Markdown
output is preserved (links, tables, fenced code blocks, images) to prevent
regressions when SAFE_TAGS/SAFE_ATTRIBUTES/ALLOWED_PROTOCOLS change; include one
test for the markdownx preview flow if applicable and name tests to map to
markdownify_sanitized so future reviewers can find coverage easily.

@github-actions
Copy link
Contributor

💬 Unresolved Review Conversations

Hi @AndyVale! 👋

This pull request currently has 1 unresolved review conversation.

Please address all review feedback and push a new commit to resolve them before this PR can be merged.

Steps to resolve:

  1. Review each comment thread in the "Files changed" tab.
  2. Make the necessary changes to your code.
  3. Reply to each conversation to explain your changes or ask for clarification.
  4. Click "Resolve conversation" once the feedback has been addressed.
  5. Push a new commit with your changes.

Once all conversations are resolved, this notice will be removed automatically. Thank you! 🙏

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR has 1 unresolved review conversation. Please resolve them before this PR can be merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

files-changed: 8 PR changes 8 files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant