A proof-of-concept Unicode obfuscation tool for .docx and .pdf documents. Every glyph in the body is recoded so text extractors see only Private Use Area characters; the rendered page is unchanged.
pip install -r requirements.txtpython noroboto.py input.[docx|pdf] output.[docx|pdf]python -m unittest discover testsThe corpus test runs the CLI over every .docx and .pdf it finds under docs/.
python app.pyThen open http://127.0.0.1:5000. The upload accepts both .docx and .pdf.