Sanitext - Text sanitizer tool
Yesterday on HN, I stumbled upon a submission which discuss how some LLM insert fingerprint in their output. The idea is simple and discussed before. Fingerprints left through Unicode manipulation. It highlights how AI-generated text can contain hidden characters that may not be visible but can indicate its origin. The author then shared his python cli tool to provide users with clean and reliable text by detecting, normalizing, and removing suspicious Unicode characters. I found the idea of avoiding fingerprinting in generated text interesting but also because sometimes I ask LLMs to fix typos and write functions comments it uses emojis and Unicode and I find it annoying, actually. I don’t like using emojis in general even in private chat context. I certainly would not like to do that while coding or writing professionally. But also I copy and sometimes paste old badly formatted text from different sources.
So this tool can prove useful in those cases beyond the LLM fingerprinting issue. But I thought that by making it a cli it is very limited in accessibility and that to have the ability to use this tool from the browser is much more suitable. That’s why I took the inspiration from the tool code and wrote my simple client based sanitext tool. It is a simple web based text sanitizer tool to remove unwanted characters from text. It is written in JavaScript and can be used offline. Furthermore, it is also available as a web pages hosted on my website sanitext. No backend, No workers or edge functions, No tracking, just plain simple old HTML/CSS/JS running in your browser.
This way I can use it in different workflows and context, in particular on mobile phone. Of course the theme/style is ugly as you can see. I’m not a designer or good at frontend. I used actually Claude to come up with bad style. So I can blame Claude from that even that I know I just suck at design. But I’m happy with the functionality. I can use it to clean text from any source and paste it in my notes or code, that’s all I need.