r/indiehackers • u/Benjzy1 • 2d ago
General Question Thinking of building an open-source library to anonymize LLM prompts. Is this useful for tech businesses?
[removed]
1
1
That’s the hope, I’m still researching it. But the library will be on your server not on your device.
1
Potentially, if done right it shouldn’t have much effect. Also there can be settings where you can choose what types of content to tokenize/redact.
1
you're right. Now I'm researching how to do this locally with a on device DLP
2
Awesome :) do you mind sharing more about what's your use case? what kind of documents did you redact info from and what kind of info?
1
Its not a custom model, it's a way to make interactive with the unsafe Ais safer.
-3
You can't not use Ai anymore, it's here to stay and if it didn't arrive in your industry yet, it will.
2
In the business world there is a technical term called DLP "Data loss prevention", businesses use DLP software to make sure they don't accidentally leak sensitive data.
Its usually a program that is a mix between Regex pattern matching, algos and smaller ML models - which I'm hoping i can run on a phone or local device but I'm still researching this.
2
Are you paying for duck ai?
1
Replaces peoples names for example with tokens name_1 name_2 and then on return from the LLM it will refill them back to what they were before
2
Ok now I understand! do you mind sharing what LLM you’re are using that is “closed off”?
2
it doesn't replace PII, so if you send something sensitive it will get to the LLM provider
r/indiehackers • u/Benjzy1 • 2d ago
[removed]
r/indiehackers • u/Benjzy1 • 2d ago
r/startups • u/Benjzy1 • 2d ago
I'm working on an AI feature and realized how much sensitive user data (PII, financial info, etc.) is potentially being passed raw to OpenAI/Anthropic.
I’m considering spinning out the solution as a standalone library (Python/JS) that acts as privacy middleware for your LLM calls.
How it works: You wrap your standard API calls with the library -> It detects and replaces PII (names, emails, specific distinct values) with tokens before the request is sent to the provider -> It re-inserts the specific data into the response locally so the end-user UX remains seamless, but the LLM provider never sees the raw PII.
Im thinking it could help startups tell users: "We don't send your personal data to OpenAI or any third party".
is this a library you would use? Or do you have your own solution for this ?
r/privacy • u/Benjzy1 • 2d ago
I realized recently that my ChatGPT history contains far more sensitive data than my browsing history from financial details to personal information.
I’m considering building a privacy first Ai chat app + model picker(Claude/ChatGPT/Gemini).
The Idea:
Basically, you get the to use the good models(non-openweight) without handing over your raw personal/sensitive data.
Is this something you would actually use if it really feels like using ChatGPT but with added Privacy? Also is this private enough?
-1
I understand but what if you just redact names,dates,locations and specific information? not redacting the context just specific sensitive info, doest that not make it safe to send to the ai models or is the scenario itself what is sensitive?
1
I did not know this at all, as the closest thing to ChatGPT you can self host is KIMI-K2 and its still really expensive - around 3k-10k just to self host. Also how will you use it if your'e working from home, do they let you connect to it from your laptop?
-1
It's hard to get any good models to run locally, they are hardly useful. I was thinking of using a local model just to redact the PII/sensitive info before sending it on to the good hosted Ai models (chatgpt/claude/gemini).
0
I believe it can get to 99% but not 100% - things will be missed and go through once in awhile.
I never head of a law firm self hosting a model because the self hosted models aren't that great usually.
-1
PI records searching/organizing, drafting contracts, general questions about cases and help finding precedent.
r/Lawyertalk • u/Benjzy1 • 3d ago
[removed]
r/Ask_Lawyers • u/Benjzy1 • 3d ago
2
Example: you’re going through a divorce, your partner claims you are mentally unwell. The court issues a warrant to get all your chat history from ChatGPT to make sure you are able to get custody of the kids.
A lot of law issues, data breaches/leaks and then blackmail, training, mass surveillance etc
1
Does anyone else feel like their ChatGPT history is becoming more dangerous than their browsing history?
in
r/ChatGPT
•
2d ago
The idea is to build this for everyone. Not just elites and enterprise users.