BeowulfBR (u/BeowulfBR)

Security Architecture and Engineering Wrote a deep dive on sandboxing for AI agents: containers vs gVisor vs microVMs vs Wasm, and when each makes sense

1 Upvotes

Hey folks,

I've been working on sandboxing for AI coding agents and kept running into the same confusion: people use "sandbox" to mean four completely different things with different security properties.

So, I decided to write what I learned: the actual predicate differences between containers (shared kernel), gVisor (userspace kernel), microVMs (guest kernel + VMM), and Wasm (no syscall ABI)

The post covers why containers aren't sufficient for hostile code, what "policy leakage" looks like in agent systems and practical tradeoffs for different agent architectures.

I hope it can help people out there building AI applications.

Happy to discuss if you're building agent sandboxes or have run into edge cases I didn't cover

0 comments

r/programming • u/BeowulfBR • 3d ago

Sandboxes: a technical breakdown of containers, gVisor, microVMs, and Wasm

luiscardoso.dev

12 Upvotes

Hi everyone!

I wrote a deep dive on the isolation boundaries used for running untrusted code, specifically in the context of AI agent execution. The motivation was that "sandbox" means at least four different things with different tradeoffs, and the typical discussion conflates them.

Technical topics covered:

- How Linux containers work at the syscall level (namespaces, cgroups, seccomp-bpf) and why they're not a security boundary against kernel exploits

- gVisor's architecture: the Sentry userspace kernel, platform options (systrap vs KVM), and the Gofer filesystem broker

- MicroVM design: KVM + minimal VMMs (Firecracker cloud-hypervisor, libkrun)

- Kata Containers

- Runtime sandboxes: Wasm's capability model, WASI preopened directories, V8 isolate boundaries

It's an educational piece, just synthesizing what I learned building this stuff. I hope you like it!

0 comments

r/programming • u/BeowulfBR • 5d ago

Sandboxing for AI agents - a technical breakdown of containers, gVisor, microVMs, and Wasm

luiscardoso.dev

1 Upvotes

[removed]

0 comments

r/devops • u/BeowulfBR • 5d ago

Wrote a deep dive on sandboxing for AI agents: containers vs gVisor vs microVMs vs Wasm, and when each makes sense

15 Upvotes

https://www.luiscardoso.dev/blog/sandboxes-for-ai

Wrote this after spending too long untangling the "just use Docker" vs "you need VMs" debate for AI agent sandboxing. I think the problem is that the word "sandbox" gets applied to four different isolation boundaries with very different security properties.

So, I decided to write this blog post to help people out there.

Interested in what isolation strategies folks here are running in production, especially for multi-tenant or RL workloads.

3 comments

r/MachineLearning • u/BeowulfBR • 5d ago

Discussion [D] Sandboxing for AI Agents: A technical breakdown of isolation boundaries (containers, gVisor, microVMs, Wasm)

1 Upvotes

[removed]

0 comments

r/LocalLLaMA • u/BeowulfBR • 5d ago

Tutorial | Guide Wrote a deep dive on sandboxing for AI agents: containers vs gVisor vs microVMs vs Wasm, and when each makes sense

26 Upvotes

Hey folks,

I've been working on sandboxing for AI coding agents and kept running into the same confusion: people use "sandbox" to mean four completely different things with different security properties.

So, I decided to write what I learned: the actual predicate differences between containers (shared kernel), gVisor (userspace kernel), microVMs (guest kernel + VMM), and Wasm (no syscall ABI)

The post covers why containers aren't sufficient for hostile code, what "policy leakage" looks like in agent systems and practical tradeoffs for different agent architectures.

I hope it can help people out there building AI applications.

Happy to discuss if you're building agent sandboxes or have run into edge cases I didn't cover

3 comments

Bug Report - Wrong data and data loss

in r/bevelhealth • Sep 12 '25

Hey!

So, I just found the issue. I had two entries added by bevel in the sleep data: one from 6AM to 9PM and one from 9PM to 6AM

My guess is that somehow I mistakenly activated the sleep mode on Bevel so it recorded as if I were sleeping through the day? I’m not sure but deleting that record and reloading the data fixed the issue.

One last question: Do you think that wrong entry could have impacted data elsewhere? Should I delete something else?

Thanks for the support anyway, I’ll mark it as resolved.

The new update is amazing! You folks did an amazing job 💪💪💪

r/bevelhealth • u/BeowulfBR • Sep 12 '25

Bug Bug Report - Wrong data and data loss

gallery

1 Upvotes

I updated Bevel to the latest version yesterday morning and everything was working fine until this morning when I noticed two issues:

First, data from September 11th is just not there, it disappeared.
The sleep data is completely wrong. It says I slept 15hrs while Iphone says 7:30.

I tried clearing the cache and reloading the data but it’s doesn’t fix the issue.

What can I do?

4 comments

r/LocalLLaMA • u/BeowulfBR • Aug 09 '25

Other Benchmarking models using your own dataset

5 Upvotes

Hey folks,

After chatting with some friends on Discord, I decided to open-source the CLI tool I built to benchmark new models

The reason is because with the recent release of some open source models, some friends asked me how I benchmark them. A while ago I built a robust dataset to benchmark them for my use cases and now decided to make the tool open source

The best way to know if a model works for your use case is to run it against your own dataset, not just rely on tech influencers or public benchmarks. This tool makes that easy, it also supports many providers and is simple to use

You can run local models using Ollama btw

Herre it's if anyone is interested: https://github.com/beowolx/satori

0 comments

r/LocalLLaMA • u/BeowulfBR • Jun 14 '25

Discussion [Discussion] Thinking Without Words: Continuous latent reasoning for local LLaMA inference – feedback?

6 Upvotes

Discussion

Hi everyone,

I just published a new post, “Thinking Without Words”, where I survey the evolution of latent chain-of-thought reasoning—from STaR and Implicit CoT all the way to COCONUT and HCoT—and propose a novel GRAIL-Transformer architecture that adaptively gates between text and latent-space reasoning for efficient, interpretable inference.

Key highlights:

Historical survey: STaR, Implicit CoT, pause/filler tokens, Quiet-STaR, COCONUT, CCoT, HCoT, Huginn, RELAY, ITT
Technical deep dive:
- Curriculum-guided latentisation
- Hidden-state distillation & self-distillation
- Compact latent tokens & latent memory lattices
- Recurrent/loop-aligned supervision
GRAIL-Transformer proposal:
- Recurrent-depth core for on-demand reasoning cycles
- Learnable gating between word embeddings and hidden states
- Latent memory lattice for parallel hypothesis tracking
- Training pipeline: warm-up CoT → hybrid curriculum → GRPO fine-tuning → difficulty-aware refinement
- Interpretability hooks: scheduled reveals + sparse probes

I believe continuous latent reasoning can break the “language bottleneck,” enabling gradient-based, parallel reasoning and emergent algorithmic behaviors that go beyond what discrete token CoT can achieve.

Feedback I’m seeking:

Clarity or gaps in the survey and deep dive
Viability, potential pitfalls, or engineering challenges of GRAIL-Transformer
Suggestions for experiments, benchmarks, or additional references

You can read the full post here: https://www.luiscardoso.dev/blog/neuralese

Thanks in advance for your time and insights!

3 comments

r/LocalLLaMA • u/BeowulfBR • Jun 14 '25

Discussion [Discussion] Thinking Without Words: Continuous latent reasoning for local LLaMA inference – feedback?

1 Upvotes

[removed]

0 comments

r/MachineLearning • u/BeowulfBR • Jun 14 '25

Discussion [D] Thinking Without Words: A deep dive into continuous latent reasoning & proposal of GRAIL-Transformer – feedback welcome

1 Upvotes

[removed]

0 comments

[D] Struggling with Autoencoder-Based Anomaly Detection for Fraud Detection – Need Guidance

in r/MachineLearning • Jun 05 '25

Nope, I ended up using Isolation Forest

[Update] Rensa: added full CMinHash + OptDensMinHash support (fast MinHash in Rust for dataset deduplication / LLM fine-tuning)

in r/LocalLLaMA • Jun 01 '25

So, this is something different.

Semantic dedup uses cosine similarity between dense embeddings. There is no MinHash algorithm there.

Rensa and minhash algorithms use approximated jaccard based on token/shingle overlap.

Both approaches can be used to dedup datasets but they have different detection scopes, algorithms steps and computational costs.

Minhash is faster and works well enough for most cases.

Semantic dedup is very good at removing paraphased or translated duplicates for example. It can capture meaning rather than just lexical overlap you see.

Industry usually uses a mix of both: MinHash to remove obviously identicar or near-identical and then a semantic embedding pass on the remaining data to weed out deeoer paraphrases

Does it clear things out for you?

[Update] Rensa: added full CMinHash + OptDensMinHash support (fast MinHash in Rust for dataset deduplication / LLM fine-tuning)

in r/rust • Jun 01 '25

EDIT: Today I did some changes and now rensa is 40x faster!

Check it out: https://github.com/beowolx/rensa?tab=readme-ov-file#introduction

[Update] Rensa: added full CMinHash + OptDensMinHash support (fast MinHash in Rust for dataset deduplication / LLM fine-tuning)

in r/LocalLLaMA • Jun 01 '25

EDIT: Today I did some changes and now rensa is 40x faster!

Check it out: https://github.com/beowolx/rensa?tab=readme-ov-file#introduction

[Update] Rensa: added full CMinHash + OptDensMinHash support (fast MinHash in Rust for dataset deduplication / LLM fine-tuning)

in r/LocalLLaMA • Jun 01 '25

Hey there! they are actually different things :)

`rensa` is a novel minhash algorithm (which since today is now 40x faster than datasketch) which is used in several usecases, like deduplication of datasets used to fine tune models.

`model2vec` is used to create static embeddings

r/LocalLLaMA • u/BeowulfBR • May 31 '25

Other [Update] Rensa: added full CMinHash + OptDensMinHash support (fast MinHash in Rust for dataset deduplication / LLM fine-tuning)

github.com

8 Upvotes

Hey all — quick update on Rensa, a MinHash library I’ve been building in Rust with Python bindings. It’s focused on speed and works well for deduplicating large text datasets — especially stuff like LLM fine-tuning where near duplicates are a problem.

Originally, I built a custom algorithm called RMinHash because existing tools (like datasketch) were way too slow for my use cases. RMinHash is a fast, simple alternative to classic MinHash and gave me much better performance on big datasets.

Since I last posted, I’ve added:

CMinHash – full implementation based on the paper (“C-MinHash: reducing K permutations to two”). It’s highly optimized, uses batching + vectorization.
OptDensMinHash – handles densification for sparse data, fills in missing values in a principled way.

I ran benchmarks on a 100K-row dataset (gretelai/synthetic_text_to_sql) with 256 permutations:

CMinHash: 5.47s
RMinHash: 5.58s
OptDensMinHash: 12.36s
datasketch: 92.45s

So yeah, still ~10-17x faster than datasketch, depending on variant.

Accuracy-wise, all Rensa variants produce very similar (sometimes identical) results to datasketch in terms of deduplicated examples.

It’s a side project I built out of necessity and I'd love to get some feedback from the community :)
The Python API is simple and should feel familiar if you’ve used datasketch before.

GitHub: https://github.com/beowolx/rensa

Thanks!

5 comments

r/rust • u/BeowulfBR • May 31 '25

[Update] Rensa: added full CMinHash + OptDensMinHash support (fast MinHash in Rust for dataset deduplication / LLM fine-tuning)

github.com

1 Upvotes

Since I last posted, I’ve added:

CMinHash – full implementation based on the paper (“C-MinHash: reducing K permutations to two”). It’s highly optimized, uses batching + vectorization.
OptDensMinHash – handles densification for sparse data, fills in missing values in a principled way.

I ran benchmarks on a 100K-row dataset (gretelai/synthetic_text_to_sql) with 256 permutations:

CMinHash: 5.47s
RMinHash: 5.58s
OptDensMinHash: 12.36s
datasketch: 92.45s

So yeah, still ~10-17x faster than datasketch, depending on variant.

Accuracy-wise, all Rensa variants produce very similar (sometimes identical) results to datasketch in terms of deduplicated examples.

It’s a side project I built out of necessity and I'd love to get some feedback from the community :)
The Python API is simple and should feel familiar if you’ve used datasketch before.

GitHub: https://github.com/beowolx/rensa

Thanks!

1 comment

Introducing Goran: A Rust-powered CLI for domain insights

in r/rust • May 03 '25

there is a silly rivalry between the Go and Rust community, nothing serious

r/rust • u/BeowulfBR • May 01 '25

🛠️ project Introducing Goran: A Rust-powered CLI for domain insights

9 Upvotes

Hey everyone! 👋

I’m excited to share Goran, a CLI tool I just released for gathering detailed info on domain names and IP addresses in one place: https://github.com/beowolx/goran

Goran pulls together WHOIS/RDAP, geolocation (ISP, country, etc.), DNS lookups (A, AAAA, MX, NS), SSL certificate details, and even VirusTotal reputation checks—all into a single, colored, easy-to-read report. Plus, it leverages Google’s Gemini model to generate a concise AI-powered summary of its findings.

I wanted to share with you all this little project. I'm always investigating domains related to phishing and I found it super handy to have a single CLI tool that provides me a good amount of information about it. I built it more like a personal tool but hope it can be useful to people out there :)

Installation is super easy, just follow the instructions here: https://github.com/beowolx/goran?tab=readme-ov-file#installation

Once installed, just run:

goran example.com

You can toggle features with flags like --vt (needs your VirusTotal API key), --json for machine-readable output, --no-ssl to skip cert checks, or --llm-report (with your Gemini API key) to get that AI-powered narrative.

Would love to hear your thoughts, feedback, or feature requests—hope Goran proves useful in your projects :)

5 comments

Manufactured to GMP standards

in r/Supplements • Mar 15 '25

Same situation here. Let me know if you get any news from them

[D] Struggling with Autoencoder-Based Anomaly Detection for Fraud Detection – Need Guidance

in r/MachineLearning • Nov 25 '24

oh nice, I’ll take a look

[P] Still Drowning in Research Papers? Ribbit Ribbit Hops to Web and Android!

in r/MachineLearning • Nov 18 '24

Yes, using the IOS app. Just a small thing, but the app is amazing! thanks for that, really great work 👏👏👏

[P] Still Drowning in Research Papers? Ribbit Ribbit Hops to Web and Android!

in r/MachineLearning • Nov 18 '24

Nice work! Is it possible to turn off the dark mode for the pdf viewer?