Chethan Reddy G. P.

About

I'm Chethan. I'm doing my MTech in AI at IISc — lived in Bengaluru my whole life, except for undergrad at NIT Surathkal.

I care about deep learning from first principles and hardware-software co-design — specifically: can superintelligence be efficient enough that everyone runs their own, on their own hardware. Local LLMs, open systems, unix philosophy.

Here's what I've been up to →

What I've been up to

Does zero-order optimisation actually work at scale? ▶

Zero-order methods like SPSA estimate gradients with just two forward passes — but they pay a steep price: variance scales with dimension d, so per-coordinate SNR is roughly 1/d. Catastrophic at a trillion parameters.

What rescues this is the NTK regime. In massively overparameterised networks, the loss becomes nearly quadratic in each individual weight — higher-order terms average away across the rest of the network. A centred finite difference gives:

L(w+1) − L(w−1)  =  2 · ∂L/∂w  +  (1/3) · ∂³L/∂w³  +  …
error  ≈  O(1/N)  as parameters N → ∞   [dimensionality blessing]

Sparsity is the other lever. If the gradient is sparse, the effective dimension collapses and sample complexity drops from Ω(d/ε²) to Ω(s/ε²). BitNet suggested that ternary weights {−1, 0, +1} hit a sweet spot: natural sparsity makes zero-order practical, and the finite-difference step Δ = 1 lands exactly on valid weight values — no rounding required.

I've been doing toy work on a BitNet-style architecture trained on FineWeb — currently ~100M params, ~2B tokens. Would love to scale it and see where the zero-order + ternary story actually breaks down.

Is IMC + ternary the right architecture for the AI age? ▶

The memory wall is the real bottleneck in inference — not compute. In-memory computing (IMC) tries to fix this by doing multiply-accumulate directly inside the SRAM array, eliminating the constant weight-fetching that dominates power and latency.

Ternary weights make the hardware story almost embarrassingly clean. Each cell is 2 bits; the multiply collapses to a conditional add; peripheral logic simplifies drastically. A digital ternary IMC array at 7nm sits at ~0.15 µm²/cell. If you dedicate 80–90% of die to SRAM — which, for an inference-only chip, is entirely reasonable — you land at roughly 1–3 billion ternary weights on a single die, with weights never leaving the chip.

I've been writing toy Verilog for ternary mat-mul IMC macros: bit-serial, popcount-based, sign-separated {+1, −1} paths. Looking forward to scaling this and thinking properly about what a full inference chip looks like.

Detecting deepfakes with just a few images ▶

Every deepfake leaves two universal artifacts. Face Inconsistency Artifacts (FIA): seams between the forged region and the real background. Up-Sampling Artifacts (USA): the decoder's spectral fingerprint, unavoidably printed whenever a generator up-samples a latent code back to pixels.

The USA is generator-specific — an SD VAE's fingerprint looks different from StyleGAN's. Which means reading the USA gives you a path to few-shot generalisation: five images from an unseen generator, and you know what to look for.

I'm using prototypical networks for this. The prototype for each generator class should capture that generator's USA signature. I'm working on making the network explicitly learn the USA — contrasting inside-mask (forged, USA-bearing) features against outside-mask (real) features during prototype learning, so the representations are discriminative in the frequency domain where the fingerprint lives.

Slowly, on the side

NixOS and the unix philosophy as the first agentic OS ▶

The unix philosophy — one tool, one job, composable, all text — is quietly the ideal substrate for agentic LLMs. Everything is a file. Everything is inspectable. Pipes compose tools the same way tool calls compose agents. An LLM that can shell out is already a capable agent; unix just makes that surface enormous and coherent.

NixOS takes this one step further: one declarative file describes your entire system state — packages, configs, services, dotfiles. Reproducible. Rollbackable. A future agent managing a NixOS system can reason about it completely because the whole system is just data. No hidden mutable state, no "works on my machine." One file to rule it all.

I'm moving toward this slowly. Currently on Arch, watching NixOS from a distance.

Rethinking ML without objective functions ▶

I came across the Stanford seminar on hyperdimensional computing and went deep. The core idea: work in very high-dimensional binary or bipolar vectors where random vectors are nearly orthogonal with overwhelming probability. Encode, bind, and bundle information with XOR, permutation, and addition. The math works out, and no gradient is required.

In a small experiment: trigram features are generated via permutations, and "learning" is done by summing all the class vectors — that's it. No loss function. No backprop. No Taylor series. No optimizer. And it works reasonably well.

This keeps me awake at night. Not because I think it replaces gradient descent — but because it makes me question what "learning" actually means, and whether the objective-function framing is as fundamental as we treat it.

Uses

Hardware

laptop Asus Vivobook — 24 threads of Zen 5, love every one of them
server HP workstation, AMD MI50, Xeon E5-2689 (14 cores) — enough to self-host everything and run local LLMs
monitors LG ultrawide + BenQ vertical — the vertical is for reading and terminals, non-negotiable
keyboard Zebronics mechanical, blue switches
mouse Logitech G306 — fan of the high polling rate
phone Pixel 6a with GrapheneOS — exists because it has to exist

Software — almost all FOSS

os Arch Linux — huge fan of the unix philosophy and FOSS; Arch because the wiki is exceptional and everything exists in pacman or AUR
desktop GNOME — had the ricing and WM phase, came back to stability : )
terminal Kitty — by far the best terminal; kittens for SSH, image rendering, all of it
editor Neovim — the millisecond lag in large IDEs irritates me; Neovim doesn't have it
browser Zen Browser — Firefox-based (win for Linux), minimal aesthetic I like
twitter/x xcancel.com — like Twitter, not its algorithm or tracking
reddit RedReader — non-scrolling focused UI; mostly r/LocalLLaMA and r/stunfisk
video YouTube — self-explanatory

Elsewhere