RightNow Research Lab
Enabling Model-Hardware Co-Design at Scale
Abstract
We build the tools that close the gap between AI models and the hardware they run on, giving engineers and enterprises the infrastructure to go from model to optimized production inference, faster.
Our products are used by teams at
1Products
We ship three products targeting different layers of the GPU development stack:managed GPU infrastructure for deployment at scale, an AI code editor for kernel developers, and an inference optimization agent for enterprises.
RightNow Editor
The only all-in-one AI code editor for GPU kernel development. Real-time profiling, emulation, and hardware-aware completions.

Forge
Drop-in optimized GPU kernels for your models. Up to 7.6x faster inference with verified correctness.
2Research
Our research spans efficient inference, world modeling, GPU kernel optimization, and large-scale dataset curation. 4 papers published on arXiv.
3Open Source
We maintain three open-source projects spanning low-level agent infrastructure, edge GPU inference, and automated kernel optimization. Combined 18,500+ GitHub stars.
OpenFang
RightNow-AI/openfangOpen-source Agent OS that lets AI agents touch the kernel of the operating system. A single ~32MB Rust binary giving agents direct access to hardware, syscalls, and GPU resources without sandboxing.
PicoLM
RightNow-AI/picolmRun a 1B parameter LLM on a $10 board with 256MB RAM. Pure C, zero dependencies, ~80KB binary. Efficient GPU-free inference for edge devices where every cycle and byte counts.
AutoKernel
RightNow-AI/autokernelAutoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton kernels. Runs ~40 experiments/hour using Amdahl's law to prioritize bottlenecks.





