<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Thamme Gowda</title><link>https://gowda.ai/</link><description>Recent content on Thamme Gowda</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Mon, 30 Mar 2026 20:30:00 +0000</lastBuildDate><atom:link href="https://gowda.ai/index.xml" rel="self" type="application/rss+xml"/><item><title>From O(N) to O(log N): A Faster BPE Training Algorithm, Buried and Rediscovered</title><link>https://gowda.ai/posts/2026/03/faster-bpe-learn/</link><pubDate>Mon, 30 Mar 2026 20:30:00 +0000</pubDate><guid>https://gowda.ai/posts/2026/03/faster-bpe-learn/</guid><description>I wrote a fast BPE training algorithm in 2020, buried it in a Python codebase, and forgot about it. Five years later, I rewrote it in C++ and benchmarked it: up to 11× faster than SentencePiece. The trick? A max-heap with lazy deletion instead of periodic linear scans.</description></item><item><title>Building a Jinja2 Template Engine from Scratch in C++</title><link>https://gowda.ai/posts/2026/03/parsing-tutorial-jinja/</link><pubDate>Tue, 10 Mar 2026 12:00:00 +0000</pubDate><guid>https://gowda.ai/posts/2026/03/parsing-tutorial-jinja/</guid><description>A tutorial on building a Jinja2 template engine in C++ for rendering LLM chat templates. Covers the lexer, recursive descent parser, and tree-walking evaluator, with real examples from HuggingFace model templates.</description></item><item><title>I Let Two AI Agents Race to Modernize pigz</title><link>https://gowda.ai/posts/2026/03/pigzpp-with-agents/</link><pubDate>Sat, 07 Mar 2026 20:20:00 +0000</pubDate><guid>https://gowda.ai/posts/2026/03/pigzpp-with-agents/</guid><description>I gave Claude Opus 4.6 and GPT 5.4 the same task: rewrite pigz in modern C++23 as a thread-safe library. One agent did a clean-room rewrite, the other wrapped the legacy code. The winner went on to beat pigz by up to 1.8x compression and 2.4x decompression.</description></item><item><title>Sequence Transduction: Generalization and Challenges</title><link>https://gowda.ai/posts/2021/05/nmt-generalization-n-challenges/</link><pubDate>Tue, 04 May 2021 10:20:00 +0000</pubDate><guid>https://gowda.ai/posts/2021/05/nmt-generalization-n-challenges/</guid><description>Sequence to sequence transduction is a general problem, for which many other problems are special cases. I also highlight some challenges of this general problem.</description></item><item><title>Many-to-English Machine Translation Tools, Data, and Pretrained Models</title><link>https://gowda.ai/posts/2021/04/mtdata-nlcodec-rtg-many-english/</link><pubDate>Sun, 25 Apr 2021 10:20:00 +0000</pubDate><guid>https://gowda.ai/posts/2021/04/mtdata-nlcodec-rtg-many-english/</guid><description>We present useful tools for machine translation research: MTData, NLCodec, and RTG. We demonstrate their usefulness by creating a multilingual neural machine translation model capable of translating from 500 source languages to English. We make this multilingual model readily downloadable and usable as a service, or as a parent model for transfer-learning to even lower-resource languages.</description></item><item><title>Macro-Average: Rare Types Are Important Too</title><link>https://gowda.ai/posts/2021/03/macroavg-rare-types-important/</link><pubDate>Thu, 11 Mar 2021 10:20:00 +0000</pubDate><guid>https://gowda.ai/posts/2021/03/macroavg-rare-types-important/</guid><description>We explore the simple type-based classifier metric, maf1, and study its applicability to MT evaluation. We find that MacroF1 is competitive on direct assessment, and outperforms others in indicating downstream cross-lingual information retrieval task performance.</description></item><item><title>Finding the Optimal Vocabulary for Neural Machine Translation</title><link>https://gowda.ai/posts/2020/11/2020-optimal-vocab-nmt/</link><pubDate>Sun, 01 Nov 2020 10:20:00 +0000</pubDate><guid>https://gowda.ai/posts/2020/11/2020-optimal-vocab-nmt/</guid><description>We cast neural machine translation (NMT) as a classification task in an autoregressive setting and analyze the limitations of both classification and autoregression components. Classifiers are known to perform better with balanced class distributions during training. Since the Zipfian nature of languages causes imbalanced classes, we explore its effect on NMT.</description></item><item><title>Notes</title><link>https://gowda.ai/notes/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://gowda.ai/notes/</guid><description>&lt;p>Here are some useful notes I have collected:&lt;/p>
&lt;h2 id="my-notes">My Notes&lt;/h2>
&lt;ul>
&lt;li>Python Best Practices: &lt;a href="https://gowda.ai/files/Python-Best-Practices-TG-2019.pdf">Get PDF&lt;/a>; &lt;a href="https://docs.google.com/presentation/d/1qRq6VJH4FsOHQa9y4VunDLH14Z20cAQ3uCftTxlnIX0/edit?usp=sharing">Google Slides&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://gowda.ai/files/intro-quantum-optimization.pdf">Introduction to Quantum Optimization using D-WAVE 2X&lt;/a>&lt;/li>
&lt;li>2019-Fall CSCI-662
&lt;ul>
&lt;li>&lt;a href="https://gowda.ai/files/2019f-cs662/GoogleCC-Pytorch.pdf">Google Cloud Setup for Pytorch with GPU&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://gowda.ai/files/2019f-cs662/non-linear-classifier.pdf">Non-Linear Classifiers&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;a href="https://thammegowda.github.io/slurm101">SLURM 101&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://thammegowda.github.io/summary/nmt/03-unsup/01-unsupervised-nmt.html">Unsupervised NMT Summary&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="notes-from-literature">Notes From Literature&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="https://gowda.ai/files/Sceptical-Thinking-Carl-Sagan.pdf">Tools for Sceptical Thinking&lt;/a> from the book &lt;em>The Demon-Hunted World&lt;/em> by &lt;strong>Carl Sagan&lt;/strong>&lt;/li>
&lt;li>&lt;a href="https://gowda.ai/files/PaleBlueDot-CarlSagan.pdf">Pale Blue Dot&lt;/a> by &lt;strong>Carl Sagan&lt;/strong>&lt;/li>
&lt;li>&lt;a href="https://gowda.ai/files/Creative-Thinking-Claude-Shannon-1952.pdf">Creative Thinking&lt;/a> by &lt;strong>Claude Shannon&lt;/strong>, 1952&lt;/li>
&lt;/ul></description></item><item><title>Publications</title><link>https://gowda.ai/publications/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://gowda.ai/publications/</guid><description/></item><item><title>Software</title><link>https://gowda.ai/software/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://gowda.ai/software/</guid><description>&lt;p>Solving problems using math and computers is my favourite job to do.
In early days of my career, I aspired to be a good software engineer, and I passionately pursued it until I slowly transitioned towards a research career.
While I write less code as a researcher than what I&amp;rsquo;m used to doing as a software engineer, I emphasize on good software engineering practices, and open sourcing of tools with a permissible license.
In the beginning (2012-2016) I wrote much of my code in Java/Groovy/Scala, but in the recent years (2016-Now), Python has become my go to choice. I have released a bunch of tools to &lt;a href="https://pypi.org/user/Thamme.Gowda/">PyPi&lt;/a>.&lt;/p></description></item></channel></rss>