<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem</title>
    <description>The most recent home feed on Forem.</description>
    <link>https://forem.com</link>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed"/>
    <language>en</language>
    <item>
      <title>AI Coding Tools Have a Context Problem — Here's the Fix</title>
      <dc:creator>RapidKit </dc:creator>
      <pubDate>Tue, 21 Apr 2026 08:11:47 +0000</pubDate>
      <link>https://forem.com/rapidkit/ai-coding-tools-have-a-context-problem-heres-the-fix-167i</link>
      <guid>https://forem.com/rapidkit/ai-coding-tools-have-a-context-problem-heres-the-fix-167i</guid>
      <description>&lt;h2&gt;
  
  
  The Wrong Unit of Context
&lt;/h2&gt;

&lt;p&gt;Most AI coding tools work at the &lt;strong&gt;file level&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That's fine for a React component. A component is self-contained — the context needed to help you fits in the file.&lt;/p&gt;

&lt;p&gt;Backend services aren't self-contained. They live inside environments. They share infrastructure. They depend on modules installed at the workspace level.&lt;/p&gt;

&lt;p&gt;This is why AI backend debugging suggestions are often... almost right. They're missing environment context.&lt;/p&gt;




&lt;h2&gt;
  
  
  What a Backend AI Actually Needs
&lt;/h2&gt;

&lt;p&gt;Take this error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;redis.exceptions.ConnectionError: Error 111 connecting to localhost:6379
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A file-level AI tells you: Redis isn't running.&lt;/p&gt;

&lt;p&gt;A workspace-aware AI knows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You have &lt;code&gt;redis-cache&lt;/code&gt; module installed in &lt;code&gt;auth-api&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Your Workspace Health check already flagged this&lt;/li&gt;
&lt;li&gt;You're using Docker Compose conventions (RapidKit workspace)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The second answer is specific. The first is a starting point you still have to work from.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Workspace as Context Unit
&lt;/h2&gt;

&lt;p&gt;In Workspai, when AI responds to a debug action, it receives:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"project"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"auth-api"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"fastapi.standard"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="nl"&gt;"modules"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"jwt-auth"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"redis-cache"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"python"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"3.12.3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"health_warnings"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Redis not reachable at localhost:6379"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ConnectionRefusedError at line 89"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Not file contents. A structured workspace snapshot. The response is grounded from the first message.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why the Workspace Format Matters
&lt;/h2&gt;

&lt;p&gt;This only works because &lt;strong&gt;RapidKit defines a structured workspace format&lt;/strong&gt;. It knows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which projects exist and what type they are&lt;/li&gt;
&lt;li&gt;Which modules are installed at each project&lt;/li&gt;
&lt;li&gt;The runtime version&lt;/li&gt;
&lt;li&gt;The current health state&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without this structure, you'd have to infer context from file contents — slow, unreliable, incomplete.&lt;/p&gt;

&lt;p&gt;With it, context assembly is deterministic. The AI starts informed.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Available Now (v0.20)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;@workspai&lt;/code&gt; Chat Participant&lt;/strong&gt; — use &lt;code&gt;@workspai /ask&lt;/code&gt; for full-context Q&amp;amp;A scoped to your active project, or &lt;code&gt;@workspai /debug&lt;/code&gt; for structured root-cause + fix + prevention, directly in the VS Code Chat panel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI Create with presets&lt;/strong&gt; — describe a project in plain language (or pick a smart preset), and AI plans the workspace, picks a kit, and selects modules&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI Debug Actions&lt;/strong&gt; — lightbulb in Python/TS/JS/Go files with workspace-aware context&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Doctor Fix with AI&lt;/strong&gt; — one-click AI resolution for workspace health issues&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Module Advisor&lt;/strong&gt; — compatible module suggestions based on what you're building&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Workspace Memory&lt;/strong&gt; — persistent AI context scoped to the workspace, carried across sessions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All on top of the existing RapidKit workspace platform. No changes to CLI, kits, or modules.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;The teams that establish workspace structure now will leverage AI more effectively as the tools improve. Workspace-aware AI will become the baseline expectation — the file level will feel like working blind.&lt;/p&gt;

&lt;p&gt;🔗 &lt;a href="https://www.workspai.com/" rel="noopener noreferrer"&gt;workspai.com&lt;/a&gt;&lt;br&gt;&lt;br&gt;
🔗 &lt;a href="https://marketplace.visualstudio.com/items?itemName=rapidkit.rapidkit-vscode" rel="noopener noreferrer"&gt;Workspai — VS Code Marketplace&lt;/a&gt;&lt;br&gt;&lt;br&gt;
🔗 &lt;a href="https://getrapidkit.com" rel="noopener noreferrer"&gt;getrapidkit.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devtools</category>
      <category>workspai</category>
      <category>vscode</category>
    </item>
    <item>
      <title>The Planning Tax: Why Your AI Agent Feature Might Be Your Worst Investment</title>
      <dc:creator>Cornel Stefanache</dc:creator>
      <pubDate>Tue, 21 Apr 2026 08:05:07 +0000</pubDate>
      <link>https://forem.com/cstefanache/the-planning-tax-why-your-ai-agent-feature-might-be-your-worst-investment-50d7</link>
      <guid>https://forem.com/cstefanache/the-planning-tax-why-your-ai-agent-feature-might-be-your-worst-investment-50d7</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Your best feature may be destroying your margins, and your engineering team has no idea.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;This article isn’t about AI as a productivity tool. It’s about AI as a cost structure, embedded in your product, triggered by your users, and scaling with your revenue.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The AI agents embedded in your product are generating a cost structure your pricing model probably didn’t account for. Not a server bill. Not a licensing fee.&lt;/p&gt;

&lt;p&gt;A variable, compounding AI infrastructure cost that grows with engagement, spikes with complexity, and, unlike every other line in your budget, gets worse the more your product succeeds.&lt;/p&gt;

&lt;p&gt;Every interaction with an LLM-powered feature is a fresh purchase from a model provider, billed per token, at rates that compound with every feature you add to make the product smarter.&lt;/p&gt;

&lt;p&gt;The model provider captures guaranteed revenue on every interaction regardless of whether your business ever makes money on that customer. As Andreessen Horowitz has argued, the total cost of ownership for generative AI is reshaping the economics of an entire software category.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;AI is running at your expense, not your users&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;There is a quiet structural problem sitting at the centre of nearly every LLM-powered product business: the more useful your product becomes, the more expensive it is to run.&lt;/p&gt;

&lt;p&gt;This is not a temporary inefficiency that engineering will eventually optimise away. It is the defining economic characteristic of a new category of software, and most product teams are not treating it with the strategic gravity it deserves.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Paradox of the Power User
&lt;/h2&gt;

&lt;p&gt;The most celebrated features of LLM-powered products, personalisation at scale, natural language interfaces, conversational support that actually resolves issues, intelligent document summarisation, share a common characteristic: they get more expensive with use.&lt;/p&gt;

&lt;p&gt;The user who engages most deeply generates the most value and the most AI agent cost simultaneously. This inverts one of the foundational assumptions of the SaaS business model. In traditional software, your heaviest users are your best customers.&lt;/p&gt;

&lt;p&gt;They renew, they expand, they refer others. In LLM-powered products, your heaviest users may be your least profitable ones.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The user who loves your product enough to use it every day is the one most likely to be costing you more than they pay.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The evidence is not theoretical. GitHub Copilot launched at $10 per month per developer. Microsoft’s internal calculations later revealed that the average developer was costing roughly $30 in Azure compute, with heavy coders consuming up to $80 per month in inference, a product that was operating at negative gross margin from day one for a meaningful subset of its user base.&lt;/p&gt;

&lt;p&gt;Microsoft subsequently raised pricing to $19 per month, not because the feature had improved, but because the original pricing had no defensible unit economics.&lt;/p&gt;

&lt;p&gt;Sam Altman confirmed publicly that ChatGPT Pro, priced at $200 per month, was losing money on users generating 20,000 or more queries. Cursor, Replit, and others have made similar mid-course corrections, shifting from flat-rate to consumption-based pricing once the distribution of actual usage became visible.&lt;/p&gt;

&lt;h2&gt;
  
  
  You Can’t Budget What You Can’t Predict
&lt;/h2&gt;

&lt;p&gt;Traditional compute scales linearly: you set a subscription price, model your cohorts, and the unit economics hold. AI agent costs break that contract entirely. You charge your customer a fixed monthly fee decided in a boardroom, while on the other side of that transaction, you are paying a dynamic, usage-driven price to a model provider that doesn’t care about your pricing page.&lt;/p&gt;

&lt;p&gt;A user who opens your product twice a month and one who runs complex queries for three hours a day pay you the same amount. They do not cost you the same amount.&lt;/p&gt;

&lt;p&gt;The gap between those two numbers isn’t an edge case to be managed — it is the fundamental structural risk of building a subscription business on top of a consumption-based cost model. As Sequoia Capital’s analysis highlights, the AI industry faces a $600 billion question around whether revenue can ever justify the infrastructure spend. You’ve sold certainty to your customer while absorbing all the variability yourself.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;You’re not paying per query. You’re paying for every decision, retry, context window, and failure your product accumulates, the per-query figure is just where the math starts.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Start with context window growth. In a multi-turn conversation, each new response requires the model to process every prior token in the session. A 10-turn conversation doesn’t cost 10 times the price of a single turn, it costs closer to 55 times (the sum of 1 through 10), because each turn re-processes everything that came before. Product features designed around conversational depth have costs that escalate with engagement, not proportionally to it.&lt;/p&gt;

&lt;p&gt;Then consider the multiplier effect of making your product smarter. Add multi-step reasoning, tool use, or chained agents, and the multiplier compounds further. Research into agentic software engineering found that in multi-agent systems, iterative code review and refinement stages alone consumed nearly 60 per cent of all tokens in a task — not the generation, but the verification loops.&lt;/p&gt;

&lt;p&gt;The Reflexion architecture, which gives LLM agents the ability to reflect on and correct their own outputs across multiple trials, achieves impressive accuracy gains precisely because it runs multiple full inference passes per task. Each improvement in output quality is purchased with a corresponding increase in model API costs.&lt;/p&gt;

&lt;p&gt;A reasonable unit economics model makes the failure cost concrete. Consider a product with 1,000 daily user interactions, a 70 per cent success rate, and an average lifetime value of $200 per customer.&lt;/p&gt;

&lt;p&gt;The 300 daily failures each carry a recovery cost of at least one additional inference call, an escalation probability, and an amortised churn risk. Even conservative assumptions produce a total daily loss that frequently exceeds the entire inference budget. The cost per transaction you’re tracking is the visible part of a larger number.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Do You Calculate the True Cost of an AI Agent?
&lt;/h2&gt;

&lt;p&gt;There is a mathematical reality about agentic systems that is uncomfortable to confront in a board meeting: the more steps an agent takes, the more likely it is to fail, even when each individual step has a high probability of success.&lt;/p&gt;

&lt;p&gt;If an agent executes a ten-step task and achieves 85% accuracy at each step, the compound probability of a fully correct end-to-end outcome is approximately 19%. Four out of every five autonomous task completions produce a result that is wrong somewhere. The arithmetic is a function of sequential dependency, and it does not improve unless you shorten the chain.&lt;/p&gt;

&lt;p&gt;The true cost of an agentic system is expressed by this formula:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Expected Agentic ROI = (Task Value × Success Rate × Volume) − (Development Cost + Runtime Cost + Failure Cost)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The term most internal business cases leave blank is Failure Cost. When an agent fails in production, you incur the engineering labor required to diagnose and remediate, plus the business impact of lost customer value. An enterprise deployment processing 1,000 tickets per day at a 70% success rate generates 300 failures daily.&lt;/p&gt;

&lt;p&gt;At a conservative $10 per failure, the monthly failure cost reaches $90,000, often exceeding the compute budget. As McKinsey’s State of AI report notes, organisations that fail to account for these hidden costs are systematically underestimating their total cost of ownership.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A demo that works 80 percent of the time is impressive. A production system that fails 20 percent of the time is useless.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  5 Proven Strategies to Reduce AI Agent Costs and Architect for Margin
&lt;/h2&gt;

&lt;p&gt;The AI cost structure described above is not fixed. It is simply the default you accept if you deploy without engineering the economics. You should treat unit economics as a first-class architectural concern from day one.&lt;/p&gt;

&lt;p&gt;When building cost-effective, production-ready AI agents for enterprise clients, we apply five core AI cost optimization strategies to fundamentally alter the dollar-per-decision profile:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model Routing by Task Complexity&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The costliest assumption in the industry is that every single step of a workflow requires a premium, frontier model. It doesn’t. You wouldn’t pay a senior executive to handle basic data entry, and you shouldn’t pay a frontier model to do it either.&lt;/p&gt;

&lt;p&gt;We design heterogeneous architectures that act as intelligent traffic controllers: they route complex, high-entropy planning to advanced models, but immediately delegate the execution of those plans to highly efficient, fine-tuned Small Language Models (SLMs).&lt;/p&gt;

&lt;p&gt;This approach isolates the cost of “expensive intelligence” only to the moments it is genuinely necessary, lowering execution costs by 10x to 30x for procedural, repetitive tasks without sacrificing output quality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Temporal Scheduling &amp;amp; Compute Arbitrage&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not all agentic work is time-sensitive, yet default setups treat every request like an emergency. Heavy computational tasks — like end-of-day batch summarisation, large-scale data extraction, or automated inbox triaging — do not need sub-second latency. We architect systems that explicitly separate real-time user needs from asynchronous background work.&lt;/p&gt;

&lt;p&gt;By scheduling heavy processing during off-peak infrastructure hours and batching requests intelligently, we drastically reduce model API costs and prevent latency spikes for the users who actually need real-time responses.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Constraining the Agent’s Latitude&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Planning capability is an incredible feature; unconstrained planning is a blank check. Without boundaries, agents will often fall down “rabbit holes,” exploring vast solution spaces and burning tokens in endless loops just to be thorough.&lt;/p&gt;

&lt;p&gt;We implement explicit step budgets, tight system guardrails, and hard termination conditions. An agent instructed to resolve a problem in three steps or fewer will often arrive at the exact same result as one told to “do whatever it takes,” but at a fraction of the cost per interaction. This ensures that your per-transaction costs remain predictable and strictly capped.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt Engineering as Infrastructure&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Too many development teams treat prompt design as a quick launch prerequisite rather than core, scalable infrastructure. We treat prompts as highly optimised code. By implementing token-budget-aware reasoning, we mathematically force the model to be concise.&lt;/p&gt;

&lt;p&gt;Furthermore, we deploy semantic caching at the architectural level. If a customer asks a question today that is contextually similar to one asked yesterday, our system recognises the intent and serves the answer directly from a vector-embedded cache. This bypasses the model provider entirely, routinely slashing direct API costs by 50% to 70% in environments with recurring request patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Difficulty-Aware Adaptive Reasoning&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We build automatic cognitive caps into the agent’s reasoning loop to prevent the system from overthinking. Informed by dual-process theories of cognition — distinguishing between rapid, intuitive responses and slow, deliberate analysis — we calibrate our architectures to allocate intensive planning resources only to tasks that actually warrant them.&lt;/p&gt;

&lt;p&gt;In AI reasoning, there is a strict point of diminishing returns where accuracy plateaus. We identify exactly where that plateau is for your specific business operations, ensuring you aren’t paying a premium for extra “thinking” that yields zero incremental correctness.&lt;/p&gt;

&lt;p&gt;As research on cost-efficient query routing demonstrates, matching model capability to task difficulty is one of the highest-leverage AI cost optimisation moves available.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Shinn, N., Cassano, F., Berman, E., Gopinath, A., Narasimhan, K., &amp;amp; Yao, S. (2023). Reflexion: Language Agents with Verbal Reinforcement Learning. arXiv. &lt;a href="https://arxiv.org/abs/2303.11366" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2303.11366&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Chen, L., Zaharia, M., &amp;amp; Zou, J. (2023). FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance. arXiv. &lt;a href="https://arxiv.org/abs/2305.05176" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2305.05176&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Ding, D., Mallick, A., Wang, C., Sim, R., Mukherjee, S., Ruhle, V., Lakshmanan, L.V.S., &amp;amp; Awadallah, A.H. (2024). Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing. ICLR 2024. &lt;a href="https://arxiv.org/abs/2404.14618" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2404.14618&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Ong, I., Almahairi, A., Wu, V., Chiang, W.-L., Wu, T., Gonzalez, J.E., Kadous, M.W., &amp;amp; Stoica, I. (2024). RouteLLM: Learning to Route LLMs with Preference Data. ICLR 2025. &lt;a href="https://arxiv.org/abs/2406.18665" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2406.18665&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Regmi, S. &amp;amp; Pun, C.P. (2024). GPT Semantic Cache: Reducing LLM Costs and Latency via Semantic Embedding Caching. arXiv. &lt;a href="https://arxiv.org/abs/2411.05276" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2411.05276&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Salim, M., Latendresse, J., Khatoonabadi, S.H., &amp;amp; Shihab, E. (2026). Tokenomics: Quantifying Where Tokens Are Used in Agentic Software Engineering. arXiv. &lt;a href="https://arxiv.org/abs/2601.14470" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2601.14470&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Singla, A., Sukharevsky, A., Yee, L. et al. (2025). The State of AI: How Organizations Are Rewiring to Capture Value. McKinsey &amp;amp; Company / QuantumBlack. &lt;a href="https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-how-organizations-are-rewiring-to-capture-value" rel="noopener noreferrer"&gt;https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-how-organizations-are-rewiring-to-capture-value&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Cahn, D. (2024). AI’s $600B Question. Sequoia Capital. &lt;a href="https://sequoiacap.com/article/ais-600b-question/" rel="noopener noreferrer"&gt;https://sequoiacap.com/article/ais-600b-question/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Jaipuria, T. (2025). The State of AI Gross Margins in 2025. Tanay Jaipuria’s Substack. &lt;a href="https://www.tanayj.com/p/the-gross-margin-debate-in-ai" rel="noopener noreferrer"&gt;https://www.tanayj.com/p/the-gross-margin-debate-in-ai&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Kappelhoff, K. (2025). Unit Economics for AI SaaS Companies: A Survival Guide for CFOs. Drivetrain.ai. &lt;a href="https://www.drivetrain.ai/post/unit-economics-of-ai-saas-companies-cfo-guide-for-managing-token-based-costs-and-margins" rel="noopener noreferrer"&gt;https://www.drivetrain.ai/post/unit-economics-of-ai-saas-companies-cfo-guide-for-managing-token-based-costs-and-margins&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Casado, M. &amp;amp; Wang, S. (2023). The Economic Case for Generative AI and Foundation Models. Andreessen Horowitz. &lt;a href="https://a16z.com/the-economic-case-for-generative-ai-and-foundation-models/" rel="noopener noreferrer"&gt;https://a16z.com/the-economic-case-for-generative-ai-and-foundation-models/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Anthropic. (2024). Introducing the Message Batches API. Anthropic Blog. &lt;a href="https://claude.com/blog/message-batches-api" rel="noopener noreferrer"&gt;https://claude.com/blog/message-batches-api&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Friedman, D. (2025). AI Startups Are SaaS Minus the Margins. Substack. &lt;a href="https://davefriedman.substack.com/p/ai-startups-are-saas-minus-the-margins" rel="noopener noreferrer"&gt;https://davefriedman.substack.com/p/ai-startups-are-saas-minus-the-margins&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Chaddha, N. (2025). Why AI Margins Matter More Than You Think. Mayfield Fund. &lt;a href="https://www.mayfield.com/why-ai-margins-matter-more-than-you-think/" rel="noopener noreferrer"&gt;https://www.mayfield.com/why-ai-margins-matter-more-than-you-think/&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>ai</category>
      <category>agents</category>
    </item>
    <item>
      <title>Configuring My Site for AI Discoverability</title>
      <dc:creator>Dennis Morello</dc:creator>
      <pubDate>Tue, 21 Apr 2026 08:02:58 +0000</pubDate>
      <link>https://forem.com/morellodev/configuring-my-site-for-ai-discoverability-1j38</link>
      <guid>https://forem.com/morellodev/configuring-my-site-for-ai-discoverability-1j38</guid>
      <description>&lt;p&gt;A growing share of web traffic doesn't come from people anymore. It comes from models reading on their behalf. ChatGPT, Claude, Perplexity, Copilot. They fetch a handful of pages, summarize, and ship the answer back. If your site isn't readable by those agents, you don't exist to them.&lt;/p&gt;

&lt;p&gt;People are calling this &lt;a href="https://wikipedia.org/wiki/Generative_engine_optimization" rel="noopener noreferrer"&gt;GEO&lt;/a&gt;, short for Generative Engine Optimization. It overlaps with SEO but the priorities are different. Agents don't care about your layout. They care about your prose, your metadata, and how many tokens it costs them to read you.&lt;/p&gt;

&lt;p&gt;This post covers how I configured this site for GEO. The first half is framework-agnostic. The second half is specific to my setup on Cloudflare, and includes a deliberate choice that fails a popular GEO audit. I'll explain why.&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 1: general GEO techniques
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Serve raw Markdown alongside HTML
&lt;/h3&gt;

&lt;p&gt;The single biggest GEO win is giving agents a version of each page without the navigation, styling, and scripts. HTML is designed for browsers. Markdown is designed for readers, human or otherwise. Agents spend their context window on your prose, not your DOM.&lt;/p&gt;

&lt;p&gt;Every blog post on this site has a mirror URL with a &lt;code&gt;.md&lt;/code&gt; suffix:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;/blog/my-post&lt;/code&gt; is the full HTML page for humans&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/blog/my-post.md&lt;/code&gt; is the raw Markdown, served as &lt;code&gt;text/markdown&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In Astro, this is a two-line route at &lt;code&gt;src/pages/blog/[slug].md.ts&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;GET&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;params&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;post&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getPostById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;slug&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;formatPostMarkdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text/markdown; charset=utf-8&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both variants are pre-generated at build time. Same content, &lt;strong&gt;roughly half the tokens&lt;/strong&gt; for an agent to consume.&lt;/p&gt;

&lt;h3&gt;
  
  
  Advertise the Markdown version in &lt;code&gt;&amp;lt;head&amp;gt;&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Agents landing on the HTML need to know the Markdown exists. A single &lt;code&gt;&amp;lt;link&amp;gt;&lt;/code&gt; in the head does it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;link&lt;/span&gt; &lt;span class="na"&gt;rel=&lt;/span&gt;&lt;span class="s"&gt;"alternate"&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"text/markdown"&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;"/blog/my-post.md"&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Browsers ignore this tag. Agents that parse the head follow it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Publish an &lt;code&gt;llms.txt&lt;/code&gt; index
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://llmstxt.org/" rel="noopener noreferrer"&gt;&lt;code&gt;llms.txt&lt;/code&gt;&lt;/a&gt; is a convention for a Markdown file at the root of your site listing your content with short descriptions and links. Think of it as a sitemap an LLM can actually read.&lt;/p&gt;

&lt;p&gt;I ship two variants:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;/llms.txt&lt;/code&gt; is the index. Title, description, one line per post with a link to its &lt;code&gt;.md&lt;/code&gt; version.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/llms-full.txt&lt;/code&gt; is the full corpus. Every post body concatenated into a single response.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why both? An agent researching a specific topic can fetch &lt;code&gt;llms.txt&lt;/code&gt;, pick the relevant links, and pull them. An agent doing deep research on the site as a whole fetches &lt;code&gt;llms-full.txt&lt;/code&gt; once and has everything it needs in one request. Either way there's no crawling.&lt;/p&gt;

&lt;h3&gt;
  
  
  Declare your AI stance in &lt;code&gt;robots.txt&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;robots.txt&lt;/code&gt; now carries a &lt;code&gt;Content-Signal&lt;/code&gt; directive for AI use. Mine reads:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User-agent: *
Content-Signal: search=yes, ai-train=no, ai-input=yes
Allow: /
Sitemap: https://morello.dev/sitemap-index.xml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three independent knobs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;search=yes&lt;/code&gt; lets search engines index&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ai-train=no&lt;/code&gt; says my content is not for training data&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ai-input=yes&lt;/code&gt; says my content &lt;em&gt;can&lt;/em&gt; be retrieved and used as input for AI answers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the stance I'm comfortable with. I want to show up when someone asks Claude about something I've written; I just don't want my posts absorbed into the next base model.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Whether any given operator actually honors this is another question. The signal's there regardless, and I'd rather be on record than silent about it.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Add structured data that actually describes the content
&lt;/h3&gt;

&lt;p&gt;Most blogs ship JSON-LD schema by reflex. Few of them include the fields that help a generative engine decide whether your article is worth fetching.&lt;/p&gt;

&lt;p&gt;On each post I emit a &lt;code&gt;BlogPosting&lt;/code&gt; graph with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;wordCount&lt;/code&gt; and &lt;code&gt;timeRequired&lt;/code&gt; (ISO 8601 duration), so an agent can estimate how much context it'll spend before fetching&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;articleBody&lt;/code&gt;, the full text machine-readable, with no HTML parsing required&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;author&lt;/code&gt; linked to a &lt;code&gt;Person&lt;/code&gt; node with &lt;code&gt;knowsAbout&lt;/code&gt; so the entity is grounded in real topics&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;BreadcrumbList&lt;/code&gt; for site hierarchy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of it goes into a single &lt;code&gt;@graph&lt;/code&gt; per page rather than scattered &lt;code&gt;&amp;lt;script&amp;gt;&lt;/code&gt; tags, which makes it cheaper for an engine to walk from post to author to site without cross-referencing.&lt;/p&gt;

&lt;h3&gt;
  
  
  A sitemap that actually tracks freshness
&lt;/h3&gt;

&lt;p&gt;If you regenerate your sitemap once and never look at it again, you're wasting a signal. Every URL in mine carries a &lt;code&gt;lastmod&lt;/code&gt; timestamp pulled from the post's &lt;code&gt;updatedDate&lt;/code&gt; frontmatter, falling back to &lt;code&gt;pubDate&lt;/code&gt;. When I edit an old post, its &lt;code&gt;lastmod&lt;/code&gt; moves forward and crawlers reprioritize it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Validate with real tools
&lt;/h3&gt;

&lt;p&gt;Two tools I found useful while iterating on all of the above:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://isitagentready.com/" rel="noopener noreferrer"&gt;isitagentready.com&lt;/a&gt; audits across five categories: discoverability, content accessibility, bot access control, protocol discovery, and commerce. The bot access control checks (&lt;code&gt;Content-Signal&lt;/code&gt;, Web Bot Auth, AI bot rules) are the part that actually influences how agents treat your content.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://acceptmarkdown.com/" rel="noopener noreferrer"&gt;acceptmarkdown.com&lt;/a&gt; has a narrower focus. It checks whether your site responds to &lt;code&gt;Accept: text/markdown&lt;/code&gt; with a Markdown body, includes &lt;code&gt;Vary: Accept&lt;/code&gt;, returns &lt;code&gt;406&lt;/code&gt; for unsupported types, and parses q-values correctly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I'll come back to the second one at the end of the post, because my site deliberately fails it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 2: the Cloudflare-specific setup
&lt;/h2&gt;

&lt;p&gt;General GEO gets you most of the way there. The rest is delivery. How fast you respond, whether the edge caches correctly, and how you advertise your agent-facing resources without waiting for someone to parse your HTML.&lt;/p&gt;

&lt;h3&gt;
  
  
  Static assets, zero Worker invocations
&lt;/h3&gt;

&lt;p&gt;My &lt;code&gt;wrangler.jsonc&lt;/code&gt; points a &lt;code&gt;./dist&lt;/code&gt; directory at &lt;a href="https://developers.cloudflare.com/workers/static-assets/" rel="noopener noreferrer"&gt;Cloudflare's assets deployment&lt;/a&gt;, with no &lt;code&gt;main&lt;/code&gt; entry:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json-doc"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"morellodev"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"compatibility_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-18"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"assets"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"directory"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"./dist"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"html_handling"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"drop-trailing-slash"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"not_found_handling"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"404-page"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every request goes straight from the edge asset cache. HTML, Markdown, &lt;code&gt;llms.txt&lt;/code&gt;, sitemap, RSS. Same path for all of them, and no Worker ever runs. On the Workers Free tier this matters. A crawler sweep that would otherwise eat into 100k daily invocations now costs me nothing. Agents, for better or worse, don't fingerprint politely.&lt;/p&gt;

&lt;h3&gt;
  
  
  Advertise discovery endpoints in a &lt;code&gt;Link&lt;/code&gt; header
&lt;/h3&gt;

&lt;p&gt;Cloudflare's &lt;a href="https://developers.cloudflare.com/workers/static-assets/headers/" rel="noopener noreferrer"&gt;&lt;code&gt;_headers&lt;/code&gt; file&lt;/a&gt; lets you ship response headers without any server code. I use it to tell every response, not just HTML ones, where the agent-facing files live:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/*
  Link: &amp;lt;/sitemap-index.xml&amp;gt;; rel="sitemap",
        &amp;lt;/rss.xml&amp;gt;; rel="alternate"; type="application/rss+xml"; title="RSS",
        &amp;lt;/llms.txt&amp;gt;; rel="describedby"; type="text/plain",
        &amp;lt;/llms-full.txt&amp;gt;; rel="describedby"; type="text/plain"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A crawler doing a &lt;code&gt;HEAD&lt;/code&gt; against any URL on the site sees all four links before it parses a single byte of HTML. &lt;strong&gt;One round-trip, no body, full discovery.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Long-lived cache for hashed assets
&lt;/h3&gt;

&lt;p&gt;Astro emits fingerprinted filenames under &lt;code&gt;/_astro/&lt;/code&gt;, so those can sit in cache for a year:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/_astro/*
  Cache-Control: public, max-age=31536000, immutable
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Faster first paint for humans, cheaper crawls for agents. Same lever.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why I skipped &lt;code&gt;Accept: text/markdown&lt;/code&gt; content negotiation
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://acceptmarkdown.com/" rel="noopener noreferrer"&gt;acceptmarkdown.com&lt;/a&gt; will tell you this site doesn't do content negotiation. No &lt;code&gt;Vary: Accept&lt;/code&gt;, no &lt;code&gt;406&lt;/code&gt;, no Markdown from the canonical URL. That's not an oversight. I tried it, shipped it briefly, and rolled it back.&lt;/p&gt;

&lt;p&gt;The reason is Cloudflare's free plan. Custom cache keys are Enterprise-only, and &lt;a href="https://developers.cloudflare.com/cache/concepts/cache-control/" rel="noopener noreferrer"&gt;their docs are explicit&lt;/a&gt; that &lt;code&gt;Vary: Accept&lt;/code&gt; is ignored for caching decisions. The edge collapses every variant of &lt;code&gt;/blog/my-post&lt;/code&gt; into one cache entry, so the first requester's format &lt;strong&gt;poisons the cache for everyone else&lt;/strong&gt; until TTL expires.&lt;/p&gt;

&lt;p&gt;The workaround is a Worker that bypasses the edge cache. But now every &lt;code&gt;/blog/*&lt;/code&gt; request burns a Worker invocation, humans included, and the &lt;a href="https://developers.cloudflare.com/workers/platform/pricing/" rel="noopener noreferrer"&gt;Workers Free plan&lt;/a&gt; gives you 100k per day and 10ms of CPU each. That's a real budget to share across humans and bots, for no functional gain over a static &lt;code&gt;.md&lt;/code&gt; URL.&lt;/p&gt;

&lt;p&gt;So I deleted the Worker. The only thing I lost is &lt;code&gt;curl -H "Accept: text/markdown" …/blog/my-post&lt;/code&gt; returning Markdown. Between &lt;code&gt;llms.txt&lt;/code&gt;, &lt;code&gt;&amp;lt;link rel="alternate"&amp;gt;&lt;/code&gt;, and the &lt;code&gt;/blog/[slug].md&lt;/code&gt; convention, no mainstream agent I've seen actually needs &lt;code&gt;Accept:&lt;/code&gt; negotiation. It's the more elegant protocol; alternate URLs are the more robust one on a free-tier CDN. On a paid plan I'd probably do both.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this leaves things
&lt;/h2&gt;

&lt;p&gt;Every page exists in two forms, both served from the edge. Agent-facing resources are advertised in response headers on every request, before any HTML gets parsed. Structured data tells engines what the article is and how much context it takes to read. &lt;code&gt;robots.txt&lt;/code&gt; says what I'll allow and what I won't.&lt;/p&gt;

&lt;p&gt;GEO is still very new. The standards are half-drafted, the tools disagree with each other, and half the signals I described above didn't exist two years ago. I fully expect to be rewriting parts of this post within six months, probably with a different opinion about Accept-based negotiation, once I've either moved off the free plan or found a workaround that doesn't involve a Worker. But for now: serve agents a version they can cheaply consume, be explicit about what you'll allow, and accept that the defaults aren't on your side.&lt;/p&gt;

&lt;p&gt;If you're reading this via a summary from some assistant, hi. Thanks for the traffic.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>seo</category>
      <category>llm</category>
    </item>
    <item>
      <title>Less Human AI Agents, Please!</title>
      <dc:creator>Mariano Gobea Alcoba</dc:creator>
      <pubDate>Tue, 21 Apr 2026 08:01:31 +0000</pubDate>
      <link>https://forem.com/mgobea/less-human-ai-agents-please-1d4f</link>
      <guid>https://forem.com/mgobea/less-human-ai-agents-please-1d4f</guid>
      <description>&lt;h2&gt;
  
  
  The Uncanny Valley of AI Agent Interaction: Beyond Human Mimicry
&lt;/h2&gt;

&lt;p&gt;The burgeoning field of AI agents, designed to autonomously perform tasks and interact with users, presents a complex design challenge. As highlighted in recent discussions, a prevalent tendency is to imbue these agents with human-like characteristics, language, and even personality traits. While seemingly intuitive, this approach often leads to an undesirable outcome: the "uncanny valley" of human-AI interaction. This article delves into the technical and user experience implications of this human-centric design philosophy and explores alternative, more effective paradigms for AI agent development.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Allure and Peril of Anthropomorphism
&lt;/h3&gt;

&lt;p&gt;Anthropomorphism, the attribution of human characteristics to non-human entities, is a deeply ingrained cognitive bias. In the context of AI, this manifests as designing agents that speak, reason, and behave as closely to humans as possible. The motivations for this are varied:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Familiarity and Ease of Use:&lt;/strong&gt; Users are inherently familiar with human communication and interaction patterns. Designing AI agents that mirror these patterns can, in theory, reduce the learning curve and make adoption smoother.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Emotional Connection and Trust:&lt;/strong&gt; Some believe that a more "human" agent can foster greater trust and a sense of connection with the user, leading to more positive user experiences.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Simulating Human Capabilities:&lt;/strong&gt; The ultimate goal for many AI agents is to replicate or surpass human performance in specific tasks. This often leads to designing agents that think and communicate in ways that mimic human cognitive processes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However, this pursuit of human likeness is fraught with peril. When an AI agent &lt;em&gt;almost&lt;/em&gt; succeeds at mimicking human behavior but falls short in subtle yet crucial ways, it can evoke feelings of unease, creepiness, or even revulsion. This is the AI equivalent of the uncanny valley, first described by roboticist Masahiro Mori in relation to humanoid robots.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Technical Manifestations of the Uncanny Valley:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Linguistic Inconsistencies:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Overly Formal or Stilted Language:&lt;/strong&gt; While aiming for politeness, agents might use phrasing that is grammatically correct but unnatural in spoken conversation.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Inappropriate Tone:&lt;/strong&gt; An agent attempting empathy might produce responses that feel hollow, insincere, or misaligned with the user's emotional state.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Repetitive Phrasing:&lt;/strong&gt; Limited generative capacity can lead to predictable and repetitive conversational patterns, signaling the artificial nature of the agent.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Misinterpretation of Nuance:&lt;/strong&gt; Sarcasm, irony, humor, and colloquialisms are notoriously difficult for AI to grasp. A failed attempt to engage with these can be jarring.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;  &lt;strong&gt;Behavioral Discrepancies:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Lack of True Agency:&lt;/strong&gt; Agents that claim to "understand" or "feel" but then act purely based on deterministic logic create a disconnect.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Inconsistent Persona:&lt;/strong&gt; An agent that fluctuates between being overly casual and then strictly professional can be disorienting.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Unrealistic Pacing:&lt;/strong&gt; Immediate responses to complex queries can feel unnatural, as humans typically require time to process information. Conversely, overly long pauses can also break the flow.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Failure to Adapt to Context:&lt;/strong&gt; An agent that forgets previous turns in a conversation or fails to acknowledge evolving user needs demonstrates a lack of true intelligence and makes the "human" facade crumble.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;  &lt;strong&gt;Task Performance Mismatch:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Over-promising and Under-delivering:&lt;/strong&gt; An agent that uses human-like language to suggest it can perform complex reasoning but then fails to do so effectively highlights its limitations.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Misaligned Expectations:&lt;/strong&gt; Users might expect the emotional intelligence or common sense reasoning of a human, which current AI agents generally lack.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Case for "Less Human" AI Agents
&lt;/h3&gt;

&lt;p&gt;Instead of striving for human mimicry, a more effective approach might be to design AI agents that embrace their artificial nature. This paradigm shift focuses on transparency, efficiency, and clarity of purpose, rather than a flawed attempt at emulation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Principles of "Less Human" AI Agents:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Transparency and Honesty:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Clearly State AI Identity:&lt;/strong&gt; The agent should explicitly identify itself as an AI. There should be no ambiguity.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Acknowledge Limitations:&lt;/strong&gt; Instead of trying to bluff its way through, the agent should be programmed to admit when it doesn't know something, can't perform a task, or requires human intervention.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Explain Capabilities and Purpose:&lt;/strong&gt; Users should understand what the agent &lt;em&gt;can&lt;/em&gt; do and why it exists. This sets realistic expectations.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Efficiency and Directness:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Focus on Task Completion:&lt;/strong&gt; The primary goal of an AI agent is to efficiently and accurately perform its designated tasks. Human-like chit-chat or personality embellishments can be distractions.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Precise Language:&lt;/strong&gt; Use clear, unambiguous language. Avoid jargon where possible, but prioritize accuracy and conciseness over conversational filler.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Structured Interaction:&lt;/strong&gt; For complex tasks, a more structured, form-based, or step-by-step interaction might be more efficient than an open-ended conversation.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Predictability and Reliability:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Consistent Behavior:&lt;/strong&gt; The agent's responses and actions should be predictable based on its programming and the input it receives. This builds trust through reliability.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Defined Scope:&lt;/strong&gt; Clearly defined operational boundaries prevent unexpected or undesirable behavior.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Functional Design:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;User Interface (UI) and User Experience (UX) Driven by Function:&lt;/strong&gt; The interface and interaction flow should be optimized for task completion, not for mimicking human conversation. This might involve dashboards, clear forms, and direct controls rather than free-form text input.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Error Handling as a Feature:&lt;/strong&gt; Robust error handling, with clear explanations and actionable steps, is more valuable than an apology that rings hollow.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Technical Implementation Strategies
&lt;/h3&gt;

&lt;p&gt;Adopting a "less human" approach doesn't mean creating robotic, unfriendly interfaces. It means prioritizing functional excellence and transparency in design and implementation.&lt;/p&gt;

&lt;h4&gt;
  
  
  1. Communication Protocols and Language Models
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Intent Recognition and Slot Filling:&lt;/strong&gt; For task-oriented agents, sophisticated Natural Language Understanding (NLU) models focusing on intent recognition and slot filling are crucial. These models should be trained to extract specific information rather than engaging in broad conversational discourse.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Example using a hypothetical NLU library
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;nlu_service&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;NLUClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;NLUClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;user_utterance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I want to book a flight from London to New York for two people next Tuesday.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_utterance&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Expected output focuses on structured data extraction
# {
#     "intent": "book_flight",
#     "slots": {
#         "origin": "London",
#         "destination": "New York",
#         "passengers": 2,
#         "date": "next Tuesday"
#     }
# }
&lt;/span&gt;
&lt;span class="c1"&gt;# The agent then uses these structured slots to query a booking system.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Controlled Generative Models:&lt;/strong&gt; If generative capabilities are needed, they should be carefully constrained. Fine-tuning Large Language Models (LLMs) on specific, task-oriented dialogue datasets can produce helpful, concise responses without venturing into overly human-like or speculative language. Techniques like Reinforcement Learning from Human Feedback (RLHF) can be used to steer generation towards helpfulness and factual accuracy, rather than "humanness."&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Hypothetical example of constrained generation
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llm_service&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LLMClient&lt;/span&gt;

&lt;span class="n"&gt;llm_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LLMClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;task_oriented_model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
User Request: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is the status of my order #12345?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;

System Instruction: Respond concisely with factual information only.
If information is unavailable, state &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Information not available.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;
Do not speculate or offer apologies.
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;llm_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Expected response: "Order #12345 is currently in transit. Estimated delivery: 2023-10-27."
# Or: "Information for order #12345 is not available."
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Explicit AI Identification:&lt;/strong&gt; The system should prepend or append clear disclaimers.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_ai_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;core_response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;prefix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;System AI: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="si"&gt;}{&lt;/span&gt;&lt;span class="n"&gt;core_response&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;user_query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Book a meeting with John Doe tomorrow at 2 PM.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="c1"&gt;# ... logic to process query and find availability ...
&lt;/span&gt;&lt;span class="n"&gt;meeting_details&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Meeting with John Doe scheduled for tomorrow at 2 PM.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;generate_ai_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;meeting_details&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="c1"&gt;# Output: System AI: Meeting with John Doe scheduled for tomorrow at 2 PM.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  2. State Management and Context Handling
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Session State:&lt;/strong&gt; Maintain a clear, explicit representation of the conversation state. This includes recognized intents, extracted slots, user preferences, and task progress.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Contextual Awareness:&lt;/strong&gt; The agent needs to understand the immediate context of the current turn as well as relevant historical context from the session. However, this context should be used to inform task execution, not to build a "personality."&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ConversationState&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;current_intent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;slots&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;task_progress&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;idle&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;history&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="c1"&gt;# Limited history relevant to task
&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;update_state&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;intent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;new_slots&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;current_intent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;intent&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;slots&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_slots&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;intent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;intent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;slots&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;new_slots&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="c1"&gt;# Logic to advance task progress based on intent and slots
&lt;/span&gt;
&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ConversationState&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# User says: "I need to reorder my usual coffee."
# NLU identifies intent="reorder_item", slots={"item": "usual coffee"}
&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_state&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reorder_item&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;item&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;usual coffee&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="c1"&gt;# Agent uses state.slots["item"] to query order history.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  3. Error Handling and Fallback Strategies
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Informative Error Messages:&lt;/strong&gt; When an error occurs, the agent should provide a clear explanation of what went wrong and, if possible, suggest concrete next steps.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_booking_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;error_type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;slot_missing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;missing_slot&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;missing_slot&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;required information&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I cannot proceed without &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;missing_slot&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;. Please provide it.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;error_type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;api_failure&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;An internal error occurred while processing your request. Please try again later.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;An unexpected error occurred. Please contact support.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Agent encounters an error
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;handle_booking_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;slot_missing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;missing_slot&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;departure date&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}))&lt;/span&gt;
&lt;span class="c1"&gt;# Output: I cannot proceed without departure date. Please provide it.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Graceful Degradation:&lt;/strong&gt; If an agent cannot fulfill a request, it should offer alternatives or clearly state its inability to help, rather than generating nonsensical or misleading information.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_unfulfillable_request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Check against agent's capabilities
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;agent_can_handle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I am designed to assist with [specific tasks]. I cannot help with &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;This request cannot be fulfilled at this time.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;handle_unfulfillable_request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analyze my company&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s stock market trends for the next decade.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="c1"&gt;# Output: I am designed to assist with booking appointments and sending reminders. I cannot help with 'Analyze my company's stock market trends for the next decade.'
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  4. User Interface Design for Clarity
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Visual Cues:&lt;/strong&gt; Use UI elements that clearly indicate the agent's function and status. Progress indicators, clear labels, and distinct input/output areas can be more effective than chat bubbles.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Structured Input:&lt;/strong&gt; For complex data entry, use forms, dropdowns, calendars, and other structured input fields instead of relying solely on natural language. This reduces ambiguity and ensures all necessary information is captured.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Actionable Output:&lt;/strong&gt; Present information and results in a clear, organized, and actionable manner. Buttons for confirmation, links to further information, or summaries of actions taken are beneficial.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;!-- Example of a structured UI element for booking --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"booking-form"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;h3&amp;gt;&lt;/span&gt;Flight Booking&lt;span class="nt"&gt;&amp;lt;/h3&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;label&lt;/span&gt; &lt;span class="na"&gt;for=&lt;/span&gt;&lt;span class="s"&gt;"origin"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Origin:&lt;span class="nt"&gt;&amp;lt;/label&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"origin"&lt;/span&gt; &lt;span class="na"&gt;placeholder=&lt;/span&gt;&lt;span class="s"&gt;"e.g., London"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;

    &lt;span class="nt"&gt;&amp;lt;label&lt;/span&gt; &lt;span class="na"&gt;for=&lt;/span&gt;&lt;span class="s"&gt;"destination"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Destination:&lt;span class="nt"&gt;&amp;lt;/label&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"destination"&lt;/span&gt; &lt;span class="na"&gt;placeholder=&lt;/span&gt;&lt;span class="s"&gt;"e.g., New York"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;

    &lt;span class="nt"&gt;&amp;lt;label&lt;/span&gt; &lt;span class="na"&gt;for=&lt;/span&gt;&lt;span class="s"&gt;"departure-date"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Departure Date:&lt;span class="nt"&gt;&amp;lt;/label&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"date"&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"departure-date"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;

    &lt;span class="nt"&gt;&amp;lt;button&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"search-flights"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Search Flights&lt;span class="nt"&gt;&amp;lt;/button&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Benefits of a Functionalist Approach
&lt;/h3&gt;

&lt;p&gt;Moving away from the pursuit of human-like interaction offers several advantages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Reduced User Frustration:&lt;/strong&gt; By setting realistic expectations and providing clear, efficient interactions, users are less likely to be frustrated by an agent's perceived shortcomings.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Increased Trust and Reliability:&lt;/strong&gt; An agent that is honest about its capabilities and consistently performs its functions accurately builds more genuine trust than one that fakes empathy or understanding.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Improved Efficiency:&lt;/strong&gt; Focusing on task completion rather than conversational pleasantries can lead to faster and more direct resolution of user needs.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Scalability:&lt;/strong&gt; Functionalist agents are often easier to scale and maintain, as their behavior is more predictable and less dependent on the nuances of human language and emotion.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Ethical Considerations:&lt;/strong&gt; Avoiding the creation of artificial "personalities" can mitigate concerns around emotional manipulation and the blurring of lines between human and machine relationships.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Conclusion: Embracing Artificiality
&lt;/h3&gt;

&lt;p&gt;The quest to make AI agents "less human" is not about creating cold, unfeeling interfaces. It is about a pragmatic recognition of current AI capabilities and a user-centered design philosophy that prioritizes clarity, efficiency, and honesty. By embracing the artificial nature of these agents, developers can build systems that are more reliable, trustworthy, and ultimately more helpful to users. The uncanny valley of human mimicry is a trap that can be avoided by focusing on what AI agents do best: process information, execute tasks, and communicate results with precision and transparency.&lt;/p&gt;

&lt;p&gt;We invite you to explore further advancements and discuss these principles in the context of your own projects. For expert guidance and consulting services in AI agent development and conversational interface design, please visit &lt;a href="https://www.mgatc.com" rel="noopener noreferrer"&gt;https://www.mgatc.com&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published in Spanish at &lt;a href="https://www.mgatc.com/blog/less-human-ai-agents-please/" rel="noopener noreferrer"&gt;www.mgatc.com/blog/less-human-ai-agents-please/&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ia</category>
      <category>agentesdeia</category>
      <category>interaccinhumanoia</category>
      <category>diseodeia</category>
    </item>
    <item>
      <title>We open sourced our Unity MCP server</title>
      <dc:creator>Daniel Fang (Glade)</dc:creator>
      <pubDate>Tue, 21 Apr 2026 08:01:05 +0000</pubDate>
      <link>https://forem.com/daniel_glade/we-open-sourced-our-unity-mcp-server-4i0l</link>
      <guid>https://forem.com/daniel_glade/we-open-sourced-our-unity-mcp-server-4i0l</guid>
      <description>&lt;p&gt;Many “AI for game dev” tools still stop at code generation.&lt;/p&gt;

&lt;p&gt;They can suggest a script, maybe explain an error, maybe even produce something close to what you want. But in actual Unity workflows, that is usually only a small part of the job.&lt;/p&gt;

&lt;p&gt;The real work is spread across scene hierarchy, prefabs, materials, UI, physics, animation, input setup, package differences, console errors, project conventions, and lots of repetitive editor actions.&lt;/p&gt;

&lt;p&gt;That gap is exactly why we built GladeKit.&lt;/p&gt;

&lt;p&gt;Today, we’re doing two things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Launching GladeKit officially (see &lt;a href="https://www.producthunt.com/products/gladekit?launch=gladekit" rel="noopener noreferrer"&gt;Product Hunt&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Open sourcing the &lt;a href="https://github.com/Glade-tool/glade-mcp-unity" rel="noopener noreferrer"&gt;GladeKit Unity MCP server&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  GladeKit Unity MCP
&lt;/h2&gt;

&lt;p&gt;The open-source MCP server connects AI clients like Cursor, Claude Code, and Windsurf directly to the Unity Editor.&lt;/p&gt;

&lt;p&gt;That means the model is not just chatting about your game in the abstract. It can actually operate with real Unity context.&lt;/p&gt;

&lt;p&gt;The server includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;230+ Unity tools across areas like scenes, GameObjects, scripts, prefabs, materials, lighting, VFX, audio, animation, physics, camera, UI, input, terrain, and NavMesh&lt;/li&gt;
&lt;li&gt;a Unity-aware system prompt&lt;/li&gt;
&lt;li&gt;GLADE.md project context injection&lt;/li&gt;
&lt;li&gt;semantic script search&lt;/li&gt;
&lt;li&gt;skill calibration based on user expertise&lt;/li&gt;
&lt;li&gt;optional cloud intelligence for RAG and cross-session memory&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Core features are free, local, and MIT licensed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why we open sourced it
&lt;/h2&gt;

&lt;p&gt;For Unity especially, usefulness depends on project awareness. The model needs to understand what scene is open, what objects exist, what scripts are relevant, what pipeline is being used, what errors are happening, and what conventions the project already follows.&lt;/p&gt;

&lt;p&gt;Without that, you end up with generic “AI-generated advice.”&lt;br&gt;
With that, you start getting closer to an actual useful AI assistant / agent. &lt;/p&gt;

&lt;p&gt;Open sourcing the MCP server is our way of pushing that interface forward.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example of the difference
&lt;/h2&gt;

&lt;p&gt;A normal coding assistant might help with:&lt;br&gt;
“Write me a script for enemy spawning.”&lt;/p&gt;

&lt;p&gt;A Unity-connected MCP can help more like this:&lt;br&gt;
“Find how enemy spawning currently works in my project, inspect the related scripts, create a new spawn manager, wire it into the scene, and adjust the exposed values to match the existing design.”&lt;/p&gt;

&lt;p&gt;That difference is what we care about.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture at a high level
&lt;/h2&gt;

&lt;p&gt;The setup is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a Unity bridge package runs inside the editor&lt;/li&gt;
&lt;li&gt;the MCP server connects to that bridge&lt;/li&gt;
&lt;li&gt;your AI client talks to the MCP server over stdio or HTTP&lt;/li&gt;
&lt;li&gt;the model gets tool access plus Unity-specific context&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So instead of copy-pasting back and forth between your IDE, a chatbot, and Unity, the agent can operate much closer to the actual source of truth.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters beyond GladeKit
&lt;/h2&gt;

&lt;p&gt;I think game dev is one of the most interesting places for MCP-style tooling.&lt;/p&gt;

&lt;p&gt;Game development has a huge amount of structured-but-fragmented work:&lt;br&gt;
editor actions, asset references, scene state, component wiring, engine-specific APIs, and long chains of small tasks that are annoying to do manually but difficult to solve with plain text generation alone.&lt;/p&gt;

&lt;p&gt;That makes it a really good fit for agent tooling with real tool access.&lt;/p&gt;

&lt;p&gt;My guess is we’ll see more of this pattern across game engines and other developer tools - not just AI that answers questions, but AI that can actually operate in the environment where the work is happening.&lt;/p&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;p&gt;Open-source MCP repo:&lt;br&gt;
&lt;a href="https://github.com/Glade-tool/glade-mcp-unity" rel="noopener noreferrer"&gt;https://github.com/Glade-tool/glade-mcp-unity&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;GladeKit site:&lt;br&gt;
&lt;a href="https://gladekit.com" rel="noopener noreferrer"&gt;https://gladekit.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Product Hunt launch:&lt;br&gt;
&lt;a href="https://www.producthunt.com/products/gladekit?launch=gladekit" rel="noopener noreferrer"&gt;https://www.producthunt.com/products/gladekit?launch=gladekit&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Would love feedback from anyone building AI dev tools, working with MCP, or trying to make Unity workflows faster.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>gamedev</category>
      <category>unity3d</category>
      <category>mcp</category>
    </item>
    <item>
      <title>Playing HEVC in a Browser Without Plugin — An H.265 Decoder in WebAssembly</title>
      <dc:creator>Thibaut Lion</dc:creator>
      <pubDate>Tue, 21 Apr 2026 08:00:42 +0000</pubDate>
      <link>https://forem.com/privaloops/playing-hevc-in-a-browser-without-plugin-an-h265-decoder-in-webassembly-4ag0</link>
      <guid>https://forem.com/privaloops/playing-hevc-in-a-browser-without-plugin-an-h265-decoder-in-webassembly-4ag0</guid>
      <description>&lt;h2&gt;
  
  
  The Problem — HEVC Everywhere Except the Browser
&lt;/h2&gt;

&lt;p&gt;HEVC/H.265 is the standard codec for Netflix, Apple, broadcasters, 4K/HDR. It saves 30-50% bandwidth versus H.264 at equivalent quality — millions in annual CDN savings for streaming services.&lt;/p&gt;

&lt;p&gt;But browser support is a mess.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;macOS&lt;/strong&gt; — Safari, Chrome, Edge, Firefox all decode HEVC natively via VideoToolbox. No extension needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Chrome 107+ on Windows&lt;/strong&gt; — uses D3D11VA directly. No Microsoft extension required, but needs a GPU with hardware HEVC decoder (Intel Skylake 2015+, NVIDIA Maxwell 2nd gen+, AMD Fiji+). No software fallback.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Edge on Windows&lt;/strong&gt; — uses Media Foundation. &lt;strong&gt;Requires&lt;/strong&gt; the Microsoft &lt;a href="https://apps.microsoft.com/detail/9nmzlz57r3t7" rel="noopener noreferrer"&gt;HEVC Video Extension&lt;/a&gt; ($1 on the Store). Without it, no HEVC regardless of GPU.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Firefox 133+ on Windows&lt;/strong&gt; — same MFT path, same extension dependency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Linux&lt;/strong&gt; — Chrome with VAAPI, maybe. Firefox, no.&lt;/p&gt;

&lt;p&gt;The root cause is licensing. MPEG LA and Access Advance impose per-unit royalties. Microsoft passes this to users via the Store extension. Google negotiated a direct D3D11VA path. Mozilla relies on Microsoft's extension. The result: publishers must either encode everything twice (H.264 + HEVC) or accept that some users get a black screen.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution — Decode HEVC Client-Side in WebAssembly
&lt;/h2&gt;

&lt;p&gt;What if the browser didn't need to know it's playing HEVC?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/privaloops/hevc.js" rel="noopener noreferrer"&gt;hevc.js&lt;/a&gt; decodes HEVC in a Web Worker and re-encodes to H.264 via WebCodecs, delivering standard H.264 to Media Source Extensions. The player doesn't know it's happening.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;fMP4 HEVC → mp4box.js (demux) → NAL units
         → WASM H.265 decoder → YUV frames
         → WebCodecs VideoEncoder → H.264
         → custom fMP4 muxer → MSE → &amp;lt;video&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The HEVC decoder is a from-scratch C++17 implementation of ITU-T H.265 (716 pages), compiled to WebAssembly. 236 KB gzipped. Zero dependencies. No special server headers needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  dash.js integration
&lt;/h3&gt;

&lt;p&gt;The plugin intercepts &lt;code&gt;MediaSource.addSourceBuffer()&lt;/code&gt;. When dash.js creates an HEVC SourceBuffer, a proxy accepts the HEVC MIME type but feeds the real SourceBuffer with H.264. ABR, seek, live — everything works unmodified.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;dashjs&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dashjs&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;attachHevcSupport&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@hevcjs/dashjs-plugin&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;player&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;dashjs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;MediaPlayer&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;attachHevcSupport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;player&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;workerUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/transcode-worker.js&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;wasmUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/hevc-decode.js&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="nx"&gt;player&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;videoElement&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;mpdUrl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Smart detection
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;MediaSource.isTypeSupported()&lt;/code&gt; can lie — Firefox on Windows reports HEVC support even without the Video Extension installed. hevc.js actually creates a SourceBuffer to probe; only activates transcoding on failure. When native HEVC works: zero overhead, WASM never loaded.&lt;/p&gt;

&lt;h2&gt;
  
  
  Browser Compatibility
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Browser + OS&lt;/th&gt;
&lt;th&gt;Native HEVC&lt;/th&gt;
&lt;th&gt;hevc.js activates?&lt;/th&gt;
&lt;th&gt;Transcoding?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Safari 13+ (macOS/iOS)&lt;/td&gt;
&lt;td&gt;Yes (VideoToolbox)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Chrome/Edge/Firefox (Mac)&lt;/td&gt;
&lt;td&gt;Yes (VideoToolbox)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Chrome 107+ (Win, HEVC GPU)&lt;/td&gt;
&lt;td&gt;Yes (D3D11VA)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Chrome 107+ (Win, no HEVC GPU)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Edge (Win, with extension)&lt;/td&gt;
&lt;td&gt;Yes (MFT)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Edge (Win, no extension)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Firefox 133+ (Win, with extension)&lt;/td&gt;
&lt;td&gt;Yes (MFT)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Firefox 133+ (Win, no extension)&lt;/td&gt;
&lt;td&gt;False positive&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Chrome/Edge 94-106&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Chrome (Linux, no VAAPI)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Requirements: WebAssembly, Web Workers, Secure Context (HTTPS), WebCodecs with H.264 encoding support.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance
&lt;/h2&gt;

&lt;p&gt;Single-threaded, Apple Silicon:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Native C++&lt;/th&gt;
&lt;th&gt;WASM (Chrome)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1080p decode&lt;/td&gt;
&lt;td&gt;76 fps&lt;/td&gt;
&lt;td&gt;61 fps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4K decode&lt;/td&gt;
&lt;td&gt;28 fps&lt;/td&gt;
&lt;td&gt;21 fps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1080p transcode&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;~2.5x realtime&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;WASM reaches 80% of native C++ speed, and 83% of libde265 (a mature 10-year-old HEVC decoder) when both are compiled to WASM.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conformance&lt;/strong&gt;: 128/128 test bitstreams pixel-perfect against ffmpeg. Zero drift.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Tradeoff
&lt;/h2&gt;

&lt;p&gt;The first segment takes 2-3 seconds to transcode — that's the startup latency cost of software decode versus native hardware. After buffering, playback is smooth.&lt;/p&gt;

&lt;p&gt;This makes hevc.js a good fit for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Streaming platforms with existing HEVC catalogs&lt;/li&gt;
&lt;li&gt;Infrastructure simplification (single HEVC pipeline, no H.264 fallback)&lt;/li&gt;
&lt;li&gt;VOD or moderate-latency live&lt;/li&gt;
&lt;li&gt;Controlled environments (IPTV, B2B)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not ideal for: low-end mobile (CPU/battery), 4K on underpowered machines, or ultra-low-latency live sports.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Live demo&lt;/strong&gt;: &lt;a href="https://hevcjs.dev/demo/dash.html" rel="noopener noreferrer"&gt;hevcjs.dev/demo/dash.html&lt;/a&gt; — toggle "Force transcoding" to test the WASM path even if your browser has native HEVC.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Install&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @hevcjs/dashjs-plugin dashjs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/privaloops/hevc.js" rel="noopener noreferrer"&gt;github.com/privaloops/hevc.js&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;MIT license. Feedback and contributions welcome.&lt;/p&gt;

</description>
      <category>webassembly</category>
      <category>javascript</category>
      <category>video</category>
      <category>streaming</category>
    </item>
    <item>
      <title>How to Build a Remote Job Alert System (No API Key Required)</title>
      <dc:creator>agenthustler</dc:creator>
      <pubDate>Tue, 21 Apr 2026 08:00:09 +0000</pubDate>
      <link>https://forem.com/agenthustler/how-to-build-a-remote-job-alert-system-no-api-key-required-5f5e</link>
      <guid>https://forem.com/agenthustler/how-to-build-a-remote-job-alert-system-no-api-key-required-5f5e</guid>
      <description>&lt;h2&gt;
  
  
  The Problem with Job Board Notifications
&lt;/h2&gt;

&lt;p&gt;Most job boards have email alerts, but they're noisy and limited. You can't filter by salary range, tech stack, or specific keywords in the description. You can't combine alerts from multiple boards into one feed. And you definitely can't pipe the results into your own tools.&lt;/p&gt;

&lt;p&gt;Let's fix that. In this tutorial, we'll build a remote job alert system that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pulls fresh listings from remote job boards every few hours&lt;/li&gt;
&lt;li&gt;Filters by your criteria (keywords, salary, location)&lt;/li&gt;
&lt;li&gt;Sends you a clean email digest&lt;/li&gt;
&lt;li&gt;Runs on autopilot with zero API keys to manage&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data source&lt;/strong&gt;: &lt;a href="https://apify.com/cryptosignals/weworkremotely-scraper" rel="noopener noreferrer"&gt;WeWorkRemotely Scraper&lt;/a&gt; on Apify (handles the data collection)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scheduling&lt;/strong&gt;: Apify's built-in scheduler (or cron if self-hosting)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filtering + alerts&lt;/strong&gt;: A simple Python script&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Email&lt;/strong&gt;: SMTP (Gmail, SendGrid, or any provider)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 1: Set Up Automated Data Collection
&lt;/h2&gt;

&lt;p&gt;Create a free Apify account and find the WeWorkRemotely Scraper in the store. Configure it with your search parameters and set it to run on a schedule (every 6 hours works well for job listings).&lt;/p&gt;

&lt;p&gt;Each run produces a dataset of JSON objects like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Senior Python Developer"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"company"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Acme Corp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://weworkremotely.com/listings/acme-senior-python"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Programming"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-15"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"salary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"$120k - $160k"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"We're looking for a senior Python developer..."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: Filter and Alert with Python
&lt;/h2&gt;

&lt;p&gt;Here's a complete script that fetches the latest results, filters them, and sends an email:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;smtplib&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;email.mime.text&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MIMEText&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timedelta&lt;/span&gt;

&lt;span class="c1"&gt;# Config
&lt;/span&gt;&lt;span class="n"&gt;APIfY_TOKEN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;your_apify_token&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;span class="n"&gt;DATASET_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;your_dataset_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;  &lt;span class="c1"&gt;# From the scheduled run
&lt;/span&gt;&lt;span class="n"&gt;EMAIL_FROM&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;alerts@yourdomain.com&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;span class="n"&gt;EMAIL_TO&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;you@yourdomain.com&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;span class="n"&gt;SMTP_HOST&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;smtp.gmail.com&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;span class="n"&gt;SMTP_PORT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;587&lt;/span&gt;
&lt;span class="n"&gt;SMTP_USER&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;your_email&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;span class="n"&gt;SMTP_PASS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;your_app_password&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;

&lt;span class="c1"&gt;# Keywords to match (case-insensitive)
&lt;/span&gt;&lt;span class="n"&gt;KEYWORDS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;python&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;fastapi&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;data engineer&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;backend&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;MIN_SALARY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100_000&lt;/span&gt;  &lt;span class="c1"&gt;# Optional: filter by minimum salary
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_jobs&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Pull latest job listings from Apify dataset.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;https://api.apify.com/v2/datasets/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;DATASET_ID&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/items&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;token&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;APIFY_TOKEN&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;matches_criteria&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Check if a job matches our filter criteria.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;kw&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;KEYWORDS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;format_digest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jobs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Format matching jobs into a readable email body.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Found &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jobs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; matching remote jobs:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;job&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;jobs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;** at &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;company&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Salary: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;salary&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Not listed&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Link: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;send_email&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subject&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Send the digest via SMTP.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MIMEText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Subject&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;subject&lt;/span&gt;
    &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;From&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;EMAIL_FROM&lt;/span&gt;
    &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;To&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;EMAIL_TO&lt;/span&gt;

    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;smtplib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;SMTP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SMTP_HOST&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SMTP_PORT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;starttls&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;login&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SMTP_USER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SMTP_PASS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;jobs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;fetch_jobs&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;matching&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;jobs&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;matches_criteria&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;matching&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;subject&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;matching&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; new remote jobs matching your criteria&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;format_digest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;matching&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;send_email&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subject&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Sent digest with &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;matching&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; jobs&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;No matching jobs found&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3: Run It on a Schedule
&lt;/h2&gt;

&lt;p&gt;You have a few options:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Apify webhook&lt;/strong&gt; — Set up a webhook on your scheduled actor run that hits your script endpoint&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cron job&lt;/strong&gt; — Run the Python script every 6 hours on any server or even a Raspberry Pi&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub Actions&lt;/strong&gt; — Free scheduled workflows that can run this script&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For GitHub Actions, create &lt;code&gt;.github/workflows/job-alerts.yml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Job Alerts&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;schedule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;cron&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*/6&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*'&lt;/span&gt;
&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;check&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-python@v5&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;python-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;3.12'&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pip install requests&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python job_alerts.py&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;APIFY_TOKEN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.APIFY_TOKEN }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Extending It
&lt;/h2&gt;

&lt;p&gt;Once the basic system works, you can add:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multiple sources&lt;/strong&gt; — Add RemoteOK, Indeed, or other boards to the same pipeline&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deduplication&lt;/strong&gt; — Track seen job URLs in a simple JSON file or SQLite database&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Slack/Discord alerts&lt;/strong&gt; — Replace the email function with a webhook POST&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Salary parsing&lt;/strong&gt; — Extract numeric ranges and filter more precisely&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dashboard&lt;/strong&gt; — Push results to a Google Sheet for tracking over time&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why This Beats Built-In Alerts
&lt;/h2&gt;

&lt;p&gt;Job board email alerts give you everything that matches a single keyword. This system lets you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Combine multiple boards into one feed&lt;/li&gt;
&lt;li&gt;Apply complex filters (salary + keywords + category)&lt;/li&gt;
&lt;li&gt;Control the format and delivery channel&lt;/li&gt;
&lt;li&gt;Keep a historical record of listings&lt;/li&gt;
&lt;li&gt;Build on top of it (analytics, auto-apply, etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The whole setup takes about 20 minutes, runs for free (within Apify's free tier and GitHub Actions limits), and you'll never miss a relevant remote job posting again.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What's your current job search automation setup? I'd love to hear what tools people are using — drop a comment below.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>productivity</category>
      <category>beginners</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Cinematic Product Videos with fal.ai and Kling 3.0 for $1 a Scene</title>
      <dc:creator>Ben Utting</dc:creator>
      <pubDate>Tue, 21 Apr 2026 08:00:00 +0000</pubDate>
      <link>https://forem.com/benutting/cinematic-product-videos-with-falai-and-kling-30-for-a-1-a-scene-go7</link>
      <guid>https://forem.com/benutting/cinematic-product-videos-with-falai-and-kling-30-for-a-1-a-scene-go7</guid>
      <description>&lt;p&gt;A client needed social media videos of their product in six different lifestyle scenes. Professional shoots would have cost thousands per location. We did all six for about $6 total, in under an hour.&lt;/p&gt;

&lt;p&gt;The pipeline is two API calls: one to place the real product into a generated scene, one to animate it into a 5-second video with sound. Both run through fal.ai.&lt;/p&gt;

&lt;h2&gt;
  
  
  The brief
&lt;/h2&gt;

&lt;p&gt;The client had a small physical product and a solid brand page with plenty of existing content. He sent me an AI-generated video he'd seen of someone walking through New York that seamlessly featured a product. He wanted something similar for his own brand: cinematic scenes showing the product in restaurant and bar settings, generated entirely from a single product photo.&lt;/p&gt;

&lt;p&gt;The goal was to build a repeatable skill that could produce these scenes on demand, not just a one-off video.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: place the product into a scene
&lt;/h2&gt;

&lt;p&gt;The first script uses Google's Nano Banana 2 edit model via fal.ai. You give it a reference photo of the real product and a text prompt describing the scene you want. It generates a new image with the product placed naturally into that environment, preserving the product's appearance, label, and proportions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python generate_kontext.py product_photo.jpg &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"Product on white linen table, candlelit restaurant, beside wine glass, warm golden light, cinematic"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--variations&lt;/span&gt; 5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;--variations 5&lt;/code&gt; flag is important. AI image generation is inconsistent. Out of five attempts, usually two or three look good. One will be excellent. The rest get discarded. At $0.04 per image, generating five costs $0.20. Cheap enough to always overshoot.&lt;/p&gt;

&lt;p&gt;One thing I learned: prompts need a scale anchor. If the product is small, the model will sometimes scale it up to fill the scene. Always include a size reference in the prompt: a wine glass, a hand, a plate. Something that tells the model how big the product actually is relative to its surroundings.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: animate the winner
&lt;/h2&gt;

&lt;p&gt;The second script takes the best image from Step 1 and turns it into a 5-second video using Kling 3.0 Pro, also via fal.ai. It generates native audio too: sizzling sounds for a kitchen scene, ambient restaurant noise, clinking glasses.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python generate_video.py &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"Hand reaches for product, picks it up, tilts gently, slow motion"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--image_url&lt;/span&gt; &lt;span class="s2"&gt;"https://fal.media/files/..."&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--duration&lt;/span&gt; 5 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--cfg_scale&lt;/span&gt; 1.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;cfg_scale&lt;/code&gt; setting matters. The default (0.5) gives the model creative freedom, which is fine for abstract content but bad for product shots. Setting it to 1.0 forces the model to follow the prompt closely. For product content, you want maximum adherence: the product should stay in frame, the motion should be what you described, nothing should morph or distort.&lt;/p&gt;

&lt;p&gt;One video takes 60 to 180 seconds to generate and costs about $0.80. Combined with the image step, a full scene (5 image variations + 1 video) runs to about $1.&lt;/p&gt;

&lt;h2&gt;
  
  
  The scenes we built
&lt;/h2&gt;

&lt;p&gt;We created a prompt library with six scenes, each with an image prompt and a matching motion prompt. Restaurant lifestyle, in-hand close-ups, kitchen action shots, moody food pairings, textured product beauty shots, and bar settings.&lt;/p&gt;

&lt;p&gt;Each scene follows the same workflow: two commands, one decision (pick the best of five images), one output (a 5-second video with audio). Total cost for all six scenes: about $6. Total time: under an hour, including prompt iteration.&lt;/p&gt;

&lt;p&gt;The prompt library is the reusable part. Once you've dialled in the style and scale for one product, adapting it for another is just swapping the product description and the reference photo.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd do differently
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Batch the image generation.&lt;/strong&gt; Right now each scene is a separate script invocation. A wrapper that runs all six scenes, generates all 30 images, and presents them for review in one pass would save time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test 9:16 for Stories and Reels.&lt;/strong&gt; All our content was 16:9. Kling supports 9:16 for vertical video, but only in text-to-video mode (not image-to-video). For Instagram Reels, you'd need to either crop or generate the initial image at 9:16.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Build a prompt template system.&lt;/strong&gt; The prompt library works, but it's manual. A template where you swap in the product name, size description, and setting would make this reusable across clients without rewriting prompts from scratch.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this works for small brands
&lt;/h2&gt;

&lt;p&gt;This client is a bootstrapped D2C brand. There's no budget for location shoots across six restaurants. But the social content needs to look premium because the product is premium.&lt;/p&gt;

&lt;p&gt;This pipeline delivers that. Five minutes per scene, a dollar per video, and the output looks like it came from a production studio. The client picks from five image options, approves one, and gets a ready-to-post video with sound. No photographer, no stylist, no venue booking.&lt;/p&gt;

&lt;p&gt;If you're selling a physical product and need lifestyle content at scale, this exact pipeline works. Two scripts, one API key, and a good product photo to start from.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ctrlaltautomate.com&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>automation</category>
      <category>python</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>16 Ways to Make a Small Language Model Think Bigger</title>
      <dc:creator>Wojtek Pluta</dc:creator>
      <pubDate>Tue, 21 Apr 2026 07:56:58 +0000</pubDate>
      <link>https://forem.com/oracledevs/16-ways-to-make-a-small-language-model-think-bigger-2lbo</link>
      <guid>https://forem.com/oracledevs/16-ways-to-make-a-small-language-model-think-bigger-2lbo</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;This article is syndicated from the original post on &lt;a href="https://blogs.oracle.com/developers/16-ways-to-make-a-small-language-model-think-bigger" rel="noopener noreferrer"&gt;blogs.oracle.com&lt;/a&gt;. Read the canonical version there for the latest updates.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;em&gt;All of the code in this article is available in the &lt;a href="http://github.com/oracle-devrel/oracle-ai-developer-hub" rel="noopener noreferrer"&gt;Oracle AI Developer Hub&lt;/a&gt;. The repository is part of Oracle’s open-source AI collection and serves as the reference implementation for everything covered here.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;You can install it with &lt;code&gt;pip install agent-reasoning&lt;/code&gt;, browse the 16 agent classes, run the TUI, or integrate it directly into an existing Ollama pipeline as a zero-change replacement client. If you find it useful, a GitHub star goes a long way.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Small language models struggle with complex reasoning on their own, but agent-based architectures (like Tree of Thoughts or Self-Consistency) can significantly improve their performance.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;agent-reasoning&lt;/code&gt; framework adds 16 research-backed reasoning strategies to any Ollama model using a simple &lt;code&gt;+strategy&lt;/code&gt; tag—no code changes required.&lt;/li&gt;
&lt;li&gt;Different strategies suit different tasks: CoT works well overall, ReAct excels with external data, and branching methods improve accuracy at the cost of speed.&lt;/li&gt;
&lt;li&gt;Much of modern AI progress comes from orchestration (prompting, search, control flow), not just larger models.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Generally, a 270M parameter LLM (as of today, April 2026) struggles with even basic multi-step reasoning. Ask a model like &lt;code&gt;gemma3:270m&lt;/code&gt; to solve the classic water jug problem, and it will often return a confidently incorrect answer—much like other small language models (SLMs) of similar size and training.&lt;/p&gt;

&lt;p&gt;However, take that same model and wrap it inside a Tree of Thoughts (ToT) agent, running a breadth-first search (BFS) with three levels and weighted branches, and it can reliably solve the puzzle. The improvement comes from the architecture: the agent distributes the reasoning process across structured exploration steps, compensating for the limitations of a single LLM call.&lt;/p&gt;

&lt;p&gt;This is where things get interesting. Much of the progress in applied AI isn't coming from bigger models alone, but from engineers rethinking how to orchestrate them—layering search, memory, and control flow on top of a standard LLM call to unlock new capabilities.&lt;/p&gt;

&lt;p&gt;This is the fundamental idea behind &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/agent-reasoning" rel="noopener noreferrer"&gt;agent-reasoning&lt;/a&gt;: sixteen cognitive architectures—each backed by peer-reviewed research—can be applied to any Ollama-served model via a simple &lt;code&gt;+Strategy&lt;/code&gt; tag appended to the model name. Call &lt;code&gt;gemma3:270m+tot&lt;/code&gt; instead of &lt;code&gt;gemma3:270m&lt;/code&gt;, and the interceptor handles everything else.&lt;/p&gt;

&lt;p&gt;We’ll talk about the different ways to invoke these reasoning strategies through the project.&lt;/p&gt;




&lt;h2&gt;
  
  
  What You’ll Learn
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;How the &lt;code&gt;ReasoningInterceptor&lt;/code&gt; intercepts model names, removes the &lt;code&gt;+Strategy&lt;/code&gt; tag, and directs traffic to one of 16 agent classes&lt;/li&gt;
&lt;li&gt;How 16 strategies divide into four families: sequential, branching, reflective, and meta —each representing a different reasoning approach and set of trade-offs&lt;/li&gt;
&lt;li&gt;What each major strategy accomplishes in practice, focusing on implementation rather than theory&lt;/li&gt;
&lt;li&gt;Which type of problem each strategy is best suited for, based on benchmark results from March 2026&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Interception Layer
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Key insight:&lt;/strong&gt; The &lt;code&gt;ReasoningInterceptor&lt;/code&gt; is an interchangeable drop-in client for Ollama that analyzes the model name for a &lt;code&gt;+Strategy&lt;/code&gt; tag and directs traffic to one of 16 cognitive agent classes while making no modifications to your pre-existing code.&lt;/p&gt;

&lt;p&gt;Everything relies on a single template: add &lt;code&gt;+Strategy&lt;/code&gt; to any Ollama model name.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2APLi2WumhUe2et_POG0V_Og.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2APLi2WumhUe2et_POG0V_Og.png" title="Using ReasoningInterceptor as a drop-in replacement client" alt="Using ReasoningInterceptor as a drop-in replacement client; strategy routing can be enabled via model name tags (e.g., +tot)." width="800" height="249"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Using ReasoningInterceptor as a drop-in replacement client&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The image below illustrates the entire routing process from start to finish. The interceptor acts as a middleman between your code and Ollama, removes the &lt;code&gt;+Strategy tag&lt;/code&gt;, and sends traffic to the correct agent class.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2A5MwkQVsNUA1pqBEzsV4ACA.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2A5MwkQVsNUA1pqBEzsV4ACA.png" title="Illustrating how the interceptor separates the base model from the Strategy tag" alt="Diagram illustrating how the interceptor separates the base model from the Strategy tag and directs traffic to the corresponding agent class." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Illustrating how the interceptor separates the base model from the Strategy tag&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;agent_map&lt;/code&gt; contains over fifty-five aliases mapped to sixteen agent classes. For example, &lt;code&gt;cot&lt;/code&gt;, &lt;code&gt;chain_of_thought&lt;/code&gt;, and &lt;code&gt;CoT&lt;/code&gt; all map to &lt;code&gt;CotAgent&lt;/code&gt;, while &lt;code&gt;mcts&lt;/code&gt; and &lt;code&gt;monte_carlo&lt;/code&gt; map to &lt;code&gt;MCTSAgent&lt;/code&gt;. Because the interceptor is a drop-in client for Ollama—supporting the same &lt;code&gt;.generate()&lt;/code&gt; and &lt;code&gt;.chat()&lt;/code&gt; APIs— existing LangChain pipelines, web UIs, and scripts can automatically gain reasoning capabilities by changing a single string in the model name.&lt;/p&gt;

&lt;p&gt;Additionally, the interceptor can be used as a network proxy. Instead of pointing an Ollama compatible application at &lt;code&gt;http://localhost:11434&lt;/code&gt;, direct it to &lt;code&gt;http://localhost:8080&lt;/code&gt; instead. Using a model name like &lt;code&gt;gemma3:270m+CoT&lt;/code&gt;, the gateway will apply reasoning transparently.&lt;/p&gt;

&lt;h2&gt;
  
  
  Family 1: Sequential Strategies
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Key insight:&lt;/strong&gt; Sequential Strategies process problems in a linear chain, where each step feeds into the next. In benchmarks, CoT achieved 88.7% average accuracy, compared to 81.3% for standard generation on the same model and weights.&lt;/p&gt;

&lt;p&gt;Each of the sixteen strategies fall into one of four families. The diagram below illustrates how they are grouped.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2AqIVVyTPUDA2luQCNkzWgKw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2AqIVVyTPUDA2luQCNkzWgKw.png" title="Categorization of the four strategy families" alt="Categorization of the four Strategy families: sequential, branching, reflective, and meta. Each route leads to a specific type of reasoning agent. The fastest Sequential Strategies occupy the top-left quadrant while slower Branching strategies sacrifice speed for increased accuracy." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Categorization of the four strategy families&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Sequential strategies are designed for high-speed processing with minimal latency. They are ideal for problems with discrete, sequential steps.&lt;/p&gt;

&lt;h3&gt;
  
  
  Chain of Thought (CoT)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Paper:&lt;/strong&gt; Wei et al. (2022), &lt;a href="https://arxiv.org/abs/2201.11903" rel="noopener noreferrer"&gt;“Chain-of-Thought Prompting Elicits Reasoning in Large Language Models”&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Chain of Thought (CoT) is a prompting strategy in which the model generates intermediate reasoning steps before producing a final response. As noted in the original paper: prompting a model to produce these intermediate steps can significantly improve accuracy.&lt;/p&gt;

&lt;p&gt;For example, standard prompting on GSM8K achieves 66.7% accuracy. With CoT prompting, this increases to 73.3%— a 10% relative improvement achieved through simple prompt design alone.&lt;/p&gt;

&lt;p&gt;The following graphic illustrates how CoT chains appear in practice: a sequence of numbered steps, each building on the previous one.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2ANwSyAs818bWZ3mCEDW2lOg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2ANwSyAs818bWZ3mCEDW2lOg.png" title="CoT in operation" alt="Visual representation of CoT in operation: the model sequentially progresses through numbered steps (step 1…step n). Each subsequent step depends on previously generated steps. The numbering in the prompt is the only special instruction provided." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;CoT in operation&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In terms of implementation within &lt;code&gt;CotAgent&lt;/code&gt;, the query is wrapped in a structured prompt:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2AolcatRJAj5naE6svAHQbOA.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2AolcatRJAj5naE6svAHQbOA.png" title="Structured prompting enforces step-by-step reasoning in CoTAgent" alt="Structured prompting enforces step-by-step reasoning in CoTAgent" width="800" height="237"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Structured prompting enforces step-by-step reasoning in CoTAgent&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Benchmark result for qwen3.5:9b (9.7B): CoT achieves &lt;strong&gt;88.7% average accuracy&lt;/strong&gt;, across GSM8K (math), MMLU (logic), and ARC-Challenge (reasoning), compared to 81.3% for standard generation. This seven-point gain in performance is attributable solely to structural prompts. Identical weights and temperatures were applied to both models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommended usage:&lt;/strong&gt; Math word problems; logic puzzles; any multi-step reasoning task where the individual steps are sequential and do not have branches.&lt;/p&gt;

&lt;h3&gt;
  
  
  Decomposed Prompting
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Paper:&lt;/strong&gt; Khot et al. (2022), &lt;a href="https://arxiv.org/abs/2210.02406" rel="noopener noreferrer"&gt;“Decomposed Prompting: A Modular Approach for Solving Complex Tasks”&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Decomposed prompting is an architectural module that splits large problems into smaller sub-problems. Each sub-problem is handled independently while carrying forward accumulated context from earlier steps. Once all sub-problems are processed, their outputs are synthesized into a final result. &lt;code&gt;DecomposedAgent&lt;/code&gt; follows a three-phase process—decomposition, execution and synthesis—and propagating context throughout so that each step can build on prior results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommended usage:&lt;/strong&gt; Planning problems; trip itinerary generation; any problem where the ultimate answer consists of multiple distinguishable parts that may be individually addressed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Decomposed prompting achieved only 38.5% average accuracy in benchmark testing. This result requires context. GSM8K primarily evaluates arithmetic reasoning, where decomposing a problem like “what is 47 × 13 + 9?” introduces overhead without improving the model's ability to compute the answer.&lt;/p&gt;

&lt;p&gt;Decomposition is more effective for problems with genuinely separable components (trip planning, multi-section reports etc.), where each part benefits from focused attention. These strengths are not captured by the benchmark, and the results reflect that mismatch.&lt;/p&gt;

&lt;h3&gt;
  
  
  Least-to-Most Prompting
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Paper:&lt;/strong&gt; Zhou et al. (2022), &lt;a href="https://arxiv.org/abs/2205.10625" rel="noopener noreferrer"&gt;“Least-to-Most Prompting Enables Complex Reasoning in Large Language Models”&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Least-to-most prompting is a strategy that orders sub-questions from simplest to most complex, establishing prerequisite knowledge before tackling harder steps. Unlike decomposed prompting which generates arbitrary sub-problems, it enforces a deliberate progression where each step builds on the last. Knowledge is accumulated iteratively until the model reaches the final question.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommended usage:&lt;/strong&gt; Questions with genuine prerequisites — e.g., “what is x?” before determining “how does x relate to y?”; educational style explanation sequences (“concept ladder”); tasks that require establishing foundational concepts before addressing more complex components.&lt;/p&gt;

&lt;h2&gt;
  
  
  Family 2: Branching Strategies
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Key insight:&lt;/strong&gt; Branching strategies explore multiple reasoning paths simultaneously and choose the best path. ToT scored 76.7% on GSM8K math, compared to 66.7% on GSM8K math with standard generation.&lt;/p&gt;

&lt;p&gt;More LLM calls mean higher latency— but often better answers on hard problems. Take this into consideration when running all branching strategies.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tree of Thoughts (ToT)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Paper:&lt;/strong&gt; Yao et al. (2023), &lt;a href="https://arxiv.org/abs/2305.10601" rel="noopener noreferrer"&gt;“Tree of Thoughts: Deliberate Problem Solving with Large Language Models”&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;ToT is a search-based methodology that evaluates numerous possible reasoning paths concurrently, selecting the best performing path as determined by evaluation metrics such as distance traveled or quality of intermediate solutions etc.&lt;/p&gt;

&lt;p&gt;Similar to chess engines, ToT applies BFS through an expanding tree of possible solutions. The core idea is straightforward: generate multiple partial solutions, evaluate them, prune weaker candidates, and continue exploring the most promising branches.&lt;/p&gt;

&lt;p&gt;Below is an illustration of how ToT generates and eliminates branches: green nodes represent surviving branches, while red nodes indicate those that have been eliminated. The final answer is derived from the highest scoring leaf node.&lt;/p&gt;

&lt;p&gt;A key design decision is how branches are evaluated. Should the same model handle both generation and scoring, or should a stronger model be introduced as a judge? In these benchmarks, the same model was used for both roles, but this is an area worth experimenting with, depending on your accuracy and latency constraints.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2AQHJPySSkNpDOji9BCKz-Ng.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2AQHJPySSkNpDOji9BCKz-Ng.png" title="Generating candidate branches at each level" alt="Illustration of how to generate candidate branches at each level; score candidate branches between 0 &amp;amp; 1; prune low-scored candidates; continue exploring surviving high-scored candidates until all levels are exhausted and then generate final answer from most promising leaf node." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Generating candidate branches at each level&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;ToTAgent&lt;/code&gt; implements this as configurable by &lt;code&gt;depth&lt;/code&gt; (default=3) and &lt;code&gt;width&lt;/code&gt; (default=2 branches). At every level, the agent generates a set of candidate next steps, evaluates them using a scoring function, prunes low-scoring options, and expands the remaining candidates into the next level.&lt;/p&gt;

&lt;p&gt;Tot achieved &lt;strong&gt;76.7% accuracy&lt;/strong&gt;—a 10% percent improvement over standard generation on GSM8K math problems. This performance comes at a cost: additional LLM calls are required at each step to evaluate candidate paths and their intermediate result, making it roughly 5-8x slower than CoT equivalent queries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommended usage:&lt;/strong&gt; Logic puzzles with multiple solution paths; strategic decision problems; tasks where multiple approaches can be explored and compared.&lt;/p&gt;

&lt;h3&gt;
  
  
  Self-Consistency (Majority Voting)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Paper:&lt;/strong&gt; Wang et al. (2022), &lt;a href="https://arxiv.org/abs/2203.11171" rel="noopener noreferrer"&gt;“Self-Consistency Improves Chain of Thought Reasoning in Language Models”&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Self-Consistency is a sampling method that generates multiple independent reasoning traces and selects a final answer through majority voting. Unlike standard prompting, it relies on sampling k diverse traces at a higher temperature to encourage variation. Each trace produces a candidate answer, and the most frequently occurring answer is selected as the final output.&lt;/p&gt;

&lt;p&gt;The image below illustrates how both Self-Consistency and Monte Carlo Tree Search (MCTS) sample multiple reasoning paths, but differ fundamentally in how those paths are evaluated—majority voting versus UCB1-based exploration-exploitation balancing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2AUKyufmNfjpFnSizTxD1M2w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2AUKyufmNfjpFnSizTxD1M2w.png" title="Self-Consistency vs MCTS comparison" alt="Left: Self-Consistency flowchart — sampling k independent traces &amp;amp; selecting most commonly occurring final answer via majority vote. Right: Monte Carlo Tree Search (MCTS) flowchart — sampling new paths through UCB1-based exploration/exploitation tradeoff balancing — both generate multiple possible answers — selection methodology differ significantly." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Self-Consistency vs MCTS comparison&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;ConsistencyAgent&lt;/code&gt; uses &lt;code&gt;k=5&lt;/code&gt; samples at temperature of &lt;code&gt;0.7&lt;/code&gt; by default. It extracts final answers using regex-based pattern matching and selects the most frequent result via &lt;code&gt;counter.most_common()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Self-Consistency matches CoT on both MMLU (96.7%) and GSM8K (76.7%). Its advantage lies in reliability rather than raw accuracy: majority voting across independent reasoning traces reduces the risk of single-trace errors propagating to the final answer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommended usage:&lt;/strong&gt; Factual question answering; multiple-choice style questions; problems where arriving at the correct answer via diverse reasoning paths is more important than inspecting a single reasoning trace.&lt;/p&gt;

&lt;h2&gt;
  
  
  Family 3: Reflective Strategies
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Self-Reflection
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Paper:&lt;/strong&gt; Shinn et al. (2023), “Reflexion: Language Agents with Verbal Reinforcement Learning” — &lt;a href="https://arxiv.org/abs/2303.11366" rel="noopener noreferrer"&gt;arXiv:2303.11366&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Self-Reflection is a draft-critique-refine loop in which the model generates an initial answer, critiques it for errors, and then revises it. The Reflexion paper showed that this iterative process can meaningfully improve output quality, even without any gradient updates.&lt;/p&gt;

&lt;p&gt;The image below shows all 3 reflective strategies side by side: Self-Reflection, Debate, and Refinement Loop.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2AGyy_CHbQa01wEnpRxsWMcA.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2AGyy_CHbQa01wEnpRxsWMcA.png" title="Reflective strategies comparison" alt="Left: Self-Reflection drafts, critiques, and refines until the critique says “CORRECT.” Right: Debate puts PRO and CON agents against each other with a Judge scoring each round. Bottom: Refinement Loop uses a numeric quality gate (0.0–1.0) to decide when to stop iterating." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Reflective strategies comparison&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SelfReflectionAgent&lt;/code&gt; runs a draft-critique-refine loop for up to 5 iterations, with early termination when the critique returns “CORRECT” in under 20 characters. If the critique is satisfied on an early pass, subsequent iterations are skipped. This approach helps keeps latency low for queries the model answers correctly on the initial pass.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommended usage:&lt;/strong&gt; Creative writing, high-stakes technical explanations, anything where “good enough on the first try” is insufficient.&lt;/p&gt;

&lt;h3&gt;
  
  
  Adversarial Debate
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Paper:&lt;/strong&gt; Irving et al. (2018), &lt;a href="https://arxiv.org/abs/1805.00899" rel="noopener noreferrer"&gt;“AI Safety via Debate”&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Irving proposed debate as a mechanism for improving AI safety. Two agents present opposing arguments, and a judge (either a human or another LLM) evaluates their merits. The underlying premise is that that identifying flaw in weak arguments is often easier than constructing strong ones.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;DebateAgent&lt;/code&gt; conducts multiple rounds of PRO and CON arguments, with a judge evaluating each exchange. Following all rounds, the strongest arguments from both sides are synthesized into a final answer that balances competing perspectives. Context is carried forward between rounds, enabling incremental refinement rather than redundant arguments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommended usage:&lt;/strong&gt; Controversial or ambiguous subjects; policy analysis; ethics and any subject matter requiring a balanced perspective.&lt;/p&gt;

&lt;h3&gt;
  
  
  Refinement Loop
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Paper:&lt;/strong&gt; Madaan et al. (2023), &lt;a href="https://arxiv.org/abs/2303.17651" rel="noopener noreferrer"&gt;“Self-Refine: Iterative Refinement with Self-Feedback”&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This paper describes a refinement loop similar to self-reflection, but instead of relying on a human-style critique to guide revisions, it uses a machine-based evaluation system with quantifiable quality metrics. These metrics determine whether further refinement is necessary. The loop terminates when a predefined quality metric is reached (&amp;gt; 0.9 by default) or when the maximum number of iterations is exceeded.&lt;/p&gt;

&lt;p&gt;The five-stage complex refinement pipeline consists of sequential stages, each focused on a distinct type of critique: technical accuracy, structure, depth, examples, and polish.&lt;/p&gt;

&lt;p&gt;Each stage targets a distinct aspect of quality, ensuring the model focuses exclusively on improving that dimension rather than attempting to optimize everything at once.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommended usage:&lt;/strong&gt; Highly technical writing; documentation; blog posts, a scenario where production-quality output is required rather than simply a first draft.&lt;/p&gt;

&lt;h2&gt;
  
  
  Family 4: Cross-Domain and Meta Strategies
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Key insight:&lt;/strong&gt; Cross-domain strategies enable sharing knowledge among disciplines, while meta-strategies automatically route queries to the most appropriate reasoning technique without requiring manual selection.&lt;/p&gt;

&lt;h3&gt;
  
  
  Analogy-Based Reasoning
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Paper:&lt;/strong&gt; Gentner (1983), &lt;a href="https://doi.org/10.1111/j.1551-6708.1983.tb00497.x" rel="noopener noreferrer"&gt;“Structure Mapping: A Theoretical Framework for Analogy”&lt;/a&gt;, Cognitive Science&lt;/p&gt;

&lt;p&gt;Gentner's structure-mapping theory proposes that analogical reasoning operates by identifying structural correspondences across domains, rather than relying on surface-level similarity. The &lt;code&gt;AnalogicalAgent&lt;/code&gt; builds on this idea through three phases: (1) identify the underlying structure independent of domain specifics, (2) generate analogous solutions from different domains that share that structure, (3) select the most effective analogy and apply its solution approach.&lt;/p&gt;

&lt;p&gt;This process reduces reliance on memorized patterns. By focusing on underlying structure, the model learns &lt;em&gt;why&lt;/em&gt; a solution works, rather than simply recalling &lt;em&gt;what&lt;/em&gt; worked before.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommended usage&lt;/strong&gt;: Solving problems that are structurally similar to prior ones, even if they differ superficially; transferring knowledge across domains; explaining complex concepts through analogy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Socratic Questioning
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Paper:&lt;/strong&gt; Paul &amp;amp; Elder (2007), &lt;a href="https://www.criticalthinking.org/" rel="noopener noreferrer"&gt;“The Art of Socratic Questioning”&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Socratic Method:&lt;/strong&gt; Do not answer the question directly. Instead, ask follow-up questions that reduce ambiguity in the solution space.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SocraticAgent&lt;/code&gt; repeatedly asks questions and receives model responses, continuing until it reaches a limit of five question-response exchanges. It then synthesizes the collected information into a final answer. A deduplication or normalization step helps prevent repeated queries that differ only in wording.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommended for:&lt;/strong&gt; Philosophy; ethics; deep technical knowledge; any field requiring the model to “know” something as opposed to merely answering it.&lt;/p&gt;

&lt;h3&gt;
  
  
  ReAct (Reason + Act)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Paper:&lt;/strong&gt; Yao et al. (2022), &lt;a href="https://arxiv.org/abs/2210.03629" rel="noopener noreferrer"&gt;“ReAct: Synergizing Reasoning and Acting in Language Models”&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;ReAct is a conceptual framework that interweaves reasoning steps with tool invocations, allowing the model to ground its thinking in external information. In practice, the model decides what action to take, calls a tool such as a web search engine, examines the result, updates its reasoning, and repeats the cycle until it reaches a satisfactory answer. Current tools include web scraping, accessing Wikipedia via an API call, and a calculator interface, with mock-ups available for off-line execution scenarios.&lt;/p&gt;

&lt;p&gt;Using ReAct achieved 70.0% accuracy on ARC-Challenge (Science Reasoning). While not the highest on this particular benchmark, it enabled tool use for the LLM and allowed it to search for required information on the Internet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommended usage&lt;/strong&gt;: Fact-checking; current events queries; mathematical calculations; tasks where access to grounded, external information is important.&lt;/p&gt;

&lt;h2&gt;
  
  
  Auto Router: MetaReasoningAgent
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Key insight:&lt;/strong&gt; A single LLM invocation allows &lt;code&gt;MetaReasoningAgent&lt;/code&gt; to classify each input into one of eleven categories and route it to the most appropriate strategy, without human intervention.&lt;/p&gt;

&lt;p&gt;All sixteen strategies depend on selecting the appropriate strategy for a given task. By removing this requirement, &lt;code&gt;MetaReasoningAgent&lt;/code&gt; eliminates the need for manual selection.&lt;/p&gt;

&lt;p&gt;The diagram below shows how each category maps to its corresponding strategy.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2ASSObpiuAEGr1s3E7oVbKGA.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2ASSObpiuAEGr1s3E7oVbKGA.png" title="MetaReasoningAgent classification diagram" alt="Classification occurs using a single LLM invocation returning CATEGORY, CONFIDENCE, and REASON." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;MetaReasoningAgent classification diagram&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;MetaReasoningAgent&lt;/code&gt; instantiates the selected strategy class and passes control to it, along with all event objects for visualization.&lt;/p&gt;

&lt;p&gt;To use this capability, specify a model such as &lt;code&gt;gemma3:270m+meta&lt;/code&gt; or &lt;code&gt;gemma3:270m+auto&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In practice, routing is generally intuitive: math problems are directed to CoT, logic puzzles to ToT, philosophical questions to Socratic Questioning, and controversial topics to Adversarial Debate.&lt;/p&gt;

&lt;p&gt;The trade-off is reduced control over strategy-specific hyperparameters in exchange for automatic routing aligned with the problem type.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Strategy Should You Pick? Benchmark Results (March 2026)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Key insight:&lt;/strong&gt; CoT performs best on average (88.7%) across diverse tasks. ReAct excels when tool use is available (70.0% on ARC-Challenge). ToT and Self-Consistency tie on GSM8K math at 76.7%.&lt;/p&gt;

&lt;p&gt;These results are based on 4,200 evaluations across 11 strategies using &lt;code&gt;qwen3.5:9b&lt;/code&gt;, collected as of March 2026. All 16 strategies are implemented and production-ready. However, the benchmarks shown below focus on the 11 that produce a single extractable answer. The remaining five are generation-focused and not suited to multiple-choice evaluation.&lt;/p&gt;

&lt;p&gt;The heat map and bar chart below provide a complete view of the results.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2AlkHAnyNpsABYEqnoueCr9g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2AlkHAnyNpsABYEqnoueCr9g.png" title="Benchmark results heatmap and bar chart" alt="Left: accuracy heatmap across GSM8K, MMLU, and ARC-Challenge for each strategy. Right: average accuracy bar chart. CoT wins overall at 88.7%." width="800" height="444"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Benchmark results heatmap and bar chart&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The short version:&lt;/strong&gt; CoT wins on average across diverse tasks. Self-Consistency and ToT beat it on specific math benchmarks. ReAct dominates on factual/science tasks. Self-Reflection and Refinement Loop are not well captured by these benchmarks, as they primarily improve generation quality rather than multiple-choice accuracy.&lt;/p&gt;

&lt;p&gt;For most queries, start with &lt;code&gt;+cot&lt;/code&gt;. If you’re solving logic puzzles or planning problems, try &lt;code&gt;+tot&lt;/code&gt;. If you need factually grounded responses, use &lt;code&gt;+react&lt;/code&gt;. If you need polished, high-quality output rather than a quick answer, use &lt;code&gt;+refinement&lt;/code&gt;. When in doubt, &lt;code&gt;+meta&lt;/code&gt; will route the query automatically.&lt;/p&gt;

&lt;p&gt;In my experience building agent-reasoning, the most surprising finding is how much prompt structure alone can improve performance. For example, &lt;code&gt;qwen3.5:9b&lt;/code&gt; improves from 81.3% to 88.7% average accuracy simply by prompting it to produce numbered reasoning steps.&lt;/p&gt;

&lt;p&gt;As of March 2026, all 16 strategies are production-ready and have been evaluated across 4,200 benchmark runs.&lt;/p&gt;

&lt;p&gt;You can &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/agent-reasoning" rel="noopener noreferrer"&gt;find the repository here&lt;/a&gt;. Install with &lt;code&gt;pip install agent-reasoning&lt;/code&gt; or &lt;code&gt;uv add agent-reasoning&lt;/code&gt;. The commands to get started:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2AXo6o2jGEUekHQjIkVWUI_A.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2AXo6o2jGEUekHQjIkVWUI_A.png" title="Getting started commands" alt="Getting started commandsInstallation and launching agent-reasoning in seconds to access a TUI with 16 reasoning agents." width="800" height="379"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Getting started commands&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The TUI provides a 16-agent sidebar, live streaming, and a step-through debugger. Arena mode runs all 16 agents simultaneously on the same query in a 4×4 grid.&lt;/p&gt;

&lt;p&gt;If this is useful, a GitHub star is always appreciated.&lt;/p&gt;




&lt;h2&gt;
  
  
  Frequently Asked Questions
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Do I need to modify my existing code to use agent-reasoning?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No. The interceptor is a drop-in replacement for the Ollama client. Just change the model name string by appending &lt;code&gt;+strategy&lt;/code&gt; (e.g., &lt;code&gt;gemma3:270m+cot&lt;/code&gt;) and the interceptor handles everything else. Existing LangChain pipelines, web UIs, and scripts work without any other changes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Which strategy should I start with?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Start with &lt;code&gt;+cot&lt;/code&gt; (Chain of Thought). It scored the highest average accuracy (88.7%) across our benchmarks and adds minimal latency. If you are unsure, use &lt;code&gt;+meta&lt;/code&gt; and let the auto-router pick the best strategy for you.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why were only 11 of the 16 strategies benchmarked?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The benchmarks (GSM8K, MMLU, ARC-Challenge) measure multiple-choice accuracy, which works well for strategies that produce a single extractable answer. The remaining five strategies are generation-focused (e.g., Refinement Loop, MCTS) and their strengths in output quality are not captured by multiple-choice evaluations. All 16 strategies are fully implemented and production-ready.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I use this with models other than Ollama-served models?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Currently the interceptor targets the Ollama API. Since it exposes the same &lt;code&gt;.generate()&lt;/code&gt; and &lt;code&gt;.chat()&lt;/code&gt; endpoints, any Ollama-compatible client works out of the box. Support for additional inference backends is on the roadmap.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How much slower are branching strategies compared to CoT?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;ToT is roughly 5-8x slower than CoT because it generates and evaluates multiple candidate branches at each level. Self-Consistency (k=5 samples) adds similar overhead. For latency-sensitive applications, stick with sequential strategies (CoT, Least-to-Most) and reserve branching strategies for problems where accuracy matters more than speed.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Created by Nacho Martinez, Data Scientist at Oracle. Find Nacho on &lt;a href="https://github.com/jasperan" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; and &lt;a href="https://linkedin.com/in/jasperan" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;&lt;/em&gt;, or visit the &lt;a href="https://www.oracle.com/developer/resources/" rel="noopener noreferrer"&gt;Oracle AI Developer page&lt;/a&gt; for more resources.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>opensource</category>
      <category>python</category>
    </item>
    <item>
      <title>Exploring Elyan Labs: Open-source infrastructure for vintage silicon</title>
      <dc:creator>houariblr</dc:creator>
      <pubDate>Tue, 21 Apr 2026 07:55:03 +0000</pubDate>
      <link>https://forem.com/houariblr/exploring-elyan-labs-open-source-infrastructure-for-vintage-silicon-8a9</link>
      <guid>https://forem.com/houariblr/exploring-elyan-labs-open-source-infrastructure-for-vintage-silicon-8a9</guid>
      <description>&lt;p&gt;I recently looked into Elyan Labs and found their approach to hardware infrastructure quite interesting. They are focusing on the intersection of vintage hardware and open-source development, integrating "Proof of Antiquity" concepts within the RustChain blockchain.&lt;/p&gt;

&lt;p&gt;What caught my attention:&lt;/p&gt;

&lt;p&gt;Infrastructure focus: 44+ PRs contributed to core projects like OpenSSL, Ghidra, vLLM, and LLVM.&lt;/p&gt;

&lt;p&gt;Research: They have a paper accepted at CVPR 2026, which suggests a solid technical foundation behind their hardware attestation models.&lt;/p&gt;

&lt;p&gt;Hardware integration: Trying to make vintage silicon relevant in a modern AI/blockchain stack is a unique challenge.&lt;/p&gt;

&lt;p&gt;Definitely worth a look if you are into low-level systems or hardware-software co-design.&lt;/p&gt;

&lt;p&gt;Keywords: #elyanlabs #vintagecomputing #opensource #hardwareattestation #CVPR2026&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>web3</category>
      <category>ai</category>
      <category>rust</category>
    </item>
    <item>
      <title>Your AI Agent Now Remembers Your Project: Persistent Memory with vem</title>
      <dc:creator>vem.dev</dc:creator>
      <pubDate>Tue, 21 Apr 2026 07:50:46 +0000</pubDate>
      <link>https://forem.com/vem/your-ai-agent-now-remembers-your-project-persistent-memory-with-vem-2d95</link>
      <guid>https://forem.com/vem/your-ai-agent-now-remembers-your-project-persistent-memory-with-vem-2d95</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;🚀 &lt;strong&gt;vem is in early access&lt;/strong&gt; — we're looking for our first users. If you try it and find it useful, we'd love to hear from you. &lt;strong&gt;Early access is completely free.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Every time you open a new chat with your AI coding assistant you spend the first few minutes re-explaining the same things: what the project does, which patterns you follow, what you were just working on, and why you made the architectural choices you did.&lt;/p&gt;

&lt;p&gt;This is not a UX quirk — it is a structural gap. AI agents are stateless. &lt;strong&gt;vem&lt;/strong&gt; solves this with a local memory layer that lives inside your repository.&lt;/p&gt;




&lt;h2&gt;
  
  
  Prerequisites — Install vem and Link a Project
&lt;/h2&gt;

&lt;p&gt;You need the vem CLI installed, an authenticated account, and a repository linked to a vem cloud project. If you completed the Cycles tutorial you are already set up — skip to the next section.&lt;/p&gt;

&lt;p&gt;Sign up at &lt;a href="https://vem.dev" rel="noopener noreferrer"&gt;vem.dev&lt;/a&gt;, grab your API key from vem.dev/keys, then run the three commands below.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Install the CLI globally&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @vemdev/cli

&lt;span class="c"&gt;# 2. Authenticate with your API key from vem.dev/keys&lt;/span&gt;
vem login &amp;lt;your-api-key&amp;gt;

&lt;span class="c"&gt;# 3. Initialise memory in your repo and link to a cloud project&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;my-project
vem init
vem &lt;span class="nb"&gt;link&lt;/span&gt;

&lt;span class="c"&gt;# Confirm everything is connected&lt;/span&gt;
vem status
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Problem: AI Agents Forget Everything
&lt;/h2&gt;

&lt;p&gt;Every time you open a new chat with your AI coding assistant you spend the first few minutes re-explaining the same things: what the project does, which patterns you follow, what you were just working on, and why you made the architectural choices you did.&lt;/p&gt;

&lt;p&gt;This is not a UX quirk — it is a structural gap. AI agents are stateless. They have no memory between sessions. The work you do to orient them at the start of each session is pure overhead, and the accumulated reasoning from previous sessions is permanently lost.&lt;/p&gt;

&lt;p&gt;vem solves this with a local memory layer that lives inside your repository. Everything your agents need to hit the ground running — project context, architectural decisions, sprint state — is stored durably in &lt;code&gt;.vem/&lt;/code&gt; and synced to the cloud so agents can query it instantly.&lt;/p&gt;




&lt;h2&gt;
  
  
  How vem Memory Works
&lt;/h2&gt;

&lt;p&gt;vem's memory system is built around four durable artifacts, all stored in &lt;code&gt;.vem/&lt;/code&gt; inside your repository. They are gitignored by default (so secrets never leak) but backed up to the vem cloud for search indexing and team sharing.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CONTEXT.md&lt;/strong&gt; — project overview and "need to know" facts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CURRENT_STATE.md&lt;/strong&gt; — live progress summary updated after each work session&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;decisions/&lt;/strong&gt; — one ADR file per architectural decision&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;tasks/&lt;/strong&gt; — structured task backlog with cycle assignments&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;CONTEXT.md&lt;/code&gt; is your project's "North Star" — a human-readable summary of what the project is, who it is for, and the non-obvious things any new contributor (human or AI) needs to know. &lt;code&gt;CURRENT_STATE.md&lt;/code&gt; captures where work stands right now: what just changed, what is in progress, and what is blocked.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;decisions/&lt;/code&gt; directory holds Architectural Decision Records (ADRs) — one file per decision, recording what was chosen, why, and what was considered and rejected. Together these four artifacts give any AI agent a complete, structured picture of your project before it writes a single line of code.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1 — Write Your First Project Context
&lt;/h2&gt;

&lt;p&gt;Start by writing a concise project context. Open &lt;code&gt;.vem/CONTEXT.md&lt;/code&gt; in any editor and describe your project in plain language: what it does, the main tech choices, and any gotchas a new developer would need to know on day one.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;vem context show&lt;/code&gt; prints the current context so you can confirm what your agents will see. After editing, run &lt;code&gt;vem push&lt;/code&gt; to sync it to the cloud immediately.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Open the context file in your editor&lt;/span&gt;
&lt;span class="nv"&gt;$EDITOR&lt;/span&gt; .vem/CONTEXT.md

&lt;span class="c"&gt;# Preview what agents see right now&lt;/span&gt;
vem context show

&lt;span class="c"&gt;# Sync to the cloud after editing&lt;/span&gt;
vem push
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 2 — Record an Architectural Decision
&lt;/h2&gt;

&lt;p&gt;Every non-obvious choice deserves a decision record. &lt;code&gt;vem decision add&lt;/code&gt; writes an ADR to &lt;code&gt;.vem/decisions/&lt;/code&gt; and immediately makes it searchable via the MCP server.&lt;/p&gt;

&lt;p&gt;Include the context (why you faced this decision) and the decision (what you chose). Future agents — and future you — will understand not just what was chosen but why. This prevents the "why did we do it this way?" confusion that slows down every project after the first month.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;vem decision add &lt;span class="s2"&gt;"Use Zod for input validation at CLI boundaries"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--context&lt;/span&gt; &lt;span class="s2"&gt;"Catching invalid user input early prevents confusing downstream errors."&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--decision&lt;/span&gt; &lt;span class="s2"&gt;"All CLI inputs are validated with Zod schemas before any business logic runs."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 3 — See Exactly What Your Agent Sees
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;vem pack&lt;/code&gt; generates a structured JSON snapshot of your entire project memory — tasks, context, decisions, and sprint state — in a single block. This is the exact payload that the MCP server sends to your AI agent at the start of each session.&lt;/p&gt;

&lt;p&gt;Running &lt;code&gt;vem pack&lt;/code&gt; manually is the fastest way to audit your memory quality. If the output looks thin or outdated, that is what your agents are working with. A well-maintained pack is the difference between an agent that needs three rounds of clarification and one that writes correct code on the first attempt.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Generate the full context pack&lt;/span&gt;
vem pack

&lt;span class="c"&gt;# Pipe to a file to inspect offline&lt;/span&gt;
vem pack &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /tmp/my-project-context.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 4 — Ask Questions About Your Project
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;vem search&lt;/code&gt; performs semantic search across your project memory — tasks, decisions, context, and changelog entries. It is powered by the vem cloud vector index built from your most recent &lt;code&gt;vem push&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This is especially useful for finding related decisions, locating tasks about a specific feature, or checking whether a topic has already been addressed before adding a new decision record.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Search across all memory artifacts&lt;/span&gt;
vem search &lt;span class="s2"&gt;"error handling"&lt;/span&gt;

&lt;span class="c"&gt;# Find decisions related to authentication&lt;/span&gt;
vem search &lt;span class="s2"&gt;"auth"&lt;/span&gt;

&lt;span class="c"&gt;# Find tasks mentioning a specific library&lt;/span&gt;
vem search &lt;span class="s2"&gt;"retry logic"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 5 — Connect Any Agent via MCP
&lt;/h2&gt;

&lt;p&gt;The vem MCP server is the bridge between your memory layer and any AI agent that supports the Model Context Protocol: Claude Desktop, Cursor, Copilot, and more. Once connected, your agent calls structured tools to read tasks, search memory, and record decisions — no copy-pasting context into the chat window.&lt;/p&gt;

&lt;p&gt;Add the snippet below to your agent's MCP configuration file. Your vem API key is read automatically from &lt;code&gt;~/.vem/config.json&lt;/code&gt; — you never need to expose it in the config.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Privacy matters:&lt;/strong&gt; vem uses a Bring Your Own Key model. Your AI provider keys (OpenAI, Anthropic, etc.) are stored only on your local machine and never sent to the vem cloud.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"vem"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"@vemdev/mcp-server"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tools available to your agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Tools exposed by the vem MCP server:&lt;/span&gt;
&lt;span class="c"&gt;# get_active_tasks()     — list current sprint tasks with status&lt;/span&gt;
&lt;span class="c"&gt;# search_memory(query)   — semantic search across all memory artifacts&lt;/span&gt;
&lt;span class="c"&gt;# read_decision(id)      — fetch a specific ADR by ID&lt;/span&gt;
&lt;span class="c"&gt;# update_task(id, ...)   — mark progress and add evidence&lt;/span&gt;
&lt;span class="c"&gt;# record_decision(...)   — write a new ADR from the agent session&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 6 — The Web Memory Dashboard
&lt;/h2&gt;

&lt;p&gt;The vem web app at &lt;a href="https://app.vem.dev" rel="noopener noreferrer"&gt;app.vem.dev&lt;/a&gt; gives you a visual view of everything stored in your project memory. The Context tab shows your &lt;code&gt;CONTEXT.md&lt;/code&gt;, current state, key decisions, and recent changelog entries all on one page.&lt;/p&gt;

&lt;p&gt;The Memory tab hosts a chat interface you can use to ask questions about your project directly from the browser — the same semantic search your agents use, but in a conversational UI. It is particularly useful during onboarding or code review when you need to quickly orient a new contributor.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F02kgzg4lmo6zjugpahuw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F02kgzg4lmo6zjugpahuw.png" alt="vem web context page showing key architectural decisions panel" width="800" height="500"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Key Decisions panel in the vem web app — all ADRs accessible to team members and agents&lt;/em&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Step 7 — Implement Tasks Remotely with the vem Agent
&lt;/h2&gt;

&lt;p&gt;vem does not just store context — it can act on it. The vem agent runner lets you trigger AI-powered task implementation from the web dashboard, delegating work to an agent running on your local dev machine or a cloud runner.&lt;/p&gt;

&lt;p&gt;Your AI keys never leave your machine. The vem cloud only orchestrates which task to run and where — the actual agent execution and code changes happen locally. This is BYOK (Bring Your Own Key) by design.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start the vem runner — listens for tasks dispatched from the web&lt;/span&gt;
vem runner

&lt;span class="c"&gt;# Or specify a particular AI agent&lt;/span&gt;
vem runner &lt;span class="nt"&gt;--agent&lt;/span&gt; claude

&lt;span class="c"&gt;# The runner outputs a secure token you connect in the web Workspace tab&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 8 — Track Agent Activity with Insights
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;vem insights&lt;/code&gt; shows a power score and command frequency breakdown for your project. It surfaces which workflow features you are using, which you are not, and how your agent activity patterns have evolved over time.&lt;/p&gt;

&lt;p&gt;The power score is a simple metric (0–100) that rewards high-value behaviours: agent-driven implementation, decision recording, task-driven work, and memory finalisation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Show power score and command frequency&lt;/span&gt;
vem insights
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 9 — Push Memory to the Cloud
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;vem push&lt;/code&gt; publishes a snapshot of your entire &lt;code&gt;.vem/&lt;/code&gt; memory to the vem cloud. The snapshot is marked &lt;code&gt;pending&lt;/code&gt; until a matching Git push is detected — at that point it is verified using the &lt;code&gt;git_hash&lt;/code&gt; + &lt;code&gt;snapshot_hash&lt;/code&gt; pair and becomes permanently auditable.&lt;/p&gt;

&lt;p&gt;Push after any significant session: after adding decisions, after completing tasks, after updating context. Your teammates and any agent connected via MCP will immediately see the updated memory on their next request.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Publish current memory snapshot&lt;/span&gt;
vem push

&lt;span class="c"&gt;# Check sync and connection status&lt;/span&gt;
vem status
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 10 — Cycle Validation: Memory Stays Correct Over Time
&lt;/h2&gt;

&lt;p&gt;Development never stops. New features can invalidate old decisions, refactors break assumptions captured in &lt;code&gt;CONTEXT.md&lt;/code&gt;, and security issues can surface weeks after the original code was written. vem's cycle validation step is designed exactly for this.&lt;/p&gt;

&lt;p&gt;When you close a sprint with &lt;code&gt;vem cycle validate&lt;/code&gt;, vem checks each completed task's validation steps against the current codebase and flags items that need human review.&lt;/p&gt;

&lt;p&gt;Run validation at the end of each cycle before you mark it done. It takes less than a minute and ensures your memory layer stays trustworthy — so agents in future cycles don't build on stale or incorrect foundations.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Validate the active cycle before closing it&lt;/span&gt;
vem cycle validate

&lt;span class="c"&gt;# Review specific task validation results&lt;/span&gt;
vem cycle validate &lt;span class="nt"&gt;--task&lt;/span&gt; TASK-003

&lt;span class="c"&gt;# Close the cycle once validation passes&lt;/span&gt;
vem cycle close
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Full Memory Loop
&lt;/h2&gt;

&lt;p&gt;Every AI session should leave the project in a better state than it found it. That means updated context, recorded decisions, completed tasks, and a fresh push to the cloud. With vem, this loop takes under two minutes and pays dividends on every session that follows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Inspect what your agents currently see&lt;/span&gt;
vem pack

&lt;span class="c"&gt;# 2. Record any decisions made this session&lt;/span&gt;
vem decision add &lt;span class="s2"&gt;"..."&lt;/span&gt; &lt;span class="nt"&gt;--context&lt;/span&gt; &lt;span class="s2"&gt;"..."&lt;/span&gt; &lt;span class="nt"&gt;--decision&lt;/span&gt; &lt;span class="s2"&gt;"..."&lt;/span&gt;

&lt;span class="c"&gt;# 3. Update task progress&lt;/span&gt;
vem task &lt;span class="k"&gt;done &lt;/span&gt;TASK-001 &lt;span class="nt"&gt;--evidence&lt;/span&gt; &lt;span class="s2"&gt;"Implemented in src/auth.ts, tests pass"&lt;/span&gt;

&lt;span class="c"&gt;# 4. Refresh current state summary&lt;/span&gt;
vem context &lt;span class="nb"&gt;set &lt;/span&gt;current &lt;span class="s2"&gt;"Completed auth module. Next: add refresh token rotation."&lt;/span&gt;

&lt;span class="c"&gt;# 5. Push to cloud and verify&lt;/span&gt;
vem push
vem status
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your agents start each session with full context. Your decisions are permanent and searchable. Your sprint state is always visible. And your memory is verified against your actual Git history — not just a file on disk.&lt;/p&gt;







&lt;p&gt;&lt;strong&gt;vem is currently in early access.&lt;/strong&gt; We're looking for our first users — developers and teams tired of re-explaining their project to AI agents every session. Early access is &lt;strong&gt;completely free&lt;/strong&gt;. No credit card, no trial timer.&lt;/p&gt;

&lt;p&gt;If you found this useful, &lt;a href="https://vem.dev" rel="noopener noreferrer"&gt;sign up at vem.dev&lt;/a&gt; and let us know what you're building. Your feedback will directly shape the product. 🙏&lt;/p&gt;

</description>
      <category>tutorial</category>
      <category>ai</category>
      <category>mcp</category>
      <category>programming</category>
    </item>
    <item>
      <title>I Built an Experiences Marketplace Five Years Before Airbnb Experiences</title>
      <dc:creator>Talvinder Singh</dc:creator>
      <pubDate>Tue, 21 Apr 2026 07:48:29 +0000</pubDate>
      <link>https://forem.com/talvinder/i-built-an-experiences-marketplace-five-years-before-airbnb-experiences-4bm7</link>
      <guid>https://forem.com/talvinder/i-built-an-experiences-marketplace-five-years-before-airbnb-experiences-4bm7</guid>
      <description>&lt;p&gt;In 2011, we built Tushky — a marketplace for local experiences in India. Cooking classes with home chefs. Heritage walks through old Mumbai. Photography workshops in the Western Ghats. Five years later, Airbnb launched Experiences and scaled the exact same model globally.&lt;/p&gt;

&lt;p&gt;We had the idea first. We executed reasonably well. We still failed.&lt;/p&gt;

&lt;p&gt;The reason wasn't timing or capital or competition. It was something more fundamental: we optimized for transactions when we should have been building social infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Social Capital Gap
&lt;/h2&gt;

&lt;p&gt;Most marketplace failures are diagnosed as "chicken-and-egg problems" — you need supply to attract demand, you need demand to attract supply. That's true but useless. It's like saying you failed because you ran out of money. The question is &lt;em&gt;why&lt;/em&gt; you couldn't solve the bootstrap problem when others did.&lt;/p&gt;

&lt;p&gt;The answer is what I call the &lt;strong&gt;Social Capital Gap&lt;/strong&gt; — the difference between a transactional platform and a community with economic infrastructure built on top.&lt;/p&gt;

&lt;p&gt;Airbnb Experiences closed that gap. We didn't. Not because we didn't understand marketplaces, but because we treated the wrong thing as the product.&lt;/p&gt;

&lt;h2&gt;
  
  
  What we got right
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Profitable unit economics on outbound marketing.&lt;/strong&gt; We could acquire customers through Facebook ads and Google search profitably. Rs 200-300 customer acquisition cost, Rs 800-1200 average booking value, 15-20% take rate. Not venture scale, but sustainable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Easy supplier onboarding.&lt;/strong&gt; Experience providers could create a listing in under 10 minutes. No approval bottleneck. We had 150+ experiences listed within six months.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Unique inventory.&lt;/strong&gt; A Parsi chef teaching dhansak in her South Mumbai apartment. A tabla master offering two-hour sessions in Dadar. A birding expert leading dawn walks in Sanjay Gandhi National Park.&lt;/p&gt;

&lt;p&gt;The product worked. People booked. Providers got paid. Reviews were positive.&lt;/p&gt;

&lt;p&gt;Transactions hit a wall at about 80-100 bookings per month.&lt;/p&gt;

&lt;p&gt;We couldn't break through. We added more experiences. We improved search. We ran more ads. We tried discounting. Nothing moved the number sustainably.&lt;/p&gt;

&lt;p&gt;The diagnosis in our internal docs: "Repeat customers were not getting enough options and first timers wanted more options to decide from."&lt;/p&gt;

&lt;p&gt;That diagnosis was wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  What actually broke
&lt;/h2&gt;

&lt;p&gt;The real problem was visible in how our experience providers talked about us.&lt;/p&gt;

&lt;p&gt;We wanted to be seen as business partners. We positioned ourselves that way in pitch decks and partner communications. But providers saw us as a booking channel — one of several ways they got customers, not materially different from their own Facebook page or a listing on JustDial.&lt;/p&gt;

&lt;p&gt;When we asked providers to promote Tushky to their existing customers, most didn't. When we asked them to refer other providers, most didn't. When we suggested they collaborate on multi-experience packages, almost none did.&lt;/p&gt;

&lt;p&gt;They had no social capital invested in the platform. We were a lead source, not a community.&lt;/p&gt;

&lt;p&gt;Compare that to what Airbnb built. They didn't just launch a booking interface. They built host meetups. They created an online forum where hosts shared tips. They featured hosts in marketing materials with their stories, not just their listings. They built a brand that hosts were proud to be associated with.&lt;/p&gt;

&lt;p&gt;Their CTO told me years later: "The product is not the website. It's the final booking." Meaning: the value isn't in the interface, it's the trust infrastructure that makes the transaction possible.&lt;/p&gt;

&lt;p&gt;We built a website. They built social capital.&lt;/p&gt;

&lt;h2&gt;
  
  
  The numbers that should have told us
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Our repeat booking rate: 12-15%&lt;/li&gt;
&lt;li&gt;Our provider referral rate: &amp;lt;5%&lt;/li&gt;
&lt;li&gt;Our provider-to-provider collaboration rate: 0%&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those aren't marketplace metrics. Those are lead generation metrics.&lt;/p&gt;

&lt;p&gt;A real marketplace creates network effects. Each new provider should make the platform more valuable to customers. Each new customer should make the platform more valuable to providers. We had linear growth at best.&lt;/p&gt;

&lt;p&gt;We also made a strategic error on marketing. Outbound worked. We could buy traffic profitably. So we kept doing it. What we didn't realize until too late: outbound marketing scales linearly with spend. Inbound marketing — SEO, word of mouth, community — scales exponentially but takes longer to build.&lt;/p&gt;

&lt;p&gt;From our internal strategy doc in 2013: "Inbound marketing is the way to go. Build extremely loyal experience partner base. They will do word of mouth for you."&lt;/p&gt;

&lt;p&gt;We knew it. We wrote it down. We didn't do it. Because outbound delivered this month's numbers. Inbound required believing in next year's numbers. We were optimizing for the wrong time horizon.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I got wrong
&lt;/h2&gt;

&lt;p&gt;I treated the chicken-and-egg problem as a supply problem. I thought: get enough experiences listed, and demand will follow. So we focused on making supplier onboarding frictionless.&lt;/p&gt;

&lt;p&gt;That was backwards.&lt;/p&gt;

&lt;p&gt;The constraint wasn't the number of listings. It was the depth of engagement. We needed 20 customers who booked 5 times each, not 100 customers who booked once. We needed suppliers who saw Tushky as their primary channel, not one of five. Who would promote it to their customers. Who would collaborate with other suppliers. Who had reputational skin in the game.&lt;/p&gt;

&lt;p&gt;That requires a different product. Not a listing interface. A community infrastructure.&lt;/p&gt;

&lt;p&gt;We also underestimated the importance of curation and quality signaling. We made listing easy, which meant we had a quality variance problem. Some experiences were exceptional. Some were mediocre. Customers couldn't tell the difference from the listing page. Airbnb solved this with detailed reviews, verified photos, and editorial featuring. We had basic star ratings.&lt;/p&gt;

&lt;p&gt;The final mistake: we thought being first was an advantage. It's not. Being first means you absorb all the market education cost. You teach customers that "experience marketplaces" exist. Then someone with more capital and better execution takes the market you created.&lt;/p&gt;

&lt;p&gt;First-mover advantage is real in network-effect businesses only if you can build the network faster than competitors can copy the product. We couldn't.&lt;/p&gt;

&lt;h2&gt;
  
  
  The test that matters
&lt;/h2&gt;

&lt;p&gt;If you're building a marketplace, here's the question:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Are your suppliers investing social capital in your platform, or are they just using it as a lead source?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If it's the latter, you don't have a marketplace. You have a lead-gen business with marketplace unit economics. That's not venture-scalable. It's also not defensible.&lt;/p&gt;

&lt;p&gt;The test is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Do suppliers refer other suppliers?&lt;/li&gt;
&lt;li&gt;Do suppliers promote your platform to their existing customers?&lt;/li&gt;
&lt;li&gt;Do suppliers collaborate with each other through your platform?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the answer to all three is no, you haven't built the social infrastructure yet. You've built a directory.&lt;/p&gt;

&lt;p&gt;We spent two years optimizing transaction flow when we should have been building community. By the time we realized it, we didn't have the capital or the team energy to rebuild.&lt;/p&gt;

&lt;p&gt;Airbnb had the capital. They also had something harder to replicate: they understood from day one that the product wasn't the booking form. It was the trust system that made strangers willing to transact.&lt;/p&gt;

&lt;p&gt;I still don't know if we could have won even if we'd understood this earlier. The India market in 2011 wasn't ready for experiential consumption at scale. Airbnb launched Experiences in 2016 into a global market that had already been trained by Airbnb Stays.&lt;/p&gt;

&lt;p&gt;But I know we lost for the wrong reasons. We lost because we optimized for the transaction when we should have been building the social capital that makes transactions possible at scale.&lt;/p&gt;

&lt;p&gt;The question I'm still working through: how do you build social capital infrastructure before you have transaction volume? Community requires critical mass. But you can't get to critical mass without community.&lt;/p&gt;

&lt;h2&gt;
  
  
  That's the real chicken-and-egg problem. Not supply and demand. Trust and scale.
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://talvinder.com/build-logs/experiences-before-airbnb/?utm_source=devto&amp;amp;utm_medium=syndication&amp;amp;utm_campaign=experiences-before-airbnb" rel="noopener noreferrer"&gt;talvinder.com&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>marketplaces</category>
      <category>startuplessons</category>
      <category>indiastartups</category>
    </item>
  </channel>
</rss>
