<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Geoffrey Litt</title>
  <id>http://geoffreylitt.com/</id>
  <link href="https://geoffreylitt.com"/>
  <link href="https://geoffreylitt.com/feed.xml" rel="self"/>
  <updated>2025-10-24T14:59:00+00:00</updated>
  <author>
    <name>Geoffrey Litt</name>
  </author>
  <entry>
    <title>Code like a surgeon</title>
    <link rel="alternate" href="https://geoffreylitt.com/2025/10/24/code-like-a-surgeon.html"/>
    <id>https://geoffreylitt.com/2025/10/24/code-like-a-surgeon.html</id>
    <published>2025-10-24T14:59:00+00:00</published>
    <updated>2025-10-24T14:59:00+00:00</updated>
    <author>
      <name>Geoffrey Litt</name>
    </author>
    <summary type="html">&lt;p&gt;A lot of people say AI will make us all “managers” or “editors”…but I think this is a dangerously incomplete view!&lt;/p&gt;

&lt;p&gt;Personally, I’m trying to &lt;strong&gt;code like a surgeon.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A surgeon isn’t a manager, they do the actual work! But their skills and time are highly...&lt;/p&gt;</summary>
    <content type="html">&lt;p&gt;A lot of people say AI will make us all &amp;ldquo;managers&amp;rdquo; or &amp;ldquo;editors&amp;rdquo;&amp;hellip;but I think this is a dangerously incomplete view!&lt;/p&gt;

&lt;p&gt;Personally, I&amp;rsquo;m trying to &lt;strong&gt;code like a surgeon.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A surgeon isn&amp;rsquo;t a manager, they do the actual work! But their skills and time are highly leveraged with a support team that handles prep, secondary tasks, admin. The surgeon focuses on the important stuff they are uniquely good at.&lt;/p&gt;

&lt;p&gt;My current goal with AI coding tools is to spend 100% of my time doing stuff that matters. (As a UI prototyper, that mostly means tinkering with design concepts.)&lt;/p&gt;

&lt;p&gt;It turns out there are a LOT of secondary tasks which AI agents are now good enough to help out with. Some things I&amp;rsquo;m finding useful to hand off these days:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Before attempting a big task, write a guide to relevant areas of the codebase&lt;/li&gt;
&lt;li&gt;Spike out an attempt at a big change. Often I won&amp;rsquo;t use the result but I&amp;rsquo;ll review it as a sketch of where to go&lt;/li&gt;
&lt;li&gt;Fix typescript errors or bugs which have a clear specification&lt;/li&gt;
&lt;li&gt;Write documentation about what I&amp;rsquo;m building&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I often find it useful to run these secondary tasks async in the background &amp;ndash; while I&amp;rsquo;m eating lunch, or even literally overnight!&lt;/p&gt;

&lt;p&gt;When I sit down for a work session, I want to feel like a surgeon walking into a prepped operating room. Everything is ready for me to do what I&amp;rsquo;m good at.&lt;/p&gt;

&lt;h2 id="mind-the-autonomy-slider"&gt;Mind the autonomy slider&lt;/h2&gt;

&lt;p&gt;Notably, there is a &lt;em&gt;huge&lt;/em&gt; difference between how I use AI for primary vs secondary tasks.&lt;/p&gt;

&lt;p&gt;For the core design prototyping work, I still do a lot of coding by hand, and when I do use AI, I&amp;rsquo;m more careful and in the details. I need fast feedback loops and good visibility. (eg, I like Cursor tab-complete here)&lt;/p&gt;

&lt;p&gt;Whereas for secondary tasks, I&amp;rsquo;m much much looser with it, happy to let an agent churn for hours in the background. The ability to get the job done eventually is the most important thing; speed and visibility matter less. Claude Code has been my go-to for long unsupervised sessions but Codex CLI is becoming a strong contender there too, possibly my new favorite.&lt;/p&gt;

&lt;p&gt;These are &lt;em&gt;very&lt;/em&gt; different work patterns! Reminds me of Andrej Karpathy&amp;rsquo;s &lt;a href="https://www.latent.space/p/s3"&gt;&amp;ldquo;autonomy slider&amp;rdquo;&lt;/a&gt; concept. &lt;strong&gt;It&amp;rsquo;s dangerous to conflate different parts of the autonomy spectrum&lt;/strong&gt; &amp;ndash; the tools and mindset that are needed vary quite a lot.&lt;/p&gt;

&lt;h2 id="your-agent-doesnt-need-a-career-trajectory"&gt;Your agent doesn&amp;rsquo;t need a career trajectory&lt;/h2&gt;

&lt;p&gt;The &amp;ldquo;software surgeon&amp;rdquo; concept is a very old idea &amp;ndash; Fred Brooks attributes it to Harlan Mills in his 1975 classic &amp;ldquo;The Mythical Man-Month&amp;rdquo;. He &lt;a href="https://www.embeddedrelated.com/showarticle/1484.php#:~:text=Mills%20proposes%20that%20each%20segment%20of%20a%20large%20job%20be%20tackled%20by%20a%20team%2C%20but%20that%20the%20team%20be%20organized%20like%20a%20surgical%20team%20rather%20than%20a%20hog%2Dbutchering%20team."&gt;talks about&lt;/a&gt; a &amp;ldquo;chief programmer&amp;rdquo; who is supported by various staff including a &amp;ldquo;copilot&amp;rdquo; and various administrators. Of course, at the time, the idea was to have humans be in these support roles.&lt;/p&gt;

&lt;p&gt;OK, so there is a super obvious angle here, that &amp;ldquo;AI has now made this approach economically viable where it wasn&amp;rsquo;t before&amp;rdquo;, yes yes&amp;hellip; but &lt;strong&gt;I am also noticing a more subtle thing at play, something to do with status hierarchies.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A lot of the &amp;ldquo;secondary&amp;rdquo; tasks are &amp;ldquo;grunt work&amp;rdquo;, not the most intellectually fulfilling or creative part of the work. I have a strong preference for teams where everyone shares the grunt work; I hate the idea of giving all the grunt work to some lower-status members of the team. Yes, junior members will often have more grunt work, but they should also be given many interesting tasks to help them grow.&lt;/p&gt;

&lt;p&gt;With AI this concern completely disappears! &lt;strong&gt;Now I can happily delegate pure grunt work.&lt;/strong&gt; And the 24/7 availability is a big deal. I would never call a human intern at 11pm and tell them to have a research report on some code ready by 7am&amp;hellip; but here I am, commanding my agent to do just that!&lt;/p&gt;

&lt;h2 id="notion-is-for-surgeons"&gt;Notion is for surgeons?&lt;/h2&gt;

&lt;p&gt;Finally I&amp;rsquo;ll mention a couple thoughts on how this approach to work intersects with my employer, &lt;a href="https://notion.com/"&gt;Notion&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;First, as an employee, I find it incredibly valuable right now to work at a place that is bullish on AI coding tools. Having support for heavy use of AI coding tools, and a codebase that&amp;rsquo;s well setup for it, is enabling serious productivity gains for me &amp;ndash; &lt;em&gt;especially&lt;/em&gt; as a newcomer to a big codebase.&lt;/p&gt;

&lt;p&gt;Secondly, as a product &amp;ndash; in a sense I would say we are trying to bring this way of working to a broader group of knowledge workers beyond programmers. When I think about how that will play out, I like the mental model of enabling everyone to &amp;ldquo;work like a surgeon&amp;rdquo;.&lt;/p&gt;

&lt;p&gt;The goal isn&amp;rsquo;t to delegate your core work, it&amp;rsquo;s to &lt;strong&gt;identify and delegate the secondary grunt work tasks, so you can focus on the main thing that matters.&lt;/strong&gt;&lt;/p&gt;

&lt;hr&gt;

&lt;h2 id="related-reads"&gt;Related reads&lt;/h2&gt;

&lt;p&gt;If you liked this perspective, you might enjoy reading these other posts I&amp;rsquo;ve written about the nature of human-AI collaboration:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="/2025/07/27/enough-ai-copilots-we-need-ai-huds"&gt;Enough AI copilots! We need AI HUDs&lt;/a&gt;: &amp;ldquo;anyone serious about designing for AI should consider non-copilot form factors that more directly extend the human mind&amp;hellip;&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;a href="/2024/12/22/making-programming-more-fun-with-an-ai-generated-debugger"&gt;AI-generated tools can make programming more fun&lt;/a&gt;: &amp;ldquo;Instead, I used AI to build a custom debugger UI… which made it more fun for me to do the coding myself&amp;hellip;&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;a href="/2023/02/26/llm-as-muse-not-oracle"&gt;ChatGPT as muse, not oracle&lt;/a&gt;: &amp;ldquo;What if we were to think of LLMs not as tools for answering questions, but as tools for asking us questions and inspiring our creativity?&lt;/li&gt;
&lt;/ul&gt;
</content>
  </entry>
  <entry>
    <title>AI as teleportation</title>
    <link rel="alternate" href="https://geoffreylitt.com/2025/09/10/ai-as-teleportation.html"/>
    <id>https://geoffreylitt.com/2025/09/10/ai-as-teleportation.html</id>
    <published>2025-09-10T19:40:00+00:00</published>
    <updated>2025-09-10T19:40:00+00:00</updated>
    <author>
      <name>Geoffrey Litt</name>
    </author>
    <summary type="html">&lt;p&gt;Here’s a thought experiment for pondering the effects AI might have on society: What if we invented teleportation?&lt;/p&gt;

&lt;p&gt;A bit odd, I know, but bear with me…&lt;/p&gt;

&lt;hr&gt;

&lt;p&gt;The year is 2035. The Auto Go Instant (AGI) teleporter has been invented. You can now go anywhere...&lt;/p&gt;</summary>
    <content type="html">&lt;p&gt;Here&amp;rsquo;s a thought experiment for pondering the effects AI might have on society: What if we invented teleportation?&lt;/p&gt;

&lt;p&gt;A bit odd, I know, but bear with me&amp;hellip;&lt;/p&gt;

&lt;hr&gt;

&lt;p&gt;The year is 2035. The Auto Go Instant (AGI) teleporter has been invented. You can now go anywhere&amp;hellip; instantly!&lt;/p&gt;

&lt;p&gt;At first the tech is expensive and unreliable. Critics laugh. &amp;ldquo;Hah, look at these stupid billionaires who can&amp;rsquo;t spend a minute of their time moving around like the rest of us. And 5% of the time they end up in the wrong place, LOL&amp;rdquo;&lt;/p&gt;

&lt;p&gt;But soon things get cheaper and better. The tech hits mass market.&lt;/p&gt;

&lt;p&gt;There are huge benefits. Global commerce is supercharged. Instead of commuting, people can spend more time with family and friends. Pollution is way down. The AGI company runs a sweet commercial of people teleporting to see their parents one last time before they die.&lt;/p&gt;

&lt;p&gt;At the same time, some weird things start happening.&lt;/p&gt;

&lt;p&gt;The landscape starts reconfiguring around the new reality. Families move to remote cabins, just seconds away from urban amenities. The summit of Mt. Everest becomes crowded with influencers. (It turns out that if you stay just a few seconds, you can take a quick selfie without needing an oxygen mask!)&lt;/p&gt;

&lt;p&gt;Physical health takes a hit for many people. It&amp;rsquo;s harder to justify walking or biking when you could just be there now.&lt;/p&gt;

&lt;p&gt;In-between moments disappear. One moment you&amp;rsquo;re at work, the next you&amp;rsquo;re at your dinner table at home. No more time to reset or prepare for a new context.&lt;/p&gt;

&lt;p&gt;But the biggest change is the loss of serendipity. When you teleport, you decide in advance where you&amp;rsquo;re headed. You never run into an old friend on the street, or stop at a farmstand by the side of the road, or see a store you might want to stop into someday.&lt;/p&gt;

&lt;p&gt;To modern teenagers, the idea of wandering out without an exact destination in mind becomes unthinkable. You start with the GPS coordinates, and then you just&amp;hellip; go.&lt;/p&gt;

&lt;p&gt;Advocates of the new way point out that there&amp;rsquo;s nothing stopping anyone from choosing traditional methods for fun. And indeed, the cross-country road trip does see a mild resurgence as a hipster thing.&lt;/p&gt;

&lt;p&gt;But when push comes to shove, most people struggle to make the time for wandering—our schedules are now arranged around an assumption of instant transport.&lt;/p&gt;

&lt;p&gt;This isn&amp;rsquo;t exactly to say that the old way was better. Most people can agree that teleportation a net win. Yet for those who remember, there&amp;rsquo;s a vague unease, a sense that something important was lost in the world&amp;hellip;.&lt;/p&gt;

&lt;hr&gt;

&lt;p&gt;In his book Technology and the Character of Everyday Life, the philosopher Albert Borgmann talks about wooden stoves in houses.&lt;/p&gt;

&lt;p&gt;What is a stove? Yes, it warms the house&amp;hellip; but it&amp;rsquo;s also so much more than that. You gotta cut the wood, you gotta start the fire in the morning&amp;hellip;&lt;/p&gt;

&lt;p&gt;&amp;ldquo;A stove used to furnish more than mere warmth. It was a focus, a hearth, a place that gathered the work and leisure of a family and gave the house of a center.&amp;rdquo;&lt;/p&gt;

&lt;p&gt;When you switch to a modern central heating system, you cut out all these inconveniences. Fantastic!&lt;/p&gt;

&lt;p&gt;Oh, and by the way, your family social life is totally different&amp;hellip;.. wait what?? Yes, the inconveniences were inconvenient. But they were also holding up something in your life and culture, and now they&amp;rsquo;re suddenly gone.&lt;/p&gt;

&lt;p&gt;I think of this as kind of a Chesteron&amp;rsquo;s fence on hard mode. Yes, the stove was put there for warmth, that was the main goal. But you should also think hard about its secondary effects before replacing it.&lt;/p&gt;

&lt;hr&gt;

&lt;p&gt;OK so&amp;hellip; how does this apply to AI?&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;m personally excited about AI and think it can improve our lives in a lot of ways. But at the same time I&amp;rsquo;m trying to be mindful of secondary effects and unintended consequences.&lt;/p&gt;

&lt;p&gt;Here&amp;rsquo;s one example. If your mental model of reading is &amp;ldquo;transmit facts into my head&amp;rdquo;, then reading an AI summary of something might seem like a more efficient way to get that task done.&lt;/p&gt;

&lt;p&gt;But if your mental model of reading is &amp;ldquo;spend time marinating in a world of ideas&amp;rdquo;, then reducing the time spent reading doesn&amp;rsquo;t help you much.&lt;/p&gt;

&lt;p&gt;The point was the journey you underwent while reading, and you replaced it with teleportation.&lt;/p&gt;

&lt;p&gt;Another example. One of the great joys of my life is having nerdy friends explain things to me. Now I can get explanations from AI with less friction, anytime, anywhere, with endless follow-up.&lt;/p&gt;

&lt;p&gt;Even if the AI explanations are &amp;ldquo;better&amp;rdquo;, there&amp;rsquo;s a social cost. I can try to mindfully nudge myself to still ask people questions, but now it requires more effort.&lt;/p&gt;

&lt;p&gt;Final example: I&amp;rsquo;m trying to be mindful of the effects of vibe coding when designing software interfaces. On the one hand, it can really speed up my iteration loop and help me explore more ideas.&lt;/p&gt;

&lt;p&gt;But at the same time, part of my design process is sitting with the details of the thing and uncovering it as I go—more a muscle memory process than a conscious plan. Messing with this process can change the results in ways that are hard to predict!&lt;/p&gt;

&lt;p&gt;I guess the throughline for all of these examples is: sometimes the friction and inconvenience is where the good stuff happens. Gotta be very careful removing it.&lt;/p&gt;

&lt;hr&gt;

&lt;p&gt;The takeaway here isn&amp;rsquo;t that &amp;ldquo;AI is bad&amp;rdquo;. I&amp;rsquo;ll just say that I&amp;rsquo;m personally trying to be mindful about keeping good friction around.&lt;/p&gt;

&lt;p&gt;During COVID, we kinda got teleportation via Zoom for a while. I decided to &amp;ldquo;virtual commute&amp;rdquo; every day, walking around the block to get some fresh air and a reset before/after work. This wasn&amp;rsquo;t a big deal but I found it really helpful.&lt;/p&gt;

&lt;p&gt;As AI makes a lot of things easier, it&amp;rsquo;ll be interesting to ponder what kinds of new frictions we&amp;rsquo;ll want to intentionally add to our lives. Teleportation isn&amp;rsquo;t always the best answer&amp;hellip;&lt;/p&gt;
</content>
  </entry>
  <entry>
    <title>Enough AI copilots! We need AI HUDs</title>
    <link rel="alternate" href="https://geoffreylitt.com/2025/07/27/enough-ai-copilots-we-need-ai-huds.html"/>
    <id>https://geoffreylitt.com/2025/07/27/enough-ai-copilots-we-need-ai-huds.html</id>
    <published>2025-07-27T20:50:00+00:00</published>
    <updated>2025-07-27T20:50:00+00:00</updated>
    <author>
      <name>Geoffrey Litt</name>
    </author>
    <summary type="html">&lt;p&gt;In my opinion, one of the best critiques of modern AI design comes from &lt;a href="https://cgi.csc.liv.ac.uk/~coopes/comp319/2016/papers/UbiquitousComputingAndInterfaceAgents-Weiser.pdf"&gt;a 1992 talk&lt;/a&gt; by the researcher &lt;a href="https://en.wikipedia.org/wiki/Mark_Weiser"&gt;Mark Weiser&lt;/a&gt; where he ranted against “copilot” as a metaphor for AI.&lt;/p&gt;

&lt;p&gt;This was 33 years ago, but it’s still incredibly relevant for anyone designing...&lt;/p&gt;</summary>
    <content type="html">&lt;p&gt;In my opinion, one of the best critiques of modern AI design comes from &lt;a href="https://cgi.csc.liv.ac.uk/~coopes/comp319/2016/papers/UbiquitousComputingAndInterfaceAgents-Weiser.pdf"&gt;a 1992 talk&lt;/a&gt; by the researcher &lt;a href="https://en.wikipedia.org/wiki/Mark_Weiser"&gt;Mark Weiser&lt;/a&gt; where he ranted against &amp;ldquo;copilot&amp;rdquo; as a metaphor for AI.&lt;/p&gt;

&lt;p&gt;This was 33 years ago, but it&amp;rsquo;s still incredibly relevant for anyone designing with AI.&lt;/p&gt;

&lt;h2 id="weisers-rant"&gt;Weiser&amp;rsquo;s rant&lt;/h2&gt;

&lt;p&gt;Weiser was speaking at an &lt;a href="https://www.dropbox.com/scl/fo/axpzd925tcsnkc9x5nd51/AJMdLqxafEYFun4Ns6fqMHo?dl=0&amp;amp;e=1&amp;amp;preview=frames_1992_014_Nov.pdf&amp;amp;rlkey=znit21hyth8w24m6gm02rq2y7"&gt;MIT Media Lab event&lt;/a&gt; on &amp;ldquo;interface agents&amp;rdquo;. They were grappling with many of the same issues we&amp;rsquo;re discussing in 2025: how to make a personal assistant that automates tasks for you and knows your full context. They even had a human &amp;ldquo;butler&amp;rdquo; on stage representing an AI agent.&lt;/p&gt;

&lt;p&gt;Everyone was super excited about this&amp;hellip; except Weiser. He was opposed to the whole idea of agents! He gave this example: how should a computer help you fly a plane and avoid collisions?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The agentic option is a &amp;ldquo;copilot&amp;rdquo; — a virtual human who you talk with to get help flying the plane.&lt;/strong&gt; If you&amp;rsquo;re about to run into another plane it might yell at you &amp;ldquo;collision, go right and down!&amp;rdquo;&lt;/p&gt;

&lt;p&gt;Weiser offered a different option: &lt;strong&gt;design the cockpit so that the human pilot is naturally aware of their surroundings.&lt;/strong&gt; In his words: &amp;ldquo;You’ll no more run into another airplane than you would try to walk through a wall.&amp;rdquo;&lt;/p&gt;

&lt;p&gt;Weiser&amp;rsquo;s goal was an &amp;ldquo;invisible computer&amp;quot;—not an assistant that grabs your attention, but a computer that fades into the background and becomes &amp;quot;an extension of [your] body&amp;rdquo;.&lt;/p&gt;

&lt;figure style="margin: 0;"&gt;
  &lt;img src="/images/article_images/weiser-slide.png" alt=""&gt;
  &lt;figcaption&gt;Weiser&amp;rsquo;s 1992 slide on airplane interfaces&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;h2 id="huds"&gt;HUDs&lt;/h2&gt;

&lt;p&gt;There&amp;rsquo;s a tool in modern planes that I think nicely illustrates Weiser&amp;rsquo;s philosophy: &lt;strong&gt;the Head-Up Display (HUD), which overlays flight info like the horizon and altitude on a transparent display directly in the pilot&amp;rsquo;s field of view.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A HUD feels completely different from a copilot! You don&amp;rsquo;t talk to it. It&amp;rsquo;s literally part invisible—you just become naturally aware of more things, as if you had magic eyes.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/copilot-hud.png" alt="" /&gt;&lt;/p&gt;

&lt;h2 id="designing-huds"&gt;Designing HUDs&lt;/h2&gt;

&lt;p&gt;OK enough analogies. What might a HUD feel like in modern software design?&lt;/p&gt;

&lt;p&gt;One familiar example is spellcheck. Think about it: &lt;strong&gt;spellcheck isn&amp;rsquo;t designed as a &amp;ldquo;virtual collaborator&amp;rdquo; talking to you about your spelling.&lt;/strong&gt; It just instantly adds red squigglies when you misspell something! You now have a new sense you didn&amp;rsquo;t have before. It&amp;rsquo;s a HUD.&lt;/p&gt;

&lt;p&gt;(This example comes from Jeffrey Heer&amp;rsquo;s excellent &lt;a href="https://idl.cs.washington.edu/files/2019-AgencyPlusAutomation-PNAS.pdf"&gt;Agency plus Automation&lt;/a&gt; paper. We may not consider spellcheck an AI feature today, but it&amp;rsquo;s still a fuzzy algorithm under the hood.)&lt;/p&gt;

&lt;figure style="margin: 0;"&gt;
  &lt;img src="/images/article_images/spellcheck.png" alt=""&gt;
  &lt;figcaption&gt;Spellcheck makes you aware of misspelled words without an &amp;ldquo;assistant&amp;rdquo; interface.&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Here&amp;rsquo;s another personal example from AI coding. Let&amp;rsquo;s say you want to fix a bug. The obvious &amp;ldquo;copilot&amp;rdquo; way is to open an agent chat and ask it to do the fix.&lt;/p&gt;

&lt;p&gt;But there&amp;rsquo;s another approach I&amp;rsquo;ve found more powerful at times: &lt;strong&gt;use AI to build a custom debugger UI which visualizes the behavior of my program!&lt;/strong&gt; In one example, I &lt;a href="/2024/12/22/making-programming-more-fun-with-an-ai-generated-debugger.html"&gt;built a hacker-themed debug view of a Prolog interpreter&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;With the debugger, I have a HUD! I have new senses, I can see how my program runs. The HUD extends beyond the narrow task of fixing the bug. I can ambiently build up my own understanding, spotting new problems and opportunities.&lt;/p&gt;

&lt;video autoplay loop controls="controls" preload="auto" muted="muted" data-video="0" type="video/mp4" src="/images/article_images/debugger/demo.mp4" width="100%"&gt;&lt;/video&gt;

&lt;p&gt;Both the spellchecker and custom debuggers show that automation / &amp;ldquo;virtual assistant&amp;rdquo; isn&amp;rsquo;t the only possible UI. We can instead use tech to build better HUDs that enhance our human senses.&lt;/p&gt;

&lt;h2 id="tradeoffs"&gt;Tradeoffs&lt;/h2&gt;

&lt;p&gt;I don&amp;rsquo;t believe HUDs are universally better than copilots! But I do believe &lt;strong&gt;anyone serious about designing for AI should consider non-copilot form factors that more directly extend the human mind.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;So when should we use one or the other? I think it&amp;rsquo;s quite tricky to answer that, but we can try to use the airplane analogy for some intuition:&lt;/p&gt;

&lt;p&gt;When pilots just want the plane to fly straight and level, they fully delegate that task to an autopilot, which is close to a &amp;ldquo;virtual copilot&amp;rdquo;. But if the plane just hit a flock of birds and needs to land in the Hudson, the pilot is going to take manual control, and we better hope they have great instruments that help them understand the situation.&lt;/p&gt;

&lt;p&gt;In other words: routine predictable work might make sense to delegate to a virtual copilot / assistant. But when you&amp;rsquo;re shooting for extraordinary outcomes, perhaps the best bet is to equip human experts with new superpowers.&lt;/p&gt;

&lt;hr&gt;

&lt;h2 id="further-reading"&gt;Further reading&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A nice discussion of one approach to this idea can be found in &lt;a href="https://distill.pub/2017/aia/"&gt;Using Artificial Intelligence to Augment Human Intelligence&lt;/a&gt; by Michael Nielsen and Shan Carter.&lt;/li&gt;
&lt;li&gt;A more cryptic take on the same topic: &lt;a href="/2025/06/29/chat-ai-dialogue.html"&gt;Is chat a good UI for AI? A Socratic dialogue&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;A discussion of how the the HUD philosophy intersects with on-demand software creation: &lt;a href="/2023/03/25/llm-end-user-programming.html"&gt;Malleable software in the age of LLMs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</content>
  </entry>
  <entry>
    <title>Is chat a good UI for AI? A Socratic dialogue</title>
    <link rel="alternate" href="https://geoffreylitt.com/2025/06/29/chat-ai-dialogue.html"/>
    <id>https://geoffreylitt.com/2025/06/29/chat-ai-dialogue.html</id>
    <published>2025-06-29T14:17:00+00:00</published>
    <updated>2025-06-29T14:17:00+00:00</updated>
    <author>
      <name>Geoffrey Litt</name>
    </author>
    <summary type="html">&lt;p&gt;The pupil was confused. Some people on Design Twitter said that chat isn’t a good UI for AI… but then chat seemed to be winning in many products? He climbed Mount GPT to consult a wizard…&lt;/p&gt;

&lt;p&gt;🐣: please wizard tell me once and for all. is chat a good UI...&lt;/p&gt;</summary>
    <content type="html">&lt;p&gt;The pupil was confused. Some people on Design Twitter said that chat isn&amp;rsquo;t a good UI for AI&amp;hellip; but then chat seemed to be winning in many products? He climbed Mount GPT to consult a wizard&amp;hellip;&lt;/p&gt;

&lt;p&gt;🐣: please wizard tell me once and for all. is chat a good UI for AI?&lt;/p&gt;

&lt;p&gt;🧙: well, aren&amp;rsquo;t we chatting now?&lt;/p&gt;

&lt;p&gt;🐣: &amp;hellip;?&lt;/p&gt;

&lt;p&gt;🧙: should this conversation be a traditional GUI?&lt;/p&gt;

&lt;p&gt;🐣: no, it could never be!&lt;/p&gt;

&lt;p&gt;🧙: why not?&lt;/p&gt;

&lt;p&gt;🐣: uh&amp;hellip; you can&amp;rsquo;t click buttons and drag sliders to ask open-ended questions like this?&lt;/p&gt;

&lt;p&gt;🧙: precisely! chat is marvelous, done.&lt;/p&gt;

&lt;p&gt;🐣: dude seriously? i came all the way here for &lt;em&gt;that&lt;/em&gt;?&lt;/p&gt;

&lt;p&gt;🧙: yep. i&amp;rsquo;ll tell you the best route down the mountain. straight 1000 ft, left 50 degrees, straight 2 miles—&lt;/p&gt;

&lt;p&gt;🐣: hold on hold on. do you have a map handy?&lt;/p&gt;

&lt;p&gt;🧙: aha! here&amp;rsquo;s a map i had in my pocket. is this a GUI?&lt;/p&gt;

&lt;p&gt;🐣: well, this map is just a piece of paper, so no?&lt;/p&gt;

&lt;p&gt;🧙: ok, what is a paper map?&lt;/p&gt;

&lt;p&gt;🐣: uh&amp;hellip; a better way to see the world?&lt;/p&gt;

&lt;p&gt;🧙: indeed! for certain things, a map is the way to see. for other things, a diagram, a chart, a table. this is the first precept:&lt;/p&gt;

&lt;div style="border: 1px solid black; padding: 10px; margin: 10px 0;"&gt;
Text is not the universal information visualization.
&lt;/div&gt;

&lt;p&gt;🐣: ok fine. but info viz can fit &lt;em&gt;into&lt;/em&gt; a chat can&amp;rsquo;t it? like, if i ask Siri or ChatGPT for the weather, they&amp;rsquo;ll show me a little weather card&amp;hellip;but it&amp;rsquo;s still basically chat&lt;/p&gt;

&lt;p&gt;🧙: where do you live on this map?&lt;/p&gt;

&lt;p&gt;🐣: right th&amp;ndash;&lt;/p&gt;

&lt;p&gt;🧙: hands in your pockets!&lt;/p&gt;

&lt;p&gt;🐣: ?&lt;/p&gt;

&lt;p&gt;🧙: no pointing. tell me where you live&lt;/p&gt;

&lt;p&gt;🐣: &amp;hellip;.well, see how there&amp;rsquo;s a little lake up by the top left? no not that lake&amp;hellip; a bit to the right.. no no next one over&amp;ndash;&lt;/p&gt;

&lt;p&gt;🧙: hahaha&lt;/p&gt;

&lt;p&gt;🐣: &amp;hellip; ok fine i get your point! this sucks.&lt;/p&gt;

&lt;p&gt;🧙: indeed! pointing is great. for referring to things, for precisely cropping an image in the right spot&amp;hellip;&lt;/p&gt;

&lt;p&gt;🐣: ok fine. but if we talk while you show me the map and i point at it, that still feels like a chat? we&amp;rsquo;re layering on information visualization and precision input, but natural language is still doing the heavy lifting?&lt;/p&gt;

&lt;p&gt;🧙: (&lt;em&gt;points at a rock, then at the ground&lt;/em&gt;) put that there.&lt;/p&gt;

&lt;p&gt;🐣: huh?&lt;/p&gt;

&lt;p&gt;🧙: put that there!&lt;/p&gt;

&lt;p&gt;🐣: (&lt;em&gt;moves the rock&lt;/em&gt;) ok, what was that about?&lt;/p&gt;

&lt;p&gt;🧙: As you say, we needed our fingers and our voices both. This leads to a second precept:&lt;/p&gt;

&lt;div style="border: 1px solid black; padding: 10px; margin: 10px 0;"&gt;
Natural language and precision inputs are complementary.
&lt;/div&gt;

&lt;p&gt;btw want a compass?&lt;/p&gt;

&lt;p&gt;🐣: yeah that&amp;rsquo;ll actually be helpful on the way down.&lt;/p&gt;

&lt;p&gt;🧙: cool, i can give you a regular compass, or Mr. Magnetic, a magical fairy who can tell you which way you&amp;rsquo;re pointed.&lt;/p&gt;

&lt;p&gt;🐣: i&amp;rsquo;ll take the regular compass? I did Boy Scouts so I know how to read it, it just becomes part of me in a sense. i definitely don&amp;rsquo;t need to have a whole damn conversation every time.&lt;/p&gt;

&lt;p&gt;🧙: ah yes, you see it. the compass pairs information visualization and precision inputs with a low-latency feedback loop, becoming an extension of your mind. this is one of our great powers as humans—to shoot an arrow or swing a club.&lt;/p&gt;

&lt;p&gt;🐣: ok that&amp;rsquo;s cool. but dude i&amp;rsquo;ve been here a while and i feel like we haven&amp;rsquo;t even really talked about GUIs!&lt;/p&gt;

&lt;p&gt;🧙: you&amp;rsquo;re right, time for dinner. let&amp;rsquo;s order a pizza&lt;/p&gt;

&lt;p&gt;🐣: &amp;hellip;&lt;/p&gt;

&lt;p&gt;🧙: can you order one?&lt;/p&gt;

&lt;p&gt;🐣: fine. I&amp;rsquo;ll see if UberEats delivers up here.&lt;/p&gt;

&lt;p&gt;🧙: why not call the restaurant?&lt;/p&gt;

&lt;p&gt;🐣: are you kidding me? i&amp;rsquo;m not a boomer.&lt;/p&gt;

&lt;p&gt;🧙: is UberEats a GUI?&lt;/p&gt;

&lt;p&gt;🐣: yes?&lt;/p&gt;

&lt;p&gt;🧙: does it work well?&lt;/p&gt;

&lt;p&gt;🐣: yeah it&amp;rsquo;s fine! gets the job done.&lt;/p&gt;

&lt;p&gt;🧙: why not chat over the phone instead?&lt;/p&gt;

&lt;p&gt;🐣: well, ordering food is the same thing every time! even when you talk to the person you&amp;rsquo;re both just following a script, really. the app just makes it faster to follow that script.&lt;/p&gt;

&lt;p&gt;🧙: indeed! this is our third precept:&lt;/p&gt;

&lt;div style="border: 1px solid black; padding: 10px; margin: 10px 0;"&gt;
Graphical interfaces can make repeated workflows nicer.
&lt;/div&gt;

&lt;p&gt;🐣: ok i get it. but idk man, i feel like this is all kinda obvious and we haven&amp;rsquo;t hit the heart of the matter? yes chat is better for open-ended workflows, and GUIs can be better when the task is repeated. but how do they relate?&lt;/p&gt;

&lt;p&gt;🧙: hey i host seminars up here every week and it&amp;rsquo;s kinda tedious. could you show me the button in UberEats where I can enter the estimated attendance and then it orders the right number of pizzas?&lt;/p&gt;

&lt;p&gt;🐣: umm that&amp;rsquo;s not a thing?&lt;/p&gt;

&lt;p&gt;🧙: why not? i want it.&lt;/p&gt;

&lt;p&gt;🐣: uhh, this is UberEats, not a seminar organizer app?&lt;/p&gt;

&lt;p&gt;🧙: oh right good point! in that case let&amp;rsquo;s add a button on the calendar invite i can press which will order the pizzas.&lt;/p&gt;

&lt;p&gt;🐣: dude what do you mean? the calendar app is just a calendar app, not a seminar organizer. you can&amp;rsquo;t just change your software like this.&lt;/p&gt;

&lt;p&gt;🧙: hm, what are my options then?&lt;/p&gt;

&lt;p&gt;🐣: ooh i have an idea! have you heard of MCP? if we just install the right servers then you can program a seminar planner agent in Claude to do this every week for you.&lt;/p&gt;

&lt;p&gt;🧙: sounds fine for the first few times while i&amp;rsquo;m figuring it out. but&amp;ndash;is planning a seminar not a repeated workflow?&lt;/p&gt;

&lt;p&gt;🐣: &amp;hellip; yes, i think it is?&lt;/p&gt;

&lt;p&gt;🧙: did we not say that GUIs can speed up repeated workflows? why do i need to stay in chat for this? also btw, i want my assistant to help out with this, and an app would help them know what to do.&lt;/p&gt;

&lt;p&gt;🐣: i mean, i&amp;rsquo;m not sure there&amp;rsquo;s a good app for seminar planning that does what you want. lemme search on the app st&amp;ndash;&lt;/p&gt;

&lt;p&gt;🧙: wait! a GUI that someone else made will not fit &lt;em&gt;my&lt;/em&gt; seminar planning needs. i need my own preferred workflow to be the one that is encoded in the tool.&lt;/p&gt;

&lt;p&gt;🐣: ohh i see! this actually might not be that much work, have you heard of vibe coding? i&amp;rsquo;ll open up Claude Artifacts and get cookin.&lt;/p&gt;

&lt;p&gt;🧙: thanks, lemme know when you&amp;rsquo;ve added the seminar pizza feature to UberEats!&lt;/p&gt;

&lt;p&gt;🐣: oh well, I was thinking it&amp;rsquo;s not gonna be added to uber eats exactly &amp;ndash; i&amp;rsquo;m gonna make a new web app that does all this.&lt;/p&gt;

&lt;p&gt;🧙: why? UberEats already has great UI for the checkout flow, I just need one little feature added.&lt;/p&gt;

&lt;p&gt;🐣: i mean i see your point, but you can&amp;rsquo;t really add your own features to UberEats? you don&amp;rsquo;t control it.&lt;/p&gt;

&lt;p&gt;🧙: haven&amp;rsquo;t they heard of vibe coding over there?&lt;/p&gt;

&lt;p&gt;🐣: dude that&amp;rsquo;s not how software works. sure everyone can code now but that doesn&amp;rsquo;t mean you can just edit any app.&lt;/p&gt;

&lt;p&gt;🧙: why not?&lt;/p&gt;

&lt;p&gt;🐣: er&amp;hellip; it sounds kinda messy? and i guess all of this app stuff was invented before AI came along anyway?&lt;/p&gt;

&lt;p&gt;🧙: when you paint a wall do you need to ask permission of the company that made the wall?&lt;/p&gt;

&lt;p&gt;🐣: &amp;hellip; hm. when you put it that way&amp;hellip; i see what you&amp;rsquo;re getting at. if all the GUIs you already use could be edited, then you wouldn&amp;rsquo;t need to resort to chat as much to fill in the seams. instead you could just change the GUIs to do what you want!&lt;/p&gt;

&lt;p&gt;🧙: aha! yes, now you see. if the UI is fixed, then it cannot respond to my needs. but if it is &lt;em&gt;malleable&lt;/em&gt;, then I can evolve it over time. This is the fourth and final precept for today:&lt;/p&gt;

&lt;div style="border: 1px solid black; padding: 10px; margin: 10px 0;"&gt;
A malleable UI pairs the ergonomics of GUIs with the open-ended flexibility of chat.
&lt;/div&gt;

&lt;p&gt;🐣: neat. this seems hard though, wouldn&amp;rsquo;t we need to rethink how the App Store works?&lt;/p&gt;

&lt;p&gt;🧙: indeed. and that is a longer conversation for another time.&lt;/p&gt;

&lt;hr&gt;

&lt;p&gt;&lt;em&gt;Note from the editor: to keep exploring, &lt;a href="https://www.inkandswitch.com/essay/malleable-software/"&gt;read this&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
</content>
  </entry>
  <entry>
    <title>Stevens: a hackable AI assistant using a single SQLite table and a handful of cron jobs</title>
    <link rel="alternate" href="https://geoffreylitt.com/2025/04/12/how-i-made-a-useful-ai-assistant-with-one-sqlite-table-and-a-handful-of-cron-jobs.html"/>
    <id>https://geoffreylitt.com/2025/04/12/how-i-made-a-useful-ai-assistant-with-one-sqlite-table-and-a-handful-of-cron-jobs.html</id>
    <published>2025-04-12T14:40:00+00:00</published>
    <updated>2025-04-12T14:40:00+00:00</updated>
    <author>
      <name>Geoffrey Litt</name>
    </author>
    <summary type="html">&lt;p&gt;There’s a lot of hype these days around patterns for building with AI. Agents, memory, RAG, assistants—so many buzzwords! But the reality is, &lt;strong&gt;you don’t need fancy techniques or libraries to build useful personal tools with LLMs.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In this short post...&lt;/p&gt;</summary>
    <content type="html">&lt;p&gt;There&amp;rsquo;s a lot of hype these days around patterns for building with AI. Agents, memory, RAG, assistants—so many buzzwords! But the reality is, &lt;strong&gt;you don&amp;rsquo;t need fancy techniques or libraries to build useful personal tools with LLMs.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In this short post, I&amp;rsquo;ll show you how I built a useful AI assistant for my family using a dead simple architecture: a single SQLite table of memories, and a handful of cron jobs for ingesting memories and sending updates, all hosted on &lt;a href="https://www.val.town"&gt;Val.town&lt;/a&gt;. The whole thing is so simple that you can easily copy and extend it yourself.&lt;/p&gt;

&lt;h2 id="meet-stevens"&gt;Meet Stevens&lt;/h2&gt;

&lt;p&gt;The assistant is called Stevens, named after the butler in the great Ishiguro novel &lt;a href="https://en.wikipedia.org/wiki/The_Remains_of_the_Day"&gt;Remains of the Day&lt;/a&gt;. Every morning it sends a brief to me and my wife via Telegram, including our calendar schedules for the day, a preview of the weather forecast, any postal mail or packages we&amp;rsquo;re expected to receive, and any reminders we&amp;rsquo;ve asked it to keep track of. All written up nice and formally, just like you&amp;rsquo;d expect from a proper butler.&lt;/p&gt;

&lt;p&gt;Here&amp;rsquo;s an example. (I&amp;rsquo;ll use fake data throughout this post, beacuse our actual updates contain private information.)&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/stevens/telegram.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;Beyond the daily brief, we can communicate with Stevens on-demand—we can forward an email with some important info, or just leave a reminder or ask a question via Telegram chat.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/stevens/coffee.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;That&amp;rsquo;s Stevens. It&amp;rsquo;s rudimentary, but already more useful to me than Siri!&lt;/p&gt;

&lt;h2 id="behind-the-scenes"&gt;Behind the scenes&lt;/h2&gt;

&lt;p&gt;Let&amp;rsquo;s break down the simple architecture behind Stevens. The whole thing is hosted on &lt;a href="https://www.val.town"&gt;Val.town&lt;/a&gt;, a lovely platform that offers SQLite storage, HTTP request handling, scheduled cron jobs, and inbound/outbound email: a perfect set of capabilities for this project.&lt;/p&gt;

&lt;p&gt;First, how does Stevens know what goes in the morning brief? The key is the butler&amp;rsquo;s notebook, a log of everything that Stevens knows. There&amp;rsquo;s an admin view where we can see the notebook contents—let&amp;rsquo;s peek and see what&amp;rsquo;s in there:&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/stevens/notebook.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;You can see some of the entries that fed into the morning brief above—for example, the parent-teacher conference has a log entry.&lt;/p&gt;

&lt;p&gt;In addition to some text, entries can have a &lt;em&gt;date&lt;/em&gt; when they are expected to be relevant.  There are also entries with no date that serve as general background info, and are always included. You can see these particular background memories came from a Telegram chat, because Stevens does an intake interview via Telegram when you first get started:&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/stevens/background.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With this notebook in hand, sending the morning brief is easy&lt;/strong&gt;: just run a cron job which makes a call to the Claude API to write the update, and then sends the text to a Telegram thread. As context for the model, we include any log entries dated for the coming week, as well as the undated background entries.&lt;/p&gt;

&lt;p&gt;Under the hood, the &amp;ldquo;notebook&amp;rdquo; is just a single SQLite table with a few columns. Here&amp;rsquo;s a more boring view of things:&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/stevens/db.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;But wait: how did the various log entries get there in the first place? In the admin view, we can watch Stevens buzzing around entering things into the log from various sources:&lt;/p&gt;

&lt;video width="100%" controls&gt;
  &lt;source src="/images/article_images/stevens/cron.mp4" type="video/mp4"&gt;
&lt;/video&gt;

&lt;p&gt;This is just some data importers populating the table:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An hourly data pull from the Google Calendar API&lt;/li&gt;
&lt;li&gt;An hourly check of the local weather forecast using a weather API&lt;/li&gt;
&lt;li&gt;I forward &lt;a href="https://www.usps.com/manage/informed-delivery.htm"&gt;USPS Informed Delivery&lt;/a&gt; containing scans of our postal mail, and Stevens OCRs them using Claude&lt;/li&gt;
&lt;li&gt;Inbound Telegram and email messages can also result in log entries&lt;/li&gt;
&lt;li&gt;Every week, some &amp;ldquo;fun facts&amp;rdquo; get added into the log, as a way of adding some color to future daily updates.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;This system is easily extensible with new importers.&lt;/strong&gt; An importer is just any process that adds/edits memories in the log. The memory contents can be any arbitrary text, since they&amp;rsquo;ll just be fed back into an LLM later anyways.&lt;/p&gt;

&lt;h2 id="reflections"&gt;Reflections&lt;/h2&gt;

&lt;p&gt;A few quick reflections on this project:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It&amp;rsquo;s very useful for personal AI tools to have access to broader context from other information sources.&lt;/strong&gt; Awareness of things like my calendar and the weather forecast turns a dumb chatbot into a useful assistant. ChatGPT recently added memory of past conversations, but there&amp;rsquo;s lots of information not stored within that silo. I&amp;rsquo;ve &lt;a href="https://x.com/geoffreylitt/status/1810442615264796864"&gt;written before&lt;/a&gt; about how the endgame for AI-driven personal software isn&amp;rsquo;t more app silos, it&amp;rsquo;s small tools operating on a shared pool of context about our lives.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&amp;ldquo;Memory&amp;rdquo; can start simple.&lt;/strong&gt; In this case, the use cases of the assistant are limited, and its information is inherently time-bounded, so it&amp;rsquo;s fairly easy to query for the relevant context to give to the LLM. It also helps that some modern models have long context windows. As the available information grows in size, RAG and &lt;a href="https://x.com/sjwhitmore/status/1910439061615239520"&gt;fancier&lt;/a&gt; &lt;a href="https://arxiv.org/abs/2304.03442"&gt;approaches&lt;/a&gt; to memory may be needed, but you can start simple.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vibe coding enables sillier projects.&lt;/strong&gt; Initially, Stevens spoke with a dry tone, like you might expect from a generic Apple or Google product. But it turned out it was just more &lt;em&gt;fun&lt;/em&gt; to have the assistant speak like a formal butler. This was trivial to do, just a couple lines in a prompt. Similarly, I decided to make the admin dashboard views feel like a video game, because why not? I generated the image assets in ChatGPT, and vibe coded the whole UI in Cursor + Claude 3.7 Sonnet; it took a tiny bit of extra effort in exchange for a lot more fun.&lt;/p&gt;

&lt;h2 id="try-it-yourself"&gt;Try it yourself&lt;/h2&gt;

&lt;p&gt;Stevens isn&amp;rsquo;t a product you can run out of the box, it&amp;rsquo;s just a personal project I made for myself.&lt;/p&gt;

&lt;p&gt;But if you&amp;rsquo;re curious, you can check out the code and fork the project &lt;a href="https://www.val.town/x/geoffreylitt/stevensDemo"&gt;here&lt;/a&gt;. You should be able to apply this basic pattern—a single memories table and an extensible constellation of cron jobs—to do lots of other useful things.&lt;/p&gt;

&lt;p&gt;I recommend editing the code using your AI editor of choice with the &lt;a href="https://github.com/val-town/vt"&gt;Val Town CLI&lt;/a&gt; to sync to local filesystem.&lt;/p&gt;
</content>
  </entry>
  <entry>
    <title>Avoid the nightmare bicycle</title>
    <link rel="alternate" href="https://geoffreylitt.com/2025/03/03/the-nightmare-bicycle.html"/>
    <id>https://geoffreylitt.com/2025/03/03/the-nightmare-bicycle.html</id>
    <published>2025-03-03T22:13:00+00:00</published>
    <updated>2025-03-03T22:13:00+00:00</updated>
    <author>
      <name>Geoffrey Litt</name>
    </author>
    <summary type="html">&lt;p&gt;In my opinion, one of the most important ideas in product design is to avoid the “nightmare bicycle”.&lt;/p&gt;

&lt;p&gt;Imagine a bicycle where the product manager said: “people don’t get math so we can’t have numbered gears. We need labeled buttons for gravel mode...&lt;/p&gt;</summary>
    <content type="html">&lt;p&gt;In my opinion, one of the most important ideas in product design is to avoid the &amp;ldquo;nightmare bicycle&amp;rdquo;.&lt;/p&gt;

&lt;p&gt;Imagine a bicycle where the product manager said: &amp;ldquo;people don&amp;rsquo;t get math so we can&amp;rsquo;t have numbered gears. We need labeled buttons for gravel mode, downhill mode, &amp;hellip;&amp;rdquo;&lt;/p&gt;

&lt;p&gt;This is the hypothetical &amp;ldquo;nightmare bicycle&amp;rdquo; that Andrea diSessa imagines in his book &lt;a href="https://mitpress.mit.edu/9780262541329/changing-minds/"&gt;Changing Minds&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;As he points out: it would be terrible! We&amp;rsquo;d lose the intuitive understanding of how to use the gears to solve any situation we encounter. Which mode do you use for gravel + downhill?&lt;/p&gt;

&lt;p&gt;It turns out, anyone can understand numbered gears totally fine after a bit of practice. People are capable!&lt;/p&gt;

&lt;p&gt;Along the same lines: one of the worst misconceptions in product design is that a microwave needs to have a button for every thing you could possibly cook: &amp;ldquo;popcorn&amp;rdquo;, &amp;ldquo;chicken&amp;rdquo;, &amp;ldquo;potato&amp;rdquo;, &amp;ldquo;frozen vegetable&amp;rdquo;, bla bla bla.&lt;/p&gt;

&lt;p&gt;You really don&amp;rsquo;t! You can just have a time (and power) button. People will figure out how to cook stuff.&lt;/p&gt;

&lt;p&gt;Good designs expose systematic structure; they lean on their users&amp;rsquo; ability to understand this structure and apply it to new situations. We were born for this.&lt;/p&gt;

&lt;p&gt;Bad designs paper over the structure with superficial labels that hide the underlying system, inhibiting their users&amp;rsquo; ability to actually build a clear model in their heads.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/nightmare-bicycle.jpeg" alt="Two pages from a book describing the nightmare bicycle concept" /&gt;&lt;/p&gt;

&lt;p&gt;p.s. Changing Minds is one of the best books ever written about design and computational thinking, you should go read it.&lt;/p&gt;
</content>
  </entry>
  <entry>
    <title>AI-generated tools can make programming more fun</title>
    <link rel="alternate" href="https://geoffreylitt.com/2024/12/22/making-programming-more-fun-with-an-ai-generated-debugger.html"/>
    <id>https://geoffreylitt.com/2024/12/22/making-programming-more-fun-with-an-ai-generated-debugger.html</id>
    <published>2024-12-22T14:05:00+00:00</published>
    <updated>2024-12-22T14:05:00+00:00</updated>
    <author>
      <name>Geoffrey Litt</name>
    </author>
    <summary type="html">&lt;p&gt;I want to tell you about a neat experience I had with AI-assisted programming this week. What’s unusual here is: &lt;strong&gt;the AI didn’t write a single line of my code.&lt;/strong&gt; Instead, I used AI to build a &lt;em&gt;custom debugger UI&lt;/em&gt;… which made it more fun for me to do the...&lt;/p&gt;</summary>
    <content type="html">&lt;p&gt;I want to tell you about a neat experience I had with AI-assisted programming this week. What&amp;rsquo;s unusual here is: &lt;strong&gt;the AI didn&amp;rsquo;t write a single line of my code.&lt;/strong&gt; Instead, I used AI to build a &lt;em&gt;custom debugger UI&lt;/em&gt;&amp;hellip; which made it more fun for me to do the coding myself.&lt;/p&gt;

&lt;div style="text-align: center; width: 100%"&gt;* * *&lt;/div&gt;

&lt;p&gt;I was hacking on a Prolog interpreter as a learning project. &lt;a href="https://en.wikipedia.org/wiki/Prolog"&gt;Prolog&lt;/a&gt; is a logic language where the user defines facts and rules, and then the system helps answer queries. A basic interpreter for this language turns out to be an elegant little program with surprising power—a perfect project for a fun learning experience.&lt;/p&gt;

&lt;p&gt;The trouble is: it&amp;rsquo;s also a bit finicky to get the details right. I encountered some bugs in my implementation of a key step called &lt;a href="https://en.wikipedia.org/wiki/Unification_(computer_science)"&gt;&lt;em&gt;unification&lt;/em&gt;&lt;/a&gt;—solving symbolic equations�—which was leading to weird behavior downstream. I tried logging some information at each step of execution, but I was still parsing through screens of text output looking for patterns.&lt;/p&gt;

&lt;p&gt;I needed better visibility. So, I asked &lt;a href="https://support.anthropic.com/en/articles/9487310-what-are-artifacts-and-how-do-i-use-them"&gt;Claude Artifacts&lt;/a&gt; to whip up a custom UI for viewing one of my execution traces. After a few iterations, here&amp;rsquo;s where it ended up:&lt;/p&gt;

&lt;video autoplay loop controls="controls" preload="auto" muted="muted" data-video="0" type="video/mp4" src="/images/article_images/debugger/demo.mp4" width="100%"&gt;&lt;/video&gt;

&lt;p&gt;I could step through an execution and see a clear visualization of my interpreter&amp;rsquo;s stack: how it has broken down goals to solve; which rule it&amp;rsquo;s currently evaluating; variable assignments active in the current context; when it&amp;rsquo;s come across a solution. The timeline shows an overview of the execution, letting me manually jump to any point to inspect the state. I could even leave a note annotating that point of the trace.&lt;/p&gt;

&lt;p&gt;Oh yeah, and don&amp;rsquo;t forget the most important feature: the retro design 😎.&lt;/p&gt;

&lt;p&gt;Using this interactive debug UI gave me far clearer visibility than a terminal of print statements. I caught a couple bugs immediately just by being able to see variable assignments more clearly. A repeating pattern of solutions in the timeline view led me to discover an infinite loop bug.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;And, above all: I started having more fun!&lt;/strong&gt; When I got stuck on bugs, it felt like I was getting stuck in interesting, essential ways, not on dumb mistakes. I was able to get an intuitive grasp of my interpreter&amp;rsquo;s operation, and then hone in on problems. As a bonus, the visual aesthetic made debugging feel more like a puzzle game than a depressing slog.&lt;/p&gt;

&lt;div style="text-align: center; width: 100%"&gt;* * *&lt;/div&gt;

&lt;p&gt;Two things that stick out to me about this experience are 1) how fast it was to get started, and 2) how fast it was to iterate.&lt;/p&gt;

&lt;p&gt;When I first had the idea, I just copy-pasted my interpreter code and a sample execution trace into Claude, and asked it to build a React web UI with the rough functionality I wanted. I also specified &amp;ldquo;a fun hacker vibe, like the matrix&amp;rdquo;, because why not? About a minute later (after a single iteration for a UI bug which Claude fixed on its own), I had a solid first version up and running:&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/debugger/prompt.png" alt="My prompt to Claude" /&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;That fast turnaround is absolutely critical, because it meant I didn&amp;rsquo;t need to break focus from the main task at hand.&lt;/strong&gt; I was trying to write a Prolog interpreter here, not build a debug UI. Without AI support, I would have just muddled through with my existing tools, lacking the time or focus to build a debug UI. Simon Willison says: &lt;a href="https://simonwillison.net/2023/Mar/27/ai-enhanced-development/"&gt;&amp;ldquo;AI-enhanced development makes me more ambitious with my projects&amp;rdquo;&lt;/a&gt;. In this case: AI-enhanced development made me more ambitious with my &lt;em&gt;dev tools&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;By the way: I was confident Claude 3.5-Sonnet would do well at this task, because it&amp;rsquo;s great at building straightforward web UIs. That&amp;rsquo;s all this debugger is, at the end of the day: a simple view of a JSON blob; an easy task for a competent web developer. In some sense, you can think of this workflow as a technique for turning that narrow, limited programming capability—rapidly and automatically building straightforward UIs—into an accelerant for more advanced kinds of programming.&lt;/p&gt;

&lt;p&gt;Whether you&amp;rsquo;re an AI-programming skeptic or an enthusiast, the reality is that many programming tasks are beyond the reach of today&amp;rsquo;s models. But many decent &lt;em&gt;dev tools&lt;/em&gt; are actually quite easy for AI to build, and can help the rest of the programming go smoother. In general, these days any time I&amp;rsquo;m spending more than a minute staring at a JSON blob, I consider whether it&amp;rsquo;s worth building a custom UI for it.&lt;/p&gt;

&lt;div style="text-align: center; width: 100%"&gt;* * *&lt;/div&gt;

&lt;p&gt;As I used the tool in my debugging, I would notice small things I wanted to visualize differently: improving the syntax display for the program, allocating screen real estate better, adding the timeline view to get a sense of the full history.&lt;/p&gt;

&lt;p&gt;Each time, I would just switch windows, spend a few seconds asking Claude to make the change, and then switch back to my code editor and resume working. When I came back at my next breaking point, I&amp;rsquo;d have a new debugger waiting for me. Usually things would just work the first time. Sometimes a minor bug fix was necessary, but I let Claude handle it every time. I still haven&amp;rsquo;t looked at the UI code.&lt;/p&gt;

&lt;p&gt;Eventually we landed on a fairly nice design, where each feature had been motivated by an immediate need that I had felt during use:&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/debugger/debugger-annotated.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;Claude wasn&amp;rsquo;t perfect—it did get stuck one time when I asked it to add a &lt;a href="https://www.brendangregg.com/flamegraphs.html"&gt;flamegraph&lt;/a&gt; view of the stack trace changing over time. Perhaps I could have prodded it into building this better, or even resorted to building it myself. But instead I just decided to abandon that idea and carry on. &lt;strong&gt;AI development works well when your requirements are flexible&lt;/strong&gt; and you&amp;rsquo;re OK changing course to work within the current limits of the model.&lt;/p&gt;

&lt;p&gt;Overall, &lt;strong&gt;it felt incredible that it only took seconds to go from noticing something I wanted in my debugger to having it there in the UI.&lt;/strong&gt; The AI support let me stay in flow the whole time; I was free to think about interpreter code and not debug tool code. I had a yak-shaving intern at my disposal.&lt;/p&gt;

&lt;p&gt;This is the dream of &lt;a href="https://www.geoffreylitt.com/2023/03/25/llm-end-user-programming.html"&gt;malleable software&lt;/a&gt;: editing software at the speed of thought. Starting with just the minimal thing we need for our particular use case, adding things immediately as we come across new requirements. Ending up with a tool that&amp;rsquo;s molded to our needs like a leather shoe, not some complicated generic thing designed for a million users.&lt;/p&gt;

&lt;h2 id="related-reading"&gt;Related reading&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;For more zoomed-out thoughts on the possibilities of custom AI-generated tools, check out &lt;a href="https://www.geoffreylitt.com/2023/03/25/llm-end-user-programming"&gt;Malleable software in the age of LLMs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;I&amp;rsquo;ve written before about using AI to develop one-off custom personal tools, like &lt;a href="https://www.geoffreylitt.com/2023/07/25/building-personal-tools-on-the-fly-with-llms"&gt;a Japanese text message translation app with a formality slider&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;A few months ago, a collaborator and I had a similar experience building &lt;a href="https://x.com/geoffreylitt/status/1821666220644683950"&gt;a custom visualization tool for bounding boxes on PDFs&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;The idea of custom tools for debugging programming systems has been explored in depth by many live programming tools; e.g. check out &lt;a href="https://gtoolkit.com/"&gt;Glamorous Toolkit&lt;/a&gt; and &lt;a href="https://clerk.vision/"&gt;Clerk&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
</content>
  </entry>
  <entry>
    <title>Your pie doesn't need to be original (unless you claim it so)</title>
    <link rel="alternate" href="https://geoffreylitt.com/2024/08/25/your-pie-doesnt-need-be-original.html"/>
    <id>https://geoffreylitt.com/2024/08/25/your-pie-doesnt-need-be-original.html</id>
    <published>2024-08-25T15:39:00+00:00</published>
    <updated>2024-08-25T15:39:00+00:00</updated>
    <author>
      <name>Geoffrey Litt</name>
    </author>
    <summary type="html">&lt;p&gt;Imagine you bake a delicious peach pie over the weekend, and you offer a slice to your friend. They respond:&lt;/p&gt;

&lt;p&gt;“Wait, how is this different from every other peach pie that’s ever been baked? It seems really similar to another pie I had recently.”&lt;/p&gt;

</summary>
    <content type="html">&lt;p&gt;Imagine you bake a delicious peach pie over the weekend, and you offer a slice to your friend. They respond:&lt;/p&gt;

&lt;p&gt;&amp;ldquo;Wait, how is this different from every other peach pie that&amp;rsquo;s ever been baked? It seems really similar to another pie I had recently.&amp;rdquo;&lt;/p&gt;

&lt;p&gt;This is obviously an absurd reaction!&lt;/p&gt;

&lt;p&gt;But this exact dynamic happens all the time in creative software projects. Someone shares a project they made, and the first reaction is: how&amp;rsquo;s it different?&lt;/p&gt;

&lt;p&gt;The problem here is a mismatch in values.&lt;/p&gt;

&lt;p&gt;The friend has assumed that your goal is to &amp;ldquo;efficiently&amp;rdquo; reach the goal of a delicious pie, or perhaps even to create a new kind of pie. But that&amp;rsquo;s not the goal at all!&lt;/p&gt;

&lt;p&gt;Baking a pie is a creative act. It&amp;rsquo;s personal, it&amp;rsquo;s inherently delightful, it&amp;rsquo;s an act of caring for others. It&amp;rsquo;s also a craft that one can improve at over time. Just buying the &amp;ldquo;best&amp;rdquo; pie would defeat the point.&lt;/p&gt;

&lt;hr&gt;

&lt;p&gt;The next day, you find out there&amp;rsquo;s a scientific conference in town: CRISP, the Conference for Research on Innovative Sweet Pastries. This is where the world&amp;rsquo;s foremost experts push forward the frontier of pie-baking technique.&lt;/p&gt;

&lt;p&gt;You show up with your delicious peach pie, and the first question from the judging panel is:&lt;/p&gt;

&lt;p&gt;&amp;ldquo;Wait, how is this different from every other peach pie that&amp;rsquo;s ever been baked? It seems really similar to another pie I had recently.&amp;rdquo;&lt;/p&gt;

&lt;p&gt;You respond: &amp;ldquo;I have no idea, I just enjoyed baking it and thought it was delicious! I don&amp;rsquo;t even know what recipe I used. Why does it matter to you, huh?&amp;rdquo;&lt;/p&gt;

&lt;p&gt;The expert says: &amp;ldquo;Well then, you&amp;rsquo;re welcome to bake pies all you want at home, but your pie is not welcome at CRISP. The community cannot understand your contribution or build on your work.&amp;rdquo;&lt;/p&gt;

&lt;p&gt;You might be upset about this outcome, but you&amp;rsquo;d be wrong. In this context, the judge&amp;rsquo;s criticism is totally fair.&lt;/p&gt;

&lt;p&gt;The goal of CRISP isn&amp;rsquo;t just to enjoy pies, it&amp;rsquo;s to build up a community of practice. Part of being a good citizen of that community is being able to explain how your pie is different—which in turn requires learning about all the other ways of making pie. This isn&amp;rsquo;t merely a higher bar than amateur weekend baking, it&amp;rsquo;s a totally different frame of mind.&lt;/p&gt;

&lt;hr&gt;

&lt;p&gt;I think mixing up these two situations is the source of a lot of unfortunate confusion.&lt;/p&gt;

&lt;p&gt;I work on prototyping new kinds of software interfaces and programming tools, and I spend time in &lt;a href="https://liveprog.org"&gt;various&lt;/a&gt; &lt;a href="https://inkandswitch.com"&gt;communities&lt;/a&gt; that span across the cultures of playful exploration (sharing demos on Twitter) and academic research (writing formal papers).&lt;/p&gt;

&lt;p&gt;Often I see creative people share personal projects and get their spirits weakened by &amp;ldquo;how&amp;rsquo;s it different?&amp;rdquo; The question can be well-meaning; it isn&amp;rsquo;t necessarily cynical! It&amp;rsquo;s just misunderstanding the goal.&lt;/p&gt;

&lt;p&gt;On the other hand, I see people submit cool work to academic research venues and get confused by, or even chafe at, the stringent requirement of situating the work in context. I used to be pretty dismissive of Related Work sections myself, until I went through grad school and realized how valuable they are to the world.&lt;/p&gt;

&lt;hr&gt;

&lt;p&gt;So, as creators and feedback-givers, how can we avoid this confusion?&lt;/p&gt;

&lt;p&gt;There&amp;rsquo;s an answer that seems obvious: clearly set the goal up front. Are you trying to do a personal project for fun, or are you trying to make a novel research contribution? Just proactively broadcast your intent, and most people will be better at asking questions that are aligned with your goals.&lt;/p&gt;

&lt;p&gt;Unfortunately, in my experience, things don&amp;rsquo;t work out this cleanly. Many of the best new ideas start out as playful explorations, and over time snowball into a larger project that are worthy of a serious research contribution.&lt;/p&gt;

&lt;p&gt;A strategy I&amp;rsquo;ve found helpful is to &lt;em&gt;start&lt;/em&gt; from a place of personal creativity. If the initial goal is playful exploration for its own sake, that creates free space to explore and quells early doubts (from both myself and others). It doesn&amp;rsquo;t matter if it&amp;rsquo;s new or good (yet), I&amp;rsquo;m just having fun.&lt;/p&gt;

&lt;p&gt;Occasionally a project grows into something more. At that point it can be appropriate to apply a critical academic lens.&lt;/p&gt;

&lt;p&gt;Starting from the other side seems a lot tougher. If you start off saying &amp;ldquo;we&amp;rsquo;re going to make a big serious contribution no one&amp;rsquo;s ever done before,&amp;rdquo; that sets up high stakes and invites harsh critique from the start. Maybe this approach works for some projects with narrower success criteria, but it doesn&amp;rsquo;t seem to work well for most of what I do.&lt;/p&gt;

&lt;p&gt;A final thing to keep in mind: when I&amp;rsquo;m on the side of giving feedback, I always try to first understand the creator&amp;rsquo;s goals. This can be a subtle art when they don&amp;rsquo;t even know their own goals yet. The weekend baker may just need encouragement, not critique.&lt;/p&gt;

&lt;h2 id="related-wisdom"&gt;Related wisdom&lt;/h2&gt;

&lt;p&gt;Richard Feynman, &lt;a href="https://www.asc.ohio-state.edu/kilcup.1/262/feynman.html"&gt;on spinning plates&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Physics disgusts me a little bit now, but I used to enjoy doing physics. Why did I enjoy it? I used to play with it&amp;hellip; I&amp;rsquo;m going to play with physics, whenever I want to, without worrying about any importance whatsoever.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Patrick Dubroy, &lt;a href="https://dubroy.com/blog/playing-like-a-kid-again/"&gt;on playing like a kid&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;It was another instance of unconsciously adopting a restrictive set of assumptions, telling myself that if I wasn’t done “right”, it wasn’t worth doing at all&amp;hellip; And guess what — when I decided to let go of those assumptions, I started having fun on my side projects again.&lt;/p&gt;
&lt;/blockquote&gt;
</content>
  </entry>
  <entry>
    <title>7 books that stood the test of time in 2023</title>
    <link rel="alternate" href="https://geoffreylitt.com/2023/12/17/seven-books-that-stuck-with-me-in-2023.html"/>
    <id>https://geoffreylitt.com/2023/12/17/seven-books-that-stuck-with-me-in-2023.html</id>
    <published>2023-12-17T17:01:00+00:00</published>
    <updated>2023-12-17T17:01:00+00:00</updated>
    <author>
      <name>Geoffrey Litt</name>
    </author>
    <summary type="html">&lt;p&gt;It’s the most wonderful time of the year: when people proudly announce how many books they have read in the past 12 months. 10 books, 20 books, 57 books! Worry not—I know you don’t care, and besides, I have no idea how many books I read this year.&lt;/p&gt;</summary>
    <content type="html">&lt;p&gt;It&amp;rsquo;s the most wonderful time of the year: when people proudly announce how many books they have read in the past 12 months. 10 books, 20 books, 57 books! Worry not—I know you don&amp;rsquo;t care, and besides, I have no idea how many books I read this year.&lt;/p&gt;

&lt;p&gt;In lieu of that, here&amp;rsquo;s a short list of &lt;strong&gt;some favorite books I read &lt;em&gt;before&lt;/em&gt; 2023 that have stuck with me this year&lt;/strong&gt; and changed the way I think. Seven masterpieces on AI, cooking, art, houses, product design, computational media, and trees:&lt;/p&gt;

&lt;figure style="margin: 0;"&gt;
  &lt;img src="/images/article_images/seven-books.jpg" alt="Six books on a floor, corresponding to the list below in this post"&gt;
  &lt;figcaption&gt;Six of the seven books. The seventh I only have on Kindle, sorry Ken!&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;h3 id="the-most-human-human-by-brian-christian"&gt;The Most Human Human, by Brian Christian&lt;/h3&gt;

&lt;p&gt;A book about humanity, disguised as a book about AI. It taught me how to have deeper conversations and find more meaning in my work. Amid a sea of spilled ink on AI, Brian Christian has simply asked more interesting questions. Notably, this book was written in 2011, before the current wave—yet it&amp;rsquo;s still remarkably relevant.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.goodreads.com/en/book/show/8884400"&gt;See it on Goodreads&lt;/a&gt;&lt;/p&gt;

&lt;h3 id="an-everlasting-meal-by-tamar-adler"&gt;An Everlasting Meal, by Tamar Adler&lt;/h3&gt;

&lt;p&gt;This book changed the way I cook. It teaches the correct way to think about home cooking &amp;ndash; not as a chore, an &amp;ldquo;obstacle&amp;rdquo;, or an optimized process&amp;hellip; but as a simple, natural act of creativity. One of the wisest books I know.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.goodreads.com/en/book/show/11300085"&gt;See it on Goodreads&lt;/a&gt;&lt;/p&gt;

&lt;h3 id="art-fear-by-david-bayles-and-ted-orland"&gt;Art &amp;amp; Fear, by David Bayles and Ted Orland&lt;/h3&gt;

&lt;p&gt;A slim little manual about how to overcome the fear and keep creating. Subtle tips on the role of talent, managing the vision-execution gap, quantity vs quality. I might not have kept going with research if I hadn&amp;rsquo;t read this book.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.goodreads.com/en/book/show/187633"&gt;See it on Goodreads&lt;/a&gt;&lt;/p&gt;

&lt;h3 id="the-production-of-houses-by-christopher-alexander-et-al"&gt;The Production of Houses, by Christopher Alexander et al.&lt;/h3&gt;

&lt;p&gt;Christopher Alexander thought people could design their own homes. His most famous books, The Timeless Way of Building and A Pattern Language, are brilliant but can be a bit abstract. The Production of Houses shows what actually happened, concretely, when he and his team helped some people do the thing and design their own homes.&lt;/p&gt;

&lt;p&gt;The result: some great successes, some strange contradictions to ponder.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.goodreads.com/book/show/106725.The_Production_of_Houses"&gt;See it on Goodreads&lt;/a&gt;&lt;/p&gt;

&lt;h3 id="creative-selection-by-ken-kocienda"&gt;Creative Selection, by Ken Kocienda&lt;/h3&gt;

&lt;p&gt;This book shows that most product design is a dead end. It describes, in great detail, the Apple way—hard to achieve, but worth striving towards. I&amp;rsquo;m constantly remembering stories from this book in my own work. &amp;ldquo;Pick one keyboard!&amp;rdquo;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.goodreads.com/book/show/37638098-creative-selection"&gt;See it on Goodreads&lt;/a&gt;&lt;/p&gt;

&lt;h3 id="changing-minds-by-andy-disessa"&gt;Changing Minds, by Andy diSessa&lt;/h3&gt;

&lt;p&gt;A foundational text for my research. I am always amazed how many people have not even heard of it. If you care about &amp;ldquo;future of computing&amp;rdquo;, Bret Victor&amp;rsquo;s work, &amp;ldquo;computational literacy&amp;rdquo;&amp;hellip; go read this book! I promise it will change your mind. I reference diSessa&amp;rsquo;s &lt;a href="https://twitter.com/geoffreylitt/status/1153373693713817600"&gt;&amp;ldquo;nightmare bicycle&lt;/a&gt; concept all the time.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.goodreads.com/book/show/1752380"&gt;See it on Goodreads&lt;/a&gt;&lt;/p&gt;

&lt;h3 id="the-overstory-by-richard-powers"&gt;The Overstory, by Richard Powers&lt;/h3&gt;

&lt;p&gt;To the extent that it&amp;rsquo;s possible to see the world from the perspective of trees, this novel got me to that place. Every time I&amp;rsquo;m in a forest now, I think about the trees: how long they&amp;rsquo;ve been there, what they&amp;rsquo;re communicating to one another.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.goodreads.com/en/book/show/40180098"&gt;See it on Goodreads&lt;/a&gt;&lt;/p&gt;

&lt;hr&gt;

&lt;p&gt;Look, I could write so much more about any one of these books (and I&amp;rsquo;m happy to answer any questions!) but honestly, it feels hard to do them justice.&lt;/p&gt;

&lt;p&gt;They&amp;rsquo;re all 5 stars, on both substance and prose. Well worth your time, and could be a great gift to the right person. I hope you have a great holidays!&lt;/p&gt;
</content>
  </entry>
  <entry>
    <title>Codifying a ChatGPT workflow into a malleable GUI</title>
    <link rel="alternate" href="https://geoffreylitt.com/2023/07/25/building-personal-tools-on-the-fly-with-llms.html"/>
    <id>https://geoffreylitt.com/2023/07/25/building-personal-tools-on-the-fly-with-llms.html</id>
    <published>2023-07-25T17:15:00+00:00</published>
    <updated>2023-07-25T17:15:00+00:00</updated>
    <author>
      <name>Geoffrey Litt</name>
    </author>
    <summary type="html">&lt;p&gt;In my previous post, &lt;a href="/2023/03/25/llm-end-user-programming.html"&gt;Malleable software in the age of LLMs&lt;/a&gt;, I laid out a theory for how LLMs might enable a new era of people creating their own personal software:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I think it’s likely that soon all computer users will have the ability to develop...&lt;/p&gt;
&lt;/blockquote&gt;</summary>
    <content type="html">&lt;p&gt;In my previous post, &lt;a href="/2023/03/25/llm-end-user-programming.html"&gt;Malleable software in the age of LLMs&lt;/a&gt;, I laid out a theory for how LLMs might enable a new era of people creating their own personal software:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I think it’s likely that soon all computer users will have the ability to develop small software tools from scratch, and to describe modifications they’d like made to software they’re already using.&lt;/p&gt;

&lt;p&gt;In other words, LLMs will represent a step change in tool support for end-user programming: the ability of normal people to fully harness the general power of computers without resorting to the complexity of normal programming. Until now, that vision has been bottlenecked on turning fuzzy informal intent into formal, executable code; now that bottleneck is rapidly opening up thanks to LLMs.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Today I&amp;rsquo;ll &lt;strong&gt;share a real example where I found it useful to build custom personal software with an LLM&lt;/strong&gt;. Earlier this week, I used GPT-4 to code an app that helps me draft text messages in English and translate them to Japanese. The basic idea: I paste in the context for the text thread and write my response in English; I get back a translation into Japanese. The app has a couple other neat features, too: I can drag a slider to tweak the formality of the language, and I can highlight any phrase to get a more detailed explanation.&lt;/p&gt;

&lt;p&gt;The whole thing is ugly and thrown together in no time, but it has exactly the features I need, and I&amp;rsquo;ve found it quite useful for planning an upcoming trip to Japan.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/texting-app-teaser.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;The app uses the GPT-4 API to do the actual translations. So there are two usages of LLMs going on here: I used an LLM to code the app, and then the app also uses an LLM when it runs to do the translations. Sorry if that&amp;rsquo;s confusing, 2023 is weird.&lt;/p&gt;

&lt;p&gt;You may ask: why bother making an app for this? Why not just ask ChatGPT to do the translations? I&amp;rsquo;m glad you asked—that&amp;rsquo;s what this post is all about! In fact, I started out doing these translations in ChatGPT, but &lt;strong&gt;I ended up finding this GUI nicer to use than raw ChatGPT for several reasons&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It encodes a prescriptive workflow so I don&amp;rsquo;t need to fuss with prompts as much.&lt;/li&gt;
&lt;li&gt;It offers convenient direct manipulation affordances like text boxes and sliders.&lt;/li&gt;
&lt;li&gt;It makes it easier to share a workflow with other people.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(Interestingly, these are similar to the reasons that so many startups are building products wrapping LLM prompts—the difference here is that I&amp;rsquo;m just building the tool for myself, and not trying to make a product.)&lt;/p&gt;

&lt;p&gt;A key point is that making this personal GUI is only worth it because &lt;strong&gt;GPT also lowers the cost of making and iterating on the GUI!&lt;/strong&gt; Even though I&amp;rsquo;m a programmer, I wouldn&amp;rsquo;t have made this tool without LLM support. It&amp;rsquo;s not only the time savings, it&amp;rsquo;s also the fact that I don&amp;rsquo;t need to turn on my &amp;ldquo;programmer brain&amp;rdquo; to make these tools; I can think at a higher level and let the LLM handle the details.&lt;/p&gt;

&lt;p&gt;There are also tradeoffs to consider when moving from ChatGPT into a GUI tool: the resulting workflow is more rigid and less open-ended than a ChatGPT session. In a sense this is the whole point of a GUI. But the GUI isn&amp;rsquo;t necessarily as limiting as it might seem, because remember, it&amp;rsquo;s &lt;em&gt;malleable&lt;/em&gt;—I built it myself using GPT and can quickly make further edits. This is a very different situation that using a fixed app that someone else made! Below I&amp;rsquo;ll share one example of how I edited this tool on the fly as I was using it.&lt;/p&gt;

&lt;p&gt;Overall I think this experience suggests an intriguing workflow of &lt;strong&gt;codifying a ChatGPT workflow into a malleable GUI&lt;/strong&gt;: starting out with ChatGPT, exploring the most useful way to solve a task, and then once you&amp;rsquo;ve landed on a good approach, codifying that approach in a GUI tool that you can use in a repeatable way going forward.&lt;/p&gt;

&lt;p&gt;Alright, on to the story of how this app came about.&lt;/p&gt;

&lt;hr&gt;

&lt;h2 id="chatgpt-is-a-good-translator-usually"&gt;ChatGPT is a good translator (usually 🙃)&lt;/h2&gt;

&lt;p&gt;I&amp;rsquo;m going on a trip to Japan soon and have been on some text threads where I need to communicate in Japanese. I grew up in Japan but my writing is rusty and painfully slow these days. One particular challenge for me is using the appropriate level of formality with extended family and other family acquaintances—I have fluent schoolyard Japanese but the nuances of formal grown-up Japanese can be tricky.&lt;/p&gt;

&lt;p&gt;I started using ChatGPT to make this process faster by asking it to produce draft messages in Japanese based on my English input. I quickly realized &lt;strong&gt;there are some neat benefits to ChatGPT vs. a traditional translation app&lt;/strong&gt;. I can give it the full context of the text thread so it can incorporate that into its translation. I can steer it with prompting: asking it to tweak the formality or do a less word-for-word translation. I can ask follow-up questions about the meaning of a word. These capabilities were all gamechangers for this task; they really show why smart chatbots can be so useful!&lt;/p&gt;

&lt;p&gt;You may be wondering: how good were the translations? I&amp;rsquo;d say: good enough to be spectacularly useful to me, &lt;em&gt;given that I can verify and edit&lt;/em&gt;. Often they were basically perfect. Sometimes they were wrong in huge, hilarious ways—flipping the meaning of a sentence, or swapping the name of a train station for another one (sigh, LLMs&amp;hellip;).&lt;/p&gt;

&lt;p&gt;In practice these mistakes didn&amp;rsquo;t matter too much though. I&amp;rsquo;m slow at writing in Japanese but can read basic messages easily, so I just fix the errors and they aren&amp;rsquo;t dealbreakers. &lt;strong&gt;When creation is slow and verification is fast, it&amp;rsquo;s a sweet spot for using an LLM.&lt;/strong&gt;&lt;/p&gt;

&lt;h2 id="honing-the-workflow"&gt;Honing the workflow&lt;/h2&gt;

&lt;p&gt;As I translated more messages and saw ways that the model failed, I developed some little prompting tricks that seemed to produce better translations. Things like this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Below is some context for a text message thread:&lt;/p&gt;

&lt;p&gt;&amp;hellip;paste thread&amp;hellip;&lt;/p&gt;

&lt;p&gt;Now translate my message below to japanese. make it sound natural in the flow of this conversation. don&amp;rsquo;t translate word for word, translate the general meaning.&lt;/p&gt;

&lt;p&gt;&amp;hellip;write message&amp;hellip;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I also learned some typical follow-up requests I would often make after receiving the initial translation: things like asking to adjust the formality level up or down.&lt;/p&gt;

&lt;p&gt;Once I had landed on these specific prompt patterns, it made my interactions more scripted. Each time I would need to dig up my prompt text for this task, copy-paste it in, and fill in the blanks for this particular translation. When asking follow-up questions I&amp;rsquo;d also copy-paste phrasings from previous chats that had proven successful. &lt;strong&gt;At this point it didn&amp;rsquo;t feel like an open-ended conversation anymore; it felt like I was tediously executing a workflow made up of specific chat prompts.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I also found myself wanting to have more of a feeling of a solid tool that I could return to. ChatGPT chats feels a bit amorphous and hard to return to: where do I store my prompts? How do I even remember what useful workflows I&amp;rsquo;ve come up with? I basically wanted a window I could pop open and get a quick translation.&lt;/p&gt;

&lt;h2 id="making-a-gui-with-gpt"&gt;Making a GUI with GPT&lt;/h2&gt;

&lt;p&gt;So, I asked GPT-4 to build me a GUI codifying this workflow. The app is a frontend-only React.js web app. It&amp;rsquo;s hosted on &lt;a href="https://replit.com/"&gt;Replit&lt;/a&gt;, which makes it easy to spin up a new project in one click and then share a link with people. (You can see the current code &lt;a href="https://replit.com/@GeoffreyLitt/TextMessageTranslator#src/App.jsx"&gt;here&lt;/a&gt; if you&amp;rsquo;re curious.) I just copy-pasted the GPT-generated code into Replit.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/texting-replit.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;The initial version of the app was very simple: it basically just accepted a text input and then made a request to the GPT-4 API asking for a natural-sounding translation. The early designs generated by ChatGPT were super primitive:&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/early-designs.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;Asking it for a &amp;ldquo;professional and modern&amp;rdquo; redesign helped get the design looking passable. I then asked GPT to add a &lt;em&gt;formality slider&lt;/em&gt; to the app. The new app requests three translations of varying formality, and then lets the user drag a slider to instantly choose between them 😎&lt;/p&gt;

&lt;video autoplay loop controls="controls" preload="auto" muted="muted" data-video="0" type="video/mp4" src="/images/article_images/text-app.mp4" width="100%"&gt;&lt;/video&gt;

&lt;p&gt;GPT-4 did most of the coding of the UI. I didn&amp;rsquo;t measure how long it took, but subjectively, the whole thing felt pretty effortless; &lt;strong&gt;it felt more like asking a friend to build an app for me than building it myself&lt;/strong&gt;, and I never engaged my detailed programmer brain. I still haven&amp;rsquo;t looked very closely at the code. GPT generally produced good results on every iteration. At one point it got confused about how to call the OpenAI API, but pasting in some recent documentation got it sorted out. I&amp;rsquo;ve included some of the coding prompts I used at the &lt;a href="#appendix"&gt;bottom of this post&lt;/a&gt; if you&amp;rsquo;re curious about the details.&lt;/p&gt;

&lt;p&gt;At the same time, it&amp;rsquo;s important to note that &lt;strong&gt;my programming background did substantially help the process along&lt;/strong&gt; and I don&amp;rsquo;t think it would have gone that well if I didn&amp;rsquo;t know how to make React UIs. I was able to give the LLM a detailed spec, which was natural for me to write. For example: I suggested storing the OpenAI key as a user-provided setting in the app UI rather than putting it in the code, because that would let us keep the app frontend-only. I also helped fix some minor bugs.&lt;/p&gt;

&lt;p&gt;I do believe it&amp;rsquo;s possible to get to the point where an LLM can support non-programmers in building custom GUIs (and that&amp;rsquo;s in fact one of my main research goals at the moment). But it&amp;rsquo;s a much harder goal than supporting programmers, and will require a lot more work on tooling. More on this later.&lt;/p&gt;

&lt;h2 id="iterating-on-the-fly"&gt;Iterating on the fly&lt;/h2&gt;

&lt;p&gt;A few times I noticed that the Japanese translations included phrases I didn&amp;rsquo;t understand. Once this need came up a few times, I decided to add it as a feature in my GUI. &lt;strong&gt;I asked GPT to modify the code so that I can select a phrase and click a button to get an explanation in context:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/explain-phrase.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;This tight iteration loop felt awesome. Going from wanting the feature to having it in my app was accomplished in minutes with very little effort. This shows the benefit of having a &lt;em&gt;malleable GUI&lt;/em&gt; which I control and I can quickly edit using an LLM. My feature requests aren&amp;rsquo;t trapped in a feedback queue, I can just build them for myself. It&amp;rsquo;s not the best-designed interaction ever, but it gets the job done.&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;ve found that having the button there encourages me to ask for explanations more often. Before, when I was doing the translations in ChatGPT, I would need to explicitly think to write a follow-up message asking for an explanation. Now I have a button reminding me to do it, and the button also uses a high-quality prompt that I&amp;rsquo;ve developed.&lt;/p&gt;

&lt;h2 id="sharing-the-tool"&gt;Sharing the tool&lt;/h2&gt;

&lt;p&gt;My brother asked me to try the tool. I sent him the Replit link and he was able to use it.&lt;/p&gt;

&lt;p&gt;I think sharing a GUI is probably way more effective than trying to share a complex ChatGPT workflow with various prompts patched together. The UI encodes what I&amp;rsquo;ve learned about doing this particular task effectively, and provides clear affordances that anyone can pick up quickly.&lt;/p&gt;

&lt;h2 id="from-chatbot-to-gui"&gt;From chatbot to GUI&lt;/h2&gt;

&lt;p&gt;What general lessons can we take away from my experience here? I think it gestures at two big ideas.&lt;/p&gt;

&lt;p&gt;The first one is that &lt;strong&gt;chatbots are not always the best interface for a task&lt;/strong&gt;, even one like translation that involves lots of natural language and text. Amelia Wattenberger wrote a &lt;a href="https://wattenberger.com/thoughts/boo-chatbots"&gt;great piece&lt;/a&gt; explaining some of the reasons. It&amp;rsquo;s worth reading the whole thing, but here&amp;rsquo;s a key excerpt about the value of affordances:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Good tools make it clear how they should be used. And more importantly, how they should not be used. If we think about a good pair of gloves, it&amp;rsquo;s immediately obvious how we should use them. They&amp;rsquo;re hand-shaped! We put them on our hands. And the specific material tells us more: metal mesh gloves are for preventing physical harm, rubber gloves are for preventing chemical harm, and leather gloves are for looking cool on a motorcycle.&lt;/p&gt;

&lt;p&gt;Compare that to looking at a typical chat interface. The only clue we receive is that we should type characters into the textbox. The interface looks the same as a Google search box, a login form, and a credit card field.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This principle clearly holds when designing a product that other people are going to use. But perhaps surprisingly, in my experience, &lt;strong&gt;affordances are actually useful even when designing a tool for myself!&lt;/strong&gt; Good affordances can help my future self remember how to use the tool. The &amp;ldquo;explain phrase&amp;rdquo; button reminds me that I should ask about words I don&amp;rsquo;t know.&lt;/p&gt;

&lt;p&gt;I also find that making a UI makes a tool more memorable. My custom GUI is a visually distinctive artifact that lives at a URL; this helps me remember that I have the tool and can use it. Having a UI makes my tool feel more like a reusable artifact than a ChatGPT prompt.&lt;/p&gt;

&lt;p&gt;Now, it&amp;rsquo;s not quite as simple as &amp;ldquo;GUI good, chatbot bad&amp;quot;—there are tradeoffs. For my translation use case, I found ChatGPT super helpful for my initial explorations. The open-endedness of the chatbot gave it a huge leg up over Google Translate, a more traditional application with more limited capabilities and clearer affordances. I was able to explore a wide space of useful features and find the ones that I wanted to keep using.&lt;/p&gt;

&lt;p&gt;I think this suggests a natural workflow: &lt;strong&gt;start in chat, and then codify a UI if it&amp;rsquo;s getting annoying doing the same chat workflow repeatedly.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;By the way, one more thing: there are obviously many other visual affordances to consider besides the ones I used in this particular example. For example, here&amp;rsquo;s another example of a GPT-powered GUI tool I built a couple months ago, where I can drag-and-drop in a file and see useful conversions of that file into different formats:&lt;/p&gt;

&lt;p&gt;&lt;blockquote class="twitter-tweet"&gt;&lt;p lang="en" dir="ltr"&gt;I wanted to convert a JSON file of a chat transcript into nice markdown text for sharing w/ people&amp;hellip;&lt;br&gt;&lt;br&gt;so I had GPT generate an ephemeral React UI where I can drag in the JSON file and it outputs the markdown🤓&lt;br&gt;&lt;br&gt;reflections on the process: &lt;a href="https://t.co/WGwBBtEGiT"&gt;pic.twitter.com/WGwBBtEGiT&lt;/a&gt;&lt;/p&gt;&amp;mdash; Geoffrey Litt (@geoffreylitt) &lt;a href="https://twitter.com/geoffreylitt/status/1654246096212992004?ref_src=twsrc%5Etfw"&gt;May 4, 2023&lt;/a&gt;&lt;/blockquote&gt; &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8"&gt;&lt;/script&gt;&lt;/p&gt;

&lt;h2 id="the-joy-of-editing-our-tools"&gt;The joy of editing our tools&lt;/h2&gt;

&lt;p&gt;Another takeaway: &lt;strong&gt;it feels great to use a tiny GUI made just for my own needs&lt;/strong&gt;. It does only what I want it to do, nothing more. The design isn&amp;rsquo;t going to win any awards or get VC funding, but it&amp;rsquo;s good enough for what I want. When I come across more things that the app needs to do, I can add them.&lt;/p&gt;

&lt;p&gt;Robin Sloan has this delightful idea that &lt;a href="https://www.robinsloan.com/notes/home-cooked-app/"&gt;an app can be a home-cooked meal&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When you liberate programming from the requirement to be professional and scalable, it becomes a different activity altogether, just as cooking at home is really nothing like cooking in a commercial kitchen. I can report to you: not only is this different activity rewarding in almost exactly the same way that cooking for someone you love is rewarding, there’s another feeling, too, specific to this realm. I have struggled to find words for this, but/and I think it might be the crux of the whole thing:&lt;/p&gt;

&lt;p&gt;This messaging app I built for, and with, my family, it won’t change unless we want it to change. There will be no sudden redesign, no flood of ads, no pivot to chase a userbase inscrutable to us. It might go away at some point, but that will be our decision. What is this feeling? Independence? Security? Sovereignty?&lt;/p&gt;

&lt;p&gt;Is it simply … the feeling of being home?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Software doesn&amp;rsquo;t always need to be mass-produced like restaurant food, it can be produced intimately at small scale. My translator app feels this way to me.&lt;/p&gt;

&lt;p&gt;In this example, using GPT-4 to code and edit the app is what enabled the feeling of malleability for me. It feels magical describing an app and having it appear on-screen within seconds. Little React apps seem to be the kind of simple code that GPT-4 is good at producing. You could even argue that it&amp;rsquo;s &amp;quot;just regurgitating other code it&amp;rsquo;s already seen&amp;rdquo;, but I don&amp;rsquo;t care—it made me the tool that I wanted.&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;m a programmer and I could have built this app manually myself without too much trouble. And yet, I don&amp;rsquo;t think I would have. The LLM is an order of magnitude faster than me at getting the first draft out and producing new iterations, this makes me much more likely to just give it a shot. This reminds me of how Simon Willison says that &lt;a href="https://simonwillison.net/2023/Mar/27/ai-enhanced-development/"&gt;AI-enhanced development makes him more ambitious with his projects&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;In the past I’ve had plenty of ideas for projects which I’ve ruled out because they would take a day—or days—of work to get to a point where they’re useful. I have enough other stuff to build already!&lt;/p&gt;

&lt;p&gt;But if ChatGPT can drop that down to an hour or less, those projects can suddenly become viable.&lt;/p&gt;

&lt;p&gt;Which means I’m building all sorts of weird and interesting little things that previously I wouldn’t have invested the time in.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Simon&amp;rsquo;s description applies perfectly to my example.&lt;/p&gt;

&lt;p&gt;It&amp;rsquo;s not just about the initial creation, it&amp;rsquo;s also about the fast iteration loop. I discussed the possibility of LLMs updating a GUI app in my &lt;a href="/2023/03/25/llm-end-user-programming.html"&gt;previous post&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Next, consider LLMs applied to the app model. &lt;strong&gt;What if we started with an interactive analytics application, but this time we had a team of LLM developers at our disposal?&lt;/strong&gt; As a start, we could ask the LLM questions about how to use the application, which could be easier than reading documentation.&lt;/p&gt;

&lt;p&gt;But more profoundly than that, the LLM developers could go beyond that and &lt;em&gt;update&lt;/em&gt; the application. When we give feedback about adding a new feature, our request wouldn&amp;rsquo;t get lost in an infinite queue. They would respond immediately, and we&amp;rsquo;d have some back and forth to get the feature implemented. Of course, the new functionality doesn&amp;rsquo;t need to be shipped to everyone; it can just be enabled for our team. This is economically viable now because we&amp;rsquo;re not relying on a centralized team of human developers to make the change.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/llm-eup/llm-app.png" alt="" /&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It simply feels good to be using a GUI app, have an idea for how it could be different, and then have that new version running within seconds.&lt;/p&gt;

&lt;p&gt;There&amp;rsquo;s a caveat worth acknowleding here: the story I shared in this post only worked under specific conditions. The app I made is extremely simple in functionality; a more complex app would be much harder to modify.&lt;/p&gt;

&lt;p&gt;And I&amp;rsquo;m pretty confident that the coding workflow I shared in this post only worked because I&amp;rsquo;m a programmer. The LLM makes me much, much faster at building these simple kinds of utilities, but my programming knowledge still feels essential to keeping the process running. I&amp;rsquo;m writing fairly detailed technical specs, I&amp;rsquo;m making architectural choices, I&amp;rsquo;m occasionally directly editing the code or fixing a bug. The app is so small and simple that it&amp;rsquo;s easy for me to keep up with what&amp;rsquo;s going on.&lt;/p&gt;

&lt;p&gt;I yearn for non-programmers to also experience software this way, as a malleable artifact they can change in the natural course of use. LLMs are clearly a big leap forward on this dimension, but there&amp;rsquo;s also a lot of work ahead. We&amp;rsquo;ll need to find ways for LLMs to work with non-programmers to specify intent, to help them understand what&amp;rsquo;s going on, and to fix things when they go wrong.&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;m optimistic that a combination of better tooling and improved models can get us there, at least for simpler use cases like my translator tool. I guess there&amp;rsquo;s only one way to find out 🤓 (&lt;a href="https://buttondown.email/geoffreylitt"&gt;Subscribe to my email newsletter&lt;/a&gt; if you want to follow along with my research in this area.)&lt;/p&gt;

&lt;hr&gt;

&lt;h2 id="recently"&gt;Recently&amp;hellip;&lt;/h2&gt;

&lt;p&gt;In the past few months I&amp;rsquo;ve given a couple talks relevant to the themes in this post.&lt;/p&gt;

&lt;p&gt;In April I spoke at &lt;a href="https://www.causalislands.com/"&gt;Causal Islands&lt;/a&gt; about &lt;a href="https://www.inkandswitch.com/potluck/"&gt;Potluck&lt;/a&gt;, a programmable notes prototype I worked on with Max Schoening, Paul Shen, and Paul Sonnentag at Ink &amp;amp; Switch. In my talk I share a bunch of demos from our published essay, but I also show some newer demos of integrating LLMs to help author spreadsheets. (The embed below will jump you right to the LLM demos)&lt;/p&gt;

&lt;iframe width="100%" height="315" src="https://www.youtube.com/embed/bJ3i4K3hefI?start=1359" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen&gt;&lt;/iframe&gt;

&lt;p&gt;Also: a couple weeks ago, I presented my PhD thesis defense at MIT! I gave a talk called Building Personal Software with Reactive Databases. I talk about what makes spreadsheets great, and show a few projects I&amp;rsquo;ve worked on that aim to make it easier to build software using techniques from spreadsheets and databases.&lt;/p&gt;

&lt;iframe width="100%" height="315" src="https://www.youtube.com/embed/CPKsS3SJU4o" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen&gt;&lt;/iframe&gt;

&lt;hr&gt;

&lt;h2 id="related-reading"&gt;Related reading&lt;/h2&gt;

&lt;p&gt;If you&amp;rsquo;re interested in diving deeper into ways of interacting with LLMs besides chatbots, I strongly recommend the following readings:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://idl.cs.washington.edu/files/2019-AgencyPlusAutomation-PNAS.pdf"&gt;Agency plus automation: Designing artificial intelligence into interactive systems&lt;/a&gt; by Jeffrey Heer&lt;/li&gt;
&lt;li&gt;&lt;a href="https://magrawala.substack.com/p/unpredictable-black-boxes-are-terrible"&gt;Unpredictable Black Boxes are Terrible Interfaces&lt;/a&gt; by Maneesh Agrawala&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dl.acm.org/doi/10.1145/267505.267514"&gt;Direct manipulation vs. interface agents&lt;/a&gt;, a 1997 debate between Ben Shneiderman and Pattie Maes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And for a more abstract angle on the example in this post, check out my previous post, &lt;a href="/2023/03/25/llm-end-user-programming.html"&gt;Malleable software in the age of LLMs&lt;/a&gt;!&lt;/p&gt;

&lt;hr&gt;

&lt;h2 id="appendix-prompts"&gt;Appendix: prompts&lt;/h2&gt;

&lt;p&gt;Here are some of the prompts I used to make the translator app.&lt;/p&gt;

&lt;p&gt;First, my general system prompt for UI coding:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;You are a helpful AI coding assistant. Make sure to follow the user&amp;rsquo;s instructions precisely and to the letter. Always reason aloud about your plans before writing the final code.&lt;/p&gt;

&lt;p&gt;Write code in ReactJS. Keep the whole app in one file. Only write a frontend, no backend.&lt;/p&gt;

&lt;p&gt;If the specification is clear, you can generate code immediately. If there are ambiguities, ask key clarifying questions before proceeding.&lt;/p&gt;

&lt;p&gt;When the user asks you to make edits, suggest minimal edits to the code, don&amp;rsquo;t regenerate the whole file.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Initial prompt for the texting app:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I&amp;rsquo;d like you to make me an app that helps me participate in a text message conversation in Japanese by using an LLM to translate. Here&amp;rsquo;s the basic idea:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I paste in a transcript of a text message thread into a box&lt;/li&gt;
&lt;li&gt;I write the message I want to reply with (in english) into a different box&lt;/li&gt;
&lt;li&gt;I click a button&lt;/li&gt;
&lt;li&gt;the app shows me a Japanese translation of my message as output; there&amp;rsquo;s a copy button so i can copy-paste it easily.&lt;/li&gt;
&lt;li&gt;the app talks to openai gpt-4 to do the translation. the prompt can be something like &amp;ldquo;here&amp;rsquo;s a text thread in japanese: &lt;thread&gt;. now translate my new message below to japanese. make it sound natural in the flow of this conversation. don&amp;rsquo;t translate word for word, translate the general meaning.&amp;rdquo; use the openai js library, some sample code pasted below.&lt;/li&gt;
&lt;li&gt;the user can paste in their openai key in a settings pane, it gets stored in localstorage&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;One of the iterative edits for the texting app:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;make the following edits and output new code:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;write a css file and style the app to look professional and modern.&lt;/li&gt;
&lt;li&gt;arrange the text thread in a tall box on the left, and then the new message and translation vertically stacked to the right&lt;/li&gt;
&lt;li&gt;give the app a title: Japanese Texting Helper&lt;/li&gt;
&lt;li&gt;hide the openai key behind a settings section that gets toggled open/closed at the bottom of the app&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
</content>
  </entry>
  <entry>
    <title>Malleable software in the age of LLMs</title>
    <link rel="alternate" href="https://geoffreylitt.com/2023/03/25/llm-end-user-programming.html"/>
    <id>https://geoffreylitt.com/2023/03/25/llm-end-user-programming.html</id>
    <published>2023-03-25T19:05:00+00:00</published>
    <updated>2023-03-25T19:05:00+00:00</updated>
    <author>
      <name>Geoffrey Litt</name>
    </author>
    <summary type="html">&lt;p&gt;&lt;img src="/images/article_images/llm-eup/robot-coding.png" alt="A robot and a human coding together. Image from Midjourney."&gt;&lt;/p&gt;

&lt;p&gt;It’s been a wild few weeks for large language models. OpenAI &lt;a href="https://cdn.openai.com/papers/gpt-4.pdf"&gt;released GPT-4&lt;/a&gt;, which shows impressive gains on a variety of capabilities including coding. Microsoft Research &lt;a href="https://www.microsoft.com/en-us/research/publication/sparks-of-artificial-general-intelligence-early-experiments-with-gpt-4/"&gt;released a paper&lt;/a&gt; showing how GPT-4 was able to produce quite sophisticated...&lt;/p&gt;</summary>
    <content type="html">&lt;p&gt;&lt;img src="/images/article_images/llm-eup/robot-coding.png" alt="A robot and a human coding together. Image from Midjourney." /&gt;&lt;/p&gt;

&lt;p&gt;It&amp;rsquo;s been a wild few weeks for large language models. OpenAI &lt;a href="https://cdn.openai.com/papers/gpt-4.pdf"&gt;released GPT-4&lt;/a&gt;, which shows impressive gains on a variety of capabilities including coding. Microsoft Research &lt;a href="https://www.microsoft.com/en-us/research/publication/sparks-of-artificial-general-intelligence-early-experiments-with-gpt-4/"&gt;released a paper&lt;/a&gt; showing how GPT-4 was able to produce quite sophisticated code like a 3D video game without much prompting at all. OpenAI also released &lt;a href="https://openai.com/blog/chatgpt-plugins"&gt;plugins for ChatGPT&lt;/a&gt;, which are a productized version of the ReAct tool usage pattern I played around with in my &lt;a href="https://www.geoffreylitt.com/2023/01/29/fun-with-compositional-llms-querying-basketball-stats-with-gpt-3-statmuse-langchain.html"&gt;previous post about querying NBA statistics using GPT&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Amid all this chaos, many people are naturally wondering: &lt;strong&gt;how will LLMs affect the creation of software?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One answer to that question is that LLMs will make skilled professional developers more productive. This is a safe bet since GitHub Copilot has already shown it&amp;rsquo;s viable. It&amp;rsquo;s also a comforting thought, because developers can feel secure in their future job prospects, and it doesn&amp;rsquo;t suggest structural upheaval in the way software is produced or distributed 😉&lt;/p&gt;

&lt;p&gt;However, I suspect this won&amp;rsquo;t be the whole picture. While I&amp;rsquo;m confident that LLMs will become useful tools for professional programmers, I also think focusing too much on that narrow use risks missing the potential for bigger changes ahead.&lt;/p&gt;

&lt;p&gt;Here&amp;rsquo;s why: &lt;strong&gt;I think it&amp;rsquo;s likely that soon all computer users will have the ability to develop small software tools from scratch, and to describe modifications they&amp;rsquo;d like made to software they&amp;rsquo;re already using.&lt;/strong&gt; In other words, LLMs will represent a step change in tool support for &lt;a href="https://www.inkandswitch.com/end-user-programming/"&gt;&lt;em&gt;end-user programming&lt;/em&gt;&lt;/a&gt;: the ability of normal people to fully harness the  general power of computers without resorting to the complexity of normal programming. Until now, that vision has been bottlenecked on turning fuzzy informal intent into formal, executable code; now that bottleneck is rapidly opening up thanks to LLMs.&lt;/p&gt;

&lt;p&gt;If this hypothesis indeed comes true, we might start to see some surprising changes in the way people use software:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;One-off scripts&lt;/strong&gt;: Normal computer users have their AI create and execute scripts dozens of times a day, to perform tasks like data analysis, video editing, or automating tedious tasks.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;One-off GUIs:&lt;/strong&gt; People use AI to create entire GUI applications just for performing a single specific task—containing just the features they need, no bloat.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Build don&amp;rsquo;t buy:&lt;/strong&gt; Businesses develop more software in-house that meets their custom needs, rather than buying SaaS off the shelf, since it&amp;rsquo;s now cheaper to get software tailored to the use case.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Modding/extensions:&lt;/strong&gt; Consumers and businesses demand the ability to extend and mod their existing software, since it&amp;rsquo;s now easier to specify a new feature or a tweak to match a user&amp;rsquo;s workflow.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Recombination:&lt;/strong&gt; Take the best parts of the different applications you like best, and create a new hybrid that composes them together.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of these changes would go beyond just making our current software production process faster. They would be changing when software gets created, by whom, for what purpose.&lt;/p&gt;

&lt;h2 id="llms-malleable-software-a-series"&gt;LLMs + malleable software: a series&lt;/h2&gt;

&lt;p&gt;Phew, there&amp;rsquo;s a lot to unpack here. 😅&lt;/p&gt;

&lt;p&gt;In a series of posts starting with this one, I&amp;rsquo;ll dig in and explore these kinds of broad changes LLMs might enable in the creation and distribution of software, and even more generally in the way people interact with software. Some of the questions I&amp;rsquo;ll cover include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Interaction models:&lt;/strong&gt; Which interaction model will make sense for which tasks? When will people want a chatbot, a one-off script, or a custom throwaway GUI?&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Software customization:&lt;/strong&gt; How might LLMs enable &lt;em&gt;malleable software&lt;/em&gt; that can be taken apart, recombined, and extended by users?&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Intent specification:&lt;/strong&gt; How will end-users work interactively with LLMs to specify their intent?&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Fuzzy translators:&lt;/strong&gt; How might the fuzzy data translation capabilities of LLMs enable shared data substrates which weren&amp;rsquo;t possible before?&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;User empowerment:&lt;/strong&gt; How should we think about &lt;em&gt;empowerment&lt;/em&gt; and &lt;em&gt;agency&lt;/em&gt; vs &lt;em&gt;delegation&lt;/em&gt; and &lt;em&gt;automation&lt;/em&gt; in the age of LLMs?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you want to subscribe to get future posts about these ideas, you can &lt;a href="https://buttondown.email/geoffreylitt"&gt;sign up for my email newsletter&lt;/a&gt; or &lt;a href="/feed.xml"&gt;subscribe via RSS&lt;/a&gt;. Posts should be fairly infrequent, monthly at most.&lt;/p&gt;

&lt;h2 id="when-to-chatbot-when-to-not"&gt;When to chatbot, when to not?&lt;/h2&gt;

&lt;p&gt;Today, we&amp;rsquo;ll start with a basic question: how will user interaction models evolve in the LLM era? In particular, &lt;strong&gt;what kinds of tasks might be taken over by chatbots?&lt;/strong&gt;  I think the answer matters a lot when we consider different ways to empower end-users.&lt;/p&gt;

&lt;p&gt;As a preview of where this post is headed: I&amp;rsquo;ll argue that, while ChatGPT is far more capable than Siri, there are many tasks which aren&amp;rsquo;t well-served by a chat UI, for which we still need graphical user interfaces. Then I&amp;rsquo;ll discuss hybrid interaction models where LLMs help us construct UIs.&lt;/p&gt;

&lt;p&gt;By the end, we&amp;rsquo;ll arrive at a point in the design space I find intriguing: open-ended computational media, directly learnable and moldable by users, with LLMs as collaborators within that media. And at that point this weird diagram will make sense 🙃:&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/llm-eup/medium-local-llm-devs.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;One disclaimer before diving in: expect a lot of speculation and uncertainty. I&amp;rsquo;m not even trying to predict how fast these changes will happen, since I have no idea. The point is to imagine how a reasonable extrapolation from current AI might support new kinds of interactions with computers, and how we might apply this new technology to maximally empower end-users.&lt;/p&gt;

&lt;h2 id="opening-up-the-programming-bottleneck"&gt;Opening up the programming bottleneck&lt;/h2&gt;

&lt;p&gt;Why might LLMs be a big deal for empowering users with computation?&lt;/p&gt;

&lt;p&gt;For decades, pioneers of computing have been reaching towards a vision of &lt;em&gt;end-user programming&lt;/em&gt;: normal people harnessing the full, general power of computers, not just using prefabricated applications handed down to them by the programmer elite. As Alan Kay &lt;a href="http://worrydream.com/refs/Kay%20-%20Opening%20the%20Hood%20of%20a%20Word%20Processor.pdf"&gt;wrote in 1984&lt;/a&gt;: &amp;ldquo;We now want to edit our &lt;em&gt;tools&lt;/em&gt; as we have previously edited our documents.&amp;rdquo;&lt;/p&gt;

&lt;p&gt;There are many manifestations of this idea. Modern examples of end-user programming systems you may have used include spreadsheets, Airtable, Glide, or iOS Shortcuts. Older examples include HyperCard, Smalltalk, and Yahoo Pipes. (See this &lt;a href="https://www.inkandswitch.com/end-user-programming/"&gt;excellent overview&lt;/a&gt; by my collaborators at Ink &amp;amp; Switch for a historical deep dive)&lt;/p&gt;

&lt;p&gt;Although some of these efforts have been quite successful, until now they&amp;rsquo;ve also been limited by a fundamental challenge: &lt;strong&gt;it&amp;rsquo;s really hard to help people turn their rough ideas into formal executable code.&lt;/strong&gt; System designers have tried super-high-level languages, friendly visual editors and better syntax, layered levels of complexity, and automatically generating simple code from examples. But it&amp;rsquo;s proven hard to get past a certain ceiling of complexity with these techniques.&lt;/p&gt;

&lt;p&gt;Here&amp;rsquo;s one example of the programming bottleneck in my own work. A few years ago, I developed an end-user programming system called &lt;a href="https://www.geoffreylitt.com/wildcard/"&gt;Wildcard&lt;/a&gt; which would let people customize any website through a spreadsheet interface. For example, in this short demo you can see a user sorting articles on Hacker News in a different order, and then adding read times to the articles in the page, all by manipulating a spreadsheet synced with the webpage.&lt;/p&gt;

&lt;p&gt;&lt;video src="/images/article_images/llm-eup/wildcard.mp4#t=0.1" controls="controls" preload="auto" muted="muted" data-video="0" /&gt;&lt;/p&gt;

&lt;p&gt;Neat demo, right?&lt;/p&gt;

&lt;p&gt;But if you look closely, there are two slightly awkward programming bottlenecks in this system. First, the user needs to be able to write small spreadsheet formulas to express computations. This is a lot easier than learning a full-fledged programming language, but it&amp;rsquo;s still a barrier to initial usage. Second, behind the scenes, Wildcard requires site-specific scraping code (excerpt shown below) to connect the spreadsheet to the website. In theory these adapters could be written and maintained by developers and shared among a community of end-users, but that&amp;rsquo;s a lot of work.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/llm-eup/hacker-news.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Now, with LLMs, these kinds of programming bottlenecks are less of a limiting factor.&lt;/strong&gt; Turning a natural language specification into web scraping code or a little spreadsheet formula is exactly the kind of code synthesis that current LLMs can already achieve. We could imagine having the LLM help with scraping code and generating formulas, making it possible to achieve the demo above without anyone writing manual code. When I made Wildcard, this kind of program synthesis was just a fantasy, and now it&amp;rsquo;s rapidly becoming a reality.&lt;/p&gt;

&lt;p&gt;This example also suggests a deeper question, though. If we have LLMs that can modify a website for us, why bother with the Wildcard UI at all? Couldn&amp;rsquo;t we just ask ChatGPT to re-sort the website for us and add read times?&lt;/p&gt;

&lt;p&gt;I don&amp;rsquo;t think the answer is that clear cut. There&amp;rsquo;s a lot of value to seeing the spreadsheet as an alternate view of the underlying data of the website, which we can directly look at and manipulate. Clicking around in a table and sorting by column headers feels good, and is faster than typing &amp;ldquo;sort by column X&amp;rdquo;. Having spreadsheet formulas that the user can directly see and edit gives them more control.&lt;/p&gt;

&lt;p&gt;The basic point here is that &lt;strong&gt;user interfaces still matter.&lt;/strong&gt; We can imagine specific, targeted roles for LLMs that help empower users to customize and build software, without carelessly throwing decades of interaction design out the window.&lt;/p&gt;

&lt;p&gt;Next we&amp;rsquo;ll dive deeper into this question of user interfaces vs. chatbots. But first let&amp;rsquo;s briefly go on a tangent and ask: can GPT really code?&lt;/p&gt;

&lt;h2 id="cmon-can-it-really-code-though"&gt;Cmon, can it really code though?&lt;/h2&gt;

&lt;p&gt;How good is GPT-4&amp;rsquo;s coding ability today? It&amp;rsquo;s hard to summarize in general terms. The best way to understand the current capabilities is to see many positive and negative examples to develop some fuzzy intuition, and ideally to try it yourself.&lt;/p&gt;

&lt;p&gt;It&amp;rsquo;s not hard to find impressive examples. Personally, I&amp;rsquo;ve had success using GPT-4 to write one-off Python code for data processing, and I watched my wife use ChatGPT to write some Python code for scraping data from a website. A &lt;a href="https://arxiv.org/abs/2303.12712"&gt;recent paper&lt;/a&gt; from Microsoft Research found GPT-4 could generate a sophisticated 3D game running in the browser, with a zero-shot prompt (shown below).&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/llm-eup/3d-game.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;It&amp;rsquo;s also not hard to find failures. In my experience, GPT-4 still gets confused when solving relatively simple algorithms problems. I tried to use it the other day to make a React application for performing some simple video editing tasks, and it got 90% of the way there but couldn&amp;rsquo;t get some dragging/resizing interactions quite right. It&amp;rsquo;s very far from perfect. In general, GPT-4 feels like a junior developer who is very fast  at typing and knows about a lot of libraries, but is careless and easily confused.&lt;/p&gt;

&lt;p&gt;Depending on your perspective, this summary might seem miraculous or underwhelming. If you&amp;rsquo;re skeptical, I want to point out a couple reasons for optimism which weren&amp;rsquo;t immediately obvious to me.&lt;/p&gt;

&lt;p&gt;First, &lt;strong&gt;iteration is a natural part of the process with LLMs&lt;/strong&gt;. When the code doesn&amp;rsquo;t work the first time, you can simply paste in the error message you got, or describe the unexpected behavior, and GPT will adjust. For one example, see this &lt;a href="https://twitter.com/ammaar/status/1637592014446551040"&gt;Twitter thread&lt;/a&gt; where a designer (who can&amp;rsquo;t write game code) creates a video game over many iterations. There were also some examples of iterating with error messages in the &lt;a href="https://www.youtube.com/watch?v=outcGtbnMuQ"&gt;GPT-4 developer livestream&lt;/a&gt;. When you think about it, this mirrors the way humans write code; it doesn&amp;rsquo;t always work on the first try.&lt;/p&gt;

&lt;p&gt;A joke that comes up often among AI-skeptical programmers goes something like this: &amp;ldquo;Great, now no one will have to write code, they&amp;rsquo;ll only have to write exact, precise specifications of computer behavior&amp;hellip;&amp;rdquo; (implied: oh wait, that is code!) I suspect we&amp;rsquo;ll look back on this view as short-sighted. LLMs can iteratively work with users and ask them questions to develop their specifications, and can also fill in underspecified details using common sense. This doesn&amp;rsquo;t mean those are trivial challenges, but I expect to see progress on those fronts. I&amp;rsquo;ve already had success prompting GPT-4 to ask me clarifying questions about my specifications.&lt;/p&gt;

&lt;p&gt;Another important point: &lt;strong&gt;GPT-4 seems to be a &lt;em&gt;lot&lt;/em&gt; better than GPT-3 at coding&lt;/strong&gt;, per the MSR paper and my own limited experiments. The trend line is steep. If we&amp;rsquo;re not plateauing yet, then it&amp;rsquo;s very plausible that the next generation of models will be significantly better once again.&lt;/p&gt;

&lt;p&gt;Coding difficulty varies by context, and we might expect to see differences between professional software engineering and end-user programming. On the one hand, one might expect end-user programming to be easier than professional coding, because lots of tasks can be achieved with simple coding that mostly involves gluing together libraries, and doesn&amp;rsquo;t require novel algorithmic innovation.&lt;/p&gt;

&lt;p&gt;On the other hand, &lt;strong&gt;failures are more consequential when a novice end-user is driving the process than when a skilled programmer is wielding control&lt;/strong&gt;. The skilled programmer can laugh off the LLM&amp;rsquo;s silly suggestion, write their own code, or apply their own skill to work with the LLM to debug. An end-user is more likely to get confused or not even notice problems in the first place. These are real problems, but I don&amp;rsquo;t think they&amp;rsquo;re intractable. End-users already write messy buggy spreadsheet programs all the time, and yet we somehow muddle through—even if that seems offensive or perhaps even immoral to a correctness-minded professional software developer.&lt;/p&gt;

&lt;h2 id="chat-is-an-essentially-limited-interaction"&gt;Chat is an essentially limited interaction&lt;/h2&gt;

&lt;p&gt;Now, with those preliminaries out of the way, let&amp;rsquo;s move on to the main topic of this post: how will interaction models evolve in this new age of computing? We&amp;rsquo;ll start by assessing chat as an interaction mode. Is the future of computing just talking to our computers in natural language?&lt;/p&gt;

&lt;p&gt;To think clearly about this question, I think it&amp;rsquo;s important to notice that chatbots are frustrating for two distinct reasons. First, it&amp;rsquo;s annoying when the chatbot is narrow in its capabilities (looking at you Siri) and can&amp;rsquo;t do the thing you want it to do. But more fundamentally than that, &lt;strong&gt;chat is an essentially limited interaction mode, regardless of the quality of the bot.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To show why, let&amp;rsquo;s pick on a specific example: this tweet from OpenAI&amp;rsquo;s Greg Brockman during the ChatGPT Plugins launch this week, where he uses ChatGPT to trim the first 5 seconds of a video using natural language:&lt;/p&gt;

&lt;p&gt;&lt;blockquote class="twitter-tweet"&gt;&lt;p lang="en" dir="ltr"&gt;Plugins for processing a video clip, no ffmpeg wizardry required. Actual use-case from today&amp;rsquo;s launch. &lt;a href="https://t.co/Q3r2Z8fRS5"&gt;pic.twitter.com/Q3r2Z8fRS5&lt;/a&gt;&lt;/p&gt;&amp;mdash; Greg Brockman (@gdb) &lt;a href="https://twitter.com/gdb/status/1638971232443076609?ref_src=twsrc%5Etfw"&gt;March 23, 2023&lt;/a&gt;&lt;/blockquote&gt; &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8"&gt;&lt;/script&gt;&lt;/p&gt;

&lt;p&gt;On the one hand, this is an extremely impressive demo for anyone who knows how computers work, and I&amp;rsquo;m excited about all the possibilities it implies.&lt;/p&gt;

&lt;p&gt;And yet&amp;hellip; in another sense, &lt;strong&gt;this is also a silly demo, because we already have direct manipulation user interfaces for trimming videos&lt;/strong&gt;, with rich interactive feedback. For example, consider the iPhone UI for trimming videos, which offers rich feedback and fine control over exactly where to trim. This is much better than going back and forth over chat saying &amp;ldquo;actually trim just 4.8 seconds please&amp;rdquo;!&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/llm-eup/iphone-trim.jpeg" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;Now, I get that the point of Greg&amp;rsquo;s demo wasn&amp;rsquo;t just to trim a video, it was to gesture at an expanse of possibilities. But there&amp;rsquo;s still something important to notice here: a chat interface is not only quite slow and imprecise, but also requires conscious awareness of your thought process.&lt;/p&gt;

&lt;p&gt;When we use a good tool—a hammer, a paintbrush, a pair of skis, or a car steering wheel—we become one with the tool in a subconscious way. We can enter a flow state, apply muscle memory, achieve fine control, and maybe even produce creative or artistic output. &lt;strong&gt;Chat will never feel like driving a car, no matter how good the bot is.&lt;/strong&gt; In their 1986 book Understanding Computers and Cognition, Terry Winograd and Fernando Flores elaborate on this point:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;In driving a car, the control interaction is normally transparent. You do not think &amp;ldquo;How far should I turn the steering wheel to go around that curve?&amp;rdquo; In fact, you are not even aware (unless something intrudes) of using a steering wheel&amp;hellip;The long evolution of the design of automobiles has led to this readiness-to-hand. It is not achieved by having a car communicate like a person, but by providing the right coupling between the driver and action in the relevant domain (motion down the road).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id="consultants-vs-apps"&gt;Consultants vs apps&lt;/h2&gt;

&lt;p&gt;Let&amp;rsquo;s zoom out a bit on this question of chat vs direct manipulation. One way to think about it is to reflect on what it&amp;rsquo;s like to interact with a team of human consultants over Slack, vs. just using an app to get the job done. Then we&amp;rsquo;ll see how LLMs might play in to that picture.&lt;/p&gt;

&lt;p&gt;So, imagine you want to get some metrics about your business, maybe a sales forecast for next quarter. How do you do it?&lt;/p&gt;

&lt;p&gt;One approach is to ask your skilled team of business analysts. You can send them a message asking your question. It probably takes hours to get a response because they&amp;rsquo;re busy, and it&amp;rsquo;s expensive because you&amp;rsquo;re paying for people&amp;rsquo;s time. Seems like overkill for a simple task, but the key benefit is &lt;em&gt;flexibility&lt;/em&gt;: you&amp;rsquo;re hoping that the consultants have a broad, general intelligence and can perform lots of different tasks that you ask of them.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/llm-eup/consultant.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;In contrast, another option is to use a self-serve analytics platform where you can click around in some dashboards. When this works, it&amp;rsquo;s way faster and cheaper than bothering the analysts. The dashboards offer you powerful direct manipulation interactions like sorting, filtering, and zooming. You can quickly think through the problem yourself.&lt;/p&gt;

&lt;p&gt;So what&amp;rsquo;s the downside? &lt;strong&gt;Using the app is &lt;em&gt;less flexible&lt;/em&gt; than working with the bespoke consultants.&lt;/strong&gt; The moment you want to perform a task which this analytics platform doesn&amp;rsquo;t support, you&amp;rsquo;re stuck asking for help or switching to a different tool. You can try sending an email to the developers of the analysis platform, but usually nothing will come of it. You don&amp;rsquo;t have a meaningful feedback loop with the developers; you&amp;rsquo;re left wishing software were more flexible.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/llm-eup/app.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;Now with that baseline comparison established, let&amp;rsquo;s imagine how LLMs might fit in.&lt;/p&gt;

&lt;p&gt;Assume that we could replace our human analyst team with ChatGPT for the tasks we have in mind, while preserving the same degree of flexibility. (This isn&amp;rsquo;t true of today&amp;rsquo;s models, but will become increasingly true to some approximation.) How would that change the picture? Well, for one thing, the LLM is a lot cheaper to run than the humans. It&amp;rsquo;s also a lot faster at responding since it&amp;rsquo;s not busy taking a coffee break. These are major advantages. But still, dialogue back and forth with it takes seconds, if not minutes, of conscious thought—much slower than feedback loops you have with a GUI or a steering wheel.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/llm-eup/llm-consultant.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;Next, consider LLMs applied to the app model. &lt;strong&gt;What if we started with an interactive analytics application, but this time we had a team of LLM developers at our disposal?&lt;/strong&gt; As a start, we could ask the LLM questions about how to use the application, which could be easier than reading documentation.&lt;/p&gt;

&lt;p&gt;But more profoundly than that, the LLM developers could go beyond that and &lt;em&gt;update&lt;/em&gt; the application. When we give feedback about adding a new feature, our request wouldn&amp;rsquo;t get lost in an infinite queue. They would respond immediately, and we&amp;rsquo;d have some back and forth to get the feature implemented. Of course, the new functionality doesn&amp;rsquo;t need to be shipped to everyone; it can just be enabled for our team. This is economically viable now because we&amp;rsquo;re not relying on a centralized team of human developers to make the change.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/llm-eup/llm-app.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;Note that this is just a rough vision at this point. We&amp;rsquo;re missing a lot of details about how this model might be made real. A lot of the specifics of how software is built today make these kinds of on-the-fly customizations quite challenging.&lt;/p&gt;

&lt;p&gt;The important thing, though, is that we&amp;rsquo;ve now established two loops in the interaction. On the inner loop, we can become one with the tool, using fast direct manipulation interfaces. On the outer loop, when we hit limits of the existing application, we can consciously offer feedback to the LLM developers and get new features built. This preserves the benefits of UIs, while adding more flexibility.&lt;/p&gt;

&lt;h2 id="from-apps-to-computational-media"&gt;From apps to computational media&lt;/h2&gt;

&lt;p&gt;Does this double interaction loop remind you of anything?&lt;/p&gt;

&lt;p&gt;Think about how a spreadsheet works. If you have a financial model in a spreadsheet, you can try changing a number in a cell to assess a scenario—this is the inner loop of direct manipulation at work.&lt;/p&gt;

&lt;p&gt;But, you can also edit the formulas! &lt;strong&gt;A spreadsheet isn&amp;rsquo;t just an &amp;ldquo;app&amp;rdquo; focused on a specific task; it&amp;rsquo;s closer to a general computational medium&lt;/strong&gt; which lets you flexibly express many kinds of tasks. The &amp;ldquo;platform developers&amp;quot;—the creators of the spreadsheet—have given you a set of general primitives that can be used to make many tools.&lt;/p&gt;

&lt;p&gt;We might draw the double loop of the spreadsheet interaction like this. You can edit numbers in the spreadsheet, but you can also edit formulas, which &lt;em&gt;edits the tool&lt;/em&gt;:&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/llm-eup/medium.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;So far, I&amp;rsquo;ve labeled the spreadsheet in the above diagram as &amp;quot;kinda&amp;rdquo; flexible. Why? Well, when any individual user is working with a spreadsheet, it&amp;rsquo;s easy for them to hit the limits of their knowledge. In real life, spreadsheets are actually way more flexible than this. The reason is that this diagram is missing a critical component of spreadsheet usage: &lt;em&gt;collaboration&lt;/em&gt;.&lt;/p&gt;

&lt;h2 id="collaboration-with-local-developers"&gt;Collaboration with local developers&lt;/h2&gt;

&lt;p&gt;Most teams have a mix of domain experts and technical experts, who work together to put together a spreadsheet. And, importantly, the people building a spreadsheet together have a &lt;em&gt;very different relationship&lt;/em&gt; than a typical &amp;ldquo;developer&amp;rdquo; and &amp;ldquo;end-user&amp;rdquo;. Bonnie Nardi and James Miller explain in their &lt;a href="https://www.lri.fr/~mbl/Stanford/CS477/papers/Nardi-Twinkling-IJMMS.pdf"&gt;1990 paper on collaborative spreadsheet development&lt;/a&gt;, imagining Betty, a CFO who knows finance, and Buzz, an expert in programming spreadsheets:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Betty and Buzz seem to be the stereotypical end-user/developer pair, and it is easy to imagine their development of a spreadsheet to be equally stereotypical: Betty specifies what the spreadsheet should do based on her knowledge of the domain, and Buzz implements it.&lt;/p&gt;

&lt;p&gt;This is not the case. Their cooperative spreadsheet development departs from this scenario in two important ways:&lt;/p&gt;

&lt;p&gt;(1) &lt;strong&gt;Betty constructs her basic spreadsheets without assistance from Buzz.&lt;/strong&gt; She programs the parameters, data values and formulas into her models. In addition, Betty is completely responsible for the design and implementation of the user interface. She makes effective use of color, shading, fonts, outlines, and blank cells to structure and highlight the information in her spreadsheets.&lt;/p&gt;

&lt;p&gt;(2) When Buzz helps Betty with a complex part of the spreadsheet such as graphing or a complex formula, &lt;strong&gt;his work is expressed in terms of Betty&amp;rsquo;s original work.&lt;/strong&gt; He adds small, more advanced pieces of code to Betty&amp;rsquo;s basic spreadsheet; Betty is the main developer and he plays an adjunct role as consultant.&lt;/p&gt;

&lt;p&gt;This is an important shift in the responsibility of system design and implementation. Non-programmers can be responsible for most of the development of a spreadsheet, implementing large applications that they would not undertake if they had to use conventional programming techniques. Non-programmers may never learn to program recursive functions and nested loops, but they can be extremely productive with spreadsheets. Because less experienced spreadsheet users become engaged and involved with their spreadsheets, they are motivated to reach out to more experienced users when they find themselves approaching the limits of their understanding of, or interest in, more sophisticated programming techniques.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So, a more accurate diagram of spreadsheet usage includes &amp;ldquo;local developers&amp;rdquo; like Buzz, who provide another outer layer of iteration, where the user can get help molding their tools. Because they&amp;rsquo;re on the same team as the user, it&amp;rsquo;s a lot easier to get help than appealing to third-party application or platform developers. And most importantly, over time, the user naturally learns to use more features of spreadsheets on their own, since they&amp;rsquo;re involved in the development process.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/llm-eup/medium-local-devs.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;In general, the local developer makes the spreadsheet more flexible, although they also introduce cost, because now you have a human technical expert in the mix. What if you don&amp;rsquo;t have a local spreadsheet expert handy, perhaps because you can&amp;rsquo;t afford to hire that person? Then you&amp;rsquo;re back to doing web searches for complex spreadsheet programming&amp;hellip;&lt;/p&gt;

&lt;p&gt;In those cases, &lt;strong&gt;what if you had an LLM play the role of the local developer?&lt;/strong&gt; That is, the user mainly drives the creation of the spreadsheet, but asks for technical help with some of the formulas when needed? The LLM wouldn&amp;rsquo;t just create an entire solution, it would also &lt;em&gt;teach the user&lt;/em&gt; how to create the solution themselves next time.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/article_images/llm-eup/medium-local-llm-devs.png" alt="" /&gt;&lt;/p&gt;

&lt;p&gt;This picture shows a world that I find pretty compelling. There&amp;rsquo;s an inner interaction loop that takes advantage of the full power of direct manipulation. There&amp;rsquo;s an outer loop where the user can also more deeply edit their tools within an open-ended medium. They can get AI support for making tool edits, and grow their own capacity to work in the medium. Over time, they can learn things like the basics of formulas, or how a &lt;code&gt;VLOOKUP&lt;/code&gt; works. This structural knowledge helps the user think of possible use cases for the tool, and also helps them audit the output from the LLMs.&lt;/p&gt;

&lt;p&gt;In a ChatGPT world, the user is left entirely dependent on the AI, without any understanding of its inner mechanism. In a computational medium with AI as assistant, the user&amp;rsquo;s reliance on the AI gently &lt;em&gt;decreases&lt;/em&gt; over time as they become more comfortable in the medium.&lt;/p&gt;

&lt;p&gt;If you like this diagram too, then it suggests an interesting opportunity. Until now, the design of open-ended computational media has been restricted by the programming bottleneck problem. LLMs seem to offer a promising way to more flexibly turn natural language into code, which then raises the question: &lt;em&gt;what kinds of powerful computational media might be a good fit for this new situation?&lt;/em&gt;&lt;/p&gt;

&lt;h2 id="demos-of-on-the-fly-ui"&gt;Demos of on-the-fly UI&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Update 3/31: In the days after I originally posted this essay, I found a few neat demos on Twitter from people exploring ideas in this space; I&amp;rsquo;ve added them here.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;OK, enough diagrams, what might on-the-fly UI generation actually feel like to use?&lt;/p&gt;

&lt;p&gt;Here&amp;rsquo;s Sean Grove demonstrating on-the-fly generation of an interactive table view, a map view with a lat/long output, and a simple video editing UI:&lt;/p&gt;

&lt;p&gt;&lt;blockquote class="twitter-tweet"&gt;&lt;p lang="en" dir="ltr"&gt;🚀Future of UI dev🔮:&lt;br&gt;~10% fixed UIs built by hand like today&lt;br&gt;~40% replaced by conversational UIs&lt;br&gt;~50% long-tail, on-the-fly UIs generated for specific tasks, used once, then vanish&lt;br&gt;&lt;br&gt;Combined with ChatGPT plugins to read/write from the world 🤯&lt;a href="https://t.co/mIFrCyzW8N"&gt;https://t.co/mIFrCyzW8N&lt;/a&gt;&lt;/p&gt;&amp;mdash; Sean Grove (@sgrove) &lt;a href="https://twitter.com/sgrove/status/1640417065650778113?ref_src=twsrc%5Etfw"&gt;March 27, 2023&lt;/a&gt;&lt;/blockquote&gt; &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8"&gt;&lt;/script&gt;&lt;/p&gt;

&lt;p&gt;And here&amp;rsquo;s Vasek Mlejnsky showing an IDE that can create a form for submitting server requests:&lt;/p&gt;

&lt;p&gt;&lt;blockquote class="twitter-tweet"&gt;&lt;p lang="en" dir="ltr"&gt;I present to you: &lt;br&gt;GPT-4 powered IDE that creates UI on demand so it fits your exact development needs.&lt;br&gt;&lt;br&gt;Need UI for making server requests? No problem. Just ask for it. &lt;a href="https://t.co/2oDKTuWM0e"&gt;pic.twitter.com/2oDKTuWM0e&lt;/a&gt;&lt;/p&gt;&amp;mdash; Vasek Mlejnsky (@mlejva) &lt;a href="https://twitter.com/mlejva/status/1641151421830529042?ref_src=twsrc%5Etfw"&gt;March 29, 2023&lt;/a&gt;&lt;/blockquote&gt; &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8"&gt;&lt;/script&gt;&lt;/p&gt;

&lt;p&gt;Finally, here&amp;rsquo;s a little video mockup I made of GPT answering a question by returning an interactive spreadsheet. Note how I can tweak numbers and get immediate feedback. I can also inspect the underlying formulas and ask the model to explain them to me to level up my spreadsheet knowledge. (GPT actually did generate this spreadsheet data, I just copied the raw data into Excel to demonstrate the interactive element.)&lt;/p&gt;

&lt;p&gt;&lt;blockquote class="twitter-tweet"&gt;&lt;p lang="en" dir="ltr"&gt;what if a chat produced a spreadsheet as the answer, so you could instantly tweak numbers and see the result? &lt;a href="https://t.co/FNKz0kLH7L"&gt;pic.twitter.com/FNKz0kLH7L&lt;/a&gt;&lt;/p&gt;&amp;mdash; Geoffrey Litt (@geoffreylitt) &lt;a href="https://twitter.com/geoffreylitt/status/1641134578222891029?ref_src=twsrc%5Etfw"&gt;March 29, 2023&lt;/a&gt;&lt;/blockquote&gt; &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8"&gt;&lt;/script&gt;&lt;/p&gt;

&lt;p&gt;I think these demos nicely illustrate the general promise of on-the-fly UI, but there&amp;rsquo;s still a ton of work ahead. One particular challenge: interesting UIs usually can&amp;rsquo;t be generated in a single shot; there has to be an iterative process with the user. In my experience, that iteration process can still often be very rough at the moment.&lt;/p&gt;

&lt;h2 id="next-time-extensible-software"&gt;Next time: extensible software&lt;/h2&gt;

&lt;p&gt;That&amp;rsquo;s it for now. There are a lot of questions in the space that we still haven&amp;rsquo;t covered.&lt;/p&gt;

&lt;p&gt;Next time I plan to discuss the architectural foundations required to make GUI applications extensible and composable by people using LLMs.&lt;/p&gt;

&lt;p&gt;If you&amp;rsquo;re interested in that, you can &lt;a href="https://buttondown.email/geoffreylitt"&gt;sign up for my email newsletter&lt;/a&gt; or &lt;a href="/feed.xml"&gt;subscribe via RSS&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id="related-reading"&gt;Related reading&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Quick reads:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://nickarner.com/notes/llm-powered-assistants-for-complex-interfaces-february-26-2023/"&gt;LLM Powered Assistants for Complex Interfaces&lt;/a&gt; by Nick Arner&lt;/li&gt;
&lt;li&gt;&lt;a href="https://stream.thesephist.com/updates/1668617521"&gt;&amp;ldquo;The fact that they generate text is not the point&amp;rdquo;&lt;/a&gt; by @thesephist&lt;/li&gt;
&lt;li&gt;&lt;a href="https://interconnected.org/home/2023/02/07/braggoscope"&gt;&amp;ldquo;GPT-3 as a universal coupling&amp;rdquo;&lt;/a&gt; by Matt Webb&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.geoffreylitt.com/2022/11/23/dynamic-documents.html#tools-vs-machines"&gt;&amp;ldquo;tools vs machines&amp;rdquo;&lt;/a&gt; and &lt;a href="https://www.geoffreylitt.com/2022/11/23/dynamic-documents.html#interpreter-vs-compiler"&gt;&amp;ldquo;interpreter vs compiler&amp;rdquo;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Deep, deep dives:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://tcher.tech/publications/PhilipTchernavskij_PhDThesis.pdf"&gt;Designing and Programming Malleable Software&lt;/a&gt;: Philip Tchernavskij&amp;rsquo;s 2019 PhD thesis, which coined the term Malleable Software, and brilliantly motivates and defines the problem. &amp;ldquo;Malleable software aims to increase the power of existing adaptation behaviors by allowing users to pull apart and re-combine their interfaces at the granularity of individual UI elements&amp;rdquo;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://web.media.mit.edu/~lieber/Publications/End-User-Software-Engineering.pdf"&gt;The State of the Art in End-User Software Engineering&lt;/a&gt;: an academic paper from 2011 that illustrates many of the challenges ahead for supporting normal people in building software. &amp;ldquo;Although these end-user programmers may not have the same goals as professional developers, they do face many of the same software engineering challenges, including understanding their requirements, as well as making decisions about design, reuse, integration, testing, and debugging.&amp;rdquo;&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://malleable.systems/catalog/"&gt;Malleable Systems Catalog&lt;/a&gt;, a list of projects exploring user-editable software, curated by J. Ryan Stinnett and co.&lt;/p&gt;
</content>
  </entry>
</feed>
