João Freitas

I vibe coded for two weeks

I’ve been solo working on a new project at my current company, which involves migrating a desktop browser extension to mobile (fairly simple software that involves scanning and downloading images). If you don’t know it, mobile browser extensions are only possible on iOS through Safari Web Extension API, unless you manually side load them on your device (Android).

Instead of following a traditional migration workflow, my CTO wanted to conduct an experiment and test if LLMs can help us (engineers) become more productive (as in reducing time to release), by vibecoding the extension using only prompts. We set a deadline for two weeks (ten work days), aiming to have a version ready for an internal release by then with a simple goal: no more than 20% of work time could be spent on manual writing/fixing generated code, everything else had to be done via prompting.

We bought a Cursor subscription and let it choose what models fits the best for each prompt (Claude 3-8, gpt4o, gemini-pro-max, among others). In order to ease future maintenance, we decided to use web technologies we work with, like React, Styled Components and Webpack as part of the tech stack. This definition as part of system rules in Cursor was essential to reduce the likelihood of ending up with code that uses different frameworks and libraries to compile. Furthermore, we also added a system rule to instruct Cursor to write clean and concise code as well to not override any project configuration through Webpack.

Timeline

We first aimed at getting an end-to-end prototype working, which took two work days. Initially we thought we could get it done within a morning, but Cursor wasn’t able to understand why nothing was appearing on the screen. After manual inspection we realized that the react components HTML were properly rendered, but the UI library CSS wasn’t being rendered in the main CSS file (content.css). Cursor tried too many times to fix the issue by injecting stylesheet elements in the page, but we had to fix it manually by importing the bundled UI library CSS file in the main CSS file.

We spent the following days fixing all visible UI and logical bugs from the generated code, as well as updating the design to match the specification on Figma. We didn’t use any MCP server to feed Cursor the Figma frames, but instead attached screenshots of these frames in the chat conversation. I was really admired that Cursor was able to quickly understand and create a somewhat copy of the frames. This stage took about five work days.

The remaining (three) days were spent on adjusting the features flow to match the requirements specified by the product team.

The goods

Now I’m going to talk about what I really enjoyed in this experiment:

The bads

As expected, everything has a cost and we noticed several downsides on vibe coding:


Overall I’m glad that we took this experiment on a new project, since it’s way more easy to evaluate the state of tools like Cursor on a greenfield with few requirements and decision flows. I wouldn’t recommend to replicate the same thing on a legacy project, as it would bring more harm than good for the engineer or team. I also think that for the price of 20$/month, Cursor is well worth it.