[Iter-X] 52/100days
Day5️⃣2️⃣
We had an online meeting today to discuss the logic behind data cleaning. The scale of the collected POI (Points of Interest) data is quite large. A complete POI includes basic information such as name, description, address, and coordinates, as well as some value-added information like recommended duration of visit, rating, etc. Some of this data can be collected at scale from public datasets, while the rest needs to be supplemented through other means:
1️⃣ Due to the volume, manual work is out of the question initially (although it’s worth noting that manual input can be a viable method—especially at full user scale, users themselves act as data curators, which is the approach many mature platforms take. User-maintained data is often the most accurate).
2️⃣ The conventional approach is using web crawlers to gather data from various sources. Some platforms provide this type of data.
3️⃣ Nowadays, a lot of this can actually be done with LLMs. We’ve estimated that supplementing one POI consumes roughly 1,000 tokens, so even at the scale of hundreds of thousands of POIs, the cost remains quite low.
There’s a lot of nuance in this process, but for now, we’re focusing on the MVP. Optimization can come later.
Today, we finally integrated all the Agents, Tools, and Prompts into the database. This will make it easier to manage and switch between different versions. Next, we’ll build a simple admin dashboard for easy management and prompt tuning. We also plan to create a Trajectory Visualization interface for visualizing the Agent call chains, which will make debugging and tracking much more convenient.
Current Progress Overview:
- Prototype Design & UI/UX: 33%
- Backend (Go) Development: 50%
- Client-side (Flutter) Progress: 42%
- Data: 12%
If you think you match the following, we’d love to talk:
- Persistent
- Dream-driven
- Interested
Enjoy Reading This Article?
Here are some more articles you might like to read next: