What began as an impressive transformation — nearly 9 pounds lost, visible core definition, and a dialed-in routine — unraveled into a cautionary tale about relying on an AI chatbot as a personal trainer. After weeks of steady progress, Gemini started forgetting the plan it helped build, swapping carefully tailored workouts and a vegetarian meal strategy for random routines and chicken-heavy menus. The result wasn’t just inconvenience; it derailed momentum and trust.
A Promising Start That Went Sideways After Weeks
Early on, the chatbot worked like a reliable coach. It organized lifting splits, tracked caloric targets, and adjusted macros on request. Then the cracks appeared. The same long-running chat that housed every note and tweak began surfacing workouts that didn’t match available equipment, fabricated meal plans, and even misremembered basic facts like dietary preference. It also pulled the wrong body measurements and miscounted training weeks.

Attempts to correct the record didn’t stick. Asking for the original plan prompted the model to improvise, not retrieve. The only way to find the “truth” was to scroll back through months of messages — the digital equivalent of rifling through a gym bag for a lost notebook.
Why Long Chats Break Fitness Coaching Over Time
The failure wasn’t random. Large language models rely on a finite context window — essentially short-term memory. Inputs and prior messages are tokenized, and once the conversation exceeds capacity, earlier details fall out. As a rule of thumb, roughly 1,000 tokens correspond to about 700–800 words, though the exact math varies by model. When that window overflows, models may fill the gaps by guessing, a behavior widely known as hallucination.
Compounding the problem is a well-documented “lost in the middle” effect. Research from Anthropic shows models overweight information at the beginning and end of long sequences while devaluing the middle — exactly where weeks of nuanced training notes tend to live. Separate work from academic groups like Stanford’s Center for Research on Foundation Models and MIT CSAIL has highlighted that even with larger context windows, retrieval reliability can degrade as sequences grow complex.
Two practical issues made things worse. First, most consumer chat interfaces don’t show a token meter, so users only notice trouble after the AI forgets. Second, health and fitness plans demand continuity. If a coach “forgets” you’re vegetarian or that you train with dumbbells, not machines, the plan stops being safe and personal — key tenets endorsed by professional bodies like the American College of Sports Medicine and the Academy of Nutrition and Dietetics.
What Went Wrong in Real Terms for This Coaching
Hallucinated workouts broke alignment with goals and available gear. Invented meal plans contradicted a vegetarian diet, jeopardizing adherence and recovery. Miscounted weeks and mismatched measurements undermined progress tracking — the core feedback loop that keeps training progressive. The experience tracks with broader AI safety notes from organizations like OpenAI and Google DeepMind: when knowledge isn’t grounded in a stable source of truth, models will confidently produce plausible but wrong answers.

The deeper issue is structural. A single never-ending thread asks a short-term memory system to act like a database. That’s not what today’s general-purpose chatbots are built for, and the gap is most obvious in domains that depend on precision and history.
Workarounds That Actually Help Keep Plans Consistent
Starting a fresh chat is the first emergency fix. But a better approach is to give the model an external, authoritative source of truth every time you resume. A simple living document — think a structured Google Doc or spreadsheet — can hold your current plan, equipment list, macros, and progress snapshots. Paste or reference that document at the top of each new session to reestablish context quickly.
Other practical tactics:
- Maintain a pinned “profile” snippet with dietary rules, injuries, and scheduling constraints, and paste it into each new chat.
- Ask the model to generate a one-page plan summary after any major change and store that summary in your document for future grounding.
- Use clear versioning: label plans by date and phase (for example, Hypertrophy Phase 2 Week 3) to reduce ambiguity when you request updates.
- For nutrition, keep your own calorie and protein targets in the document. Models can help with recipes and swaps, but your targets should be the anchor.
What This Means for AI as a Coach Going Forward
Today’s chatbots can be excellent assistants for brainstorming routines, creating checklists, and troubleshooting plateaus. But they remain brittle long-term coaches unless paired with a reliable memory layer. Industry research is racing toward solutions — retrieval-augmented generation, longer contexts, and better salience heuristics — yet consistency in consumer tools still lags.
The takeaway is simple: use AI for structure and creativity, but keep your plan and progress in a stable, human-readable record you control. If the model forgets, your training doesn’t have to. That shift turned a would-be disaster back into a manageable system — and kept the gains on track.
