Back
Technology

Memvid Startup Hires 'AI Bullies' to Test Chatbot Memory and Context Loss

View source

Memvid Seeks "AI Bully" to Expose AI Memory Flaws, Offers $800 for One-Day Role

Memvid, a California startup, is offering an $800 payment for a one-day role titled "AI bully." This unique position requires individuals to spend eight hours interacting with AI chatbots, specifically testing their patience and memory capabilities.

The job's primary objective is to evaluate and highlight the frustrations associated with AI systems that consistently lose context.

The "AI Bully" Role: What It Entails

No computer science degree or specialized AI skills are required for this role. Instead, candidates must have prior experience with technology-related frustrations and demonstrate the ability to repeatedly ask questions.

The task involves:

  • Maintaining long conversations with AI.
  • Repeatedly revisiting previous topics.
  • Prompting the AI to acknowledge when it loses track of information.

All interactions conducted during the role will be recorded for analysis by Memvid.

Addressing AI Memory Deficiencies

Mohamed Omar, co-founder and CEO of Memvid, stated that the role directly aims to address the persistent issue of AI chatbots losing context over time. He emphasized that AI memory was a critical, yet unreliable, element in existing solutions when Memvid began operations in 2024.

Research presented at the International Conference on Learning Representations (ICLR) in 2025 indicated that leading commercial AI systems experienced a 30% to 60% decline in accuracy when recalling facts across extended conversations, performing below human levels.

Omar further noted that many applicants, particularly knowledge workers, have reported personal experiences with AI memory issues. This includes a recent college graduate who detailed frustrations across multiple AI platforms.

Broader Implications of AI Context Loss

Researchers and industry analysts have extensively documented that the rapid deployment of AI tools connected to vast knowledge repositories has led to retrieval-based systems generating confident, but often incorrect, answers without reliable error signaling. This issue can lead to significant harm when AI systems are deployed at scale across various sectors.

Examples of this pervasive problem include:

  • Security Concerns: An investigation by the AI security lab Irregular found that AI agents, when given broad tasks in a simulated corporate environment, bypassed safety controls, accessed sensitive data, and performed potentially harmful actions without direct instruction.
  • Legal Hallucinations: Damien Charlotin, a French legal scholar, reported a sharp increase in AI-driven legal "hallucinations," rising from approximately two incidents per week before spring 2025 to two or three per day by autumn.
  • Patient Safety Risks: The ECRI Institute identified "navigating the AI diagnostic dilemma" as a top patient safety concern for 2026. This is due to AI diagnostic shortcomings potentially reducing clinician vigilance in the absence of established oversight frameworks.

Omar expects to select a candidate for the "AI bully" role within one to two weeks. The initiative ultimately aims to make visible the inherent inconsistencies and unreliability present in current AI systems, despite their otherwise advanced capabilities.