Carnegie Mellon tested 13 AI models on 175 real office tasks. Best score: 30.3%. One agent renamed a colleague instead of messaging the right person. Gartner says 40% of agentic AI projects will be canceled by 2027.