Tech Digest – January 25, 2026
AI CAPABILITIES & BENCHMARKS
GPT-5.2 Pro Hits 31% on FrontierMath Tier 4—Mathematicians Now Analyzing Its Failures
OpenAI’s GPT-5.2 Pro has achieved 31% on FrontierMath Tier 4, Epoch AI’s benchmark of research-level mathematics problems that typically take professional mathematicians weeks to solve. The score represents a new state-of-the-art, up from 17% achieved by GPT-5 Pro. Epoch AI notes that only 17 of 48 private Tier 4 problems have been solved by any model as of January 2026—and mathematicians are now studying the model’s failure cases to understand what makes the remaining problems resistant.
Note: This benchmark matters because it tests whether AI can contribute to actual mathematical research, not just solve textbook problems. For institutions tracking AI capability trajectories, the 31% score signals that frontier models are beginning to operate at the edge of human mathematical knowledge—though the 69% failure rate shows significant limits remain.
Sources: Epoch AI FrontierMath, OpenAI
AI GOVERNANCE & DEVELOPMENT
Anthropic Opens the Door on AI-Resistant Hiring—and Admits the Door Is Already Half-Shut
Anthropic has published its internal performance engineering exam as an open benchmark after Claude Opus 4.5 scored higher than any human candidate who has ever taken it. The exam, designed to test optimization skills under time pressure, now serves as a challenge to humans who believe they can outperform the model “given infinite time.” The company is explicitly inviting applicants to prove they can beat the AI.
Note: This is a concrete marker for HR and workforce planning. When a major AI lab publicly acknowledges its own hiring tests are now easier for AI than for humans, institutions should take it as a signal: technical assessment methods that worked five years ago may now be obsolete. The implications extend beyond hiring to internal training, competency evaluation, and role design.
Sources: Anthropic Engineering Blog
Claude Code Gets “Tasks”—Persistent, Dependency-Aware Project Management Across Sessions
Anthropic has shipped a “Tasks” feature for Claude Code that enables persistent task tracking with dependencies, cross-session coordination, and multi-agent collaboration. Unlike previous to-do functionality, tasks are stored in the filesystem (~/.claude/tasks), survive session restarts, and can be shared across multiple Claude Code instances working on the same project. The feature supports defining task dependencies and broadcasting updates when one session completes work.
Note: For institutions evaluating AI coding tools, this signals a shift from “AI assistant” to “AI project coordinator.” The ability to define task dependencies and coordinate across sessions moves Claude Code closer to functioning as an autonomous development team member rather than a reactive tool. Worth monitoring for implications on software procurement and development workflows.
Sources: Thariq Shihipar (Anthropic), VentureBeat
eBay Bans AI Shopping Agents—Signals the Start of the Human-Verification Era
eBay has updated its user agreement to explicitly prohibit AI-powered “buy for me” agents, LLM-driven bots, and any automated system that places orders without human review. The policy, effective February 20, 2026, makes eBay one of the first major platforms to draw a legal line against agentic commerce operating without human oversight.
Note: For public procurement and institutional purchasing, this is a preview of regulatory friction to come. If AI agents can negotiate deals and execute transactions autonomously, platforms, vendors, and public institutions will all need new verification and audit mechanisms. eBay’s move suggests the “agent economy” will face real resistance—and institutions should plan accordingly.
Sources: The Register, EcommerceBytes
AI & WORKFORCE
Nature Study: Scientists Using AI Publish 3x More Papers—But Science May Be Narrowing
A study in Nature analyzing 41 million research papers found that scientists who use AI tools publish 3.02 times more papers, receive 4.84 times more citations, and reach leadership positions 1.37 years earlier than peers who don’t. However, the same study found AI adoption shrinks the collective scope of scientific topics by 4.63% and reduces engagement between researchers by 22%.
Note: For public research institutions and universities, this is a workforce strategy signal with a warning attached. AI adoption clearly accelerates individual careers—but may concentrate research into narrower, more crowded territory. Funding bodies and policy-makers may need to actively incentivize exploration of underserved fields to counterbalance the efficiency gains.
UK Graduate Job Postings Collapse From 180,000 to 55,000 in Four Years
James Reed, CEO of recruitment giant Reed, told media that graduate job postings on his platform have dropped from over 180,000 in 2021 to 55,000 today. He described the situation as a “white-collar recession” and encouraged young people to consider manual trades. A Reed survey found 22% of employers have cut recruitment due to higher National Insurance costs, while 15% cited AI as a factor.
Note: For education ministries, workforce development agencies, and local administrations, this is a structural shift—not a cyclical dip. Entry-level knowledge work is contracting while skilled trades remain in demand. Institutions investing in training programs should weigh the evidence: vocational pathways may offer better employment outcomes than traditional graduate pipelines for the next decade.
Sources: LBC, Personnel Today
EU ENERGY TRANSITION
Wind and Solar Overtake Fossil Fuels in EU Power Mix for the First Time
Wind and solar generated 30% of EU electricity in 2025, overtaking fossil fuels (29%) for the first time on record, according to energy think tank Ember. Solar alone grew by more than 20% for the fourth consecutive year, reaching 13% of total generation. Coal fell to a historic low of 9.2%, while 14 of 27 EU member states now generate more power from wind and solar than from all fossil fuels combined.
Note: This is a structural milestone, not a temporary fluctuation. For public institutions across the EU, it validates the direction of Digital Decade energy policy and reinforces the relevance of “twin transition” framing. Institutions planning infrastructure investments should note: the grid is now majority low-carbon (71% renewables + nuclear), and the next bottleneck is transmission capacity, not generation.
GLOBAL ENERGY
China Hits 10.4 Trillion kWh—More Than Double US Consumption
China’s total electricity consumption reached 10.4 trillion kWh in 2025, the first time any country has crossed the 10 trillion threshold. The figure is more than double US consumption and exceeds the combined annual usage of the EU, Russia, India, and Japan. A 17% jump in data center power demand and a 48.8% surge in EV charging infrastructure drove much of the growth.
Note: For European institutions tracking global energy dynamics, China’s scale is now in a category of its own. The data center load increase (17%) is a direct signal that AI infrastructure buildout is a measurable driver of national energy demand. Institutions planning digitalization should factor in not just their own energy costs, but the systemic pressure AI deployments are placing on global power grids.
Japan Restarts First TEPCO Reactor Since Fukushima—Then Suspends Again
Tokyo Electric Power Company (TEPCO) restarted a reactor at the Kashiwazaki-Kariwa nuclear plant on January 21—the first TEPCO-operated unit to go online since the 2011 Fukushima disaster. Hours later, the restart was suspended due to a control rod malfunction. TEPCO says there was no safety issue, but the cause is under investigation.
Note: For institutions monitoring energy security and clean power strategies, this is a reminder that nuclear restarts are politically and operationally complex—even when policy support is strong. Japan is pushing for nuclear expansion to reduce fossil fuel dependence and meet AI-driven demand. The immediate glitch underscores that execution risk remains high.
Sources: NBC News, Al Jazeera
ENVIRONMENTAL TECHNOLOGY
New Filtration Technology Removes PFAS “Forever Chemicals” 100x Faster
Rice University researchers have developed an eco-friendly filtration material that captures PFAS chemicals over 1,000 times more effectively than existing methods and removes them 100 times faster than commercial carbon filters. The copper-aluminum layered material also destroys captured PFAS through thermal decomposition and can be regenerated for reuse across at least six cycles.
Note: For water utilities, environmental agencies, and municipalities dealing with PFAS contamination, this is a potential step-change in remediation economics. Current cleanup methods are slow, expensive, and generate secondary waste. If this technology scales, it could dramatically reduce the cost and timeline of addressing contaminated water supplies—a pressing issue across Europe and beyond.
Sources: The Guardian, Rice University, Science Daily