Our Community
Read more
Arrow
Forecasting AI Capabilities: A Look at 2025 and Beyond
Forecasting AI Capabilities: A Look at 2025 and Beyond

Forecasting AI Capabilities: A Look at 2025 and Beyond

At The Vanguard, we're passionate about the intersection of deep tech and its potential impact on society. We believe that anticipating future developments is critical for navigating the complex landscape of emerging technologies, particularly in the rapidly evolving field of Artificial Intelligence. Recently, we had the privilege of participating in two engaging initiatives that provided a glimpse into what the future of AI might hold: the insightful 2025 Forecasting Survey, found on AI Digest's fantastic website, and a collaborative event with AI Safety Poland. This post will explore my contributions to both and highlight the importance of preparedness and collaboration as we approach the era of advanced AI.

Gauging the AI Landscape: Exploring the AI 2025 Forecasting Survey

The team at AI Digest has created a really valuable tool with their 2025 Forecasting Survey. It offers a fascinating, crowdsourced approach to predicting key AI milestones. It's an important initiative, as pre-registering our expectations for AI in 2025 will allow us to collectively make better sense of the progress that ends up materializing. We had a lot of fun participating in this survey, and we highly recommend checking out their website. They have a number of other great tools and demos that are worth exploring. By contributing to this survey, we're not just guessing at the future; we're actively engaging in a collective exercise to understand and potentially shape the trajectory of AI development.

Our predictions, detailed below, reflect a cautious optimism tempered by a realistic understanding of the challenges that lie ahead.

Our Predictions

  • High-Level Machine Intelligence (HLMI): We estimate a 50% probability of achieving HLMI, capable of outperforming humans on all cognitive tasks, by 2028. This prediction is based on the expectation that we will see at least one more significant leap in model size and capabilities, as well as the continued exploration of the "capabilities overhang" – the untapped potential within existing models. It is important to mention that, given the decreasing scaling laws, this estimate might be pushed further into the future.
  • Existential Risk: We assign a 20% probability to the risk of HLMI having an extremely negative long-term impact on humanity. While this may seem high, it acknowledges the novel and unpredictable nature of this technology, drawing parallels to other existential risks humanity has faced.
  • Benchmark Performance:
    • RE-Bench: We predict a median score of 0.81 (80% confidence between 0.7 and 0.92), suggesting significant progress in automating AI research.
    • SWE-Bench Verified: We forecast a median performance of 78% (80% confidence between 73% and 91%), indicating that AI will be highly capable in software engineering tasks.
    • Cybench: We estimate a median performance of 48% (80% confidence between 40% and 57%), highlighting AI's growing proficiency in cybersecurity.
    • OSWorld: We predict a high median performance of 78% (80% confidence between 70% and 92%), suggesting that operating systems directly from images is low-hanging fruit.
    • FrontierMath: We forecast a median performance of 27% (80% confidence between 10% and 42%), acknowledging the difficulty of advanced mathematical problem-solving for current AI models.
  • OpenAI Preparedness Scores: We anticipate high probabilities (80-90%) of AI systems achieving scores of Medium or higher in Cybersecurity and Model Autonomy, and High or higher in CBRN and Persuasion by the end of 2025. This reflects the rapid advancement of AI in areas with significant safety and security implications. The most alarming prediction is the possibility of an AI system reaching a pre-mitigation score of High or higher on CBRN. It is based on the assumption that model improvements are "universal". It seems reasonable to assume that if the model improves on cyber offense capabilities, it'll likely increase on CBRN.
  • Revenue: We project a median combined annualized revenue of $7.7 billion (80% confidence between $5.6 and $10 billion) for OpenAI, Anthropic, and xAI by the end of 2025, reflecting the growing commercialization of AI.
  • Public Attention: We foresee a modest 0.35% of Americans identifying computers/technology advancement as the US's most important problem (80% confidence between 0.1% and 0.6%), indicating a potential shift in public perception. However, we don't expect this to change much, as there's currently not a lot of incentive to engage the public in topics around AI Safety.

If you're interested in discussing how we arrived at these forecasts feel free to reach out to us and ask about our exact reasoning.

AI Safety Poland x The Vanguard: A Collaborative Effort

Building on the theme of preparedness, we recently had the honor of co-hosting an event with AI Safety Poland, a dynamic community dedicated to ensuring the responsible development of AI. The event, held on December 17th, 2024, at Techie's Space in Krakow, brought together around 20 passionate individuals, including engineers, scientists, students, and policy makers. The meetup served as a valuable platform for discussing the "Representation Engineering: A Top-Down Approach to AI Transparency" paper and for networking within the AI safety community.

As a representative of The Vanguard, we led a discussion on AI Capabilities Forecasting, where we shared my predictions from the AI Digest survey and discussed OpenAI's preparedness framework for assessing critical AI risks. The framework's focus on Cybersecurity, CBRN, Persuasion, and Model Autonomy sparked lively discussions among the attendees.

Key Takeaways and Reflections

A central theme that emerged from both the survey and the AI Safety Poland event is the urgent need for proactive measures to mitigate the potential risks associated with advanced AI. We are likely going to see models capabilities in dangerous knowledge domains like CNBR, Cybersecurity, Model Autonomy improve a lot. We need good engineers and researchers to evaluate these models and build safe, robust systems that are not easily misused. This requires not only technical expertise but also a deep understanding of the ethical, societal, and philosophical implications of AI.

The success of the AI Safety Poland event, measured by the numerous valuable connections made (an average of 3.67 new connections per attendee), underscores the importance of community building and collaboration in the field of AI safety. By fostering dialogue and knowledge sharing, we can collectively work towards ensuring that AI remains a force for good in the world.

Moving Forward

The AI landscape is evolving at an unprecedented pace. As we move towards 2025 and beyond, The Vanguard remains committed to fostering informed discussions, supporting research, and promoting collaboration within the deep tech community. We believe that by working together, we can navigate the challenges and harness the immense potential of AI to create a brighter future for all. We once again encourage everyone to participate in the AI 2025 Forecasting Survey on the AI Digest website. It's a fun and insightful way to contribute to this important conversation and help us collectively anticipate and shape the future of AI.

Stay tuned for more updates from The Vanguard as we continue to explore the frontiers of deep tech and its impact on our world!