Hacker Newsnew | past | comments | ask | show | jobs | submit | vmayoral's commentslogin

Are CTFs becoming outdated as human benchmarks? If autonomous agents now dominate competitions designed to identify top security talent at negligible cost, what are CTFs actually measuring?

In 2025, the open-source CAI systematically won top-tier events, outperforming seasoned security teams worldwide.


Exploring the question: Which role is more effective in cybersecurity—attack or defense?


It’s humans who:

- Design the system and prompts

- Build and integrate the attack tools

- Guide the decision logic and analysis

This isn’t just semantics — overstating AI capabilities can confuse the public and mislead buyers, especially in high-stakes security contexts.

I say this as someone actively working in this space. I participated in the development of PentestGPT, which helped kickstart this wave of research and investment, and more recently, I’ve been working on Cybersecurity AI (CAI) — the leading open-source project for building autonomous agents for security:

- CAI GitHub: https://github.com/aliasrobotics/cai

- Tech report: https://arxiv.org/pdf/2504.06017

I’m all for pushing boundaries, but let’s keep the messaging grounded in reality. The future of AI in security is exciting — and we’re just getting started.


> It's humans

Who would it be, gremlins? Those humans weren't at the top of the leaderboard before they had the AI, so clearly it helps.


Actually, those humans (XBOW's) were already top rankers. Just look it up.

What's being critized here is the hype, which can be misleading and confusing. On this topic, wrote a small essay: “Cybersecurity AI: The Dangerous Gap Between Automation and Autonomy,” to sort fact from fiction -> https://shorturl.at/1ytz7


It's tremendous news indeed but I personally believe it's going to bring exciting changes and benefits: https://news.accelerationrobotics.com/is-open-robotics-acqui...


Well said. Robotics is indeed the art of system integration and such remains (and will) one the biggest tech hurdles.

We wrote about this, and raised-funding, and worked for years on it. Hell, we even created an extension of ROS for it called the "Hardware Robot Operating System" (H-ROS) that aimed to address many of these issues you describe using ROS as the common language and Programmable Logic (FPGAs) to deal with physical interfaces. More on this at https://ieeexplore.ieee.org/document/8046383.

Problem was that the market didn't really accept it. I still believe on the tech problem landscape in here. I'm just not so sure anymore if there's any real market/business for such a solution. After all, vendor lock-ins are great business, and generate business.


I agree that one of the major aspects to improve in robotics is software. There's still a lot do there and many are working towards it. Most leading initiatives around ROS 2. I also believe that we need faster robotics simulation, and hardware acceleration will be valuable there, not only with GPUs, but also other accelerators (e.g. FPGAs outperform GPUs in many benchmarks both performance and energy consumption-wise).

I strongly disagree with the rest and your comments indicate a clear lack of (hardware) understanding. I often encounter this in the AI world. Where folks building ANNs just "forget" that this computational abstraction only "grew in popularity" (there were previous accelerators though) when CNNs got implemented as an accelerator in a GPU empowering newer results. The same's likely to happen in robotics.

Robots are real-time systems. Meeting time deadlines in their computations is the most important feature. (robot) Behaviors often take the form of computational graphs, with data flowing between compute abstractions (Nodes), across physical networks (communication buses) and while mapping to underlying sensors, compute technologies and actuators. ROS enables you to build these computational graphs and create robot behaviors by providing libraries, a communication infrastructure, drivers and tools to put it all together.

From a more technical compute architecture perspective, ROS 2 presents an event-driven programming interface. The resulting computational graphs built with it are "event-driven software architectures". Mapping these event-driven software architectures to hardware using CPUs leveraging classic control flow architectures (von Neumann architectures) leads to various issues. A key challenge in applying classic event-driven programming is that CPUs hardly provide real-time and safety guarantees. The de facto strategy in industry to meet the timing deadlines is a laborious, empirical, and case-by-case tuning of the system. This “whack-a-mole” approach is being realized by some, but unsustainable and hard to scale due to the lack of a hardware-supported timing-safe even driven programming interface. This is where accelerators come in. As already adopted in other industries including aerospace, automotive and healthcare, through the creation of custom compute building blocks (accelerators), a hardware/software co-design strategy provides clean behavioral specifications, describing clearly its guarantees in terms of timing (in other words, avoiding memory-centric von Neumann architectures).

Hardware is important. Note the current scarcity of semiconductors, which is one of the drivers of this research. Note also that creating custom accelerators for robotics application not only can lead to performance improvements (i.e. software that runs faster!) but also to more deterministic responses (which affects the downstream pipeline of software!). Rephrasing Alan Kay's quote, if you're serious about robotics, you should care about hardware ;).


Yeah, started my own robotics company, early engineer at another that sold for >$200M. You seem to have misinterpreted my statement of "hardware is already good enough" to mean "hardware is unimportant."


That's exactly what this work does while remaining ROS 2-centric (with C++, CMake extensions and other friends). Goal in this research was to provide a way for the path of least friction for roboticists.

There's a huge gap between robotics development (more sw-centric today) and the average hardware engineer in big semiconductor companies. Totally different perspectives and viewpoints. We felt bridging the gap was necessary.


Move to ROS 2 :)!

Regarding who's going to create the accelerators, enablers like this allow any ROSser to jump into creating their own accelerator from C++ by purely adding some metadata into their CMakeLists.txt files (which every ROS package has already).

On top of that, organizations are creating professional-grade kernels meant for production. NVIDIA and AMD are investing in that direction and so are we at Acceleration Robotics. Here're a few of the first ones announced: - Accelerated perception stack (ROS 2 API-compatible) https://accelerationrobotics.com/robotcore-perception.php - Accelerated coordinate system transformation (ROS 2 API-compatible, tf2) https://accelerationrobotics.com/robotcore-transform.php


Use ROS 2. The misconceptions about ROS have been mostly addressed with ROS 2.

Check out industry adoption of ROS 2 in various industries including automotive, healthcare and warehouse automation. ROS 2 is (and will remain as) the de facto standard for robot application development.


The issue with ROS are mostly cultural, and heavily ingrained into the community. ROS2 isn't going to fix that and has been shown to repeat the same mistakes of ROS.

The atrocious build system has stayed the same. People continue to add even more brittle automagic to the pile. (.e.g. automagic compilation of C++ to FPGAs)

The grown semi ad-hoc messaging semantics for services has stayed the same. Even if DDS was an acceptable foundation, adding a ROS message translation layer and ROS semantics on top of a perfectly valid schema language is bringing the same pitfalls that Object Relational Mappers have, in that you now have to understand three systems, the source, the compiler, and the target. And given that ROS1 had a 'wontfix' issue on the message schema checksums being computed wrongly (ignoring the cardinality of fields), which could result in your robot literally ripping your head of when it received messages from two different library versions, I don't trust anybody from the community to get it right this time.

The whole DDS vendor based ecosystem is antithetical to open source. The DDS standard is way too big, and a bunch of vendors meeting every couple years to demonstrate that their systems can send messages about shapes between one another simply doesn't cut it to prove correctness.

ROS(2) is an attempt to make C++ more easy by layering automagic and complexity on top. Most robot shops I know waste most of their time fighting that magic and complexity. The other ones don't use ROS(2).

ROS should have been:

* A bunch of libraries (not frameworks) to solve robotics, vision, and knowledge representation related tasks.

* A bunch of best practices to work with common middlewares, none of which are developed as part of ROS.

* A repository of competing data schemata for the above libraries and middlewares to facilitate interoperability but allow for experimentation.

* Dataviz libraries for common notebooks, e.g. jupyter, observablehq, that work with the common data schemata.

Instead we got one of the hairiest C++ projects and a bunch of shell scripts to open your default text editor more conveniently. Yay.


Well, yes but not only. There's only so much you can do with Pis and other SBCs that offer a CPU-centric compute solution. Robots are real-time systems and CPUs don't excel at that.

When optimizing dataflows for lower latencies and higher throughputs, you typically seek specialized compute architectures and that's wherein GPUs and FPGAs come in. That's what this work really enables.


I see, thanks for the explanation!


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: