Try it out!

Eldritch v0.3 REPL (WASM)

Example commands:

dir(sys)

pprint(process.list())

!whoami

if sys.get_os()['arch'] == 'x86_64':
    print("not arm")

for i in sorted([3,1,2]):
    print(i)

The Evolution of Eldritch: Why We Built a DSL for Realm

If you’ve ever managed a complex red team engagement, you know maintaining TTPs and best practices requires discipline and lots of communication. Different operators run commands in different ways, tools behave differently across operating systems, and “it worked on my machine” becomes the team motto.

When we built Realm, we wanted to have the automation that Ansible offered but with the stealth of a c2 framework. With this in mind we bulit a dedicated Domain Specific Language (DSL) called Eldritch. And now, with the release of Eldritch v2, we are doubling down on that decision.

Here is why we put a language inside our C2, and why v2 changes the game.

Red Team TTPs as code

Across the tech industry, we are watching the adoption of “as code” paradigms revolutionize workflows. We see Infrastructure as Code (IaC), Detection as Code, Response as Code (SOAR), and even Policy as Code.

However, offensive operations have largely lagged behind. To keep pace with modern defenses, Red Teams need to leverage the same advantages that “as code” provides to developers and sysadmins: repeatability, version tracking, and self-documentation.

Repeatability

In a high-stakes engagement, human error is the enemy. Eldritch facilitates deterministic actions, ensuring that critical steps like persistence installation or timestomping—are executed identically every time. It removes the risk of an operator forgetting a flag or skipping a hygiene step.

This repeatability also unlocks better collaboration.

DevOps for Offense: Just as developers use CI/CD and unit testing for their code, offensive tool developers can now apply unit testing to their tradecraft.

Purple Teaming: Products like Atomic Red Team and MSV allow detection engineers to define tests through code. Eldritch brings this capability natively to the C2, allowing operators to easily replay actions on targets to help Blue Teams tune their defenses.

While tools like Cobalt Strike have introduced sleep interpreters and Caldera is pushing toward campaign automation, Eldritch v2 treats these repeatable actions as a core feature of the agent not a server side API.

Version Tracking

When your capabilities are defined in code, you gain the immense power of Git. You are no longer just storing scripts; you are visualizing how your tradecraft evolves over time.

I experienced this firsthand using Ansible at CCDC to perform red team operations via an “as code” process. Over the years, the code provided a historical record of the cat and mouse game with defenders. For example, we could physically see bind shells falling out of our playbooks as teams improved their defenses and checking those vectors became standard practice.

Capabilities over time

Self-Documenting

Documentation is often the most dreaded part of an engagement, and static runbooks are famous for becoming obsolete the moment they are written. Eldritch solves this through self-documentation.

Because the code is the context, a new operator can look at an Eldritch script and understand exactly what the team is doing and why, without needing a handover meeting.

More importantly, this changes the value proposition for the client. Instead of a static PDF report, teams can leave behind the actual playbooks used during the engagement. This allows Detection and Incident Response teams to follow exactly what happened and, crucially, retest those specific actions to verify their fixes.

Why a v2?

Eldritch v1 was based on starlark-rust library. While starlark-rust is excellent for build systems and helped us rapidly prototype, it enforces constraints that make sense for builds but fight against the nature of offensive operations.

Realm v2 introduces a major overhaul to the language engine to better fit the operator’s needs.

Smaller Size The original interpreter carried a lot of weight to support features we didn’t need. v2 is leaner. A smaller engine means a smaller agent binary, which translates to:

Less noise on the wire during staging.

A smaller footprint in memory, reducing the surface area for EDR memory scanners.

More Control Over Language Features We moved from a generic implementation to a custom one tailored for C2. This allows us to introduce domain-specific types (like Beacon or Creds) as first class citizens in the language, rather than hacking them together with dictionaries.
Removing Unnecessary Restrictions Starlark was designed to be hermetic and deterministic, which led to two major annoyances for a C2 agent:

Frozen Objects: In Starlark, once a module is loaded, its values are “frozen” (immutable). This is great for preventing build side effects, but terrible for an implant that needs to maintain state. In v2, we removed this restriction. You can now update global variables, allowing your scripts to be stateful and dynamic (e.g., keeping a running counter of failed login attempts).

Lack of Infinite Loops: Starlark forbids recursion and unbounded loops to prevent builds from hanging. In a c2, “hanging” is often the goal—we want an agent to loop forever, polling for jobs or monitoring a log file. v2 enables infinite loops, allowing you to write long running monitors and daemons directly in Eldritch without complex workarounds.

Tags: eldritch

Eldritch V2

He's just a lil guy