Share

A Modular Generative Honeypot Shell

IEEE International Conference on Cyber Security and Resilience (CSR '24) · S. Johnson, J. Pijpker, R. Hassing, R. Loves · Maritime IT Security Research Group, NHL Stenden, The Netherlands

📄 Get the Paper · 📊 Get the Slides · 📽️ Go to the Video

In Short

A honeypot is a decoy system — a deliberate trap designed to look like a real, vulnerable target so that when an attacker breaks in, you can watch what they do, learn how they operate, and gather intelligence without them knowing they've been caught. Building a convincing one has traditionally required significant expertise and effort: you need to carefully simulate a realistic system environment, and the more sophisticated your adversary, the harder it is to fool them.

This paper asks a simple but powerful question: what if you replaced the scripted, hand-crafted fake shell at the heart of a honeypot with a large language model? The result is LIMBOSH — a honeypot shell that places attackers in a live conversation with an LLM configured to behave exactly like a real Linux shell. Because it's generative rather than scripted, it can respond convincingly to arbitrary commands, adapt to unexpected inputs, and be reconfigured for any target environment just by changing a prompt. In a single-blind study with four cybersecurity professionals — who were not told they might be interacting with a honeypot — LIMBOSH successfully fooled three out of four, with the fourth only identifying it due to a software bug that has since been fixed.


The Breakdown

Problem

Traditional honeypot shells like Cowrie work by scripting responses to known commands — they maintain a fake filesystem, log attacker activity, and return pre-defined outputs. This works well against automated scanners and unsophisticated attackers, but falls apart quickly against a skilled human who runs unexpected commands, probes edge cases, or behaves in ways the honeypot's designers didn't anticipate. Building a medium- or high-interaction honeypot capable of standing up to a professional attacker is expensive, time-consuming, and requires deep expertise. There's also a fingerprinting problem: known honeypot tools have known signatures, and experienced attackers have tools specifically designed to detect them. The field needed a fundamentally different approach — one that could generate convincing, context-aware responses to arbitrary inputs without requiring exhaustive pre-programming.

Approach

LIMBOSH solves this by using an LLM as the response engine, configured via a system prompt to behave as a specific shell on a specific machine. The software is built around a highly modular architecture — using dependency injection and chain-of-responsibility design patterns — so that components like the choice of LLM, the prompt template, and the security layers can all be swapped out independently. The security layers are where the most novel engineering work sits: LIMBOSH implements a three-layer prompt injection defence, combining input delimiting, a trained random forest classifier (88% accuracy at detecting adversarial inputs) and an output guard that uses a secondary LLM context to detect when the primary context has deviated from its instructions. Eleven different LLMs were evaluated for their ability to convincingly emulate a shell under a minimal prompt, with GPT-4o and Llama3 performing best. The evaluation itself was deliberately rigorous: participants were not told they might encounter a honeypot, making it a true single-blind study rather than the more common approach of asking participants to try to detect a system they already know is fake.

Key Findings

Three of four professional participants — including a penetration tester who spent over 31 minutes and executed 90 commands against the honeypot — were successfully deceived and expressed genuine surprise during the debrief. The fourth was only able to identify the honeypot because a software bug caused a Python error message referencing the OpenAI SDK to be printed to the terminal. Absent that bug, there is no indication they would have detected it either. Participants who were deceived generally interpreted the honeypot's slower response times as the machine being locked down or security-hardened rather than as an indicator of deception — a finding with interesting implications for how defenders might deliberately shape attacker perception. One significant fingerprinting vulnerability was also identified: response time correlates with output length rather than command complexity, which a sufficiently attentive attacker could exploit to detect the underlying LLM.

Real-World Implications

The immediate implication is a practical one: LIMBOSH dramatically lowers the barrier to deploying medium- and high-interaction honeypots. What previously required extensive manual configuration can now be achieved by writing a prompt. The broader implication is that LLM-powered deception is now viable as a defensive tool — and that the arms race between attacker detection and defender deception has a new and rapidly evolving front. The paper also surfaces an important and underexplored concern: the prompt injection attack surface of LLM-powered security tools, which requires active, multi-layered mitigation rather than any single defensive measure.

So, What?

LIMBOSH sits at the sharpest edge of what's happening in applied AI security right now. The paper was published in 2024, at exactly the moment the industry began seriously reckoning with what large language models mean for both offensive and defensive security operations — and it demonstrates something concrete and deployable rather than speculative.

The deception angle is particularly interesting. Honeypots have always been fundamentally about information asymmetry: the attacker doesn't know they're being watched, and every action they take is intelligence. LLM-powered honeypots amplify this in two ways. First, they can engage attackers for longer and more convincingly, increasing the quality of the intelligence gathered. Second — and this is what the LIMBOSH results hint at — they can actively shape attacker behaviour, nudging adversaries toward specific actions or away from others, in ways that scripted systems simply cannot. That's a qualitatively different capability.

The prompt injection findings are equally significant, and arguably more urgent. As AI is embedded into more security tooling — SIEMs, SOAR platforms, autonomous response systems, penetration testing assistants — the question of whether an attacker can manipulate those systems by crafting inputs that subvert their instructions becomes critical. LIMBOSH's multi-layer mitigation approach is a useful reference architecture for this problem, and the finding that prompt injection may be a fundamental vulnerability in LLMs that cannot be fully addressed through prompt engineering alone is a sobering constraint that any AI-powered security product needs to take seriously.

For the OffSec automation space specifically, this paper is directly relevant in both directions. On the defensive side: LLM honeypots as a tool for attacker profiling and TTP collection at scale, with minimal operational overhead. On the offensive side: understanding how LLM-based deception systems work — and where their fingerprinting vulnerabilities lie — is essential knowledge for any autonomous penetration testing system that needs to detect and navigate honeypot environments without triggering their logging and alerting mechanisms. The attacker who spent 31 minutes and 90 commands on a fake shell, entirely unaware, is a vivid illustration of what's at stake on both sides.

Author's note: These publication summaries are AI-assisted. I use AI to present my work in a consistent, accessible way — the research and writing behind each publication is entirely my own.