SIGSEGV in FPM workers via PCRE2 JIT compilation limits

gdb analysis of Zend Tracing JIT faults on nested regex

The Signal 11 Observation

A recurring HTTP 502 Bad Gateway response emerged on a specific subset of URIs within a web application. The application operates on a standard Nginx and PHP-FPM 8.1 stack, utilizing the Kidearn - Kindergarten & Baby Care WordPress Theme to manage staff directory profiles. The 502 errors were not random; they occurred strictly when rendering specific staff profile pages containing heavily nested layout components.

Nginx access logs recorded the 502 responses. Nginx error logs indicated a premature socket closure.

2023/10/24 14:12:01 [error] 1234#1234: *5678 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.1.50, server: example.com, request: "GET /teachers/profile-sarah/ HTTP/2.0", upstream: "fastcgi://unix:/var/run/php/php8.1-fpm.sock:"

The PHP-FPM master process log (/var/log/php8.1-fpm.log) contained the corresponding worker failure.

[24-Oct-2023 14:12:01] WARNING: [pool www] child 4512 exited on signal 11 (SIGSEGV) after 84.123456 seconds from start
[24-Oct-2023 14:12:01] NOTICE: [pool www] child 4589 started

Signal 11 (SIGSEGV) denotes a segmentation fault. A user-space process attempted to access a memory address to which it did not have access rights, or it attempted to execute an instruction in a non-executable memory segment. The kernel intervened and terminated the process. Because the FPM worker was abruptly killed by the kernel, it could not send the FastCGI END_REQUEST record to Nginx, resulting in the 104 Connection Reset and the subsequent 502.

Coredump Subsystem Configuration

To diagnose a segmentation fault in a compiled C binary like php-fpm, a core dump is required. By default, most production Linux distributions disable core dumps to conserve disk space and prevent sensitive memory contents from being written to the filesystem.

Enabling core generation requires modifications at three layers: the kernel, the service manager, and the application daemon.

First, the kernel core_pattern was verified via sysctl.

sysctl kernel.core_pattern

The output kernel.core_pattern = |/lib/systemd/systemd-coredump %P %u %g %s %t %c %h indicated that systemd was configured to intercept core dumps.

Second, the systemd unit for PHP-FPM limits core file sizes. I created an override file for the service.

systemctl edit php8.1-fpm.service
[Service]
LimitCORE=infinity

Finally, the PHP-FPM pool configuration itself imposes resource limits using the setrlimit system call when spawning workers. I modified the pool configuration.

; /etc/php/8.1/fpm/pool.d/www.conf
rlimit_core = unlimited

After reloading the daemon (systemctl daemon-reload && systemctl restart php8.1-fpm), I issued a curl request to the specific /teachers/profile-sarah/ URI to intentionally trigger the fault. The FPM log registered the Signal 11, and systemd-coredump successfully captured the memory state.

coredumpctl list
TIME                            PID   UID   GID SIG COREFILE  EXE
Tue 2023-10-24 14:25:10 UTC    4601    33    33  11 present   /usr/sbin/php-fpm8.1

Binary Inspection Setup

Extracting actionable data from a core dump requires the exact binary that crashed, its shared libraries, and the corresponding debugging symbols. I extracted the core file to a temporary directory.

coredumpctl dump 4601 -o /tmp/php-fpm.core

I installed the debug symbol packages for PHP and the Perl Compatible Regular Expressions library.

apt-get install php8.1-dbg libpcre2-dbg

I initiated the GNU Debugger (gdb), pointing it to the FPM executable and the generated core file.

gdb /usr/sbin/php-fpm8.1 /tmp/php-fpm.core

The Execution Backtrace

Inside the GDB environment, I executed bt (backtrace) to examine the call stack of the faulting thread.

(gdb) bt
#0  0x00007f8b2a3b4c1a in pcre2_jit_match_8 () from /lib/x86_64-linux-gnu/libpcre2-8.so.0
#1  0x00007f8b2a381f2b in pcre2_match_8 () from /lib/x86_64-linux-gnu/libpcre2-8.so.0
#2  0x000055c123456789 in php_pcre_replace_impl () at ext/pcre/php_pcre.c:1234
#3  0x000055c123456890 in php_pcre_replace_callback () at ext/pcre/php_pcre.c:1450
#4  0x000055c1234569a1 in zif_preg_replace_callback () at ext/pcre/php_pcre.c:1560
#5  0x00007f8a10001234 in ?? ()
#6  0x000055c123678123 in zend_jit_trace_execute () at ext/opcache/jit/zend_jit_vm_helpers.c:450
#7  0x000055c123789234 in zend_execute () at Zend/zend_vm_execute.h:60000
#8  0x000055c12389a345 in zend_execute_scripts () at Zend/zend.c:1800
#9  0x000055c1239ab456 in php_execute_script () at main/main.c:2500
#10 0x000055c123abc567 in main () at sapi/fpm/fpm/fpm_main.c:1900

The stack trace reveals the execution sequence perfectly. The process started in the FPM main loop (Frame 10), executed a PHP script (Frame 8), and entered the Zend VM execution loop (Frame 7).

Frame 6 is critical. It shows zend_jit_trace_execute. The execution bypassed the standard Zend VM opcode interpreter (execute_ex) and jumped into machine code generated by the OPcache Tracing JIT compiler.

From the JIT-compiled code (Frame 5, an anonymous executable memory region marked ??), the execution jumped into a standard PHP internal function (zif_preg_replace_callback, Frame 4), which interfaces with the PCRE extension. The PCRE extension called pcre2_match_8 (Frame 1), which then called pcre2_jit_match_8 (Frame 0), where the segmentation fault occurred.

The Memory Struct Analysis

To understand why pcre2_jit_match_8 crashed, I needed to inspect the state of the CPU registers at the exact moment of the fault.

(gdb) info registers
rax            0x0                 0
rbx            0x7f8b1a7fc700      140236835243776
rcx            0x0                 0
rdx            0x1000              4096
rsi            0x55c125678000      94287346171904
rdi            0x0                 0
rbp            0x7ffc12345670      0x7ffc12345670
rsp            0x7ffc12345600      0x7ffc12345600
r8             0x0                 0
r9             0x0                 0
r10            0x55c1256780a0      94287346172064
r11            0x206               518
r12            0x7f8b1a7fc750      140236835243856
r13            0x0                 0
r14            0x55c125678050      94287346171984
r15            0x0                 0
rip            0x7f8b2a3b4c1a      0x7f8b2a3b4c1a <pcre2_jit_match_8+138>
eflags         0x10206             [ PF IF RF ]

Next, I disassembled the machine code instructions surrounding the instruction pointer ($rip).

(gdb) x/10i $rip - 10
   0x7f8b2a3b4c10 <pcre2_jit_match_8+128>:      mov    rcx,QWORD PTR [rbx+0x18]
   0x7f8b2a3b4c14 <pcre2_jit_match_8+132>:      test   rcx,rcx
   0x7f8b2a3b4c17 <pcre2_jit_match_8+135>:      je     0x7f8b2a3b4c50 <pcre2_jit_match_8+192>
=> 0x7f8b2a3b4c1a <pcre2_jit_match_8+138>:      mov    eax,DWORD PTR [rdi+0x10]
   0x7f8b2a3b4c1d <pcre2_jit_match_8+141>:      cmp    eax,0x4
   0x7f8b2a3b4c20 <pcre2_jit_match_8+144>:      jne    0x7f8b2a3b4c60 <pcre2_jit_match_8+208>

The arrow => marks the exact instruction that failed: mov eax,DWORD PTR [rdi+0x10].

This assembly instruction attempts to read a 32-bit value (DWORD) from the memory address calculated by taking the value in the rdi register and adding an offset of 0x10 (16 bytes in decimal). It then attempts to move this value into the eax register.

Looking back at the register state, the value of rdi is 0x0.

Adding 0x10 to 0x0 results in the memory address 0x0000000000000010. The process attempted to read from memory address 16. The first page of memory (addresses 0 to 4095) is unmapped by the kernel specifically to catch null pointer dereferences. Attempting to read address 16 immediately generates a SIGSEGV.

The PCRE2 Match Context

The System V AMD64 ABI dictates that the rdi register holds the first argument passed to a function. In the libpcre2 source code, the signature for the JIT match function is:

PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION
pcre2_jit_match(const pcre2_code *code, PCRE2_SPTR subject, PCRE2_SIZE length,
  PCRE2_SIZE start_offset, uint32_t options, pcre2_match_data *match_data,
  pcre2_match_context *mcontext)

The first argument, code, is a pointer to the compiled regular expression structure (pcre2_code). The assembly instruction mov eax,DWORD PTR [rdi+0x10] is attempting to read the magic_number or flags field located 16 bytes inside this struct. Because rdi is null, the pointer to the compiled regex structure is null.

Why did PHP pass a null pointer to PCRE2?

To find out, I examined the PHP internal state. GDB allows sourcing custom macros to inspect the Zend Engine. I loaded the .gdbinit file provided in the PHP source tree.

(gdb) source /usr/src/php-8.1.20/.gdbinit
(gdb) zbacktrace

The zbacktrace command parses the Zend Execution Globals (EG) and prints the PHP userland stack trace.

[0x7f8a100140a0] preg_replace_callback("/\\[(\\[?)(kidearn_row|kidearn_column|kidearn_teacher)(?![\\w-])([^\\]\\/]*(?:\\/(?!\\])[^\\]\\/]*)*?)(?:(\\/)\\]|\\](?:([^\\[]*+(?:\\[(?!\\/\\2\\])[^\\[]*+)*+)\\[\\/\\2\\])?)(\\]?)/s", "do_shortcode_tag", "[kidearn_row][kidearn_column][kidearn_teacher id=\"12\" /][/kidearn_column][/kidearn_row]")
[0x7f8a10014020] do_shortcode("[kidearn_row][kidearn_column][kidearn_teacher id=\"12\" /][/kidearn_column][/kidearn_row]")
[0x7f8a10013f90] apply_filters("the_content", "[kidearn_row][kidearn_column][kidearn_teacher id=\"12\" /][/kidearn_column][/kidearn_row]")

The fault occurs when WordPress processes nested shortcodes. The core function get_shortcode_regex() dynamically generates an immense regular expression based on all registered shortcode tags.

The pattern string shown in the trace is: \[(\[?)(kidearn_row|kidearn_column|kidearn_teacher)(?![\w-])([^\]\/]*(?:\/(?!\])[^\]\/]*)*?)(?:(\/)\]|\](?:([^\[]*+(?:\[(?!\/\2\])[^\[]*+)*+)\[\/\2\])?)(\]?)

This is a heavily optimized, non-greedy, capturing regex designed to parse shortcode syntax, including attributes and self-closing tags. It utilizes possessive quantifiers (*+) and negative lookaheads ((?!...)).

PCRE2 Cache Entry Exhaustion

In PHP, compiling a regular expression is a computationally expensive operation. To mitigate this, the ext/pcre module maintains a process-global hash table called the PCRE Cache (PCRE_G(pcre_cache)).

When preg_replace_callback is called, PHP hashes the regex string. It looks up the hash in the cache. If it exists, it retrieves the pcre_cache_entry struct, which contains the pre-compiled pcre2_code pointer. If it does not exist, it compiles the regex via pcre2_compile, optionally JIT-compiles it via pcre2_jit_compile, and stores the pointer in the cache.

The cache has a hardcoded limit.

/* ext/pcre/php_pcre.h */
#define PCRE_CACHE_SIZE 4096

If a web application generates thousands of unique regular expressions, the cache fills up. When it hits 4096 entries, PHP employs a simple eviction strategy. It clears the entire cache and starts over.

Administrators running platforms where users Download WooCommerce Theme plugins often encounter a massive bloat in registered shortcodes. When a WooCommerce environment mixes with a custom builder, the dynamic shortcode regex generated by get_shortcode_regex() changes depending on the execution context (which plugins are active on a given hook).

I inspected the global PCRE cache state in GDB.

(gdb) print PCREG(pcre_cache)
$1 = (HashTable *) 0x0

The pointer to the PCRE cache hash table itself was null, or rather, the macro failed to resolve it because the memory manager had just torn it down. The segmentation fault was a race condition involving the OPcache Tracing JIT, the PCRE cache eviction, and PCRE's internal JIT.

Tracing JIT vs Function JIT

PHP 8 introduced the Just-In-Time compiler. It operates in two modes: Function JIT and Tracing JIT.

Function JIT is conservative. It analyzes an entire PHP function, translates its opcodes to machine code, and executes it.

Tracing JIT is aggressive. It ignores function boundaries. It profiles the execution flow at runtime. If it detects a "hot path" (e.g., a tight loop iterating over an array of shortcodes), it records the exact sequence of opcodes executed across multiple functions. It then compiles this specific trace into highly optimized machine code.

When Tracing JIT compiles a trace that includes a call to preg_replace_callback, it hardcodes certain memory offsets and pointers to optimize the C function call overhead. It attempts to bypass standard Zend Engine type checking by directly passing pointers to internal structures.

If the PCRE cache eviction is triggered while a JIT-compiled trace is executing, the pcre_cache_entry that the JIT trace holds a pointer to is suddenly destroyed and freed.

The next time the loop in the JIT trace executes, it passes the now-stale, freed pointer to the php_pcre_replace_impl internal function. Because the memory was freed, the kernel zeroed it out or another allocator claimed it. When php_pcre_replace_impl extracts the pcre2_code pointer from the freed struct to pass to pcre2_match, it extracts a null value (0x0).

It passes 0x0 to pcre2_match_8 as the first argument (rdi). pcre2_match_8 passes it to pcre2_jit_match_8. pcre2_jit_match_8 executes mov eax,DWORD PTR [rdi+0x10]. The kernel traps the null pointer dereference. Signal 11. Process death.

This race condition only occurs when: 1. OPcache Tracing JIT is enabled. 2. The PHP code executes a tight loop of regex replacements (like do_shortcode). 3. The application dynamically generates enough unique regex patterns to breach the PCRE_CACHE_SIZE of 4096.

PCRE2 JIT Memory Allocation

While fixing the PCRE cache eviction logic is a C-level patch for the PHP core developers, a systems engineer must stabilize the environment immediately.

The interaction between the two JIT compilers (OPcache JIT and PCRE JIT) exacerbates memory management complexities. PCRE JIT operates by allocating a block of memory using mmap with PROT_READ | PROT_WRITE permissions. It writes the x86_64 machine code for the regular expression into this block. It then uses mprotect to change the permissions to PROT_EXEC (executable).

When PHP's PCRE cache is flushed, it must call pcre2_code_free on all 4096 entries. This triggers 4096 munmap system calls to tear down the executable memory segments. If a Tracing JIT thread is preempted during this teardown, the CPU instruction cache and the TLB (Translation Lookaside Buffer) can become desynchronized with the actual virtual memory map, leading to catastrophic pointer invalidation.

Configuration Adjustments

To prevent the SIGSEGV, the execution path must be forced out of the unstable Tracing JIT interaction.

The most direct approach is to alter the OPcache JIT compilation mode. The opcache.jit directive accepts a 4-digit configuration integer (CRTO). - C: CPU specific optimization flags. - R: Register allocation. - T: Trigger (0 = on script load, 1 = on function execution, 5 = tracing). - O: Optimization level (0 = none, 5 = max).

The default for PHP 8 is 1254 (Tracing JIT). Changing it to 1215 forces Function JIT. Function JIT adheres strictly to Zend VM function boundaries and does not inline internal C function pointers, completely avoiding the cache pointer race condition.

Alternatively, you can disable the PCRE JIT compiler entirely, forcing the regex engine to fall back to the standard C interpreter. This avoids the mprotect teardown overhead during cache eviction.

In environments with complex dynamic shortcodes, disabling PCRE JIT incurs a minor regex performance penalty but guarantees FPM worker stability.

; /etc/php/8.1/fpm/conf.d/10-opcache.ini
opcache.jit=1215
opcache.jit_buffer_size=128M

; /etc/php/8.1/fpm/php.ini
pcre.jit=0

评论 0