OOM invocations from RSS feed XML parsers
Tracing libxml2 memory leaks in PHP workers
Infrastructure State and Workload Baseline
The operating environment is a dedicated bare-metal server, provisioned with a single AMD Ryzen 9 5900X processor (12 cores, 24 threads) and 64GB of unbuffered ECC DDR4 memory. The storage subsystem consists of two 1TB NVMe solid-state drives configured in a software RAID 1 mirror via mdadm. The host operating system is Debian 12 (Bookworm), utilizing the standard 6.1.0-18-amd64 kernel.
The application architecture relies on Linux Containers (LXC) for process isolation. The specific container under analysis is restricted via cgroups v2, allocated 4 logical CPU cores and a strict memory limit of 8GB. Swap is disabled at the host level. The software stack within the container comprises Nginx 1.24.0, PHP 8.2.10 via FPM, and PostgreSQL 15.
The hosted application is an event management platform utilizing the JoyDay - Creative Event Agency WordPress Theme. The operational profile of this application includes a localized event rendering frontend and a backend synchronization routine. This routine pulls external event data from third-party vendor systems via XML-based syndication feeds (RSS and Atom formats) to populate the local database.
The OOM Killer Invocation
At 03:00 UTC, the monitoring daemon recorded an unexpected termination of multiple PHP-FPM worker processes. The HTTP endpoints returned 502 Bad Gateway errors for approximately four seconds while the FPM master process detected the child terminations and spawned replacements.
I extracted the kernel ring buffer logs using dmesg to verify the termination cause.
[Tue Nov 14 03:04:12 2023] php-fpm8.2 invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
[Tue Nov 14 03:04:12 2023] CPU: 4 PID: 114092 Comm: php-fpm8.2 Not tainted 6.1.0-18-amd64 #1 Debian 6.1.76-1
[Tue Nov 14 03:04:12 2023] Hardware name: Supermicro AS-1014S-WTRT/H12SSW-NT, BIOS 2.4 12/21/2022
[Tue Nov 14 03:04:12 2023] Call Trace:
[Tue Nov 14 03:04:12 2023] <TASK>
[Tue Nov 14 03:04:12 2023] dump_stack_lvl+0x44/0x5c
[Tue Nov 14 03:04:12 2023] dump_header+0x4a/0x20c
[Tue Nov 14 03:04:12 2023] oom_kill_process.cold+0xb/0x10
[Tue Nov 14 03:04:12 2023] out_of_memory+0x1d5/0x500
[Tue Nov 14 03:04:12 2023] mem_cgroup_out_of_memory+0x111/0x130
[Tue Nov 14 03:04:12 2023] try_charge_memcg+0x629/0x7b0
[Tue Nov 14 03:04:12 2023] charge_memcg+0x34/0x200
[Tue Nov 14 03:04:12 2023] __mem_cgroup_charge+0x2c/0x80
[Tue Nov 14 03:04:12 2023] __handle_mm_fault+0x1118/0x16b0
[Tue Nov 14 03:04:12 2023] handle_mm_fault+0xd1/0x2d0
[Tue Nov 14 03:04:12 2023] do_user_addr_fault+0x1e3/0x680
[Tue Nov 14 03:04:12 2023] exc_page_fault+0x70/0x170
[Tue Nov 14 03:04:12 2023] asm_exc_page_fault+0x22/0x30
[Tue Nov 14 03:04:12 2023] RIP: 0033:0x7f8b9e6b4d32
[Tue Nov 14 03:04:12 2023] Code: 0f 1f 40 00 ...
[Tue Nov 14 03:04:12 2023] RSP: 002b:00007ffc9a8b4a20 EFLAGS: 00010206
[Tue Nov 14 03:04:12 2023] RAX: 0000000000000000 RBX: 00007f8b8a214010 RCX: 0000000000000010
[Tue Nov 14 03:04:12 2023] RDX: 00007f8b8a214010 RSI: 0000000000000018 RDI: 00007f8b8a214000
[Tue Nov 14 03:04:12 2023] </TASK>
[Tue Nov 14 03:04:12 2023] memory: usage 8388608kB, limit 8388608kB, failcnt 140
[Tue Nov 14 03:04:12 2023] memory+swap: usage 8388608kB, limit 8388608kB, failcnt 0
[Tue Nov 14 03:04:12 2023] kmem: usage 41012kB, limit 9007199254740988kB, failcnt 0
[Tue Nov 14 03:04:12 2023] Memory cgroup stats for /lxc/app-container:
[Tue Nov 14 03:04:12 2023] anon 8192004KB file 141020KB kernel_stack 4012KB pagetables 8214KB
[Tue Nov 14 03:04:12 2023] Tasks state (memory values in pages):
[Tue Nov 14 03:04:12 2023] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[Tue Nov 14 03:04:12 2023] [ 114092] 33 114092 341012 284102 2101248 0 0 php-fpm8.2
[Tue Nov 14 03:04:12 2023] [ 114095] 33 114095 321014 264102 1901452 0 0 php-fpm8.2
[Tue Nov 14 03:04:12 2023] [ 114098] 33 114098 281014 244102 1801452 0 0 php-fpm8.2
[Tue Nov 14 03:04:12 2023] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/lxc/app-container,task_memcg=/lxc/app-container,task=php-fpm8.2,pid=114092,uid=33
[Tue Nov 14 03:04:12 2023] Memory cgroup out of memory: Killed process 114092 (php-fpm8.2) total-vm:1364048kB, anon-rss:1136408kB, file-rss:0kB, shmem-rss:0kB, UID:33 pgtables:2052kB oom_score_adj:0
The log explicitly identifies the cgroup limit 8388608kB (8GB) being reached. The process 114092 (php-fpm8.2) was killed. Its anon-rss (Anonymous Resident Set Size) was 1,136,408 kB (approximately 1.1GB). In the context of a PHP process, anonymous memory corresponds directly to the heap memory allocated dynamically during script execution.
A standard PHP script rendering a typical webpage consumes between 12MB and 45MB of memory. An individual worker process reaching 1.1GB of allocated heap before triggering an OOM kill event is anomalous.
Memory Mapping Inspection with pmap
To observe the memory allocation pattern without waiting for another OOM termination, I monitored a live PHP-FPM worker executing the scheduled background tasks. I utilized watch combined with ps to identify the worker consuming the most memory over time.
watch -n 5 "ps -eo pid,user,%mem,rss,cmd | grep php-fpm | sort -k4 -n -r | head -n 5"
Once I identified a target process (PID 115201) that had grown to 450MB of RSS, I executed pmap to map its memory layout.
pmap -x 115201
115201: php-fpm: pool www
Address Kbytes RSS Dirty Mode Mapping
...
00007f8b91e00000 2048 2048 2048 rw--- [ anon ]
00007f8b92000000 131072 128000 128000 rw--- [ anon ]
00007f8b9a000000 65536 65536 65536 rw--- [ anon ]
00007f8b9e000000 262144 262144 262144 rw--- [ anon ]
...
---------------- ------- ------- -------
total kB 512408 480210 480210
The output confirmed that the memory was not tied to memory-mapped files or shared libraries (which would show the file path in the Mapping column). It was entirely [ anon ] (anonymous) memory, mapped with rw--- (read/write) permissions. This confirmed a user-space memory leak within the PHP engine or one of its loaded extensions.
Core Dump Configuration and Acquisition
To analyze the exact contents of the anonymous memory segments, I configured the system to generate a core dump when the process receives a SIGSEGV or is terminated by the kernel.
I modified the system parameters via sysctl.
sysctl -w kernel.core_pattern=/var/crash/core.%e.%p.%t
sysctl -w fs.suid_dumpable=2
I then adjusted the resource limits for the PHP-FPM service by modifying its systemd unit file.
systemctl edit php8.2-fpm.service
[Service]
LimitCORE=infinity
I reloaded the systemd daemon and restarted FPM.
systemctl daemon-reload
systemctl restart php8.2-fpm
I waited for the background cron task to execute. When the memory limit was breached and the kernel sent the termination signal, a core dump was successfully written to /var/crash/core.php-fpm8.2.116042.1699931450. The file size was 1.2GB, reflecting the memory footprint at the exact moment of termination.
GDB Heap Extraction and Memory Forensics
I utilized the GNU Debugger (gdb) to inspect the core dump. To ensure accurate symbol resolution, I installed the debug symbols for PHP and the core extensions.
apt-get install php8.2-dbg libxml2-dbg
gdb /usr/sbin/php-fpm8.2 /var/crash/core.php-fpm8.2.116042.1699931450
Inside the GDB prompt, the stack trace at the exact moment of termination is often irrelevant in an OOM scenario, as the process is killed asynchronously by the kernel, not by a specific instruction fault. The objective is to examine the data persisting in the heap.
I searched the memory space for string patterns that could identify the data objects consuming the allocation. Since the application processes event feeds, I searched for common XML tags.
(gdb) find 0x00007f8b92000000, +131072000, "\n<rss version=\"2.0\">\n<channel>\n<title>Regional Event Syndication</title>\n"
0x7f8b94104080: "<item>\n<title>Annual Tech Symposium</title>\n<description>Detailed schedule and speaker list...</description>\n"
0x7f8b941040f0: "<pubDate>Mon, 13 Nov 2023 09:00:00 GMT</pubDate>\n</item>\n"
...
The heap contained thousands of distinct, fully loaded XML strings representing event syndication feeds.
To determine how PHP was handling these strings, I needed to locate the internal Zend Engine structures (zval) pointing to this data. In PHP 8, a string zval contains a pointer to a zend_string structure.
I dumped the raw memory around an XML fragment to locate the zend_string header.
(gdb) x/16xg 0x7f8b94104010 - 24
0x7f8b94103ff8: 0x0000000100000006 0x0000000000004a20
0x7f8b94104008: 0x6d783f3c00000000 0x73726576206c6d78
...
The hexadecimal value 0x0000000100000006 represents the gc (garbage collection) header. The 1 is the reference count (refcount), and the 6 is the type info (IS_STRING). The value 0x0000000000004a20 (18,976 in decimal) is the length of the string.
The reference count was exactly 1. If the script had finished processing this XML payload and moved on to the next one, the variable holding this string should have gone out of scope. When a variable goes out of scope, the Zend Engine decrements the refcount. When refcount reaches 0, the zend_string memory is freed.
The fact that thousands of strings persisted with a refcount of 1 indicated they were still anchored to active variables or objects within the PHP execution context, preventing the garbage collector from reclaiming the memory.
I exited GDB to analyze the application logic.
Application Logic and XML Parsing
When developers Download WordPress Themes designed for event aggregation, the backend logic typically uses PHP's DOMDocument or SimpleXMLElement classes to traverse external feeds.
I located the synchronization routine in the application codebase: /var/www/html/wp-content/themes/joyday/inc/sync/class-feed-parser.php.
class JoyDay_Feed_Parser {
private $parsed_items = [];
public function process_feed($url) {
$xml_string = file_get_contents($url);
if (!$xml_string) {
return false;
}
$dom = new DOMDocument();
// Suppress warnings for malformed XML
libxml_use_internal_errors(true);
$dom->loadXML($xml_string);
$items = $dom->getElementsByTagName('item');
foreach ($items as $item) {
$event_data = $this->extract_node_data($item);
$this->save_to_database($event_data);
$this->parsed_items[] = $event_data['id'];
}
// Clear internal errors to prevent memory leaks in libxml
libxml_clear_errors();
return true;
}
private function extract_node_data($node) {
$data = [];
foreach ($node->childNodes as $child) {
if ($child->nodeType === XML_ELEMENT_NODE) {
$data[$child->nodeName] = $child->nodeValue;
}
}
return $data;
}
}
The script iterates through multiple feed URLs, calls process_feed, loads the XML into a DOMDocument, extracts the data, and saves it.
At first glance, the code appears correct. The $dom variable is local to the process_feed method. When the method returns, $dom falls out of scope, and its memory should be freed. The libxml_clear_errors() call is correctly placed to prevent error buffer accumulation.
However, the core dump proved the memory was not being freed.
Profiling with Valgrind Memcheck
To isolate the exact C-level functions responsible for the allocation, I constructed a minimal PHP CLI script that replicated the logic of class-feed-parser.php and executed it within a controlled loop.
I created /var/tmp/test_parse.php:
loadXML($xml_payload);
$items = $dom->getElementsByTagName('item');
foreach ($items as $item) {
$title = $item->getElementsByTagName('title')->item(0)->nodeValue;
}
libxml_clear_errors();
}
I executed this script under valgrind, utilizing the memcheck tool. To ensure the output was readable and accurately mapped to the PHP extensions, I suppressed Zend Engine's internal memory allocator (USE_ZEND_ALLOC=0), forcing PHP to use the standard system malloc and free.
USE_ZEND_ALLOC=0 valgrind --tool=memcheck --leak-check=full --show-leak-kinds=all --track-origins=yes php /var/tmp/test_parse.php > /var/log/valgrind_php.log 2>&1
The execution was deliberately slow due to the instrumentation overhead. Once completed, I analyzed the valgrind_php.log output.
==118021== HEAP SUMMARY:
==118021== in use at exit: 418,401,280 bytes in 8,210,412 blocks
==118021== total heap usage: 14,210,124 allocs, 6,000,112 frees, 842,102,400 bytes allocated
==118021==
==118021== 418,401,280 bytes in 500 blocks are definitely lost in loss record 142 of 142
==118021== at 0x4842839: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==118021== by 0x51A2B14: xmlNewDocNode (tree.c:2140)
==118021== by 0x51B4C22: xmlParseElement (parser.c:9120)
==118021== by 0x51B5D41: xmlParseContent (parser.c:9840)
==118021== by 0x51B6E52: xmlParseDocument (parser.c:10520)
==118021== by 0x51B7F63: xmlDoReadMemory (parser.c:14810)
==118021== by 0x7F8B98A10214: dom_document_loadxml (document.c:1420)
==118021== by 0x54A2145: execute_ex (zend_vm_execute.h:5420)
==118021== by 0x54A8256: zend_execute (zend_vm_execute.h:6012)
==118021== by 0x5411367: zend_execute_scripts (zend.c:1820)
==118021== by 0x53B2478: php_execute_script (main.c:2540)
==118021== by 0x5503589: do_cli (php_cli.c:980)
==118021== by 0x550469A: main (php_cli.c:1340)
The stack trace points directly to xmlNewDocNode within libxml2 (tree.c), invoked by dom_document_loadxml in the PHP ext/dom extension. Over 400MB of memory was "definitely lost". The number of blocks lost (500) corresponds exactly to the 500 iterations of the loop in the test script.
Every time $dom->loadXML($xml_payload) was executed, libxml2 allocated internal memory structures for the parsed XML tree. When the $dom variable was overwritten or fell out of scope, PHP destroyed its internal zval representation of the DOM object, but the underlying C structures allocated by libxml2 were not being freed.
Cyclic References and Garbage Collection Limitations
To understand why the libxml2 structures were orphaned, it is necessary to examine how PHP interfaces with external C libraries via objects.
When new DOMDocument() is called, PHP creates a user-land object. When loadXML() is called, the PHP extension passes the string to libxml2. libxml2 parses the string and creates its own internal tree in memory consisting of xmlNode structs. The PHP extension then stores a pointer to the root of this libxml2 tree within the internal data of the PHP DOMDocument object.
PHP utilizes a reference-counting garbage collector. When a variable's reference count drops to 0, its memory is freed. However, the DOM structure is inherently cyclic. A parent node has pointers to its child nodes (childNodes), and the child nodes have pointers back to the parent node (parentNode).
If you extract a specific node from the tree and assign it to a PHP variable:
$items = $dom->getElementsByTagName('item');
The $items variable is a DOMNodeList object. It holds references to the internal libxml2 nodes. To prevent libxml2 from freeing the memory while PHP still needs it, the ext/dom extension increments the internal reference counters of the libxml2 structures whenever a PHP object interfaces with them.
In the test script, the foreach loop assigns each child node to the variable $item.
foreach ($items as $item) {
$title = $item->getElementsByTagName('title')->item(0)->nodeValue;
}
During this iteration, PHP creates internal DOMElement objects representing the XML nodes. Due to the complex interplay of parent/child relationships and the traversal mechanism of DOMNodeList, cyclic references are created within the Zend Engine's object store.
When the function ends, $dom and $items go out of scope. Their reference counts decrement. However, because of the cyclic references created during the traversal, the reference counts of the internal node objects do not reach zero.
PHP's standard reference counting cannot resolve cyclic dependencies. To handle this, PHP includes a synchronous cycle collection algorithm (the cyclic garbage collector). This collector runs periodically when the root buffer (which stores potential cycle roots) is full (default 10,000 entries).
The critical failure point resides in how the ext/dom extension calculates memory usage. The cyclic garbage collector prioritizes objects based on their known memory footprint. The PHP DOMElement object is very small (a few bytes containing a pointer). The actual data (megabytes of XML text) resides in the libxml2 allocated memory, which is opaque to the PHP garbage collector's size estimation.
Consequently, the cyclic garbage collector does not prioritize cleaning up these DOM objects. In a long-running process (like FPM or a background CLI sync task), the rapid instantiation of large DOM trees creates millions of tiny, cyclic PHP objects holding pointers to massive, opaque libxml2 memory blocks. The process hits the operating system cgroup memory limit (triggering the OOM killer) long before the PHP cyclic garbage collector determines that an internal threshold has been reached to initiate a comprehensive sweep.
Execution Reconfiguration and Memory Deallocation
The resolution requires manual intervention at the application layer to break the cyclic references and explicitly signal the memory subsystems to deallocate the structures before the process iterates to the next XML payload.
I modified the class-feed-parser.php file to implement deterministic destruction.
class JoyDay_Feed_Parser {
// ...
public function process_feed($url) {
$xml_string = file_get_contents($url);
if (!$xml_string) {
return false;
}
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadXML($xml_string);
$items = $dom->getElementsByTagName('item');
foreach ($items as $item) {
$event_data = $this->extract_node_data($item);
$this->save_to_database($event_data);
$this->parsed_items[] = $event_data['id'];
}
// 1. Clear internal libxml error buffers
libxml_clear_errors();
// 2. Explicitly unset the DOMNodeList to decrement the internal libxml references
unset($items);
// 3. Clear the document structure. This is the critical step to free the libxml2 C structs.
// Re-assigning the property forces the destruction of the underlying C tree.
$dom->loadXML('<empty/>');
// 4. Explicitly unset the DOMDocument object
unset($dom);
// 5. Force the Zend Engine cyclic garbage collector to execute immediately
gc_collect_cycles();
return true;
}
// ...
}
The adjustments are precise.
First, unset($items) destroys the DOMNodeList iterator.
Second, $dom->loadXML('<empty/>') is a known operational technique within the PHP extension ecosystem. Because there is no explicit $dom->destroy() method, loading a nominal, zero-byte XML string into the existing object forces the ext/dom extension to call xmlFreeDoc on the previously held libxml2 memory tree before allocating the new, tiny tree.
Third, unset($dom) removes the object entirely.
Finally, gc_collect_cycles() forces the Zend Engine to pause execution and immediately resolve any remaining cyclic references generated during the extract_node_data method, guaranteeing the memory is marked free before the next URL is processed.
In addition to the application code patch, I adjusted the PHP-FPM pool configuration (/etc/php/8.2/fpm/pool.d/www.conf) to implement a secondary, structural fail-safe against external library leaks.
; Terminate and respawn worker processes after processing 100 requests.
; This guarantees that any native C-level leaks (like libxml2 or OpenSSL)
; are forcefully reclaimed by the kernel upon process exit.
pm.max_requests = 100
State Verification
After applying the code modifications and restarting the PHP-FPM daemon, I monitored the execution of the background synchronization task using the identical pmap command sequence.
pmap -x 118402
118402: php-fpm: pool www
Address Kbytes RSS Dirty Mode Mapping
...
00007f8b91e00000 2048 2048 2048 rw--- [ anon ]
00007f8b92000000 16384 12288 12288 rw--- [ anon ]
...
---------------- ------- ------- -------
total kB 140218 68112 68112
During the processing of 50 consecutive XML feeds, the anonymous RSS footprint of the target PHP-FPM worker process stabilized at 68MB. It fluctuated briefly during the loadXML() execution, and immediately returned to the baseline following the explicit unset() and gc_collect_cycles() operations. The dmesg buffer showed no further kernel OOM killer invocations.
评论 0