Elementra 1.1.1 - 100% Elementor WordPress Theme / Free download with activation included
Free download : Elementra 1.1.1 - 100% Elementor WordPress Theme
NFSv4 flock() deadlock in Elementor CSS generation
Incident Context
At 02:00 UTC, a scheduled background routine responsible for purging and regenerating static CSS assets began leaving orphaned lock files and zombie PHP CLI processes. The node in question is a staging environment configured on AlmaLinux 9 (kernel 5.14.0-284.11.1.el9_2.x86_64) with PHP 8.2.10.
This specific server is part of an automated testing matrix evaluating the Elementra - 100% Elementor WordPress Theme prior to its integration into our primary repository. The node operates headlessly.
The issue did not trigger standard resource alerts. CPU utilization remained nominal. Memory consumption was stable. The anomaly was identified solely due to the accumulation of .lock files in the /wp-content/uploads/elementor/css/ directory, which eventually caused subsequent deployment pipelines to halt due to permission conflicts during asset synchronization.
Initial Process State Analysis
I connected to the node and listed the accumulated lock files to determine the scope of the blocked operations.
find /var/www/html/wp-content/uploads/elementor/css/ -type f -name "*.lock" -ls
Output:
3491028 4 -rw-r--r-- 1 www-data www-data 0 Nov 12 02:00 /var/www/html/wp-content/uploads/elementor/css/post-12.css.lock
3491035 4 -rw-r--r-- 1 www-data www-data 0 Nov 13 02:00 /var/www/html/wp-content/uploads/elementor/css/post-45.css.lock
3491088 4 -rw-r--r-- 1 www-data www-data 0 Nov 14 02:00 /var/www/html/wp-content/uploads/elementor/css/global.css.lock
The timestamps corresponded precisely with the daily execution of a WP-CLI command invoked via standard cron:
0 2 * * * cd /var/www/html && wp elementor flush_css --quiet
I checked the process table for any lingering wp commands.
ps -eo pid,ppid,user,stat,start,time,cmd | grep "wp elementor"
Output:
10291 1 www-data S Nov 12 00:00:01 php /usr/local/bin/wp elementor flush_css --quiet
18492 1 www-data S Nov 13 00:00:01 php /usr/local/bin/wp elementor flush_css --quiet
27104 1 www-data S Nov 14 00:00:01 php /usr/local/bin/wp elementor flush_css --quiet
The processes were in the S (Interruptible Sleep) state. They were detached from their original parent (PPID 1, systemd), indicating the original cron execution shell had timed out or terminated, but the PHP binaries were still waiting on a system call.
To identify the exact system call, I inspected the file descriptors held by the oldest process (PID 10291).
lsof -p 10291
Excerpt of output:
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
...
php 10291 www-data 0r CHR 1,3 0t0 1028 /dev/null
php 10291 www-data 1w CHR 1,3 0t0 1028 /dev/null
php 10291 www-data 2w CHR 1,3 0t0 1028 /dev/null
php 10291 www-data 3r DIR 0,43 4096 3491000 /var/www/html/wp-content/uploads/elementor/css
php 10291 www-data 4w REG 0,43 0 3491028 /var/www/html/wp-content/uploads/elementor/css/post-12.css.lock
File descriptor 4 (FD 4w) was holding a write lock on post-12.css.lock.
Tracing the Kernel Execution Path
I pulled the kernel stack trace for the sleeping process to determine where in the kernel space the thread was blocked.
cat /proc/10291/stack
Output:
[<0>] nfs_wait_bit_killable+0x1e/0x90 [nfs]
[<0>] nfs4_proc_setlk+0x12a/0x180 [nfsv4]
[<0>] nfs_flock+0x8f/0xd0 [nfs]
[<0>] locks_lock_inode_wait+0x5a/0x1a0
[<0>] do_fcntl+0x2f1/0x650
[<0>] __x64_sys_fcntl+0x80/0xd0
[<0>] do_syscall_64+0x5c/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
The stack trace isolated the issue. The PHP process executed a fcntl() system call to acquire a POSIX file lock (flock() in PHP userland often translates to fcntl(F_SETLKW) depending on Zend Engine implementation). The file system /var/www/html/wp-content/uploads is an NFSv4 mount. The kernel delegated the lock request to the NFS client module (nfs_flock), which communicated with the NFSv4 server via nfs4_proc_setlk.
The thread was suspended in nfs_wait_bit_killable, waiting indefinitely for an RPC response from the NFS server that was either never sent or dropped.
File System Architecture and NFS Mount Parameters
The staging environments utilize a shared storage array to simulate distributed cluster states used when testing themes intended for sites that frequently Download WordPress Themes and require horizontally scaled file synchronization.
I checked the specific mount parameters for the NFS share.
mount | grep nfs4
10.0.5.50:/exports/staging_uploads on /var/www/html/wp-content/uploads type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.5.20,local_lock=none,addr=10.0.5.50)
The critical parameter here is local_lock=none. Under NFSv4, local_lock=none instructs the kernel to send all POSIX file lock requests directly to the NFS server. The NFS server must support NLM (Network Lock Manager) or native NFSv4 lock leasing. If the lock state between the client and server becomes desynchronized due to a momentary TCP drop, the client process can hang in F_SETLKW (Set Lock Wait) indefinitely because NFS hard mounts do not time out on interrupted RPC lock states.
Zend Engine and GDB Core Dump Analysis
To confirm the exact PHP userland execution path leading to this system call, I attached gdb directly to the hung PHP worker process. Relying solely on the kernel stack is insufficient; I needed to identify the exact file and line number within the WordPress or theme core triggering the operation.
gdb -p 10291
Inside the gdb prompt, I configured the environment to read PHP's debug symbols.
(gdb) source /usr/src/php-8.2.10/.gdbinit
(gdb) zbacktrace
The Zend Engine backtrace yielded the following execution stack:
[0x00007ffc12a4b890] flock(resource(#412), 2) [internal function]
[0x00007ffc12a4b7f0] Elementor\Core\Files\Base->get_file_handle() /var/www/html/wp-content/plugins/elementor/includes/core/files/base.php:124
[0x00007ffc12a4b750] Elementor\Core\Files\Base->update_file() /var/www/html/wp-content/plugins/elementor/includes/core/files/base.php:210
[0x00007ffc12a4b6b0] Elementor\Core\Files\CSS\Post->update() /var/www/html/wp-content/plugins/elementor/includes/core/files/css/base.php:80
[0x00007ffc12a4b600] Elementor\Core\Files\Manager->clear_cache() /var/www/html/wp-content/plugins/elementor/includes/core/files/manager.php:145
[0x00007ffc12a4b540] Elementra_Theme_Hooks->trigger_css_flush() /var/www/html/wp-content/themes/elementra/inc/classes/class-elementra-hooks.php:89
[0x00007ffc12a4b490] WP_CLI\Dispatcher\CommandFactory::extract_function_call() /var/www/html/wp-content/plugins/elementor/cli/manager.php:42
The stack confirmed that Elementor's CSS file generation logic was executing. Specifically, Elementor\Core\Files\Base->get_file_handle() was attempting to open a .lock file using flock($handle, LOCK_EX). The second argument 2 corresponds to LOCK_EX (exclusive lock).
I inspected the source code in base.php at line 124.
sed -n '120,128p' /var/www/html/wp-content/plugins/elementor/includes/core/files/base.php
public function get_file_handle() {
if ( ! $this->file_handle ) {
$this->file_handle = fopen( $this->path . '.lock', 'w' );
if ( $this->file_handle ) {
flock( $this->file_handle, LOCK_EX );
}
}
return $this->file_handle;
}
The method opens the file and immediately requests an exclusive, blocking lock.
However, the backtrace also showed a specific call from the theme: Elementra_Theme_Hooks->trigger_css_flush(). I needed to investigate why the theme was wrapping the standard Elementor CSS flush command.
cat /var/www/html/wp-content/themes/elementra/inc/classes/class-elementra-hooks.php
Excerpt:
public function trigger_css_flush() {
// Force Elementor to regenerate global typography before flushing post CSS
$global_styles = get_option('elementra_global_styles_cache');
if ( empty( $global_styles ) ) {
Elementra_Typography::compile_fonts();
}
\Elementor\Plugin::$instance->files_manager->clear_cache();
}
The theme was functioning correctly. The error was purely infrastructural. The blocking flock call on an NFS share with local_lock=none was the root cause of the process suspension.
Inotify Tracing the Lock Contention
To prove that the lock was not being held by another active process (which would make F_SETLKW function as intended, albeit slowly), but was instead a network protocol deadlock, I needed to monitor the file system events in real-time.
I terminated the zombie PHP processes to release any client-side lock state.
kill -9 10291 18492 27104
rm -f /var/www/html/wp-content/uploads/elementor/css/*.lock
I wrote a Bash script utilizing inotifywait to monitor all file descriptor interactions within the css directory.
#!/bin/bash
DIR="/var/www/html/wp-content/uploads/elementor/css"
inotifywait -m -e create -e open -e modify -e close_write -e close_nowrite -e delete --timefmt '%H:%M:%S' --format '%T %w%f %e' $DIR | while read time file event; do
echo "[$time] $file : $event"
done
I executed the script in one terminal. In another, I manually ran the WP-CLI command.
wp elementor flush_css --quiet
The inotifywait output:
Setting up watches.
Watches established.
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-12.css.lock : CREATE
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-12.css.lock : OPEN
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-12.css : CREATE
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-12.css : OPEN
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-12.css : MODIFY
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-12.css : CLOSE_WRITE,CLOSE
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-12.css.lock : CLOSE_WRITE,CLOSE
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-12.css.lock : DELETE
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-45.css.lock : CREATE
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-45.css.lock : OPEN
The command processed post-12.css successfully. It created the lock, opened it, generated the CSS, closed the CSS, closed the lock, and deleted the lock.
It then created post-45.css.lock and opened it. It never proceeded to CREATE post-45.css. The terminal running the WP-CLI command hung indefinitely at this exact point.
There was no secondary process contending for post-45.css.lock. The lock request was simply disappearing into the network stack.
RPC Protocol Dissection (NFSv4 tcpdump)
I needed to see the raw RPC packets to understand why the NFS server was failing to acknowledge the lock request for post-45.css.lock.
I initiated a packet capture targeting the NFS port (2049) between the client (10.0.5.20) and the storage server (10.0.5.50).
tcpdump -i eth0 host 10.0.5.50 and port 2049 -w nfs_lock.pcap
I terminated the hung WP-CLI command, deleted the lock file, and ran the command again to reproduce the hang while capturing the traffic. Once the process froze, I halted the capture and analyzed it with tshark.
tshark -r nfs_lock.pcap -Y "nfs && rpc.msgtyp == 0" -T fields -e frame.number -e ip.src -e ip.dst -e rpc.xid -e nfs.opcode -e nfs.lock.locktype
The output filtered for NFS Call operations:
14 10.0.5.20 10.0.5.50 0x4f1a2b3c OPEN
16 10.0.5.20 10.0.5.50 0x4f1a2b3d LOCK WRITE_LT
18 10.0.5.20 10.0.5.50 0x4f1a2b3e WRITE
20 10.0.5.20 10.0.5.50 0x4f1a2b3f LOCKU
22 10.0.5.20 10.0.5.50 0x4f1a2b40 CLOSE
24 10.0.5.20 10.0.5.50 0x4f1a2b41 REMOVE
26 10.0.5.20 10.0.5.50 0x4f1a2b42 OPEN
28 10.0.5.20 10.0.5.50 0x4f1a2b43 LOCK WRITE_LT
Frames 14 through 24 correspond to the successful generation of post-12.css.
Frame 26 is the OPEN call for post-45.css.lock. Frame 28 is the LOCK call for it, requesting a write lock (WRITE_LT).
I checked the NFS Replies from the server (rpc.msgtyp == 1):
tshark -r nfs_lock.pcap -Y "nfs && rpc.msgtyp == 1" -T fields -e frame.number -e ip.src -e ip.dst -e rpc.xid -e nfs.status
15 10.0.5.50 10.0.5.20 0x4f1a2b3c NFS4_OK
17 10.0.5.50 10.0.5.20 0x4f1a2b3d NFS4_OK
19 10.0.5.50 10.0.5.20 0x4f1a2b3e NFS4_OK
21 10.0.5.50 10.0.5.20 0x4f1a2b3f NFS4_OK
23 10.0.5.50 10.0.5.20 0x4f1a2b40 NFS4_OK
25 10.0.5.50 10.0.5.20 0x4f1a2b41 NFS4_OK
27 10.0.5.50 10.0.5.20 0x4f1a2b42 NFS4_OK
Frame 27 successfully returned NFS4_OK for the OPEN call.
There is no reply to the LOCK request sent in Frame 28 (XID 0x4f1a2b43). The server never responded.
Under NFSv4, file locking is stateful and integrated into the core protocol (unlike NFSv3 which relies on the separate rpc.statd and rpc.lockd daemons). The storage array, a specialized appliance, was enforcing a silent lock-throttling mechanism. Because the WP-CLI script was generating hundreds of locks sequentially in a tight loop with zero microsecond delay between CLOSE and the next OPEN, the appliance's NLM equivalent dropped the subsequent lock request, classifying it as a potential denial-of-service vector or triggering an internal state sequence ID collision.
Because the Linux NFS client mount was configured with hard, it does not time out. It simply retries the RPC request endlessly or waits for a response that will never arrive. The PHP script, blocked by fcntl, is unaware of the network topology and sleeps until interrupted.
Remediation Strategy
There are two methods to resolve this: modifying the application logic or reconfiguring the POSIX file system abstraction.
Option 1: Elementor Print Method Modification
Elementor allows you to bypass physical file generation and instead inject the CSS inline into the HTML <head>. This eliminates the file locking requirement entirely. This is controlled via the elementor_css_print_method option in the wp_options table.
wp option update elementor_css_print_method internal
While effective, this forces all CSS to be rendered inline on every page request, bypassing browser caching. For a staging node used for functional testing, this might be acceptable, but it deviates from production configuration parity.
Option 2: NFS Mount Parameter Adjustment
The technically robust solution is to modify how the Linux kernel handles file locks on the NFS mount.
By changing the mount option from local_lock=none to local_lock=posix, we instruct the Linux kernel to handle all POSIX flock and fcntl calls entirely in local memory on the client. The kernel will not transmit the LOCK RPC calls to the NFS server.
This is safe because the staging environment processes CSS generation exclusively on this specific client node. There are no other nodes in the cluster simultaneously attempting to compile Elementor CSS for the same files.
I modified the /etc/fstab configuration.
--- /etc/fstab.orig
+++ /etc/fstab
@@ -10,2 +10,2 @@
-10.0.5.50:/exports/staging_uploads /var/www/html/wp-content/uploads nfs4 rw,relatime,vers=4.2,hard,proto=tcp,timeo=600,retrans=2,sec=sys,local_lock=none 0 0
+10.0.5.50:/exports/staging_uploads /var/www/html/wp-content/uploads nfs4 rw,relatime,vers=4.2,hard,proto=tcp,timeo=600,retrans=2,sec=sys,local_lock=posix 0 0
I applied the remount operation.
mount -o remount /var/www/html/wp-content/uploads
I verified the applied options.
cat /proc/mounts | grep nfs4
10.0.5.50:/exports/staging_uploads /var/www/html/wp-content/uploads nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.5.20,local_lock=posix,addr=10.0.5.50 0 0
Final Execution Benchmark
With the local lock mapping enabled, the PHP process no longer delegates lock state to the storage appliance. The kernel immediately grants the exclusive lock via local VFS (Virtual File System) structs.
I executed the WP-CLI command and tracked its execution time.
time wp elementor flush_css --quiet
real 0m1.842s
user 0m1.201s
sys 0m0.314s
The operation completed in 1.8 seconds.
I checked the directory for any remaining lock files.
find /var/www/html/wp-content/uploads/elementor/css/ -type f -name "*.lock"
No files were returned. The local flock implementation cleanly applied and removed the locks sequentially without triggering network latency penalties or state machine drops on the NFS server. The background automation matrix returned to normal operation.
评论 0