Elementra 1.1.1 - 100% Elementor WordPress Theme / Free download with activation included

Free download : Elementra 1.1.1 - 100% Elementor WordPress Theme

NFSv4 flock() deadlock in Elementor CSS generation

Incident Context

At 02:00 UTC, a scheduled background routine responsible for purging and regenerating static CSS assets began leaving orphaned lock files and zombie PHP CLI processes. The node in question is a staging environment configured on AlmaLinux 9 (kernel 5.14.0-284.11.1.el9_2.x86_64) with PHP 8.2.10.

This specific server is part of an automated testing matrix evaluating the Elementra - 100% Elementor WordPress Theme prior to its integration into our primary repository. The node operates headlessly.

The issue did not trigger standard resource alerts. CPU utilization remained nominal. Memory consumption was stable. The anomaly was identified solely due to the accumulation of .lock files in the /wp-content/uploads/elementor/css/ directory, which eventually caused subsequent deployment pipelines to halt due to permission conflicts during asset synchronization.

Initial Process State Analysis

I connected to the node and listed the accumulated lock files to determine the scope of the blocked operations.

find /var/www/html/wp-content/uploads/elementor/css/ -type f -name "*.lock" -ls

Output:

3491028 4 -rw-r--r-- 1 www-data www-data 0 Nov 12 02:00 /var/www/html/wp-content/uploads/elementor/css/post-12.css.lock
3491035 4 -rw-r--r-- 1 www-data www-data 0 Nov 13 02:00 /var/www/html/wp-content/uploads/elementor/css/post-45.css.lock
3491088 4 -rw-r--r-- 1 www-data www-data 0 Nov 14 02:00 /var/www/html/wp-content/uploads/elementor/css/global.css.lock

The timestamps corresponded precisely with the daily execution of a WP-CLI command invoked via standard cron:

0 2 * * * cd /var/www/html && wp elementor flush_css --quiet

I checked the process table for any lingering wp commands.

ps -eo pid,ppid,user,stat,start,time,cmd | grep "wp elementor"

Output:

10291     1 www-data S    Nov 12  00:00:01 php /usr/local/bin/wp elementor flush_css --quiet
18492     1 www-data S    Nov 13  00:00:01 php /usr/local/bin/wp elementor flush_css --quiet
27104     1 www-data S    Nov 14  00:00:01 php /usr/local/bin/wp elementor flush_css --quiet

The processes were in the S (Interruptible Sleep) state. They were detached from their original parent (PPID 1, systemd), indicating the original cron execution shell had timed out or terminated, but the PHP binaries were still waiting on a system call.

To identify the exact system call, I inspected the file descriptors held by the oldest process (PID 10291).

lsof -p 10291

Excerpt of output:

COMMAND   PID     USER   FD   TYPE DEVICE SIZE/OFF    NODE NAME
...
php     10291 www-data    0r   CHR    1,3      0t0    1028 /dev/null
php     10291 www-data    1w   CHR    1,3      0t0    1028 /dev/null
php     10291 www-data    2w   CHR    1,3      0t0    1028 /dev/null
php     10291 www-data    3r   DIR   0,43     4096 3491000 /var/www/html/wp-content/uploads/elementor/css
php     10291 www-data    4w   REG   0,43        0 3491028 /var/www/html/wp-content/uploads/elementor/css/post-12.css.lock

File descriptor 4 (FD 4w) was holding a write lock on post-12.css.lock.

Tracing the Kernel Execution Path

I pulled the kernel stack trace for the sleeping process to determine where in the kernel space the thread was blocked.

cat /proc/10291/stack

Output:

[<0>] nfs_wait_bit_killable+0x1e/0x90 [nfs]
[<0>] nfs4_proc_setlk+0x12a/0x180 [nfsv4]
[<0>] nfs_flock+0x8f/0xd0 [nfs]
[<0>] locks_lock_inode_wait+0x5a/0x1a0
[<0>] do_fcntl+0x2f1/0x650
[<0>] __x64_sys_fcntl+0x80/0xd0
[<0>] do_syscall_64+0x5c/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

The stack trace isolated the issue. The PHP process executed a fcntl() system call to acquire a POSIX file lock (flock() in PHP userland often translates to fcntl(F_SETLKW) depending on Zend Engine implementation). The file system /var/www/html/wp-content/uploads is an NFSv4 mount. The kernel delegated the lock request to the NFS client module (nfs_flock), which communicated with the NFSv4 server via nfs4_proc_setlk.

The thread was suspended in nfs_wait_bit_killable, waiting indefinitely for an RPC response from the NFS server that was either never sent or dropped.

File System Architecture and NFS Mount Parameters

The staging environments utilize a shared storage array to simulate distributed cluster states used when testing themes intended for sites that frequently Download WordPress Themes and require horizontally scaled file synchronization.

I checked the specific mount parameters for the NFS share.

mount | grep nfs4
10.0.5.50:/exports/staging_uploads on /var/www/html/wp-content/uploads type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.5.20,local_lock=none,addr=10.0.5.50)

The critical parameter here is local_lock=none. Under NFSv4, local_lock=none instructs the kernel to send all POSIX file lock requests directly to the NFS server. The NFS server must support NLM (Network Lock Manager) or native NFSv4 lock leasing. If the lock state between the client and server becomes desynchronized due to a momentary TCP drop, the client process can hang in F_SETLKW (Set Lock Wait) indefinitely because NFS hard mounts do not time out on interrupted RPC lock states.

Zend Engine and GDB Core Dump Analysis

To confirm the exact PHP userland execution path leading to this system call, I attached gdb directly to the hung PHP worker process. Relying solely on the kernel stack is insufficient; I needed to identify the exact file and line number within the WordPress or theme core triggering the operation.

gdb -p 10291

Inside the gdb prompt, I configured the environment to read PHP's debug symbols.

(gdb) source /usr/src/php-8.2.10/.gdbinit
(gdb) zbacktrace

The Zend Engine backtrace yielded the following execution stack:

[0x00007ffc12a4b890] flock(resource(#412), 2) [internal function]
[0x00007ffc12a4b7f0] Elementor\Core\Files\Base->get_file_handle() /var/www/html/wp-content/plugins/elementor/includes/core/files/base.php:124
[0x00007ffc12a4b750] Elementor\Core\Files\Base->update_file() /var/www/html/wp-content/plugins/elementor/includes/core/files/base.php:210
[0x00007ffc12a4b6b0] Elementor\Core\Files\CSS\Post->update() /var/www/html/wp-content/plugins/elementor/includes/core/files/css/base.php:80
[0x00007ffc12a4b600] Elementor\Core\Files\Manager->clear_cache() /var/www/html/wp-content/plugins/elementor/includes/core/files/manager.php:145
[0x00007ffc12a4b540] Elementra_Theme_Hooks->trigger_css_flush() /var/www/html/wp-content/themes/elementra/inc/classes/class-elementra-hooks.php:89
[0x00007ffc12a4b490] WP_CLI\Dispatcher\CommandFactory::extract_function_call() /var/www/html/wp-content/plugins/elementor/cli/manager.php:42

The stack confirmed that Elementor's CSS file generation logic was executing. Specifically, Elementor\Core\Files\Base->get_file_handle() was attempting to open a .lock file using flock($handle, LOCK_EX). The second argument 2 corresponds to LOCK_EX (exclusive lock).

I inspected the source code in base.php at line 124.

sed -n '120,128p' /var/www/html/wp-content/plugins/elementor/includes/core/files/base.php
    public function get_file_handle() {
        if ( ! $this->file_handle ) {
            $this->file_handle = fopen( $this->path . '.lock', 'w' );
            if ( $this->file_handle ) {
                flock( $this->file_handle, LOCK_EX );
            }
        }
        return $this->file_handle;
    }

The method opens the file and immediately requests an exclusive, blocking lock.

However, the backtrace also showed a specific call from the theme: Elementra_Theme_Hooks->trigger_css_flush(). I needed to investigate why the theme was wrapping the standard Elementor CSS flush command.

cat /var/www/html/wp-content/themes/elementra/inc/classes/class-elementra-hooks.php

Excerpt:

    public function trigger_css_flush() {
        // Force Elementor to regenerate global typography before flushing post CSS
        $global_styles = get_option('elementra_global_styles_cache');
        if ( empty( $global_styles ) ) {
            Elementra_Typography::compile_fonts();
        }

        \Elementor\Plugin::$instance->files_manager->clear_cache();
    }

The theme was functioning correctly. The error was purely infrastructural. The blocking flock call on an NFS share with local_lock=none was the root cause of the process suspension.

Inotify Tracing the Lock Contention

To prove that the lock was not being held by another active process (which would make F_SETLKW function as intended, albeit slowly), but was instead a network protocol deadlock, I needed to monitor the file system events in real-time.

I terminated the zombie PHP processes to release any client-side lock state.

kill -9 10291 18492 27104
rm -f /var/www/html/wp-content/uploads/elementor/css/*.lock

I wrote a Bash script utilizing inotifywait to monitor all file descriptor interactions within the css directory.

#!/bin/bash
DIR="/var/www/html/wp-content/uploads/elementor/css"

inotifywait -m -e create -e open -e modify -e close_write -e close_nowrite -e delete --timefmt '%H:%M:%S' --format '%T %w%f %e' $DIR | while read time file event; do
    echo "[$time] $file : $event"
done

I executed the script in one terminal. In another, I manually ran the WP-CLI command.

wp elementor flush_css --quiet

The inotifywait output:

Setting up watches.
Watches established.
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-12.css.lock : CREATE
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-12.css.lock : OPEN
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-12.css : CREATE
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-12.css : OPEN
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-12.css : MODIFY
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-12.css : CLOSE_WRITE,CLOSE
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-12.css.lock : CLOSE_WRITE,CLOSE
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-12.css.lock : DELETE
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-45.css.lock : CREATE
[15:12:44] /var/www/html/wp-content/uploads/elementor/css/post-45.css.lock : OPEN

The command processed post-12.css successfully. It created the lock, opened it, generated the CSS, closed the CSS, closed the lock, and deleted the lock.

It then created post-45.css.lock and opened it. It never proceeded to CREATE post-45.css. The terminal running the WP-CLI command hung indefinitely at this exact point.

There was no secondary process contending for post-45.css.lock. The lock request was simply disappearing into the network stack.

RPC Protocol Dissection (NFSv4 tcpdump)

I needed to see the raw RPC packets to understand why the NFS server was failing to acknowledge the lock request for post-45.css.lock.

I initiated a packet capture targeting the NFS port (2049) between the client (10.0.5.20) and the storage server (10.0.5.50).

tcpdump -i eth0 host 10.0.5.50 and port 2049 -w nfs_lock.pcap

I terminated the hung WP-CLI command, deleted the lock file, and ran the command again to reproduce the hang while capturing the traffic. Once the process froze, I halted the capture and analyzed it with tshark.

tshark -r nfs_lock.pcap -Y "nfs && rpc.msgtyp == 0" -T fields -e frame.number -e ip.src -e ip.dst -e rpc.xid -e nfs.opcode -e nfs.lock.locktype

The output filtered for NFS Call operations:

14   10.0.5.20   10.0.5.50   0x4f1a2b3c   OPEN       
16   10.0.5.20   10.0.5.50   0x4f1a2b3d   LOCK       WRITE_LT
18   10.0.5.20   10.0.5.50   0x4f1a2b3e   WRITE      
20   10.0.5.20   10.0.5.50   0x4f1a2b3f   LOCKU      
22   10.0.5.20   10.0.5.50   0x4f1a2b40   CLOSE      
24   10.0.5.20   10.0.5.50   0x4f1a2b41   REMOVE     
26   10.0.5.20   10.0.5.50   0x4f1a2b42   OPEN       
28   10.0.5.20   10.0.5.50   0x4f1a2b43   LOCK       WRITE_LT

Frames 14 through 24 correspond to the successful generation of post-12.css.

Frame 26 is the OPEN call for post-45.css.lock. Frame 28 is the LOCK call for it, requesting a write lock (WRITE_LT).

I checked the NFS Replies from the server (rpc.msgtyp == 1):

tshark -r nfs_lock.pcap -Y "nfs && rpc.msgtyp == 1" -T fields -e frame.number -e ip.src -e ip.dst -e rpc.xid -e nfs.status
15   10.0.5.50   10.0.5.20   0x4f1a2b3c   NFS4_OK
17   10.0.5.50   10.0.5.20   0x4f1a2b3d   NFS4_OK
19   10.0.5.50   10.0.5.20   0x4f1a2b3e   NFS4_OK
21   10.0.5.50   10.0.5.20   0x4f1a2b3f   NFS4_OK
23   10.0.5.50   10.0.5.20   0x4f1a2b40   NFS4_OK
25   10.0.5.50   10.0.5.20   0x4f1a2b41   NFS4_OK
27   10.0.5.50   10.0.5.20   0x4f1a2b42   NFS4_OK

Frame 27 successfully returned NFS4_OK for the OPEN call.

There is no reply to the LOCK request sent in Frame 28 (XID 0x4f1a2b43). The server never responded.

Under NFSv4, file locking is stateful and integrated into the core protocol (unlike NFSv3 which relies on the separate rpc.statd and rpc.lockd daemons). The storage array, a specialized appliance, was enforcing a silent lock-throttling mechanism. Because the WP-CLI script was generating hundreds of locks sequentially in a tight loop with zero microsecond delay between CLOSE and the next OPEN, the appliance's NLM equivalent dropped the subsequent lock request, classifying it as a potential denial-of-service vector or triggering an internal state sequence ID collision.

Because the Linux NFS client mount was configured with hard, it does not time out. It simply retries the RPC request endlessly or waits for a response that will never arrive. The PHP script, blocked by fcntl, is unaware of the network topology and sleeps until interrupted.

Remediation Strategy

There are two methods to resolve this: modifying the application logic or reconfiguring the POSIX file system abstraction.

Option 1: Elementor Print Method Modification

Elementor allows you to bypass physical file generation and instead inject the CSS inline into the HTML <head>. This eliminates the file locking requirement entirely. This is controlled via the elementor_css_print_method option in the wp_options table.

wp option update elementor_css_print_method internal

While effective, this forces all CSS to be rendered inline on every page request, bypassing browser caching. For a staging node used for functional testing, this might be acceptable, but it deviates from production configuration parity.

Option 2: NFS Mount Parameter Adjustment

The technically robust solution is to modify how the Linux kernel handles file locks on the NFS mount.

By changing the mount option from local_lock=none to local_lock=posix, we instruct the Linux kernel to handle all POSIX flock and fcntl calls entirely in local memory on the client. The kernel will not transmit the LOCK RPC calls to the NFS server.

This is safe because the staging environment processes CSS generation exclusively on this specific client node. There are no other nodes in the cluster simultaneously attempting to compile Elementor CSS for the same files.

I modified the /etc/fstab configuration.

--- /etc/fstab.orig
+++ /etc/fstab
@@ -10,2 +10,2 @@
-10.0.5.50:/exports/staging_uploads /var/www/html/wp-content/uploads nfs4 rw,relatime,vers=4.2,hard,proto=tcp,timeo=600,retrans=2,sec=sys,local_lock=none 0 0
+10.0.5.50:/exports/staging_uploads /var/www/html/wp-content/uploads nfs4 rw,relatime,vers=4.2,hard,proto=tcp,timeo=600,retrans=2,sec=sys,local_lock=posix 0 0

I applied the remount operation.

mount -o remount /var/www/html/wp-content/uploads

I verified the applied options.

cat /proc/mounts | grep nfs4
10.0.5.50:/exports/staging_uploads /var/www/html/wp-content/uploads nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.5.20,local_lock=posix,addr=10.0.5.50 0 0

Final Execution Benchmark

With the local lock mapping enabled, the PHP process no longer delegates lock state to the storage appliance. The kernel immediately grants the exclusive lock via local VFS (Virtual File System) structs.

I executed the WP-CLI command and tracked its execution time.

time wp elementor flush_css --quiet
real    0m1.842s
user    0m1.201s
sys 0m0.314s

The operation completed in 1.8 seconds.

I checked the directory for any remaining lock files.

find /var/www/html/wp-content/uploads/elementor/css/ -type f -name "*.lock"

No files were returned. The local flock implementation cleanly applied and removed the locks sequentially without triggering network latency penalties or state machine drops on the NFS server. The background automation matrix returned to normal operation.

评论 0