tests: retry os.rename on PermissionError failure in lttng_live_server.py
authorSimon Marchi <simon.marchi@efficios.com>
Mon, 30 Oct 2023 18:38:57 +0000 (14:38 -0400)
committerPhilippe Proulx <eeppeliteloop@gmail.com>
Tue, 31 Oct 2023 15:08:36 +0000 (11:08 -0400)
On the Windows CI jobs, we get random failures like:

    # plugins/src.ctf.lttng-live/test-live.sh: python3 /c/Users/jenkins/workspace/dev_review_babeltrace_master_winbuild/build/std/conf/std/platform/msys2-mingw64/src/babeltrace/tests/data/plugins/src.ctf.lttng-live/lttng_live_server.py /c/Users/jenkins/workspace/dev_review_babeltrace_master_winbuild/build/std/conf/std/platform/msys2-mingw64/src/babeltrace/tests/data/plugins/src.ctf.lttng-live/inactivity-discarded-packet.json --port-file /c/Users/jenkins/workspace/dev_review_babeltrace_master_winbuild/build/std/conf/std/platform/msys2-mingw64/tmp/test-live-server-port.Rn2dyS --trace-path-prefix C:\Users\jenkins\workspace\dev_review_babeltrace_master_winbuild\build\std\conf\std\platform\msys2-mingw64\src\babeltrace\tests\data\ctf-traces
    Traceback (most recent call last):
      File "C:/Users/jenkins/workspace/dev_review_babeltrace_master_winbuild/build/std/conf/std/platform/msys2-mingw64/src/babeltrace/tests/data/plugins/src.ctf.lttng-live/lttng_live_server.py", line 1951, in <module>
        LttngLiveServer(port, port_filename, sessions, max_query_data_response_size)
      File "C:/Users/jenkins/workspace/dev_review_babeltrace_master_winbuild/build/std/conf/std/platform/msys2-mingw64/src/babeltrace/tests/data/plugins/src.ctf.lttng-live/lttng_live_server.py", line 1667, in __init__
        self._write_port_to_file(port_filename)
      File "C:/Users/jenkins/workspace/dev_review_babeltrace_master_winbuild/build/std/conf/std/platform/msys2-mingw64/src/babeltrace/tests/data/plugins/src.ctf.lttng-live/lttng_live_server.py", line 1792, in _write_port_to_file
        os.replace(tmp_port_file.name, port_filename)
    PermissionError: [WinError 5] Access is denied: 'C:/Users/jenkins/workspace/dev_review_babeltrace_master_winbuild/build/std/conf/std/platform/msys2-mingw64/tmp/tmpt13jh6sp' -> 'C:/Users/jenkins/workspace/dev_review_babeltrace_master_winbuild/build/std/conf/std/platform/msys2-mingw64/tmp/test-live-server-port.Rn2dyS'

The PermissionError exception is raised when trying to move the port
file from its temporary location to its final location, where the bash
script expects it to appear.

I don't understand the root cause of the issue.  When exiting the `with`
scope, the temporary file is supposed to be closed, and it should be
fine to move it.  I suppose it's possible that something in the Windows
kernel hasn't completely finished using the file when we try to move it.

Implement a wait-and-retry scheme as a (bad) workaround.

Change-Id: Ia8dcefca9538aa5e58438bf84a3fa67e5e05a49a
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Reviewed-on: https://review.lttng.org/c/babeltrace/+/11174
Reviewed-by: Philippe Proulx <eeppeliteloop@gmail.com>
tests/data/plugins/src.ctf.lttng-live/lttng_live_server.py

index 2177afa508fb09bc79c575b1888b4b4e2c0379c1..3a3c565e094a53caa5956ab4173fff4f125968c9 100644 (file)
@@ -7,6 +7,7 @@
 
 import os
 import re
+import time
 import socket
 import struct
 import logging
@@ -1788,13 +1789,41 @@ class LttngLiveServer:
         ) as tmp_port_file:
             print(self._server_port, end="", file=tmp_port_file)
 
-        # Rename temporary file to real file
-        os.replace(tmp_port_file.name, port_filename)
-        logging.info(
-            'Renamed port file: src-path="{}", dst-path="{}"'.format(
-                tmp_port_file.name, port_filename
-            )
-        )
+        # Rename temporary file to real file.
+        #
+        # For unknown reasons, on Windows, moving the port file from its
+        # temporary location to its final location (where the user of
+        # the server expects it to appear) may raise a `PermissionError`
+        # exception.
+        #
+        # We suppose it's possible that something in the Windows kernel
+        # hasn't completely finished using the file when we try to move
+        # it.
+        #
+        # Use a wait-and-retry scheme as a (bad) workaround.
+        num_attempts = 5
+        retry_delay_s = 1
+
+        for attempt in reversed(range(num_attempts)):
+            try:
+                os.replace(tmp_port_file.name, port_filename)
+                logging.info(
+                    'Renamed port file: src-path="{}", dst-path="{}"'.format(
+                        tmp_port_file.name, port_filename
+                    )
+                )
+                return
+            except PermissionError:
+                logging.info(
+                    'Permission error while attempting to rename port file; retrying in {} second: src-path="{}", dst-path="{}"'.format(
+                        retry_delay_s, tmp_port_file.name, port_filename
+                    )
+                )
+
+                if attempt == 0:
+                    raise
+
+                time.sleep(retry_delay_s)
 
 
 def _session_descriptors_from_path(
This page took 0.034482 seconds and 4 git commands to generate.