The Wombelix Post - reposynchttps://dominik.wombacher.cc/2022-03-22T00:00:00+01:00How changing '<' to '>=' introduced a weird and hard to track bug2022-03-22T00:00:00+01:002022-03-22T00:00:00+01:00Dominik Wombachertag:dominik.wombacher.cc,2022-03-22:/posts/how_changing_less-than_to_greater-equal_introduced_a_weird_and_hard_to_track_bug.html<!-- SPDX-FileCopyrightText: 2023 Dominik Wombacher <dominik@wombacher.cc> -->
<!-- -->
<!-- SPDX-License-Identifier: CC-BY-SA-4.0 -->
<p>After upgrading to <a class="reference external" href="https://www.uyuni-project.org">Uyuni 2022.2</a> I wasn't able to sync openSUSE Leap 15.3
and Oracle Linux 8 repositories anymore, the configured HTTP Proxy was ignored. Multiple people reported ... <a class="read-more" href="/posts/how_changing_less-than_to_greater-equal_introduced_a_weird_and_hard_to_track_bug.html"> [read more]</a></p><!-- SPDX-FileCopyrightText: 2023 Dominik Wombacher <dominik@wombacher.cc> -->
<!-- -->
<!-- SPDX-License-Identifier: CC-BY-SA-4.0 -->
<p>After upgrading to <a class="reference external" href="https://www.uyuni-project.org">Uyuni 2022.2</a> I wasn't able to sync openSUSE Leap 15.3
and Oracle Linux 8 repositories anymore, the configured HTTP Proxy was ignored. Multiple people reported similar issues:
<a class="reference external" href="https://github.com/uyuni-project/uyuni/issues/4932">#4932</a>
(Archive: <a class="reference external" href="https://web.archive.org/web/20220322220517/https://github.com/uyuni-project/uyuni/issues/4932">[1]</a>,
<a class="reference external" href="https://archive.today/2022.03.22-220522/https://github.com/uyuni-project/uyuni/issues/4932">[2]</a>),
<a class="reference external" href="https://github.com/uyuni-project/uyuni/issues/4850">#4850</a>
(Archive: <a class="reference external" href="https://web.archive.org/web/20220322220540/https://github.com/uyuni-project/uyuni/issues/4850">[1]</a>,
<a class="reference external" href="https://archive.today/2022.03.22-220553/https://github.com/uyuni-project/uyuni/issues/4850">[2]</a>),
<a class="reference external" href="https://github.com/uyuni-project/uyuni/issues/4826">#4826</a>
(Archive: <a class="reference external" href="https://web.archive.org/web/20220322220605/https://github.com/uyuni-project/uyuni/issues/4826">[1]</a>,
<a class="reference external" href="https://archive.today/2022.03.22-220613/https://github.com/uyuni-project/uyuni/issues/4826">[2]</a>).</p>
<p>During multiple troubleshooting sessions I learned a lot about
the Python Codebase and the reposync tool.</p>
<p>It took my quite a while to track down the Issue to the <code>urlgrabber</code> package.</p>
<p>What happened? Method <code>find_proxy</code> in <code>urlgrabber.grabber.URLGrabberOptions</code>
will identify the proxy server that should be used based on the provided URL Scheme.</p>
<p>Let's assume <code>server.satellite.http_proxy</code> in <code>/etc/rhn/rhn.conf</code> is set to <code>http://10.11.12.13:80</code>.</p>
<pre class="code python literal-block">
<span class="pygments-o">>>></span> <span class="pygments-kn">from</span> <span class="pygments-nn">urlgrabber.grabber</span> <span class="pygments-kn">import</span> <span class="pygments-n">URLGrabberOptions</span><span class="pygments-w">
</span><span class="pygments-o">>>></span> <span class="pygments-n">opts</span> <span class="pygments-o">=</span> <span class="pygments-n">URLGrabberOptions</span><span class="pygments-p">(</span><span class="pygments-n">proxy</span><span class="pygments-o">=</span><span class="pygments-kc">None</span><span class="pygments-p">,</span> <span class="pygments-n">proxies</span><span class="pygments-o">=</span><span class="pygments-p">{</span><span class="pygments-s1">'http'</span><span class="pygments-p">:</span> <span class="pygments-s1">'http://10.11.12.13:80'</span><span class="pygments-p">,</span> <span class="pygments-s1">'https'</span><span class="pygments-p">:</span> <span class="pygments-s1">'http://10.11.12.13:80'</span><span class="pygments-p">,</span> <span class="pygments-s1">'ftp'</span><span class="pygments-p">:</span> <span class="pygments-s1">'http://10.11.12.13:80'</span><span class="pygments-p">})</span><span class="pygments-w">
</span><span class="pygments-o">>>></span> <span class="pygments-n">opts</span><span class="pygments-o">.</span><span class="pygments-n">find_proxy</span><span class="pygments-p">(</span><span class="pygments-sa">b</span><span class="pygments-s1">'http://test.example.com'</span><span class="pygments-p">,</span> <span class="pygments-sa">b</span><span class="pygments-s1">'http'</span><span class="pygments-p">)</span><span class="pygments-w">
</span>
</pre>
<p>That's the simplified version of the relevant Code, a Instance of <code>URLGrabberOptions</code> contains a list of proxies,
one per URL Scheme, the method <code>find_proxy</code> is used to choose the right proxy from <code>opts.proxies</code>
based on two parameter, <code>url</code> and <code>scheme</code>, both are passed as type <code>bytes</code>.</p>
<p>The expected result, <code>opts.proxy</code> contains the value <code>http://10.11.12.13:80</code> from type <code>str</code>,
actual result, <code>None</code>.</p>
<p>The package <code>urlgrabber</code> is quite old and not that actively maintained.
Python 3 support was added in Version 4, the last release is from October 2019.</p>
<p>It looks like that the above issues, that proxy is None, comes from an inconsistent bytes / string conversion.
If you pass the scheme as string instead bytes, you get the expected result:</p>
<pre class="code python literal-block">
<span class="pygments-o">>>></span> <span class="pygments-kn">from</span> <span class="pygments-nn">urlgrabber.grabber</span> <span class="pygments-kn">import</span> <span class="pygments-n">URLGrabberOptions</span><span class="pygments-w">
</span><span class="pygments-o">>>></span> <span class="pygments-n">opts</span> <span class="pygments-o">=</span> <span class="pygments-n">URLGrabberOptions</span><span class="pygments-p">(</span><span class="pygments-n">proxy</span><span class="pygments-o">=</span><span class="pygments-kc">None</span><span class="pygments-p">,</span> <span class="pygments-n">proxies</span><span class="pygments-o">=</span><span class="pygments-p">{</span><span class="pygments-s1">'http'</span><span class="pygments-p">:</span> <span class="pygments-s1">'http://10.11.12.13:80'</span><span class="pygments-p">,</span> <span class="pygments-s1">'https'</span><span class="pygments-p">:</span> <span class="pygments-s1">'http://10.11.12.13:80'</span><span class="pygments-p">,</span> <span class="pygments-s1">'ftp'</span><span class="pygments-p">:</span> <span class="pygments-s1">'http://10.11.12.13:80'</span><span class="pygments-p">})</span><span class="pygments-w">
</span><span class="pygments-o">>>></span> <span class="pygments-n">opts</span><span class="pygments-o">.</span><span class="pygments-n">find_proxy</span><span class="pygments-p">(</span><span class="pygments-sa">b</span><span class="pygments-s1">'http://test.example.com'</span><span class="pygments-p">,</span> <span class="pygments-s1">'http'</span><span class="pygments-p">)</span><span class="pygments-w">
</span><span class="pygments-o">>>></span> <span class="pygments-n">opts</span><span class="pygments-o">.</span><span class="pygments-n">proxy</span><span class="pygments-w">
</span><span class="pygments-s1">'http://10.11.12.13:80'</span><span class="pygments-w">
</span>
</pre>
<p>To get the Sync of openSUSE Leap 15.3 and Oracle Linux 8 Repositories working again
through a http proxy, a small change was already sufficient:</p>
<pre class="code diff literal-block">
<span class="pygments-gh">diff --git a/backend/satellite_tools/download.py b/backend/satellite_tools/download.py</span><span class="pygments-w">
</span><span class="pygments-gh">index 3d064e5c6ce..3b7a02d5176 100644</span><span class="pygments-w">
</span><span class="pygments-gd">--- a/backend/satellite_tools/download.py</span><span class="pygments-w">
</span><span class="pygments-gi">+++ b/backend/satellite_tools/download.py</span><span class="pygments-w">
</span><span class="pygments-gu">@@ -114,7 +114,7 @@ def __init__(self, url, filename, opts, curl_cache, parent):</span><span class="pygments-w">
</span> self.parent = parent<span class="pygments-w">
</span> (url, parts) = opts.urlparser.parse(url, opts)<span class="pygments-w">
</span> (scheme, host, path, parm, query, frag) = parts<span class="pygments-w">
</span><span class="pygments-gd">- opts.find_proxy(url, scheme)</span><span class="pygments-w">
</span><span class="pygments-gi">+ opts.find_proxy(url, scheme.decode("utf-8"))</span><span class="pygments-w">
</span> super().__init__(url, filename, opts)<span class="pygments-w">
</span> def _do_open(self):<span class="pygments-w">
</span>
</pre>
<p>Pull Request <a class="reference external" href="https://github.com/uyuni-project/uyuni/pull/4953">#4953</a>
(Archive: <a class="reference external" href="https://web.archive.org/web/20220322220636/https://github.com/uyuni-project/uyuni/pull/4953">[1]</a>,
<a class="reference external" href="https://archive.today/2022.03.22-220654/https://github.com/uyuni-project/uyuni/pull/4953">[2]</a>)
was accepted and merged, unfortunately a few user still reported issues, especially related to CentOS 7 this time.</p>
<p>After a quick check, I had the impression that syncing EL8 based Distributions like Oracle Linux 8
share the same Code as EL7 based Distributions like CentOS 7.
More or less right, but due to things like <em>mirrorlist</em>, there are some additional steps and further
calls of <code>urlgrabber</code> split across the reposync code.</p>
<p>I had to find another way to fix it without touching multiple files and methods.
So I gave some hacky <a class="reference external" href="https://en.wikipedia.org/wiki/Monkey_patch">Monkey Patching</a> a try.
First I reverted the previous workaround:</p>
<pre class="code diff literal-block">
<span class="pygments-gh">diff --git a/python/spacewalk/satellite_tools/download.py b/python/spacewalk/satellite_tools/download.py</span><span class="pygments-w">
</span><span class="pygments-gh">index 3b7a02d5176..3d064e5c6ce 100644</span><span class="pygments-w">
</span><span class="pygments-gd">--- a/python/spacewalk/satellite_tools/download.py</span><span class="pygments-w">
</span><span class="pygments-gi">+++ b/python/spacewalk/satellite_tools/download.py</span><span class="pygments-w">
</span><span class="pygments-gu">@@ -114,7 +114,7 @@ def __init__(self, url, filename, opts, curl_cache, parent):</span><span class="pygments-w">
</span> self.parent = parent<span class="pygments-w">
</span> (url, parts) = opts.urlparser.parse(url, opts)<span class="pygments-w">
</span> (scheme, host, path, parm, query, frag) = parts<span class="pygments-w">
</span><span class="pygments-gd">- opts.find_proxy(url, scheme.decode("utf-8"))</span><span class="pygments-w">
</span><span class="pygments-gi">+ opts.find_proxy(url, scheme)</span><span class="pygments-w">
</span> super().__init__(url, filename, opts)<span class="pygments-w">
</span> def _do_open(self):<span class="pygments-w">
</span>
</pre>
<p>Then I created a new <code>find_proxy</code> method, which just triggers the original one
but performs the bytes to string conversion. The magic happens in the two lines after
the method, <code>urlgrabber_find_proxy</code> becomes the method from <code>urlgrabber</code>
and my own version replaces the original one. That way it doesn't matter where in
<code>yum_src.py</code> anything related to <code>urlgrabber</code> will be triggered, scheme
will always be converted to a string a the proxy set as configured and expected.</p>
<pre class="code diff literal-block">
<span class="pygments-gh">diff --git a/python/spacewalk/satellite_tools/repo_plugins/yum_src.py b/python/spacewalk/satellite_tools/repo_plugins/yum_src.py</span><span class="pygments-w">
</span><span class="pygments-gh">index 85013cfb36a..39f36be61e5 100644</span><span class="pygments-w">
</span><span class="pygments-gd">--- a/python/spacewalk/satellite_tools/repo_plugins/yum_src.py</span><span class="pygments-w">
</span><span class="pygments-gi">+++ b/python/spacewalk/satellite_tools/repo_plugins/yum_src.py</span><span class="pygments-w">
</span><span class="pygments-gu">@@ -80,6 +80,17 @@</span><span class="pygments-w">
</span>APACHE_USER = 'wwwrun'<span class="pygments-w">
</span>APACHE_GROUP = 'www'<span class="pygments-w">
</span><span class="pygments-gi">+</span><span class="pygments-w">
</span><span class="pygments-gi">+# Monkey Patch 'urlgrabber.grabber' method 'find_proxy' to enforce type string for variable 'scheme'</span><span class="pygments-w">
</span><span class="pygments-gi">+# Workaround due to wrong byte/string handling in 'urlgrabber' package</span><span class="pygments-w">
</span><span class="pygments-gi">+# Required by reposync to connect through http_proxy as configured</span><span class="pygments-w">
</span><span class="pygments-gi">+def find_proxy(self, url, scheme):</span><span class="pygments-w">
</span><span class="pygments-gi">+ urlgrabber_find_proxy(self, url, scheme.decode('utf-8'))</span><span class="pygments-w">
</span><span class="pygments-gi">+</span><span class="pygments-w">
</span><span class="pygments-gi">+urlgrabber_find_proxy = urlgrabber.grabber.URLGrabberOptions.find_proxy</span><span class="pygments-w">
</span><span class="pygments-gi">+urlgrabber.grabber.URLGrabberOptions.find_proxy = find_proxy</span><span class="pygments-w">
</span><span class="pygments-gi">+</span><span class="pygments-w">
</span><span class="pygments-gi">+</span><span class="pygments-w">
</span>class ZyppoSync:<span class="pygments-w">
</span> """<span class="pygments-w">
</span> This class prepares a environment for running Zypper inside a dedicated reposync root<span class="pygments-w">
</span>
</pre>
<p>There is a <a class="reference external" href="https://github.com/rpm-software-management/urlgrabber/issues/33">Issue</a>
(Archive: <a class="reference external" href="https://web.archive.org/web/20220322220711/https://github.com/rpm-software-management/urlgrabber/issues/33">[1]</a>,
<a class="reference external" href="https://archive.today/2022.03.22-220733/https://github.com/rpm-software-management/urlgrabber/issues/33">[2]</a>)
in the urlgrabber repository, until that's fixed, it looks like that a
workaround in the Uyuni / reposync Codebase will be required.</p>
<p>I created a new <a class="reference external" href="https://github.com/uyuni-project/uyuni/pull/5051">Pull Request</a>
(Archive: <a class="reference external" href="https://web.archive.org/web/20220322220814/https://github.com/uyuni-project/uyuni/pull/5051">[1]</a>,
<a class="reference external" href="https://archive.today/2022.03.22-220809/https://github.com/uyuni-project/uyuni/pull/5051">[2]</a>)
in the Uyuni Project based on the above described fix, let's see if anyone comes
up with a more elegant solution or if the guys are happy with that one and agree to merge it.</p>
<p>Based on my tests, by syncing <em>openSUSE Leap 15.3</em>, <em>Oracle Linux 8</em>, <em>CentOS 7</em> and <em>Ubuntu 20.04</em> repositories,
it should finally resolve all, so far known, Issues related to reposync and HTTP Proxy.</p>
<p>And what had all this to do with a change of '<' to '>='?</p>
<p>In PR <a class="reference external" href="https://github.com/uyuni-project/uyuni/pull/4604">#4604</a>
(Archive: <a class="reference external" href="https://web.archive.org/web/20220322220815/https://github.com/uyuni-project/uyuni/pull/4604">[1]</a>,
<a class="reference external" href="https://archive.today/2022.03.22-220831/https://github.com/uyuni-project/uyuni/pull/4604">[2]</a>)
the version was bumped, changing <code>python3-urlgrabber < 4</code> to
<code>python3-urlgrabber >= 4</code> caused all that trouble and lot of issues
where syncing repositories behind a http proxy was just not possible anymore.</p>