<feed xmlns='http://www.w3.org/2005/Atom'>
<title>unit.git/src/python, branch 1.34.2-1</title>
<subtitle>Universal Web Application Server</subtitle>
<link rel='alternate' type='text/html' href='https://git.sigsegv.uk/unit.git/'/>
<entry>
<title>python: Don't decrement a reference to a borrowed object</title>
<updated>2024-09-13T16:12:39+00:00</updated>
<author>
<name>Andrew Clayton</name>
<email>a.clayton@nginx.com</email>
</author>
<published>2024-09-12T15:48:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.sigsegv.uk/unit.git/commit/?id=50b1aca3b8318c58f7073fe11911f1f0d52c651d'/>
<id>50b1aca3b8318c58f7073fe11911f1f0d52c651d</id>
<content type='text'>
On some Python 3.11 systems, 3.11.9 &amp; 3.11.10, we were seeing a crash
triggered by Py_Finalize() in nxt_python_atexit() when running one of
our pytests, namely
test/test_python_factory.py::test_python_factory_invalid_callable_value

  2024/09/12 15:07:29 [alert] 5452#5452 factory "wsgi_invalid_callable" in module "wsgi" can not be called to fetch callable
  Fatal Python error: none_dealloc: deallocating None: bug likely caused by a refcount error in a C extension
  Python runtime state: finalizing (tstate=0x00007f560b88a718)

  Current thread 0x00007f560bde7ad0 (most recent call first):
    &lt;no Python frame&gt;
  2024/09/12 15:07:29 [alert] 5451#5451 app process 5452 exited on signal 6 (core dumped)

This was due to

  obj = PyDict_GetItemString(PyModule_GetDict(module), callable);

in nxt_python_set_target() which returns a *borrowed* reference, then
due to the test meaning this is a `None` object we `goto fail` and call

  Py_DECREF(obj);

which then causes `Py_Finalize()` to blow up.

The simple fix is to just increment its reference count before the `goto
fail`.

Note: This problem only showed up under (the various versions of Python
we test on); 3.11.9 &amp; 3.11.10. It doesn't show up under; 3.6, 3.7, 3.9,
3.10, 3.12

Cc: Konstantin Pavlov &lt;thresh@nginx.com&gt;
Closes: https://github.com/nginx/unit/issues/1413
Fixes: a9aa9e76d ("python: Support application factories")
Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On some Python 3.11 systems, 3.11.9 &amp; 3.11.10, we were seeing a crash
triggered by Py_Finalize() in nxt_python_atexit() when running one of
our pytests, namely
test/test_python_factory.py::test_python_factory_invalid_callable_value

  2024/09/12 15:07:29 [alert] 5452#5452 factory "wsgi_invalid_callable" in module "wsgi" can not be called to fetch callable
  Fatal Python error: none_dealloc: deallocating None: bug likely caused by a refcount error in a C extension
  Python runtime state: finalizing (tstate=0x00007f560b88a718)

  Current thread 0x00007f560bde7ad0 (most recent call first):
    &lt;no Python frame&gt;
  2024/09/12 15:07:29 [alert] 5451#5451 app process 5452 exited on signal 6 (core dumped)

This was due to

  obj = PyDict_GetItemString(PyModule_GetDict(module), callable);

in nxt_python_set_target() which returns a *borrowed* reference, then
due to the test meaning this is a `None` object we `goto fail` and call

  Py_DECREF(obj);

which then causes `Py_Finalize()` to blow up.

The simple fix is to just increment its reference count before the `goto
fail`.

Note: This problem only showed up under (the various versions of Python
we test on); 3.11.9 &amp; 3.11.10. It doesn't show up under; 3.6, 3.7, 3.9,
3.10, 3.12

Cc: Konstantin Pavlov &lt;thresh@nginx.com&gt;
Closes: https://github.com/nginx/unit/issues/1413
Fixes: a9aa9e76d ("python: Support application factories")
Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>python: Constify some local static variables</title>
<updated>2024-07-03T19:41:00+00:00</updated>
<author>
<name>Andrew Clayton</name>
<email>a.clayton@nginx.com</email>
</author>
<published>2024-06-21T23:20:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.sigsegv.uk/unit.git/commit/?id=ff6d504530ad2c126fc264744faa9e62bcc43fb9'/>
<id>ff6d504530ad2c126fc264744faa9e62bcc43fb9</id>
<content type='text'>
These somehow got missed in my previous constification patches...

Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
These somehow got missed in my previous constification patches...

Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>python: Support application factories</title>
<updated>2024-07-02T18:13:14+00:00</updated>
<author>
<name>Gourav</name>
<email>gouravkandoria1500@gmail.com</email>
</author>
<published>2024-06-26T05:44:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.sigsegv.uk/unit.git/commit/?id=a9aa9e76db2766a681350c09947df848898531f6'/>
<id>a9aa9e76db2766a681350c09947df848898531f6</id>
<content type='text'>
Adds support for the app factory pattern to the Python language module.
A factory is a callable that returns a WSGI or ASGI application object.

Unit does not support passing arguments to factories.

Setting the `factory` option to `true` instructs Unit to treat the
configured `callable` as a factory.

For example:

    "my-app": {
        "type": "python",
        "path": "/srv/www/",
        "module": "hello",
        "callable": "create_app",
        "factory": true
    }

This is similar to other WSGI / ASGI servers. E.g.,

    $ uvicorn --factory hello:create_app
    $ gunicorn 'hello:create_app()'

The factory setting defaults to false.

Closes: https://github.com/nginx/unit/issues/1106
Link: &lt;https://github.com/nginx/unit/pull/1336#issuecomment-2179381605&gt;
[ Commit message - Dan / Minor code tweaks - Andrew ]
Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Adds support for the app factory pattern to the Python language module.
A factory is a callable that returns a WSGI or ASGI application object.

Unit does not support passing arguments to factories.

Setting the `factory` option to `true` instructs Unit to treat the
configured `callable` as a factory.

For example:

    "my-app": {
        "type": "python",
        "path": "/srv/www/",
        "module": "hello",
        "callable": "create_app",
        "factory": true
    }

This is similar to other WSGI / ASGI servers. E.g.,

    $ uvicorn --factory hello:create_app
    $ gunicorn 'hello:create_app()'

The factory setting defaults to false.

Closes: https://github.com/nginx/unit/issues/1106
Link: &lt;https://github.com/nginx/unit/pull/1336#issuecomment-2179381605&gt;
[ Commit message - Dan / Minor code tweaks - Andrew ]
Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Convert 0-sized arrays to true flexible array members</title>
<updated>2024-05-07T01:46:49+00:00</updated>
<author>
<name>Andrew Clayton</name>
<email>a.clayton@nginx.com</email>
</author>
<published>2023-04-13T18:42:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.sigsegv.uk/unit.git/commit/?id=e2a09c7742d2b74e3896ef99d3941ab1e46d2a15'/>
<id>e2a09c7742d2b74e3896ef99d3941ab1e46d2a15</id>
<content type='text'>
Declaring a 0-sized array (e.g 'char arr[0];') as the last member of a
structure is a GNU extension that was used to implement flexible array
members (FAMs) before they were standardised in C99 as simply '[]'.

The GNU extension itself was introduced to work around a hack of
declaring 1-sized arrays to mean a variable-length object. The advantage
of the 0-sized (and true FAMs) is that they don't count towards the size
of the structure.

Unit already declares some true FAMs, but it also declared some 0-sized
arrays.

Converting these 0-sized arrays to true FAMs is not only good for
consistency but will also allow better compiler checks now (as in a C99
FAM *must* be the last member of a structure and the compiler will warn
otherwise) and in the future as doing this fixes a bunch of warnings
(treated as errors in Unit by default) when compiled with

  -O2 -Warray-bounds -Wstrict-flex-arrays -fstrict-flex-arrays=3

(Note -Warray-bounds is enabled by -Wall and -Wstrict-flex-arrays seems
to also be enabled via -Wall -Wextra, the -02 is required to make
-fstrict-flex-arrays more effective, =3 is the default on at least GCC
14)

such as

  CC     build/src/nxt_upstream.o
src/nxt_upstream.c: In function ‘nxt_upstreams_create’:
src/nxt_upstream.c:56:18: error: array subscript i is outside array bounds of ‘nxt_upstream_t[0]’ {aka ‘struct nxt_upstream_s[]’} [-Werror=array-bounds=]
   56 |         string = nxt_str_dup(mp, &amp;upstreams-&gt;upstream[i].name, &amp;name);
      |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from src/nxt_upstream.c:9:
src/nxt_upstream.h:55:48: note: while referencing ‘upstream’
   55 |     nxt_upstream_t                             upstream[0];
      |                                                ^~~~~~~~

Making our flexible array members proper C99 FAMs and ensuring any &gt;0
sized trailing arrays in structures are really normal arrays will allow
to enable various compiler options (such as the above and more) that
will help keep our array usage safe.

Changing 0-sized arrays to FAMs should have no effect on structure
layouts/sizes (they both have a size of 0, although doing a sizeof() on
a FAM will result in a compiler error).

Looking at pahole(1) output for the nxt_http_route_ruleset_t structure
for the [0] and [] cases...

$ pahole -C nxt_http_route_ruleset_t /tmp/build/src/nxt_http_route.o
typedef struct {
        uint32_t           items;                /*     0     4 */

        /* XXX 4 bytes hole, try to pack */

        nxt_http_route_rule_t * rule[];          /*     8     0 */

        /* size: 8, cachelines: 1, members: 2 */
        /* sum members: 4, holes: 1, sum holes: 4 */
        /* last cacheline: 8 bytes */
} nxt_http_route_ruleset_t;
$ pahole -C nxt_http_route_ruleset_t build/src/nxt_http_route.o
typedef struct {
        uint32_t           items;                /*     0     4 */

        /* XXX 4 bytes hole, try to pack */

        nxt_http_route_rule_t * rule[];          /*     8     0 */

        /* size: 8, cachelines: 1, members: 2 */
        /* sum members: 4, holes: 1, sum holes: 4 */
        /* last cacheline: 8 bytes */
} nxt_http_route_ruleset_t;

Also checking with the size(1) command on the effected object files
shows no changes to their sizes

$ for file in build/src/nxt_upstream.o \
	build/src/nxt_upstream_round_robin.o \
	build/src/nxt_h1proto.o \
	build/src/nxt_http_route.o \
	build/src/nxt_http_proxy.o \
	build/src/python/*.o; do \
	size -G /tmp/${file} $file; echo; done
      text       data        bss      total filename
       640        418          0       1058 /tmp/build/src/nxt_upstream.o
       640        418          0       1058 build/src/nxt_upstream.o

      text       data        bss      total filename
       929        351          0       1280 /tmp/build/src/nxt_upstream_round_robin.o
       929        351          0       1280 build/src/nxt_upstream_round_robin.o

      text       data        bss      total filename
     11707       8281         16      20004 /tmp/build/src/nxt_h1proto.o
     11707       8281         16      20004 build/src/nxt_h1proto.o

      text       data        bss      total filename
      8319       3101          0      11420 /tmp/build/src/nxt_http_route.o
      8319       3101          0      11420 build/src/nxt_http_route.o

      text       data        bss      total filename
      1495       1056          0       2551 /tmp/build/src/nxt_http_proxy.o
      1495       1056          0       2551 build/src/nxt_http_proxy.o

      text       data        bss      total filename
      4321       2895          0       7216 /tmp/build/src/python/nxt_python_asgi_http-python.o
      4321       2895          0       7216 build/src/python/nxt_python_asgi_http-python.o

      text       data        bss      total filename
      4231       2266          0       6497 /tmp/build/src/python/nxt_python_asgi_lifespan-python.o
      4231       2266          0       6497 build/src/python/nxt_python_asgi_lifespan-python.o

      text       data        bss      total filename
     12051       6090          8      18149 /tmp/build/src/python/nxt_python_asgi-python.o
     12051       6090          8      18149 build/src/python/nxt_python_asgi-python.o

      text       data        bss      total filename
        28       1963        432       2423 /tmp/build/src/python/nxt_python_asgi_str-python.o
        28       1963        432       2423 build/src/python/nxt_python_asgi_str-python.o

      text       data        bss      total filename
      5818       3518          0       9336 /tmp/build/src/python/nxt_python_asgi_websocket-python.o
      5818       3518          0       9336 build/src/python/nxt_python_asgi_websocket-python.o

      text       data        bss      total filename
      4391       2089        168       6648 /tmp/build/src/python/nxt_python-python.o
      4391       2089        168       6648 build/src/python/nxt_python-python.o

      text       data        bss      total filename
      9095       5909        152      15156 /tmp/build/src/python/nxt_python_wsgi-python.o
      9095       5909        152      15156 build/src/python/nxt_python_wsgi-python.o

Link: &lt;https://lwn.net/Articles/908817/&gt;
Link: &lt;https://people.kernel.org/kees/bounded-flexible-arrays-in-c&gt;
Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Declaring a 0-sized array (e.g 'char arr[0];') as the last member of a
structure is a GNU extension that was used to implement flexible array
members (FAMs) before they were standardised in C99 as simply '[]'.

The GNU extension itself was introduced to work around a hack of
declaring 1-sized arrays to mean a variable-length object. The advantage
of the 0-sized (and true FAMs) is that they don't count towards the size
of the structure.

Unit already declares some true FAMs, but it also declared some 0-sized
arrays.

Converting these 0-sized arrays to true FAMs is not only good for
consistency but will also allow better compiler checks now (as in a C99
FAM *must* be the last member of a structure and the compiler will warn
otherwise) and in the future as doing this fixes a bunch of warnings
(treated as errors in Unit by default) when compiled with

  -O2 -Warray-bounds -Wstrict-flex-arrays -fstrict-flex-arrays=3

(Note -Warray-bounds is enabled by -Wall and -Wstrict-flex-arrays seems
to also be enabled via -Wall -Wextra, the -02 is required to make
-fstrict-flex-arrays more effective, =3 is the default on at least GCC
14)

such as

  CC     build/src/nxt_upstream.o
src/nxt_upstream.c: In function ‘nxt_upstreams_create’:
src/nxt_upstream.c:56:18: error: array subscript i is outside array bounds of ‘nxt_upstream_t[0]’ {aka ‘struct nxt_upstream_s[]’} [-Werror=array-bounds=]
   56 |         string = nxt_str_dup(mp, &amp;upstreams-&gt;upstream[i].name, &amp;name);
      |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from src/nxt_upstream.c:9:
src/nxt_upstream.h:55:48: note: while referencing ‘upstream’
   55 |     nxt_upstream_t                             upstream[0];
      |                                                ^~~~~~~~

Making our flexible array members proper C99 FAMs and ensuring any &gt;0
sized trailing arrays in structures are really normal arrays will allow
to enable various compiler options (such as the above and more) that
will help keep our array usage safe.

Changing 0-sized arrays to FAMs should have no effect on structure
layouts/sizes (they both have a size of 0, although doing a sizeof() on
a FAM will result in a compiler error).

Looking at pahole(1) output for the nxt_http_route_ruleset_t structure
for the [0] and [] cases...

$ pahole -C nxt_http_route_ruleset_t /tmp/build/src/nxt_http_route.o
typedef struct {
        uint32_t           items;                /*     0     4 */

        /* XXX 4 bytes hole, try to pack */

        nxt_http_route_rule_t * rule[];          /*     8     0 */

        /* size: 8, cachelines: 1, members: 2 */
        /* sum members: 4, holes: 1, sum holes: 4 */
        /* last cacheline: 8 bytes */
} nxt_http_route_ruleset_t;
$ pahole -C nxt_http_route_ruleset_t build/src/nxt_http_route.o
typedef struct {
        uint32_t           items;                /*     0     4 */

        /* XXX 4 bytes hole, try to pack */

        nxt_http_route_rule_t * rule[];          /*     8     0 */

        /* size: 8, cachelines: 1, members: 2 */
        /* sum members: 4, holes: 1, sum holes: 4 */
        /* last cacheline: 8 bytes */
} nxt_http_route_ruleset_t;

Also checking with the size(1) command on the effected object files
shows no changes to their sizes

$ for file in build/src/nxt_upstream.o \
	build/src/nxt_upstream_round_robin.o \
	build/src/nxt_h1proto.o \
	build/src/nxt_http_route.o \
	build/src/nxt_http_proxy.o \
	build/src/python/*.o; do \
	size -G /tmp/${file} $file; echo; done
      text       data        bss      total filename
       640        418          0       1058 /tmp/build/src/nxt_upstream.o
       640        418          0       1058 build/src/nxt_upstream.o

      text       data        bss      total filename
       929        351          0       1280 /tmp/build/src/nxt_upstream_round_robin.o
       929        351          0       1280 build/src/nxt_upstream_round_robin.o

      text       data        bss      total filename
     11707       8281         16      20004 /tmp/build/src/nxt_h1proto.o
     11707       8281         16      20004 build/src/nxt_h1proto.o

      text       data        bss      total filename
      8319       3101          0      11420 /tmp/build/src/nxt_http_route.o
      8319       3101          0      11420 build/src/nxt_http_route.o

      text       data        bss      total filename
      1495       1056          0       2551 /tmp/build/src/nxt_http_proxy.o
      1495       1056          0       2551 build/src/nxt_http_proxy.o

      text       data        bss      total filename
      4321       2895          0       7216 /tmp/build/src/python/nxt_python_asgi_http-python.o
      4321       2895          0       7216 build/src/python/nxt_python_asgi_http-python.o

      text       data        bss      total filename
      4231       2266          0       6497 /tmp/build/src/python/nxt_python_asgi_lifespan-python.o
      4231       2266          0       6497 build/src/python/nxt_python_asgi_lifespan-python.o

      text       data        bss      total filename
     12051       6090          8      18149 /tmp/build/src/python/nxt_python_asgi-python.o
     12051       6090          8      18149 build/src/python/nxt_python_asgi-python.o

      text       data        bss      total filename
        28       1963        432       2423 /tmp/build/src/python/nxt_python_asgi_str-python.o
        28       1963        432       2423 build/src/python/nxt_python_asgi_str-python.o

      text       data        bss      total filename
      5818       3518          0       9336 /tmp/build/src/python/nxt_python_asgi_websocket-python.o
      5818       3518          0       9336 build/src/python/nxt_python_asgi_websocket-python.o

      text       data        bss      total filename
      4391       2089        168       6648 /tmp/build/src/python/nxt_python-python.o
      4391       2089        168       6648 build/src/python/nxt_python-python.o

      text       data        bss      total filename
      9095       5909        152      15156 /tmp/build/src/python/nxt_python_wsgi-python.o
      9095       5909        152      15156 build/src/python/nxt_python_wsgi-python.o

Link: &lt;https://lwn.net/Articles/908817/&gt;
Link: &lt;https://people.kernel.org/kees/bounded-flexible-arrays-in-c&gt;
Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Python: bytearray body support for ASGI module.</title>
<updated>2024-02-21T14:06:43+00:00</updated>
<author>
<name>Andrei Zeliankou</name>
<email>zelenkov@nginx.com</email>
</author>
<published>2024-01-26T14:58:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.sigsegv.uk/unit.git/commit/?id=697a58506235e89af1c8cc3cafc92b3d85a3467d'/>
<id>697a58506235e89af1c8cc3cafc92b3d85a3467d</id>
<content type='text'>
@filiphanes requested support for bytearray
and memoryview in the request body here:
&lt;https://github.com/nginx/unit/issues/648&gt;

This patch implements bytearray body support only.
Memoryview body still need to be implemented.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
@filiphanes requested support for bytearray
and memoryview in the request body here:
&lt;https://github.com/nginx/unit/issues/648&gt;

This patch implements bytearray body support only.
Memoryview body still need to be implemented.
</pre>
</div>
</content>
</entry>
<entry>
<title>Python: Fix header field values character encoding.</title>
<updated>2023-11-09T17:53:09+00:00</updated>
<author>
<name>Andrew Clayton</name>
<email>a.clayton@nginx.com</email>
</author>
<published>2023-05-26T19:54:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.sigsegv.uk/unit.git/commit/?id=5cfad9cc0bb3809f802cf83d2739739fdfaab7a8'/>
<id>5cfad9cc0bb3809f802cf83d2739739fdfaab7a8</id>
<content type='text'>
On GitHub, @RomainMou reported an issue whereby HTTP header field values
where being incorrectly reported as non-ascii by the Python .isacii()
method.

For example, using the following test application

  def application(environ, start_response):
      t = environ['HTTP_ASCIITEST']

      t = "'" + t + "'" +  " (" + str(len(t)) + ")"

      if t.isascii():
          t = t + " [ascii]"
      else:
          t = t + " [non-ascii]"

      resp = t + "\n\n"

      start_response("200 OK", [("Content-Type", "text/plain")])
      return (bytes(resp, 'latin1'))

You would see the following

  $ curl -H "ASCIITEST: $" http://localhost:8080/
  '$' (1) [non-ascii]

'$' has an ASCII code of 0x24 (36).

The initial idea was to adjust the second parameter to the
PyUnicode_New() call from 255 to 127. This unfortunately had the
opposite effect.

  $ curl -H "ASCIITEST: $" http://localhost:8080/
  '$' (1) [ascii]

Good. However...

  $ curl -H "ASCIITEST: £" http://localhost:8080/
  '£' (2) [ascii]

Not good. Let's take a closer look at this.

'£' is not in basic ASCII, but is in extended ASCII with a value of 0xA3
(163). Its UTF-8 encoding is 0xC2 0xA3, hence the length of 2 bytes
above.

  $ strace -s 256 -e sendto,recvfrom curl -H "ASCIITEST: £" http://localhost:8080/
  sendto(5, "GET / HTTP/1.1\r\nHost: localhost:8080\r\nUser-Agent: curl/8.0.1\r\nAccept: */*\r\nASCIITEST: \302\243\r\n\r\n", 92, MSG_NOSIGNAL, NULL, 0) = 92
  recvfrom(5, "HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\nServer: Unit/1.30.0\r\nDate: Mon, 22 May 2023 12:44:11 GMT\r\nTransfer-Encoding: chunked\r\n\r\n12\r\n'\302\243' (2) [ascii]\n\n\r\n0\r\n\r\n", 102400, 0, NULL, NULL) = 160
  '£' (2) [ascii]

So we can see curl sent it UTF-8 encoded '\302\243\' which is C octal
escaped UTF-8 for 0xC2 0xA3, and we got the same back. But it should not
be marked as ASCII.

When doing PyUnicode_New(size, 127) it sets the buffer as ASCII. So we
need to use another function and that function would appear to be

  PyUnicode_DecodeCharmap()

Which creates an Unicode object with the correct ascii/non-ascii
properties based on the character encoding.

With this function we now get

  $ curl -H "ASCIITEST: $" http://localhost:8080/
  '$' (1) [ascii]

  $ curl -H "ASCIITEST: £" http://localhost:8080/
  '£' (2) [non-ascii]

and for good measure

  $ curl -H "ASCIITEST: $ £" http://localhost:8080/
  '$ £' (4) [non-ascii]

  $ curl -H "ASCIITEST: $" -H "ASCIITEST: £" http://localhost:8080/
  '$, £' (5) [non-ascii]

PyUnicode_DecodeCharmap() does require having the full string upfront so
we need to build up the potentially comma separated header field values
string before invoking this function.

I did not want to touch the Python 2.7 code (which may or may not even
be affected by this) so kept these changes completely isolated from
that, hence a slight duplication with the for () loop.

Python 2.7 was sunset on January 1st 2020[0], so this code will
hopefully just disappear soon anyway.

I also purposefully didn't touch other code that may well have similar
issues (such as the HTTP header field names) if we ever get issue
reports about them, we'll deal with them then.

[0]: &lt;https://www.python.org/doc/sunset-python-2/&gt;

Link: &lt;https://docs.python.org/3/c-api/unicode.html&gt;
Closes: &lt;https://github.com/nginx/unit/issues/868&gt;
Reported-by: RomainMou &lt;https://github.com/RomainMou&gt;
Tested-by: RomainMou &lt;https://github.com/RomainMou&gt;
Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On GitHub, @RomainMou reported an issue whereby HTTP header field values
where being incorrectly reported as non-ascii by the Python .isacii()
method.

For example, using the following test application

  def application(environ, start_response):
      t = environ['HTTP_ASCIITEST']

      t = "'" + t + "'" +  " (" + str(len(t)) + ")"

      if t.isascii():
          t = t + " [ascii]"
      else:
          t = t + " [non-ascii]"

      resp = t + "\n\n"

      start_response("200 OK", [("Content-Type", "text/plain")])
      return (bytes(resp, 'latin1'))

You would see the following

  $ curl -H "ASCIITEST: $" http://localhost:8080/
  '$' (1) [non-ascii]

'$' has an ASCII code of 0x24 (36).

The initial idea was to adjust the second parameter to the
PyUnicode_New() call from 255 to 127. This unfortunately had the
opposite effect.

  $ curl -H "ASCIITEST: $" http://localhost:8080/
  '$' (1) [ascii]

Good. However...

  $ curl -H "ASCIITEST: £" http://localhost:8080/
  '£' (2) [ascii]

Not good. Let's take a closer look at this.

'£' is not in basic ASCII, but is in extended ASCII with a value of 0xA3
(163). Its UTF-8 encoding is 0xC2 0xA3, hence the length of 2 bytes
above.

  $ strace -s 256 -e sendto,recvfrom curl -H "ASCIITEST: £" http://localhost:8080/
  sendto(5, "GET / HTTP/1.1\r\nHost: localhost:8080\r\nUser-Agent: curl/8.0.1\r\nAccept: */*\r\nASCIITEST: \302\243\r\n\r\n", 92, MSG_NOSIGNAL, NULL, 0) = 92
  recvfrom(5, "HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\nServer: Unit/1.30.0\r\nDate: Mon, 22 May 2023 12:44:11 GMT\r\nTransfer-Encoding: chunked\r\n\r\n12\r\n'\302\243' (2) [ascii]\n\n\r\n0\r\n\r\n", 102400, 0, NULL, NULL) = 160
  '£' (2) [ascii]

So we can see curl sent it UTF-8 encoded '\302\243\' which is C octal
escaped UTF-8 for 0xC2 0xA3, and we got the same back. But it should not
be marked as ASCII.

When doing PyUnicode_New(size, 127) it sets the buffer as ASCII. So we
need to use another function and that function would appear to be

  PyUnicode_DecodeCharmap()

Which creates an Unicode object with the correct ascii/non-ascii
properties based on the character encoding.

With this function we now get

  $ curl -H "ASCIITEST: $" http://localhost:8080/
  '$' (1) [ascii]

  $ curl -H "ASCIITEST: £" http://localhost:8080/
  '£' (2) [non-ascii]

and for good measure

  $ curl -H "ASCIITEST: $ £" http://localhost:8080/
  '$ £' (4) [non-ascii]

  $ curl -H "ASCIITEST: $" -H "ASCIITEST: £" http://localhost:8080/
  '$, £' (5) [non-ascii]

PyUnicode_DecodeCharmap() does require having the full string upfront so
we need to build up the potentially comma separated header field values
string before invoking this function.

I did not want to touch the Python 2.7 code (which may or may not even
be affected by this) so kept these changes completely isolated from
that, hence a slight duplication with the for () loop.

Python 2.7 was sunset on January 1st 2020[0], so this code will
hopefully just disappear soon anyway.

I also purposefully didn't touch other code that may well have similar
issues (such as the HTTP header field names) if we ever get issue
reports about them, we'll deal with them then.

[0]: &lt;https://www.python.org/doc/sunset-python-2/&gt;

Link: &lt;https://docs.python.org/3/c-api/unicode.html&gt;
Closes: &lt;https://github.com/nginx/unit/issues/868&gt;
Reported-by: RomainMou &lt;https://github.com/RomainMou&gt;
Tested-by: RomainMou &lt;https://github.com/RomainMou&gt;
Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Python: Do nxt_unit_sptr_get() earlier in nxt_python_field_value().</title>
<updated>2023-11-08T21:53:46+00:00</updated>
<author>
<name>Andrew Clayton</name>
<email>a.clayton@nginx.com</email>
</author>
<published>2023-05-26T19:51:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.sigsegv.uk/unit.git/commit/?id=dd0c53a77dbd3c8b7e3496c5e15cef757346ef8b'/>
<id>dd0c53a77dbd3c8b7e3496c5e15cef757346ef8b</id>
<content type='text'>
This is a preparatory patch for fixing an issue with the encoding of
http header field values.

This patch simply moves the nxt_unit_sptr_get() to the top of the
function where we will need it in the next commit.

Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is a preparatory patch for fixing an issue with the encoding of
http header field values.

This patch simply moves the nxt_unit_sptr_get() to the top of the
function where we will need it in the next commit.

Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Python: Fix error checks in nxt_py_asgi_request_handler().</title>
<updated>2023-05-31T23:25:40+00:00</updated>
<author>
<name>synodriver</name>
<email>diguohuangjiajinweijun@gmail.com</email>
</author>
<published>2023-05-27T15:18:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.sigsegv.uk/unit.git/commit/?id=b84f6ecad42f4217b6eafb0ceb1e66a75b34e948'/>
<id>b84f6ecad42f4217b6eafb0ceb1e66a75b34e948</id>
<content type='text'>
Signed-off-by: synodriver &lt;diguohuangjiajinweijun@gmail.com&gt;
Reviewed-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
[ Re-word commit subject - Andrew ]
Fixes: c4c2f90c5b53 ("Python: ASGI server introduced.")
Closes: &lt;https://github.com/nginx/unit/issues/895&gt;
Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: synodriver &lt;diguohuangjiajinweijun@gmail.com&gt;
Reviewed-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
[ Re-word commit subject - Andrew ]
Fixes: c4c2f90c5b53 ("Python: ASGI server introduced.")
Closes: &lt;https://github.com/nginx/unit/issues/895&gt;
Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Python: Add ASGI lifespan state support.</title>
<updated>2023-05-31T23:25:03+00:00</updated>
<author>
<name>synodriver</name>
<email>diguohuangjiajinweijun@gmail.com</email>
</author>
<published>2023-05-27T14:18:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.sigsegv.uk/unit.git/commit/?id=93ed66958e11b40cc06dcb32fc8e623967af6347'/>
<id>93ed66958e11b40cc06dcb32fc8e623967af6347</id>
<content type='text'>
Lifespan state is a special dict in asgi lifespan scope, which allow
applications to persist data from the lifespan cycle to request/response
handling. The scope["state"] namespace provides a place to store these
sorts of things. The server will ensure that a shallow copy of the
namespace is passed into each subsequent request/response call into the
application.

Some frameworks are already taking advantage of this feature, for
example, starlette, and without this feature they wouldn't work
properly.

Signed-off-by: synodriver &lt;diguohuangjiajinweijun@gmail.com&gt;
Reviewed-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
[ Minor code tweaks to avoid lines &gt; 80 chars, static a function and
  re-work the PyMemberDef structure initialisation for Python &lt;3.7
  and -Wwrite-strings compatibility - Andrew ]
Tested-by: &lt;https://github.com/synodriver&gt;
Tested-by: &lt;https://github.com/hawiliali&gt;
Closes: &lt;https://github.com/nginx/unit/issues/864&gt;
Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Lifespan state is a special dict in asgi lifespan scope, which allow
applications to persist data from the lifespan cycle to request/response
handling. The scope["state"] namespace provides a place to store these
sorts of things. The server will ensure that a shallow copy of the
namespace is passed into each subsequent request/response call into the
application.

Some frameworks are already taking advantage of this feature, for
example, starlette, and without this feature they wouldn't work
properly.

Signed-off-by: synodriver &lt;diguohuangjiajinweijun@gmail.com&gt;
Reviewed-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
[ Minor code tweaks to avoid lines &gt; 80 chars, static a function and
  re-work the PyMemberDef structure initialisation for Python &lt;3.7
  and -Wwrite-strings compatibility - Andrew ]
Tested-by: &lt;https://github.com/synodriver&gt;
Tested-by: &lt;https://github.com/hawiliali&gt;
Closes: &lt;https://github.com/nginx/unit/issues/864&gt;
Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Python: Fix ASGI applications accessed over IPv6.</title>
<updated>2023-05-18T14:57:11+00:00</updated>
<author>
<name>Andrew Clayton</name>
<email>a.clayton@nginx.com</email>
</author>
<published>2023-05-15T21:48:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.sigsegv.uk/unit.git/commit/?id=47683c4704572bbe0efb3b989b35a3912b65ac83'/>
<id>47683c4704572bbe0efb3b989b35a3912b65ac83</id>
<content type='text'>
There are a couple of reports on GitHub about issues accessing Python
ASGI based applications over IPv6.

A request over IPv6 would result in an error like

2023/05/13 17:49:12 [alert] 47202#47202 [unit] #10: Python failed to create 'client' pair
2023/05/13 17:49:12 [alert] 47202#47202 [unit] Python failed to call 'loop.call_soon'
ValueError: invalid literal for int() with base 10: 'db8:1:1:1ee7:dead:beef:cafe'

The above error was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib64/python3.11/asyncio/base_events.py", line 765, in call_soon
    handle = self._call_soon(callback, args, context)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/asyncio/base_events.py", line 781, in _call_soon
    handle = events.Handle(callback, args, self, context)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SystemError: &lt;class 'asyncio.events.Handle'&gt; returned a result with an exception set

This issue occurred in the nxt_py_asgi_create_ip_address() function
where it tries to create an IP address / port number pair.

It does this by looking for the first ':' in the address and taking
everything after it as the port number. Like in the above error message,
if we tried to access the server @ 2001:db8:1:1:1ee7:dead:beef:cafe,
then we'd end up with the port number as 'db8:1:1:1ee7:dead:beef:cafe'.

There are two issues with this

 1) The IP address and port number are already flowed through
    separately.
 2) Even if (1) wasn't true, it would still be broken for IPv6 as we'd
    expect to a get an address literal like
    [2001:db8:1:1:1ee7:dead:beef:cafe]:8080, however there was no code to
    handle the []'s.

The fix is to simply not try looking for a port number. We pass a port
number into this function to use in the case where we don't find a port
number, we never will...

A further cleanup would be to flow through the server port number when
creating the 'server pair' PyTuple, rather than just using the hard
coded 80.

Closes: &lt;https://github.com/nginx/unit/issues/793&gt;
Closes: &lt;https://github.com/nginx/unit/issues/874&gt;
Reviewed-by: Alejandro Colomar &lt;alx@nginx.com&gt;
Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are a couple of reports on GitHub about issues accessing Python
ASGI based applications over IPv6.

A request over IPv6 would result in an error like

2023/05/13 17:49:12 [alert] 47202#47202 [unit] #10: Python failed to create 'client' pair
2023/05/13 17:49:12 [alert] 47202#47202 [unit] Python failed to call 'loop.call_soon'
ValueError: invalid literal for int() with base 10: 'db8:1:1:1ee7:dead:beef:cafe'

The above error was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib64/python3.11/asyncio/base_events.py", line 765, in call_soon
    handle = self._call_soon(callback, args, context)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/asyncio/base_events.py", line 781, in _call_soon
    handle = events.Handle(callback, args, self, context)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SystemError: &lt;class 'asyncio.events.Handle'&gt; returned a result with an exception set

This issue occurred in the nxt_py_asgi_create_ip_address() function
where it tries to create an IP address / port number pair.

It does this by looking for the first ':' in the address and taking
everything after it as the port number. Like in the above error message,
if we tried to access the server @ 2001:db8:1:1:1ee7:dead:beef:cafe,
then we'd end up with the port number as 'db8:1:1:1ee7:dead:beef:cafe'.

There are two issues with this

 1) The IP address and port number are already flowed through
    separately.
 2) Even if (1) wasn't true, it would still be broken for IPv6 as we'd
    expect to a get an address literal like
    [2001:db8:1:1:1ee7:dead:beef:cafe]:8080, however there was no code to
    handle the []'s.

The fix is to simply not try looking for a port number. We pass a port
number into this function to use in the case where we don't find a port
number, we never will...

A further cleanup would be to flow through the server port number when
creating the 'server pair' PyTuple, rather than just using the hard
coded 80.

Closes: &lt;https://github.com/nginx/unit/issues/793&gt;
Closes: &lt;https://github.com/nginx/unit/issues/874&gt;
Reviewed-by: Alejandro Colomar &lt;alx@nginx.com&gt;
Signed-off-by: Andrew Clayton &lt;a.clayton@nginx.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
