summaryrefslogtreecommitdiffhomepage
path: root/src (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2023-09-05HTTP: compress: gzip: calculating wbits and memlevel dynamically.gzip-v36Alejandro Colomar1-4/+32
When the content length is small, optimize zlib for low memory usage. Conversely, when the content length is large, use a similar amount of memory within zlib, as it will improve compression, and won't hurt significantly. Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-09-05Libunit: added bit functions.Alejandro Colomar1-0/+64
These are based on C23's <stdbit.h>. Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-09-05HTTP: compress: checking $header_accept_encoding.Alejandro Colomar3-0/+48
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-09-05HTTP: compress: added "mime_types" rule.Alejandro Colomar7-10/+63
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-09-05HTTP: compress: added configurable threshold for Content-Length.Alejandro Colomar4-6/+42
With this, short responses, that is, responses with a body of up to content_length_threshold bytes, won't be compressed. The default value is 20, as in NGINX. Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-09-04String: added strto[u]l(3) variants for nxt_str_t.Alejandro Colomar2-0/+47
They're really more inspired in the API of BSD's strto[iu](3), but use long just to keep it simple, instead of intmax_t, and since they wrap strtol(3), I called them like it. Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-09-04Auto: zlib: added --no-zlib.Alejandro Colomar3-3/+22
Related to: HTTP: compress: gzip Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-09-04HTTP: compress: added configurable "level" of compression.Alejandro Colomar4-2/+23
2023-09-04HTTP: compress: added "encoding": "gzip".Alejandro Colomar3-1/+203
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-09-03HTTP: compress: added "compress" action.Alejandro Colomar6-4/+171
There are still no supported encodings. This is just infrastructure for the next commits, which will add gzip compression. Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-09-03HTTP: filter: supporting a list of filter_handlersAlejandro Colomar3-0/+68
Filter handlers are a new handler that takes place when a buffer is about to be sent. It filters (modifies) the contents of the buffer in-place, so that the new contents will be sent. Several filters can be applied in a loop. Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-09-03HTTP: refactor: storing the body_handler as part of r.Alejandro Colomar8-24/+24
This will allow sending the header from a totally different point, since the data for the call is present in the request, which is available everywhere. It will also allow consulting in a filter if there is a body_handler installed. The gzip filter will need this, as it should be a no-op if there is no body handler installed. Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-09-03Libunit: added macros that enhance type safety.Alejandro Colomar3-13/+92
nxt_min() nxt_max() Return the minimum/maximum of two values. nxt_swap() Swap the values of two variables passed by their addresses. nxt_sizeof_array() Return the size (in bytes) of an array. nxt_nitems() Return the number of elements in an array. nxt_memberof() Expand to a member of a structure. It uses a compound literal for the object. nxt_sizeof_incomplete() Calculate the size of an incomplete type, as if sizeof() could be applied to it. nxt_sizeof_fam0() Calculate the size of each element of a FAM of a structure. nxt_sizeof_fam() Calculate the size of a FAM of a structure. nxt_offsetof_fam() Calculate the offset of the nth element of the FAM from the start of the containing structure. nxt_sizeof_struct() Calculate the total size of a structure containing a FAM. This value is the one that should be used for allocating the structure. Suggested-by: Andrew Clayton <a.clayton@nginx.com> nxt_is_near_end() Evaluate to true if the member is near the end of a structure. This is only designed to be used with FAMs, to make sure that the FAM is near the end of the structure (a zero-length array near the end of the structure would still pass this test, but it's a reasonable assertion to do. Suggested-by: David Laight <David.Laight@ACULAB.COM> nxt_is_zero_sizeof() Evaluate to true if the size of 'x' is 0. nxt_is_same_type() Evaluate to true if the both arguments are compatible types. nxt_is_same_typeof() Evaluate to true if the types of both arguments are compatible. nxt_is_array() Evaluate to true if the argument is an array. nxt_must_be() It's like static_assert(3), but implemented as an expression. It's necessary for writing the must_be_array() macro. It's always evaluates to (int) 0. nxt_must_be_array() Statically assert that the argument is an array. It is an expression that always evaluates to (int) 0. nxt_must_be_zero_sizeof() Statically assert that the argument has a size of 0. nxt_must_be_near_end() Statically assert that a member of a structure is near the end of it. Suggested-by: David Laight <David.Laight@ACULAB.COM> nxt_must_be_fam() Statically assert that the argument is a flexible array member (FAM). It's an expression that always evaluates to (int) 0. Link: <https://gustedt.wordpress.com/2011/03/14/flexible-array-member/> Link: <https://lore.kernel.org/lkml/202308161913.91369D4A@keescook/T/> Link: <https://inbox.sourceware.org/gcc/dac8afb7-5026-c702-85d2-c3ad977d9a48@kernel.org/T/> Link: <https://stackoverflow.com/a/57537491> Link: <https://github.com/shadow-maint/shadow/pull/762> Cc: Andrew Clayton <a.clayton@nginx.com> Cc: Zhidao Hong <z.hong@f5.com> Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-08-01Added unit pkg-config file.Konstantin Pavlov1-0/+11
2023-08-17Wasm: Add support for directory access.Andrew Clayton6-1/+69
Due to the sandboxed nature of WebAssembly, by default WASM modules don't have any access to the underlying filesystem. There is however a capabilities based mechanism[0] for allowing such access. This adds a config option to the 'wasm' application type; 'access.filesystem' which takes an array of directory paths that are then made available to the WASM module. This access works recursively, i.e everything under a specific path is allowed access to. Example config might look like "access" { "filesystem": [ "/tmp", "/var/tmp" ] } The actual mechanism used allows directories to be mapped differently in the guest. But at the moment we don't support that and just map say /tmp to /tmp. This can be revisited if it's something users clamour for. Network sockets are another resource that may be controlled in this manner, for example there is a wasi_config_preopen_socket() function, however this requires the runtime to open the network socket then effectively pass this through to the guest. This is something that can be revisited in the future if users desire it. [0]: <https://github.com/bytecodealliance/wasmtime/blob/main/docs/WASI-capabilities.md> Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-17Wasm: Wire up Wasm language module support to the config system.Andrew Clayton3-0/+92
This exposes various WebAssembly language module specific options. The application type is "wasm". There is a "module" option that is required, this specifies the full path to the WebAssembly module to be run. This module should be in binary format, i.e a .wasm file. There are also currently eight function handlers that can be specified. Three of them are _required_ 1) request_handler The main driving function. This may be called multiple times for a single HTTP request if the request is larger than the shared memory. 2) malloc_handler Used to allocate a chunk of memory at language module startup. This memory is allocated from the WASM modules address space and is what is sued for communicating between the WASM module (the guest) and Unit (the host). 3) free_handler Used to free the memory from above at language module shutdown. Then there are the following five _optional_ handlers 1) module_init_handler If set, called at language module startup. 2) module_end_handler If set, called at language module shutdown. 3) request_init_handler If set, called at the start of request. Called only once per HTTP request. 4) request_end_handler If set, called once all of a request has been sent to the WASM module. 5) response_end_handler If set, called at the end of a request, once the WASM module has sent all its headers and data. Example config "applications": { "luw-echo-request": { "type": "wasm", "module": "/path/to/unit-wasm/examples/c/luw-echo-request.wasm", "request_handler": "luw_request_handler", "malloc_handler": "luw_malloc_handler", "free_handler": "luw_free_handler", "module_init_handler": "luw_module_init_handler", "module_end_handler": "luw_module_end_handler", } } Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-17Wasm: Add the core of initial WebAssembly language module support.Andrew Clayton3-0/+801
This adds the core of runtime WebAssembly[0] support. Future commits will enable this in the Unit core and expose the configuration. This introduces a new src/wasm directory for storing this source. We are initially using Wasmtime[0] as the WebAssembly runtime, however this has been designed with the ability to use different runtimes in mind. src/wasm/nxt_wasm.[ch] is the main interface to Unit. src/wasm/nxt_rt_wasmtime.c is the Wasmtime runtime support. This is nicely insulated from any knowledge of internal Unit workings. Wasmtime is what loads and runs the Wasm modules. The Wasm modules can export functions Wasmtime can call and Wasmtime can export functions that the module can call. We make use of both. The terminology used is that function exports are what the Wasm module exports and function imports are what the Wasm runtime exports to the module. We currently have four function imports (functions exported by the runtime to be called by the Wasm module). 1) nxt_wasm_get_init_mem_size This allows Wasm modules to get the size of the initially allocated shared memory. This is the size allocated at Unit startup and what the Wasm modules can assume they have access to (in reality this shared memory will likely be larger). The amount of memory allocated at startup is NXT_WASM_MEM_SIZE which as of this commit is 32MiB. We do actually allocate NXT_WASM_MEM_SIZE + NXT_WASM_PAGE_SIZE at startup which is an extra 64KiB (the smallest allocation unit), this is to allow room for the response structure and so module developers can just assume they have the full 32MiB for their actual response. 2) nxt_wasm_send_headers This allows WASM modules to send their headers. 3) nxt_wasm_send_response This allows WASM modules to send their response. 4) nxt_wasm_response_end This allows WASM modules to inform Unit they have finished sending their response. This calls nxt_unit_request_done() Then there are currently up to eight functions that a module can export. Three of which are required. These function can be named anything. I'll use the Unit configuration names to refer to them 1) request_handler The main driving function. This may be called multiple times for a single HTTP request if the request is larger than the shared memory. 2) malloc_handler Used to allocate a chunk of memory at language module startup. This memory is allocated from the WASM modules address space and is what is sued for communicating between the WASM module (the guest) and Unit (the host). 3) free_handler Used to free the memory from above at language module shutdown. Then there are the following optional handlers 1) module_init_handler If set, called at language module startup. 2) module_end_handler If set, called at language module shutdown. 3) request_init_handler If set, called at the start of request. Called only once per HTTP request. 4) request_end_handler If set, called once all of a request has been sent to the WASM module. 5) response_end_handler If set, called at the end of a request, once the WASM module has sent all its headers and data. 32bits We currently support 32bit WASM modules, I.e wasm32-wasi. Newer version of clang, 13+[2], do seem to have support for wasm64 as a target (which uses a LP64 model). However it's not entirely clear if the WASI SDK fully supports[3] this and by extension WASI libc/wasi-sysroot. 64bit support is something than can be explored more thoroughly in the future. As such in structures that are used to communicate between the host and guest we use 32bit ints. Even when a single byte might be enough. This is to avoid issues with structure layout differences between a 64bit host and 32bit guest (I.e WASM module) and the need for various bits of structure padding depending on host architecture. Instead everything is 4-byte aligned. [0]: <https://webassembly.org/> [1]: <https://wasmtime.dev/> [2]: <https://reviews.llvm.org/rG670944fb20b226fc22fa993ab521125f9adbd30a> [3]: <https://github.com/WebAssembly/wasi-sdk/issues/185> Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-16Wasm: Add core configuration data structure.Andrew Clayton1-0/+16
This is required to actually _build_ the Wasm language module. The nxt_wasm_app_conf_t structure consists of the modules name, e.g wasm, then the three required function handlers followed by the five optional function handlers. See the next commit for details of these function handlers. We also need to include the u.wasm union entry that provides access to the above structure. The bulk of the configuration infrastructure will be added in a subsequent commit. Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-10Wasm: Register a new WebAssembly language module type.Andrew Clayton2-0/+2
This is the first patch in adding WebAssembly language module support. This just adds a new NXT_APP_WASM type, required by subsequent commits. Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-10Index initialise the nxt_app_msg_prefix array.Andrew Clayton1-6/+6
This makes it much more clear what's what. This is in preparation for adding WebAssembly language module support. Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-09HTTP: controlling response headers support.Zhidao HONG6-1/+238
2023-08-09HTTP: stored matched action in nxt_http_request_t.Zhidao HONG5-9/+20
No functional changes.
2023-07-12NJS: workaround for the warning in nxt_js_call() on Freebsd12 gcc.Zhidao HONG1-4/+3
2023-07-01Var: supported HTTP response header variables.Zhidao HONG3-6/+216
This commit adds the variable $response_header_NAME.
2023-06-19Variables refactoring.Zhidao HONG6-180/+204
This commit is to reimplement the variables with an unknown field such as $header_{name} to make the parsing more generic, it's a preparation for supporting response header variables.
2023-07-11NJS: supported 0.8.0.Zhidao HONG1-15/+15
2023-06-30Fixed indentation.Alejandro Colomar2-4/+4
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-05-25HTTP: fixed variable caching.Zhidao HONG3-14/+41
When a variable is accessed in the Unit configuration, the value is cached. This was useful prior to the URI rewrite feature, but now that the URI (more precisely, the request target) can be rewritten, the contents of the variable $uri (which contains the path part of the request target, and is decoded) should not be cached anymore, or at least the cached value should be invalidated after a URI rewrite. Example: { "rewrite": "/prefix$uri", "share": "$uri" } For a request line like GET /foo?bar=baz HTTP/1.1\r\n, the expected file served in the response would be /prefix/foo, but due to the caching issue, Unit currently serves /foo.
2023-06-01Python: Fix error checks in nxt_py_asgi_request_handler().synodriver1-2/+2
Signed-off-by: synodriver <diguohuangjiajinweijun@gmail.com> Reviewed-by: Andrew Clayton <a.clayton@nginx.com> [ Re-word commit subject - Andrew ] Fixes: c4c2f90c5b53 ("Python: ASGI server introduced.") Closes: <https://github.com/nginx/unit/issues/895> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-06-01Python: Add ASGI lifespan state support.synodriver4-3/+84
Lifespan state is a special dict in asgi lifespan scope, which allow applications to persist data from the lifespan cycle to request/response handling. The scope["state"] namespace provides a place to store these sorts of things. The server will ensure that a shallow copy of the namespace is passed into each subsequent request/response call into the application. Some frameworks are already taking advantage of this feature, for example, starlette, and without this feature they wouldn't work properly. Signed-off-by: synodriver <diguohuangjiajinweijun@gmail.com> Reviewed-by: Andrew Clayton <a.clayton@nginx.com> [ Minor code tweaks to avoid lines > 80 chars, static a function and re-work the PyMemberDef structure initialisation for Python <3.7 and -Wwrite-strings compatibility - Andrew ] Tested-by: <https://github.com/synodriver> Tested-by: <https://github.com/hawiliali> Closes: <https://github.com/nginx/unit/issues/864> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-05-25Tests: fixed incorrect pointer assignment.Alejandro Colomar1-2/+2
If we don't update the pointer before copying the request body, then we get the behavior shown below. After this patch, "foo\n" is rightly appended at the end of the response body. Request: "GET / HTTP/1.1\r\nHost: _\nContent-Length: 4\n\nfoo\n" Response body: """ Hello world! foo est data: Method: GET Protocol: HTTP/1.1 Remote addr: 127.0.0.1 Local addr: 127.0.0.1 Target: / Path: / Fields: Host: _ Content-Length: 4 Body: """ Fixes: 1bb22d1e922c ("Unit application library.") Reviewed-by: Andrew Clayton <a.clayton@nginx.com> Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-05-21Added back deprecated options to unitd.Alejandro Colomar1-0/+31
We renamed the options recently, with the intention of keeping the old names as supported but deprecated for some time, before removal. This was done with the configure script options, but in the unitd binary, we accidentally removed the old names, causing some unintended breakage. Keep support for the old names, albeit with a deprecation message to stderr, for some time, until we decide to remove them. Fixes: 5a37171f733f ("Added default values for pathnames.") Closes: <https://github.com/nginx/unit/issues/876> Reported-by: El RIDO <elrido@gmx.net> Acked-by: Liam Crilly <liam@nginx.com> Acked-by: Artem Konev <a.konev@f5.com> Acked-by: Timo Stark <t.stark@nginx.com> Reviewed-by: Andrew Clayton <a.clayton@nginx.com> Cc: Andrei Zeliankou <zelenkov@nginx.com> Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-05-18Python: Fix ASGI applications accessed over IPv6.Andrew Clayton1-11/+3
There are a couple of reports on GitHub about issues accessing Python ASGI based applications over IPv6. A request over IPv6 would result in an error like 2023/05/13 17:49:12 [alert] 47202#47202 [unit] #10: Python failed to create 'client' pair 2023/05/13 17:49:12 [alert] 47202#47202 [unit] Python failed to call 'loop.call_soon' ValueError: invalid literal for int() with base 10: 'db8:1:1:1ee7:dead:beef:cafe' The above error was the direct cause of the following exception: Traceback (most recent call last): File "/usr/lib64/python3.11/asyncio/base_events.py", line 765, in call_soon handle = self._call_soon(callback, args, context) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib64/python3.11/asyncio/base_events.py", line 781, in _call_soon handle = events.Handle(callback, args, self, context) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ SystemError: <class 'asyncio.events.Handle'> returned a result with an exception set This issue occurred in the nxt_py_asgi_create_ip_address() function where it tries to create an IP address / port number pair. It does this by looking for the first ':' in the address and taking everything after it as the port number. Like in the above error message, if we tried to access the server @ 2001:db8:1:1:1ee7:dead:beef:cafe, then we'd end up with the port number as 'db8:1:1:1ee7:dead:beef:cafe'. There are two issues with this 1) The IP address and port number are already flowed through separately. 2) Even if (1) wasn't true, it would still be broken for IPv6 as we'd expect to a get an address literal like [2001:db8:1:1:1ee7:dead:beef:cafe]:8080, however there was no code to handle the []'s. The fix is to simply not try looking for a port number. We pass a port number into this function to use in the case where we don't find a port number, we never will... A further cleanup would be to flow through the server port number when creating the 'server pair' PyTuple, rather than just using the hard coded 80. Closes: <https://github.com/nginx/unit/issues/793> Closes: <https://github.com/nginx/unit/issues/874> Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-05-08NJS: supported loadable modules.Zhidao HONG14-49/+1584
2023-04-20HTTP: added basic URI rewrite.Zhidao HONG9-11/+160
This commit introduced the basic URI rewrite. It allows users to change request URI. Note the "rewrite" option ignores the contained query if any and the query from the request is preserverd. An example: "routes": [ { "match": { "uri": "/v1/test" }, "action": { "return": 200 } }, { "action": { "rewrite": "/v1$uri", "pass": "routes" } } ] Reviewed-by: Alejandro Colomar <alx@nginx.com>
2023-04-25Allow to remove the version string in HTTP responses.Andrew Clayton4-3/+22
Normally Unit responds to HTTP requests by including a header like Server: Unit/1.30.0 however it can sometimes be beneficial to withhold the version information and in this case just respond with Server: Unit This patch adds a new "settings.http" boolean option called server_version, which defaults to true, in which case the full version information is sent. However this can be set to false, e.g "settings": { "http": { "server_version": false } }, in which case Unit responds without the version information as the latter example above shows. Link: <https://www.ietf.org/rfc/rfc9110.html#section-10.2.4> Closes: <https://github.com/nginx/unit/issues/158> Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-25Decouple "Unit" from NXT_SERVER.Andrew Clayton1-1/+2
Split out the "Unit" name from the NXT_SERVER #define into its own NXT_NAME #define, then make NXT_SERVER a combination of that and NXT_VERSION. This is required for a subsequent commit where we may want the server name on its own. Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-24Remove an erroneous semi-colon.Andrew Clayton1-1/+1
Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-24Don't conflate the error variable in nxt_kqueue_poll().Andrew Clayton1-3/+4
In nxt_kqueue_poll() error is declared as a nxt_bool_t aka unsigned int (on x86-64 anyway). It is used both as a boolean and as the return storage for a bitwise AND operation. This has potential to go awry. If nxt_bool_t was changed to be a u8 then we would have the following issue gcc12 -c -pipe -fPIC -fvisibility=hidden -O -W -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wmissing-prototypes -Werror -g -O2 -I src -I build -I/usr/local/include -o build/src/nxt_kqueue_engine.o -MMD -MF build/src/nxt_kqueue_engine.dep -MT build/src/nxt_kqueue_engine.o src/nxt_kqueue_engine.c src/nxt_kqueue_engine.c: In function 'nxt_kqueue_poll': src/nxt_kqueue_engine.c:728:17: error: overflow in conversion from 'int' to 'nxt_bool_t' {aka 'unsigned char'} changes value from '(int)kev->flags & 16384' to '0' [-Werror=overflow] 728 | error = (kev->flags & EV_ERROR); | ^ cc1: all warnings being treated as errors EV_ERROR has the value 16384, after the AND operation error holds 16384, however this overflows and wraps around (64 times) exactly to 0. With nxt_bool_t defined as a u32, we would have a similar issue if EV_ERROR ever became UINT_MAX + 1 (or a multiple thereof)... Rather than conflating the use of error, keep error as a boolean (it is used further down the function) but do the AND operation inside the if (). Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-24Remove a bunch of dead code.Andrew Clayton21-4935/+0
This removes a bunch of unused files that would have been touched by subsequent commits that switch to using nxt_bool_t (AKA unit6_t) in structures. In auto/sources we have NXT_LIB_SRC0=" \ src/nxt_buf_filter.c \ src/nxt_job_file.c \ src/nxt_stream_module.c \ src/nxt_stream_source.c \ src/nxt_upstream_source.c \ src/nxt_http_source.c \ src/nxt_fastcgi_source.c \ src/nxt_fastcgi_record_parse.c \ \ src/nxt_mem_pool_cleanup.h \ src/nxt_mem_pool_cleanup.c \ " None of these seem to actually be used anywhere (other than within themselves). That variable is _not_ referenced anywhere else. Also remove the unused related header files: src/nxt_buf_filter.h, src/nxt_fastcgi_source.h, src/nxt_http_source.h, src/nxt_job_file.h, src/nxt_stream_source.h and src/nxt_upstream_source.h Also, these files do not seem to be used, no mention under auto/ or build/ src/nxt_file_cache.c src/nxt_cache.c src/nxt_job_file_cache.c src/nxt_cache.h is #included in src/nxt_main.h, but AFAICT is not actually used. With all the above removed $ ./configure --openssl --debug --tests && make -j && make -j tests && make libnxt all builds. Buildbot passes. NOTE: You may need to do a 'make clean' before the next build attempt. Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-12HTTP: optimizing $request_line.Alejandro Colomar5-41/+20
Don't reconstruct a new string for the $request_line from the parsed method, target, and HTTP version, but rather keep a pointer to the original memory where the request line was received. This will be necessary for implementing URI rewrites, since we want to log the original request line, and not one constructed from the rewritten target. This implementation changes behavior (only for invalid requests) in the following way: Previous behavior was to log as many tokens from the request line as were parsed validly, thus: Request -> access log ; error log "GET / HTTP/1.1" -> "GET / HTTP/1.1" OK ; = "GET / HTTP/1.1" -> "GET / HTTP/1.1" [1] ; = "GET / HTTP/2.1" -> "GET / HTTP/2.1" OK ; = "GET / HTTP/1." -> "GET / HTTP/1." [2] ; "GET / HTTP/1. [null]" "GET / food" -> "GET / food" [2] ; "GET / food [null]" "GET / / HTTP/1.1" -> "GET / / HTTP/1.1" [2] ; = "GET / / HTTP/1.1" -> "GET / / HTTP/1.1" [2] ; = "GET food HTTP/1.1" -> "GET" ; "GET [null] [null]" "OPTIONS * HTTP/1.1" -> "OPTIONS" [3] ; "OPTIONS [null] [null]" "FOOBAR baz HTTP/1.1"-> "FOOBAR" ; "FOOBAR [null] [null]" "FOOBAR / HTTP/1.1" -> "FOOBAR / HTTP/1.1" ; = "get / HTTP/1.1" -> "-" ; " [null] [null]" "" -> "-" ; " [null] [null]" This behavior was rather inconsistent. We have several options to go forward with this patch: - NGINX behavior. Log the entire request line, up to '\r' | '\n', even if it was invalid. This is the most informative alternative. However, RFC-complying requests will probably not send invalid requests. This information would be interesting to users where debugging requests constructed manually via netcat(1) or a similar tool, or maybe for debugging a client, are important. It might be interesting to support this in the future if our users are interested; for now, since this approach requires looping over invalid requests twice, that's an overhead that we better avoid. - Previous Unit behavior This is relatively fast (almost as fast as the next alternative, the one we chose), but the implementation is ugly, in that we need to perform the same operation in many places around the code. If we want performance, probably the next alternative is better; if we want to be informative, then the first one is better (maybe in combination with the third one too). - Chosen behavior Only logging request lines when the request is valid. For any invalid request, or even unsupported ones, the request line will be logged as "-". Thus: Request -> access log [4] "GET / HTTP/1.1" -> "GET / HTTP/1.1" OK "GET / HTTP/1.1" -> "GET / HTTP/1.1" [1] "GET / HTTP/2.1" -> "-" [3] "GET / HTTP/1." -> "-" "GET / food" -> "-" "GET / / HTTP/1.1" -> "GET / / HTTP/1.1" [2] "GET / / HTTP/1.1" -> "GET / / HTTP/1.1" [2] "GET food HTTP/1.1" -> "-" "OPTIONS * HTTP/1.1" -> "-" "FOOBAR baz HTTP/1.1"-> "-" "FOOBAR / HTTP/1.1" -> "FOOBAR / HTTP/1.1" "get / HTTP/1.1" -> "-" "" -> "-" This is less informative than previous behavior, but considering how inconsistent it was, and that RFC-complying agents will probably not send us such requests, we're ready to lose that information in the log. This is of course the fastest and simplest implementation we can get. We've chosen to implement this alternative in this patch. Since we modified the behavior, this patch also changes the affected tests. [1]: Multiple successive spaces as a token delimiter is allowed by the RFC, but it is discouraged, and considered a security risk. It is currently supported by Unit, but we will probably drop support for it in the future. [2]: Unit currently supports spaces in the request-target. This is a violation of the relevant RFC (linked below), and will be fixed in the future, and consider those targets as invalid, returning a 400 (Bad Request), and thus the log lines with the previous inconsistent behavior would be changed. [3]: Not yet supported. [4]: In the error log, regarding the "log_routes" conditional logging of the request line, we only need to log the request line if it was valid. It doesn't make sense to log "" or "-" in case that the request was invalid, since this is only useful for understanding decisions of the router. In this case, the access log is more appropriate, which shows that the request was invalid, and a 400 was returned. When the request line is valid, it is printed in the error log exactly as in the access log. Link: <https://datatracker.ietf.org/doc/html/rfc9112#section-3> Suggested-by: Liam Crilly <liam@nginx.com> Reviewed-by: Zhidao Hong <z.hong@f5.com> Cc: Timo Stark <t.stark@nginx.com> Cc: Andrei Zeliankou <zelenkov@nginx.com> Cc: Andrew Clayton <a.clayton@nginx.com> Cc: Artem Konev <a.konev@f5.com> Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-04-11Add per-application logging.Andrew Clayton5-0/+81
Currently when running in the foreground, unit application processes will send stdout to the current TTY and stderr to the unit log file. That behaviour won't change. When running as a daemon, unit application processes will send stdout to /dev/null and stderr to the unit log file. This commit allows to alter the latter case of unit running as a daemon, by allowing applications to redirect stdout and/or stderr to specific log files. This is done via two new application options, 'stdout' & 'stderr', e.g "applications": { "myapp": { ... "stdout": "/path/to/log/unit/app/stdout.log", "stderr": "/path/to/log/unit/app/stderr.log" } } These log files are created by the application processes themselves and thus the log directories need to be writable by the user (and or group) of the application processes. E.g $ sudo mkdir -p /path/to/log/unit/app $ sudo chown APP_USER /path/to/log/unit/app These need to be setup before starting unit with the above config. Currently these log files do not participate in log-file rotation (SIGUSR1), that may change in a future commit. In the meantime these logs can be rotated using the traditional copy/truncate method. NOTE: You may or may not see stuff printed to stdout as stdout was traditionally used by CGI applications to communicate with the webserver. Closes: <https://github.com/nginx/unit/issues/197> Closes: <https://github.com/nginx/unit/issues/846> Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-11Add nxt_file_stdout().Andrew Clayton2-0/+20
This is analogous to the nxt_file_stderr() function and will be used in a subsequent commit. This function redirects stdout to a given file descriptor. Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-11PHP: Make the filter_input() function work.Andrew Clayton1-3/+12
On GitHub, @jamesRUS52 reported that the PHP filter_input()[0] function would just return NULL. To enable this function we need to run the variables through the sapi_module.input_filter() function when we call php_register_variable_safe(). In PHP versions prior to 7.0.0, input_filter() takes 'len' as an unsigned int, while later versions take it as a size_t. Now, with this commit and the following PHP <?php var_dump(filter_input(INPUT_SERVER, 'REMOTE_ADDR')); var_dump(filter_input(INPUT_SERVER, 'REQUEST_URI')); var_dump(filter_input(INPUT_GET, 'get', FILTER_SANITIZE_SPECIAL_CHARS)); ?> you get $ curl 'http://localhost:8080/854.php?get=foo<>' string(3) "::1" string(18) "/854.php?get=foo<>" string(13) "foo&#60;&#62;" [0]: <https://www.php.net/manual/en/function.filter-input.php> Tested-by: <https://github.com/jamesRUS52> Closes: <https://github.com/nginx/unit/issues/854> Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-03Remove a useless assignment in nxt_mem_zone_alloc_pages().Andrew Clayton1-1/+1
This was reported by the 'Clang Static Analyzer' as a 'dead nested assignment'. We assign prev_size then check if it's != 0 and if true we then set prev_pages to page_size right shifted by two at the same time setting prev_size to be right shifted by two (>>=), however page_size is never used again so no need to set it here. Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-03Prevent a possible NULL de-reference in nxt_job_create().Andrew Clayton1-4/+6
We allocate 'job' we then have a check if it's not NULL and do stuff with it, but then we accessed it outside this check. Simply return if job is NULL. Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-03Remove a useless assignment in nxt_fs_mkdir_all().Andrew Clayton1-1/+1
This was reported by the 'Clang Static Analyzer' as a 'dead nested assignment'. We set end outside the loop but the first time we use it is to assign it in the loop (not used anywhere else). Further cleanup could be to reduce the scope of end by moving its declaration inside the loop. Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-29Auto: mirroring installation structure in build tree.Alejandro Colomar1-1/+1
This makes the build tree more organized, which is good for adding new stuff. Now, it's useful for example for adding manual pages in man3/, but it may be useful in the future for example for extending the build system to run linters (e.g., clang-tidy(1), Clang analyzer, ...) on the C source code. Previously, the build tree was quite flat, and looked like this (after `./configure && make`): $ tree -I src build build ├── Makefile ├── autoconf.data ├── autoconf.err ├── echo ├── libnxt.a ├── nxt_auto_config.h ├── nxt_version.h ├── unitd └── unitd.8 1 directory, 9 files And after this patch, it looks like this: $ tree -I src build build ├── Makefile ├── autoconf.data ├── autoconf.err ├── bin │ └── echo ├── include │ ├── nxt_auto_config.h │ └── nxt_version.h ├── lib │ ├── libnxt.a │ └── unit │ └── modules ├── sbin │ └── unitd ├── share │ └── man │ └── man8 │ └── unitd.8 └── var ├── lib │ └── unit ├── log │ └── unit └── run └── unit 17 directories, 9 files It also solves one issue introduced in 5a37171f733f ("Added default values for pathnames."). Before that commit, it was possible to run unitd from the build system (`./build/unitd`). Now, since it expects files in a very specific location, that has been broken. By having a directory structure that mirrors the installation, it's possible to trick it to believe it's installed, and run it from there: $ ./configure --prefix=./build $ make $ ./build/sbin/unitd Fixes: 5a37171f733f ("Added default values for pathnames.") Reported-by: Liam Crilly <liam@nginx.com> Reviewed-by: Konstantin Pavlov <thresh@nginx.com> Reviewed-by: Andrew Clayton <a.clayton@nginx.com> Cc: Andrei Zeliankou <zelenkov@nginx.com> Cc: Zhidao Hong <z.hong@f5.com> Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-03-29Renamed --libstatedir to --statedir.Alejandro Colomar1-5/+5
In BSD systems, it's usually </var/db> or some other dir under </var> that is not </var/lib>, so $statedir is a more generic name. See hier(7). Reported-by: Andrei Zeliankou <zelenkov@nginx.com> Reported-by: Zhidao Hong <z.hong@f5.com> Reviewed-by: Konstantin Pavlov <thresh@nginx.com> Reviewed-by: Andrew Clayton <a.clayton@nginx.com> Cc: Liam Crilly <liam@nginx.com> Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-02-23Set a safer umask(2) when running as a daemon.Andrew Clayton1-3/+3
When running as a daemon. unit currently sets umask(0), i.e no umask. This is resulting in various directories being created with a mode of 0777, e.g rwxrwxrwx this is currently affecting cgroup and rootfs directories, which are being created with a mode of 0777, and when running as a daemon as there is no umask to restrict the permissions. This also affects the language modules (the umask is inherited over fork(2)) whereby unless something explicitly sets a umask, files and directories will be created with full permissions, 0666 (rw-rw-rw-)/ 0777 (rwxrwxrwx) respectively. This could be an unwitting security issue. My original idea was to just remove the umask(0) call and thus inherit the umask from the executing shell/program. However there was some concern about just inheriting whatever umask was in effect. Alex suggested that rather than simply removing the umask(0) call we change it to a value of 022 (which is a common default), which will result in directories and files with permissions at most of 0755 (rwxr-xr-x) & 0644 (rw-r--r--). If applications need some other umask set, they can (as they always have been able to) set their own umask(2). Suggested-by: Alejandro Colomar <alx.manpages@gmail.com> Reviewed-by: Liam Crilly <liam@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>