1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
|
``` {=html}
<style>
body { max-width: 48em !important; }
</style>
```
##### [digital-domain.net](https://digital-domain.net/)
## NGINX Unit Serialised Pointers
In [NGINX Unit](https://unit.nginx.org/) we make use of what we call
_serialised pointers_. In simplest terms these are nothing more than _offsets_
into memory. However, the way they are implemented is somewhat non-obvious.
These are needed when we want to share memory (containing pointers) via
Inter Process Communications methods.
This text will attempt to explain them.
In Unit it is common to have a chunk of memory that starts with a _structure_
then has some some data after it, such as a bunch of, possibly nul terminated,
strings.
Each of these strings would have an associated `nxt_unit_sptr_t` structure
member which is defined like
```c
union nxt_unit_sptr_u {
uint8_t base[1];
uint32_t offset;
};
```
`.base[1]` is only used to get the address of this union, the array decays to
a pointer, so `.base` is the address of the union.
**This is really the key to the whole thing, we never set (or retrieve)
`.base`, it merely exists to provide the address of the union.**
`.offset` is then an offset relative from the `.base` address to the start of
the data in question.
(This could have been implemented using a simple integer type)
The following example program and diagram will hopefully make things clear
```c
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
union sptr_u {
uint8_t base[1];
uint32_t offset;
};
typedef union sptr_u sptr_t;
struct s {
uint8_t name1_len;
uint8_t name2_len;
uint8_t name3_len;
sptr_t name1;
sptr_t name2;
sptr_t name3;
};
static void sptr_set(sptr_t *sptr, void *ptr)
{
sptr->offset = (uint8_t *)ptr - sptr->base;
}
static void *sptr_get(sptr_t *sptr)
{
return sptr->base + sptr->offset;
}
int main(void)
{
static const char * const names[] = { "toor", "foobar", "baz" };
struct s *s = malloc(sizeof(struct s) +
strlen(names[0]) + strlen(names[1]) +
strlen(names[2]) + 3);
char *p = (char *)(s) + sizeof(struct s);
sptr_set(&s->name1, p);
p = stpcpy(p, names[0]);
p++;
sptr_set(&s->name2, p);
p = stpcpy(p, names[1]);
p++;
sptr_set(&s->name3, p);
p = stpcpy(p, names[2]);
printf("name1 : %s\n", (const char *)sptr_get(&s->name1));
printf("name2 : %s\n", (const char *)sptr_get(&s->name2));
printf("name3 : %s\n", (const char *)sptr_get(&s->name3));
free(s);
exit(EXIT_SUCCESS);
}
```
The above program results in something like

[pahole(1)](https://www.kernel.org/doc/ols/2007/ols2007v2-pages-35-44.pdf)
shows
```
union sptr_u {
uint8_t base[1]; /* 0 1 */
uint32_t offset; /* 0 4 */
};
struct s {
uint8_t name1_len; /* 0 1 */
uint8_t name2_len; /* 1 1 */
uint8_t name3_len; /* 2 1 */
/* XXX 1 byte hole, try to pack */
sptr_t name1; /* 4 4 */
sptr_t name2; /* 8 4 */
sptr_t name3; /* 12 4 */
/* size: 16, cachelines: 1, members: 6 */
/* sum members: 15, holes: 1, sum holes: 1 */
/* last cacheline: 16 bytes */
};
```
So we have three strings; "toor", "foobar" & "baz"
_toor_ starts at the address of _s->name1_ + _12_, 12 is `sizeof(sptr_t) * 3`.
_foobar_ start at the address of _s->name2_ + _13_, 13 is `sizeof(sptr_t) * 2`
+ the length of "toor\0" (5).
_baz_ starts at the address of _s->name3_ + _16_, 16 is `sizeof(sptr_t)` + the
lengths of "toor\0" & "foobar\0" (12).
---
[Andrew Clayton](mailto:Andrew Clayton <andrew@digital-domain.net>),
Apr 8th 2024
|