We got no feedback on any 502 errors in the last 11 days so it looks like the problem is finally solved 🎉
The only problem is, we changed two things. We might change one back for a short period of time in the next few days to check which of both lead to the problem. So if you experience 502 errors it will only be for a short time but please still report them.
We now know what the cause of the 502 errors was. The backends of the mastodon application bind only to ipv4 by default. We used the name localhost on the nginx reverse proxy to refer to the backend. This, as usual first resolves to ::1 the local ipv6 address. As there is no backend listening under this address we need to fallback to ipv4 and we think this was to slow in some cases leading to the user facing 502 error.
@ordnung Habt ihr irgendwo dokumentiert, was ihr geändert habt und was die mögliche Ursache war?
@clerie na wir wissen ja noch nicht was es war, also können wir es auch noch nicht dokumentieren :D
@ordnung In diesem Stadium vom Debugging seid ihr also noch :D
Wenn ihr mehr wisst, dann werde ich es wohl erfahren^^
@ordnung what was the solution? enable ipv6 for docker or change localhost to 127.0.0.1?
@gcrkrause we don't use docker. And the solution was to change it to 127.0.0.1, so yes.
@ordnung Nice find! That's not the sort of failure mode you see every day.
chaos.social – a Fediverse instance for & by the Chaos community