Follow

But in the meantime: Hey, , anybody else having trouble going from Postgres 9.6 (Stretch) to 11 (Buster)? We're seeing high rates of 502s, a whole lot of long-running CPU eating queries, and no real idea why this would happen due to a DB upgrade.

Ran vacuum and reindex, and the queries that seem to be hanging look fairly normal.

pg_hero reports 4 duplicate indexes and 26 unused ones (do these numbers look similar for you?), but I wouldn't expect those to be the issue.
But the upgrade has our server running at full CPU load, and ~90% of that is postgres. Mastodon issue? Postgres config change between 9.6 and 11?

This seems to be a thing that's coming in waves – we're ok for a while, then suddenly CPU load across all cores spikes, leading to plenty of 502s.

I'm particularly confused that postgres would do this (misconfig on our part?), and that there always a couple of *idle* queries among the CPU hogs. How come?

Any hints or experiences on your side would be very appreciated – either on here or via r at rixx.de (email) or rixx at jabber.shackspace.de, or messenger pidgeon, I suppose.

@rixx postgres upgrades are usually major things, i've never done anything that just completely skipped 10

@rixx Our little instance's PgHero has shown “4 duplicate indexes” right from the start. We started on Debian Buster.

Not sure about unused indexes. They just went from 11(?) to 10 after I clicked “with a migration”.

Alas, our in-house competence with PostgreSQL is limited. Formerly, we used MySQL for everything, but lots of projects seem to prefer psql these days.

@rixx Have you tried to export and re-import it, again? Without index.
Have you tried to set the (error) logging on the highest level?

@rixx I specialise in Postgres if you still need a hand with anything.

I'd turn the query logs on (log_min_duration_statement = 0, just needs a reload) and run a query analyzer such as pgfouine on it.

Then get an example of the slower queries and EXPLAIN ANALYZE.

@rixx if something is showing as "idle" in pg_stat_activity it's not running anything. The statement shown is the last one it executed.

@russss Thanks, we found we didn't run the analyze script (which seems to do more than a plain analyze, which we ran). All good now, was just weird to figure out what was going on (especially with late-night brain on).

@russss But it's good to know you specialise in postgres. *files notes*
Is there anything you don't specialise in, though.

@rixx with most things I just know enough to be useful, usually :)

@rixx (although I just spotted that's broken on postgres 11 so I shall fix it now!)

@russss Ohh, that looks nice, thanks! I still have a small handfull of open tabs on Postgres performance (though for pretalx originally, I guess the results would apply here, too). Is there any source you'd recommend to learn more, apart from postgresql.org/docs/12/admin.h ?

@rixx I wrote this a few years ago - I think it's still mostly correct. russ.garrett.co.uk/2015/10/02/

...and that just reminded me about pg_stat_statements which I'd completely forgotten about!

Sign in to participate in the conversation
chaos.social

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!