Today, building a new grafana dashboard for ceph, I found that the OSDs on one host perform much better than on the others. After this observation and a round of debugging I found out that the controllers where configured differently.

The two hosts, with the much slower apply and commit latency used the JBOD mode of the controller, the faster one used single disk RAID 0 to provide the disks to ceph (with a write-back cache).

After that I searched for the, apparently religious, question if it is recommended to use a single disk RAID 0 or the JBOD mode of the controller underlying the ceph clusters OSDs.

After no clear result I decided to go by the official red heat docs. As they seem to be the best source of truth for ceph and are a lot better than the ceph project official docs. Well... maybe a result of rad hat buying ceph a while ago.

Now the headline for this topic in the docs is "avoid RAID" and, as usual, isn't a good choice for the recommendation they give in the paragraph.

"Red Hat recommends that each hard drive be exported separately from the RAID controller as a single volume with write-back caching enabled."

So they indeed recommend RAID but only in the RAID 0 for every single disk configuration with a battery-backed controller.

JBOD, the other option they recommend for a much more specific usecase: "Using Just a Bunch of Drives (JBOD) in independent drive mode with Ceph is supported when using all Solid State Drives (SSDs), or for configurations with high numbers of drives per controller, for example, 60 drives attached to one controller.

What a fun! :)

Follow

The only open question is, why is the read/write operations latency so much better? Does they count another part of the operation than apply+commit or...?

@leah RAID0 can simply distribute between more disks. Two disks with half the writes are faster as one with all of them. Or does every controller only have 1 disk?

@Toasterson nope, you are getting something wrong here I think. The controller has multiple disks, every single disk is in its own raid0 there are no two disks in a raid0. That would be a problematic configuration because it would exploit some of the functions of ceph. Oh and the question also not much in common with the problem described above.

@leah Ah an interesting combination then. Firmware difference maybe? But thanks for clarifying.

@Toasterson I don't get the point of your question. sorry

@leah no need. I now understood your Thread. Curious to see how this turns out.

@leah maybe the raid 0 setting enables some caching on the controller, so you measure your cache instead of the disk. Just a guess.

@wasserpanther wouldn't explain it because two of three hosts have no caching enabled.

Sign in to participate in the conversation
chaos.social

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!