mastodontech.de ist einer von vielen unabhängigen Mastodon-Servern, mit dem du dich im Fediverse beteiligen kannst.
Offen für alle (über 16) und bereitgestellt von Markus'Blog

Serverstatistik:

1,5 Tsd.
aktive Profile

#mdadm

0 Beiträge0 Beteiligte0 Beiträge heute

today in #fedora qa:
* investigated a funky bug #openqa found in mutter 48.0, filed an issue with my findings: gitlab.gnome.org/GNOME/mutter/
* deep dive into a blocker bug where reuse of existing RAID devices during install doesn't work - bugzilla.redhat.com/show_bug.c . traced it to a questionable check added to #mdadm 4.3, sent a patch: github.com/md-raid-utilities/m
* a bit more cleanup on openqa needles
* wiped a bunch of old updates(-testing) composes to save space on the disk they live on

GitLabDesktop sometimes stops responding/updating after switch from VT to desktop (qemu/virtio VM) (48.0) (#3991) · Tickets · GNOME / mutter · GitLabThe gnome-shell / mutter 48.0 final update for Fedora failed openQA testing due to an interesting input/state update(?) bug, which I can...

Ever accidentally blow away your /boot partition?

dpkg -S /boot

will list what packages were written there. then you can do an 'apt-get --reinstall install <pkgs>' to fix it... This is Debian. Probably similarly for pacman, apk, emerge, etc.

So, why do I mention this?

Was replacing a failed drive (md RAID 1 root) in my garage "Artoo" unit and was duping the partitions from working to new ssd.

I guess I dd'ed the new empty /boot to the old working /boot. I was aiming for the UEFI partition...

Big dummy! LOL

Oh well. No one died, move along, nothing to see here. System is back up again now and in sleep mode for the rest of the night.

#Linux#Debian#mdadm

Is there any know-how about creating filesystems on top of RAID1? Both XFS and ext4 top out at 35 MB/s for me even though the block device below it is capable of 135 MB/s (yes, 4x difference!) This even with an absurdly large I/O request sizes like 512KB.

The disks are old-ish SATA drives with 512-byte sectors. XFS seem to use 512b sectors while ext4 picked 4k. They're connected via USB DAS, with dm-integrity on top of each disk, which are then combined into mdadm raid1, and encrypted with LUKS.

I confirmed with fio that the slowdown happens on the filesystem level, not LUKS or anywhere below it.

The only advice I found online is about alignment, but with 512KB requests it shouldn't be an issue because that's way larger than any of the block/sector sizes involved. I must be missing something, but what is it?

#mdadm#raid#ext4

Huh, so apparently adding just two simple #udev lines makes my #mdadm #RAID array seem much more responsive (at least, Jellyfin feels way more responsive). One increases the size of the stripe cache, and the other...I'm honestly not sure?

I'm not one to recommend adding random advice off the Internet to your system config, but if you're using RAID 5 or RAID 6 and wondering about performance optimizations, maybe check them out.

#Linux

github.com/8bitbuddhist/nix-co

GitHubModules: set RAID optimizations · 8bitbuddhist/nix-configuration@9da3430NixOS configuration files. This repository is a mirror of https://code.8bitbuddhism.com/aires/nix-configuration - Modules: set RAID optimizations · 8bitbuddhist/nix-configuration@9da3430

As it's the 1st Sunday of the month and #mdadm's periodic checks runs, here's a short plug for all those who want the #SoftwareRAID state in their #LinuxDesktop monitored without installing a fully fledged #monitoring system like Icinga or Xymon:

I wrote a small helper for the #systemtray showing an icon with green, yellow or red state indicators and doing notifications in case of change.

github.com/xtaran/systray-mdst
packages.debian.org/systray-md

Antwortete im Thread

@Gentoo_eV Given that I get a KVM console in time, I will demonstrate my installation guide (gentoo.duxsco.de/) in English using a #Hetzner dedicated server.

  • What? Beyond Secure Boot – Measured Boot on Gentoo Linux?
  • When? Saturday, 2024-10-19 at 18:00 UTC (20:00 CEST)
  • Where? Video call via BigBlueButton: bbb.gentoo-ev.org/

The final setup will feature:

  • #SecureBoot: All EFI binaries and unified kernel images are signed.
  • #MeasuredBoot: #clevis and #tang will be used to check the system for manipulations via #TPM 2.0 PCRs and for remote LUKS unlock (you don't need tty).
  • Fully encrypted: Except for ESPs, all partitions are #LUKS encrypted.
  • #RAID: Except for ESPs, #btrfs and #mdadm based #RAID are used for all partitions.
  • Rescue System: A customised #SystemRescue (system-rescue.org/) supports SSH logins and provides a convenient chroot.sh script.
  • Hardened #Gentoo #Linux for a highly secure, high stability production environment.
  • If enough time is left at the end, #SELinux which provides Mandatory Access Control using type enforcement and role-based access control

Что-то сегодня адуха какая-то.

Как будто бы рейд посыпался.

Пока я сегодня спал, пришло уведомление, что из массива выпал один диск, к моменту как я дополз до компьютера - уже недоставало двух.

При этом диски не были помечены сбойными, а просто пропали из массива. При этом находясь в системе и показывая по mdadm --examine, что они часть рейда. Причём ещё и в статусе "active". Когда в это же время в mdadm --detail было минус два диска как будто их просто не существует.

В итоге один вернулся сам после того как я этих двоих физически вынул и вернул обратно.

Второй вернулся после mdadm <array> --add. Выглядело всё это так как будто mdadm - слепой, а я его носом ткнул и он такой: "Да вот же он!".

Вот только в следующий ребут ФС, которая находится поверх LUKS на этом массиве не подмонтировалась с ошибкой "fsconfig system call failed: Structure needs cleaning", а e2fsck нашёл какую-то ебАную кучу всякой хрени (пока просто в режиме без изменений).

Оставил на ночь mdadm провести проверку целостности массива с пересчётом и ...

Заказал дисков, на которые планировал обновляться в самом конце года.

Короче говоря, астрологи объявили неделю приключений, танцев с бубном, незапланированных трат и потенциальной потери 30 терабайт данных 🎉

Предлагаю делать ставки на исход.

#hardware#server#soft
Fortgeführter Thread

Just done an `mdadm --examine` on the drives.

sda1, sdb1 and sdc1 all have Last Update times of Thu Mar 28 15:59. sdd1's Last Update time is Wed Mar 27 05:32. The only writes during that 34-hour gap would have been from Syncthing, so I have all that elsewhere.

However, sdc1 has a "bad blocks present" error.

This looks like a classic case of a drive dropping off an array and exposing corruption elsewhere.

Can I re-add sdd1 to the array read-only, without completely breaking it?

Fortgeführter Thread

So, running `mdadm --assemble` with all the drives, this is what I get.

When I first thought there was an issue last week, I tried re-adding sdc to the array. When it failed, I shut the server down. It now says it's a spare, which is worrying.

sdd won't re-add because it's too old (in dmesg: "kicking non-fresh sdd1 from array!").

Bear in mind that I don't need any recent writes, so I don't really care about corruption on any data from the past, say, 3 months.

Any ideas, people?

I’ve rebooted a server and it went into emergency boot mode because a #Linux software #RAID name changed from md0 to md127.

Took me an hour to figure out the cause and the solution. The server’s hostname is no longer assigned by the new dhcp server and #mdadm sees the raid as a foreign one now. Updated fstab and now it’s all good again. Apparently you‘re not supposed to refer to your software raids as /dev/md? anyways…

How to be a #UNIX expert and earn >$1M/yr!

- Redirect both stderr AND stdout in tcsh
- Copy a #ssh key that works the 1st time
- Update a #OSX #homebrew #LAMP stack
- Read all the #kernel sources (especially pci.c)
- Replace a failed drive with #mdadm
- Self host a #mastodon server
- Install #Gentoo once
- Read about #Kubernetes decide to just not
- Get the wifi working on #ubuntu
- Reset a parent’s password over the phone
- Switch from #systemd to /etc/init.d scripts
- Work out how to exit #vi

Antwortete im Thread

@pelzi Ich hab es schon einige male erlebt, das #btrfs deutlich stärker offeriert, wenn etwas mit dem Blockdevice nicht stimmt.
Konnte das dann aber auch stets mit #smart Werten und Test bestätigen.
Gelegentlich konnte ich es auf Aussetzer zurückführen, die am Kabel oder etwas anderem Lagen, die soetwas wie #mdadm einfach stillschweigend Hinnahme aber Btrfs stehts loggte.