Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Uniquely malformed MBR containing NTFS PBS causes udev spam and constant reads #1207

Open
jarrodsfarrell opened this issue Oct 21, 2023 · 7 comments

Comments

@jarrodsfarrell
Copy link

jarrodsfarrell commented Oct 21, 2023

This issue is missing samples to reproduce as they were inadvertently destroyed. If you came here from a search then kindly engage here and supply samples before doing any partition modification as it may destroy the circumstances that would cause this bug.


/dev/sdb1 used to be a NTFS filesystem holding games at one point when Windows was the dominate OS on this desktop and later became a BTRFS filesystem by lazily pointing mkfs.btrfs at it. Importantly the games were migrated over. This has been for almost a year.

Around this month I noticed my HDD light constantly illuminated and checking iotop I saw udisksd writing to the disk at a constant 6-7 M/s. Checking udevadm monitor there was a lot of UDEV and KERNEL change events.

UDEV  [106382.274587] change   /devices/pci0000:00/0000:00:01.2/0000:01:00.1/ata2/host1/target1:0:0/1:0:0:0/block/sdb/sdb1 (block)
UDEV  [106382.279426] change   /devices/pci0000:00/0000:00:01.2/0000:01:00.1/ata2/host1/target1:0:0/1:0:0:0/block/sdb (block)
KERNEL[106382.281577] change   /devices/pci0000:00/0000:00:01.2/0000:01:00.1/ata2/host1/target1:0:0/1:0:0:0/block/sdb (block)
KERNEL[106382.282234] change   /devices/pci0000:00/0000:00:01.2/0000:01:00.1/ata2/host1/target1:0:0/1:0:0:0/block/sdb/sdb1 (block)
UDEV  [106382.287861] change   /devices/pci0000:00/0000:00:01.2/0000:01:00.1/ata2/host1/target1:0:0/1:0:0:0/block/sdb/sdb1 (block)
UDEV  [106382.293801] change   /devices/pci0000:00/0000:00:01.2/0000:01:00.1/ata2/host1/target1:0:0/1:0:0:0/block/sdb (block)
UDEV  [106382.301564] change   /devices/pci0000:00/0000:00:01.2/0000:01:00.1/ata2/host1/target1:0:0/1:0:0:0/block/sdb/sdb1 (block)
[and so on...]

I did some other debugging steps I found online, but the one that ended up showing something unusual was with strace. Scanning around the file \353R\220NTFS came up frequently during a read call. Searching around brought me to a unrelated post about someone having issues with NTFS, then searching up the NTFS structure came across it's Partition Boot Sector. This is when I discovered this disk was still MBR. Using gdisk to discard the MBR and recreate the partition table resolved the issue with the disk no longer constantly read.

Before discarding the MBR, I did make a backup of the first 2048 bytes, sdb.2048-bytes.bin.gz. And running parted against the disk before fixing produced an interesting result, included for the absurdity.

[nix-shell:/dev]# parted /dev/sdb print
Model: ATA Samsung SSD 860 (scsi)
Disk /dev/sdb: 1000GB
Sector size (logical/physical): 512B/512B
Partition Table: loop
Disk Flags: 

Number  Start  End     Size    File system  Flags
 1      0.00B  1000GB  1000GB  ntfs

Reminder: /dev/sdb1 is actually a btrfs file system.

Extra

/etc/fstab declaration:

/dev/disk/by-uuid/XXXX /media/Games btrfs noatime,compress=zstd 0 0

Disk information:

Model Family:     Samsung based SSDs
Device Model:     Samsung SSD 860 EVO 1TB
Serial Number:    ----
LU WWN Device Id: 5 002538 ec0bdad6b
Firmware Version: RVT04B6Q
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
TRIM Command:     Available, deterministic, zeroed
Device is:        In smartctl database 7.3/5387
ATA Version is:   ACS-4 T13/BSR INCITS 529 revision 5
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Oct 21 15:33:53 2023 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

/etc/os-release:

BUG_REPORT_URL="https://github.com/NixOS/nixpkgs/issues"
BUILD_ID="23.11pre536534.ca012a02bf83"
DOCUMENTATION_URL="https://nixos.org/learn.html"
HOME_URL="https://nixos.org/"
ID=nixos
LOGO="nix-snowflake"
NAME=NixOS
PRETTY_NAME="NixOS 23.11 (Tapir)"
SUPPORT_URL="https://nixos.org/community.html"
VERSION="23.11 (Tapir)"
VERSION_CODENAME=tapir
VERSION_ID="2.11"
@jarrodsfarrell
Copy link
Author

strace.log.gz

@tbzatek
Copy link
Member

tbzatek commented Oct 22, 2023

Please provide udevadm info for /dev/sdb and /dev/sdb1 (with the broken MBR) - initial probing is done by udev and udisks only consumes most of the info. Anything related in dmesg? Anything on stdout and stderr spewn by udisksd?

@jarrodsfarrell
Copy link
Author

jarrodsfarrell commented Oct 22, 2023

@tbzatek As mentioned I did make a backup of the first 2048 bytes, so for posterity I tried it on a small file attached using a loop device and could not recreate it like that. Also tried on an old USB flash drive I had laying around to similar effect.

At the moment the issue is not happening right now after I tried to return back to the previous setup as much as I can. But these are the following actions I did:

  • umount the active partition
  • Backed up the working GPT table (dd if=/dev/sdb of=./working-gpt.bin bs=2048 count=1)
  • Overwritten the disk with the broken MBR (dd if=./broken-mbr.bin of=/dev/sdb bs=2048 count=1)
  • mount the partition.
  • udevadm monitor was silent, so rebooted.
  • Returned some BIOS settings back to previous values I changed:
    • SATA hotplugging on the port where the SSD is attached to.
      • The port is in a front mounted bay; I've been lazy about configuring it correctly until now.
    • A PCI bus speed (Auto back to 3G)
  • Recall I also detached a seperate disk internally (where Windows is installed) and moved it elsewhere due to it not being detected. This has been returned. Still nothing.
    • The disk is OK otherwise, but I guess the SATA power cable is damaged and a reseat did not change that. Fun.
  • Disabled SATA hotplugging on a seperate SATA port that was also changed.

For now I'm going to leave the system back to that setup and see if the issue crops up due to time, since even with a broken MBR the system works fine. I'll report back if it starts happening again.

@jarrodsfarrell
Copy link
Author

Though here's the udevadm info as requested if there's anything of interest there, realizing it still might be useful just now.

/dev/sdb:

P: /devices/pci0000:00/0000:00:01.2/0000:01:00.1/ata2/host1/target1:0:0/1:0:0:0/block/sdb
M: sdb
U: block
T: disk
D: b 8:16
N: sdb
L: 0
S: disk/by-id/ata-Samsung_SSD_860_EVO_1TB_S599NZFNB22849E
S: disk/by-path/pci-0000:01:00.1-ata-2
S: disk/by-path/pci-0000:01:00.1-ata-2.0
S: disk/by-diskseq/3
S: disk/by-id/wwn-0x5002538ec0bdad6b
Q: 3
E: DEVPATH=/devices/pci0000:00/0000:00:01.2/0000:01:00.1/ata2/host1/target1:0:0/1:0:0:0/block/sdb
E: DEVNAME=/dev/sdb
E: DEVTYPE=disk
E: DISKSEQ=3
E: MAJOR=8
E: MINOR=16
E: SUBSYSTEM=block
E: USEC_INITIALIZED=3471770
E: PATH=/nix/store/yjisihkg87ycnpj5db42s4z9xlaxrqy0-udev-path/bin:/nix/store/yjisihkg87ycnpj5db42s4z9xlaxrqy0-udev-path/sbin
E: ID_ATA=1
E: ID_TYPE=disk
E: ID_BUS=ata
E: ID_MODEL=Samsung_SSD_860_EVO_1TB
E: ID_MODEL_ENC=Samsung\x20SSD\x20860\x20EVO\x201TB\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20
E: ID_REVISION=RVT04B6Q
E: ID_SERIAL=Samsung_SSD_860_EVO_1TB_S599NZFNB22849E
E: ID_SERIAL_SHORT=S599NZFNB22849E
E: ID_ATA_WRITE_CACHE=1
E: ID_ATA_WRITE_CACHE_ENABLED=1
E: ID_ATA_FEATURE_SET_HPA=1
E: ID_ATA_FEATURE_SET_HPA_ENABLED=1
E: ID_ATA_FEATURE_SET_PM=1
E: ID_ATA_FEATURE_SET_PM_ENABLED=1
E: ID_ATA_FEATURE_SET_SECURITY=1
E: ID_ATA_FEATURE_SET_SECURITY_ENABLED=0
E: ID_ATA_FEATURE_SET_SECURITY_ERASE_UNIT_MIN=4
E: ID_ATA_FEATURE_SET_SECURITY_ENHANCED_ERASE_UNIT_MIN=8
E: ID_ATA_FEATURE_SET_SECURITY_FROZEN=1
E: ID_ATA_FEATURE_SET_SMART=1
E: ID_ATA_FEATURE_SET_SMART_ENABLED=1
E: ID_ATA_DOWNLOAD_MICROCODE=1
E: ID_ATA_SATA=1
E: ID_ATA_SATA_SIGNAL_RATE_GEN2=1
E: ID_ATA_SATA_SIGNAL_RATE_GEN1=1
E: ID_ATA_ROTATION_RATE_RPM=0
E: ID_WWN=0x5002538ec0bdad6b
E: ID_WWN_WITH_EXTENSION=0x5002538ec0bdad6b
E: ID_PATH=pci-0000:01:00.1-ata-2.0
E: ID_PATH_TAG=pci-0000_01_00_1-ata-2_0
E: ID_PATH_ATA_COMPAT=pci-0000:01:00.1-ata-2
E: ID_PART_TABLE_UUID=22a22938
E: ID_PART_TABLE_TYPE=dos
E: DEVLINKS=/dev/disk/by-id/ata-Samsung_SSD_860_EVO_1TB_S599NZFNB22849E /dev/disk/by-path/pci-0000:01:00.1-ata-2 /dev/disk/by-path/pci-0000:01:00.1-ata-2.0 /dev/disk/by-diskseq/3 /dev/disk/by-id/wwn-0x5002538ec0bdad6b
E: TAGS=:systemd:
E: CURRENT_TAGS=:systemd:

/dev/sdb1:

P: /devices/pci0000:00/0000:00:01.2/0000:01:00.1/ata2/host1/target1:0:0/1:0:0:0/block/sdb/sdb1
M: sdb1
R: 1
U: block
T: partition
D: b 8:17
N: sdb1
L: 0
S: disk/by-diskseq/3-part1
S: disk/by-id/ata-Samsung_SSD_860_EVO_1TB_S599NZFNB22849E-part1
S: disk/by-id/wwn-0x5002538ec0bdad6b-part1
S: disk/by-path/pci-0000:01:00.1-ata-2-part1
S: disk/by-partuuid/22a22938-01
S: disk/by-uuid/9fa48288-ca5a-4300-a0cc-283f5d36265a
S: disk/by-path/pci-0000:01:00.1-ata-2.0-part1
S: disk/by-label/Games
Q: 3
E: DEVPATH=/devices/pci0000:00/0000:00:01.2/0000:01:00.1/ata2/host1/target1:0:0/1:0:0:0/block/sdb/sdb1
E: DEVNAME=/dev/sdb1
E: DEVTYPE=partition
E: DISKSEQ=3
E: PARTN=1
E: MAJOR=8
E: MINOR=17
E: SUBSYSTEM=block
E: USEC_INITIALIZED=3471792
E: PATH=/nix/store/yjisihkg87ycnpj5db42s4z9xlaxrqy0-udev-path/bin:/nix/store/yjisihkg87ycnpj5db42s4z9xlaxrqy0-udev-path/sbin
E: ID_ATA=1
E: ID_TYPE=disk
E: ID_BUS=ata
E: ID_MODEL=Samsung_SSD_860_EVO_1TB
E: ID_MODEL_ENC=Samsung\x20SSD\x20860\x20EVO\x201TB\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20
E: ID_REVISION=RVT04B6Q
E: ID_SERIAL=Samsung_SSD_860_EVO_1TB_S599NZFNB22849E
E: ID_SERIAL_SHORT=S599NZFNB22849E
E: ID_ATA_WRITE_CACHE=1
E: ID_ATA_WRITE_CACHE_ENABLED=1
E: ID_ATA_FEATURE_SET_HPA=1
E: ID_ATA_FEATURE_SET_HPA_ENABLED=1
E: ID_ATA_FEATURE_SET_PM=1
E: ID_ATA_FEATURE_SET_PM_ENABLED=1
E: ID_ATA_FEATURE_SET_SECURITY=1
E: ID_ATA_FEATURE_SET_SECURITY_ENABLED=0
E: ID_ATA_FEATURE_SET_SECURITY_ERASE_UNIT_MIN=4
E: ID_ATA_FEATURE_SET_SECURITY_ENHANCED_ERASE_UNIT_MIN=8
E: ID_ATA_FEATURE_SET_SECURITY_FROZEN=1
E: ID_ATA_FEATURE_SET_SMART=1
E: ID_ATA_FEATURE_SET_SMART_ENABLED=1
E: ID_ATA_DOWNLOAD_MICROCODE=1
E: ID_ATA_SATA=1
E: ID_ATA_SATA_SIGNAL_RATE_GEN2=1
E: ID_ATA_SATA_SIGNAL_RATE_GEN1=1
E: ID_ATA_ROTATION_RATE_RPM=0
E: ID_WWN=0x5002538ec0bdad6b
E: ID_WWN_WITH_EXTENSION=0x5002538ec0bdad6b
E: ID_PATH=pci-0000:01:00.1-ata-2.0
E: ID_PATH_TAG=pci-0000_01_00_1-ata-2_0
E: ID_PATH_ATA_COMPAT=pci-0000:01:00.1-ata-2
E: ID_PART_TABLE_UUID=22a22938
E: ID_PART_TABLE_TYPE=dos
E: ID_FS_LABEL=Games
E: ID_FS_LABEL_ENC=Games
E: ID_FS_UUID=9fa48288-ca5a-4300-a0cc-283f5d36265a
E: ID_FS_UUID_ENC=9fa48288-ca5a-4300-a0cc-283f5d36265a
E: ID_FS_UUID_SUB=652facac-6256-40f7-bd03-cbc1dd020066
E: ID_FS_UUID_SUB_ENC=652facac-6256-40f7-bd03-cbc1dd020066
E: ID_FS_BLOCKSIZE=4096
E: ID_FS_LASTBLOCK=244189696
E: ID_FS_SIZE=1000200994816
E: ID_FS_TYPE=btrfs
E: ID_FS_USAGE=filesystem
E: ID_PART_ENTRY_SCHEME=dos
E: ID_PART_ENTRY_UUID=22a22938-01
E: ID_PART_ENTRY_TYPE=0x83
E: ID_PART_ENTRY_NUMBER=1
E: ID_PART_ENTRY_OFFSET=2048
E: ID_PART_ENTRY_SIZE=1953517568
E: ID_PART_ENTRY_DISK=8:16
E: ID_BTRFS_READY=1
E: DEVLINKS=/dev/disk/by-diskseq/3-part1 /dev/disk/by-id/ata-Samsung_SSD_860_EVO_1TB_S599NZFNB22849E-part1 /dev/disk/by-id/wwn-0x5002538ec0bdad6b-part1 /dev/disk/by-path/pci-0000:01:00.1-ata-2-part1 /dev/disk/by-partuuid/22a22938-01 /dev/disk/by-uuid/9fa48288-ca5a-4300-a0cc-283f5d36265a /dev/disk/by-path/pci-0000:01:00.1-ata-2.0-part1 /dev/disk/by-label/Games
E: TAGS=:systemd:
E: CURRENT_TAGS=:systemd:

@tbzatek
Copy link
Member

tbzatek commented Oct 23, 2023

Thanks, the udevadm dumps look fine (e.g. the ID_FS_TYPE=btrfs). FYI, GPT is located both on start and end of the block device, simply backing up the leading 2kB doesn't work. The MBR partition table may have contained a protective partition (even with bogus boundaries, just to indicate there's something on the disk). This may have been messed up in a number of different ways so having a working reproducer is crucial.

@jarrodsfarrell
Copy link
Author

jarrodsfarrell commented Oct 23, 2023

@tbzatek Ah, didn't know but good to know otherwise and something I'll probably have to look into myself. I'll see if I could instead recreate the events that led to this situation by using Windows to format a NTFS disk then lazily turn the created partition into BTRFS.

@jarrodsfarrell
Copy link
Author

So I went into Windows and tried to recreate the issue with a spare SSD and used HxD to look at the results, but in all the ways I could do in Disk Management did not try to add the NTFS' PBS at the MBR. Mostly the testing process was initialize the disk, add NTFS, then look at it in HxD before manually zeroing out the first handful of sectors before I'd try something else to much of Windows' annoyance. In all cases it would create a MBR stub and/or create the GPT table with nothing unusual; no NTFS PBS.

Back on the Linux side I also tried purposefully formatting the whole disk NTFS (mkfs.ntfs --quick --force /dev/sda) then create a new MBR with a partition (fdisk /dev/sda) which kept the PBS, but udevadm monitor displayed normal remove/add/change events and fdisk -l correctly reported the disk as dos instead of loop like we saw. Same when using gdisk to create a GPT partition which also kept the PBS. I also did some silly things like creating a MBR on a partition but nothing wanted to humor the partitions within partitions.

Mildly disappointed that I destroyed the only unique case of a malformed MBR disk that would cause udev spam. Oh well, at least if someone ever encounters a situation like this they would take some solace that they weren't the only one and there would be this issue to supply their working (broken?) case of udisksd misbehaving.

So at this point it would seem this issue is stalled.

@jarrodsfarrell jarrodsfarrell changed the title MBR formatted disk formerly containing NTFS causes udisksd to spam udev and constantly reads Uniquely malformed MBR containing NTFS PBS causes udev spam and constant reads Oct 25, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants