Physical block size detection incorrect for AF SAT drives

Question

Physical block size detection incorrect for AF SAT drives

GreatEmerald opened this issue 10 years ago · 83 comments

Currently the script uses blockdev --getpbsz to detect the physical block size. However, this is incorrect for Advanced Format drives using SCSI-to-ATA Translation (SAT; includes USB enclosures and such), as detailed here: http://nunix.fr/index.php/linux/7-astuces/65-too-hard-to-be-above-2tb

The correct way to determine it is by using either hdparm -I or smartctl -d sat -a. Since the former doesn't need explicit specification that it's a SAT drive, it's probably better to use that, like so:

BLOCK_SIZE=$(sudo hdparm -I /dev/$DEVICE | grep "Physical Sector size" | awk '{print $(NF-1)}')

Answer 1 · 2015-08-04T14:48:53.000Z

Thanks for the report! This is helpful.

This is reminiscent of issue #5, in which @barak suggested that we move (from lshw) to blockdev. I like his argument that blockdev is an essential package on Ubuntu, so it's pretty much guaranteed to be present (though it's not part of the POSIX standard). Right now, the only other non-POSIX commands used are printf and xxd, I believe.

Do you have any ideas on how to solve your problem using a more portable solution than hdparm? It seems undesirable to make users install the hdparm package even if they don't have an AF SAT drive. A hacky solution would be lazy environmental validation, such that the script doesn't exit until (1) it knows that it needs hdparm; and (2) it doesn't have hdparm (vs. failing at script init if it doesn't have hdparm). Ideas?

Also, philosophical question: Does your defect report better belong against blockdev instead of format-udf? If blockdev is reporting an incorrect value, then it seems most proper to pursue a fix there, instead of here.

Answer 2 · 2015-08-04T14:54:10.000Z

Honestly, I think the real bug is in the kernel. I think the canonical way to get the block size on Linux should be via:

cat /sys/class/block/$DEVICE/queue/physical_block_size

However, for AF SAT it, too, reports 512. So clearly the kernel itself is confused as well. I'll see if I can open a ticket in the kernel bug tracker for this.

Answer 3 · 2015-08-04T14:56:22.000Z

That's great. In the meantime, I'd like to keep this issue open as a placeholder.

Once a kernel ticket has been opened, would you mind adding a link to it in this thread?

Many thanks.

Answer 4 · 2015-08-04T17:08:45.000Z

Filed a bug report here: https://bugzilla.kernel.org/show_bug.cgi?id=102271

Answer 5 · 2016-01-06T15:59:04.000Z

Just for the record, I ran into the same issue. I guess it's fine to fix the root cause of the error, but a word of caution in the Readme or other prominent location would have saved me a lot of trouble.

FWIW, I looked up the correct sector size using hdparm manually and verified that blockdev indeed reported it wrong, then set BLOCKSIZE to 512 in the script. Worked like a charm!

Answer 6 · 2016-01-06T16:08:27.000Z

thanks, all.

fyi, i plan on getting back to addressing this (and other outstanding issues) in the upcoming weeks. just finished graduate school in december. thanks for your patience!

Answer 7 · 2016-02-05T01:26:26.000Z

FYI, 107cf1c adds a new option -b BLOCK_SIZE which allows manually overriding the OS-detected block size. It doesn't fix the root issue here (still waiting on the Linux kernel fix), but it does make things slightly less painful.

Also, @ksterker, thanks for the tip: The new usage (and README) highlights the kernel bug.

Answer 8 · 2016-04-16T09:52:39.000Z

The kernel devs said that they can't do anything about it, because they'd have to use libusb, and that's not available from the kernel.
So I guess another tool is a way to go in any case. Or maybe someone could write to the linux-usb mailing list and see if they can come up with a solution.

Although, in any case, as per #12 there's a reason to just hardcode it to 512 for Windows (or use logical size?).

Answer 9 · 2017-01-09T17:51:44.000Z

The kernel devs said that they can't do anything about it, because they'd have to use libusb, and that's not available from the kernel.

Can you give me link to this discussion, specially need for libusb? This seems like misunderstanding for me as libusb is just userspace library which uses kernel api for direct access to usb bus. It exist because only kernel can access hw. Kernel can of course access usb HW.

Although, in any case, as per #12 there's a reason to just hardcode it to 512 for Windows (or use logical size?).

No, there is a no good reason to hardcode value to any specific constant.

Answer 10 · 2017-01-09T18:15:40.000Z

Anyway, UDF uses logical block size, not physical. So it is not needed to know physical block size for formatting.

Answer 11 · 2017-01-09T18:40:16.000Z

Physical block size is required to enable UDF support on Windows XP. See Pieter's article linked on the readme (you may need the Wayback Machine). The block size calculated by format-udf is used for both the formatting operation and the creation of the false MBR.

Answer 12 · 2017-01-09T18:54:52.000Z

UDF 2.60 specification says:

http://www.osta.org/specs/pdf/udf260.pdf

Logical Block Size - The Logical Block Size for a Logical Volume shall be set to the logical sector size of the volume or volume set on which the specific logical volume resides.

Uint32 LogicalBlockSize in struct LogicalVolumeDescriptor - Interpreted as specifying the Logical Block Size for the logical volume identified by this Logical Volume Descriptor. This field shall be set to the largest logical sector size encountered amongst all the partitions on media that constitute the logical volume identified by this Logical Volume Descriptor. Since UDF requires that all Volumes within a Volume Set have the same logical sector size, the Logical Block Size will be the same as the logical sector size of the Volume.

Answer 13 · 2017-01-09T18:58:45.000Z

Physical block size is required to enable UDF support on Windows XP.

Ah :-( It is really truth? And not logical block size?

See Pieter's article linked on the readme (you may need the Wayback Machine).

http://web.archive.org/web/20151103171649/http://sipa.ulyssis.org/2010/02/filesystems-for-portable-disks/

I do not see there requirement about physical block size.

The block size calculated by format-udf is used for both the formatting operation and the creation of the false MBR.

New version of mkudffs autodetect blocksize (via BLKSSZGET) if it is not passed via command option.

Answer 14 · 2017-01-10T01:49:48.000Z

Sorry, @pali, I don't think I'm explaining myself very well. Let me try again.

There are 3 different block sizes we're discussing here:

Disk physical block size. This is an artifact of the disk, governed by the disk controller. Not controlled by the user. Reported by blockdev --getpbsz or /sys/block/sda/queue/physical_block_size.
Disk logical block size. This is an artifact of the disk, governed by the kernel. Not controlled by the user. Reported by blockdev --getss or /sys/block/sda/queue/logical_block_size. Usually 512 bytes.
File system block size. This is an artifact of the file system (not the underlying disk), specified by the user at file system format time.

All of these 3 values can be different, even for the same disk.

format-udf.sh uses UDF v2.01 (not the latest v2.60), for reasons specified in the README. However, both v2.01 and v2.60 use the same term (LogicalBlockSize) to describe item 3 above. It's important to note that the use of the adjective Logical here is with respect to the file system, not the disk. In other words, it's important not to confuse the UDF LogicalBlockSize with the disk's logical block size. Those can be (and often are) different.

The udftools package provides mkudffs, which is the de facto tool for formatting a UDF file system in Linux. The -b,--blocksize option lets you specify item 3 above when formatting a disk. It has no bearing on items 1 or 2. The version of udftools on my current Ubuntu 16.04.1 LTS machine is 1.0.0b3-14.4. That version defaults to a block size of 2048 bytes unless otherwise specified. It's completely possible that newer versions default to something else.

I fully acknowledge that a UDF drive formatted to be fully compliant with the spec has a file system block size (item 3) set to the disk logical block size (item 2). Your quotes above from the spec are accurate (though I'm targeting v2.01).

Be reminded that the goal of format-udf.sh is to output is a drive that can be used for reading/writing across multiple operating system families. If a user is interested in a fully spec-compliant format for use on a single OS, then he/she should use the native formatting tool (mkudffs or newfs_udf) and specify the file system block size (item 3) set to the disk logical block size (item 2). However, this will be insufficient for a cross-platform UDF drive that works on Linux, OS X, and Windows.

From Pieter's article, for Windows XP,

the UDF block size must match the block size of the underlying device

I interpreted this to refer to the disk physical block size (item 1). (I believe this is the point you're contesting, submitting that it should instead be the disk logical block size, which it item 2.) I verified the use and behavior of the disk physical block size (item 1) in my lab over 2 years ago when I first authored format-udf.sh. It's completely possible that I made (and repeated) this mistake. However, with the number of tests that I ran at the time, I find it unlikely. Unfortunately, I have no Windows XP servers in my lab at the moment, so I'm unable to re-validate. Thus, I cannot refute your claim.

However, there is another (more significant) reason that format-udf.sh relies on disk physical block size (item 1) instead of disk logical block size (item 2). UDF v2.01 itself has a limit of 2^32 blocks. If the disk logical block size (item 2, which is usually 512 bytes) is used in formatting and in the partition table, then the maximum disk size supported by format-udf.sh will most often be 2 TiB. This is unacceptable for many modern (larger) drives. Out of convention, many disk manufacturers still respect the 2^32-block limit, which means their only practical way for crafting larger disks is--you guessed it--increasing the disk physical block size (item 1).

Therefore, for format-udf.sh to accomplish its goal of providing a true cross-platform read/write disk (and not artifically capped at 2 TiB), it must use the (sometimes larger) disk physical block size (item 1), which comes at a cost (as you correctly point out) that it technically isn't fully spec-compliant. format-udf.sh is a tool for pragmatists. Theorists who prefer spec compliance will need to give up dreams of a true cross-platform read/write disk, and stick to the native formatting tools.

Sorry for the verbosity, but I hope this explains my position more clearly.

Answer 15 · 2017-01-10T08:53:36.000Z

Understood now.

Anyway, I understand UDF specification that blocksize must be disk logical blocksize (2) (if disk needs to be compliant to UDF spec). Correct or not?

mkudffs from udftools prior to version 1.1 (or 1.0) has lot of problems and generates invalid UDF filesystem. I can imagine that it can cause problems for other operating systems. So it could be good to use mkudffs version 1.1 (or better last version from git) and retest all operating systems without "hacks" like MBR or with different block sizes.

Anyway, if you have testing env for different operating systems, can you test newfs_udf tool from UDFClient project? http://www.13thmonkey.org/udfclient/ I remember that some disks formatted by newfs_udf from UDFClient was recognized correctly by Win7, but not when formatted by mkudffs. Maybe we are really facing problems in mkudffs and not always in other operating systems.

With number of blocks you are right, limit is I think 2^32-1 (not full 2^32). But increasing block size has also disadvantages (like in other FS) that some data (or metadata) will always store on disk full block size. Maybe some thresholds could be used to start increasing blocksize when number of blocks is max number -- and not to use big blocksize on smaller disks (to not waste disk space)?

Answer 16 · 2017-01-10T13:22:00.000Z

I understand UDF specification that blocksize must be disk logical blocksize (2) (if disk needs to be compliant to UDF spec). Correct or not?

To be honest, I'm still not 100% sure. The -b option in mkudffs sets the volume's logicalBlockSize to the disk's disc->blocksize, which is set to the value passed in by the user. Note how the v2.01 spec uses slightly different terminology than you and I have used in this discussion:

Logical Sector Size - The Logical Sector Size for a specific volume shall be the same as the physical sector size of the specific volume.
Logical Block Size - The Logical Block Size for a Logical Volume shall be set to the logical sector size of the volume or volume set on which the specific logical volume resides.

The way that I read that makes it sound like the spec is calling for the volume's block size to be set to the disk's logical sector size, which should be set to the disk's physical sector (block) size.

mkudffs from udftools prior to version 1.1 (or 1.0) has lot of problems and generates invalid UDF filesystem.

Correct. I've been glad to see you pick up maintenance of udftools. And I'm even more glad that Debian/Ubuntu package maintainers have picked up your edits.

So it could be good to use mkudffs version 1.1 (or better last version from git) and retest all operating systems without "hacks" like MBR or with different block sizes.

Agreed. I have always conducted my testing against mainstream packages for udftools and OS X. Debian stable still uses 1.0.0b3-14.3, but (as you know) Ubuntu has picked up 1.2-1build1 as of Yakkety.

My testing resources are currently allocated for other projects (and also I'm about to have a baby), but I agree that it would be good to test format-udf.sh with udftools 1.2. I captured this in #33.

I should mention that all of my initial testing was conducted without any MBR/hacks. In fact, the addition of the MBR was the outcome of having performed my initial testing.

can you test newfs_udf tool from UDFClient project?

Is this the same suite included in OS X? (I would conjecture yes, based on the number of BSD references.) If so, then I've already conducted implicit testing. I've observed minor differences, but largely consistent behavior. See #11.

I'm particularly interested in udfclient's ongoing release where they claim to be working on a functional fsck. That would be huge for the credibility of UDF. I had barely started porting the Solaris implementation, but haven't gotten very far yet.

Maybe some thresholds could be used to start increasing blocksize when number of blocks is max number -- and not to use big blocksize on smaller disks (to not waste disk space)?

If users of format-udf.sh are concerned about block-level efficiency, they're always welcome to use the -b BLOCK_SIZE switch to specify their own. I don't recall having seen any disks <= 2 TiB with a physical block size > 512. Most folks, I've found, are more interested in using their entire disk vs. truncating it but having better block efficiency.

Answer 17 · 2017-01-10T14:12:19.000Z

In 2.01 spec is also written:

physical sector - A sector [1/5.9] given by a relevant standard for recording [1/5.10]. In this specification, a sector [1/5.9] is equivalent to a a logical sector [3/8.1.2].

So it not such clear! [3/8.1.2] is reference to ECMA 167 standard where is:

1/5.5 logical sector - The unit of allocation of a volume.
3/8.1.2 Logical sector - The sectors of a volume shall be organised into logical sectors of equal length. The length of a logical sector shall be referred to as the logical sector size and shall be an integral multiple of 512 bytes. The logical sector size shall be not less than the size of the smallest sector of the volume. Each logical sector shall begin in a different sector, starting with the sector having the next higher sector number than that of the last sector constituting the previous, if any, logical sector of the volume. The first byte of a logical sector shall be the first byte of the sector in which it begins, and if the size of this sector is smaller than the logical sector size, then the logical sector shall comprise a sequence of constituent sectors with consecutive ascending sector numbers.

So still I'm not sure...

Is this the same suite [newfs_udf] included in OS X?

No, what I saw OS X has own closed implementation of UDF and does not use UDFClient. So UDFClient's newfs_udf implementation should be different.

Debian stable packages will never be updated (stable means that package versions are stable). But both Ubuntu (in some version) and Debian (testing) have packages for newfs_udf (in udfclient) and mkudffs (in udftools). Currently mkudffs 1.2 does not work on 32bit systems for formatting disks above 4GB (problem with Large File Support), this will be fixed in mkudffs 1.3 (64bit systems do not have this problem).

I'm particularly interested in udfclient's ongoing release where they claim to be working on a functional fsck.

Author has interested to that but did not have much more time to implement it yet... Last year I was contacted by student who want to implement fsck as part of thesis so maybe there will be something...

I had barely started porting the Solaris implementation, but haven't gotten very far yet.

That is useless. I already looked at it years ago and it supported only UDF 1.2 and used Solaris kernel drivers where was functionality implemented... So for systems without Solaris kernel it means to reimplement whole functionality and that is probably more work as implement fsck from scratch. (And UDF 1.2 is not enough!)

I don't recall having seen any disks <= 2 TiB with a physical block size > 512.

All my 512GB and 1TB disks have physical block size of 4096, so they are not rare. (But I'm using ext4 on them...)

Answer 18 · 2017-01-10T15:21:10.000Z

Thanks for your comments and additional data points, @pali. I will terminate my udf-fsck project given your guidance.

I am still leaving this issue (#13) open as a placeholder to track https://bugzilla.kernel.org/show_bug.cgi?id=102271. We're also waiting for @GreatEmerald to respond to your request.

Answer 19 · 2017-01-10T19:06:01.000Z

I was referring to the bug tracker, and @pali has already seen that. I don't feel like starting a thread about it in the Linux mailing list, since I don't think I know as much as you two about the whole issue.

From what I recall of my own testing, Windows 10 simply did not work with UDF volumes with FS block size other than 512. I tested that with an external HDD which has a physical block size of 4096, and thus by some logic should have used 4096 block size, but no dice. I also haven't really used any MBR hacks, and Win10 worked fine with that.

Maybe it would be a good idea to revive the wiki page with test results, but also make certain to add the tested OS versions. Because "Windows" can mean 10 or 95, with wildly different results. Same with macOS etc. And not everyone cares about XP support, or macOS support, etc., one may just want to have something that works on a given Windows version and a given Linux version.

Answer 20 · 2017-01-10T19:37:23.000Z

I forwarded bug to mailing list and CCed you to be informed about status.

Answer 21 · 2017-01-10T21:22:15.000Z

@pali Is there a mailing list link that you can post here in this thread?

Answer 22 · 2017-01-10T21:30:51.000Z

Discussion is available e.g. in this archive:
http://www.spinics.net/lists/linux-usb/index.html#151780

Answer 23 · 2017-05-20T11:14:59.000Z

Now I looked at this logical vs physical block size problem again and I think whole block addressing in UDF should be according to LBA. Which means that logical block size of disk should be used, and not physical block size! Basically all read/write operations in disks implementations work with LBA and physical block size is just hints for disk tools to align partitions/buffer for better performance...

This @GreatEmerald's comment just prove it:

From what I recall of my own testing, Windows 10 simply did not work with UDF volumes with FS block size other than 512. I tested that with an external HDD which has a physical block size of 4096, and thus by some logic should have used 4096 block size, but no dice.

As today, basically all disks have logical block size 512 (and physical 4096). In past there were HDDs which operated with logical block size of 4096, therefore LBA was 4096 and it caused problems as disk partition utilities, simple bootloaders and other small programs has hardcoded logical block size to 512.

And I bet that in Pieter's article is "block size" mean to be logical block size, for LBA addressing.

Therefore I think that default block size for mkudffs should be value from logical block size (blockdev --getss). And for 2TB+ disks it would make sense to set it at least to 1024 for ability to format whole disk.

New mkudffs already uses default block size (if not specified on cmdline) from logical block size of disk.

Note that MBR table (and also GPT structures) works with LBA and therefore depends on logical block size, not physical!

Answer 24 · 2017-05-20T12:10:53.000Z

Also of note is that in Windows 10 Creators update, MS finally implemented mounting multiple partitions in one device, just like it is in Linux. So for 2 TiB+ drives, it might make sense to make multiple partitions at 2 TiB in size.

Oh, and last I tried, my UDF-partitioned SD card contents were shown in Windows 10, but it was read-only. Performing the disk check resulted in all files being relocated to lost+found and the card became read-write. Not sure why.

Answer 25 · 2017-05-20T13:07:52.000Z

Also of note is that in Windows 10 Creators update, MS finally implemented mounting multiple partitions in one device

It is really confirmed by MS?

Because current MS implementation is very special. For removable disks MBR/GPT is optional, but if present only first partition is used. For non-removable disks MBR/GPT is required, but then all partitions are used.

So if you do not create MBR/GTP and format whole disks to some filesystem, then it is recoginzed by Windows only if disk itself if marked as "removable".

And that flag "removable" is part of USB mass storage protocol, so some flash disk can announce "removable" and some not.

So for 2 TiB+ drives, it might make sense to make multiple partitions at 2 TiB in size.

In MBR are stored 32bit pairs (first LBA, num of LBA blocks). So for 512 LBA disks you can can have maximally 2TB long parition and every partition must start before 2TB offset. Which means you can use maximally 4TB disks. First parition would be 2TB long and starts somewhere at begining and second parition would be 2TB long and starts before 2TB offset.

PS: It is not exaclty 2TB, but just (2^32-1)*512B ~= 2046GB and due to MBR header and alignment first partition would be smaller...

In GPT are pairs (first LBA, last LBA) but those are 64bit which means that for 512 LBA disks every partition must starts and also ends in first 8ZB (1ZB = 2^70B).

Oh, and last I tried, my UDF-partitioned SD card contents were shown in Windows 10, but it was read-only. Performing the disk check resulted in all files being relocated to lost+found and the card became read-write. Not sure why.

Make sure you are formatting by mkudffs at version 1.3, older versions do not prepare UDF partitions correctly.

Answer 26 · 2017-05-20T13:36:44.000Z

Pretty sure. It was a removable drive I tried, listed in the safely remove hardware list. Clicking to remove one removes both at the same time. I usually use GPT for everything these days.

Answer 27 · 2017-05-20T15:56:48.000Z

Hm... looks like Microsoft added support for more partitions in Windows 10, Version 1703.

https://msdn.microsoft.com/en-us/windows/hardware/commercialize/manufacture/desktop/winpe-create-usb-bootable-drive#windows-10-version-1703

In Windows 10, Version 1703 you can create multiple partitions on a USB drive, allowing you to have a single USB key with a combination of FAT32 and and NTFS partitions. To work with USB drives that have multiple partitions, your technician PC has to be Windows 10, Version 1703, with the most recent version of the ADK installed.

But still it does not solve problem for UDF, which should not be installed on partition but rather on whole disk (not partitioned).

Answer 28 · 2017-05-20T17:32:54.000Z

Yeap, 1703 is Creators Update. But true, I don't think I tested that with UDF, I probably should try it and see if it works.

Answer 29 · 2017-05-20T20:17:02.000Z

Thanks to you both for your contributions in this conversation.

There are a few things that I wish to remind you of:

This comment thread is for the topic of AF SAT drives. Our conversation has gotten a bit off-track, but I'll still entertain them here due to context.
There are multiple block sizes that we're dealing with here. Please be clear.
My priorities for this project are:
1. Ease of use for the average user
2. Maximal compatibility across operating systems
3. Specification compliance
4. Maximal flexibility to help users with uncommon needs
XP is still a supported OS (as far as format-udf is concerned), and I'm not yet willing to deprecate support for XP due to the level of interest in XP support when I first started this project.
Windows 10 is not (yet) officially supported by format-udf (please see the README)
The block size to be used during a format operation can be user-specified with the -b BLOCK_SIZE parameter. Because of this, the discussion on what default block size that format-udf should use is relatively unimportant. However, I'm still personally interested in the question.

@pali: Maybe I'm missing something, but I don't see any new evidence from your comments today. You appear to be basing your claim from an "I think" statement:

Now I looked at this logical vs physical block size problem again and I think whole block addressing in UDF should be according to LBA.

My whole reason for choosing the physical block size in the first place was Pieter's comment from 2010:

Windows XP: supports read-only support for UDF up to version 2.01, but the UDF block size must match the block size of the underlying device (which for USB-sticks and disks is 512 bytes).

Windows Vista and 7 have full support up to UDF v2.6, but the UDF block size has the same constraint.

To your earlier point, this statement is indeed ambiguous, because we don't know why 512 was the right answer. In 2010, it was common for disks to have a physical block size that matched the logical block size, hence the ambiguity. (n.b. In 2017, it's a different landscape.)

For me to change the default block size used by format-udf, it will require learning exactly how Windows XP mounts UDF drives. This could be done via reverse engineering, I suppose, but I think it seems way easier to just test a physical (not virtual) Windows XP machine with real hard drives. The ideal test would include 2 hard drives:

A disk with logical block size of 512 and physical block size of 4096
A disk with logical block size of 4096 and physical block size of 4096

Using each of these disks, I would test a UDF file system with block sizes of 512, 1024, 2048, and 4096. That's a total of 8 test cases. For posterity, this testing ought to be repeated on Windows 7 (and 10, while we're at it). That brings our number of test cases to 24.

Using these 24 test cases, it should be possible to see whether Windows mounts UDF drives:

with a block size hard-coded to 512
with a block size set to the logical block size
with a block size set to the physical block size

If the answer ends up being hard-coded to 512 or the logical block size, we'll have to get creative in order to support drives >2TiB. Perhaps that could take the shape of asking the user whether they prefer a disk compatible with Windows, or a partition that uses the entire disk.

Given the large amount of testing and burn-in that has occurred with format-udf using physical block size, if there is a change, then it will also require substantial regression testing with the new parameter, ideally across all major versions of all supported OSes.

Unfortunately, since I have a newborn at home, I don't have much free time at the moment to conduct this testing. I do have a spare machine right now, and I have at least one of the above-listed test drives. (I don't think I have a 4096/4096 drive.) I will make an attempt to conduct this testing, but it may not be for awhile. If you're willing to conduct this testing, that may lead to a quicker resolution. :)

In the meantime, users can still specify whatever block size they want using the -b BLOCK_SIZE parameter.

Answer 30 · 2017-05-20T20:35:16.000Z

Do 4096/4096 drives even exist? I thought the main reason behind the logical block size is for it to be set to 512 for backwards compatibility purposes.

And why would Windows 10 not be supported? Excluding the second most common OS conflicts with your goal ii ;) Not to mention that UDF works fine on it, and even better than any other Windows version.

Answer 31 · 2017-05-20T20:43:59.000Z

Hi @GreatEmerald.

You and I are on the same page. According to @pali (here), 4096/4096 exists. If he's right, I've learned something today.

Windows 10 may very well work, but it has not received the same level of testing as previous versions of Windows. For example, please see #21. There is nothing preventing a user from running format-udf on Windows 10, but (for now) it's at your own risk. I will update the README when I've completed my testing.

Answer 32 · 2017-05-20T20:48:32.000Z

If the answer ends up being hard-coded to 512 or the logical block size, we'll have to get creative in order to support drives >2TiB.

I will answer this part. As Windows XP address do all addressing in 32bit only, it does not support partitions >2TB (for classing 512 LBA disks). So no, you would not get >2TB working on Windows XP.

Do 4096/4096 drives even exist?

There are USB based SATA controllers which "fakes" logical block size of disk and propagate it to host system from physical block size of disk. It is big fun to use those controllers with 512/4096 disks as MBR uses LBA. If you format those disks with that controller you cannot read them without it.

I do not know if it is possible to buy 4096/4096 disks today, but in past they probably really exists as there really were problems with software which had hardcoded "logical block size = 512" for LBA.

I thought the main reason behind the logical block size is for it to be set to 512 for backwards compatibility purposes.

Yes. Lot of applications, implementation and also formats/specifications itself depends on 512 or LBA, so they internal strucutres are based on logical block size. Changing it would mean incompatiblity between disks. More "smart" programs can read physical block size and correctly align paritions or do read/write operations for RAID.

Answer 33 · 2017-05-20T20:57:26.000Z

Do 4096/4096 drives even exist?

I did some research and those disks really exist! They are named under native Advanced Format 4K: https://en.wikipedia.org/wiki/Advanced_Format#4K_native

There is also for Seagate disks "SmartAlign technology" and via some Seagate tools it is possible to "switch" what would be reported in logical block size by disk itself.

So that SmartAlign allow to change logical block size between 512 and 4096.

Also you can try to google string "Sector size (logical/physical): 4096 bytes / 4096 bytes" (output from fdisk) that people have disks with logical block size of 4096 (either faked by USB based SATA controllers or SmartAlign or native from factory).

Answer 34 · 2017-05-20T21:01:32.000Z

Thanks for the info, @pali. Indeed, I learned something today.

Be reminded that the block size limitation applies to Windows 7 as well as XP (at least according to Pieter). It wouldn't surprise me if my 24 test cases yielded different results for XP, Win7, and/or Win10.

Answer 35 · 2017-05-20T21:48:58.000Z

Tests can be done also in qemu as it is possible to emulate different logical block size and physical block size.

Here is example how to set logical_block_size and physical_block_size for hard disk image stored in local file udf.img:

qemu-system-x86_64 -enable-kvm -cpu host -m 1024 -device lsi -device scsi-hd,drive=hd,logical_block_size=4096,physical_block_size=4096 -drive file=udf.img,if=none,id=hd

Also removable flag can be emulated, just append ,removable=on after physical_block_size setting.

Answer 36 · 2017-05-20T23:14:58.000Z

Here are my tests for Windows XP:

removable	MBR	UDF block size	logical block size	physical block size	readable
no	no	*	*	*	no (not initialized)
yes	no	512	512	512	yes
yes	no	512	512	4096	yes
yes	no	512	4096	4096	no (corrupted)
yes	no	4096	512	512	no (unformatted)
yes	no	4096	512	4096	no (unformatted)
yes	no	4096	4096	4096	no (unformatted)
no	yes	512	512	512	yes
no	yes	512	512	4096	yes
no	yes	512	4096	4096	no (corrupted)
no	yes	4096	512	512	no (unformatted)
no	yes	4096	512	4096	no (unformatted)
no	yes	4096	4096	4096	no (unformatted)
yes	yes	512	512	512	yes
yes	yes	512	512	4096	yes
yes	yes	512	4096	4096	no (corrupted)
yes	yes	4096	512	512	no (unformatted)
yes	yes	4096	512	4096	no (unformatted)
yes	yes	4096	4096	4096	no (unformatted)

(note that Windows XP has only read-only support for UDF)

So for Windows XP logical block size and UDF block size must pass. Physical block size does not matter. This just prove my assumption about usage of LBA or hardcoded 512 in Windows UDF HDD driver.

Answer 37 · 2017-05-20T23:19:06.000Z

Here is some information about 4K disks from MS:
https://msdn.microsoft.com/en-us/windows/compatibility/advanced-format-disk-compatibility-update

Answer 38 · 2017-05-20T23:21:31.000Z

https://en.wikipedia.org/wiki/Advanced_Format#4K_native

Readiness of the support for 4 KB logical sectors within operating systems differs among their types, vendors and versions. For example, Microsoft Windows supports 4K native drives since Windows 8 and Windows Server 2012 (both released in 2012), and Linux supports 4K native drives since the Linux kernel version 2.6.31 and util-linux-ng version 2.17 (released in 2009 and 2010, respectively).

Answer 39 · 2017-05-21T10:24:39.000Z

Thank you, @pali! This is fantastic data. I think we've gotten to the heart of the matter.

It seems none of us (Pieter included) had the complete picture:

I incorrectly interpreted Pieter's comment on block size to refer to physical block size. According to your testing, it appears that XP's UDF mounting behavior indeed depends on the logical block size.
Both Pieter and you seemed to miss the information that XP attempts to mount UDF drives assuming a hard-coded block size of 512.

So, if I had to sum up what your testing has included: According to empirical testing, Windows XP 32-bit will mount a UDF drive (read-only) if and only if BOTH the UDF file system block size AND the device's logical block size are 512 bytes.

According to your testing, breaking either of these conditions causes XP to fail mounting/reading the device.

This information is invaluable and helps advance progress for both UDF and also format-udf. Thank you!

I have already begin thinking through how format-udf should be modified to best accommodate this new information. However, we still have an incomplete picture.

If it's not too much trouble, are you willing to repeat your testing on Windows 7, and also on Windows 10? The information you pasted on 4K native drives makes me think that Windows 10 will have different behavior than Windows XP.

Also, do you anticipate that 64-bit Windows operating systems will have UDF-mounting behavior that differs from their 32-bit counterpart?

Thanks again!

Answer 40 · 2017-05-21T10:47:24.000Z

XP attempts to mount UDF drives assuming a hard-coded block size of 512.

CD/DVD optical medias have logical block size 2048. UDF block size for optical medias is also 2048. And XP UDF driver can read DVD without any problem. So it looks like UDF driver do not have hardcoded block size to 512, but rather logical block size for HDDs is hardcoded to 512 and CD/DVD to 2048.

Windows XP 32-bit will mount a UDF drive (read-only) if and only if BOTH the UDF file system block size AND the device's logical block size are 512 bytes.

Agree.

are you willing to repeat your testing on Windows 7, and also on Windows 10?

Later I would try to prepare some similar table.

Now I'm going to add support for fake MBR and GPT partition table directly into mkudffs. And also option to clean reserved UDF boot area to wipeout all previous magic headers of other filesystems.

Also, do you anticipate that 64-bit Windows operating systems will have UDF-mounting behavior that differs from their 32-bit counterpart?

It is possible, specially in case of GPT. E.g. Windows Server 2003 SP1 has GPT support so maybe could handle >= 2TB disks, but Windows XP (both 32bit and 64bit) do not support GPT and are unable to handle >= 2TB disks.

Answer 41 · 2017-05-21T10:59:10.000Z

Very good. Thanks, @pali.

Now I'm going to add support for fake MBR and GPT partition table directly into mkudffs. And also option to clean reserved UDF boot area to wipeout all previous magic headers of other filesystems.

I haven't added GPT support yet (#15), but it sounds like you aim to replicate inside mkudffs what format-udf already does. You're more than welcome to do that, but you may wish to make the fake partition table non-default behavior, as users of mkudffs are expecting compliance to the UDF spec. format-udf advertises a different goal than mkudffs, allowing it to break spec compliance in favor of maximal compatibility. See our previous conversation.

Answer 42 · 2017-05-21T11:05:05.000Z

Fake table would be of course optional and not enabled by default. Writing proper MBR or GPT is tricky as whole offsets must be according to logical block size of underlying device, and not udf block size or physical block size... Plus GPT needs backup copy (and not exact copy) at the end of disk. I need it for proper testing of >= 2TB 4096/4096 disks.

Answer 43 · 2017-05-26T21:28:50.000Z

The terms "physical block size" and "logical block size" in Linux seem to match the definitions in the Advanced Format specification. Advanced Format introduces 4k physical sector drives, which also define a logical sector size, typically either 512 bytes or 4k bytes. The whole point with the smaller logical sector size is that the drive emulates a classic 512 byte sector hard drive, so that it is compatible with older operating systems and BIOSes that can not handle 4k sectors. The logical sector size is the addressable sector size of the drive, i.e. the sector size the ATA command set and the SCSI commands refer to.

The ECMA-167 standard (which the UDF standard is built upon) also defines the terms (physical) sector and logical sector, but these mean different things and should not be confused with the Advanced Format definitions. From ECMA-167: "Sector: The data field of the smallest addressable part of the medium that can be accessed independently of other addressable parts of the medium. Logical sector: The unit of allocation of a volume. The length of a logical sector shall be referred to as the logical sector size and shall be an integral multiple of 512 bytes". So here a logical sector is potentially bigger than the physical sector. Also bear in mind that the UDF and ECMA-167 specs predate AF disks by about two decades.

To me it is clear that what Linux and AF calls the "logical sector size" (and not the "physical sector size") is what should be chosen as the sector size for UDF. This is the "addressable sector size", i.e. the sector number N in an ATA or SCSI command refers to a sector of this size and the offset to this sector on the disk is N*(logical sector size) bytes. How could you otherwise move an 4096/512 AF disk to an older system that has no knowledge of 4k sector disks? The physical sector size (in the Linux/AF sense) describes how the data is stored physically on the disk platter (e.g. preamble - 4k data block - ECC - gap). The value just serves as a hint that the partitions on the disk should preferably be aligned to 4k borders.

Answer 44 · 2017-05-26T22:31:07.000Z

Welcome to the conversation, @jmyreen!

You are correct, and there are several people in this discussion who agree with you (including myself) on the theoretical correctness. If you haven't already, I encourage you to read this entire thread for context. Along the way, you'll learn about a kernel bug discovered/reported by @GreatEmerald, the historical reasons why format-udf used the physical block size, and information on format-udf project priorities.

As I've iterated a few times in this discussion, theoretical correctness (or spec compliance) is not the highest goal of this project. If you're looking for theoretical correctness in UDF, may I recommend checking out @pali's udftools (which, by the way, now defaults to the logical block size if not overridden).

In case you missed it, format-udf also has an explicit override. Users can still specify whatever UDF block size they want using the -b BLOCK_SIZE parameter.

I already have fix adjusting the logic for format-udf's default block size in the works, but because maximal compatibility is a large project priority, I am not making decisions according to theoretical correctness. Instead, empirically observed behavior is the primary means I'm using for determining the adjusted logic.

I'm interested in observing behavior of the UDF mount on Windows XP, 7, and 10. We have good reason to believe that Windows 10 has adjusted their UDF logic. @pali was kind enough to do a significant amount of testing on XP. However, nobody has stepped up (yet) to replicate his efforts for Windows 7 or 10. If you're interested in digging in to do some qemu testing on Windows 7 and 10, I'd be grateful for the assistance.

Instead of adjusting the logic multiple times, I'm waiting until I see results from Windows 7 and 10 before pushing my fixes.

Answer 45 · 2017-05-27T07:49:55.000Z

I think theoretical correctness should be the starting point, don't you think? If some systems don't work according to spec, then work around the bugs in these implementations. I have read the whole thread, and I don't see anything in the findings above that contradicts what I said. Especially the XP table by @pali contains just what I excpected. The 512 byte block size for hard disks is not only hardcoded for the UDF driver in XP, it's hardcoded in the whole operating system, because 4k disks didn't exist when XP was released. (Optical disks are a different matter, they had a 2k sector size from the beginning.)

My point is that the logical sector size is really the only thing that matters, since that is the unit used by the ATA and SCSI protocols. The drive and the OS had better agree on the size, otherwise things won't work. The physical sector size is a drive internal thing, and if the kernel has a bug reporting the correct physical sector size, it doesn't really matter. (Based on the bug report above I find it disputable there even is a bug in the kernel.)

Note that the GPT partition table works in the same way. The partition table uses Logical Block Addresses to address the start and end of the partition. On an AF 512e (emulated) disk 512 byte blocks are used, but on a 4Kn (native) the LBA entries in the table reference 4094 byte blocks.

Answer 46 · 2017-05-27T10:04:44.000Z

Thanks for sharing your opinion, @jmyreen.

If some systems don't work according to spec, then work around the bugs in these implementations.

That's why this format-udf project exists, actually. Several features of format-udf are meant to formalize workarounds that help maximize OS compatibility.

logical sector size is really the only thing that matters

I beg to differ. Logical sector size is one thing that matters; but it's not the only thing that matters. As evidenced by @pali's research on XP, Windows (at least earlier versions) incorrectly hard-coded a 512-byte block size while mounting UDF. So, it's insufficient for format-udf to merely use logical sector size (vs. physical sector size). In order to make any attempt at supporting XP (which is a stated goal--at least currently), format-udf must also do something differently when a user is attempting to format a device having a logical sector size other than 512.

When I push my changes that switch the default block size used from physical to logical, there are a few edge cases that must be addressed as well:

(This is the case mentioned above.) If a user attempts to run format-udf on a device having a logical sector size other than 512, format-udf should warn the user that the resultant drive will not work on XP. That warning should also list Windows 7 and 10, if they also have the same limitation. Since Windows is closed-source, the only way to know for sure is to test UDF mounting behavior on Windows 7 and 10.
Larger drives today have more than 2^32 logical sectors. (That seems to be less true of physical sectors, as some manufacturers seem to artificially keep the number of physical sectors below that 2^32 threshold.) Since UDF cannot handle more than 2^32 blocks, it now creates an edge case that some users must decide between utilizing the full drive capacity and maintaining XP support. That's an easy decision for many users who never deal with XP. However, if Windows 7 and 10 also have the same 512-byte block limitation as XP, then format-udf users will need to decide between utilizing the full drive capacity and maintaining Windows support. That might be a deal-breaker.

As author/maintainer for format-udf, one of the early lessons I learned is that this project must be self-documenting. The average user of format-udf might skim the README, but otherwise never reads any documentation. I've learned that, if I don't provide contextual warnings about the implications of the user's decisions, my inbox floods with complaints how format-udf is broken. When, in reality, it's working just fine--users often don't understand the nuances of how different OSs across the board handle their specific drive.

The net effect of this is that I'm incentivised to always only tell the complete truth when it comes to alerting users about limitations. It would be great to repeat @pali's testing on every version of Windows. However, I'm settling for just Windows 7 and 10.

Again, if you're willing to help shoulder the testing burden, I'd be grateful for the assistance.

Answer 47 · 2017-05-27T12:09:51.000Z

@JElchison can I ask one question: Why have you chosen physical block size of media as default block size for UDF? Do you have some combination of HD and OS which does not accept 512bytes as block size? Because from everything what I read it looks like that blocksize for UDF is bounded to harddisk sector size and only new OS are able to handle hardisks with non-512b sector size (logical block size).

The only exception is Linux kernel prior to 2.6.30 which was unable to use UDF with different blocksize as 2048 (also on HDD) until user manually and explicitly defined (via mount option) correct UDF blocksize.

I was not able to simulate or find harddisk and Windows version which needed blocksize of 4096. It does not mean that such combination does not exist, but I'm trying to find source of this statement that UDF needs to match physical block size and UDF block size in some existing implementation due to some compatibility... Finding such combination or explicit source of this statement is the key to finally decide what are the best options for UDF compatibility.

(I agree with @jmyreen, nice summary!)

Answer 48 · 2017-05-27T13:06:00.000Z

Windows XP is unable to deal with 4k native (i.e. 4k logical sector size) AF disks in general, because they didn't exist until long after XP was released. This shortcoming is at a lower level than the file system level, so it's not UDF's fault. I found the following hotfix from Microsoft which implies that not even Vista or Windows Server 2008 can handle disks with a 4k logical sector size.

Does anybody know of a list of known 4k native (not 512e) Advanced Format USB disks that are available for purchase? Apparently they have existed for some time already. A 4k sector size pushes the UDF partition size limit from 2 TB to 16 TB.

Answer 49 · 2017-05-27T13:22:37.000Z

@pali: Of course! I thought I've answered this question already, but perhaps it's been lost in my verbosity. Sorry about that. Let me try again, and I'll try to be more concise.

There are two reasons why I chose physical block size in the first place (years ago):

That is how I initially interpreted Pieter's blog writeup, which was the inspiration for format-udf
Years ago, I conducted considerable testing on my own, and using the physical block size (seemed to) provide better compatibility across my test set of drives and operating systems

However, since then, both of these claims have come into question. Specifically:

pali's testing on XP has proven my initial interpretation of his blog incorrect
My initial round of testing was conducted before having added other compatibility features to format-udf, most notably the fake partition table. Once format-udf started specifically overwriting the first N sectors on disk (and optionally also writing a fake MBR), I realized that my previous testing (although extensive) could no longer be trusted, because all initial testing used an unknown initial state. Sadly, I no longer have all of the drives that I used years ago in my initial round of testing, so I can't go back and re-validate.

In case I've been unclear so far, I agree with both of you that the default behavior for format-udf should be based on logical block size, not physical block size. The reason why I'm waiting to push my fix for that is because I want to provide a full solution, not just a partial one. More reasons were outlined here.

It's important to me that format-udf provide maximal support across operating systems. The only way to do that against closed-source operating systems is to test them. As soon as I can ensure that my fixes (and alerts to the end user) are accurate (specifically for Windows 7 and 10), I'll be happy to push my fixes.

Answer 50 · 2017-05-27T19:40:40.000Z

Yea, I agree with the lot of you. Though I wonder, since XP and previous (or maybe Vista and previous) are rather different in working with UDF, maybe it would make sense to make some sort of format wizard that would ask the user if they care about supporting such OSs in problematic cases? If they don't, follow the spec more closely and not use quirks for old OSs (fake MBR included).

Answer 51 · 2017-05-27T20:02:17.000Z

Speaking of MBRs, do Macs still require a whole disk filesystem? Has anybody tried a disk with a GPT partition table on a Mac? GPT works fine on Linux and at least Windows 7 and later, according to my tests.

Answer 52 · 2017-05-27T20:32:34.000Z

@GreatEmerald: Good ideas. In fact, format-udf is that "format wizard" you suggested. (Otherwise, users can just use udftools directly.) One primary aim of format-udf is to abstract away most of the technical details so that an average end user doesn't have to worry about them. "UDF as a service", if you will. The logic currently in question is how to prompt the user when they need to decide between using the entire disk's capacity and maintaining compatibility for XP (and possibly Win7/10). I would like to repeat @pali's research for Win7 and Win10 so that the user prompts can be accurate, helpful, complete, and still abstract away the complicated detail.

@jmyreen: My Mac still requires a whole disk filesystem. However, I have not tried UDF with GPT. Sounds like another great thing to test. :)

Answer 53 · 2017-05-27T20:43:49.000Z

Looks like that Windows 10 does not have drivers for LSI Ultra2 SCSI controller and therefore it is not possible to attach qemu scsi-hd with 4096 physical size from local file (as described in #13 (comment)).

Answer 54 · 2017-05-27T20:53:44.000Z

@pali: Thanks for checking. Do you know whether the same applies to Win7?

We can always conduct testing the old-fashioned way (i.e. a real, non-virtual machine).

Answer 55 · 2017-05-28T17:14:46.000Z

Here is a Microsoft support article which seems to put the nail in the coffin for UDF (and FAT32) disks larger than 2 TB on Windows 7: Microsoft support policy for 4K sector hard drives in Windows. According to the article, 4k sectors are only supported on Windows 8, Windows 10 and Windows Server 2012.

I have also learned that apparently there are USB enclosures that present themselves as 4k logical sector drives, even if the drive inside the enclosure uses 512 byte sectors. This seems to make it easier to find an external 4k drive than an internal drive, which all seem to be 4k physical / 512 emulated (at least if we are talking about SATA drives.) See this superuser.com article.

Answer 56 · 2017-05-28T18:00:24.000Z

Thanks for the info, @jmyreen. Valuable resources indeed.

A point of clarification for future readers who don't click through to the articles: Across its OS versions, Windows currently offers wider support for 512E devices than for 4K native. However, Windows 10 isn't listed to support 512E.

Missing from the Microsoft article, unfortunately, is information specific to its UDF mounting behavior. If a specific HD is supported by Windows, but the file system isn't, format-udf users are still out of luck.

For example, if a supported HD (according to the Microsoft article) has a 2048-byte UDF file system block size, should we expect the UDF file system to mount? Does the behavior differ between versions of Windows? I only know of two ways to obtain this kind of information:

reverse engineer the Windows OS (difficult and illegal)
test it (a nuisance, but legal)

As a representative set, I still think that testing the UDF mounting behavior on Windows 7 and 10 is the way to provide the format-udf user base with the most accurate compatibility expectations.

Answer 57 · 2017-05-28T18:41:34.000Z

The UDF file system uses logical block numbers, not byte offsets, as the addressing unit. File system structures that reference other parts of the file system, for example a data block, contain block numbers. This is why the block (sector) size is so important. The logical sector size of the drive governs the internal structure of the file system, and the file system driver must adapt to this. I doubt there are any 2048-byte sector hard drives on the market, so a 2048 sector size UDF file system is not applicable to hard drives. Optical media, on the other hand, use 2048 byte sectors.

Answer 58 · 2017-05-28T19:03:37.000Z

The logical sector size of the drive governs the internal structure of the file system, and the file system driver must adapt to this.

In theory, you are correct. In practice, operating systems have done silly (non-adaptive) things like hard-coding 512-byte file system block sizes when mounting HDDs. This is what @pali's XP research indicated.

a 2048 sector size UDF file system is not applicable to hard drives

In theory, you are correct. In practice, udftools defaulted to a block size of 2048 until @pali made a recent update to it. Plenty of people have contacted me over the years asking questions why their 2048-byte UDF HDD doesn't mount on their choice OS.

It's particularly bad for Windows, which is both a closed-source OS and doesn't provide a CLI method for specifying mount parameters (such as file system block size). Both *nix and macOS systems provide a way to specify the file system block size you wish to use.

To cover for cases when operating systems don't do the theoretically optimal thing, I still think there's no better substitute for testing.

Answer 59 · 2017-05-28T19:14:25.000Z

Did the 2048-byte UDF HDD ever work even on Linux? I don't see how that is possible, or then the Linux UDF driver does something non-standard. All the guides I have come across tell that you should use the --blocksize=512 switch with mkudffs when formatting a HDD.

In theory, you are correct

I don't think @pali's XP findings negate anything I said about the requirement that the file system driver and the drive must agree upon the sector size. XP just didn't adapt. It didn't follow the rules, and that's because Windows XP doesn't know how to handle 4k sectors.

Don't get me wrong. I do think testing is important, and should be done as thouroghly as possible. On the other hand, I don't think it's worth the trouble to test something that's not according to spec.

Answer 60 · 2017-05-28T20:12:28.000Z

Did the 2048-byte UDF HDD ever work even on Linux?

In past (before 2.6.30 kernel) only UDF with 2048 block size worked automatically, independent of disk sector size. Currently Linux kernel can mount and use UDF with probably any block size on HDD with 512 byte sector size. IIRC udf.ko is doing detection of UDF block size based on VRS and AVDP blocks.

Answer 61 · 2017-06-13T19:10:28.000Z

I did some investigation for UDF support in linux kernel.

Prior to 2.6.30 only UDF blocksize 2048 is tried (independent of logical disk sector size). Since 2.6.30 and prior to 4.11 is tries logical sector size and 2048 (as fallback). Since 4.11 it tries logical sector size and fallback to any valid blocksize between logical sector size and 4096.

In any case (also prior to 2.6.30) it is possible to manually specify UDF blocksize via bs= mount parameter.

Answer 62 · 2017-06-13T19:11:14.000Z

So UDF in linux kernel fully ignores physical block size of device.

Answer 63 · 2017-06-14T08:07:17.000Z

Since 4.11 [Linux] tries logical sector size and fallback to any valid blocksize between logical sector size and 4096.

The Linux kernel allows the UDF "internal block size" to be a multiple of disk's logical sector size, and employs this heuristic to detect what the block size is. We cannot expect any other OS (Windows) to do so, because it is against the UDF specification. If a UDF formatted disk is to be portable over operating systems, the formatting tool should not use any other block size than the disk's logical sector size. Unfortunately, this means the maximum UDF file system size for a typical hard disk is 2 TB.

Answer 64 · 2017-06-14T11:18:39.000Z

If somebody has free 2TB+ HDD, please try to format it as UDF on Windows (quick format is enough). We would see then what Windows generate... I tried this on Windows in qemu, but after qemu written 117MB (yes megabytes!) to disk image, qemu paused and crashed (yes qemu! not windows).

Answer 65 · 2017-06-14T12:54:46.000Z

Of course, if a 2TB+ disk has 4k logical sectors (i.e. it is a 4k native Advanced Format disk, not 512e), then the maximum size of the UDF file system is 16 TB. Note that FAT32 is affected by the same problem: the maximum volume size is 2 TB with 512 byte sectors, and 16 TB with 4k sectors. It seems the vendors of external drives have adopted two strategies to deal with this, either

using 4k disks (possibly faking 4k with the USB chipset in the enclosure), or
pre-formatting the disk with NTFS and including an NTFS file system driver for Mac

It's quite hard to find information on whether big disks are 4k native or not. My guess is (based on web search) that they are actually quite rare, and mainly SCSI disks.

Answer 66 · 2017-06-14T12:58:45.000Z

I mean classic disks with 512 byte logical sectors. At least Windows in qemu allowed me to start formatting 4TB partition (GPT scheme) to UDF. But as I wrote qemu paused and crashed, so I do not know what would be result from Windows.

Yes, when UDF blocksize is 512 and logical sector size is also 512, then upper limit is 2TB (minus one sector...). But question is what Windows do with larger disks with GPT scheme which support larger partitions and with UDF?

Answer 67 · 2017-06-15T09:07:16.000Z

@pali I created a ~3 TB virtual disk in VirtualBox and attached it to a Windows 10 virtual machine. The result is not encouraging: the format program says "QuickFormatting 2,9 TB", but the end result is a file system of size 881 GB, which is suspiciously close to 2.9 T modulo 2 T. Seems the Windows format program is not at all prepared for big disks; it really should have printed an error message telling that the partition is too large, instead of doing this strange thing. Formatting to NTFS works, though.

When I tried reducing the volume size on the disk to 1.9 TB (leaving the rest of the disk unpartitioned), everything worked OK: "Format complete. 1.91 TB total disk space."

Answer 68 · 2017-06-15T09:13:20.000Z

So... after formatting 3TB disk, UDF filesystem has only 881GB total space (also in Explorer)? Then it just prove fact 2TB+ disks (with 512b sectors) are unsuitable for UDF on Windows. Which also means that GPT (fake header) is not needed at all as MBR is enough.

Answer 69 · 2017-06-15T09:16:49.000Z

Thanks to everyone for continuing this conversation!

On @jmyreen's latest tests: My understanding (which may be incorrect) is that UDF itself uses 32-bit block numbers internally, which seems to be the limiting factor in play here.

If UDF is limited to 32-bit block numbers, then I don't think there's any partition format (such as GPT) that can extend the max capacity beyond 2^32 blocks.

Source: https://sites.google.com/site/udfintro/

Answer 70 · 2017-06-15T09:19:53.000Z

Yes, udf uses 32bit block numbers, this is a limit. But if you use larger block size, then you can increase maximal size of formatted filesystem. I just wanted to see what would Windows do in this case...

Anyway, there is another tool for formatting disks under windows, which support also UDF. See this blog: https://fsfilters.blogspot.com/2011/10/testing-minifilter-on-more-filesystems.html

Answer 71 · 2017-06-15T11:01:53.000Z

@pali I tried the Diskpart tool mentioned in the blog post you linked to. The result is the same: no error message ("DiskPart successfully formatted the volume."), but the final file system size is partition size modulo 2 T. My guess is that both Format.exe and Diskpart get as input the number of sectors on the disk, truncate it to 32 bits and uses the truncated (unsigned) result as input to the formatting code. Maybe both tools use the same library code internally?

Answer 72 · 2017-06-15T11:05:44.000Z

It is possible that both tools uses same library... Anyway both format and diskpart support specifying size of block/sector (do not know correct parameter name). Can you try to test if it is possible to specify it also for UDF and if it is working?

Answer 73 · 2017-06-15T11:40:38.000Z

I tried with unit sizes 4096, 2048, and 1024, but all failed. Specifying unit=512 worked (not shown in the screenshot). This is in line with the UDF spec, which only allows a block size equal to the logical sector size of the disk. I couldn't find any error message supposedly giving more information in Event Viewer.

Answer 74 · 2017-06-15T11:53:54.000Z

Ok, thanks for testing! I think now we can definitely say that Windows does not support UDF with block size different from logical sector size of disk, which means it does not support UDF on partition which has more then 2^32-1 sectors (for 512/4096 disks more then 2TB).

Original bug report is about incorrect detection of physical block size, and it is a problem in kernel (see relevant mailing list discussion). Moreover for both Windows and Linux it is irrelevant as UDF needs to have block size same as logical sector size of disk.

Therefore this bug can be closed and scripts needs to be fixed to use logical sector size instead of physical...

Answer 75 · 2017-06-15T16:41:50.000Z

Well, the purpose of this issue has rather shifted from physical block size detection to defaulting to logical block size + updating documentation about the change.

Answer 76 · 2017-06-18T12:46:53.000Z

Please review pull request #35

Answer 77 · 2017-06-18T13:18:53.000Z

Meanwhile I implemented new option --bootarea=preserve|erase|mbr in mkudffs for specifying how to fill UDF bootarea: https://github.com/pali/udftools/commits/master For hard disks default would be erase, to clean all headers of all previous filesystems. Option mbr would then put similar "fake" partition. I have also experimented with gpt, it is working, but I do not see any reason why to use it as it has no benefits for now (and just cause problems with last UDF anchor)...

Answer 78 · 2017-06-18T17:27:27.000Z

From the new README:

Many operating systems will only attempt one block size (usually whatever the mount utility defaults to). For example, In order to mount a UDF device, Windows seems to require that the UDF file system use a block size equal to the logical block size. If your block size isn't the OS's default, then auto-mounting likely will not work on your OS.

It's the UDF specification that requires that the UDF file system use a block size equal to the logical block size. You can't really blame Microsoft for adhering to the UDF specification – it's the right thing to do for the Windows implementation to enforce this. That Linux supports a file system block size that differs from the block size of the disk is a non-standard extension. That older Linux UDF kernel driver versions only supported 2048 byte blocks on hard disks was clearly a bug.

It would make life much easier for the users of the UDF formatting and mount utilities if the programs just did the right thing, and we would simply forget about the -b option. Formatting and mounting should use the (logical) block size of the device, because otherwise we really can't call the resulting file system UDF anymore. Has anybody ever been able to come across a case where deviating from this rule has been necessary? That is, using a file system internal block size equal to the block size of the drive would cause an incompatibility between systems? Why would you want to create a file system that doesn't follow the specification?

There may be situations where the -b option is needed: if the operating system is unable to report the block size of the drive. Note that what started this whole discussion, Linux reporting ("lying") the wrong physical block size doesn't count, because the physical block size is irrelevant here. There is a mention that "macOS is known to report the incorrect block size in certain scenarios as well." Is this a documented fact, and if it is, is there something that could be done to work around the problem?

If the -b option were to stay, it should (in my opinion) be hidden in "expert" sections of the documentation, with ample warnings that the user should really understand what he or she is doing.

Answer 79 · 2017-06-18T17:36:42.000Z

Specifying block size is needed when you are creating/formatting disk image stored in local file (which is later "flashed" or cloned to real disk device).

Answer 80 · 2017-06-18T17:43:59.000Z

Good point. Though this also falls into the "really know what you are doing" category.

Answer 81 · 2017-06-18T19:22:39.000Z

Great catch, @jmyreen. The discussion about the Linux kernel bug is now irrelevant. I have removed references to it from the README and from the usage, and have added an "expert warning" to the usage.

I attempted to find my message history describing macOS reporting a suspect block size, but I cannot find it. Perhaps I handled that discussion outside of a GitHub Issue. Thus, I have removed the macOS reference as well.

Let me know if these changes meet your liking. (You can review it again on the same pull request, #35.)

Answer 82 · 2021-05-04T18:11:40.000Z

Is this a documented fact, and if it is, is there something that could be done to work around the problem?

...am I misunderstanding Pali's table above? It seems to me that if you had a 4K native disk, you'd want to set the UDF block size to 512 for compatibility with Windows XP, since according to the table, XP needs 512 but doesn't mind if the physical size doesn't match.

What am I missing?

Answer 83 · 2021-05-04T18:35:37.000Z

if you had a 4K native disk

It means that disk logical block size is 4096.

And according to that (my) table there is no option how to use such disk as UDF on Windows XP. From that test can be seen that Windows XP does not support UDF on Native 4K disk.