Much lower compression on new pool with largely same settings #16360
-
I have a pool that I was using for a longer period to store various kinds of data, but I started moving some of them to a new pool, made of disks I was using in the first pool before I upgraded its disks. Both pools use ashift=12, (on all datasets) lz4 compression, recordsize=1M. If I take a file that is well compressed on the origin pool, and make a copy on the same pool, I see that the copy is also well compessed by looking at the difference of the logical size and the size on disk. I should note that I created the pool with checksum=blake3 and recordsize=4M (limit increased), but when I noticed this difference, I have set these back to the values I was using on the origin pool, deleted all copied files, and started over. But that did not result in any difference. I have compared the dataset and pool properties. The configurable native properties are basically the same, the only notable difference is the size of the pool, the vdev structure (the origin pool is a single raidz1 vdev, this new pool does not have any parity, its sole vdev is a Device Mapper virtual block device), the name and description, and the allowed features. Namely, in the old pool I dont have the following features enabled, but I have in the new one: What is the cause of this difference? I would prefer to keep the compression capability of my old pool, because it has a meaningful impact on how soon would I need to upgrade. This is what I'm using: |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 17 replies
-
I have attached exports of pool properties, and of one of the affected dataset's: props.zip Please ignore the size and free stat props, I'm in the process of copying all files again to have a bigger picture on the current effectiveness of compression. |
Beta Was this translation helpful? Give feedback.
-
Are you sure that old pool really had ashift=12? You should not look on the respective pool property, since it affects only newly added vdevs and its change later does not matter. Look into on zdb output, which should show real ashift for each individual leaf vdev.
What is the average file size? If files are very small, then recordsize may not matter if it is rarely/never reached.
Pool topology may also matter. For example, RAIDZ1 rounds up each allocation to a multiple of 2 allocation sizes, which may be only 1KB if ashift=9, but much more significant 8KB if ashift=12, while single vdev or mirrors round up only to 1 allocation size. Mentioned Device Mapper does not tell me anything, since depending on its characteristics ZFS may increase ashift up to 16KB, if needed (again, look into zdb output). If you use Device Mapper to create some sort of RAID under ZFS -- please don't, since you loose ZFS' ability to recover data from multiple copies. |
Beta Was this translation helpful? Give feedback.
-
Thats a good point, thanks!
The vdev with guid of
On the dataset for which I have included the proprties export, most files range from 500 MB to 10+ (below 20) GB.
If it has significance, the Device Mapper virtual block device is a LUKS device of a full disk, no partition table.
I dont, this is only a single layer of full disk encryption. This pool is intentionally without parity, it will only store replaceable data, backups, and other such things. |
Beta Was this translation helpful? Give feedback.
-
Looking at the output of Do both of them agree the result is logically the same size, even if actually not? The awkward part about doing math on raidz is, all the numbers you get from things like |
Beta Was this translation helpful? Give feedback.
-
I wonder how you copied the data to the new pool. If you used |
Beta Was this translation helpful? Give feedback.
Well, in the example dataset you included, assuming it's the same data on the old and the new, the old said 744G including snapshots, 394G live, with a compression ratio of 1.00x, and the new says 144G and a compression ratio of 1.00x.
So I'm not sure the problem here is one of compression differences. I really think it's just raidz deflateratio surprising you.
Pick a particular file on the old and the new which differ in apparent space savings and examine them closely in zdb.