Menu
Is free
registration
the main  /  Navigators/ Linux file systems comparison. Windows file systems

Linux file systems comparison. Windows file systems

The Linux operating system supports a wide variety of file system types. From a Linux point of view, file systems can be roughly divided into four groups:

  • "Native" file systems. This means that the file system supports all the attributes inherent in Linux: access rights, timestamps, information about the owner of the file, etc .;
  • Non-native file systems. That is, file systems that do not support Linux attributes;
  • Virtual. These are filesystems that do not have physical media;
  • Network file systems.

Native file systems include:

  • reiserfs

Ext2 file system

Ext2 is one of the first filesystems used in Linux ( More specifically, the first Linux filesystem was minix. But the capabilities of this fs are very limited, and it was used only in the early stages of Linux development.). It was established in 1993. The system is considered to be very reliable and time-tested. But since ext2 was developed at a time when a 300MB hard drive was considered very large, it has some limitations. It makes no sense to use this fs for large partitions, it will slow down when there are a large number of files in the partition. That is, ext2 is considered slow ( Slow is a very relative term. Ext2 is considered slow on Linux. But if you compare it to the standard FreeBSD filesystem, ext2 is very fast.). Of course, with the increase in the size of disks, with the advent of new trends, changes have been made to the file system to improve its performance and functionality. For example, POSIX ACL support. But still, it was not affected by global changes, which make it possible to say:

Yes, this is the only file system that suits me completely.

In addition, ext2 has serious limitations:

  • The maximum file size is 2048 GB.
  • The maximum file system size is 32768 GB.
  • The maximum number of subdirectories in one directory is 32768.

Journaled file systems

Nowadays, the ext2 file system is practically not used anymore. And it's not even about its limitations, ext2 is a fairly reliable file system. It's all about the loading speed of Linux servers. The server needs to be running constantly. But miracles do not happen, the server sometimes has to be overloaded. Your task is to make sure that after a system crash they reboot as quickly as possible. When the server is turned on, the disks are checked. The procedure for checking file systems, especially large ones, is a rather lengthy procedure. If there are several such file systems, then their check can take a very long time. And the server should work!

Journaling file systems have been developed to reduce the time spent on checking and to increase reliability. If you have worked with databases, you probably know such a thing as a transaction. Several SQL statements are combined into a transaction. The system must execute all statements. If at least one of them fails, then the system rolls back to the beginning of the transaction. If the system was shut down while a transaction was in progress, at power up, if possible, it tries to execute the remaining statements or return to the beginning of the transaction.

Transaction log support has been added to modern file systems. From the point of view of the file system operation, all operations with the file look like one transaction. If you take a closer look at file operations in Linux, writing or modifying a file is a rather complex procedure, consisting of many actions with data on the disk. When using the transaction log, before any physical changes are made to disk, a new transaction is opened in the log, which will record all the actions that will be performed on the file system. And only after the transaction is saved to disk, changes will be made in the file system.

If the file system is disconnected incorrectly, the checker first looks at the transaction log and, based on the data in it, will try to either return (rollback) the system at the time the transaction started, or, if possible, complete the actions described in the transaction. Considering that the journal is small (in the ext3 file system it is 32 MB), the process of restoring the file system is significantly accelerated.

Ext3 filesystem

When the need arose to implement journaled file systems in Linux, RedHat developed the ext3 file system. RedHat took the path of least resistance - they took the well-known ext2 as a basis and added journal support.

Ext2 is physically identical to ext3. This feature made it possible to use the same utilities (creating, checking and configuring file systems) for working with ext3, as for working with ext2.

Despite the addition of a journal, ext3 is faster than ext2. The advantages of ext3 should also include the ability to journal not only the necessary actions, but also the data that other journaling systems do not allow. This feature makes ext3 very reliable.

Ext3 supports three modes of operation:

  • Writeback - no data logging occurs in this mode. The so-called metadata (file inode, links to blocks) are first placed in the log. Only after they have entered the log is the data written to the file system.
  • Ordered (default mode) - This mode is similar to the one described above. The only difference is that in writeback mode, all metadata is first written to the journal, and only then changes are made to the file system. And in the ordered mode, when information about a block is placed in the journal, this block is immediately changed in the file system. Then information about the next block is written to the log, and the block is written, and so on. That is, the data changes in parallel with the change in the log.
  • Journal - full journaling mode. The log contains metadata and data. And only after that there is a change in the file system.

ReiserFS file system

ReiserFS is developed by Hans Reiser and his company Namesys (http://www.namesys.com). It is a very fast filesystem, well suited for storing large numbers of small files.

It managed to solve the problem of placing small files on the disk. For example, in ext2 / 3, a whole block would be used on disk to accommodate a file containing a single character. An ext2 / 3 block can be 1 to 8 KB ( size depends on the size of the file system). And in ReiserFS, data from several files can be placed in one block. Moreover, if the file size is very small, the data can be placed in the inode, that is, directly in the metadata.

The file system is based on optimized trees (B tree). This increases the speed of searching in the file system and removes the issue of limiting the number of files and directories in a directory.

This file system also handles large files very confidently.

ReiserFS 3.6 has the following limitations:

  • The maximum file size is 8 TB (for 32-bit computers);
  • The maximum file system size is 16 TB.

Now the next version of ReiserFS is being developed - the fourth. It is expected to be included in kernels 2.6.17 or 2.6.18.

JFS file system

This file system is developed by IBM and is licensed under the GNU GPL. A description of JFS can be found on the Internet at. JFS is used not only on Linux, but also on other operating systems such as AIX and OS / 2.

JFS is a Journaled File System. Its main strong point is its use in conjunction with LVM (Logical Volume Manager). LVM allows you to combine multiple physical hard disk partitions into one logical one, which can then be partitioned like a regular hard disk. At the same time, some types of LVM allow you to connect new disk space on the fly. And if you use the ext3 file system on growing partitions, one day you will receive a message about the impossibility of creating a new file. The fact is that when formatting a partition in ext3, a finite number of inodes are reserved in it in advance, depending on the size. That is, the maximum number of files is known in advance. If the size of the file system does not increase, then this number of inodes is sufficient for normal operation. JFS has the ability to dynamically grow the file system and the number of inodes. Thanks to this property, when the size of the file system increases, there is no limit on the number of files created.

The JFS file system has the following restrictions:

  • The maximum file size is limited by the bitness of the operating system.
  • The maximum file system size is 512 TB.

XFS file system

The XFS file system was developed by SGI (formerly Silicon Graphics, Inc.). XFS was born in 1994 and originally shipped with the IRIX operating system. SGI is renowned for its video production workstations and storage servers. Therefore, the file system is optimized to serve a large number of huge files and to support large directories. Due to its structure, it also supports a large number of small files well. In terms of its speed, it is comparable to the ReiserFS file system, and in terms of reliability it surpasses the Hans file system ( How much data I lost in the ReiserFS file system from scratch. Only backup saved. Therefore, now I do not use ReiserFS on servers.).

Large file support is possible because XFS is a 64-bit file system. And the speed of the file system is achieved by using B + trees to find and describe internal structures.

The internal structure of the file system is quite complex, and I do not see the need for a brief description of its structure. Moreover, there are good articles on the Internet detailing XFS:

Microsoft file systems

As far as Microsoft file systems are concerned, Linux supports FAT and NTFS. With FAT, everything is very simple, the structure of the file system is known, so it is fully supported in Linux. The only thing to consider when using FAT, there are two flavors in Linux:

  • msdos - FAT12 / 16.
  • vfat - FAT32.

FAT support should be enabled if you intend to use floppy disks and various USB storage devices: flash cards, hard drives, etc. The point is that they are all usually formatted in FAT.

NTFS is a little more complicated. This file system is normally read-only. It is not recommended to use it in recording mode. Although the write mode is supported, but if you read the documentation for the NTFS drivers, you will see that it is written in capital letters: in write mode you can only change the contents of existing files, in no case should you create new files, delete or resize existing ones - this can destroy file system.

The iso9660 and udf file systems

These file systems are used to store information on CDs and DVDs.

Iso9660 was originally a very simple filesystem with a lot of limitations. For example, filenames like in MS DOS, limitation on the number of directory attachments. Therefore, several add-ons have been written for iso9660 to expand its capabilities. Including add-ons that allow you to preserve the attributes of UNIX files. All add-ons are supported by the file system driver, and there shouldn't be any difficulties while working. Moreover, the iso9660 driver supports, oddly enough, the recording mode. It is used to create CD-ROM images.

No special problems were noticed with udf either. Thus, working with CDs and DVDs is supported in Linux without any restrictions.

The proc filesystem

Belongs to the category of virtual file systems. A very useful file system. As an administrator, you will very often refer to its capabilities. In one of the first chapters on the organization of the Linux file system, I briefly discussed the purpose of this file system. Just to remind you that the files in the / proc directory are the mapping of the kernel data area to the filesystem. That is, if you look at the contents of a file, you actually see a certain part of the kernel data area.

Below are some interesting files that you can find in the / proc directory. The contents of the files on your system will differ from the contents of the files shown as examples.

/ proc / cmdline

Contains the command line passed to the kernel when it was started.

# cat cmdline BOOT_IMAGE = Linux-2613 ro root = 303 #

/ proc / cpuinfo

Information about the processor or processors.

# cat cpuinfo processor: 0 vendor_id: GenuineIntel cpu family: 6 model: 9 model name: Intel (R) Pentium (R) M processor 1400MHz stepping: 5 cpu MHz: 1399.050 cache size: 1024 KB fdiv_bug: no hlt_bug: no f00f_bug: no coma_bug: no fpu: yes fpu_exception: yes cpuid level: 2 wp: yes flags: fpu vme de pse tsc msr mce cx8 sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 tm pbe est tm2 # bogomips: 2800.9

/ proc / devices

List of devices.

# cat devices Character devices: 1 mem 2 pty 3 ttyp 4 / dev / vc / 0 4 tty 4 ttyS 5 / dev / tty 5 / dev / console 5 / dev / ptmx 7 vcs 10 misc 13 input 14 sound 21 sg 116 alsa 128 ptm 136 pts 171 ieee1394 180 usb 226 drm 254 pcmcia Block devices: 3 ide0 7 loop 8 sd 11 sr 65 sd #

/ proc / dma

Using DMA Channels.

# cat dma 4: cascade #

/ proc / filesystems

List of supported file systems.

# cat filesystems nodev sysfs nodev rootfs nodev bdev nodev proc nodev sockfs nodev pipefs nodev futexfs nodev tmpfs nodev inotifyfs nodev eventpollfs nodev devpts ext3 ext2 nodev ramfs msdos vfat iso9660 ntfs udf nodev mqueue nodev usbfs #

/ proc / interrupts

Distribution of interrupts.

# cat interrupts CPU0 0: 850627 XT-PIC timer 1: 9691 XT-PIC i8042 2: 0 XT-PIC cascade 7: 2 XT-PIC parport0 8: 1 XT-PIC rtc 9: 6620 XT-PIC acpi 11: 238626 XT -PIC Intel 82801DB-ICH4, yenta, yenta, eth0, eth1, ohci1394, ehci_hcd: usb1, uhci_hcd: usb2, uhci_hcd: usb3, uhci_hcd: usb4, [email protected]: 0000: 01: 00.0 12: 65575 XT-PIC i8042 14: 11538 XT-PIC ide0 NMI: 0 LOC: 0 ERR: 0 MIS: 0 #

/ proc / modules

List of loaded modules.

# Cat modules irtty_sir 5248 0 - Live 0xf8a09000 sir_dev 13548 1 irtty_sir, Live 0xf8a1d000 irda 107768 1 sir_dev, Live 0xf8a3f000 crc_ccitt 1792 1 irda, Live 0xf8a04000 parport_pc 24324 0 - Live 0xf8a16000 parport 30920 1 parport_pc, Live 0xf8a0d000 uhci_hcd 30416 0 - Live 0xf89e7000 ehci_hcd 27656 0 - Live 0xf897a000 usbcore 103740 3 uhci_hcd, ehci_hcd, Live 0xf8990000 ohci1394 31092 0 - Live 0xf895e000 ieee1394 86392 1 ohci1394, Live 0xf891e000 ipw2100 78204 0 - Live 0xf8936000 ieee80211 18948 1 ipw2100, Live 0xf8918000 ieee80211_crypt 4488 1 ieee80211, Live 0xf88f8000 eepro100 26512 0 - Live 0xf8909000 pcmcia 30568 4 - Live 0xf8900000 firmware_class 7680 2 ipw2100, pcmcia, Live 0xf88f2000 yenta_socket 20748 4 - Live 0xf8879000 rsrc_nonstatic 11264 1 yenta_socket, Live 0xf88756 pcma_cent2

/ proc / mounts

Contains a list of mounted file systems.

# cat mounts rootfs / rootfs rw 0 0 / dev / root / ext3 rw 0 0 proc / proc proc rw, nodiratime 0 0 sysfs / sys sysfs rw 0 0 none / dev ramfs rw 0 0 / dev / hda5 / usr ext3 rw 0 0 / dev / hda6 / home ext3 rw 0 0 / dev / hda1 / mnt / win ntfs ro, noatime, nodiratime, uid = 0, gid = 0, fmask = 0177, dmask = 077, nls = iso8859-1, errors = continue, mft_zone_multiplier = 1 0 0 devpts / dev / pts devpts rw 0 0 usbfs / proc / bus / usb usbfs rw 0 0 #

/ proc / partitions

Contains a list of partitions for all connected drives.

# cat partitions major minor #blocks name 3 0 58605120 hda 3 1 10485688 hda1 3 2 506520 hda2 3 3 9775080 hda3 3 4 1 hda4 3 5 9775048 hda5 3 6 28062688 hda6 #

/ proc / pci

List of devices detected on the PCI bus.

This file can be used to diagnose the reasons why some devices do not work. Pay attention to interrupts: if it is equal to 0, it means that the device has not been allocated an interrupt for some reason. I will not give the full contents of this file, it is very large.

# cat pci PCI devices found: Bus 0, device 0, function 0: Host bridge: Intel Corporation 82855PM Processor to I / O Controller (rev 3). Prefetchable 32 bit memory at 0xd0000000. Bus 0, device 1, function 0: PCI bridge: Intel Corporation 82855PM Processor to AGP Controller (rev 3). Master Capable. Latency = 96. Min Gnt = 12. Bus 0, device 29, function 0: USB Controller: Intel Corporation 82801DB / DBL / DBM (ICH4 / ICH4-L / ICH4-M) USB UHCI Controller # 1 (rev 1). IRQ 11. I / O at 0x1800. #

/ proc / swaps

Contains a list of connected swap files and partitions.

# cat swaps Filename Type Size Used Priority / dev / hda2 partition 506512 0 -1 #

/ proc / version

Contains information about the version of the operating system and Linux kernel.

# cat version Linux version 2.6.13-rc3-my ( [email protected]) (gcc version 3.3.6) # 3 Tue Jul 19 22:25:23 GMT + 3 2005 #

Process information

In addition to files, / proc contains directories that have a number as their name. Each directory describes a process whose PID matches the directory name. The files in this directory describe the process parameters. The contents of one of the directories are shown below.

# ls / proc / 4624 auxv [email protected] [email protected] maps mounts oom_score seccomp statm task / cmdline environ fd / mem oom_adj [email protected] stat status wchan #

Only a few of the files in the example contain information that would be understandable without preprocessing.

cmdline

Contains command line arguments.

# cat cmdline -su #

environ

Contains the values ​​of the environment variables of the process.

# cat environ HZ = 100TERM = xtermPATH = / usr / local / sbin: / usr / local / bin: / sbin: / usr / sbin: / bin: / usr / binHOME = / rootSHELL = / bin / bashUSER = rootLOGNAME = rootMAIL = / var / spool / mail / root #

status

Contains information about the state of the process in a human-readable format.

# cat status Name: bash State: S (sleeping) SleepAVG: 98% Tgid: 4510 Pid: 4510 PPid: 4498 TracerPid: 0 Uid: 0 0 0 0 Gid: 0 0 0 0 FDSize: 256 Groups: 0 1 2 3 4 6 10 11 VmSize: 2832 kB VmLck: 0 kB VmRSS: 1724 kB VmData: 388 kB VmStk: 88 kB VmExe: 628 kB VmLib: 1628 kB VmPTE: 12 kB Threads: 1 SigQ: 0/7168 SigPnd: 0000000000000000 ShdPnd0000 : 0000000000010000 SigIgn: 0000000000384004 SigCgt: 000000004b813efb CapInh: 0000000000000000 CapPrm: 00000000fffffeff CapEff: 00000000fffffeff #

Other directories

In the coma of directories describing system processes, there may be other directories in / proc. Below is the purpose of some of them:

  • ide- information about devices connected to the ide interface.
  • irq- information about the distribution of interrupts.
  • net- information about the network. The contents of the arp table and the routing table. Statistics on network interfaces and protocol. Etc.
  • scsi- information about SCSI devices.
  • sys- contains changeable parameters of the system.

/ proc / sys

The / proc / sys filesystem is a separate big topic. Using the files in this directory, you can change the system parameters on the fly. It is enough to write the desired value to a specific file. I will not describe / proc / sys, there is too much information and you need to know too much to understand what the files are used for. Therefore, I will tell you where to find the documentation and description for this file system:

Sysfs is used by udev to dynamically create device files.

Hello, readers of my site site, I wanted to tell you about existing and new file systems, as well as help her correctly choose... After all, the choice depends on the speed of work, comfort and health. when the computer freezes, slows down, I don't think you like it and it affects your nerves correctly 🙂

What is a file system and what is it for?

In simple terms, this is a system that serves to store files and folders on a hard drive or other media, flash drive, phone, camera, etc. And also for organizing files and folders: moving them, copying, renaming. So this system is responsible for all your files, which is why it is so important.

If you choose the wrong file system, your computer can malfunction, freeze, hang, information can flow slowly, and even worse, data corruption is possible. This is good if not systemic, but it will appear. And the most important thing is that if your computer slows down for this reason, no amount of garbage cleaning will help!

Types of file systems?

Many file systems are a thing of the past, and some are on their last legs, tk. modern technologies grow and grow every day, and now a completely new file system is on the way behind which future! Let's see where it all started.

Fat 12

Fat - file allocation table in translation file allocation table... At first, the file system was 12-bit, using a maximum of 4096 clusters. It was developed a very long time ago, back in the days of DOS and was used for floppy disks and small drives up to 16 MB in size. But it was replaced by a more advanced fat16.

Fat 16

This file system contained already 65525 and supported disks of size 4.2 GB, at that time it was a luxury and therefore it did well at that time. But the file size could not exceed 2GB, and in terms of economy it is not the best option, the larger the file size, the more the cluster takes up space. Therefore, it is not profitable to use a volume of more than 512 MB. The table shows how much the sector size takes depending on the size of the media.

Although the system coped at that time, a number of shortcomings appeared in the future:

1. You cannot work with hard disks over 8 GB.

2. You cannot create files larger than 2 GB.

3. The root folder cannot contain more than 512 items.

4. Inability to work with disk partitions larger than 2 GB.

Fat 32

Modern technologies do not stand still and over time, the fat 16 system was not enough and came to replace fat 32... This system was already able to support disks up to 2 terabytes (2048 gigabytes) and already economically use disk space due to smaller clusters. Another plus is that there are no restrictions on the use of files in the root folder and is more reliable compared to previous versions. But the biggest disadvantage for the present time is that files can be damaged and it is good that this will not lead to. And the second main disadvantage is that now the files exceed the size of more than 4 GB, and the system does not support a larger volume of one file. That often users have questions about why I can't download a 7GB movie, although there is 100GB free on the disk, that's the whole problem.

Therefore cons and here's enough:

1. Files larger than 4GB are not supported by the system.

2. The system is prone to file fragmentation, which causes the system to slow down.

3. Affected by file corruption.

4. At the moment, there are already more than 2 TB disks.

NTFS

And now it came to replace new system ntfs(New Technology File System) what is translated file system new technology, in which a number of disadvantages have been removed, but there are also enough disadvantages. This system is the last approved, apart from the new one, which I will discuss below. The system appeared back in the 90s, and was approved in 2001 with the release of windows xp and is used to this day. supports disks up to 18TB, cool huh? And when files are fragmented, the speed is not lost so noticeably. Security has already reached good heights, in the event of a failure, information corruption is unlikely.

Minuses and here will be:

1. The consumption of RAM, if you have less than 64 MB of RAM, then it is not recommended to set it.

2. With the remaining 10% of free space on the hard disk, the system starts to slow down noticeably.

3. Working with small storage capacity can be difficult.

New ReFS

Brand new ReFS file system ( Resilient File System) in translation a fault-tolerant file system developed for the new Windows operating system, behind which there may be future! According to the developers, the system should be extremely reliable and, soon after completion, will be supported on other operating systems. Here is a table of differences:

As you can see, the new system supports large amounts of disk space and more characters in the path and file name. The system promises to be more secure in which there should be a minimum of disruptions due to the new architecture and a different way of recording the log. As long as you can see only one pros, but how much this is true is not yet known. After full approval, it is possible that a series will appear cons... But so far this remains a mystery. Let's hope that the new file system will bring us only positive feelings from it.

Which file system should you choose?

It is better to install on a well-performing computer Ntfs, it will be more productive and safer for these purposes. It is not recommended to install on computers with less than 32GB hard disk and 64 MB RAM. And the old woman fat32 you can bet on a flash drive with a small volume, because performance can be higher. And one more thing that after formatting a USB flash drive for a phone, digital camera and other electronic devices in ntfs format, you may have errors. some devices may not support ntfs or may be slow and crash with it. So before formatting, make sure which file system is best for your device.

There are other types of file systems, for example for Linux XFS, ReiserFS (Reiser3), JFS (Journaled File System), ext (extended filesystem), ext2 (second extended file system), ext3 (third extended filesystem), Reiser4, ext4, Btrfs (B-tree FS or Butter FS), Tux2, Tux3, Xiafs, ZFS (Zettabyte File System), but that's a completely different story ...

Why the smartphone may not run programs from the memory card? How is ext4 fundamentally different from ext3? Why will a flash drive live longer if formatted to NTFS instead of FAT? What is the main problem with F2FS? The answers lie in the peculiarities of the structure of file systems. We will talk about them.

Introduction

File systems determine how data is stored. They determine what restrictions the user will face, how fast read and write operations will be, and how long the drive will work without failures. This is especially true of budget SSDs and their younger brothers - flash drives. Knowing these features, you can squeeze the maximum out of any system and optimize its use for specific tasks.

You have to choose the type and parameters of the file system every time you need to do something non-trivial. For example, you want to speed up the most frequent file operations. At the file system level, this can be accomplished in a number of ways: indexing will provide fast searches, and pre-reservation of free blocks will make it easier to overwrite frequently changing files. Optimizing the data in RAM beforehand will reduce the amount of I / O required.

Features of modern file systems such as lazy writing, deduplication, and other advanced algorithms help extend uptime. They are especially relevant for cheap SSDs with TLC memory chips, flash drives and memory cards.

Separate optimizations exist for disk arrays of different tiers: for example, the file system can support lightweight volume mirroring, snapshots, or dynamic scaling without taking a volume offline.

Black box

Users mainly work with the file system offered by the operating system by default. They rarely create new disk partitions and even less often think about their settings - they just use the recommended parameters or even buy pre-formatted media.

For Windows fans, everything is simple: NTFS on all disk partitions and FAT32 (or the same NTFS) on flash drives. If there is a NAS and some other file system is used in it, then for the majority this remains beyond perception. They simply connect to it over the network and download files, as if from a black box.

On mobile gadgets with Android, ext4 is most often found in internal memory and FAT32 on microSD cards. For Apple, it doesn't matter at all what kind of file system they have: HFS +, HFSX, APFS, WTFS ... for them there are only beautiful folder and file icons drawn by the best designers. Linux users have the richest choice, but you can add support for file systems that are not native to the operating system in both Windows and macOS - more on that later.

Common roots

More than a hundred different file systems have been created, but a little more than a dozen can be called relevant. While they were all designed for their specific applications, many ended up being conceptually related. They are similar because they use the same type of presentation structure (meta) data - B-trees ("bi-trees").

As with any hierarchical system, the B-tree starts at the root record and further branches down to the final elements - individual records about files and their attributes, or "leaves". The main purpose of creating such a logical structure was to speed up the search for file system objects on large dynamic arrays - like hard drives of several terabytes or even more impressive RAID arrays.

B-trees require far fewer disk accesses than other types of B-trees when performing the same operations. This is achieved due to the fact that the final objects in B-trees are hierarchically located at the same height, and the speed of all operations is just proportional to the height of the tree.

Like other balanced trees, B-trees have the same path length from root to any leaf. Instead of growing up, they branch more and grow more in width: all branch points in the B-tree store many references to child objects, making them easy to find in fewer calls. A large number of pointers reduces the number of the longest disk operations - head positioning when reading arbitrary blocks.

The concept of B-trees was formulated back in the seventies and has undergone various improvements since then. It is implemented in one form or another in NTFS, BFS, XFS, JFS, ReiserFS and many DBMS. They are all cousins ​​in terms of the basic principles of data organization. The differences concern details, which are often quite important. The disadvantage of related file systems is also common: they were all created to work with disks even before the advent of SSDs.

Flash memory as an engine of progress

Solid-state drives are gradually replacing disk drives, but so far they are forced to use file systems that are alien to them, inherited. They are built on arrays of flash memory, the principles of which differ from those of disk devices. In particular, flash memory must be erased before writing, and this operation in NAND chips cannot be performed at the level of individual cells. It is possible only for large blocks as a whole.

This limitation is due to the fact that in NAND memory, all cells are combined into blocks, each of which has only one common connection to the control bus. We will not go into the details of paging and describe the complete hierarchy. The very principle of group operations with cells and the fact that the sizes of blocks of flash memory are usually larger than the blocks addressed in any file system are important. Therefore, all addresses and commands for drives with NAND flash must be translated through the FTL (Flash Translation Layer) abstraction layer.

Flash memory controllers provide compatibility with the logic of disk devices and support for commands of their native interfaces. Usually FTL is implemented in their firmware, but it can (partially) run on the host - for example, Plextor writes drivers for its SSDs that accelerate writing.

You cannot do without FTL at all, since even writing one bit to a specific cell leads to the launch of a whole series of operations: the controller searches for a block containing the required cell; the block is read in full, written to the cache or to free space, then erased entirely, after which it is rewritten back with the necessary changes.

This approach resembles everyday life in the army: in order to give an order to one soldier, the sergeant makes a general formation, calls the poor fellow out of order and orders the rest to disperse. In the now rare NOR-memory, the organization was spetsnaz: each cell was controlled independently (each transistor had an individual contact).

The tasks for the controllers are increasing, since with each generation of flash memory, the technical process of its manufacture decreases in order to increase the density and reduce the cost of data storage. Along with technological standards, the estimated life of the chips is also reduced.

Modules with single-level SLC cells had a declared resource of 100 thousand rewrite cycles and even more. Many of them still work in old flash drives and CF cards. The enterprise-class MLC (eMLC) claimed the resource in the range from 10 to 20 thousand, while in the usual consumer-level MLC it is estimated at 3-5 thousand. Memory of this type is actively crowded by the even cheaper TLC, whose resource barely reaches a thousand cycles. Keeping the lifespan of flash memory at an acceptable level has to be done through software tweaks, and new file systems are becoming one of them.

Initially, manufacturers assumed that the file system was unimportant. The controller itself must maintain a short-lived array of memory cells of any type, distributing the load between them in an optimal way. For the file system driver, it simulates a regular disk, and itself performs low-level optimizations on any access. However, in practice, optimization varies from magical to fictitious for different devices.

In corporate SSDs, the built-in controller is a small computer. It has a huge memory buffer (half a gig and more), and it supports many methods to improve the efficiency of working with data, which avoids unnecessary rewriting cycles. The chip arranges all the blocks in the cache, performs lazy writes, performs deduplication on the fly, reserves some blocks and clears others in the background. All this magic happens completely unnoticed by the OS, programs and the user. With an SSD like this, it really doesn't matter what filesystem is used. Internal optimizations have a much larger impact on performance and resource than external ones.

Budget SSDs (and even more so - flash drives) are equipped with much less intelligent controllers. The cache in them is truncated or absent, and advanced server technologies are not used at all. In memory cards, the controllers are so primitive that it is often claimed that they do not exist at all. Therefore, for cheap devices with flash memory, external load balancing methods remain relevant - primarily using specialized file systems.

JFFS to F2FS

One of the first attempts to write a file system that would take into account the principles of organization of flash memory was JFFS - Journaling Flash File System. Initially, this development by the Swedish company Axis Communications was focused on improving the memory efficiency of network devices that Axis produced in the nineties. The first version of JFFS only supported NOR memory, but already in the second version it became friends with NAND.

JFFS2 is of limited use right now. Mostly it is still used in Linux distributions for embedded systems. It can be found in routers, IP cameras, NAS, and other regulars on the Internet of Things. In general, wherever a small amount of reliable memory is required.

A further development effort for JFFS2 was LogFS, which stored inodes in a separate file. The authors of this idea are an employee of the German division of IBM Jorn Engel and a professor at the University of Osnabrück Robert Mertens. The source code for LogFS is available on GitHub. Judging by the fact that the last change in it was made four years ago, LogFS has not gained popularity.

But these attempts spurred the emergence of another specialized file system - F2FS. It was developed by Samsung Corporation, which accounts for a large part of the flash memory produced in the world. Samsung makes NAND Flash chips for its own devices and for other companies, and is also developing SSDs with fundamentally new interfaces instead of legacy disk ones. The creation of a specialized file system optimized for flash memory has been a long overdue necessity from Samsung's point of view.

Four years ago, in 2012, Samsung created F2FS (Flash Friendly File System). Its idea is good, but the implementation turned out to be damp. The key task when creating F2FS was simple: to reduce the number of cell rewriting operations and distribute the load on them as evenly as possible. This requires performing operations with several cells within the same block at the same time, and not rape them one by one. This means that we do not need instant rewriting of existing blocks at the first request of the OS, but caching of commands and data, adding new blocks to free space and delayed erasure of cells.

Today, F2FS support is already officially implemented in Linux (and hence in Android), but it does not provide any special advantages in practice. The main feature of this file system (deferred overwrite) has led to premature conclusions about its effectiveness. The old caching trick even fooled early versions of benchmarks, where F2FS showed an apparent advantage not by a few percent (as expected) or even several times, but orders of magnitude. It's just that the F2FS driver reported the execution of an operation that the controller was just planning to do. However, if the real performance gain in F2FS is small, then cell wear will definitely be less than when using the same ext4. Those optimizations that a cheap controller cannot do will be performed at the level of the file system itself.

Extents and bitmaps

While F2FS is perceived as exotic for geeks. Even Samsung's own smartphones still use ext4. Many consider it to be a further development of ext3, but this is not entirely true. This is more of a revolution than breaking the 2 TB per file barrier and simply increasing other metrics.

When computers were large and files were small, addressing was easy. Each file was allocated a certain number of blocks, the addresses of which were entered into the correspondence table. This is how the ext3 filesystem worked, which is still in use today. But in ext4, a fundamentally different way of addressing appeared - extents.

Extents can be thought of as inode extensions as discrete sets of blocks that are addressed in their entirety as contiguous sequences. One extent can contain a whole medium-sized file, and for large files, it is enough to allocate a dozen or two extents. This is much more efficient than addressing hundreds of thousands of small blocks of four kilobytes.

The writing mechanism itself has changed in ext4. Now the distribution of blocks occurs immediately in one request. And not in advance, but just before writing data to disk. Delayed multi-block allocation allows you to get rid of unnecessary operations that ext3 sinned: in it, blocks for a new file were allocated immediately, even if it completely fit in the cache and was scheduled to be deleted as temporary.


FAT restricted diet

Besides balanced trees and their modifications, there are other popular logical structures. There are file systems with a fundamentally different type of organization - for example, linear. You probably use at least one of them a lot.

Mystery

Guess the riddle: at twelve she began to gain weight, by sixteen she was a stupid fat woman, and by thirty-two she became fat, and remained a simpleton. Who is she?

That's right, this is a story about the FAT file system. Compatibility requirements gave her a bad inheritance. On floppy disks it was 12-bit, on hard disks - at first it was 16-bit, and to this day it has come down as 32-bit. In each subsequent version, the number of addressable blocks increased, but in the very essence nothing changed.

The still popular FAT32 file system appeared twenty years ago. Today it is still primitive and does not support ACLs, disk quotas, background compression, or other modern data optimization technologies.

Why is FAT32 needed these days? All the same for compatibility purposes only. Manufacturers rightly believe that any OS can read a FAT32 partition. Therefore, they create it on external hard drives, USB Flash and memory cards.

How to free up flash memory on your smartphone

MicroSD (HC) cards used in smartphones are formatted to FAT32 by default. This is the main obstacle to installing applications on them and transferring data from internal memory. To overcome it, you need to create an ext3 or ext4 partition on the card. All file attributes (including owner and access rights) can be transferred to it, so any application can work as if it was launched from internal memory.

Windows cannot create more than one partition on flash drives, but for this you can run Linux (at least in a virtual machine) or an advanced utility for working with logical partitioning - for example, MiniTool Partition Wizard Free. Having found an additional primary partition with ext3 / ext4 on the card, the Link2SD application and similar ones will offer much more options than in the case of a single FAT32 partition.


Another argument in favor of choosing FAT32 is often cited as its lack of journaling, which means faster writes and less wear on NAND Flash memory cells. In practice, the use of FAT32 leads to the opposite and gives rise to many other problems.

Flash drives and memory cards just die quickly because any change in FAT32 causes the overwriting of the same sectors where two chains of file tables are located. I saved the entire web page, and it was rewritten a hundred times - with each addition of another small GIF to the flash drive. Launched the portable software? He created temporary files and constantly changes them during work. Therefore, it is much better to use NTFS on flash drives with its fault-tolerant $ MFT table. Small files can be stored directly in the main file table, and its extensions and copies are written to different areas of flash memory. In addition, NTFS indexing makes searches faster.

INFO

For FAT32 and NTFS, the theoretical nesting level limits are not specified, but in practice they are the same: only 7707 subdirectories can be created in the first-level directory. Lovers of nesting dolls will appreciate it.

Another problem that most users face is that it is impossible to write a file larger than 4 GB to a FAT32 partition. The reason is that in FAT32 the file size is described by 32 bits in the file allocation table, and 2 ^ 32 (minus one, to be precise) just gives four gigs. It turns out that neither a movie in normal quality nor a DVD image can be recorded on a freshly purchased flash drive.

Copying large files is still half the trouble: when you try to do this, the error is at least immediately visible. In other situations, FAT32 acts as a time bomb. For example, you copied portable software to a USB flash drive and at first you can use it without any problems. After a long time, one of the programs (for example, accounting or mail) has a database bloated, and ... it just stops updating. The file cannot be overwritten because it has reached the 4 GB limit.

A less obvious problem is that in FAT32, the creation date of a file or directory can be specified with an accuracy of two seconds. This is not sufficient for many cryptographic applications that use timestamps. The low precision of the date attribute is another reason why FAT32 is not considered a complete file system from a security point of view. However, its weaknesses can be used for your own purposes. For example, if you copy any files from an NTFS partition to a FAT32 volume, they will be cleared of all metadata, as well as inherited and specially set permissions. FAT just doesn't support them.

exFAT

Unlike FAT12 / 16/32, exFAT was designed specifically for USB Flash and large memory cards (≥ 32 GB). Extended FAT eliminates the aforementioned disadvantage of FAT32 - overwriting the same sectors on any change. As a 64-bit system, it has practically no meaningful limits on the size of a single file. Theoretically, it can be 2 ^ 64 bytes (16 EB) long, and cards of this size will not appear soon.

Another major difference in exFAT is its support for Access Control Lists (ACLs). This is no longer that simpleton from the nineties, but the closed format hinders the implementation of exFAT. ExFAT support is fully and legally implemented only in Windows (starting from XP SP2) and OS X (starting from 10.6.5). On Linux and * BSD, it is supported either with restrictions or not entirely legally. Microsoft requires licensing to use exFAT, and there are many legal disputes in this area.

Btrfs

Another prominent example of B-tree file systems is called Btrfs. This FS appeared in 2007 and was originally created in Oracle with an eye to working with SSD and RAID. For example, it can be dynamically scaled: create new inodes on the live system, or split a volume into subvolumes without allocating free space to them.

The copy-on-write mechanism implemented in Btrfs and full integration with the Device mapper kernel module allow you to make almost instant snapshots via virtual block devices. Data precompression (zlib or lzo) and deduplication speed up basic operations while also extending the lifetime of flash memory. This is especially noticeable when working with databases (compression is achieved by 2–4 times) and small files (they are written in orderly large blocks and can be stored directly in the "leaves").

Btrfs also supports full journaling (data and metadata), volume checking without unmounting, and many other modern features. Btrfs code is published under the GPL license. This file system has been maintained as stable on Linux since kernel 4.3.1.

Flight logs

Almost all more or less modern file systems (ext3 / ext4, NTFS, HFSX, Btrfs and others) belong to the general group of journaled ones, since they keep records of the changes made in a separate log (journal) and check with it in case of failure during disk operations ... However, the level of verbosity and fault tolerance of these file systems is different.

Ext3 supports three logging modes: loopback, sequenced, and full logging. The first mode implies recording only general changes (metadata), performed asynchronously with respect to changes in the data itself. In the second mode, the same metadata recording is performed, but strictly before any changes are made. The third mode is equivalent to full logging (changes to both metadata and the files themselves).

Only the latter option ensures data integrity. The other two only speed up the identification of errors during the check and guarantee the restoration of the integrity of the file system itself, but not the contents of the files.

NTFS logging is similar to ext3's second logging mode. Only changes to the metadata are recorded in the log, and the data itself may be lost in the event of a failure. This NTFS journaling method was not conceived as a way to achieve maximum reliability, but only as a compromise between performance and fault tolerance. This is why people accustomed to working with fully journaling systems consider NTFS to be pseudo-journaled.

The NTFS approach is somewhat better than the default in ext3. In NTFS, checkpoints are additionally created periodically to ensure that all previously pending disk operations are completed. Checkpoints have nothing to do with restore points in \ System Volume Infromation \. These are just overhead entries in the log.

Practice shows that such partial NTFS journaling in most cases is enough for trouble-free operation. After all, even with a sharp power outage, disk devices do not de-energize instantly. The power supply unit and numerous capacitors in the drives themselves provide just that minimum energy reserve, which is enough to complete the current write operation. Modern SSDs, with their speed and economy, the same amount of energy is usually enough to perform pending operations. An attempt to switch to full logging would reduce the speed of most operations by several times.

We connect third-party filesystems in Windows

The use of file systems is limited by their support at the OS level. For example, Windows does not understand ext2 / 3/4 and HFS +, but sometimes you need to use them. This can be done by adding the appropriate driver.

WARNING

Most drivers and plugins for supporting third-party file systems have their limitations and do not always work stably. They can interfere with other drivers, antivirus and virtualization programs.

Open driver for reading and writing ext2 / 3 partitions with partial ext4 support. The latest version supports extents and partitions up to 16 TB. LVM, ACLs and extended attributes are not supported.


There is a free plugin for Total Commander. Supports reading ext2 / 3/4 partitions.


coLinux is an open source and free port of the Linux kernel. Together with a 32-bit driver, it allows you to run Linux on Windows 2000 through 7 without using virtualization technologies. Supports 32-bit versions only. The development of the 64-bit modification was canceled. coLinux allows, among other things, to organize access from Windows to ext2 / 3/4 partitions. Project support was suspended in 2014.

Windows 10 may already have native support for Linux-specific file systems, it's just hidden. These thoughts are suggested by the kernel-level driver Lxcore.sys and the LxssManager service, which is loaded as a library by the Svchost.exe process. For more details, see Alex Ionescu's talk "The Linux Kernel Hidden Inside Windows 10", which he presented at Black Hat 2016.


ExtFS for Windows is a paid driver released by Paragon. It works on Windows 7 to 10, supports read / write access to ext2 / 3/4 volumes. Provides almost complete ext4 support on Windows.

HFS + for Windows 10 is another proprietary driver from Paragon Software. Despite the name, it works in all versions of Windows starting from XP. Provides full access to HFS + / HFSX file systems on disks with any partition (MBR / GPT).

WinBtrfs is an early development of the Btrfs driver for Windows. Already in version 0.6, it supports both read and write access to Btrfs volumes. It is able to handle hard and symbolic links, supports alternate data streams, ACL, two types of compression and asynchronous read / write mode. So far WinBtrfs cannot use mkfs.btrfs, btrfs-balance and other utilities to maintain this file system.

File System Capabilities and Limitations: Pivot Table

File system Mac-si-mal-ny volume-size Pre-del size of one file Length by own file name Length of full file name (including path from root) Pre-del number of files and / or catalogs Accuracy of specifying the date of the file / catalog Rights dos-tu-pa Hard links Sim-free links Snap-shots Compressing data in the background Cipher-ro-va-tion of data in the background Grandfather-pli-ka-tion of data
FAT16 2 GB in 512 byte sectors or 4 GB in 64 KB clusters 2 GB 255 bytes with LFN - - - - - - - - - -
FAT32 8 TB in 2 KB sectors 4 GB (2 ^ 32 - 1 byte) 255 bytes with LFN up to 32 subdirectories with CDS 65460 10ms (create) / 2s (change) No No No No No No No
exFAT ≈ 128 PB (2 ^ 32-1 clusters of 2 ^ 25-1 bytes) theoretical / 512 TB due to third-party limitations 16 EB (2 ^ 64 - 1 byte) 2796202 in catalog 10 ms ACL No No No No No No
NTFS 256 TB in 64 KB clusters or 16 TB in 4K clusters 16 TB (Win 7) / 256 TB (Win 8) 255 Unicode characters (UTF-16) 32,760 Unicode characters, but no more than 255 characters per element 2^32-1 100 ns ACL Yes Yes Yes Yes Yes Yes
HFS + 8 EB (2 ^ 63 bytes) 8 EB 255 Unicode characters (UTF-16) not limited separately 2^32-1 1 sec Unix, ACL Yes Yes No Yes Yes No
APFS 8 EB (2 ^ 63 bytes) 8 EB 255 Unicode characters (UTF-16) not limited separately 2^63 1 ns Unix, ACL Yes Yes Yes Yes Yes Yes
Ext3 32 TB (theoretical) / 16 TB in 4K clusters (due to limitations of e2fs programs) 2 TB (theoretical) / 16 GB for older programs 255 Unicode characters (UTF-16) not limited separately - 1 sec Unix, ACL Yes Yes No No No No
Ext4 1 EB (theoretical) / 16 TB in 4K clusters (due to limitations of e2fs programs) 16 TB 255 Unicode characters (UTF-16) not limited separately 4 billion 1 ns POSIX Yes Yes No No Yes No
F2FS 16 TB 3.94 TB 255 bytes not limited separately - 1 ns POSIX ACL Yes Yes No No Yes No
BTRFS 16 EB (2 ^ 64 - 1 byte) 16 EB 255 ASCII characters 2 ^ 17 bytes - 1 ns POSIX ACL Yes Yes Yes Yes Yes Yes

(2010) with some additions and clarifications.

Journaling

Before talking about filesystems - let's take a quick look at the concept of “ logging«.

Journaling it is used in one form or another in almost all modern file systems.

Logging is used only for write operations to disk, and is a kind of buffer for all such operations. This approach helps to solve problems that arise during a disk write operation in which the computer turns off, for example, due to a power outage. Without journaling, in such cases it is impossible to figure out which files were written and which were not or were partially written.

When using logging, the file is first written to the log (or "log"). After that, the file is written to the hard disk and then deleted from the log, after which the write operation is considered complete. If the power is turned off during recording, then after turning on the system, the file system can check the log and find uncompleted operations.

The biggest problem with using logging is that it requires additional system resources to use it. In order to reduce such overhead, journaling file systems do not write the entire file to the journal, but only certain metadata.

Ext file systems

Ext

Means " Extended"(Extended) file system, and it was the first to be designed specifically for Linux-systems. There are 4 file systems in total today. Ext... The very first one is simply Ext- was a major update of the FS OS Minix.

Characteristics Ext:

  • maximum file size: 2GB;
  • maximum partition size: 2GB;

The developer is, and the first version appeared in 1992.

We will not consider it, because most likely you will never encounter it.

Ext2

- non-journaling file system, released in 1993, the main task for which was to support devices up to 2 Terabytes in size. Because at Ext2 no journaling - it performs much fewer disk writes, which affects performance and its scope.

Characteristics:

  • maximum file size: 16GB - 2TB;
  • maximum partition size: 2 - 32 TB;
  • the maximum name size is 255 characters.
  • due to the low number of write-delete operations, it is ideal for various flash drives;
  • at the same time modern SSD-disks have improved indicators of their life cycle (wear resistance of drive elements) and some other features that neutralize the shortcomings Ext2 as a non-logged FS.

Ext3

- appeared in 2001, together with the release Linux Kernel 2.4.15. In fact is the same Ext2 but with logging support. The main purpose Ext3 there was a possibility of its backward compatibility with Ext2 without the need to reformat partitions. The advantages include the fact that most of the testing, bug fixes, etc. for Ext3 was the same as in Ext2 what did Ext3 more stable and faster FS.

Characteristics:

  • maximum file size: 16GB - 2TB (depending on the block size);
  • maximum partition size: 2 - 32 TB (depending on block size);
  • suitable if you are using Ext2, and you want to use logging;
  • due to its performance and stability, it will probably be the most suitable FS for database servers;
  • Probably not the best choice for servers. does not support snapshot creation ( shapshot) FS and difficulties with recovering deleted files.

Ext4

- like Ext3 has backward compatibility with previous versions of FS. As a matter of fact, you can mount Ext2 or Ext3 how Ext4-and, under certain conditions, achieve greater productivity. You can also mount Ext4 how Ext3 without any side effects.

Ext4- the stable version was released in 2008. Is the first FS from the "family" Ext using the mechanism " “, Which allows for less file fragmentation and increases the overall performance of the file system. Besides, in Ext4 the delayed write mechanism is implemented ( ), which also reduces disk fragmentation and CPU utilization. On the other hand, although the lazy write mechanism is used in many file systems, due to the complexity of its implementation, it increases the likelihood of data loss. See for more details.

Characteristics:

  • maximum file size: 16 TB;
  • maximum file name size: 255 characters.
  • the best choice for SSD;
  • best performance compared to previous ones Etx-systems;
  • it is also great as a file system for database servers, although the system itself is younger Ext3.

BtrFS

- developed by the company Oracle in 2007. Its scheme is similar to ReiserFS, the main principle of its work is the so-called. ... BtrFS allows you to dynamically allocate inodes, create snapshots of the file system during its operation, perform transparent compression of files and do defragmentation in the operating mode.

Although the stable version BtrFS not yet included in most distributions Linux(for today, judging by the post - only SUSE and Oracle Linux) - it may well replace Ext3 / 4 in the foreseeable future and already provides conversion options Ext3 / 4 in BtrFS... Also, it's worth mentioning that one of the developers Ext, said that " BtrFS Is a step into the future. "

Characteristics:

  • maximum partition size: 16 EB;
  • maximum file name size: 255 characters.
  • due to performance, snapshots and other features - BtrFS is an excellent file system for the server;
  • Oracle also develops a replacement for NFS and CIFS, which is called CRFS and which is designed to improve performance for file storages with BtrFS;
  • performance tests showed lag BtrFS from Ext4 on solid-state media such as SSD and for operations with relatively small files:

ReizerFS

- introduced in 2001, it realized many possibilities that can never be realized in Ext*. In 2004 to replace ReizerFS FS was released Reizer4.

At the same time - development Reizer4 progressing very slowly and still has limited support (?) in the kernel Linux... Currently, only ReiserFS .

Characteristics:

  • maximum file size: 1 EB ();
  • maximum partition size: 16 TB;
  • maximum file name size: 4032 bytes, but limited to 255 characters.
  • excellent performance when working with small files such as log files and is great for database servers or mail servers;
  • ReiserFS lends itself well to increasing the volume size - but does not support its reduction and encryption at the FS level;
  • future Reiser4 is still in question and for now BtrFS remains the preferred (?) choice between these two FS.

ZFS

- it is worth mentioning here, tk. it was also developed by the company Oracle and has capabilities similar to BtrFS and ReizerFS... She also became quite famous after the company Apple the intention to use it as the default file system. First release ZFS took place in 2005.

Due to license restrictions - ZFS cannot be included in the kernel Linux, but its support is possible with the help of the mechanism Linux's (FUSE).

Characteristics:

  • maximum file size: 16 EB ();
  • maximum partition size: 256 ZiB (Zebibyte);
  • maximum file name size: 255 bytes.
  • shows excellent performance when working with large disk arrays;
  • supports the ability to combine disks into arrays, create FS snapshots, and work with "layered display" ( ) data;
  • possible difficulties when trying to install and use in Linux-systems, due to the need to use FUSE.

Swap

Swap- is not a file system at all. File or section with swap-th is used by the kernel virtual memory system and has no filesystem structure at all. You cannot mount it and read data from it, because swap used exclusively by the kernel Linux to write pages of memory not disk. Usually - swap is used only when the OS lacks free RAM and "flushes" part of the data from memory to swap for her release.