China Linux Storage & Filesystem 2014 workshop (first day)

CLSF (China Linux Storage & File System Workshop) is an effort to make local Linux kernel hackers get together to share and exchange ideas. CLSF is an invitation only workshop, in order to have effective communication, only a small group of people will be invited. Most of the invitees are active upstream Linux kernel developers locally from China, focus on I/O and storage sub-systems.

CLSF 2014 was hold in office of XiaoMi which is a famous consumer electronics company in china. Participators are mainly from Huawei, Fujitsu, Intel, Alibaba and other companies.

The first topic lead by Jiufei Xue from Huawei is about ocfs2. Huawei was building their private cloud product on ocfs2, so in recent two years the kernel developers in Huawei commited many fix patches and new features into ocfs2 community. In this year, they add range lock into ocfs2, so users could not only lock the whole file but a specific range of one file, which will promote the performance in cluster when many clients read/write files at the same time.

ocfs2

F2FS is a new filesystem in kernel. It is based on devices like SD-card. It stores meta-data in the beginning of device (The random read/write performance in the beginning of SD-card is very well) and use indirect-block layout just like ext2. The reason of not using extent (like ext4) is for its convenience of garbage collection of NAND block. F2FS also merges many ‘sync’ operations into one, mainly for the speed (The ‘sync’ operation on SD-card in mobile-phone is very slow).

Zeifan Li (from Huawei): If one day, the SD-card adds FTL layer into its controller and let SD-card works just like enterprise SSD. Will the advantage of F2FS be disappear?
Ying Huang (from Intel): Let’s look at this another way. If F2FS do the whole work of SSD firmware, we could use kernel file system on NAND flash directly, which will save a lot of money 🙂

f2f2

Bo Liu from Oracle hold the topic about btrfs. The new features for btrfs decreased in this year, and the main work is fixing bugs. Using the standard kernel worker thread in btrfs cause a serious bug which would lost user’s data. Bo Liu spends a long time to repeat the bug, and fortunately, fix it at last.

btrfs

Coly (from Alibaba): I have joined the linux plumber conference in this year. On a presentation about docker, the speaker complained that btrfs is the most unstable filesystem in linux kernel. The developers in btrfs community may have to spend more time on fixing bugs.

The last topic in first day is hold by an engineer from Memblaze (a china-version fusion-io company). They face many problmes in building All-Flash-Array (called AFA). In AFA, the linux system in box has become bottleneck in the whole IO path: too much interrupts cost too much CPU and time; the implemention of socket and TCP/IP stack in linux is too insufficient; context switch for processes also make IOPS unstable; even filesystem itself spend too much time in searching files.

memblaze

Besides these problems, they also speak out a new viewpoint: SSD will become a perfect device to store cold data as NVME and PCM will become heap very quickly. For this point, we began a long and heated discussion. If PCM comes into the market, the Block-layer and filesystem in linux kernel will become obsolete, and almost all the guys sit in the meeting-room will be out of work :). So, maybe the time has come closely for the big chang in storage area.

clsf 2014