Wrong ‘struct timeval’ for setsockopt()

What if we deliberately use ‘struct timeval’ like this incorrect way to set timeout of receiving to 3 seconds:

the ‘setsockopt’ will return fail (-1).

Let’s look up the linux kernel code for systemcall sys_setsockopt():

sock_setsockopt() will invoke sock_set_timeout() and sock_set_timeout() looks like:

That’s it. If ‘tv.tv_usec’ is greater than USEC_PER_SEC (which equals 1000000), it will return -EDOM and setsockopt() will fail.

DCTC 2016 conference

Yesterday I went to attend DCTC(Data Center Technology Conference) 2016 in Beijing. Although it is called “Data Center Technology”, most topics is about storage, because the conference is hold by Memblaze, a famous flash-storage startup company in China.
Xuebing Yin, The CEO of Memblaze, gave the first topic:




2016 is a important year for flash-storage because the revenue of SSD become bigger than Hard-disk for the first time. As we can see, data center will become full silicon (Hard-Disk is the only non-silicon component in servers) in the near future. As the speed of SSD and its interface (from SATA to PCIE) become faster and faster, many old softwares become bottleneck of performance: mysql 5.6 can’t use up the performance of high-speed SSD, but mysql 5.7 could.

Janene Ellefson from NVME organization introduced why we need a standard for high-speed data transfer.




Could use up to 64K queues with 64K command for each queue, NMVE is definitely the most powerful protocol for modern (or future) IO devices.

Xin Wu from GBase introduced the problems they face in using SSD for database




GBase is a serial of database products for OLTP/OLAP and global data storage. The SSD benefited the OLTP application, but for OLAP application the SSD is too expensive because the Hard-disk Array could also provide the same bandwidth. Maybe that’s why AWS released new type of EBS months ago.

Coly Li (Yes, my old friend ^_^) from SUSE Labs showed us the improvements of Linux Soft RAID in recent years. Many years ago, Linux soft RAID was used only for low speed Hard Disks, so the cost of bad software implement is not significant. But recently, the widely use of SSD expose many bottlenecks in soft RAID, and the developers of open source community commit many patches to improve performance. And, many patches came from Shaohua Li, a sophisticate kernel developer.(He worked first for Intel, and then Fusionio, and now Facebook).



In the tea break, I visited the exhibition of Memblaze.




This is a 1U server built by SuperMicro, it use eight NVME SSD by SFF8639 interfaces in the front. Running Mysql at full speed (4600% CPU usage), the wind run out from back is still not hot. Looks the server of SuperMicro is very effective, and cool 🙂

Some tips about Hive

Found some tips about Hive in my learning progress:

1. When I start “bin/hive” at first time, these errors report:

The solution is simple:

Actually, we’d better use mysql instead of derby for multi-users environment.

2. Control the number of mappers for SQL jobs. If a SQL job use too much mappers, the context-switch of processes (include frequent launch/stop for JVM) will cost extra CPU resource. We could use

to change the number of mappers for all the SQL jobs.

3. After I imported 1TB data into a “Orc format” table, the size of the table is just 250GB. But after I imported 1TB data into a “Parquet format” table, the size is 900GB. Looks Apache Orc has more effective compression algorithm for custom data.

4. Using partitions carefully.

Now we have a table named “users” and is partitioned by field “ca”.

Now, there is a record in HDFS directory “/user/hive/warehouse/users/ca=China/”
In the book <>, it said we could copy the data in a partition directory to AWS s3 and then set partition to it. But, what if I set the partition to a new empty HDFS directory? Let’s try:

Because the partition has been set to a empty directory, the select couldn’t find any records now. That is what “Schema on read” mean.

5. Debug.

This will print many debug information for finding causes such as: