Raw disk I/O performance on FreeBSD
Author: Willem Jan Withagen (wjw@withagen.nl), date: 13 october 2005
Introduction
 
  Driven by remarks on the FreeBSD mailing lists, some information in an article
    and my personal curriosity, I started to look into "evaluation" of the raw I/O 
	performance of disks.
  
Tests
Simple dd(1)
In general dd(1) is considered a (very) poor test for disk IO-performance, 
  but in this case it is used as a simple first approach to see what is all going 
  on, and to get a first indication of what to expect. So I created a simple setup 
  and ran the first initial tests.
Setup
  On a server with 2 disks, the OS (FreeBSD 5.4) is installed on the first disk. The 
  second disk is not partitioned, formatted, or anything else. So it is only available
  as the raw device. Then dd(1) is used to write blocks of fixed size (from /dev/zero) 
  to subsequent 
  parts of the disk. Just until this disk is filled. dd(1) is used since it reports 
  the transferrate all by itself, and looking at the code this would be fairly accurate.
  (This removes the burden to modify dd(1), or to write a special tool.)
  
Results
Each graph is based on 10 runs for each of the memory block sample sizes (1,5,10,20 Mbyte).
  Diskindex is the reference to the location on the disk. The disksize is divided by the 
  size of the write blocksize, giving the max. blockcount. The index then counts from 0 
  to max. blockcount.
Block Reading
   
  
   
  
Block Writing
   
  
   
  
Other disks:
  	 	BigFoot 	
  	DiamondMax9 	
  	 	WD2500YD 	
  	 	st38421 	
  	twa-single 	
  	wd800-gmirror 	
  	wd800-sata	
Observations:
  - 
    The first part of the 1Mbyte block writes is not value compatible with the picture that we
	see in the other three graphs: 5, 10 20 Mbyte. The last there give more or less identical 
	values.
 This might be due to the fact that at the faster inner part of the disk the cache 
	can keep up with with the transfers. This sould then be cause by the "slow" process 
	of starting dd(1) every time over.
 Looking at the outer side of the disk, where the slow tracks are. All for runs are 
	at the same speed.
 This would suggest that curve exposed by the 5Mbyte block writing is an indication 
	of the upper limit on the write transferrate on that specific part of the disk.
- 
    "systat -vm 1" shows a maximum of 85% diskusuage. Never gets the disk fully saturated. 
	Not shure if this is due to the fact that all transfers complete within the 1 second 
	sample time systat(1) runs on.
  
- 
    We have not (yet) accounted for the transfer-time from /dev/zero into the buffer, 
	before writing to the disk device. This would have the largest impact on the 1Mbyte transfers.
  
- 
  	When running enough sample some sort of transfer "shadowing" starts to show. This means
	that there where several transfers that were consistently less fast than the max. 
	transferrate. The fact that the shadowing really creates the same blocking as the main line, 
	just only a little slower suggests that there is some sort of deterministic process interfering
	with either the transfer or the time measurement. 
 Time quantisation could be the cause of the very large dispersion with the 1 Mbyte blocks.
 An average sample is: 1048576 bytes transferred in 0.014545 secs (72090850 bytes/sec) 
	or 1048576 bytes transferred in 0.021420 secs (48953123 bytes/sec).
	And perhaps the time takeing is not acurate enough to really differentiate between these valaues
 The fact that the 5Mbyte actually has the least and smallest diversions from the main line, 
	would indicate that it is not a result of time quantisation. If this where the case than this 
	effect would decrase when the sample duration would become longer, aka. the relative influence
	of the quantisation would be four times less with 20 Mbyte blocks.
Systems
  
    Dual Opteron 244, 1Gb Ram
  
  
    Harddisk: 
	  Western Digital WD800 SATA disk (WD800JD), 8Mb cache 
  
On both the server and clients all processes which are not required for the 
  tests are terminated, especially cron(1), syslog(1) and sendmail(1).
Do it Yourself
  to be filled
Interesting other reading
  - NFS Tricks and Benchmarking Traps, Daniel Ellard, Margo Seltzer, 
    Harvard University, 
 Proceedings of the FREENIX Track: 2003 USENIX Annual Technical Conference
 San Antonio, Texas, USA, June 9-14, 2003