Sunday, December 22, 2013

Random vs Sequential I/O and database systems


Main aim of this post is to set foundation to explain how random I/O's influence database systems.

Before going into the details, we need to understand data storage on physical layer (hard disk). It's important to know storage at physical layer to get better understanding on data access techniques used by RDBMS systems.

Not going into too many details of hard disk, some of the important components of hard disk for this post are: platter, read/write head, tracks and sectors.

Check these links on hard disk working:

Platter (below image) consists of set of tracks (red and black color rings), track is divided into sectors (yellow sections in red track).



Sector is the lowest physical storage unit, typical size of sector is 512 bytes, and data is stored in sectors (there are disks with larger than 512 bytes per sector).

To save a file of size 9000 bytes, two sectors are allocated on the physical layer.

Sequential Access
Let's walk thru a sample (sector size is 512 bytes in our sample)...
1) File is created with initial size of 1024 bytes, two continuous sectors are allocated on outer most track (say track1).
2) Append 1024 bytes to the file, it need two more sectors, these two sectors are allocated adjacent to previous two blocks
3) Append 1024 bytes to the file, it need two more sectors, these two sectors are allocated adjacent to previous four blocks.

Data blocks for file are allocated sequentially on track1.

To read this data, certain operations has to be performed...
 i)   Move read/write head to track1 (seek time)
 ii)  Wait for the sectors to come under read/write head (rotational latency)
 iii) Transfer data to OS (data transfer rate).

Typical seek times are anywhere between 9 to15ms, let's say 10ms in our case.
Rotational latency depends on rpm rate of disk, it typically goes from 2 to 5ms (http://en.wikipedia.org/wiki/Rotational_latency#Rotational_latency ), assume 2ms in our case.
Data transfer rate has increased over the period of time, typical transfer rate goes from 50MB/sec to 170MB/sec.
In our case, assume its 50MB/sec (50 * 1024 KB/sec).

Total cost to read file: 10ms (seek time) + 2ms rotational latency + 0.05ms (time to transfer 3KB of data) = 12.05ms.

Seek time is to move read/write head to track1 (if read/write head is not on track1), then onwards data is read sequentially.

Among all the three costs, seek time is the expensive one.


Random Access
Let's walkthrough the same sample when data is saved randomly

1) File is created with initial size of 1024 bytes, it need two sectors, and these two sectors are allocated on two different tracks (track1 & track100).
2) Append 1024 bytes to the file, it need two more sectors, these two sectors are allocated on two different tracks (track2 & track200).
3) Append 1024 bytes to the file, these two sectors are allocated on two different tracks (track3 & track300).

Data blocks for file are allocated randomly at different locations.


Read/Write head movement sequence: track1->track100->track2->track200->track3->track300

Sequence of physical operations and cost for each operation:
A) Cost for track1: 10ms (seek time) + 2ms rotational latency + 0.001ms (time to transfer 512 bytes) =12.001ms

B) Read/Write head is on track1, it should be moved to track100 to read second block, which means seek time is involved.
Cost for track100: 10ms (seek time) +2ms rotational latency+0.001ms (time to transfer 512 bytes) =12.001ms

C) Read/Write head is on track100, it should be moved to track2 to read third block.
Cost for track2: 10ms (seek time) +2ms rotational latency+0.001ms (time to transfer 512 bytes) =12.001ms

D) Read/Write head is on track2, it should be moved to track200 to read fourth block.
Cost for track200: 10ms (seek time) +2ms rotational latency+0.001ms (time to transfer 512 bytes) =12.001ms

E) Read/Write head is on track200, it should be moved to track3 to read fifth block.
Cost for track3: 10ms (seek time) +2ms rotational latency+0.001ms (time to transfer 512 bytes) =12.001ms

F) Read/Write head is on track3, it should be moved to track300 to read sixth block.
Cost for track300: 10ms (seek time) +2ms rotational latency+0.001ms (time to transfer 512 bytes) =12.001ms

Total cost to read file = 12.001+12.001+12.001+12.001+12.001+12.001--->72.006ms

Access time has increased by six times when data is randomly distributed.

This is the cost you pay when data is fragmented.

There has been lot of improvements in data transfer rate over the years, there is only small percentage of improvement in seek times due to mechanical movement of read/write head.
Improvements in data transfer rates helps systems like map-reduce which scans data sequentially.


My experience in database systems has been mostly with Microsoft SQL Server, when I refer to any specific aspects of database systems in my posts, it'll be with reference to SQL Server.

Fragmentation is the silent killer of query performance in database systems. If the execution plan looks perfect as far as physical access to data goes, and still query takes longer to execute, most likely your data is fragmented either in index or heap.

Fragmentation leads to random I/O's and affects the performance of query.

Continue to my next post by following link http://technicalpostsbyarifkhan.blogspot.com/2013/12/influence-of-random-ios-on-key-look-ups.html to know more on how random I/O's influence database systems.


If you like this post, please do not forget to share it on social networks by clicking buttons below.