Disk read spead single process vs summary disk speed

Question

Question

Eriks Tauckels · Jul 3, 2020

Running cache 5.0.21 64 bit on Windows server 2016 in virtual environment. Trying to understand why every single process disk read speed (simple sql data walks) caps around ~20MB/s, however 2 paralell such tasks on different data areas can reach 19MB/s each, four - 17MB/s each, that is 70MB/s total, etc. Also simple copy file to nul on that system reach ~400MB/s.

What can keep single query on idle system from reaching for example 200MB/s? Virtualization? Windows? Cache? Processors are below 1-3%

Discussion (4)2

Log in or sign up to continue

Mark Bolinsky · Jul 9, 2020

Hi Eriks,

Specific to you questions about why you cannot achieve 200MB/s, there are some specific physics/physical reasons why this is the case. Firstly, your file copy is a completely different IO operation - it's performed at larger block size requests and 100% sequential in operation benefiting from file cache and/or storage controller cache along with NTFS read-ahead prediction.

In a Caché SQL query, Caché (or IRIS) will do 8KB block reads and presumably random in nature as well depending on the query and the data/global structure, so any caching will be mostly limited to whatever you have defined for database cache (global buffers) in the Caché instance. Since this is 5.0.21, I wouldn't expect your installation to have hundreds of GBs of global buffers (and I would not recommend that on 5.0.21 either), so you are at the mercy of disk latency of a single process doing random 8KB reads and not total throughput you see in a file copy operation.

So, based on ~20MB/sec you are seeing, this indicates you are getting about 2500 8KB IOPS or .4ms single process storage latency - this is actually very good performance for a single process. As you add more jobs in parallel you start approaching other limits in the IO chain such as SCSI queue depths at the VM layer, at the VMware ESXi layer, etc... and its more a IO operation limitation than a throughput (MB/s) limitation.

I hope this helps explain the situation you are seeing, and expected behavior because the ~20MB/s you see is just a factor of storage latency for a single process (.4ms) so that's a max IOPS per second (~2500) * 8KB IO size = ~20MB/sec

Kind regards,

Mark B-

1 0

score 1 · Answer 1 · 2020-07-06T11:30:07-04:00

Hello Eriks,

Before any other investigation I would note that Caché 5.0.21 is an extremely old version and I am surprised it even works on Windows Server 2016 as that is far outside the range of supported platforms.

If you can reproduce the behavior on a modern version of Caché, or better yet IRIS, then there may be more to look into.

score 0 · Answer 2 · 2020-07-09T05:15:53-04:00

Indeed. An incredible amount of things have happened between that version and the current IRIS release, one being automatic parallelization of eligible SQL queries, which likely does exactly what you're looking for.

score 1 · Answer 3 · 2020-07-09T05:59:07-04:00

Undeniably. But that is exact reason, why migration is no longer possible - project is running 24x7, with 3TB data, modified system classes, ~1000 classes and 2000+ fine tuned (for very exact COS plan) queries. There were attempt at 2013 but failed, mainly because of changes in SQL optimization.

But system is running fine. This was more theoretical question - perhaps something can be tuned on vmware or windows level - what cache process can be doing between iops and COS interpreter. 1 job - 25000 IOPS <1% CPU, 2 jobs - 50000 IOPS <1% CPU, 4 jobs - 100000 IOPS <1% CPU. As system is usually running more than 20 IOPS hungry jobs, result is good SSD load. Just wondering what state sits single process on idle system.