Please welcome guest blogger, Tushar Patel, Principal Engineer, Oracle Solutions
Proven 37 Million IOPS, now augmented with 100G Networking and Integrated GPU – Oh My!
We live in a world of change. Many of us are old enough to remember a time before cable television, the internet, and cell phones were commonplace. Movies made in the 1980’s provides a view of the past. Yet technology continued to advance fed be a desire to bring, for example, broadcast programming faster and in a form to save the consumer complexity and cost. Television shows transitioned from a big wooden box in the family room to VCR tapes, to DVDs, to streaming on Netflix viewed on your latest smartphone. All of this change in the span of a single generation!
Server architecture has evolved over the past decade as well. When Cisco launched the Unified Computing System (UCS) back in 2009, the Cisco UCS B-Series 200 blade only had capacity for two disk drives and these housed the operating system on one drive while the other served as a back-up drive. All application or database files were placed on external storage arrays such as NetApp. This lead to the development of Converged Infrastructures such as FlexPod (Cisco UCS & NetApp) and FlashStack (Cisco UCS & Pure Storage) which became a widely accepted infrastructure to host large database, ERP, and other mission critical workloads. The largest of Oracle databases are hosted on Converged Infrastructures and there will continue to be an ample market for this system design.
Technology continues to advance, and Cisco has captured many of these advances in the new Cisco UCS X-Series chassis which houses up to eight UCS X210c M6 compute nodes. Think of these nodes as really a rack optimized server that is sled vertically into the new Cisco X-Series Modular System (Cisco UCS 9508 chassis). This new design is managed by Cisco Intersight, a cloud based system management capability that leverages key features of the former UCS Manager which managed the previous generation systems. While there are many new capabilities we could elaborate on, for today, lets focus on the fact that the X210c has six NVMe PCI Gen 4 solid state disk drives (SSD) in addition to the two drives which house the operating system.
This month the options expand! Database Administrators (DBAs) look at every system design point to maximize performance. Network bandwidth, traditionally the network admin domain, was considered out of bounds for the DBA. But, Cisco UCS X-Series now supports 100G networking providing an ability for these teams to design new solutions delivering new performance thresholds. Just think of the possibilities! This system can easily drive your database, VDI, AI/ML, and other enterprise applications.
With this new design, customers who have single instance databases such as Oracle or Microsoft SQL Server now have up to 90TB of onboard storage space per X210c compute node. System architects can use these drives to house a single instance of a database and avoid the cost and complexity of investing in external storage. This also saves space and power within the datacenter without sacrificing performance.
Servers supporting Virtual Desktops can drive more complex applications than ever before. Graphic intensive workloads require Graphical Processing Units (GPUs) to maximize performance. X-Series provides two options. The first is the ability configure 1-2 Nvidia T4 GPU per X210c compute node. The second is a dedicated GPU compute node which is ideal for a wide array of workloads such as VDI.
You might recall back in July 2021, John published a blog that detailed Virtual Desktop Infrastructure (VDI) testing that was performed on a single X210c M6. The results proved that this compute node could deliver excellent performance. Database administrators reading this will point out that a database is far different for any number of reasons. While they have a point, we wanted to first determine a baseline performance for these drives. It’s a common practice to evaluate the system baseline performance before deploying any database application. System baseline performance testing is performed using common I/O calibration tools such as Linux FIO. These tools can generate I/O patterns that mimic the type of I/O operations performed by Oracle databases.
In our tests, we used the FIO load generator to exercise 6 local Intel 5600 NVMe disks with 3.2 TB each. Even though this is a single instance oracle database running on single blade server, we decided to run performance tests on all 8 fully populated X210c blades simultaneously so we can ensure that each blade is maximizing its designed performance within the X9108 chassis. From this testing we could conclude:
- No impact on an individual blade server performance due to power and cooling
- No slot dependency. All blades achieve identical performance irrespective of slot inside the chassis.
- Precondition all disks by writing random data patterns
- Observe and validate sustained IOPs and performance over 30 mins, 1-hour and 4-hours with average and 90th percentile sub-millisecond latency for various read/write ratios. Multiple runs for each iteration were conducted and careful analysis was done to verify the data consistency
- Generate reproducible results with minimal parameter changes while mimicking application behavior. In other words, we could generate slightly higher numbers but the goal is not to generate peak performance benchmark numbers. For each test, appropriate FIO iodepth and number of jobs were selected along with async IO as parameter
- Each blade server generated approximately 4.7 million sustained IOPs at 100% reads with total around 37+ Million IOPs at nearly 500 Microseconds latency with 4K data blocks.
- Near Identical IO behavior on all 8-blade servers
This is excellent performance, better than you might have imagined! Now with this baseline in place it is time to load up a single instance of the Oracle database 19c and see how the X210c M6 performs – which will be the topic of our next blog. Stay tuned!