I feel the need…OK, most people are familiar with the line from the film "Top Gun," but this same movie quote can sum up many peoples' feelings when it comes to the speed of their computer. It seems that no matter how fast their computer operates, most people wish it would run faster.
Sometimes this desire is justified, particularly with users who run a lot of multimedia applications or perform video editing. But for many, the desire is simply to have the latest and greatest technology without regard to cost or actual improvements in work completion. How can an organization decide if it's worth it to upgrade their computer equipment to newer equipment using the latest processing technology?
What seems like a straightforward question can easily turn into quite a puzzle when you first try to answer this question. However, using a top-down methodology can help you understand the fundamental business processes or mission goals that drive your organization, and the software applications, data characteristics, and network requirements that are necessary to support those processes.
Combining this technique with performance benchmarks can enable you to determine if it's worth it to upgrade your existing technologies. To better explain this methodology, we'll use the Pentium 4, Intel's newest processor as an example and examine how a "top-down" methodology can be used to break this question down and answer it.
The Pentium 4
Intel's newest processor, the Pentium 4 debuted this year in speeds of 1.3, 1.4 and 1.5 GHz. This is the first completely new chip architecture from Intel since the introduction of the Pentium Pro in 1995. Designed for the high-end multimedia and Internet applications such as video editing, encoding MP3 files, 3D gaming and visualization applications, the new processor is based on Intel's IA-32 architecture, otherwise known as NetBurst (Intel). Why is this new architecture called NetBurst? Probably because it sounds cool and fits in well with Intel's marketing plan. However, the highlights of NetBurst and the Pentium 4, or P4 as it's commonly called, can be summarized in the following categories:
•400-MHz system bus
•Advanced transfer cache
•Execution trace cache
•Rapid -execution engine
•Advanced dynamic execution
•Enhanced floating-point and multimedia unit
To better understand some of the potential improvements of upgrading to this processor, each of the categories are discussed in further detail.
400-MHz System Bus - The P4's front-side bus, or system bus, is only clocked at 100 MHz and transfers 64-bits of data per clock cycle with an 8 byte data bandwidth. However, the new bus is what Intel calls quad-pumped, meaning there are 4 channels for the data to travel through. This translates to 8 bytes * 100 million/s * 4 = 3200MB/s. This is a tremendous improvement over the Pentium III and is currently the highest desktop system bus available.
Advanced Transfer Cache - This also known as Level 2 or L2-cache. The L2 cache on the P4 is identical in size to the Pentium III at 256 KB. However, the Pentium 4's L2-cache uses 128 byte cache lines that are divided into two 64-byte sections. Thus the L2-cache reads data 64 bytes at a time, which improves performance during burst transfers but can reduce performance if only one byte out of the 64 is actually required.
Execution Trace Cache - The P4's trace cache can be viewed as an L1 instruction cache that lies behind the decoders. In the P4, unlike the Pentium III, the executable instructions are of a defined size. Once in the trace cache, the P4 saves time by not having to re-decode repeating instructions. The trace cache ensures that the processor pipeline is continuously fed with instructions, decreasing the chance of the processor having to wait on the decoder units. This is especially important with a processor with as high of a clock speed as the P4.
Hyper-Pipeline – The pipeline for the P4, with 20 stages, is twice as long as the pipeline for the Pentium III. Doubling the number of transistors that process the CPU’s program instructions shortens each stage in the processing pipeline. The shorter each stage, the higher the clock rate for the processor. However, a disadvantage to the longer pipeline occurs when the software branches to an address not predicted. In this case, the whole pipeline needs to be flushed and refilled.
The improved trace cache branch prediction unit that works in conjunction with the trace cache is supposed to ensure this occurs only rarely.
Rapid Execution Engine – The basic part of the Rapid Execution Engine consists of two double-pumped arithmetic logic units (ALUs) and address generation units (AGUs). Each of the four is said to be clocked with double the processor’s clock and can receive simple instructions every half-clock.
The normal or slow ALUs and AGUs process complex instructions. The majority of instructions are processed through this path. This makes the rapid execution engine a sensible but not a dramatic design improvement.
Advanced Dynamic Execution – This enables the P4 to process data more efficiently by choosing from 126 instructions in the execution buffer—triple the Pentium III’s capacity. It also predicts program flow more accurately. This can work well if advanced dynamic execution predicts correctly, but it can also hinder processing speed if it guesses wrong. This is often the case with office applications.
Enhanced Floating Point and Multimedia Unit – This unit accelerates processor intensive tasks such as streaming video, voice recognition, video and audio encoding, and image processing. This is where the P4 uses streaming SIMD extensions 2 (SSE2). This section of the processor has 144 new multimedia and graphics instructions designed to speed up graphics and multimedia applications. However, unless software is written to take advantage of these new instructions, it probably won’t have much effect on processing performance.
So What Does It All Mean?
So now that we’ve gotten past all of the geek speak, what do all these new enhancements incorporated into the P4 mean to the user in terms of realized performance?
The easiest way to determine this is to take a look at performance benchmark tests performed by independent evaluators. During Business Winstone 2001 and Content Creation Winstone 2001 tests, which simulate using common office-type applications, the P4 narrowly edged the Pentium III and was actually beaten by AMD’s Athlon. However, when using more demanding applications, the P4 was the winner.
The memory bandwidth improvements of the P4 architecture allowed it to beat the Pentium III and AMD Athlon on 3D Winbench 2000, Quake III and Photoshop tests. These test applications transfer large amounts of data sequentially to and from memory. The P4, thanks to its SSE2 enhancements, also performed very well on floating point intensive tests using Photoshop Lighting Effects. Additionally, as one would expect, the P4 was the best performer on SPEC tests, which contain applications optimized for the P4 architecture.
So now that we have some indications of how the new P4 will perform while using real-world applications, how do we determine whether it is operationally and economically feasible to upgrade to computers using the new processor? The answer lies in using a top-down approach that looks at how the new computer will be implemented into the organization and network as a whole.
The Top-Down, not Top Gun, Approach
The top-down approach requires an understanding of business constraints and objectives, as well as applications, and the data used by those applications, before data processing, communications and networking options are considered. While normally used for networks as a whole, most desktop and laptop computers are used as part of a network; therefore, the approach can be applied to computers individually as part of the network as well.
Use of the top-down approach is fairly straightforward, implemented by asking yourself questions and answering them.
The Top-Down Model
Use of the top-down approach will help ensure that the implementation of the personal computer into the network design meets the business needs and objectives that motivated the network design in the first place. To better understand the model, each layer is described below.
Business Layer – At the business layer one must understand the major business functions of the organization. What is the organization trying to achieve? What are the problems or opportunities that may be solved by upgrading to the P4? Once business or organizational level objectives are understood, one can then move to the applications layer.
Applications Layer – Here the applications that will be running on the computer must be examined. How will current applications be improved by upgrading to the P4? What other information needs can you identify and how can you relate these to business processes and opportunities? Will upgrading to the P4 allow for new applications development or procurement to satisfy these needs? The use of benchmark tests can greatly assist you in answering questions at this level.
Data Layer – After a thorough analysis of the applications that can or will be used, the data that the applications use or produce must be examined. Here you must look at not only the amount of data, but also the type of data that will be used, such as voice, video, image, fax, etc., in addition to true data.
What type of data distribution architecture is in place or can be improved? How can you relate data collection and distribution to information and business needs and how will upgrading to the P4 improve those needs? Once again, benchmark tests can help answer many of the questions at this layer.
Network Layer – Here one must ask, what are the network requirements? How does it support the business and information needs of the organization and how does the computer impact the network’s ability to satisfy those needs? How will upgrading to the P4 improve the network?
Technology Layer – At this point, one must finally look at the physical implementation of the computer. Although it is beyond the scope of this article to detail, one must look at not only the computer processor itself, but also all other components of the computer, such as RAM, hard drives, chipset, video cards, etc. Can an upgrade to the processor suffice or will an upgrade to a completely new computer be necessary?
In the case of the P4, an upgrade to a completely new computer will almost certainly be necessary. Additionally, you will also need to examine how the computer physically connects to the network.
The overall relationship between layers of the top-down model can be described as follows: analysis at upper layers produces requirements that are passed down to lower layers, while solutions meeting those requirements are passed back to upper layers. If this process is followed throughout the model, then the implemented technology in the bottom layer, in this case the P4, should meet the business or organizational objectives in the top layer.
Performing this type of analysis, along with assessing performance benchmark information, should enable organizations to determine the potential benefits of upgrading to a newer technology and ensure that it will support the organizational objectives.
Combining this information with a cost-benefit analysis can then help you determine if upgrading to a newer technology is warranted. While the P4 chip was used here as an example, this method can be applied to any technology that will be implemented in your organization’s network.
Lt. Cmdr. Tim J. King is the special assistant for Joint Matters, Navy Personnel Command. Dr. Mark Frolick is an associate professor of Management Information Systems at the University of Memphis.
The views expressed here are solely those of the author, and do not necessarily reflect those of the Department of the Navy, Department of Defense or the United States government.