|
![]() ![]()
May 2000Understanding parallel queries in SQL Server 7.0by Scott Mauvais When Microsoft shipped SQL Server 7.0, it added another key check box for its head-to-head competition with Oracle and DB2: the ability to execute queries in parallel on multiprocessor servers. If you've ever watched Windows NT Performance Monitor, you might have noticed a situation where SQL Server only uses a single processor. "Wait a minute," you might have asked yourself. "I thought SQL Server was supposed to be a true, multithreaded app." Well, the answer is yes and no. Prior to version 7.0, SQL Server as a whole would have taken advantage of all the processors in your system and spread the work across them. When it came down to the basic unit of work in a query, it executed everything serially. In other words, the application was multithreaded but an individual query had only one thread of execution. Both Oracle and IBM used this fact, along with the lack of row-level locking, to great advantage when selling against SQL Server in shops with large data sets and applications that relied upon complex queries. All that has changed with the release of SQL Server 7.0. In this article, we'll take a detailed look at the parallel query architecture and how SQL Server decides whether to execute a query in parallel or not. From there, we'll examine the performance tradeoffs and learn how to control how SQL Server executes parallel queries. For example, you can limit the number of processors SQL Server can use to process parallel queries, as shown in Figure A. Finally, we'll set up and execute some sample queries and compare the performance between the serial and parallel execution plans. Figure A: You can configure the number of processors SQL Server uses for
parallel queries. Parallel query architectureFirst, let's make sure we understand what is meant by intra-query parallelism . When SQL Server executes a query in parallel, it breaks down this single query into subsections and spreads these subsections across multiple processors. A classical example for a parallel query is the calculation of quarterly sales. To apply this to the Northwind sample database, you would use something like the following query:DATEPART(year,Orderdate) AS FY, SUM(unitprice*quantity) AS Sales FROM Orders O,
[Order Details] OD WHERE O.OrderID = OD.OrderID GROUP BY DATEPART(year, Orderdate),
DATEPART(qq, OrderDate)
ORDER BY FY DESC, FQ DESC In the classic example, this query would be broken down into four sub-tasks, which relate to each of the four quarters. On a quad processor machine, the server would delegate one fiscal quarter to each processor. SQL Server behaves quite similarly other than it doesn't break down the query into tasks based on column values and the number of groups returned. This makes sense because the server can't know the number of distinct values until it actually executes the query, so it has no way of partitioning the query across the processors using this approach. When executing a query in parallel, SQL Server breaks down the rows into streams , regardless of value, and sends each stream-consisting of a roughly equal number of rows-to each processor. Each processor then executes the complete query against a subset of the rows. Finally, near the end of the query execution, SQL Server aggregates these separate streams and produces a single result set. Well, that's almost correct. Sometimes, not every processor participates in a
parallel query. SQL Server supports what it calls One thing that makes SQL Server's parallel query architecture stand out is its lack of explicit parallelism directives. Other databases require the developer to explicitly define a query as parallel at design-time by including a specific syntax in the queries. Microsoft, by contrast, has written parallelism into SQL Server's Query Optimizer rather than writing it into the various logical operations (for example-seek, join, sequence, etc.). Instead of relying on the developer, SQL Server decides at optimization-time whether to execute a parallel or serial query plan. When the Query Optimizer detects that a given query could benefit from parallelism, it inserts exchange operators into the query. These operators manage the streams and the flow control during the execution of the query. With this approach in mind, Microsoft developed what it calls parallel everything , where each query operation can execute in parallel when combined with an exchange operator. In the SQL Server 7.0 release, the INSERT, UPDATE, and DELETE operators-just the operators, not the entire query-always execute serially to avoid race conditions. The remainder of the query, such as the WHERE clause and the SELECT portion of an INSERT query, can execute in parallel. In other words, SQL Server identifies the rows affected by the data modification in parallel but it applies the changes serially. Microsoft has said that future versions of SQL Server will be fully parallel. Performance tradeoffsLet's move on and examine how the optimizer determines if a query would run faster if executed on multiple processors. To decide this, the optimizer looks at the query itself and the current operating environment. This is a key point that also distinguishes SQL Server from its competitors.Similar to its approach with dynamic locking, Microsoft has built into SQL Server the ability to monitor its own usage and optimize queries appropriately. Thus, depending on the current load on the server, a query that executes in parallel one time may execute serially the next. For SQL Server to execute a query in parallel, the query and the current environment must pass the following six tests:
After the optimizer has evaluated a query against these six criteria, it places exchange operators in the queries it has determined will benefit from parallel execution. As a developer or DBA, you can exercise a great deal of influence over how SQL Server handles these six criteria and, consequently, you can control the DOP at several different levels. First, you can disable all parallel queries or limit the maximum DOP at the server level. If that's too sweeping of a change, you can increase the complexity threshold a query must cross before the optimizer considers it for parallel execution. Finally, you can have extremely granular control by specifying the maximum DOP on a query-by-query basis. Let's look at each of these approaches. Disabling parallel query processing for your serverTo turn off all parallel query processing, bring up the SQL Server Properties window shown in Figure A on the cover. To do so, right-click on your server in Enterprise Manager and choose Properties. Next, click on the Processor tab.By default, SQL Server uses all available processors for executing parallel queries. To reduce the number of processors used, select the Use option button in the Parallelism section of the dialog box. Doing so enables the combo box where you can select the maximum number of processors you want SQL Server to use. This setting affects the outcome of the second test mentioned earlier. Be careful not to make changes in the Processor Control section of this dialog box. These settings affect the processors used by SQL Server itself, not just the parallel query processor. The other parallelism setting you can change in this window is the If you prefer to change settings through queries instead of dialog boxes, you can also configure the threshold for parallelism by using Transact-SQL. The following commands enable you to turn off intra-query parallelism for the entire server: EXEC sp_configure 'max degree of parallelism', 1 RECONFIGURE
EXEC sp_configure 'show advanced options', 0 Just like the maximum degree of parallelism, you can change the minimum cost for parallelism via Transact-SQL. For example, to increase the threshold to 7 seconds, you could use the following statements: EXEC sp_configure
EXEC sp_configure 'show advanced options', 0 Disabling parallel query processing for an individual queryIf you require more control over individual queries, you can use an optimizer hint to control the DOP. Unlike the locking hints you might be familiar with that affect the concurrency and consistency that you place following theFROM
clause, the DOP hint is part of the new OPTIONS
clause. Back to our quarterly sales example-if we want to limit the intra-query
parallelism to two processors, we could add a hint like this:
DATEPART(year, Orderdate) AS FY, SUM(unitprice * quantity) AS Sales FROM Orders O, [Order Details] OD WHERE O.OrderID = OD.OrderID GROUP BY DATEPART(year, Orderdate),
DATEPART(qq, OrderDate) ORDER BY FY DESC, FQ DESC
OPTION (MAXDOP 2) It's worth drawing attention to the specific syntax. Some of the early SQL
Server documentation as well as several articles list the syntax as Why should you adjust the DOP?With all of SQL Server's great self-tuning we keep hearing about, you might be wondering why you would want to adjust the DOP. The most common reason is if you have a server that isn't dedicated to a single task. In this case, you might want to ensure that a handful of complex queries don't monopolize all of the CPU horsepower on your server and affect the performance of all the other applications running on it.With the declining cost of hardware, the need for a server to do double-duty by running both SQL Server and, say, IIS has gone away for the most part. However, this declining cost has also lead to a new class of super servers that host a whole series of SQL applications, running the gamut from high-volume OLTP systems to data warehouses holding hundreds of gigabytes. With servers like this, you'll probably want to limit the DOP so you can maintain the agreed upon quality of service across all your applications. Turning off parallel queries is also a documented means of working around a couple of bugs in SQL Server. For more information, refer to Knowledge Base articles Q240853 and Q247500. Now that we understand how SQL Server decided whether to run a query in parallel across a server's processors and we've discussed how to influence this behavior, let's move on and examine the performance differences between parallel and serial queries. Analyzing the performance of parallel queriesTo study the performance of parallel queries in a controlled environment, we created a script that increased the size of the sample Northwind database a thousand fold. We chose this approach primarily because it provided a repeatable test bed that wasn't based on a proprietary schema or confidential information. We'll discuss the details of the script in a future article. But, in the meantime, you can download it along with instructions for its use from from my web site .To compare the performance of parallel versus serial execution, we constructed a simple query that computes the average sale of a product, the number of customers who bought it, and counts the times it was ordered. Here's the query we used: COUNT(C.CustomerID) AS NumCustomers,
COUNT(O.OrderID) AS NumOrders, AVG(unitprice * quantity) As AvgPrice
FROM Customers C,
Orders O, [Order Details] OD WHERE C.CustomerID = O.CustomerID AND O.OrderID = OD.OrderID GROUP BY ProductID
ORDER BY ProductID This query provides us with a good performance test of parallel queries because
it incorporates three large joins and a good sized aggregation. To run this
same query serially, simply append Using Performance MonitorIn Figure B, you can see where we captured a Windows NT Performance Monitor graph of the query first executed serially and then executed in parallel on the same dual processor Pentium 500 server. While there are several interesting counters we could have looked at, here we chose to focus strictly on the CPU utilization.The solid black line represents the Figure B: You can analyze serial and parallel query performance with
Windows NT Performance Monitor. The first thing you'll undoubtedly notice is the shorter execution time of the second, parallel query. Astute readers will suspect that some of the performance gain during parallel execution came because of SQL Server caching the data after it ran the query the first time. To correct for this measuring error, we used DBCC PINTABLE against the relevant tables before running these tests to ensure they were already in the data cache. To double-check our results, we reversed the order of the tests and achieved the identical outcome. Faster execution is, of course, the expected result because of the increased CPU resources available. The interesting thing, however, is the relative amplitude of the CPU utilization between the serial and parallel queries. During the serial execution, CPU 1 (the dotted line) immediately jumps up around 90 percent utilization, while CPU 0 (the white line) languishes around 10 percent. The result is a total CPU usage of around 50 percent. Even more interesting is the apparent inverse relationship between the two processors. As one processor peaks within its range, the other drops off. While there's no way to know for sure, the most likely cause is the thread executing the query on CPU 1 has to wait for SQL Server or Windows NT itself to return the desired data from the cache to the CPU for processing. When we turn on parallel execution, the first thing you'll notice is the nearly
100 percent The other important aspect of the parallel query plan is that it only provided a little more than a 20 percent performance boost from the serial plan, based on a rough estimate of the CPU graphs. This shows that the scaling isn't linear because of the overhead of coordinating the multiple processors. Using SQL Server ProfilerTo get a better measure of the relative performance of serial and parallel execution plans, we can turn to another tool provided with SQL Server, the Profiler. Before you can effectively use the SQL Server Profiler to analyze parallel queries, you need to add the Event Sub Class data column. To do so, open up your trace and select File | Properties. This brings up the Trace Properties dialog box. Next, click on the Data Columns tab, highlight Event Sub Class in the Unselected Data list box, and click Add, as shown in Figure C. The Event Sub Class column lets you see the DOP for each query in the Trace window.Figure C: You'll need to add Event Sub Class to Profiler to analyze
parallel query performance. With the Event Sub Class column added, let's execute our sales by product query with both serial and parallel execution plans and look at the results. You can see our SQL Server Profiler results in Figure D. Figure D: SQL Server Profiler enables you to view serial and parallel
DOP, CPU usage, and duration. You can tell the first query (above the highlight) is the serial one both by the ConclusionIn this article, we covered SQL Server's new intra-query parallelism. While previous versions of SQL Server were indeed multithreaded, an individual query would only execute on a single thread.We discussed how SQL Server breaks a query into subtasks and assigns each to separate processors. We then looked at how the SQL Server Query Optimizer decided whether to use a serial or parallel execution plan. Next, we looked at three methods that we as developers and DBAs can employ to control the degree of parallelism for the server as a whole and down to the level of an individual query. Finally, we showed some examples of parallel queries and compared their performance to normal serial queries. With the addition of the parallel query, execution SQL Server takes an important step towards being a real, enterprise-class database now that it captures the wasted horsepower in SMP servers by using the full CPU resources available. This faster execution comes at a cost as coordination of the query's multiple threads requires additional memory and CPU cycles.
|