M I C R O S O F T C O R P O R AT I O N PAPER EXPLAINS Why we need Massive parallel processing(MPP)? What is PDW? Why QO is complex in PDW? What are the changes done in SQL Server optimizer? How to calculate cost ? PDW Composed of Hardware and Software Shared nothing or loosely coupled SQL Server uses Symmetric multiprocessing (SMP) - uses only one server MPP runs several servers in parallel and independent Cost effective Easy to add extra server and storage Easy upgraded or individually replaced (CPU’s,Memory,Storage,.Etc.) SQL SERVER PDW ARCHITECTURE CONTROL NODE & COMPUTE NODE Control Node: Distribute Queries among the compute nodes Accepts Client connection –ODBC, OLE DB, ADO.NET Contains additional S/W to support distributed architecture of PDW Manages DMS – communication layer between nodes Contains shell database Performs data base authentication and authorization Compute Node: Host for single server instance Runs for communication and data transfer Each node contains portion of user data (hash-partitioned) SHELL DATABASE Contains metadata of user tables No user data Undistinguishable among the one contains actual data Used for testing and debugging of compilations issues Additionally stores, Users and privileges Check for security and access rights Provides same security as SQL server Contains global statistics Calculate local statistics among each nodes DATA MOVEMENT SERVICE Responsible for moving data between all the nodes and appliance One instance, intermediate results move to one compute node to another Another instance, one or more compute nodes move intermediate results to control node Control nodes computes final aggregations and sorting prior to returning the results Uses temp tables to store intermediate tables In some cases, queries generate direct results without intermediate tables and sent back to client (DMS will not involve in the process) DSQL PLAN AND ITS EXECUTION DSQL Plan contains
SQL operations – executed directly in SQL server
DMS operations – Moves data among nodes Temp table operations Return operations – push data to clients
Query plans executed serially, one step at a time.
However, serial process runs parallelly across nodes DSQL PLAN EXAMPLE Assume the customer table (custkey)and order table (orderkey) SELECT c_custkey, o_orderdate FROM Orders, Customer WHERE o_custkey = c_custkey AND o_totalprice > 100 Not compatible with join because of primary key DSQL plan 1. DMS operation: repartitions the table according custkey 2. Return SQL operation: sent the final tuples to the client QUERY OPTIMIZATION PDW Parser SQL Server compilation XML generator PDW Query Optimizer COST BASED QUERY OPTIMIZATION IN PDW DSQL PLAN GENERATION Input: Physical operator tree Output: DSQL formatted plan Framework: QRel Programming Sends SQL queries to the compute nodes, instead of the operator tree (unlike other MPP systems E.g: GreenPlum) SQL statements executed in underlying compute nodes and DMS operations used to transfer data. Similar to Asterdata Approach DSQL PLAN GENERATIONS – QREL FRAMEWORK DMS OPERATIONS 7 DATA MOVEMENT OPERATIONS: 1. Shuffle Move (many-to-many). Rows are moved from each compute node to target table based on a hash of the value in the specified distribution column. 2. Partition Move (many-to-one). Rows are moved from each compute node to the target table on the target node (typically the control node but this is not a requirement). 3. Control-Node Move (From the control node to the compute nodes). A table in the control node is replicated to all compute nodes. 4. Broadcast Move. Rows are moved from each compute node to the target table on all compute nodes. 5. Trim Move. Trim move is initiated against a replicated table on all compute nodes where the destination is to a distributed table on its own nodes. Hashing will take place so that only rows that this node is responsible for will be kept. 6. Replicated broadcast. A table which is only in one compute node it is replicated via a broadcast move. 7. Remote copy to single node . Can be either a remote copy of a replicated table (from control node or from compute node) or, a Remote copy of a distributed table. COST OF DMS OPERATIONS Separated as two components: Source (sending side) and Target (Receiving side) Source and target are divided into sub components Source: = max ( , ) Target: = max ( , ) DMS operation cost: ! = max ( , ) Data transmission happens asynchronously Source and targets operation performs parallelly in each node QUERIES? THANK YOU!!