









## We've All Followed This Path ..

Given the clear and pressing need for improved computer performance, there are several means of achieving this end. In the simplest approach, current computer architectures are reimplemented using faster device technologies. Although this approach will always be exploited, physical, technological and economic limitations make it incapable of providing all the needed computational power. Instead, parallelism must be exploited to obtain truly significant performance improvements. November 1987





Research

| One, Two, Three, Many?               |                        |       |         |                          |                           |         |                     |  |  |
|--------------------------------------|------------------------|-------|---------|--------------------------|---------------------------|---------|---------------------|--|--|
| С                                    | Core<br>ache<br>e Thre | ad    | С       | Core<br>ache<br>Single/I | Core<br>Cache<br>Multiple | e Three | he Cache<br>re Core |  |  |
|                                      | Core                   | Core  | Core    | Core                     | Core                      | Core    |                     |  |  |
|                                      | Cache                  | Cache | Cache   | Cache                    | Cache                     | Cache   |                     |  |  |
|                                      | Core                   | Core  | Core    | Core                     | Core                      | Core    |                     |  |  |
|                                      | Cache                  | Cache | Cache   | Cache                    | Cache                     | Cache   |                     |  |  |
|                                      | Core                   | Core  | Core    | Core                     | Core                      | Core    |                     |  |  |
|                                      | Cache                  | Cache | Cache   | Cache                    | Cache                     | Cache   |                     |  |  |
|                                      | Core                   | Core  | Core    | Core                     | Core                      | Core    |                     |  |  |
|                                      | Cache                  | Cache | Cache   | Cache                    | Cache                     | Cache   |                     |  |  |
| Serious Parallelism and Optimization |                        |       |         |                          |                           |         |                     |  |  |
| 9                                    |                        | Ackno | wledgme | nt: Tim N                | lattson, I                | ntel    | Research            |  |  |

#### © 2007 Microsoft Corporation. All rights reserved.

8

# Think Chocolates, Not Cookies

### Sugar cookies

- Similar, modulo process variation
- You must eat lots to be satisfied
- Designer chocolates
  - Diversity is a feature
  - Forrest Gump was right





#### © 2007 Microsoft Corporation. All rights reserved.

10

Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Research

## The Other Axis: Core Complexity

- Remember Amdahl's Law Speedup = (S + (1-S)/N)<sup>-1</sup>
- Multicore implications?
  - Symmetric or asymmetric cores
  - Legacy and new code
  - Programming heterogeneity
- Some very nice work by Mark Hill
  - "Amdahl's Law in the Multicore Era," M. D. Hill and M. R. Marty, IEEE Computer, July 2008

G. Amdahl, "Validity of the Single Processor Approach to Achieving Large-Scale Computing Capabilities," *AFIPS Conference Proceedings*, pp. 483-485, 1967



### Architectural Futures

- Replication of tweaked cores
  - interconnect (it really matters)
  - mix of core types
    - heterogeneity and programmability
- Or, more radical ideas …
- Other issues …
  - process variation and cores
  - performance and reliability
  - dynamic power management

# Where is our architectural vision? Where are the new ideas?

13

| Variability Cause and Estimated<br>Impact on Delay |                                |                            |  |  |  |  |
|----------------------------------------------------|--------------------------------|----------------------------|--|--|--|--|
| Time domain<br>(sec)                               | Mechanism                      | Delay<br>Impact (3#<br>29% |  |  |  |  |
| 1 × 10 <sup>10</sup>                               | Lithography node               |                            |  |  |  |  |
| 1 × 10*                                            | Electromigration               | 5%                         |  |  |  |  |
| 1 × 10 <sup>4</sup>                                | Hot electron effect            | 6%                         |  |  |  |  |
| 1 × 10°                                            | NETI                           | 19%                        |  |  |  |  |
| 1 × 10 <sup>4</sup>                                | Chip electrical mean variation | 18%                        |  |  |  |  |
| 1 × 10 <sup>4</sup>                                | Across-chip Lvariation         | 15%                        |  |  |  |  |
| 1 × 10°                                            | Self heating/temperature       | 12%                        |  |  |  |  |
| 1 × 10 <sup>8</sup>                                | SOI history effect             | 10%                        |  |  |  |  |
| 1 × 10 <sup>+0</sup>                               | Supply voltage                 | 17%                        |  |  |  |  |
| 1 × 10"                                            | Line-to-line coupling          | 10%                        |  |  |  |  |
| 1 H 10 <sup>11</sup>                               | Residual S/D charge            | 5%                         |  |  |  |  |

Revealer

© 2007 Microsoft Corporation. All rights reserved.





et Protocol (IP)

### © 2007 Microsoft Corporation. All rights reserved.

## The Siren Call ...



- We've seen parts of this movie before
  - Vector processors, systolic arrays, attached processors
- Success requires optimizing for efficiency
  - data movement, computation and software costs
- Efficient exploitation, in two senses
  - achieved application performance
     baliatio application part just application
  - holistic assessment, not just application kernels
  - high human productivity
    - extant software base, available tools
- We must raise the abstraction level ...

Managed code and SLAs
 Performance and failures



Researc

## **Programming Groups**

- Three developer groups
- Heroes
- Mainstream
- Entry/Novice
- Each with differing needs
- Heroes
  - "Neurosurgery?
  - No problem, hand me that screwdriver."
- Mainstream
  - The typical computing graduate
- Entry/novice

19

- Think of Visual Basic developers



----



#### © 2007 Microsoft Corporation. All rights reserved.

18

### Breaking Through The Brick Wall Industry Giants Try to Break Computing's Dead My fimes March 19, 2008 John Markoff Intel and Microsoft said Tuesday that they planned to finance two groups of university researchers to start over and design a new generation of computing systems intended to break the industry out of a technological cul-de-sac that threatens to end decades of performance increases in computers.

### Microsoft/Intel UPCRCs

- Two academic research centers
  - UC-Berkeley and UIUC
- Jointly funded by Intel and Microsoft
  - \$20M over five years
  - Matching funds from each institution

### Rationale

21

- Long-term approaches to parallel computing
- Integrated thinking applications to architectures



### © 2007 Microsoft Corporation. All rights reserved.

Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.



1

ILLINOIS















