Software Exploits of Instruction-Level Parallelism For Supercomputers

International Journal of Computer Science Engineering and Information Technology Research (IJCSEITR) ISSN 2249-6831 Vol.
2, Issue 4, Dec 2012 19-38 TJPRC Pvt. Ltd.,
SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

1
S. N. TAZI, 2PRAKASH MEENA, 3ISHITA SHARMA, 4A. K. DUBEY& 5NEETU SHARMA M.Tech Scholar- Computer Engineering, Govt. Engineering College, Ajmer-305002, Rajasthan
4 5
1, 2, 3 3
M.Tech Scholar- Computer Engineering,Govt. Women Engineering College, Ajmer-305002, Rajasthan Ieee Member, India
Govt. Engineering College Ajmer-305002, Rajasthan, India
ABSTRACT
For decades hardware algorithms have dominated the field of parallel processing. But with the Moores law reaching its limit need for software pipelining is being felt. This area has eluded researchers since long. Significant measure of success has been obtained in graphics processing using software approaches to pipelining. This project aims at developing software to detect various kinds of data dependencies like data flow dependency, anti-dependency and output dependency for a basic code block. Graphs would be generated for the various kinds of dependencies present in a code block and would be combined to obtain a single data dependency graph. This graph would be further processed to obtain a transitive closure graph and finally an ILP graph. The ILP graph can be used to predict the possible combinations of instructions that may be executed in parallel. A scheduling algorithm would be developed to obtain an instruction schedule in the form of instruction execution start times. The schedule obtained would be used to compute various performance metrics like speed-up factor, efficiency, throughput, etc.
KEYWORDS: Ilp, Dependencies, System Design, Agile, Performance Metrics INTRODUCTION

Instruction-level parallelism (ILP) is measuring amount of operation computer program performed simultaneously. The main objective consider by designer to design compiler and processor is, identification of ILP and gain all its beneficial points as much as possible. Commonly programs are written in a order execution model. Where all the instructions are executed one after the other explicited by the programmer. ILP facilitate both compiler & processor for overlapping to the execution of multiple instructions or change the executon order of instructions[1]. To achiving approximate standard of high performance, supercomputers uses both super-pipelining & EPIC (Explicitly Parallel Instruction Set Computing) processors. In this work is exploits software based approach from two comman approaches i.e., Hardware and software based approach. The ILP existence amont specify the application values of program. In specific field of graphics and scientific computing the existing amount of ILP is much more in compare to cryptography. The exploit ILP are used Micro-architectural techniques that include: Instruction pipelining for execution of multiple instructions which can be partially overlapped. VLIW, Superscalar execution are closely related to the concept of Parallel Instruction Computing, in which execute multiple instructions in parallel by using multiple execution units. Instructions execute in random arrangement that does not violate data dependencies in sequence of out-of-order excuetion.This technique is independent for both pipelining and superscalar. Current implementations, without proper sequencing of execution pertaining to extract ILP from ordinary programs. If etract this parallelism at compile time then, how convey appropriate information to the hardware. Every instruction of encoded multiple independent operations is clearify and sufficiently improved. The repetation process to examine again and again is followed by industry for instruction sets to control the complexity arises in squencial order instructions.
20
S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma
A technique used for renaming register is to turn away accidental serialization of program operations which is imposed by reuse of registers and those particular operations. All the internal part of speculative execution are executed before the determination of target control flow instructions . Branch prediction, which is used with speculative execution to turn away stalling for control dependencies which may resolved. [1]
Figure 1.1: A Canonical Five-Stage Pipeline in a RISC Machine (IF = Instruction Fetch, ID = Instruction Decode, EX = Execute, MEM = Memory Access, WB = Register Write Back) [2]
DEPENDENCIES
In computer science, data dependency shows instruction or a program statement which, refers to the data of preceding statement. According to compiler theory, dependence analysis is a technique, use for discovering data dependencies from statements (or instructions). Two common type of dependencies are as follow: DATA DEPENENCIES Lets assume statement S1 and S2, S2 depends on S1 if: [I (S1) O (S2)] U [O (S1) I (S2)] U [O (S1) O (S2)] where, I( Si ) represent set of memory locations read by Si and O( Sj ) represent set of memory locations written by Sj And S1 to S2 represent the feasible run-time execution path This condition is called Bernstein Condition, named after A.J. Bernstein. Three cases exist: True (data) dependence: O(S1) I(S2), S1 -> S2 and S1 writes something read by S2 Anti-dependence: I(S1) O(S2), mirror relationship of true dependence Output dependence: O(S1) O(S2), S1 -> S2, both are writing into same memory location.
True Dependencies A true dependencie, also known as data dependencies. It occurs when the current instructions depends on the previous instructions results.
Software Exploits of Instruction-Level Parallelism for Supercomputers
21
Anti-dependence Anti-dependencie occurs when required value for a particular instructions, updated later. Output Dependencies An output dependencie occurs when the final output value of a variable is affected by instructions order. A commonly used convention for the data dependencies is the following: Read-after-Write (true dependence) Write-after-Write (output dependence) Write-after-Read (anti-dependency) CONTROL DEPENDENCY In the program instruction are executed according to sequencial execution modal, under this modal, instruction used one after to other, atomically. However, dependencies among instructions may execute parallel execution of multiple instructions, by a processor exploiting instruction level parallelism without considering related dependencies may cause danger of getting wrong results, namely hazards. We restrict ourselves to data dependencies in this project without dealing of control dependencies.[2]
REQUIREMENT ELICITATION
Basic The requirement elicitation of system campture all the relevant information related to the system development, i.e., customerdetails, problem identification of client & appropriate developer for particular problem. The requirement elicitation role work as interface between system specification (in developer team) custmers records (problem). The main motive is to focus on the custmors view of the system.[3][4] In the analysis phase of requirement, analysier mainly focus on two basic thing: clarifcation & understandibilty of the real problem is one thing and procedure to solve the cpcoming problem is another one. The automation of system and automation in development environment could a common problem another one make the combination of these two. Heavy systems have a lot of features, and its necessary to perform all these different tasks, one of the most commn task is to understood the requirements of the system.. The problem analysier, analysis real mean of problem and its context. They required the complete report generated by previous analyzer to understood the system and its individual automated parts. Proposed System This project aims at developing a software to detect various kinds of data dependencies like data flow dependency, anti-dependency and output dependency for a basic code block. and
22
(Instructions without ILP)
Detect data flow dependency
Detect anti-dependency
Detect output dependency
Obtain data dependency graph
Obtain transitive closure graph Obtain architectural restrictions graph
Obtain dependence graph
Obtain ILP graph
Obtain instruction schedule Compute performance metrics
Figure 1.2: ILP Application Algorithm Graphs would be generated for the various kinds of dependencies present in a code block and would be combined to obtain a single data dependency graph. This graph would further be processed using certain backtracking algorithms to obtain a transitive closure graph (TCG). The TCG is an indication of the various kinds of dependencies and can be used to predict the possible combinations of instructions that may execute in parallel. Finally an ILP graph would be obtained. A scheduling algorithm would be applied to obtain an instruction schedule in the form of instruction start times. Certain performance metrics would then be computed. Specificaton of Software & Hardware Processor: Intel Core2 Duo @ 2.66GHz RAM: 2GB DDR2 Hard Disk: Samsung HD 161(160 GB) Operating system: Fedora 10 (Linux kernel version 2.6.27.5-117.fc10.i686)
23
X-Windows system: GNOME Editor: Gedit Development kit: JDK 1.6 Programming paradigm: Object Oriented Programming language: JAVA 2 SE Development philosophy: Agile Process model: Scrum Technology: Open Source Image manipulator: GIMP
SYSTEM DESIGN
The objective of analysis modeling is to create a variety of representations that depict software requirements for information, function, and behavior. To accomplish this, two different modeling philosophies can be applied: structured analysis and object-oriented analysis. Structured analysis views software as an information transformer.That may support software engineer to identify data object and relationship between different object.It also transform the object in a flow through a systematic manner by the use of function. Object-oriented analysis examines a problem domain defined as a set of use-cases in an effort to extract classes that define the problem. Each class has a set of attributes and operations. Classes are related to one another in a number of ways and are modeled using UML diagrams. Four modeling element such as :scenariobase, class-base, flow and behavioral models are composed with analysis . Scenario-Based Modeling This model consist software requirements on the basis of users view. The use-case- a narrative or template driven description of an interaction between an actor and the software- is the primary modeling element. Derived during requirement elicitation, the use-case defines the key steps for a specific function or interaction. The degree of use-case formality and details varies, but the end result provides necessary input to all other analysis modeling activities. Scenarios can also be described using an activity diagram- a flowchart-like graphical representation that depicts the processing flow within a specific scenario.
Figure 2: Use-Case Diagram
24
A use-case captures the interactions that occur between producers and consumers of information and the system itself. Requirements gathering mechanisms are used to identify stakeholders, define the scope of the problem, specify overall operational goals, outline all known functional requirements, and describe the objects that will be manipulated by the system. Flow Modeling Flow models focus on the flow of data objects as they are transformed by processing functions. Derived from structured analysis, flow models use the data flow diagram, a modeling notation that depicts how input is transformed into output as data objects move through a system.
Figure 3: Context-Level DFD Each software function that transforms data is described by a process specification or narrative. In addition to data flow, this modeling element also depicts control flow- a representation that illustrates how events affect the behavior of a system. The DFD takes an input-process-output view of a system. That is, data objects flow into the software, are transformed by processing elements, and resultant data objects flow out of the software.
Figure 4: Level 1 DFD
Figure 5: Level 2 DFD that Refines the Detect Direct Dependency Process
25
Class-Based Modeling
Figure 6: Class Diagram Behavioral Modeling Data objects are represented by labeled arrows and transformations are represented by bubbles. The DFD is presented in a hierarchical fashion. That is, the first data flow model (sometimes called a level 0 DFD or context diagram) represents the system as a whole. Subsequent data flow diagrams refine the context diagram, providing increasing detail with each subsequent level.
Figure 7: Sequence Diagram
SYSTEM ANALYSIS
Agile Design Philosophy Agile is a philosophy ,guidelines to build a software . This philosophy encourages client satisfaction and delivery of a software before deadline;It gave a motivation to development team and minimize the workdone on software product. These guidline hammering both analyser and developer for better communication to client.
26
Manifesto for agile software development: We unveil improve the developing process of software by doing it and also assist others do it. Entire this processing work we have come to significance: $ Individuals and interactions accomplished processes and tools $ Working software over comprehensive documentation $ Customer collaboration over contract negotiation $ Responding to changes aloft following a plan i.e, while there is a value item in the right, we changes and make value the items on the left more. [3][4] Software engineers and other project stakeholders work together on an agile team- a team that is self-organizing and in control of its own destiny.
Figure 8: Agile v/s waterfall An agile team fosters communication and collaboration among all who serve on it. Agile development may be best termed as software engineering lite. The basic framework activities- customer communication, planning, modeling, construction, delivery and evaluation remain. But they morph into a minimal task set that pushes the project team toward construction and delivery (some argue that this is done at the expense of problem analysis and solution design). Customers and software engineers who have adopted the agile philosophy have the same view- the only really important work product is an operational software increment that is delivered to the customer on the appropriate commitment date. The Agile Alliance defines 12 principles for those who want to achieve agility [5]: 1. Our highest priority is to made satisfaction to customer during whole phases from initiation to delivery continuously of valuable software. 2. Adapt required changes in the requirements, flush later in development. Agile processes tackle changes made for customers competitive advantage. 3. To gave the preference for shorter time scaling relate to deliver process of software. It may from small time duration (couple of week) to continuous increment in time(couple of month). 4. 5. During the project development process developers and business people must work together daily. Develop the projects near about indviduals motivation. Provide appropriate environment and support according to their need, and belief them to achive the job done.
27
6.
Face-to-face conversation is the most common and effective with efficient method to fetch information for both developers team member and other.
7. 8.
Primary measure of progress is covered by working software. Agile processes promote credible development. All the involving affective teams for project (i.e; sponsors, customers and developers) should be keep-up a continuous fix pace indefinitely.
9.
Continuous concentration to technical preeminence and good design embellish agility.
10. Lack of adornment- cover the amount of essential work which is not completed. 11. The self-organizing teams emerge best architectures, requirements and designs. 12. Time to time the team reflects regularly on increment of effectiveness and maintain their behavioral tune accordingly. Agility can be applied to any software process. However, to accomplish this, it is essential that the process be designed in a way that allows the project team to adapt tasks and to streamline them, conduct planning in a way that understands the fluidity of an agile development approach, eliminate all but the most essential work products and keep them lean, and emphasize an incremental delivery strategy that gets working software to the customer as rapidly as feasible for the product type and operational environment. [10] Any agile software process is characterized in a manner that addresses three key assumptions about the majority of software projects [6][7][11]: It is difficult to predict in advance which software re1quirements will persist and which will change. It is equally difficult to predict how customer priorities will change as a project proceeds. For many types of software, design and construction are interleaved. That is, both activities should be performed in tandem so that design models are proven as they are created. It is difficult to predict how much design is necessary before construction is used to prove the design. Analysis, design, construction, and testing are not as predictable (from a planning point of view) as we might like. A number of key traits must exist among the people on agile team [8][9]: Common focus Competence Collaboration Decision-making ability Fuzzy problem-solving ability Mutual trust and respect Self-organization
28
Scrum Process Model[13] Scrum is an agile process model that was developed by Jeff Sutherland and his team in the early 1990s. In recent years, further development of the Scrum methods has been performed by Schwaber and Beedle.The Scrum principle consist with the agile manifesto.
Figure 9: Scrum[13] Scrum emphasizes the use of a set of software process patterns that have proven effective for projects with tight timelines, changing requirements, and business criticality. Each of these process patterns defines a set of development activities: Backlog : Choose the maximum priority from list of requierment, according to the business value. Items can be added to the backlog at any time (this is how changes are introduced). The project manager assesses the backlog and updates the priorities as required.
Figure 10: Prevalence of Scrum[13]
29
Sprints-By the getting of priority requirment from backlog to fit work unit , that must completed the task within predefined deadline . During the sprint, the backlog items that the sprint work units address are frozen (i.e. changes are not introduced during the sprint). Hence, the sprint allows the team members to work in a short-term, but stable environment. Scrum meetings- are short meetings held daily by the Scrum teams. Three key questions are asked and answered by all team members: What did you observe since the last meeting? What obstacles are you encountering? What do you plan to accomplish by the next team meeting? Demos- deliver the software increment to the customer so that functionality that has been implemented can be demonstrated and evaluated by the customer. It is important to note that the demo may not contain all planned functionality, but rather those functions that can be delivered within the time-box that was established.
CONSTRUCTION
Base The programming paradigm used for coding is object oriented. It provides the ease of development with the use of constructs like classes, constructors, inheritance, interface, encapsulation, and packages. Java provides a rich set of language features like pre-defined classes and methods in the form of packages, interfaces for establishing guidelines for methods, data hiding, event handling with awt, etc. The user interface for this software has been designed using Swing. It provides light weight components as compared to the awt. The use of awt in this software is restricted to event handling. Java provides this software its present platform independent form. The security is ensured by the sandbox model of JVM. Packages most prominently used in the development of this software include javax.swing, java.awt, java.util, java.awt.geom, and java.awt.event. The interface used in this software include ActionListener and Runnable interface. Transitive Closure Graph It is the summation of both the direct and indirect dependencies. Given that G is a n-vertex digraph, we construct the transitive closure graph of the digraph G as another n-vertex digraph by adding edges to G, following this rule. In H, add an edge (i, j) directed from vertex i to j if, and only if, there is a directed path (of any length -1,2,3,,n-1) from i to j in G. To estimate the transitive closure of G in (n3) time that saves time and space in practice we substitute logical operations V (logical OR) and (logical AND) for the arithmetic operations min and + in the Floyd-Warshall algorithm. For i, j, k = 1,2,3,.n
We construct the transitive closure according to Floyd-Warshall algorithm[12], G* = (V, E*) by putting edge (i, j) into E* if and only if tij(n) = 1.
30
if tij(n) = 1. tij(0) = { 0 if i j and (i, j) if i = j or (i, j) for k >= 1, tij(k) = tij(k-1) (tik(k-1) tkj(k-1) ). E, and E, 1
Transitive-Closure (G) 1 n V [G]
2 for i 1 to n 3 4 5 6 7 8 9 10 11 return T(n) do for j 1 to n do if i = j or (i, j) E[G]
then tij(0) 1 else tij(0) 0 for k 1 to n do for i 1 to n do for j 1 to n do tij(k) tij(k-1) (tik(k-1) tkj(k-1))
Scheduling Algorithm Schedule (T, Index) 1 2 3 4 5 6 7 8 9 10 11 unscheduled_count := index initialize inst_state to 0 initialize pipeline_stage to 0 while unscheduled_count > 0 do if stage = EMPTY then sel_stage stage for j 1 to index if inst_state = UNPROCESSED then while dependency or unprocessed predecessor exists if sched_condition break
31
12 13 14 15 16 17
else stage = OCCUPIED sel_stage = stage_no inst_state : = PROCESSED time_array[index] := clock update stage counters and clock return time_array
T is the ILP graph time_array is an array that stores the execution start times of instructions sel_stage represents the pipeline stage to which an instruction has been supplied index denotes the total number of instructions inst_state denotes whether instruction has been scheduled or not stage denotes whether a pipeline stage is empty or occupied.
Instruction Set This software operates on a basic code block written in a generic instruction set. All instructions are assumed to be of five clock cycles. Transfer Instructions Like MOV RD , RS MVI R, 8-BIT OUT [ADDRESS] IN [ADDRESS] Arithmetic Instructions Like ADD R ADI 8-BIT SUB R SUI 8-BIT INR R DCR R Logic Instructions Like ANA R ANI 8-BIT ORA R
32
ORI 8-BIT XRA R XRI 8-BIT Machine Control Instructions Like HLT NOP Notes: The implicit register are used in accumulator . Instructions like INR are presumed to both use and modify the associated register. Branch instructions like the JMP have not been scheduled because we have not dealt with the control dependencies at this stage of the project. Being a fundamental law of computer science GIGO is also applicable here. This software has no explicit error handling facility. Architectural Restrictions Some processors may have some restrictions on which instructions can be combined in parallel. Architectural restrictions may be represented by an architectural restrictions graph, which depicts which instructions cannot be combined in parallel. We have considered the following architectural restrictions in this software: ADD MOV ADDF MULF SUBF DIVF SUB MOV INR DIV Performance Metrics For a K-stage linear pipeline processor with clock period :
33
Testing This software has been tested using a modular testing approach. Finally the integrated product has been tested as a single unit and the detected flaws have been removed. It has been tested on both the Linux and windows platforms for consistent performance and absence of errors of any sort. Let us consider a test case to understand the working of the software. Code sequence are given as below: ADDF R1 R2 R3 SUB R4 R2 R1 MOV R2 PORT#1 INR R4 DCR R1 ORA R2 DIV R7 R5 R3 MULF R6 R8 R9 The code consists of a block of eight instructions. The instructions may be defined as: ADDF floating-point add the contents of R2 and R3 and store in R1 SUB subtract the contents of R1 from R2 and store the result in R4 MOV move the data from port#1 to R2 INR increment register R4 DCR decrement register R1 ORA perform an OR operation over the contents of R2 and accumulator DIV divide R5 by R3 and store result in R7 MULF floating-point multiply R8 and R9 and store result in R6
Figure 11: Data Flow Dependency Graph
Figure 12: Anti-Dependency Graph
34
Figure 13: Output Dependency Graph
Figure 14: Data Dependency Graph
Figure 15: Transitive Closure Graph
Figure 16: Architectural Restrictions Graph
Figure 17: Dependence Graph
Figure 18: ILP Graph
35
Figure 19: Performance Metrics
DEPLOYMENT
System Implementation When the theoretical design concept is turned out into a working system, then ths stage is known as implementation of the project. Therefore, it considered as most danger stage in achieving a successful newly system and in giving the user, confidence that the newly system will work proper and be effective. The implementation stage involves investigation of the existing system, careful planning and implementation constraints, methods of design to manage conversion and judgment of conversion methods. Though the software has been developed on the Linux platform but it has been implemented on the windows platform as well. The platform independent nature of the software is due the platform independence of Java. The platform has been tested on both the platforms for consistent performance. The final working software has been packaged by assembling all the required class files in a jar file archive. The delivered software provides benefit for the end-user, but it also provides useful feedback for the software team. An appropriate statement is given by the end user to increase the characteristics of software such as reliability, user friendly and other comments to their functions and feature. Feedback should be collected and recorded by the software team and used to: $ Make immediate modifications to the delivered increment (if required) $ Define changes to be incorporated into the next planned increment $ Make necessary design modifications to accommodate changes $ Revise the plan for the next increment to reflect the changes
CONCLUSIONS AND FUTURE SCOPE

Conclusions For decades hardware algorithms like Tomasulo algorithm for the IBM System/360s FPU, Scoreboarding for the CDC 6600 computer, etc. have dominated the scenario of pipelining in processors. But with the Moores law reaching its limit, it is no longer feasible to depend purely on hardware pipelining. A paradigm shift is expected in the nearby future from the hardware-centric approaches to a software-oriented approach to exploit the instruction level parallelism. Intel, IBM, AMD and other companies have already begun intense research in this field. An area where this approach has found significant application is graphics processing, as the graphics data contains a considerable amount of redundancy and
36
parallelism. A prominent example is the Graphics Processing Unit (GPU) technology which relies heavily on software approaches to pipelining. Another example is the Itanium processor developed by the Intel Corporation. This processor has found a very significant application as the processor for the Intel supercomputer at NASA. Itanium has features like software pipelining for loop optimization, rotating registers, speculative branch prediction, etc. This is a field of intense research and provides ample of opportunities for the developers and scientists. This field also presents significant challenges for the system programmers. Future Scope The project has covered almost all the requirements initially laid out. Further requirements and improvements can be easily incorporated since the coding is mainly modular in nature. The agile nature of the project has provided the scope for easy accommodation of changes and emerging requirements. Some of the extensions may be in the form of: $ GCD tests before computing dependencies $ Use of expanded instruction set $ Inclusion of control dependencies to extend the software functionality for handling complex branching code blocks $ Application of global code scheduling algorithms like Trace scheduling $ Refinement of the scheduling algorithm to handle resource dependencies
REFERENCES
1. 2. Yahoo answer on Hardware and Software approaches for instruction Level parallesism By Sumanta .in 2011 John L. Hennessy, David A. Patterson (2003), Computer Architecture: A Quantitative Approach (3rd ed.), Morgan Kaufmann. ISBN 1-55860-724-2. 3. 4. 5. Beck, Kent; et al. (2001). "Manifesto for Agile Software Development". Agile Alliance. Retrieved 14 June 2010. Ambler, S.W. "Examining the Agile Manifesto". Retrieved 6 April 2011. Beck, Kent, et al, "Principles behind the Agile Manifesto", Agile Alliance, Archivedfrom the original on 14 June 2010, Retrieved 6 June 2010. 6. Black S. E. , Boca P. P. , Bowen J. P., Gorman J., Hinchey M. G. , "Formal versus agile:- Survival of the fittest", IEEE Computer 49 (9): 3945, September 2009. 7. Boehm, B.R. Turner , Balancing Agility and Discipline:- A Guide for the Perplexed, Boston, MA, AddisonWesley ISBN 0-321-18612-5, Appendix A, pages 165-194. 8. Mark Seuffert, Piratson Technologies, Sweden, "Karlskrona test, A generic agile adoption test", Piratson.se. Retrieved 6 June 2010. 9. "How agile are you, a scrum-specific test", Agile-software-development.com, Retrieved 6 June 2010.
10. http://www.cloudspace.com/blog/2010/08/25/agile-principle-11-the-best-architectures-requirements-and-designsemerge-from-self-organizing-teams/ Posted on August 25, 2010 by Tim Rosenblatt.
11. Software Engineering:- A Practitioners Approach, by Roger S. Pressman, chapter 04.
37
12. http://serverbob.3x.ro/IA/DDU0157.html By The Floyd-Warshall algorithm 13. Agile Software Development with Scrum, by Ken Schwaber and Mike Beedle. 14. Paolo Faraboschi, Joseph A. Fisher and Cliff Young, Instruction Scheduling for Instruction Level Parallel Processors, IEEE Proceedings , VOL 89, No. 11, November 2001. 15. Rainer Leupers ,Exploiting Conditional Instructions in Code Generation for Embedded VLIW Processors 16. Alexandru Nicolau and Joseph A. Fisher, Measuring the Parallelism Available for Very Long Instruction Word Architectures, IEEE transactions on computers, VOL c-33, No. 11, November 1984 17. Lei Wang and Gui Chen Architecture-dependent Register allocation and Instruction Scheduling on VLIW, 2010 IEEE. 18. Advanced computer architecture: Parallelism, Scalability, Programmability, by Kai Hwang 19. Computer architecture and Parallel processing, by Faye A. Briggs and Kai Hwang.

Software Exploits of Instruction-Level Parallelism For Supercomputers

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Software Exploits of Instruction-Level Parallelism For Supercomputers

Uploaded by

Copyright:

Available Formats

International Journal of Computer Science Engineering and Information Technology Research (IJCSEITR) ISSN 2249-6831 Vol.

2, Issue 4, Dec 2012 19-38 TJPRC Pvt. Ltd.,

SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

Govt. Engineering College Ajmer-305002, Rajasthan, India

KEYWORDS: Ilp, Dependencies, System Design, Agile, Performance Metrics INTRODUCTION

S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

Software Exploits of Instruction-Level Parallelism for Supercomputers

S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

(Instructions without ILP)

Detect data flow dependency

Detect output dependency

Obtain data dependency graph

Obtain transitive closure graph Obtain architectural restrictions graph

Obtain dependence graph

Obtain ILP graph

Obtain instruction schedule Compute performance metrics

Software Exploits of Instruction-Level Parallelism for Supercomputers

Figure 2: Use-Case Diagram

S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

Figure 4: Level 1 DFD

Software Exploits of Instruction-Level Parallelism for Supercomputers

Figure 7: Sequence Diagram

S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

Software Exploits of Instruction-Level Parallelism for Supercomputers

Continuous concentration to technical preeminence and good design embellish agility.

S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

Figure 10: Prevalence of Scrum[13]

Software Exploits of Instruction-Level Parallelism for Supercomputers

S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

Transitive-Closure (G) 1 n V [G]

2 for i 1 to n 3 4 5 6 7 8 9 10 11 return T(n) do for j 1 to n do if i = j or (i, j) E[G]

Software Exploits of Instruction-Level Parallelism for Supercomputers

S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

Software Exploits of Instruction-Level Parallelism for Supercomputers

Figure 11: Data Flow Dependency Graph

Figure 12: Anti-Dependency Graph

S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

Figure 13: Output Dependency Graph

Figure 14: Data Dependency Graph

Figure 15: Transitive Closure Graph

Figure 16: Architectural Restrictions Graph

Figure 17: Dependence Graph

Figure 18: ILP Graph

Software Exploits of Instruction-Level Parallelism for Supercomputers

Figure 19: Performance Metrics

CONCLUSIONS AND FUTURE SCOPE

S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

10. http://www.cloudspace.com/blog/2010/08/25/agile-principle-11-the-best-architectures-requirements-and-designsemerge-from-self-organizing-teams/ Posted on August 25, 2010 by Tim Rosenblatt.

11. Software Engineering:- A Practitioners Approach, by Roger S. Pressman, chapter 04.

Software Exploits of Instruction-Level Parallelism for Supercomputers

You might also like