
Description
Books and References
Guest Lectures
Schedule and class materials
Projects
CSC 557/449: Special Topic:
High Availability and Performance Computing Course
Preq., CSC 437, 445 or instructor's permission
Times: Tues and Thus
12
Place: CEnIT Board Room or Innovation Lab, Live view
Instructor: Dr. Box
Leangsuksun, box@latech.edu
Office: room 237 Nethken Hall, 318-257, 3291
Office Hours: M-W 2
Guest Lecturers:
|
Guest
Speakers |
Tentative
Schedule and Topic |
|
Dr. Hong Ong, Oak Ridge National Lab |
Dec 7, 2005 at Innovation Lab Performance Measurement and Evaluation Tools for Large-scale Systems |
| Charles Grassl, IBM |
TBA POWER5 Programming and Optimization |
| More external speakers will be announced soon..... | |
The course will expose
student to state-of-the-art research and development in High Availability and
Performance Computing (HAPC) and related fields. This class is a reading,
research and hand-on-oriented education. Activities include studies of HAPC systems
and techniques and selected research topics of the current interest. Topics include but not limited to:
1)
http://webct.ncsa.uiuc.edu:8900/public/MPI/
2)
Parallel Programming with MPI by Peter Pacheco Morgan Kaufmann;
1st edition (October 1996) ISBN: 1558603395 (optional).
Other class activities: research, experiment, term projects. The activities will be on an HA-OSCAR Linux cluster[†]
Grading Policies:
Since this class is research (reading) oriented, I think it is more appropriate to evaluate your learning and mastering level of our class objectives into three following categories:
1) Hand-on Term project (40%)
2) Paper (15%) (due right after the charismas break)
3) Exams (25%) and Homework (15%)
4) Attendance (5%)
Grading scheme:
| 91 and up | A |
| 81- 90 | B |
| 71-80 | C |
| below 70 | F |
| Nov 30, 2005 | HAPC introduction |
| Dec 1, 2005 | Progress
in Supercomputing by Dr. Horst
Simon - Video
homework 1 (due date Dec 5) |
| Chapter 1 Intro to High Performance Cluster Computing | |
| Chapter 1 Intro to High Performance Cluster Computing | |
| Dr. Hong Ong's presentation | |
| Intro to Grid Computing (powerpoint by Prof.. Ed Siedle from http://www.cactuscode.org) | |
| The Development of Computational Grid Techniques for the D0 Experiment | |
| Intro to Globus (by ANL, USC Information Sciences Institute, www.globus.org) | |
| A discussion on Condor-G architecture and fault-tolerance aspect. Materials were excerpted from the condor tutorial, www.cs.wisc.edu/condor | |
| continued discussions on Grid computing and 4 primary services, MPI programming | |
| Load Balancing over Network and Homework #1 (MPI program) | |
|
Job and Resource Management System An HPC file system case study: Lustre (by Peter Braam, presented at IEEE Cluster 2003) |
|
| Lustre discussion (continued) and A case study on a cluster management system: OSCAR and ROCKS | |
| A case study on a
cluster management system: summary(continued).
Quantifying Non-functional Requirements (Availability and Performance) |
· HA-cluster with Windows
· Workload Characterization, Performance Modeling and Evaluation for HPC systems/applications
· Applying HPC/HA to solve a specific problem (e.g. sensor networks, bioscience, bioinformatics etc.)
· HA-OSCAR cluster with Windows
· HA and DR-enabled storage system
·
Drug discovery cluster
·
IPMI-based
cluster management.
·
HA-cluster and
load balancer to support e-commerce/internet services
·
HA-cluster and
Fault tolerant HPC job schedulers
·
Hot-swap
Cluster OS
·
HA-OSCAR and
grid computing
·
Performance
benefits analysis from HA-OSCAR.
·
Beneficial
factors from Standards for HAPC environments
· FT LAM/MPI in HA-Cluster
· Performance Benchmark, micro-benchmark & macro-benchmark