Google Summer of Code

From gem5
Revision as of 03:42, 12 March 2008 by Nate (talk | contribs) (Detailed In-Order core model)
Jump to: navigation, search

Introduction

The Google Summer of Code (SoC) is a great opportunity for students to contribute to open source software projects. The open source projects get additional contributions and active developers while the students get some money and gain experience in large distributed software development.

About M5

M5 is a modular platform for computer system architecture research, encompassing system-level architecture as well as processor microarchitecture. At its core M5 provides a generic, object-oriented discrete-event simulation framework. This includes a foundation for: defining, parameterizing, configuring, and marshaling simulation objects. The foundation along with various pre-made object models allow M5 simulate both single systems and multiple networked systems deterministically. Simulations can be run using one binary (syscall emulation) or booting an entire operating system such as Linux or Solaris (full-system) on most major ISAs (SPARC, MIPS, ALPHA, ARM, x86/64). The simulator is written in a combination of C++ and Python and is pervasively object oriented. Python is used for configuration and not-performance critical parts, while C++ is used for the core of the simulation framework. Using the M5 simulator, computer architecture researchers around the world have been able to successfully model their systems and publish their work in magazines, conferences, and academic journals. So far, the Publications list has reached more than 50 and it grows every year.


Project Ideas

Below is a list of possible project ideas and starting points, however we're open to other ideas students may have. All the ideas listed here will require some familiarity with Python and a good grasp of advanced C++ concepts.

Direct Execution model

Direct execution is a well known technique for speeding up simulation employed by a number of simulators. A direct execution simulator uses the native machine to execute guest instructions without interpretation. Methods of direct execution include: static code instrumentation, dynamic code instrumentation, full OS virtualization, and application virtualization. There are several mechanisms for implementing direct execution with different pros and cons.

  1. The Linux Kernel Virtual Machine
    • PRO: Could be brought up quickly and can leverage an existing virtualization system
    • CON: Can only be used for fast forward since instructions cannot be trapped
    • http://kvm.qumranet.com/kvmwiki
  2. PIN based application virtualization
  3. Custom implementation
    • PRO: Can do exactly what we want
    • CON: Significant effort

Parallelization

As the industry moves toward multicore systems, software will need to become parallel if it is to benefit from successive generations of chips.

  • Use the Wisconsin Wind Tunnel as a guide
  • This actually isn't as bad as it sounds as all objects schedule their own events and there are limited ways they can interact with other objects in the system.

Memory Network Models

Interconnection networks are becoming important for multicore research. Having various models for on-chip networks would be very useful.

  • Mesh models
  • Crossbar models

Directory Coherence Protocol

As the number of cores increases, coherence traffic will consume an increasing proportion of system resources. Directory base cache coherence protocols drastically reduce the resources required to maintain coherence across a large number of cores. This project can go hand-in-hand with an effort to implement a new network.

Detailed In-Order core model

There is currently no detailed In-Order CPU model in M5; there is code to start with but nothing that is fully fleshed out. Cores will likely become lower power and have reduced complexity in the future making in-order cores attractive for such systems.

Interface to an HDL

  1. Write a PLI interface to connect Verilog CPUs to the memory system.

Sampling/fast-forwarding techniques

  1. Sampling/fast-forwarding techniques
    • This would have the most impact if it was coupled with (1)
    • Using SMARTS work would be a good guide

Other device models

  1. Flash memory device model (seems popular nowadays)
    • This could be a hard drive based model like we're seeing in laptops now or a memory device model like several research papers have suggested as storage in between DRAM and disk.

Other Information

The most successful project is one that is going to be interesting to you. We've got some suggested projects above, but the suggestions are just that. If there is something related that you would rather do please put that in your proposal.

Please describe who you are and what you've done in your application. In particular we would like to know about other projects you've worked on and your familiarity with Python and C++. The M5 code base tends to exercise most of the C++ standard (and the non-standard). A good familiarity with C++ and object oriented programming is necessary for a successful M5 project.

Additionally, we would like to see a set of goals/milestones in your proposal. We don't expect the list to be etched in stone, however stepping back and figuring out how you're planning to get from point A to point B is a good way for your and your mentors to track your progress and evaluate the how reasonably your goals are. Finally, we expect that working on M5 would be your main summer activity.

Mentors / M5 Simulation Team

  • Steve Reinhardt - Simulator Infrastructure; Parallel Simulation; ISA description; Full System Simulation; Memory Modeling
  • Nate Binkert - Simulator Infrastructure; Parallel Simulation; Python Integration; Full System Simulation; Networking Models; Configuration Scripts
  • Ali Saidi - Networking Models; Device Modeling; Full System Simulation; Memory Modeling not including caches
  • Lisa Hsu - Full System Workloads; Memory Modeling; Checkpointing Simulations
  • Kevin Lim - CPU Modeling (Out-of-Order, SimpleCPU) ; Full-System Simulation;
  • Gabe Black - ISA description (SPARC, x86); Full System Simulation
  • Korey Sewell - ISA description (MIPS); Out-of-Order CPU Modeling; SMT Simulation
  • Ron Dreslinski - Memory Modeling