Difference between revisions of "Running gem5"
(→Checkpoints) |
|||
Line 227: | Line 227: | ||
</pre> | </pre> | ||
The number N is integer that represents checkpoint number, when they are order lexically (i.e. by the ticknumber) - oldest tick has number 1, next checkpoint has number 2, etc. | The number N is integer that represents checkpoint number, when they are order lexically (i.e. by the ticknumber) - oldest tick has number 1, next checkpoint has number 2, etc. | ||
+ | |||
+ | === Detailed example: Parsec === | ||
+ | In the following section we would describe how checkpoints are created for workloads PARSEC benchmark suite. However similar procedure can be followed to create checkpoint for other workloads beyond PARSEC suite. | ||
+ | Following are the high level steps of creating checkpoint: | ||
+ | # Annotate each workload with start and end of ''Region of Interest'' and with start and end of work units in the program | ||
+ | # Take a checkpoint at the start of the ''Region of Interest'' | ||
+ | # Simulate the whole program in the Region of Interest and periodically take checkpoints | ||
+ | # Analyse the statistics corresponding to periodic checkpoints and select the most interesting section of the program execution | ||
+ | # Take warm up cache trace for Ruby before reaching most interesting portion of the program and take the final checkpoint. | ||
+ | |||
+ | In each of the following sections we explain each of the above steps in more details. | ||
+ | |||
+ | ==== Annotating workloads ==== | ||
+ | Annotation is required for two purposes --- for defining region of program beyond the initialization section of a program and for defining logical units of work in each of the workloads. | ||
+ | |||
+ | Workloads in PARSEC benchmark suite, already has annotating demarcating start and end of portion of program without program initialization section and program finalization section. We just use gem5 specific annotation for start of ''Region of Interest''. The start of the ''Region of Interest (ROI)'' is marked by '''m5_roi_begin()''' and the end of ROI is demarcated by '''m5_roi_end()'''. | ||
+ | |||
+ | Due to large simulation time its not always possible to simulate whole program. Moreover, unlike single threaded programs, simulating for a given number instructions in multi-threaded workloads is not a correct way to simulate portion of a program due to possible presence of instructions spinning on synchronization variable. Thus it is important define semantically meaningful logical units of work in each workload. Simulating for a given number of workuints in a multi-threaded workloads gives a reasonable way of simulating portion of workloads as the problem of instructions spinning on synchronization variables | ||
==Switchover/Fastforwarding== | ==Switchover/Fastforwarding== |
Revision as of 04:31, 25 April 2011
Contents
Usage
The M5 command line has four parts, the M5 binary, options for the binary, a simulation script, and options for the script. The options that are passed to the M5 binary and those passed to the script are handled separately, so be sure any options you use are being passed to the right component.
% <m5 binary> [m5 options] <simulation script> [script options]
M5 Options
Running m5 with the "-h" flag prints a help message that includes all of the supported simulator options. Here's a snippet:
% build/ALPHA_SE/m5.debug -h Usage ===== m5.debug [m5 options] script.py [script options] Copyright (c) 2001-2008 The Regents of The University of Michigan All Rights Reserved Options ======= --version show program's version number and exit --help, -h show this help message and exit --authors, -A Show author information --build-info, -B Show build information --copyright, -C Show full copyright information --readme, -R Show the readme --outdir=DIR, -d DIR Set the output directory to DIR [Default: m5out] --redirect-stdout, -r Redirect stdout (& stderr, without -e) to file --redirect-stderr, -e Redirect stderr to file --stdout-file=FILE Filename for -r redirection [Default: simout] --stderr-file=FILE Filename for -e redirection [Default: simerr] --interactive, -i Invoke the interactive interpreter after running the script --pdb Invoke the python debugger before running the script --path=PATH[:PATH], -p PATH[:PATH] Prepend PATH to the system path when invoking the script --quiet, -q Reduce verbosity --verbose, -v Increase verbosity ...
Script Options
The script section of the command line begins with a path to your script file and includes any options that you'd like to pass to that script. Most Example scripts allow you to pass a '-h' or '--help' flag to the script to see script specific options. An example is as follows:
M5 compiled Apr 2 2011 00:57:11 M5 started Apr 3 2011 21:16:02 M5 executing on zooks command line: build/ALPHA_SE/m5.opt configs/example/se.py -h Usage: se.py [options] Options: -h, --help show this help message and exit -c CMD, --cmd=CMD The binary to run in syscall emulation mode. -o OPTIONS, --options=OPTIONS The options to pass to the binary, use " " around the entire string -i INPUT, --input=INPUT Read stdin from a file. --output=OUTPUT Redirect stdout to a file. --errout=ERROUT Redirect stderr to a file. --ruby -d, --detailed -t, --timing --inorder -n NUM_CPUS, --num-cpus=NUM_CPUS --caches --l2cache --fastmem --clock=CLOCK --num-dirs=NUM_DIRS --num-l2caches=NUM_L2CACHES --num-l3caches=NUM_L3CACHES --l1d_size=L1D_SIZE --l1i_size=L1I_SIZE --l2_size=L2_SIZE --l3_size=L3_SIZE --l1d_assoc=L1D_ASSOC --l1i_assoc=L1I_ASSOC --l2_assoc=L2_ASSOC --l3_assoc=L3_ASSOC ...
The script file documentation page (Simulation Scripts Explained) describes how to write your own simulation scripts, and the Options section explains how to add your own command line options.
Output Files
Full System (FS) Mode
Booting Linux
We'll assume that you've already built an ALPHA_FS version of the M5 simulator, and downloaded and installed the full-system binary and disk image files. Then you can just run the fs.py configuration file in the m5/configs/examples directory. For example:
% build/ALPHA_FS/m5.debug -d /tmp/output configs/example/fs.py M5 Simulator System Copyright (c) 2001-2006 The Regents of The University of Michigan All Rights Reserved M5 compiled Aug 16 2006 18:51:57 M5 started Wed Aug 16 21:53:38 2006 M5 executing on zeep command line: ./build/ALPHA_FS/m5.debug configs/example/fs.py 0: system.tsunami.io.rtc: Real-time clock set to Sun Jan 1 00:00:00 2006 Listening for console connection on port 3456 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000 warn: Entering event queue @ 0. Starting simulation... <...simulation continues...>
Basic Operation
By default, the fs.py script boots Linux and starts a shell on the system console. To keep console traffic separate from simulator input and output, this simulated console is associated with a TCP port. To interact with the console, you must connect to the port using a program such as telnet
, for example:
% telnet localhost 3456
Telnet's echo behavior doesn't work well with m5, so if you are using the console regularly, you probably want to use M5term instead of telnet. By default m5 will try to use port 3456, as in the example above. However, if that port is already in use, it will increment the port number until it finds a free one. The actual port number used is printed in the m5 output.
In addition to loading a Linux kernel, M5 mounts one or more disk images for its filesystems. At least one disk image must be mounted as the root filesystem. Any application binaries that you want to run must be present on these disk images. To begin running benchmarks without requiring an interactive shell session, M5 can load .rcS files that replace the normal Linux boot scripts to directly execute from after booting the OS. These .rcS files can be used to configure ethernet interfaces, execute special m5 instructions, or begin executing a binary on the disk image. The pointers for the linux binary, disk images, and .rcS files are all set in the simulation script. (To see how these files work, see Simulation Scripts Explained.) Examples: Going into / of root filesystem and typing ls will show:
benchmarks etc lib mnt sbin usr bin floppy lost+found modules sys var dev home man proc tmp z
Snippet of an .rcS file:
echo -n "setting up network..." /sbin/ifconfig eth0 192.168.0.10 txqueuelen 1000 /sbin/ifconfig lo 127.0.0.1 echo -n "running surge client..." /bin/bash -c "cd /benchmarks/surge && ./Surge 2 100 1 192.168.0.1 5. echo -n "halting machine" m5 exit
m5term
The m5term program allows the user to connect to the simulated console interface that full-system m5 provides. Simply change into the util/term directory and build m5term:
% cd m5/util/term % make gcc -o m5term term.c % make install sudo install -o root -m 555 m5term /usr/local/bin
The usage of m5term is:
./m5term <host> <port> <host> is the host that is running m5 <port> is the console port to connect to. m5 defaults to using port 3456, but if the port is used, it will try the next higher port until it finds one available. If there are multiple systems running within one simulation, there will be a console for each one. (The first system's console will be on 3456 and the second on 3457 for example) m5term uses '~' as an escape character. If you enter the escape character followed by a '.', the m5term program will exit.
m5term can be used to interactively work with the simulator, though users must often set various terminal settings to get things to work
A slightly shortened example of m5term in action:
% m5term localhost 3456 ==== m5 slave console: Console 0 ==== M5 console Got Configuration 127 memsize 8000000 pages 4000 First free page after ROM 0xFFFFFC0000018000 HWRPB 0xFFFFFC0000018000 l1pt 0xFFFFFC0000040000 l2pt 0xFFFFFC0000042000 l3pt_rpb 0xFFFFFC0000044000 l3pt_kernel 0xFFFFFC0000048000 l2reserv 0xFFFFFC0000046000 CPU Clock at 2000 MHz IntrClockFrequency=1024 Booting with 1 processor(s) ... ... VFS: Mounted root (ext2 filesystem) readonly. Freeing unused kernel memory: 480k freed init started: BusyBox v1.00-rc2 (2004.11.18-16:22+0000) multi-call binary PTXdist-0.7.0 (2004-11-18T11:23:40-0500) mounting filesystems... EXT2-fs warning: checktime reached, running e2fsck is recommended loading script... Script from M5 readfile is empty, starting bash shell... # ls benchmarks etc lib mnt sbin usr bin floppy lost+found modules sys var dev home man proc tmp z #
Full System Benchmarks
We have several full-system benchmarks already up and running. The binaries are available in the disk images you can obtain/download from us, and the .rcS files are in the m5/configs/boot/ directory. To run any of them, you merely need to set the benchmark option to the name of the test you want to run. For example:
%./build/ALPHA_FS/m5.opt configs/example/fs.py -b NetperfMaerts
To see a comprehensive list of all benchmarks available:
%./build/ALPHA_FS/m5.opt configs/examples/fs.py -h
Checkpoints
Creation
First of all, you need to create a checkpoint. After booting the M5 simulator, execute the following command (in the shell):
m5 checkpoint
which will create a new directory with the checkpoint, named 'cpt.TICKNUMBER'
Restoring
With the new simulator (2.0 beta2), the restoring from a checkpoint can usually be easily done from the command line, e.g.:
build/ALPHA_FS/m5.debug configs/example/fs.py -r N OR build/ALPHA_FS/m5.debug configs/example/fs.py --checkpoint-restore=N
The number N is integer that represents checkpoint number, when they are order lexically (i.e. by the ticknumber) - oldest tick has number 1, next checkpoint has number 2, etc.
Detailed example: Parsec
In the following section we would describe how checkpoints are created for workloads PARSEC benchmark suite. However similar procedure can be followed to create checkpoint for other workloads beyond PARSEC suite. Following are the high level steps of creating checkpoint:
- Annotate each workload with start and end of Region of Interest and with start and end of work units in the program
- Take a checkpoint at the start of the Region of Interest
- Simulate the whole program in the Region of Interest and periodically take checkpoints
- Analyse the statistics corresponding to periodic checkpoints and select the most interesting section of the program execution
- Take warm up cache trace for Ruby before reaching most interesting portion of the program and take the final checkpoint.
In each of the following sections we explain each of the above steps in more details.
Annotating workloads
Annotation is required for two purposes --- for defining region of program beyond the initialization section of a program and for defining logical units of work in each of the workloads.
Workloads in PARSEC benchmark suite, already has annotating demarcating start and end of portion of program without program initialization section and program finalization section. We just use gem5 specific annotation for start of Region of Interest. The start of the Region of Interest (ROI) is marked by m5_roi_begin() and the end of ROI is demarcated by m5_roi_end().
Due to large simulation time its not always possible to simulate whole program. Moreover, unlike single threaded programs, simulating for a given number instructions in multi-threaded workloads is not a correct way to simulate portion of a program due to possible presence of instructions spinning on synchronization variable. Thus it is important define semantically meaningful logical units of work in each workload. Simulating for a given number of workuints in a multi-threaded workloads gives a reasonable way of simulating portion of workloads as the problem of instructions spinning on synchronization variables
Switchover/Fastforwarding
Sampling
Sampling (switching between functional and detailed models) can be implemented via your Python script. In your script you can direct the simulator to switch between two sets of CPUs. To do this, in your script setup a list of tuples of (oldCPU, newCPU). If there are multiple CPUs you wish to switch simultaneously, they can all be added to that list. For example:
run_cpu1 = SimpleCPU() switch_cpu1 = DetailedCPU(defer_registration=True) run_cpu2 = SimpleCPU() switch_cpu2 = FooCPU(defer_registration=True) switch_cpu_list = [(run_cpu1,switch_cpu1),(run_cpu2,switch_cpu2)]
Note that the CPU that does not immediately run should have the parameter "defer_registration=True". This keeps those CPUs from adding themselves to the list of CPUs to run; they will instead get added when you switch them in.
In order for M5 to instantiate all of your CPUs, you must make the CPUs that will be switched in a child of something that is in the configuration hierarchy. Unfortunately at the moment some configuration limitations force the switch CPU to be placed outside of the System object. The Root object is the next most convenient place to place the CPU, as shown below:
root1 = Root() root1.system = System(cpu = run_cpu1) root1.switch_cpu = switch_cpu1 root2 = Root() root2.system = System(cpu = run_cpu2) root2.switch_cpu = switch_cpu2
This will add the swtich CPUs as children of each root object. Note that switch_cpu is not an actual parameter for Root, but is just an assignment to indicate that it has a child, switch_cpu.
After the systems and the CPU list is setup, your script can direct M5 to switch the CPUs at the appropriate cycle. This is achieved by calling switchCpus(cpus_list). For example, assuming the code above, and a system that is setup running run_cpu1 and run_cpu2 initially:
m5.simulate(500) # simulate for 500 cycles m5.switchCpus(switch_cpu_list) m5.simulate(500) # simulate another 500 cycles after switching
Note that M5 may have to simulate for a few cycles prior to switching CPUs due to any outstanding state that may be present in the CPUs being switched out.