Initial release

s5z · Mar 13, 2014 · 56684e3 · 56684e3
1 parent 12d25b5
commit 56684e3
Show file tree

Hide file tree

Showing 159 changed files with 37,217 additions and 7 deletions.
diff --git a/LICENSE b/LICENSE
@@ -1,7 +1,7 @@
-GNU GENERAL PUBLIC LICENSE
+                    GNU GENERAL PUBLIC LICENSE
                        Version 2, June 1991
 
- Copyright (C) 1989, 1991 Free Software Foundation, Inc., <http://fsf.org/>
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
  51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
  Everyone is permitted to copy and distribute verbatim copies
  of this license document, but changing it is not allowed.
@@ -290,8 +290,8 @@ to attach them to the start of each source file to most effectively
 convey the exclusion of warranty; and each file should have at least
 the "copyright" line and a pointer to where the full notice is found.
 
-    {description}
-    Copyright (C) {year}  {fullname}
+    <one line to give the program's name and a brief idea of what it does.>
+    Copyright (C) <year>  <name of author>
 
     This program is free software; you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -329,11 +329,11 @@ necessary.  Here is a sample; alter the names:
   Yoyodyne, Inc., hereby disclaims all copyright interest in the program
   `Gnomovision' (which makes passes at compilers) written by James Hacker.
 
-  {signature of Ty Coon}, 1 April 1989
+  <signature of Ty Coon>, 1 April 1989
   Ty Coon, President of Vice
 
 This General Public License does not permit incorporating your program into
 proprietary programs.  If your program is a subroutine library, you may
 consider it more useful to permit linking proprietary applications with the
 library.  If this is what you want to do, use the GNU Lesser General
-Public License instead of this License.
+Public License instead of this License.
diff --git a/README.md b/README.md
@@ -1,4 +1,206 @@
 zsim
 ====
 
-A fast and scalable x86-64 multicore simulator
+zsim is a fast x86-64 simulator. It was originally written to evaluate ZCache
+(Sanchez and Kozyrakis, MICRO-44, Dec 2010), hence the name, but it has since
+outgrown its purpose.
+zsim's main goals are to be fast, simple, and accurate, with a focus on
+simulating memory hierarchies and large, heterogeneous systems. It is parallel
+and uses DBT extensively, resulting in speeds of hundreds of millions of
+instructions/second in a modern multicore host. Unlike conventional simulators,
+zsim is organized to scale well (almost linearly) with simulated core count.
+
+You can find more details about zsim in our ISCA 2013 paper:
+http://people.csail.mit.edu/sanchez/papers/2013.zsim.isca.pdf.
+
+
+License & Copyright
+-------------------
+
+zsim is free software; you can redistribute it and/or modify it under the terms
+of the GNU General Public License as published by the Free Software Foundation,
+version 2.
+
+zsim was originally written by Daniel Sanchez at Stanford University, and per
+Stanford University policy, the copyright of this original code remains with
+Stanford (specifically, the Board of Trustees of Leland Stanford Junior
+University). Since then, zsim has been substantially modified and enhanced at
+MIT by Daniel Sanchez, Nathan Beckmann, and Harshad Kasture. zsim also
+incorporates contributions on main memory performance models from Krishna
+Malladi, Makoto Takami, and Kenta Yasufuku.
+
+zsim was also modified and enhanced while Daniel Sanchez was an intern at
+Google. Google graciously agreed to share these modifications under a GPLv2
+license. This code is (C) 2011 Google Inc. Files containing code developed at
+Google have a different license header with the correct copyright attribution.
+
+Additionally, if you use this software in your research, we request that you
+reference the zsim paper ("ZSim: Fast and Accurate Microarchitectural
+Simulation of Thousand-Core Systems", Sanchez and Kozyrakis, ISCA-40, June
+2013) as the source of the simulator in any publications that use this
+software, and that you send us a citation of your work.
+
+
+Setup
+-----
+
+External dependencies: `gcc >=4.6, pin, scons, libconfig, libhdf5`
+
+1. Clone a fresh copy of the git zsim repository (`git clone <path to zsim repo>`).
+
+2. Download Pin, http://www.pintool.org . Tested with Pin 2.8+ on an x86-64
+   architecture.  Compiler flags are set up for Pin 2.9 on x86-64. To get flags
+   for other versions, examine the Pin makefile or derive from sample pintools.
+   Set the PINPATH environment variable to Pin's base directory.
+
+   NOTE: Linux 3.0+ systems require Pin 2.10+, just because Pin does a kernel
+   version check that 3.0 fails.
+
+   NOTE 2: Use Pin 2.12 with Sandy/Ivy Bridge systems, earlier Pin versions
+   have strange performance regressions on this machine (extremely low IPC).
+
+3. zsim requires some additional libraries. If they are not installed in your
+   system, you will need to download and build them:
+
+  3.1 libconfig, http://www.hyperrealm.com/libconfig . To install locally,
+      untar, run `./configure --prefix=<libconfig install path> && make install`.
+      Then define the env var `LIBCONFIGPATH=<libconfig install path>`. 
+
+  3.2 libhdf5, http://www.hdfgroup.org (v1.8.4 path 1 or higher). The
+      SConstruct file assumes it is installed in the system.
+
+  3.3 (OPTIONAL) polarssl (currently used just for their SHA-1 hash function),
+      http://www.polarssl.org Install locally as in 3.1 and define the env var
+      `POLARSSLPATH=<polarssl install path>`
+
+      NOTE: You may need to add `-fPIC` to the Makefile's C(PP/XX)FLAGS depending
+      on the version.
+
+  3.4 (OPTIONAL) DRAMSim2 for main memory simulation. Build locally and define
+      the env var DRAMSIMPATH as in 3.1 and 3.3.
+
+4. Compile zsim: `scons -j16`
+
+5. Launch a test run: `./build/opt/zsim tests/simple.cfg`
+
+For more compilation options, run scons --help. You can build debug, optimized
+and release variants of the simulator (--d, --o, --r options). Optimized (opt)
+is the default. You can build profile-guided optimized (PGO) versions of the
+code with --p. These improve simulation performance with OOO cores by about
+30%.
+
+NOTE: zsim uses C++11 features available in `gcc >=4.6` (such as range-based for
+loops, strictly typed enums, lambdas, and type inference). Older version of gcc
+will not work. zsim can also be built with `icc` (see the `SConstruct` file). 
+
+
+Notes
+-----
+
+**Accuracy:** While we have validated zsim against a real system, you should be
+aware that we sometimes sacrifice some accuracy for speed and simplicity. The
+ISCA 2013 paper details the possible sources of inaccuracy.  Despite our
+validation efforts, if you are using zsim with workloads or architectures that
+are significantly different from ours, you should not blindly trust these
+results. Also, zsim can be configured with varying degrees of accuracy, which
+may be OK in some cases but not others (e.g., longer bound phases to reduce
+overheads are often OK if your application has little communication, but not
+with fine-grained parallelism and synchronization). Finally, in some cases, you
+will need to modify the code, and for some purposes, zsim is just not the right
+tool. In any case, we strongly recommend validating your baseline configuration
+and workloads against a real machine.
+
+**Memory Management:** zsim can simulate multiple processes, which introduces some
+complexities in memory management. Each Pin process uses SysV IPC shared
+memory to communicate through a global heap. Be aware that Pin processes have a
+global and a process-local heap, and all simulator objects should be allocated
+in the global heap. A global heap allocator is implemented (galloc.c and g\_heap
+folder) using Doug Lea's malloc. The global heap allocator functions are as the
+usual ones, with the gm\_ prefix (e.g. gm\_malloc, gm\_calloc, gm\_free). Objects
+can be allocated in the global heap automatically by making them inherit from
+GlobAlloc, which redefines the new and delete operators. STL classes use their
+own internal allocators, so they cannot be members of globally visible objects.
+To ease this, the g\_stl folder has template specializations of commonly used
+STL classes that are changed to use our own STL-compliant allocator that
+allocates from the global heap. Use these classes as drop-in replacements when
+you need a globally visible STL class, e.g. substitute std::vector with
+g\_vector, etc.
+
+**Harness:** While most of zsim is implemented as a pintool (`libzsim.so`), a harness
+process (`zsim`) is used to control the simulation: set up the shared memory
+segment, launch pin processes, check for deadlock, and ensure termination of
+the whole process tree when it is killed. In prior revisions of the simulator,
+you could launch the pintool directly, but now you should use the harness.
+
+**Transparency & I/O:** To maintain transparency w.r.t instrumented
+applications, zsim does all logging through info/warn/panic methods. With the
+sim.logToFile option, these dump to per-process log files instead of the
+console. *You should never use cout/cerr or printf in simulator code* ---
+simple applications will work, but more complex setups, e.g., anything that
+uses pipes, will break.
+
+**Interfacing with applications:** You can use special instruction sequences to
+control the simulation from the application (e.g., fast-forward to the region
+you want to simulate). `misc/hooks` has wrappers for C/C++, Fortran, and Java,
+and extending this to other languages should be easy.
+
+**Host Configuration:** The system configuration may need some tweaks to support
+zsim. First, it needs to allow for large shared memory segments. Second, for
+Pin to work, it must allow a process to attach to any other from the user, not
+just to a child. Use sysctl to ensure that `kernel.shmmax=1073741824` (or larger)
+and `kernel.yama.ptrace_scope=0`. zsim has mainly been used in
+Ubuntu 11.10, 12.04, 12.10, 13.04, and 13.10, but it should work in other Linux
+distributions. Using it in OSs other than Linux (e.g,, OS X, Windows) will be
+non-trivial, since the user-level virtualization subsystem has deep ties into
+the Linux syscall interface.
+
+**Stats:** The simulator outputs periodic, eventual and end-of-sim stats files.
+Stats can be output in both HDF5 and plain text. Read the README.stats file
+and the associated scripts repository to see how to use these stats.
+
+**Configuration & Getting Started:** A detailed use guide is out of the scope of
+this README, because the simulator options change fairly often. In general,
+*the documentation is the source code*. You should be willing to occasionally
+read the source code to see how different zsim features work. To get familiar
+with the way to configure the simulator, the following three steps usually work
+well when getting started:
+
+1. Check the examples in the `tests/` folder, play around with the settings, and
+   launch a few runs. Config files have three sections, sys (configures the
+   simulated system, e.g., core and cache parameters), sim (configures simulation
+   parameters, e.g., how frequent are periodic stats output, phase length, etc.),
+   and process{0, 1, 2, ...} entries (what processes to run). 
+
+2. Most parameters have implicit defaults. zsim produces an out.cfg file that
+   includes all the default choices (and we recommend that your analysis scripts
+   automatically parse this file to check that what you are simulating makes
+   sense). Inspecting the out.cfg file reveals more configuration options to play
+   with, as well as their defaults.
+
+3. Finally, check the source code for more info on options. The whole system is
+   configured in the init.cpp (sys and sim sections) and process\_tree.cpp
+   (processX sections) files, so there is no need to grok the whole simulator
+   source to find out all the configuration options.
+
+**Hacking & Style Guidelines:** zsim is mostly consistent with Google's C++ style
+guide. You can use cpplint.py to check rule violations. We depart from these
+guidelines on a couple of aspects:
+
+- 4-space indentation instead of 2 spaces
+
+- 120-character lines instead of 80-char (though you'll see a clear disregard
+  for strict line length limits every now and then)
+
+You can use cpplint.py (included in misc/ with slight modifications) to check
+your changes. misc/ also has a script to tidy includes, which should be in
+alphabetical order within each type (own, system, and project headers).
+
+vim will indent the code just fine with the following options:
+`set cindent shiftwidth=4 expandtab smarttab`
+
+Finally, as Google's style guidelines say, please be consistent with the
+current style used elsewhere. For example, the parts of code that deal with Pin
+follow a style consistent with pintools.
+
+Happy hacking, and hope you find zsim useful!
+
diff --git a/README.stats b/README.stats
@@ -0,0 +1,71 @@
+#!/usr/bin/python
+# zsim stats README
+# Author: Daniel Sanchez <[email protected]>
+# Date: May 3 2011
+#
+# Stats are now saved in HDF5, and you should never need to write a stats
+# parser. This README explains how to access them in python using h5py. It
+# doubles as a python script, so you can just execute it with "python
+# README.stats" and see how everything works (after you have generated a stats
+# file).
+#
+
+import h5py # presents HDF5 files as numpy arrays
+import numpy as np
+
+# Open stats file
+f = h5py.File('zsim-ev.h5', 'r')
+
+# Get the single dataset in the file
+dset = f["stats"]["root"]
+
+# Each dataset is first indexed by record. A record is a snapshot of all the
+# stats taken at a specific time.  All stats files have at least two records,
+# at beginning (dest[0])and end of simulation (dset[-1]).  Inside each record,
+# the format follows the structure of the simulated objects. A few examples:
+
+# Phase count at end of simulation
+endPhase = dset[-1]['phase']
+print endPhase
+
+# If your L2 has a single bank, this is all the L2 hits. Otherwise it's the
+# hits of the first L2 bank
+l2_0_hits = dset[-1]['l2'][0]['hGETS'] + dset[-1]['l2'][0]['hGETX']
+print l2_0_hits
+
+# Hits into all L2s
+l2_hits = np.sum(dset[-1]['l2']['hGETS'] + dset[-1]['l2']['hGETX'])
+print l2_hits
+
+# Total number of instructions executed, counted by adding per-core counts
+# (you could also look at procInstrs)
+totalInstrs = np.sum(dset[-1]['simpleCore']['instrs'])
+print totalInstrs
+
+# You can also focus on one sample, or index over multiple steps, e.g.,
+lastSample = dset[-1]
+allHitsS = lastSample['l2']['hGETS']
+firstL2HitsS = allHitsS[0]
+print firstL2HitsS
+
+# There is a certain slack in the positions of numeric and non-numeric indices,
+# so the following are equivalent:
+print dset[-1]['l2'][0]['hGETS'] 
+#print dset[-1][0]['l2']['hGETS'] # can't do
+print dset[-1]['l2']['hGETS'][0]
+print dset['l2']['hGETS'][-1,0]
+print dset['l2'][-1,0]['hGETS']
+print dset['l2']['hGETS'][-1,0]
+
+# However, you can't do things like dset[-1][0]['l2']['hGETS'], because the [0]
+# indexes a specific element in array 'l2'. The rule of thumb seems to be that
+# numeric indices can "flow up", i.e., you can index them later than you should.
+# This introduces no ambiguities.
+
+# Slicing works as in numpy, e.g.,
+print dset['l2']['hGETS'] # a 2D array with samples*per-cache data
+print dset['l2']['hGETS'][-1] # a 1D array with per-cache numbers, for the last sample
+print dset['l2']['hGETS'][:,0] # 1D array with all samples, for the first L2 cache
+
+# OK, now go bananas!
+