GitHub - epfl-dias/shore-kits

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 1,530 Commits
include		include
scripts		scripts
src		src
.hgtags		.hgtags
AUTHORS		AUTHORS
ChangeLog		ChangeLog
Makefile.am		Makefile.am
NEWS		NEWS
README		README
THIS_IS_SHORE_KITS		THIS_IS_SHORE_KITS
autogen.sh		autogen.sh
configure.ac		configure.ac
debug-shore.cpp		debug-shore.cpp
debug_locks.cpp		debug_locks.cpp
gnuify-changelog.pl		gnuify-changelog.pl
shore.conf		shore.conf
shore.conf.baseline		shore.conf.baseline
shore.conf.baseline.16		shore.conf.baseline.16
shore.conf.baseline.64		shore.conf.baseline.64
shore.conf.dora		shore.conf.dora
shore.conf.dora.16		shore.conf.dora.16
shore.conf.dora.64		shore.conf.dora.64
shore.conf.plp		shore.conf.plp
shore.conf.plp.16		shore.conf.plp.16
shore.conf.plp.64		shore.conf.plp.64

Repository files navigation

Please refer to the on-line wiki at https://bitbucket.org/shoremt/shore-kits/wiki/Home for more detailed setup information.

-----------------------
Shore Kits Readme
-----------------------

TOC:

1. Installing
2. Running
3. Notes
4. Known bugs
5. Contact

-----------------------
1. Installing
-----------------------

a. Prerequisites
-----------------------

- Download and compile Shore-mt. 

b. Compiling
-----------------------

#> ln -s <shore-mt-dir>/m4
#> ./autogen.sh
#> ./configure --with-shore=<shore-mt-dir>
               --with-glibtop(for reporting throughput periodically)
	       --enable-debug(optional)
[
Special cases in SPARC/Solaris:
#> CXX=CC ./configure --with-readline=<path to the dir that has include/readline dir> ...
]
#> make -j

-----------------------
2. Running
-----------------------

a. Edit shore.conf
-----------------------

To run a workload you first need to make sure that you have a proper setup
for the specific scaling factor you want.
For example,
if you want to run the TPC-C workload with scaling factor 10,
then check shore.conf first to see whether "tpcc-10-device",
"tpcc-10-devicequota", "tpcc-10-bufpoolsize", etc. is defined for it.
Also, make sure that the configuration numbers fit your needs.
For example
(and the below are also warnings based on previously asked questions from
our external users),
- your buffer pool size cannot be higher than the available memory on your
machine,
- your device quota should be big enough to keep the whole database even as the
database grows during your runs
- if you are running in-memory,
"db-loaders" parameter should not bigger than the available hardware contexts
on your machine, otherwise you are going to observe convoys.

b. Run shore_kits
-----------------------

---- Creating general log and device directories:

You need to create log and device directories indicated by shore.conf for
the workload configuration you want.
For convenience the device directory is either "databases" or "dskdb"
and the log directory is "log-<workload name>-<scaling factor>".
Using two device names is simply for us to logically differentiate between
the small-sized databases that we can link to memory and databases we link
to disk since their size is too big.

If you have enough main memory and do not have sufficiently fast IO subsystem,
first link the database and log to somewhere in your machine's main memory
(such as /dev/shm/).
#> mkdir -p <dir for main memory>/<log dir name in shore.conf>
#> mkdir -p <dir for main memory>/<device dir name in shore.conf>
#> ln -s <dir for main memory>/<log dir name in shore.conf>
#> ln -s <dir for main memory>/<device dir name in shore.conf>

If you would like to run everything on disk, then simply do:
#>mkdir -p <log dir name in shore.conf>
#>mkdir -p <device dir name in shore.conf>

So if you would like to run TPC-C with scaling factor 10 on linux,
then you need something like:
#> mkdir -p /dev/shm/log-tpcc-10
#> mkdir -p /dev/shm/databases
#> ln -s /dev/shm/log-tpcc-10
#> ln -s /dev/shm/databases

or

#>mkdir -p log-tpcc-10
#>mkdir -p databases

---- Running shore-kits without a previously populated database

#> ./shore_kits -c <workload name>-<scaling factor> -s <design> -d <tree type> -r

Examples:

If you would like to run TPC-C with scaling factor 10 in baseline Shore-MT
using regular B+trees:
#> ./shore_kits -c tpcc-10 -s baseline -d normal -r

If you would like to run TPC-C with scaling factor 10 in baseline Shore-MT
using MRBtrees:
#> ./shore_kits -c tpcc-10 -s baseline -d mrbtnorm -r

If you would like to run TPC-C with scaling factor 10 with DORA using regular
B+trees:
#> ./shore_kits -c tpcc-10 -s dora -d normal -r

If you would like to run TPC-C with scaling factor 10 with DORA using MRBtrees:
#> ./shore_kits -c tpcc-10 -s dora -d mrbtnorm -r

If you would like to run TPC-C with scaling factor 10 with PLP using MRBtrees
(PLP inherently don't use regular B+trees), with no latching on B+trees but
keeping latching on heap pages:
#> ./shore_kits -c tpcc-10 -s plp -d mrbtnorm -r

If you would like to run TPC-C with scaling factor 10 with PLP using MRBtrees
(PLP inherently don't use regular B+trees), with no latching on both B+trees
and heap pages:
#> ./shore_kits -c tpcc-10 -s plp -d mrbtleaf -r

---- Running shore-kits with a previously populated database

Make sure that the populated database and corresponding log files are under
the device and log directories. Then, run shore_kits without "-r" option:
#> ./shore_kits -c <workload name>-<scaling factor> -s <design> -d <tree type>

c. List of commands
-----------------------

i. The three basic commands for running experiments:

measure - Executes a specific workload for a specific duration.
          The user can set the number of clients,
	  whether the requests will be spread,
	  the number of iterations,
	  and the scaling factor.

test    - For a specific workload each client executes a specific number
          of transactions.
          The user can set the number of clients,
	  whether the requests will be spread,
	  the number of iterations,
	  and the scaling factor.
          WARNING: The 'test' command would not be that accurate in terms
	  of the reported throughput.

trxs    - Lists the transactions supported by the current workload.

ii. Additional commands:

break   - Breaks into a debugger by raising ^C
          (terminates program if no debugger is active)
cpu     - Process cpu usage/statitics
echo    - Echoes its input arguments to the screen
help    - Prints help
iodelay - Sets the fake I/O disk delay
quit    - Quits
stats   - Print statistics
trace   - Sets tracing level

iii. Help:

help     - Prints the list of additional commands 
help cmd - Prints cmd-specific information 

c. Usage example:
-----------------------

If you want to run the "measure" command on a TPC-C database with
scaling factor 10 (10 warehouses):

- "measure" usage

measure <NUM_QUERIED> [<SPREAD> <NUM_THRS> <DURATION> <TRX_ID> <ITERATIONS>]

<NUM_QUERIED> : The SF queried (queried factor)
This parameter is the SF (e.g., the number of Warehouses) to be used.
For example, you can have a 100 Warehouse TPC-C database (13GB) populated,
but you want only the 10 Warehouses to be used.
In that case, you will put 10 in the NUM_QUERIED.

<SPREAD> : Whether to spread threads (0=No, Other=Yes, Default=No) (optional)
The second parameter determines whether each client will submit queries
to a different specific warehouse (to reduce logical contention)
or it will choose randomly among the whole warehouses.

<NUM_THRS> : Number of threads used (optional)
The third parameter determines the number of clients that will run.
If there is no I/O (log is linked to memory and a database that fits in memory)
then 1 client is supposed to utilize 1 core.

<DURATION> : Duration of experiment in secs (Default=20) (optional)
The fourth parameter determines the duration of the run.

<TRX_ID> : Transaction ID to be executed (0=mix) (optional)
The fifth parameter determines which workload those clients will run.
The list of available workloads is given through the "trxs" command.
For example, in TPC-C if you put 0 in that parameter it will run the full TPC-C
benchmark (execute transactions given the distribution defined in the benchmark).
On the other hand, if you want all the clients to execute the Payment transaction
you will have to put 2 there.

<ITERATIONS> : Number of iterations (Default=5) (optional)
Finally, the last parameter determines how many times this run will be executed.

- Examples

- Execute 5 iterations of 1min runs of 20 clients running the TPC-C Mix.
(tpcc-base) measure 10 0 20 60 0 5 0

- Run 2 iterations of 30secs runs of 10 clients submitting only TPC-C Payment 
  transactions, and each client uses a different warehouse.
(tpcc-base) measure 10 1 10 30 2 1 

-----------------------
3. Notes
-----------------------

a. Re-populate a database:
-----------------------

In case of errors during powerruns,
it is faster to re-populate a new database than wait for recovery to finish.
Shore when it starts always first tries to recover.
Therefore, before the re-population of a benchmark database the log files
should be deleted.
Meaning that you need to delete any log files at the corresponding log directory

-----------------------
4. Warnings
-----------------------

Warning-1: Cannot update fields included at indexes - delete and insert again.

Warning-2: Only the last field of an index can be of variable length

-----------------------
5. Contact
-----------------------

For any issues, reporting bugs, etc., please visit:

https://bitbucket.org/shoremt/shore-kits and send us an email.