diff --git a/benchmarks/README.md b/benchmarks/README.md
index a5ae4406fe729c067cdb6c34a231bf5ec230f181..ef410bcddf874f41509f62caba7c41eb586fb7f6 100644
--- a/benchmarks/README.md
+++ b/benchmarks/README.md
@@ -1 +1,155 @@
 # LBMPY benchmarks
+
+This directory provides all scripts needed for reproducing the benchmark comparison
+in between lbmpy and [lbmBench](https://github.com/RRZE-HPC/lbm-benchmark-kernels/).
+
+
+## Get started
+Clone this directory and make sure all dependencies are installed, which are:
+- [Intel Compiler 19](https://software.intel.com/en-us/parallel-studio-xe/choose-download)
+- Python 3 including (this branch of) [pystencils](https://i10git.cs.fau.de/pycodegen/pystencils/),
+  [lbmpy](https://i10git.cs.fau.de/pycodegen/lbmpy/),
+  [Kerncraft](https://github.com/RRZE-HPC/kerncraft/) and their dependencies.
+- [likwid](https://github.com/RRZE-HPC/likwid/) or any similar tool for fixing
+  the CPU frequency and pinning threads.
+- [Jupyter Notebook](https://jupyter.org/install) for creating plots out of the 
+  result data.
+
+Copy the `config.json` file in `config/` into your local config under `~/.config/pystencils/`
+for using ICC instead of GCC.
+
+If you want to reproduce all results you should get rid of all `results_*` files
+in this directory, since some of the runscripts would append the results to the
+old data, which falsifies the produced graphs.
+If you only want to see the measured results or want to reproduce some of the 
+results, keep the result files accordingly.
+
+**Keep in mind to always fix the frequency of your system before running any benchmarks!**
+
+
+## Hardware settings
+The benchmarks can be run on any Skylake (SKL), Haswell (HSW) or Ivy Bridge (IVB).
+However, for comparison the hardware used in this [report]() will be stated below.
+
+| Architecture | Platform | used fixed frequency |
+|--------------|----------|----------------------|
+| SKL          | Intel Xeon Gold 6148 | 2.40 GHz |
+| HSW          | Intel Xeon E5-2695 v3| 2.30 GHz |
+| IVB          | Intel Xeon E5-2660 v2| 2.20 GHz |
+
+## Usage
+All and results from this [report]() can be reproduced by running the corresponding
+scripts and plots can be created by using the `plot_results.ipynb` Jupyter notebook.
+
+In general, we can divide the number of all tasks in **four steps**:
+
+### 1. Get roofline by using the copy benchmark
+> Produces `results_copy.csv`
+
+Go inside of `copy/` and run 
+```
+./run_copy.sh ARCH
+```
+For help run this script with `--help`.
+
+### 2. Get lbmBench results as reference peak performance
+> Produces `results_lbmBench_[ARCH].csv`, `results_node.csv`, `results_dims.csv`
+
+Go inside of `lbmBench/` and run all three *create_plotdata_lbmBench...* scripts:
+- Run 
+  ```
+  ./create_plotdata_lbmBench_single_core.sh ARCH
+  ```
+  for creating the corresponding result file `results_lbmBench_ARCH.csv` for 
+  all kernels on a single core.  
+  For help run this script with `--help`.
+- Run
+  ```
+  ./create_plotdata_lbmBench_node_scale.sh ARCH
+  ```
+  for creating results for the fastest lbmBench kernel scaling from 1 to 40
+  threads for the corresponding architecture in `results_node.csv`.  
+  For help run this script with `--help`.
+- Run
+  ```
+  ./create_plotdata_lbmBench_dim_scaling.sh ARCH
+  ```
+  for creating results for the fastest lbmBench kernel on a single core with a
+  varying problem size for the x dimension
+  (`[10, 25, 50, 100, 125, 150, 175, 200, 250, 300, 400, 500]`).  
+  For help run this script with `--help`.
+
+### 3. Get lbmpy results
+> Produces `results_lbmpy_[ARCH].csv`, `results_node.csv`, `results_dims.csv`
+
+Go inside of `lbmpy/` and run all three *create_plotdata_lbmpy...* scripts:
+- Run
+  ```
+  ./create_plotdata_lbmpy_single_core.sh ARCH
+  ```
+  for creating the corresponding result file `results_lbmpy_ARCH.csv` for all
+  parameter combinations on a single core.   
+  For help run this script with `--help`.
+- Run
+  ```
+  ./create_plotdata_lbmpy_node_scale.sh ARCH
+  ```
+  for creating results for the fastest lbmpy kernel scaling from 1 to 40
+  threads for the corresponding architecture in `results_node.csv`.
+  For this, the `bench-omp-ARCH` binary is used.
+  If you want to compile your binary for the architecture `ARCH` yourself,
+  run the following commands before executing the main script:
+  ```bash
+  ./lbmpy_bench.py ARCH -f --openmp -o dummy-out
+  mv bench bench-omp-ARCH
+  rm dummy-out
+  ```  
+  For help run this script or `lbmpy_bench.py` with `--help`.
+- Run
+  ```
+  ./create_plotdata_lbmpy_dim_scaling.sh ARCH
+  ```
+  for creating results for the fastest lbmpy kernel on a single core with a
+  varying problem size for the x dimension
+  (`[10, 25, 50, 100, 125, 150, 175, 200, 250, 300, 400, 500]`).  
+  For help run this script with `--help`.
+
+### 4. Plot results and visualize differences
+To get plots out the created results, open `plot_results.ipynb` in an
+Jupyter notebook and execute the cells.
+
+**Export plots as SVG**  
+In the code cell of each plot exists a comment block with:
+```python
+# For download as SVG
+# iplot(fig, image='svg', filename='YOUR_PLOT_NAME', image_width=1280)
+```
+For downloading this plot, just uncomment the last line and adjust filename or
+image size if needed.
+
+**Change bar graphs orientation**  
+For each bar graph you can switch between horizontal and vertical mode.
+For all lbmBench kernels, this is done by changing the `lbm_orientation`
+variable in the following code cell:
+```python
+##################################################
+# **ADJUST HERE**
+# 'h' or 'v'
+lbm_orientation='v'
+
+##################################################
+```
+Similar to that, for all lbmpy kernels the variable is set in the corresponding
+code cell:
+```python
+##################################################
+# **ADJUST HERE**
+# 'h' or 'v'
+lbmpy_orientation='v'
+
+##################################################
+```
+
+Additionally to the plots, all lbmpy results are also shown in a queryable
+dataframe and the average speedup by using the `split` and `nontemporal` option
+as well as the parameters of the fastest kernel are given.
\ No newline at end of file