Add km module kafka

2026-02-08 16:00:51 +08:00 · 2023-02-14 16:27:47 +08:00
parent 229140f067
commit 0b8160a714
4039 changed files with 718112 additions and 46204 deletions
--- a/vagrant/README.md
+++ b/vagrant/README.md
@@ -0,0 +1,134 @@
+# Apache Kafka #
+
+Using Vagrant to get up and running.
+
+1) Install Virtual Box [https://www.virtualbox.org/](https://www.virtualbox.org/)
+2) Install Vagrant >= 1.6.4 [https://www.vagrantup.com/](https://www.vagrantup.com/)
+3) Install Vagrant Plugins:
+
+    $ vagrant plugin install vagrant-hostmanager
+    # Optional
+    $ vagrant plugin install vagrant-cachier # Caches & shares package downloads across VMs
+
+In the main Kafka folder, do a normal Kafka build:
+
+    $ gradle
+    $ ./gradlew jar
+
+You can override default settings in `Vagrantfile.local`, which is a Ruby file
+that is ignored by git and imported into the Vagrantfile.
+One setting you likely want to enable
+in `Vagrantfile.local` is `enable_dns = true` to put hostnames in the host's
+/etc/hosts file. You probably want this to avoid having to use IP addresses when
+addressing the cluster from outside the VMs, e.g. if you run a client on the
+host. It's disabled by default since it requires `sudo` access, mucks with your
+system state, and breaks with naming conflicts if you try to run multiple
+clusters concurrently.
+
+Now bring up the cluster:
+
+    $ vagrant/vagrant-up.sh     
+    $ # If on aws, run: vagrant/vagrant-up.sh --aws
+
+(This essentially runs vagrant up --no-provision && vagrant hostmanager && vagrant provision)
+
+We separate out the steps (bringing up the base VMs, mapping hostnames, and configuring the VMs)
+due to current limitations in ZooKeeper (ZOOKEEPER-1506) that require us to
+collect IPs for all nodes before starting ZooKeeper nodes. Breaking into multiple steps
+also allows us to bring machines up in parallel on AWS.
+
+Once this completes:
+
+* Zookeeper will be running on 192.168.50.11 (and `zk1` if you used enable_dns)
+* Broker 1 on 192.168.50.51 (and `broker1` if you used enable_dns)
+* Broker 2 on 192.168.50.52 (and `broker2` if you used enable_dns)
+* Broker 3 on 192.168.50.53 (and `broker3` if you used enable_dns)
+
+To log into one of the machines:
+
+    vagrant ssh <machineName>
+
+You can access the brokers and zookeeper by their IP or hostname, e.g.
+
+    # Specify ZooKeeper node 1 by it's IP: 192.168.50.11
+    bin/kafka-topics.sh --create --zookeeper 192.168.50.11:2181 --replication-factor 3 --partitions 1 --topic sandbox
+
+    # Specify brokers by their hostnames: broker1, broker2, broker3
+    bin/kafka-console-producer.sh --broker-list broker1:9092,broker2:9092,broker3:9092 --topic sandbox
+
+    # Specify brokers by their IP: 192.168.50.51, 192.168.50.52, 192.168.50.53
+    bin/kafka-console-consumer.sh --bootstrap-server 192.168.50.51:9092,192.168.50.52:9092,192.168.50.53:9092 --topic sandbox --from-beginning
+
+If you need to update the running cluster, you can re-run the provisioner (the
+step that installs software and configures services):
+
+    vagrant provision
+
+Note that this doesn't currently ensure a fresh start -- old cluster state will
+still remain intact after everything restarts. This can be useful for updating
+the cluster to your most recent development version.
+
+Finally, you can clean up the cluster by destroying all the VMs:
+
+    vagrant destroy -f
+
+## Configuration ##
+
+You can override some default settings by specifying the values in
+`Vagrantfile.local`. It is interpreted as a Ruby file, although you'll probably
+only ever need to change a few simple configuration variables. Some values you
+might want to override:
+
+* `enable_hostmanager` - true by default; override to false if on AWS to allow parallel cluster bringup.
+* `enable_dns` - Register each VM with a hostname in /etc/hosts on the
+  hosts. Hostnames are always set in the /etc/hosts in the VMs, so this is only
+  necessary if you want to address them conveniently from the host for tasks
+  that aren't provided by Vagrant.
+* `enable_jmx` - Whether to enable JMX ports on 800x and 900x for Zookeeper and the Brokers respectively where `x` is the nodes of each respectively. For example, the zk1 machine would have JMX exposed on 8001, ZK2 would be on 8002, etc. 
+* `num_workers` - Generic workers that get the code (from this project), but don't start any services (no brokers, no zookeepers, etc). Useful for starting clients. Each worker will have an IP address of `192.168.50.10x` where `x` starts at `1` and increments for each worker. 
+* `num_zookeepers` - Size of zookeeper cluster
+* `num_brokers` - Number of broker instances to run
+* `ram_megabytes` - The size of each virtual machine's RAM; default to `1200MB`
+
+
+
+## Using Other Providers ##
+
+### EC2 ###
+
+Install the `vagrant-aws` plugin to provide EC2 support:
+
+    $ vagrant plugin install vagrant-aws
+
+Next, configure parameters in `Vagrantfile.local`. A few are *required*:
+`enable_hostmanager`, `enable_dns`, `ec2_access_key`, `ec2_secret_key`, `ec2_keypair_name`, `ec2_keypair_file`, and
+`ec2_security_groups`. A couple of important notes:
+
+1. You definitely want to use `enable_dns` if you plan to run clients outside of
+   the cluster (e.g. from your local host). If you don't, you'll need to go
+   lookup `vagrant ssh-config`.
+
+2. You'll have to setup a reasonable security group yourself. You'll need to
+   open ports for Zookeeper (2888 & 3888 between ZK nodes, 2181 for clients) and
+   Kafka (9092). Beware that opening these ports to all sources (e.g. so you can
+   run producers/consumers locally) will allow anyone to access your Kafka
+   cluster. All other settings have reasonable defaults for setting up an
+   Ubuntu-based cluster, but you may want to customize instance type, region,
+   AMI, etc.
+
+3. `ec2_access_key` and `ec2_secret_key` will use the environment variables
+   `AWS_ACCESS_KEY` and `AWS_SECRET_KEY` respectively if they are set and not
+   overridden in `Vagrantfile.local`.
+
+4. If you're launching into a VPC, you must specify `ec2_subnet_id` (the subnet
+   in which to launch the nodes) and `ec2_security_groups` must be a list of
+   security group IDs instead of names, e.g. `sg-34fd3551` instead of
+   `kafka-test-cluster`.
+
+Now start things up, but specify the aws provider:
+
+    $ vagrant/vagrant-up.sh --aws
+
+Your instances should get tagged with a name including your hostname to make
+them identifiable and make it easier to track instances in the AWS management
+console.
--- a/vagrant/aws/aws-access-keys-commands
+++ b/vagrant/aws/aws-access-keys-commands
@@ -0,0 +1,25 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+export AWS_IAM_ROLE=$(curl -s http://169.254.169.254/latest/meta-data/iam/info | grep InstanceProfileArn | cut -d '"' -f 4 | cut -d '/' -f 2)
+export AWS_ACCESS_KEY=$(curl -s http://169.254.169.254/latest/meta-data/iam/security-credentials/$AWS_IAM_ROLE | grep AccessKeyId | awk -F\" '{ print $4 }')
+export AWS_SECRET_KEY=$(curl -s http://169.254.169.254/latest/meta-data/iam/security-credentials/$AWS_IAM_ROLE | grep SecretAccessKey | awk -F\" '{ print $4 }')
+export AWS_SESSION_TOKEN=$(curl -s http://169.254.169.254/latest/meta-data/iam/security-credentials/$AWS_IAM_ROLE | grep Token | awk -F\" '{ print $4 }')
+
+if [ -z "$AWS_ACCESS_KEY" ]; then
+    echo "Failed to populate environment variables AWS_ACCESS_KEY, AWS_SECRET_KEY, and AWS_SESSION_TOKEN."
+    echo "AWS_IAM is currently $AWS_IAM. Double-check that this is correct. If not set, add this command to your .bashrc file:"
+    echo "export AWS_IAM=<my_aws_iam>  # put this into your ~/.bashrc"
+fi
--- a/vagrant/aws/aws-example-Vagrantfile.local
+++ b/vagrant/aws/aws-example-Vagrantfile.local
@@ -0,0 +1,29 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Use this template Vagrantfile.local for running system tests on aws 
+# To use it, move it to the base kafka directory and rename
+# it to Vagrantfile.local, and adjust variables as needed.
+ec2_instance_type = "m3.xlarge"
+ec2_spot_max_price = "0.266"  # On-demand price for instance type
+enable_hostmanager = false
+num_zookeepers = 0
+num_brokers = 0
+num_workers = 9
+ec2_keypair_name = kafkatest
+ec2_keypair_file = ../kafkatest.pem
+ec2_security_groups = ['kafkatest']
+ec2_region = 'us-west-2'
+ec2_ami = "ami-29ebb519"
--- a/vagrant/aws/aws-init.sh
+++ b/vagrant/aws/aws-init.sh
@@ -0,0 +1,81 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# This script can be used to set up a driver machine on aws from which you will run tests
+# or bring up your mini Kafka cluster.
+
+# Install dependencies
+sudo apt-get install -y \
+  maven \
+  openjdk-8-jdk-headless \
+  build-essential \
+  ruby-dev \
+  zlib1g-dev \
+  realpath \
+  python-setuptools \
+  iperf \
+  traceroute
+
+base_dir=`dirname $0`/../..
+
+if [ -z `which vagrant` ]; then
+    echo "Installing vagrant..."
+    wget https://releases.hashicorp.com/vagrant/2.1.5/vagrant_2.1.5_x86_64.deb
+    sudo dpkg -i vagrant_2.1.5_x86_64.deb
+    rm -f vagrant_2.1.5_x86_64.deb
+fi
+
+# Install necessary vagrant plugins
+# Note: Do NOT install vagrant-cachier since it doesn't work on AWS and only
+# adds log noise
+
+# Custom vagrant-aws with spot instance support. See https://github.com/mitchellh/vagrant-aws/issues/32
+wget -nv https://s3-us-west-2.amazonaws.com/confluent-packaging-tools/vagrant-aws-0.7.2.spot.gem -P /tmp
+vagrant_plugins="/tmp/vagrant-aws-0.7.2.spot.gem vagrant-hostmanager"
+existing=`vagrant plugin list`
+for plugin in $vagrant_plugins; do
+    echo $existing | grep $plugin > /dev/null
+    if [ $? != 0 ]; then
+        vagrant plugin install $plugin
+    fi
+done
+
+# Create Vagrantfile.local as a convenience
+if [ ! -e "$base_dir/Vagrantfile.local" ]; then
+    cp $base_dir/vagrant/aws/aws-example-Vagrantfile.local $base_dir/Vagrantfile.local
+fi
+
+gradle="gradle-2.2.1"
+if [ -z `which gradle` ] && [ ! -d $base_dir/$gradle ]; then
+    if [ ! -e $gradle-bin.zip ]; then
+        wget https://services.gradle.org/distributions/$gradle-bin.zip
+    fi
+    unzip $gradle-bin.zip
+    rm -rf $gradle-bin.zip
+    mv $gradle $base_dir/$gradle
+fi
+
+# Ensure aws access keys are in the environment when we use a EC2 driver machine
+LOCAL_HOSTNAME=$(hostname -d)
+if [[ ${LOCAL_HOSTNAME} =~ .*\.compute\.internal ]]; then
+  grep "AWS ACCESS KEYS" ~/.bashrc > /dev/null
+  if [ $? != 0 ]; then
+    echo "# --- AWS ACCESS KEYS ---" >> ~/.bashrc
+    echo ". `realpath $base_dir/aws/aws-access-keys-commands`" >> ~/.bashrc
+    echo "# -----------------------" >> ~/.bashrc
+    source ~/.bashrc
+  fi
+fi
--- a/vagrant/base.sh
+++ b/vagrant/base.sh
@@ -0,0 +1,167 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+set -ex
+
+# The version of Kibosh to use for testing.
+# If you update this, also update tests/docker/Dockerfile
+export KIBOSH_VERSION=8841dd392e6fbf02986e2fb1f1ebf04df344b65a
+
+path_to_jdk_cache() {
+  jdk_version=$1
+  echo "/tmp/jdk-${jdk_version}.tar.gz"
+}
+
+fetch_jdk_tgz() {
+  jdk_version=$1
+
+  path=$(path_to_jdk_cache $jdk_version)
+
+  if [ ! -e $path ]; then
+    mkdir -p $(dirname $path)
+    curl -s -L "https://s3-us-west-2.amazonaws.com/kafka-packages/jdk-${jdk_version}.tar.gz" -o $path
+  fi
+}
+
+JDK_MAJOR="${JDK_MAJOR:-8}"
+JDK_FULL="${JDK_FULL:-8u202-linux-x64}"
+
+if [ -z `which javac` ]; then
+    apt-get -y update
+    apt-get install -y software-properties-common python-software-properties binutils java-common
+
+    echo "===> Installing JDK..." 
+
+    mkdir -p /opt/jdk
+    cd /opt/jdk
+    rm -rf $JDK_MAJOR
+    mkdir -p $JDK_MAJOR
+    cd $JDK_MAJOR
+    fetch_jdk_tgz $JDK_FULL
+    tar x --strip-components=1 -zf $(path_to_jdk_cache $JDK_FULL)
+    for bin in /opt/jdk/$JDK_MAJOR/bin/* ; do 
+      name=$(basename $bin)
+      update-alternatives --install /usr/bin/$name $name $bin 1081 && update-alternatives --set $name $bin
+    done
+    echo -e "export JAVA_HOME=/opt/jdk/$JDK_MAJOR\nexport PATH=\$PATH:\$JAVA_HOME/bin" > /etc/profile.d/jdk.sh
+    echo "JDK installed: $(javac -version 2>&1)"
+
+fi
+
+chmod a+rw /opt
+if [ -h /opt/kafka-dev ]; then
+    # reset symlink
+    rm /opt/kafka-dev
+fi
+ln -s /vagrant /opt/kafka-dev
+
+
+get_kafka() {
+    version=$1
+    scala_version=$2
+
+    kafka_dir=/opt/kafka-$version
+    url=https://s3-us-west-2.amazonaws.com/kafka-packages/kafka_$scala_version-$version.tgz
+    # the .tgz above does not include the streams test jar hence we need to get it separately
+    url_streams_test=https://s3-us-west-2.amazonaws.com/kafka-packages/kafka-streams-$version-test.jar
+    if [ ! -d /opt/kafka-$version ]; then
+        pushd /tmp
+        curl -O $url
+        curl -O $url_streams_test || true
+        file_tgz=`basename $url`
+        file_streams_jar=`basename $url_streams_test` || true
+        tar -xzf $file_tgz
+        rm -rf $file_tgz
+
+        file=`basename $file_tgz .tgz`
+        mv $file $kafka_dir
+        mv $file_streams_jar $kafka_dir/libs || true
+        popd
+    fi
+}
+
+# Install Kibosh
+apt-get update -y && apt-get install -y git cmake pkg-config libfuse-dev
+pushd /opt
+rm -rf /opt/kibosh
+git clone -q  https://github.com/confluentinc/kibosh.git
+pushd "/opt/kibosh"
+git reset --hard $KIBOSH_VERSION
+mkdir "/opt/kibosh/build"
+pushd "/opt/kibosh/build"
+../configure && make -j 2
+popd
+popd
+popd
+
+# Install iperf
+apt-get install -y iperf traceroute
+
+# Test multiple Kafka versions
+# We want to use the latest Scala version per Kafka version
+# Previously we could not pull in Scala 2.12 builds, because Scala 2.12 requires Java 8 and we were running the system
+# tests with Java 7. We have since switched to Java 8, so 2.0.0 and later use Scala 2.12.
+get_kafka 0.8.2.2 2.11
+chmod a+rw /opt/kafka-0.8.2.2
+get_kafka 0.9.0.1 2.11
+chmod a+rw /opt/kafka-0.9.0.1
+get_kafka 0.10.0.1 2.11
+chmod a+rw /opt/kafka-0.10.0.1
+get_kafka 0.10.1.1 2.11
+chmod a+rw /opt/kafka-0.10.1.1
+get_kafka 0.10.2.2 2.11
+chmod a+rw /opt/kafka-0.10.2.2
+get_kafka 0.11.0.3 2.11
+chmod a+rw /opt/kafka-0.11.0.3
+get_kafka 1.0.2 2.11
+chmod a+rw /opt/kafka-1.0.2
+get_kafka 1.1.1 2.11
+chmod a+rw /opt/kafka-1.1.1
+get_kafka 2.0.1 2.12
+chmod a+rw /opt/kafka-2.0.1
+get_kafka 2.1.1 2.12
+chmod a+rw /opt/kafka-2.1.1
+get_kafka 2.2.2 2.12
+chmod a+rw /opt/kafka-2.2.2
+get_kafka 2.3.1 2.12
+chmod a+rw /opt/kafka-2.3.1
+get_kafka 2.4.0 2.12
+chmod a+rw /opt/kafka-2.4.0
+get_kafka 2.5.0 2.12
+chmod a+rw /opt/kafka-2.5.0
+
+# For EC2 nodes, we want to use /mnt, which should have the local disk. On local
+# VMs, we can just create it if it doesn't exist and use it like we'd use
+# /tmp. Eventually, we'd like to also support more directories, e.g. when EC2
+# instances have multiple local disks.
+if [ ! -e /mnt ]; then
+    mkdir /mnt
+fi
+chmod a+rwx /mnt
+
+# Run ntpdate once to sync to ntp servers
+# use -u option to avoid port collision in case ntp daemon is already running
+ntpdate -u pool.ntp.org
+# Install ntp daemon - it will automatically start on boot
+apt-get -y install ntp
+
+# Increase the ulimit
+mkdir -p /etc/security/limits.d
+echo "* soft nofile 128000" >> /etc/security/limits.d/nofile.conf
+echo "* hard nofile 128000" >> /etc/security/limits.d/nofile.conf
+
+ulimit -Hn 128000
+ulimit -Sn 128000
--- a/vagrant/broker.sh
+++ b/vagrant/broker.sh
@@ -0,0 +1,43 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Usage: brokers.sh <broker ID> <public hostname or IP> <list zookeeper public hostname or IP + port>
+
+set -e
+
+BROKER_ID=$1
+PUBLIC_ADDRESS=$2
+PUBLIC_ZOOKEEPER_ADDRESSES=$3
+JMX_PORT=$4
+
+kafka_dir=/opt/kafka-dev
+cd $kafka_dir
+
+sed \
+    -e 's/broker.id=0/'broker.id=$BROKER_ID'/' \
+    -e 's/#advertised.host.name=<hostname routable by clients>/'advertised.host.name=$PUBLIC_ADDRESS'/' \
+    -e 's/zookeeper.connect=localhost:2181/'zookeeper.connect=$PUBLIC_ZOOKEEPER_ADDRESSES'/' \
+    $kafka_dir/config/server.properties > $kafka_dir/config/server-$BROKER_ID.properties
+
+echo "Killing server"
+bin/kafka-server-stop.sh || true
+sleep 5 # Because kafka-server-stop.sh doesn't actually wait
+echo "Starting server"
+if [[  -n $JMX_PORT ]]; then
+  export JMX_PORT=$JMX_PORT
+  export KAFKA_JMX_OPTS="-Djava.rmi.server.hostname=$PUBLIC_ADDRESS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false  -Dcom.sun.management.jmxremote.ssl=false "
+fi
+bin/kafka-server-start.sh $kafka_dir/config/server-$BROKER_ID.properties 1>> /tmp/broker.log 2>> /tmp/broker.log &
--- a/vagrant/package-base-box.sh
+++ b/vagrant/package-base-box.sh
@@ -0,0 +1,75 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# This script automates the process of creating and packaging 
+# a new vagrant base_box. For use locally (not aws).
+
+base_dir=`dirname $0`/..
+cd $base_dir
+
+backup_vagrantfile=backup_Vagrantfile.local
+local_vagrantfile=Vagrantfile.local
+
+# Restore original Vagrantfile.local, if it exists
+function revert_vagrantfile {
+    rm -f $local_vagrantfile
+    if [ -e $backup_vagrantfile ]; then
+        mv $backup_vagrantfile $local_vagrantfile
+    fi
+}
+
+function clean_up {
+    echo "Cleaning up..."
+    vagrant destroy -f
+    rm -f package.box
+    revert_vagrantfile
+}
+
+# Name of the new base box
+base_box="kafkatest-worker"
+
+# vagrant VM name
+worker_name="worker1"
+
+echo "Destroying vagrant machines..."
+vagrant destroy -f
+
+echo "Removing $base_box from vagrant..."
+vagrant box remove $base_box
+
+echo "Bringing up a single vagrant machine from scratch..."
+if [ -e $local_vagrantfile ]; then
+    mv $local_vagrantfile $backup_vagrantfile
+fi
+echo "num_workers = 1" > $local_vagrantfile
+echo "num_brokers = 0" >> $local_vagrantfile
+echo "num_zookeepers = 0" >> $local_vagrantfile
+vagrant up
+up_status=$?
+if [ $up_status != 0 ]; then
+    echo "Failed to bring up a template vm, please try running again."
+    clean_up
+    exit $up_status
+fi
+
+echo "Packaging $worker_name..."
+vagrant package $worker_name
+
+echo "Adding new base box $base_box to vagrant..."
+vagrant box add $base_box package.box
+
+clean_up
+
--- a/vagrant/system-test-Vagrantfile.local
+++ b/vagrant/system-test-Vagrantfile.local
@@ -0,0 +1,26 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Use this example Vagrantfile.local for running system tests
+# To use it, move it to the base kafka directory and rename
+# it to Vagrantfile.local
+num_zookeepers = 0
+num_brokers = 0
+num_workers = 9
+base_box = "kafkatest-worker"
+
+
+# System tests use hostnames for each worker that need to be defined in /etc/hosts on the host running ducktape
+enable_dns = true
--- a/vagrant/vagrant-up.sh
+++ b/vagrant/vagrant-up.sh
@@ -0,0 +1,266 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+set -o nounset
+set -o errexit  # exit script if any command exits with nonzero value
+
+readonly PROG_NAME=$(basename $0)
+readonly PROG_DIR=$(dirname $(realpath $0))
+readonly INVOKE_DIR=$(pwd)
+readonly ARGS="$@"
+
+# overrideable defaults
+AWS=false
+PARALLEL=true
+MAX_PARALLEL=5
+DEBUG=false
+
+readonly USAGE="Usage: $PROG_NAME [-h | --help] [--aws [--no-parallel] [--max-parallel MAX]]"
+readonly HELP="$(cat <<EOF
+Tool to bring up a vagrant cluster on local machine or aws.
+
+    -h | --help             Show this help message
+    --aws                   Use if you are running in aws
+    --no-parallel           Bring up machines not in parallel. Only applicable on aws
+    --max-parallel  MAX     Maximum number of machines to bring up in parallel. Note: only applicable on test worker machines on aws. default: $MAX_PARALLEL
+    --debug                 Enable debug information for vagrant
+Approximately speaking, this wrapper script essentially wraps 2 commands:
+    vagrant up
+    vagrant hostmanager
+
+The situation on aws is complicated by the fact that aws imposes a maximum request rate,
+which effectively caps the number of machines we are able to bring up in parallel. Therefore, on aws,
+this wrapper script attempts to bring up machines in small batches.
+
+If you are seeing rate limit exceeded errors, you may need to use a reduced --max-parallel setting.
+
+EOF
+)"
+
+function help {
+    echo "$USAGE"
+    echo "$HELP"
+    exit 0
+}
+
+while [[ $# > 0 ]]; do
+    key="$1"
+    case $key in
+        -h | --help)
+            help
+            ;;
+        --aws)
+            AWS=true
+            ;;
+        --no-parallel)
+            PARALLEL=false
+            ;;
+        --max-parallel)
+            MAX_PARALLEL="$2"
+            shift
+            ;;
+        --debug)
+            DEBUG=true
+            ;;
+        *)
+            # unknown option
+            echo "Unknown option $1"
+            exit 1
+            ;;
+esac
+shift # past argument or value
+done
+
+# Get a list of vagrant machines (in any state)
+function read_vagrant_machines {
+    local ignore_state="ignore"
+    local reading_state="reading"
+    local tmp_file="tmp-$RANDOM"
+
+    local state="$ignore_state"
+    local machines=""
+
+    while read -r line; do
+        # Lines before the first empty line are ignored
+        # The first empty line triggers change from ignore state to reading state
+        # When in reading state, we parse in machine names until we hit the next empty line,
+        # which signals that we're done parsing
+        if [[ -z "$line" ]]; then
+            if [[ "$state" == "$ignore_state" ]]; then
+                state="$reading_state"
+            else
+                # all done
+                echo "$machines"
+                return
+            fi
+            continue
+        fi
+
+        # Parse machine name while in reading state
+        if [[ "$state" == "$reading_state" ]]; then
+            line=$(echo "$line" | cut -d ' ' -f 1)
+            if [[ -z "$machines" ]]; then
+                machines="$line"
+            else
+                machines="${machines} ${line}"
+            fi
+        fi
+    done < <(vagrant status)
+}
+
+# Filter "list", returning a list of strings containing pattern as a substring
+function filter {
+    local list="$1"
+    local pattern="$2"
+
+    local result=""
+    for item in $list; do
+        if [[ ! -z "$(echo $item | grep "$pattern")"  ]]; then
+            result="$result $item"
+        fi
+    done
+    echo "$result"
+}
+
+# Given a list of machine names, return only test worker machines
+function worker {
+    local machines="$1"
+    local workers=$(filter "$machines" "worker")
+    workers=$(echo "$workers" | xargs)  # trim leading/trailing whitespace
+    echo "$workers"
+}
+
+# Given a list of machine names, return only zookeeper and broker machines
+function zk_broker {
+    local machines="$1"
+    local zk_broker_list=$(filter "$machines" "zk")
+    zk_broker_list="$zk_broker_list $(filter "$machines" "broker")"
+    zk_broker_list=$(echo "$zk_broker_list" | xargs)  # trim leading/trailing whitespace
+    echo "$zk_broker_list"
+}
+
+# Run a vagrant command on batches of machines of size $group_size
+# This is annoying but necessary on aws to avoid errors due to AWS request rate
+# throttling
+#
+# Example
+#   $ vagrant_batch_command "vagrant up" "m1 m2 m3 m4 m5" "2"
+#
+#   This is equivalent to running "vagrant up" on groups of machines of size 2 or less, i.e.:
+#   $ vagrant up m1 m2
+#   $ vagrant up m3 m4
+#   $ vagrant up m5
+function vagrant_batch_command {
+    local vagrant_cmd="$1"
+    local machines="$2"
+    local group_size="$3"
+
+    local count=1
+    local m_group=""
+    # Using --provision flag makes this command useable both when bringing up a cluster from scratch,
+    # and when bringing up a halted cluster. Permissions on certain directores set during provisioning
+    # seem to revert when machines are halted, so --provision ensures permissions are set correctly in all cases
+    for machine in $machines; do
+        m_group="$m_group $machine"
+
+        if [[ $(expr $count % $group_size) == 0 ]]; then
+            # We've reached a full group
+            # Bring up this part of the cluster
+            $vagrant_cmd $m_group
+            m_group=""
+        fi
+        ((count++))
+    done
+
+    # Take care of any leftover partially complete group
+    if [[ ! -z "$m_group" ]]; then
+        $vagrant_cmd $m_group
+    fi
+}
+
+# We assume vagrant-hostmanager is installed, but may or may not be disabled during vagrant up
+# In this fashion, we ensure we run hostmanager after machines are up, and before provisioning.
+# This sequence of commands is necessary for example for bringing up a multi-node zookeeper cluster
+function bring_up_local {
+    vagrant up --no-provision
+    vagrant hostmanager
+    vagrant provision
+}
+
+function bring_up_aws {
+    local parallel="$1"
+    local max_parallel="$2"
+    local machines="$(read_vagrant_machines)"
+    case "$3" in
+          true)
+            local debug="--debug"
+            ;;
+          false)
+            local debug=""
+            ;;
+    esac
+    zk_broker_machines=$(zk_broker "$machines")
+    worker_machines=$(worker "$machines")
+
+    if [[ "$parallel" == "true" ]]; then
+        if [[ ! -z "$zk_broker_machines" ]]; then
+            # We still have to bring up zookeeper/broker nodes serially
+            echo "Bringing up zookeeper/broker machines serially"
+            vagrant up --provider=aws --no-parallel --no-provision $zk_broker_machines $debug
+            vagrant hostmanager --provider=aws
+            vagrant provision
+        fi
+
+        if [[ ! -z "$worker_machines" ]]; then
+            echo "Bringing up test worker machines in parallel"
+	    # Try to isolate this job in its own /tmp space. See note
+	    # below about vagrant issue
+            local vagrant_rsync_temp_dir=$(mktemp -d);
+            TMPDIR=$vagrant_rsync_temp_dir vagrant_batch_command "vagrant up $debug --provider=aws" "$worker_machines" "$max_parallel"
+            rm -rf $vagrant_rsync_temp_dir
+            vagrant hostmanager --provider=aws
+        fi
+    else
+        vagrant up --provider=aws --no-parallel --no-provision $debug
+        vagrant hostmanager --provider=aws
+        vagrant provision
+    fi
+
+    # Currently it seems that the AWS provider will always run rsync
+    # as part of vagrant up. However,
+    # https://github.com/mitchellh/vagrant/issues/7531 means it is not
+    # safe to do so. Since the bug doesn't seem to cause any direct
+    # errors, just missing data on some nodes, follow up with serial
+    # rsyncing to ensure we're in a clean state. Use custom TMPDIR
+    # values to ensure we're isolated from any other instances of this
+    # script that are running/ran recently and may cause different
+    # instances to sync to the wrong nodes
+    for worker in $worker_machines; do
+        local vagrant_rsync_temp_dir=$(mktemp -d);
+        TMPDIR=$vagrant_rsync_temp_dir vagrant rsync $worker;
+        rm -rf $vagrant_rsync_temp_dir
+    done
+}
+
+function main {
+    if [[ "$AWS" == "true" ]]; then
+        bring_up_aws "$PARALLEL" "$MAX_PARALLEL" "$DEBUG"
+    else
+        bring_up_local
+    fi
+}
+
+main
--- a/vagrant/zk.sh
+++ b/vagrant/zk.sh
@@ -0,0 +1,47 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Usage: zk.sh <zkid> <num_zk>
+
+set -e
+
+ZKID=$1
+NUM_ZK=$2
+JMX_PORT=$3
+
+kafka_dir=/opt/kafka-dev
+cd $kafka_dir
+
+cp $kafka_dir/config/zookeeper.properties $kafka_dir/config/zookeeper-$ZKID.properties
+echo "initLimit=5" >> $kafka_dir/config/zookeeper-$ZKID.properties
+echo "syncLimit=2" >> $kafka_dir/config/zookeeper-$ZKID.properties
+echo "quorumListenOnAllIPs=true" >> $kafka_dir/config/zookeeper-$ZKID.properties
+for i in `seq 1 $NUM_ZK`; do
+    echo "server.${i}=zk${i}:2888:3888" >> $kafka_dir/config/zookeeper-$ZKID.properties
+done
+
+mkdir -p /tmp/zookeeper
+echo "$ZKID" > /tmp/zookeeper/myid
+
+echo "Killing ZooKeeper"
+bin/zookeeper-server-stop.sh || true
+sleep 5 # Because zookeeper-server-stop.sh doesn't actually wait
+echo "Starting ZooKeeper"
+if [[  -n $JMX_PORT ]]; then
+  export JMX_PORT=$JMX_PORT
+  export KAFKA_JMX_OPTS="-Djava.rmi.server.hostname=zk$ZKID -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false  -Dcom.sun.management.jmxremote.ssl=false "
+fi
+bin/zookeeper-server-start.sh config/zookeeper-$ZKID.properties 1>> /tmp/zk.log 2>> /tmp/zk.log &