Skip to content

Commit

Permalink
Issue apache#2212: Fix issues in Bookkeeper Docker image that prevent…
Browse files Browse the repository at this point in the history
…ed containers from starting up

Descriptions of the changes in this PR:

### Motivation

This PR modifies the Apache Bookkeeper Docker image to fix issues that were causing errors upon container bootstrap. The containers would exit soon after they were launched. See issue apache#2212 for a description of such an error. 

Note that the problems that are fixed in this PR were observed:

* when launching containers using both Docker Compose and Kubernetes.  
* when we were trying to upgrade the image to: `4.9.2`. It is highly likely that the issue is observed in other versions (except for `4.7.3`) too.  
* when launching both a standalone container as well as a cluster of three containers.

### Changes
The major changes made in this PR are as follows: 
* Updates the `Dockerfile` to install `zk-shell`. 
* Update the `init_bookie.sh` file to:
   * Use `zk-shell` instead of `/opt/bookkeeper/bin/bookkeeper org.apache.zookeeper.ZooKeeperMain` command that doesn't work. 
  * Use `opt/bookkeeper/bin/bookkeeper shell initnewcluster` for initializing the cluster instead of the previously used command that did not work. 
  * Increase the time a container waits for an in-flight `initnewcluster` operation. 
  * Make the comments more descriptive.
* Modifies `bin/common.sh` to handle the condition when file `/proc/sys/net/ipv6/bindv6only` is missing in the system. This can prevent a container from starting up in some cases. We have seen this issue on some Kubernetes-based environments. 
* Fixes errors in `docker-compose.yml` file. 

*Note:* Some of the changes made in this PR are modeled after changes made by sijie for `v4.7.2` in PR apache#1666 .

### Master Issue
apache#2212 




Reviewers: Jia Zhai <[email protected]>, Enrico Olivelli <[email protected]>

This closes apache#2219 from ravisharda/startup-failure-docker-image, closes apache#2212
  • Loading branch information
Ravi Sharda authored May 4, 2020
1 parent 94912a9 commit 092ebf2
Show file tree
Hide file tree
Showing 5 changed files with 35 additions and 34 deletions.
2 changes: 1 addition & 1 deletion bin/common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
# */

# Check net.ipv6.bindv6only
if [ -f /sbin/sysctl ]; then
if [ -f /sbin/sysctl ] && [ -f /proc/sys/net/ipv6/bindv6only ]; then
# check if net.ipv6.bindv6only is set to 1
bindv6only=$(/sbin/sysctl -n net.ipv6.bindv6only 2> /dev/null)
if [ -n "$bindv6only" ] && [ "$bindv6only" -eq "1" ]
Expand Down
7 changes: 6 additions & 1 deletion docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ ENV JAVA_HOME=/usr/lib/jvm/jre-1.8.0
# Download Apache Bookkeeper, untar and clean up
RUN set -x \
&& adduser "${BK_USER}" \
&& yum install -y java-1.8.0-openjdk-headless wget bash python sudo \
&& yum install -y java-1.8.0-openjdk-headless wget bash python sudo\
&& mkdir -pv /opt \
&& cd /opt \
&& wget -q "${DISTRO_URL}" \
Expand All @@ -46,6 +46,11 @@ RUN set -x \
&& tar -xzf "$DISTRO_NAME.tar.gz" \
&& mv bookkeeper-server-${BK_VERSION}/ /opt/bookkeeper/ \
&& rm -rf "$DISTRO_NAME.tar.gz" "$DISTRO_NAME.tar.gz.asc" "$DISTRO_NAME.tar.gz.sha512" \
# install zookeeper shell
&& wget -q https://bootstrap.pypa.io/get-pip.py \
&& python get-pip.py \
&& pip install zk-shell \
&& rm -rf get-pip.py \
&& yum remove -y wget \
&& yum clean all

Expand Down
6 changes: 3 additions & 3 deletions docker/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ services:
environment:
- JAVA_HOME=/usr/lib/jvm/jre-1.8.0
- BK_zkServers=zookeeper:2181
- BK_zkLedgersRootPath = /ledgers
- BK_zkLedgersRootPath=/ledgers

bookie2:
image: apache/bookkeeper
Expand All @@ -40,7 +40,7 @@ services:
environment:
- JAVA_HOME=/usr/lib/jvm/jre-1.8.0
- BK_zkServers=zookeeper:2181
- BK_zkLedgersRootPath = /ledgers
- BK_zkLedgersRootPath=/ledgers

bookie3:
image: apache/bookkeeper
Expand All @@ -50,7 +50,7 @@ services:
environment:
- JAVA_HOME=/usr/lib/jvm/jre-1.8.0
- BK_zkServers=zookeeper:2181
- BK_zkLedgersRootPath = /ledgers
- BK_zkLedgersRootPath=/ledgers

dice:
image: caiok/bookkeeper-tutorial
Expand Down
49 changes: 20 additions & 29 deletions docker/scripts/init_bookie.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,65 +19,57 @@
# * See the License for the specific language governing permissions and
# * limitations under the License.
# */

source ${SCRIPTS_DIR}/common.sh

function wait_for_zookeeper() {
echo "wait for zookeeper"
until /opt/bookkeeper/bin/bookkeeper org.apache.zookeeper.ZooKeeperMain -server ${BK_zkServers} ls /; do sleep 5; done
until zk-shell --run-once "ls /" ${BK_zkServers}; do sleep 5; done
}

function create_zk_root() {
if [ "x${BK_CLUSTER_ROOT_PATH}" != "x" ]; then
echo "create the zk root dir for bookkeeper at '${BK_CLUSTER_ROOT_PATH}'"
/opt/bookkeeper/bin/bookkeeper org.apache.zookeeper.ZooKeeperMain -server ${BK_zkServers} create ${BK_CLUSTER_ROOT_PATH}
zk-shell --run-once "create ${BK_CLUSTER_ROOT_PATH} '' false false true" ${BK_zkServers}
fi
}

# Init the cluster if required znodes not exist in Zookeeper.
# Use ephemeral zk node as lock to keep initialize atomic.
function init_cluster() {
if [ "x${BK_STREAM_STORAGE_ROOT_PATH}" == "x" ]; then
echo "BK_STREAM_STORAGE_ROOT_PATH is not set. fail fast."
exit -1
fi

/opt/bookkeeper/bin/bookkeeper org.apache.zookeeper.ZooKeeperMain -server ${BK_zkServers} stat ${BK_STREAM_STORAGE_ROOT_PATH}
zk-shell --run-once "ls ${BK_zkLedgersRootPath}/available/readonly" ${BK_zkServers}
if [ $? -eq 0 ]; then
echo "Metadata of cluster already exists, no need to init"
echo "Cluster metadata already exists"
else
# create ephemeral zk node bkInitLock, initiator who this node, then do init; other initiators will wait.
/opt/bookkeeper/bin/bookkeeper org.apache.zookeeper.ZooKeeperMain -server ${BK_zkServers} create -e ${BK_CLUSTER_ROOT_PATH}/bkInitLock
if [ $? -eq 0 ]; then
# bkInitLock created success, this is the successor to do znode init
echo "Initializing bookkeeper cluster at service uri ${BK_metadataServiceUri}."
/opt/bookkeeper/bin/bkctl --service-uri ${BK_metadataServiceUri} cluster init
# Create an ephemeral zk node `bkInitLock` for use as a lock.
lock=`zk-shell --run-once "create ${BK_CLUSTER_ROOT_PATH}/bkInitLock '' true false false" ${BK_zkServers}`
if [ -z "$lock" ]; then
echo "znodes do not exist in Zookeeper for Bookkeeper. Initializing a new Bookkeekeper cluster in Zookeeper."
/opt/bookkeeper/bin/bookkeeper shell initnewcluster
if [ $? -eq 0 ]; then
echo "Successfully initialized bookkeeper cluster at service uri ${BK_metadataServiceUri}."
echo "initnewcluster operation succeeded"
else
echo "Failed to initialize bookkeeper cluster at service uri ${BK_metadataServiceUri}. please check the reason."
echo "initnewcluster operation failed. Please check the reason."
echo "Exit status of initnewcluster"
echo $?
exit
fi
else
echo "Other docker instance is doing initialize at the same time, will wait in this instance."
echo "Others may be initializing the cluster at the same time."
tenSeconds=1
while [ ${tenSeconds} -lt 10 ]
while [ ${tenSeconds} -lt 100 ]
do
sleep 10
echo "run '/opt/bookkeeper/bin/bookkeeper org.apache.zookeeper.ZooKeeperMain -server ${BK_zkServers} stat ${BK_STREAM_STORAGE_ROOT_PATH}'"
/opt/bookkeeper/bin/bookkeeper org.apache.zookeeper.ZooKeeperMain -server ${BK_zkServers} stat ${BK_STREAM_STORAGE_ROOT_PATH}
zk-shell --run-once "ls ${BK_zkLedgersRootPath}/available/readonly" ${BK_zkServers}
if [ $? -eq 0 ]; then
echo "Waited $tenSeconds * 10 seconds, bookkeeper inited"
echo "Waited $tenSeconds * 10 seconds. Successfully listed ''${BK_zkLedgersRootPath}/available/readonly'"
break
else
echo "Waited $tenSeconds * 10 seconds, still not init"
echo "Waited $tenSeconds * 10 seconds. Continue waiting."
(( tenSeconds++ ))
continue
fi
done

if [ ${tenSeconds} -eq 10 ]; then
echo "Waited 100 seconds for bookkeeper cluster init, something wrong, please check"
if [ ${tenSeconds} -eq 100 ]; then
echo "Waited 100 seconds for bookkeeper cluster to initialize, but to no avail. Something is wrong, please check."
exit
fi
fi
Expand All @@ -97,5 +89,4 @@ function init_bookie() {

# init the cluster
init_cluster

}
5 changes: 5 additions & 0 deletions tests/docker-images/current-version-image/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,11 @@ RUN set -x \
&& yum install -y java-1.8.0-openjdk-headless wget bash python-pip python-devel sudo netcat gcc gcc-c++ \
&& mkdir -pv /opt \
&& cd /opt \
# install zookeeper shell
&& wget -q https://bootstrap.pypa.io/get-pip.py \
&& python get-pip.py \
&& pip install zk-shell \
&& rm -rf get-pip.py \
&& yum clean all

# untar tarballs
Expand Down

0 comments on commit 092ebf2

Please sign in to comment.