Dmitrii Khalezin DevOps Engineer

Created: May 30, 2024

[Step-by-Step Guide: How to Run a ClickHouse Cluster in a GKE Cluster]

Overview

ClickHouse is a high-performance, column-oriented SQL database management system (DBMS) for online analytical processing (OLAP).

OLAP systems facilitate the organization and analysis of extensive datasets, enabling the swift execution of intricate queries. With the ability to efficiently handle petabytes of data, OLAP is invaluable in fields such as data science and business analytics.

In this article, we'll demonstrate step-by-step how to run a ClickHouse cluster in a GKE cluster.

Please note that in this article we're intentionally not using Terraform, Helm, or other tools, and will use the fastest and easiest way to launch both the GKE and ClickHouse clusters.

Prerequisites

Account in GCP.
GKE cluster.
Node pool with 3 x e2-standard-2 nodes.

Required tools

gcloud
kubectl (configured to communicate with the required cluster)
clickhouse-client

GKE cluster

We start by creating a simple GKE cluster. If you already have a cluster, you can skip this step and move to installing the ClickHouse operator.

1. Initialize gcloud

gcloud init

2. Set project

gcloud config set project PROJECT_ID

3. Set default account

gcloud config set account [email protected]

4. Authorization in the project

gcloud auth application-default login

5. Enable Kubernetes Engine API and Compute Engine API

gcloud services enable container.googleapis.com compute.googleapis.com

6. Verify existing cluster

gcloud container clusters list

7. Create a GKE cluster with e2-standard-2 nodes and 1 node per zone

gcloud container clusters create clickhouse-cluster --location us-central1 --machine-type e2-standard-2 --num-nodes 1
kubeconfig entry generated for clickhouse-cluster.
NAME                LOCATION     MASTER_VERSION      MASTER_IP       MACHINE_TYPE  NODE_VERSION        NUM_NODES  STATUS
clickhouse-cluster  us-central1  1.27.8-gke.1067004  35.224.126.241  e2-standard-2     1.27.8-gke.1067004  9          RUNNING

8. Authenticate the GKE cluster

gcloud container clusters get-credentials clickhouse-cluster --location us-central1

9. Verify access to the cluster with kubectl

kubectl get nodes
NAME                                                STATUS   ROLES    AGE     VERSION
gke-clickhouse-cluster-default-pool-121c11eb-fh4l   Ready    <none>   6m22s   v1.27.8-gke.1067004
gke-clickhouse-cluster-default-pool-413e1fe1-s662   Ready    <none>   6m21s   v1.27.8-gke.1067004
gke-clickhouse-cluster-default-pool-fee1829a-c2tq   Ready    <none>   6m19s   v1.27.8-gke.1067004

You can find more information about cluster quickstart here and here.

Altinity Сlickhouse operator

The Altinity ClickHouse operator serves as a robust solution for overseeing ClickHouse within a Kubernetes ecosystem. It provides an array of functionalities, including automated cluster setup, scalability, and monitoring. Whether you're overseeing a solitary ClickHouse instance or administering multiple ClickHouse clusters, leveraging the ClickHouse Operator can streamline operations and enhance efficiency.

Installation

We'll follow a simple approach to install the Altinity ClickHouse operator and apply a k8s manifest.

Install Altinity ClickHouse Operator version 0.23.4:

kubectl apply -f https://raw.githubusercontent.com/Altinity/clickhouse-operator/release-0.23.4/deploy/operator/clickhouse-operator-install-bundle.yaml

With this command, we install clickhouse-operator version release-0.23.4;
Pay attention to the default ClickHouse operator installation in the kube-system namespace. If you want to change the installation namespace, you need to download the operator manifest and change the namespace value before applying;
By default, the ClickHouse operator watches all namespaces:

  watch:
      # List of namespaces where clickhouse-operator watches for events.
      # Concurrently running operators should watch on different namespaces.
      # IMPORTANT
      # Regexp is applicable.
      #namespaces: ["dev", "test"]
      namespaces: []

If you want to change this behavior, you need to download the operator manifest change this configuration, and apply it.

You can verify the operator with this command:

kubectl get pods --namespace kube-system
NAME READY   STATUS    RESTARTS      AGE
…………
clickhouse-operator-857c69ffc6-ttnsj   2/2     Running   0  4s

You can find more information about the ClickHouse operator in the github repository. Our next step is to launch the ClickHouse Keeper and ClickHouse cluster. Let's familiarize ourselves with the step-by-step instructions.

ClickHouse cluster

ClickHouse Keeper or ZooKeeper coordinates and manages the state in a ClickHouse cluster. They ensure data consistency and availability and manage various aspects of the cluster such as leader election, node discovery, and distribution of data processing tasks.

We use ClickHouse Keeper in this guide. Read more about why we use ClickHouse Keeper.

Installation

Create namespace:

kubectl create namespace clickhouse
namespace/clickhouse created

Clickhouse Keeper

1. With this example, we get the following:

StatefulSet of ClickHouse Keeper with 3 replicas.
25GB of storage per replica.
Services:
ClusterIP;
Second with headless type.
Configmap with ClickHouse Keeper settings.
Configmap with ClickHouse Keeper scripts.
PodDisruptionBudget manifest for ClickHouse Keeper.

kubectl -n clickhouse apply -f https://raw.githubusercontent.com/Altinity/clickhouse-operator/release-0.23.3/deploy/clickhouse-keeper/clickhouse-keeper-manually/clickhouse-keeper-3-nodes.yaml

2. Verify installation status

kubectl -n clickhouse get po                                                                                                                                                                          
NAME                     READY   STATUS    RESTARTS   AGE
clickhouse-keeper-0      1/1     Running   0          25s
clickhouse-keeper-1      1/1     Running   0          18s
clickhouse-keeper-2      1/1     Running   0          11s

3. Verify ClickHouse Keeper

kubectl  -n clickhouse port-forward svc/clickhouse-keeper 2181
……
echo ruok | nc 127.0.0.1 2181
imok

If everything is fine you get imok as the answer.

Find more information about this example in the GitHub repository and official documentation.

ClickHouse cluster

1. Create a service account for the ClickHouse pods. Create a sa.yaml file with the following content:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: clickhouse-gcp

Then you need to apply this manifest:

kubectl -n clickhouse apply -f  sa.yaml

2. Using the ClickHouseInstallation object, we can launch the ClickHouse cluster. With this configuration, we get the following:

Two replicas of ClickHouse;
One shard;
User, password, and network restriction, in this example, we have user admin and password admin.

     admin/password_sha256_hex: 8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918
     admin/access_management: 1
     admin/networks/ip:
     - 127.0.0.1
     - "::1"

You can generate the password sha256_hex with the following command:

echo -n admin| sha256sum

Note for ClickHouse, we need to use SSD disks and in this configuration we use: storageClassName: premium-rwo.

You need to create a chi.yaml file with the following content:

apiVersion: clickhouse.altinity.com/v1
kind: ClickHouseInstallation
metadata:
 name: dc1
spec:
 defaults:
   templates:
     serviceTemplate: chi-service-template
     clusterServiceTemplate: cluster-service-template
 configuration:
   zookeeper:
     nodes:
       - host: clickhouse-keeper
         port: 2181
   settings:
     default_replica_name: "{replica}"
     default_replica_path: "/clickhouse/tables/{uuid}/{shard}"
   users:
     admin/password_sha256_hex: 8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918
     admin/access_management: 1
     admin/networks/ip:
     - 127.0.0.1
     - "::1"
   files:
     users.d/remove_database_ordinary.xml: |
       <yandex>
         <profiles>
            <default>
               <default_database_engine remove="1"/>
            </default>
         </profiles>
       </yandex>
     config.d/z_log_disable.xml: |-
       <clickhouse>
           <asynchronous_metric_log remove="1"/>
           <metric_log remove="1"/>
           <query_views_log remove="1" />
           <part_log remove="1"/>
           <session_log remove="1"/>
           <text_log remove="1" />
           <trace_log remove="1"/>
           <crash_log remove="1"/>
           <opentelemetry_span_log remove="1"/>
           <zookeeper_log remove="1"/>
       </clickhouse>
     config.d/query_log_ttl.xml: |-
       <clickhouse>
           <query_log replace="1">
               <database>system</database>
               <table>query_log</table>
               <engine>ENGINE = MergeTree PARTITION BY (event_date)
                       ORDER BY (event_time)
                       TTL event_date + INTERVAL 14 DAY DELETE
               </engine>
               <flush_interval_milliseconds>7500</flush_interval_milliseconds>
           </query_log>
       </clickhouse>
     config.d/query_thread_log_ttl.xml: |-
       <clickhouse>
           <query_thread_log replace="1">
               <database>system</database>
               <table>query_thread_log</table>
               <engine>ENGINE = MergeTree PARTITION BY (event_date)
                       ORDER BY (event_time)
                       TTL event_date + INTERVAL 14 DAY DELETE
               </engine>
               <flush_interval_milliseconds>7500</flush_interval_milliseconds>
           </query_thread_log>
       </clickhouse>
     config.d/merge_tree.xml: |-
       <clickhouse>
           <merge_tree>
               <max_part_loading_threads>4</max_part_loading_threads>
               <max_part_removal_threads>4</max_part_removal_threads>
           </merge_tree>
       </clickhouse>
   clusters:
     - name: "cluster1"
       templates:
         podTemplate: clickhouse-23.8-lts
         clusterServiceTemplate: cluster-service-template
       layout:
         shardsCount: 1
         replicasCount: 2
 templates:
   serviceTemplates:
     - name: chi-service-template
       generateName: "{chi}"
       spec:
         ports:
           - name: tcp
             port: 9000
         type: ClusterIP
         ClusterIP: None
     - name: cluster-service-template
       generateName: "{chi}-{cluster}"
       spec:
         ports:
           - name: tcp
             port: 9000
         type: ClusterIP
         ClusterIP: None
   podTemplates:
     - name: clickhouse-23.8-lts
       podDistribution:
         - type: ClickHouseAntiAffinity
       spec:
         serviceAccount: clickhouse-gcp
         containers:
           - name: clickhouse
             image: clickhouse/clickhouse-server:23.8.9.54
             volumeMounts:
               - name: server-storage
                 mountPath: /var/lib/clickhouse
             resources:
               requests:
                 memory: "1024Mi"
                 cpu: 200m
               limits:
                 memory: "1280Mi"
   volumeClaimTemplates:
     - name: server-storage
       reclaimPolicy: Retain
       spec:
         accessModes:
           - ReadWriteOnce
         storageClassName: premium-rwo
         resources:
           requests:
             storage: 20Gi

Then you need to apply this manifest

kubectl -n clickhouse apply -f chi.yaml

Verify ClickHouse installation

kubectl -n clickhouse get po
chi-dc1-cluster1-0-0-0   1/1     Running   0          100s
chi-dc1-cluster1-0-1-0   1/1     Running   0          93s

Verify the ClickHouse cluster after installation

1. You need to use port-forward to get access to the cluster

kubectl -n clickhouse port-forward svc/dc1 9000:9000

2. Using the ClickHouse-client, you can connect to the cluster

clickhouse-client --user admin --password admin --host localhost --port 9000

3. Get information about the ClickHouse cluster

chi-dc1-cluster1-0-0-0.chi-dc1-cluster1-0-0.clickhouse.svc.cluster.local :) SELECT * FROM system.clusters LIMIT 2 FORMAT Vertical;

SELECT *
FROM system.clusters
LIMIT 2
FORMAT Vertical

Query id: b6e62d38-08d8-488c-9f69-893d79786080

Row 1:
──────
cluster:                 all-replicated
shard_num:               1
shard_weight:            1
replica_num:             1
host_name:               chi-dc1-cluster1-0-0
host_address:            127.0.0.1
port:                    9000
is_local:                1
user:                    default
default_database:        
errors_count:            0
slowdowns_count:         0
estimated_recovery_time: 0
database_shard_name:     
database_replica_name:   
is_active:               ᴺᵁᴸᴸ

Row 2:
──────
cluster:                 all-replicated
shard_num:               1
shard_weight:            1
replica_num:             2
host_name:               chi-dc1-cluster1-0-1
host_address:            10.2.1.32
port:                    9000
is_local:                0
user:                    default
default_database:        
errors_count:            0
slowdowns_count:         0
estimated_recovery_time: 0
database_shard_name:     
database_replica_name:   
is_active:               ᴺᵁᴸᴸ

2 rows in set. Elapsed: 0.002 sec.

Conclusion

This guide provides a comprehensive tutorial for deploying ClickHouse on a k8s cluster. By following the steps above, users can establish a robust ClickHouse environment suitable for their analytical workloads. With the ClickHouse Operator facilitating cluster management and ClickHouse Keeper ensuring coordination, users can easily leverage the power of ClickHouse within a Kubernetes ecosystem.

For further customization and advanced configurations, users are encouraged to refer to the official ClickHouse Operator documentation and GitHub repository. Additionally, exploring monitoring and alerting solutions like Victoria Metrics can enhance cluster observability and performance management.

By deploying ClickHouse on k8s, users can unlock powerful analytical capabilities, enabling efficient data processing and analysis for various use cases ranging from data science to business intelligence.