Advanced Topics

Service Discovery

As of Kubernetes 1.3, DNS is a built-in service launched automatically using the addon manager.

The addons are in the /etc/kubernetes/addons directory on the master node.

The service can be used within pods to find other services running on the same cluster.

Multiple containers within 1 pod don't need this service, as they can contact each other directly. A container in the same pod can just use localhost:port.

To make DNS work, a pod will need a service definition.

How can app 1 reach app 2 using DNS? The container itself can talk to the service of App 2.

If you ran the host for app1-service and got back 10.0.0.1, host app2-service could get back 10.0.0.2.

Examples from the CL

host app1-service
# has addr 10.0.0.1
host app2-service
# has addr 10.0.0.2
host app2-service.default
# app2-service.default has address 10.0.0.2
host app2-service.default.svc.cluster.local
# app2-service.default.svc.cluster.local has addr 10.0.0.2

The default stands for default namespace. Pods and services can be launched in different namespaces (to logically seperate your cluster).

So how does this resolution work?

Say we have a pod and we run kubectl run -i -tty busybox --image=busybox --restart=Never -- sh and the from the shell run cat /etc/resolv.conf, can can see that there will be a nameserver. If you do a lookup of the service name in this folder, you'll see why the above works with .default and .default.svc.whatever.

Demo: Service Discovery

After creating a secrets type, pod type for a database (SQL using the secrets), and a service for exposing certain ports for the database and then deploying three replicas for a helloworld-deployment that also has a index-db.js file which we run node index-db.js which will have code that works on the service. The value of the MYSQL_HOST being set to database-service will resolve with the database-service.yml file where the metadata name is database-service.

Running kubectl get pod we should see the database plus 3 pods running for the deployment.

Running kubectl logs [deployment-name] will also show us the logs for that pod.

Again, remember that running kubectl get svc will get all the services available.

ConfigMap

Config params that are not secret can be put in the ConfigMap.

The input is again key-value pairs.

The ConfigMap key-value pairs can then be read by the app using:

Env variables
Container commandline args in the Pod config
Using volumes

It can also contain full config files eg. a webserver config file. Then that file can then be mounted using volumes where the application expects its config file.

This was you can inject config settings into containers without changing the container itself.

To generate a configmap using files:

$ cat << EOF > app.properties
driver=jdbc
database=postgres
lookandfeel=1
otherparams=xyz
param.with.hierarchy=xyz
EOF
$ kubectl create configmap app-config --from-file=app.properties

How to use it? You can create a pod that exposes the ConfigMap using a volume.

# pod-helloworld.yml w/ secrets
apiVersion: v1
kind: Pod
metadata:
  name: nodehelloworld.example.com
  labels:
  app: helloworld
spec:
  # The containers are listed here
  containers:
  - name: k8s-demo
  image: okeeffed/docker-demo
  ports:
  - containerPort: 3000
  # @@@ This are the envs in a volume mount
  volumeMounts:
  - name: credvolume
    mountPath: /etc/creds
    readOnly: true
  # @@@ For the ConfigMap
  - name: config-volume
    mountPath: /etc/config
  volumes:
  - name: credvolume
  secret:
    secretName: db-secrets
  # @@@ For the ConfigMap
  - name: config-volume
  configMap:
    name: app-config

From /etc/config , the config values will be stored in files at /etc/config/driver and /etc/config/param/with/hierarchy.

This is an example of a pod that exposes the ConfigMap as env variables:

# pod-helloworld.yml w/ secrets
apiVersion: v1
kind: Pod
metadata:
  name: nodehelloworld.example.com
  labels:
  app: helloworld
spec:
  # The containers are listed here
  containers:
  - name: k8s-demo
  image: okeeffed/docker-demo
  ports:
  - containerPort: 3000
  # @@@ This are the envs in a volume mount
  env:
  - name: DRIVER
    valueFrom: # where you get the value from
    configMapKeyRef: # ensuring the ref comes from the configMap
    name: app-config
    key: driver
  - name: DATABASE
  [ ... ]

Demo: Config Map

Using an example for a reverse proxy config for NGINX:

server {
  listen  80;
  server_name localhost;

  location / {
  proxy_bind 127.0.0.1;
  proxy_pass http://127.0.0.1:3000;
  }

  error_page  500 502 503 504 /50x.html;
  location = /50x.html {
  root    /usr/share/nginx/html;
  }
}

We could then create this config map with kubectl create configmap nginx-config --from-file=reverseproxy.conf.

# pod-helloworld.yml w/ secrets
apiVersion: v1
kind: Pod
metadata:
  name: hellonginx.example.com
  labels:
  app: hellonginx
spec:
  # The containers are listed here
  containers:
    - name: nginx
    image: nginx:1.11
    ports:
    - containerPort: 80
    # @@@ The import conf stuff
    volumeMounts:
    - name: config-volume
        mountPath: /etc/nginx/conf.d
  - name: k8s-demo
  image: okeeffed/docker-demo
  ports:
    - containerPort: 3000
  # @@@ The important mounting
  volumes:
    - name: config-volume # @@@ this is referred to above in volumeMounts
    configMap:
        name: nginx-config
        items:
        - key: reverseproxy.conf
        path: reverseproxy.conf

After then also creating the service, we can grab the minikube service url and use curl to get info on that request. From here, would could see that it is nginx answer the request and transferring it to the Node port.

If we then want to jump into the nginx container to see what is going on, we can run kubectl exec -i -t helloworld-nginx -c nginx -- bash (-c flag to specify container) and run ps x to see the processes and we can cat /etc/nginx/conf.d/reverseproxy.conf.

At this stage, we can enable SSL for NGINX.

Ingress Controller

Ingress a solution since Kub 1.1 that allows inbound connections to the cluster.

It's an alternative to the external LoadBalancer and nodePorts. It allows you to easily expose services that need to be accessible from outside to the cluster.

With ingress you can run your own ingress controller (basically a loudbalancer) within the Kub Cluster.

There are default ingress controller available, or you can write your own ingress controller.

How does it work? If you connect over 80/443 you will first hit the Ingress Controller. You can use the NGINX controller that comes with Kubernetes. That controller will the dirrect all the traffic.

The ingress rules could define that if you go to host-x.example.com you go to Pod 1 etc. You can even redirect slash URLs specifically.

To create an Ingress Controller:

# ingress-controller.yml w/ secrets
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: helloworld-rules
spec:
  # @@@ Setting the important rules
  rules:
    - host: helloworld-v1.example.com
      http:
        paths:
          - path: /
          backend:
            serviceName: helloworld-v1
            servicePort: 80
    - host: helloworld-v2.example.com
      http:
        paths:
          - path: /
          backend:
            serviceName: helloworld-v2
            servicePort: 80

Demo: Ingress Controller

In the example, the ingress controller is a Replication Controller to ensure that there is always one up and running.

After deploying, if we curl with the -H host flag with helloworld-v1.whatever.com and v2 respectively, it would have the ingress controller route to each server.

External DNS

On public cloud providers, you can use the ingress controller to reduce the cost of your LoadBalancers.

You could use 1 LoadBalancer that captures all the external traffic and send it to the ingress controller.
IngCont can be configured to route the different traffic to all your apps based on HTTP rules.
Only works for HTTP(s)-based apps

The External DNS tool will automatically create the necessary DNS records in your external DNS server (like route53).

For every hostname used in ingress, it'll create a new record to send traffic to load balancer.
The major DNS providers are supported: Route53, Google CloudDNS, CloudFlare etc.

Diagram

Volumes

How can we run stateful apps?

Volumes in kubernetes allow you to store data outside of the container. So far, all the applications have been stateless for this reason. This can be done with external services like a database, caching server (eg MySQL, AWS S3).

Persistent Volumes in Kubernetes allow you to attach a volume to a container that exists even when the container stops. Volumes can be attached using different volume plugins. Eg local volume, EBS Storage etc.

Using EBS Storage

With this, we can keep state. You could run a MySQL database using persistent volumes, although this may not be ready for production yet.

The use case is that if your node stops working, the pod can be rescheduled on another node, and the volume can be attached to the new node.

To use volumes, you first need to create the volume:

aws ec2 create-volume --sze 1- --region us-east-1 --availability-zone us-east-1 --volume-type gp2

Next, we need to create a pod with a volume def:

# pod-helloworld.yml w/ secrets
apiVersion: v1
kind: Pod
metadata:
  name: hellonginx.example.com
  labels:
  app: hellonginx
spec:
  # The containers are listed here
  containers:
    - name: k8s-demo
    image: okeeffed/k8s-demo
    volumeMounts:
    - name: myvolume
      mountPath: myvolume
  # @@@ The important mounting
  volumes:
    - name: myvolume # @@@ this is referred to above in volumeMounts
      awsElasticBlockStore:
        volumeID: vol-9835id

Demo: Volumes

Using Vagrant for kops, we can first create a volume using the above mentioned command.

After receiving a response, you can replace the .yml pod definition config file to attach that volumeID. Once the deployment is created and deployed.

After create and confirmation, we can get the pod name kubectl get pod and attach kubectl exec helloworld-deployment-923id -i -t -- bash and then run ls -ahl /myvol/ to check for volume.
If we run echo 'test' > /myvol/myvol.txt and echo 'test 2' > /test.txt, we know that the latter file will not persist if the pod is restarted/rescheduled.
If we run kubectl drain ip --force we can drain the pod. Assuming this is a Replication Controller or Deployment, another container should spin up.
Once that pod is attached to another node, we can also attach back to the pod on the new node with the exec command and we can confirm that the /myvol/myvol.txt is still there, although the other /test.txt is no longer there since it was not saved to the volume.

If you need to remove the ebs volume, you can run aws ec2 delete-volume --volume-id vol-[id].

Volume Provisioning

The kubs plugins have the capability to provision storage for you. The AWS Plugin can for instance provision storage for you by creating the volumes in AWS before attaching them to a node.

This is done using the StorageClass object -- this is beta for the course but should be stable soon.

To use autoprovisioing, create the following:

# storage.yml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: standard
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
  zone: ap-southeast-1

Next, you can create a volume claim and specify the size:

# my-volume-claim.yml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: myclaim
  annotations:
    volume.beta.kubernetes.io/storage-class: "standard"
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 8Gi

Finally, if launching a pod:

# pod-helloworld.yml w/ secrets
apiVersion: v1
kind: Pod
metadata:
  name: mypod
spec:
  # The containers are listed here
  containers:
    - name: myfrontend
    image: nginx
    volumeMounts:
    - name: mypd
      mountPath: '/var/www/html'
  # @@@ The important mounting
  volumes:
    - name: mypd # @@@ this is referred to above in volumeMounts
      persistentVolumeClaim:
        claimName: myclaim # @@@ refers to my claim from the previous type definition

Demo: Using Wordpress with Volumes

After declaring a StorageClass class from a yaml file and a PersistentVolumeClaim class.

# storage.yml
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
  name: standard
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
  zone: eu-west-1a

# PV Claim
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: db-storage
  annotations:
    volume.beta.kubernetes.io/storage-class: "standard"
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 8Gi

There is also a simple ReplicationController for the Wordpress DB. In the spe for the container for mysql, we declare where the mountPath will be.

apiVersion: v1
kind: ReplicationController
metadata:
  name: wordpress-db
spec:
  replicas: 1
  selector:
    app: wordpress-db
  template:
    metadata:
      name: wordpress-db
      labels:
        app: wordpress-db
    spec:
      containers:
      - name: mysql
        image: mysql:5.7
        args:
          - "--ignore-db-dir=lost+found"
        ports:
        - name: mysql-port
          containerPort: 3306
        env:
          - name: MYSQL_ROOT_PASSWORD
            valueFrom:
              secretKeyRef:
                name: wordpress-secrets
                key: db-password
        volumeMounts:
        - mountPath: "/var/lib/mysql"
          name: mysql-storage
      volumes:
        - name: mysql-storage
          persistentVolumeClaim:
            claimName: db-storage

Having a makeshift secrets file for secrets:

apiVersion: v1
kind: Secret
metadata:
  name: wordpress-secrets
type: Opaque
data:
  db-password: cGFzc3dvcmQ=
  # random sha1 strings - change all these lines
  authkey: MTQ3ZDVhMTIzYmU1ZTRiMWQ1NzUyOWFlNWE2YzRjY2FhMDkyZGQ4OA==
  loggedinkey: MTQ3ZDVhMTIzYmU1ZTRiMWQ1NzUyOWFlNWE2YzRjY2FhMDkyZGQ4OQ==
  secureauthkey: MTQ3ZDVhMTIzYmU1ZTRiMWQ1NzUyOWFlNWE2YzRjY2FhMDkyZGQ5MQ==
  noncekey: MTQ3ZDVhMTIzYmU1ZTRiMWQ1NzUyOWFlNWE2YzRjY2FhMDkyZGQ5MA==
  authsalt: MTQ3ZDVhMTIzYmU1ZTRiMWQ1NzUyOWFlNWE2YzRjY2FhMDkyZGQ5Mg==
  secureauthsalt: MTQ3ZDVhMTIzYmU1ZTRiMWQ1NzUyOWFlNWE2YzRjY2FhMDkyZGQ5Mw==
  loggedinsalt: MTQ3ZDVhMTIzYmU1ZTRiMWQ1NzUyOWFlNWE2YzRjY2FhMDkyZGQ5NA==
  noncesalt: MTQ3ZDVhMTIzYmU1ZTRiMWQ1NzUyOWFlNWE2YzRjY2FhMDkyZGQ5NQ==

To open up the service for the port:

apiVersion: v1
kind: Service
metadata:
  name: wordpress-db
spec:
  ports:
  - port: 3306
    protocol: TCP
  selector:
    app: wordpress-db
  type: NodePort

Opening up the web and web service:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: wordpress-deployment
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: wordpress
    spec:
      containers:
      - name: wordpress
        image: wordpress:4-php7.0
        # uncomment to fix perm issue, see also https://github.com/kubernetes/kubernetes/issues/2630
        # command: ['bash', '-c', 'chown www-data:www-data /var/www/html/wp-content/uploads && apache2-foreground']
        ports:
        - name: http-port
          containerPort: 80
        env:
          - name: WORDPRESS_DB_PASSWORD
            valueFrom:
              secretKeyRef:
                name: wordpress-secrets
                key: db-password
          - name: WORDPRESS_AUTH_KEY
            valueFrom:
              secretKeyRef:
                name: wordpress-secrets
                key: authkey
          - name: WORDPRESS_LOGGED_IN_KEY
            valueFrom:
              secretKeyRef:
                name: wordpress-secrets
                key: loggedinkey
          - name: WORDPRESS_SECURE_AUTH_KEY
            valueFrom:
              secretKeyRef:
                name: wordpress-secrets
                key: secureauthkey
          - name: WORDPRESS_NONCE_KEY
            valueFrom:
              secretKeyRef:
                name: wordpress-secrets
                key: noncekey
          - name: WORDPRESS_AUTH_SALT
            valueFrom:
              secretKeyRef:
                name: wordpress-secrets
                key: authsalt
          - name: WORDPRESS_SECURE_AUTH_SALT
            valueFrom:
              secretKeyRef:
                name: wordpress-secrets
                key: secureauthsalt
          - name: WORDPRESS_LOGGED_IN_SALT
            valueFrom:
              secretKeyRef:
                name: wordpress-secrets
                key: loggedinsalt
          - name: WORDPRESS_NONCE_SALT
            valueFrom:
              secretKeyRef:
                name: wordpress-secrets
                key: noncesalt
          - name: WORDPRESS_DB_HOST
            value: wordpress-db
        volumeMounts:
        # shared storage for things like media
        - mountPath: /var/www/html/wp-content/uploads
          name: uploads
      volumes:
      - name: uploads
        nfs:
          server: eu-west-1a.fs-5714e89e.efs.eu-west-1.amazonaws.com
          path: /

apiVersion: v1
kind: Service
metadata:
  name: wordpress
spec:
  ports:
  - port: 80
    targetPort: http-port
    protocol: TCP
  selector:
    app: wordpress
  type: LoadBalancer

With the AWS Commandline, you can create a file system and mount target. For the fs, run aws efs create-file-system --creation-token and then after grabbing the file-system-id and subnet-id, you can run aws efs create-mount-target --file-system-id <id> --security-groups <sg>. Ensure in the above nfs volume you update the fs id.

Pod Presets

Pod presets can inject information into pods at runtime.

Used to inject Kubernetes Resources like Secrets, ConfigMaps, Volumes and Environment variables.

Imagine you have 20 apps to deploy, all with a specific credential. You can edit the 20 specs and add the creds, or you can use presets to create 1 Preset Object, which will inject an environment variable or config file to all matching pods.

When injecting env vars and volume mounts, the Pod Preset will apply the changes to ll containers within the pod.

# PodPreset File
apiVersion: settings.k8s.io/v1alpha1
kind: PodPreset
metadata:
  name: allow-database
spec:
  selector:
    matchLabels:
      role: frontend
  env:
    - name: DB_PORT
      value: '6379'
  volumeMounts:
    - mountPath: /cache
      name: cache-volume
  volumes:
    - name: cache-volume
      emptyDir: {}

# Pod file using PodPreset
apiVersion: v1
kind: Pod
metadata:
  name: website
  labels:
    app: website
    role: frontend
spec:
  containers:
    - name: website
      image: nginx
      ports:
        - containerPort: 80

$ kubectl create -f pod-preset.yml
$ kubectl create -f pod.yml

Stateful Sets - (formerly Pet Sets)

Stateful dist apps - new feature from Kub 1.3.

It is introduced to be able to run stateful applications that need:

A stable pod hostname (instead of podname-randomstr) - will have an index ie podname-0, podname-1 etc.
Stateful app requires multi pods with vols based on their ordinal number. Currently deleting and/or scaling a PetSet down will not deleted volumes associated.

A pet set will allow your stateful app to use DNS to find out peers. One running node of the Pet Set is called a Pet. Using Pet Sets you can run for instance 5 cassandra nodes on Kubs named cass-1 until cass-5.

The big difference is that you don't want to connect just any specific service, you want to make sure pod whatever definitely connects to another pod.

This pet set also allows order to startup and teardown of pets.

Still a lot of work for future work.

Daemon Sets

Ensure that every single node in the Kubernetes cluster runs the same pod resource. This is useful to ensure a certain pod is running on every single kubernetes node.
When a node is added to the cluster, a new pod will be started automatically
Same when a node is removed, the pod will not be rescheduled on another node

Use cases:

Logging aggregators
Monitoring
Load Balancers/Reverse Proxies/API Gateways

Resource Usage Monitoring

Heapster enables Container Cluster Monitoring and Performance Analysis.
It's providing a monitoring platform for Kubernetes.
It's a prerequisite if you want to do pod auto-scaling in Kubernetes.
Heapster exports cluster metrix via REST endpoints.
You can use different backends with Heapster.
- Demo uses InfluxDB, but Kafka is also possible.
Visualisations can be shown with Grafana.
- Kubernetes dashboard will also show graphs once monitoring is enabled.
All these technologies can be started in pods
The yaml files can be found on the github repo of Heapster.

Since Heapster is now deprecated, you would have to use metrics-server or an alternative like Prometheus.

Horiztonal Pod Autoscaling

Link to main Kubernetes site

Kubernetes has the possibility to autoscale pods based on metrics.
Kubernetes can autoscale Deployment, Replication Controller or ReplicaSet.
In Kubernetes 1.3 scaling based on CPU usage is possible out of the box.
- Application based metrics are also available (like queries per second or average request latency).
  - To enable, the cluster has to be started with env var ENABLE_CUSTOM_METRICS to be true.
It will periodically query the utilization for the targeted pods.
- By default 30 sec, can be changed using the --horizontal-pod-autoscaler-sync-periodflag when launching the controller manager.
Requires the metrics system to work.

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-example-autoscaler
spec:
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: hpa-example
  minReplicas: 1
  maxReplicas: 10
  targetCPUUtilizationPercentage: 50

Affinity/Anti-Affinity

The affinity/anti-affinity feature allows you to do more complex scheduling than the nodeSelector and also works on Pods.
- The language is more expressive.
- You can create rules that are not hard requirements, but rather a preferred rule, meaning that the scheduler will stil be able to schedule your pod, even if the rules cannot be met.
- You can create rules to take other pod labels into account
  - Example, you can make sure two different pods are never on the same node.
Kubernetes can do node affinity and pod affinity/anti-affinity.
- Node affinity is similar to the nodeSelector.
- Pod affinity/anti-affinity allows you to create rules how pods should be scheduled taking into account other running pods.
- Affinity/anti-affinity mechanism is only relevant during scheduling, once a pod is running, it'll need to be recreated to apply the rules again.
There are currently 2 types you can use for node affinity:
- 1) requiredDuringSchedulingIgnoredDuringExecution
- 2) preferredDuringSchedulingIgnoredDuringExecution
The first one sets a hard requirement (like the nodeSelector).
- The rules must be met before the pod can be scheduled.
The second type will try to enforce the rule, but it will not guarantee it.
- Even if the rule is not met, the pod can still be scheduled, it's a soft requirement, a preference.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: <% helloworld-deployment %>
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: <% app_name %>
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: env
                    operator: In
                    values:
                      - dev
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 1 # higher the weighting, the more emphasis on rule
              preference:
                matchExpressions:
                  - key: team
                    operator: In
                    values:
                      - engineering-project1
      containers:
        - name: k8s-demo
          image: <% image_name %>
          port:
            - containerPort: 3000

When scheduling, Kubernetes will score every node by summarizing the weightings per node.

Eg two different rules with weights 1and 5.
- If both rules match, score 6.
- If only rule with weight 1 matches, score 1.
The node that has the highest total score, that's where the pod will be scheduled on.

3.13 Interpod Affinity/Anti-Affinity

This allows you to influence scheduling based on the labels of other pods that are already running on the cluster.
Pods belong to a namespace, so rules apply to namespace (default to pod name).

Two types:

requiredDuringSchedulingIgnoredDuringExecution
preferredDuringSchedulingIgnoredDuringExecution

The required type create rules that must be met for the pod to be scheduled, the preferred type is a "soft" type and the rules may be met.

A good use case for pod affinity is co-located pods.

Example, you have an app that uses redis as cache and you want to have the Redis pod on the same node as the app itself.
Another use-case is to co-locate pods within the same availability zone.
When writing your pod affinity and anti-affinity rules, you need to specify a topology domain, called topologyKey in the rules.
- Key refers to a node label.
- If affinity rule matches, new pod will only be scheduled on nodes that have the same topologyKey value as the current running pod.

Interpod Affinity and anti-affinity

Zone topology

Anti-affinity

You can use anti-affinity to make sure a pod is only scehduled once on a node.

Example 3 nodes and you want to schedule 2 pods but they shouldn't be on the same node.
Pod anti-affinity allows you to create a rule that say to not schedule on the same host if a pod label matches.

Anti-affinity

Topology operators

In
NotIn
Exists
DoesNotExist

Affinity requires a substantial amount of processor. Take this into account if you have a lot of rules.

# pod-affinity.yml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: pod-affinity-1
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: pod-affinity-1
    spec:
      containers:
      - name: k8s-demo
        image: wardviaene/k8s-demo
        ports:
        - name: nodejs-port
          containerPort: 3000
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: pod-affinity-2
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: pod-affinity-2
    spec:
      affinity:
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: "app"
                    operator: In
                    values:
                    - pod-affinity-1
              topologyKey: "kubernetes.io/hostname" # this could be change for zoning
      containers:
      - name: redis
        image: redis
        ports:
        - name: redis-port
          containerPort: 6379

We can then check this is fine by running kubectl get pod -o wide to see the Node the pods are running on.

As for anti-affinity:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: pod-affinity-1
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: pod-affinity-1
    spec:
      containers:
      - name: k8s-demo
        image: wardviaene/k8s-demo
        ports:
        - name: nodejs-port
          containerPort: 3000
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: pod-affinity-2
spec:
  replicas: 3
  template:
    metadata:
      labels:
        app: pod-affinity-2
    spec:
      affinity:
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: "app"
                    operator: In
                    values:
                    - pod-affinity-1
              topologyKey: "kubernetes.io/hostname"
      containers:
      - name: redis
        image: redis
        ports:
        - name: redis-port
          containerPort: 6379
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: pod-affinity-3
spec:
  replicas: 3
  template:
    metadata:
      labels:
        app: pod-affinity-3
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: "app"
                    operator: In
                    values:
                    - pod-affinity-1
              topologyKey: "kubernetes.io/hostname"
      containers:
      - name: k8s-demo
        image: wardviaene/k8s-demo
        ports:
        - name: nodejs-port
          containerPort: 3000
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: pod-affinity-4
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: pod-affinity-4
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: "app"
                    operator: In
                    values:
                    - pod-affinity-1
                    - pod-affinity-3
              topologyKey: "kubernetes.io/hostname"
      containers:
      - name: k8s-demo
        image: wardviaene/k8s-demo
        ports:
        - name: nodejs-port
          containerPort: 3000
---

Resulting run with the affinity/anti-affinity

Note that there are differences between preferred and required. With preferred, you may still have the pod scheduled in events we don't necessarily want as a best case scenario.

3.14 Taints and Tolerations

Tolerations is the opposite of node affinity.

Allows a node to repels a set of pods.
Taints mark a node, tolerations are applied to pods to influence the scheduling of a pod.
One use case for taints is to make sure that when you create a new pod, they're not scheduled on the master (node-role.kubernetes.io/master:NoSchedule). This is the default.

# To add a taint
$ kubectl taint nodes node1 key=value:NoSchedule # This will make sure that no pods will be scheduled on node1 as long as they don't have a matching toleration

# tolerations.yml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: tolerations-1
spec:
  replicas: 3
  template:
    metadata:
      labels:
        app: tolerations-1
    spec:
      containers:
      - name: k8s-demo
        image: wardviaene/k8s-demo
        ports:
        - name: nodejs-port
          containerPort: 3000
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: tolerations-2
spec:
  replicas: 3
  template:
    metadata:
      labels:
        app: tolerations-2
    spec:
      tolerations:
      - key: "type"
        operator: "Equal"
        value: "specialnode"
        effect: "NoSchedule"
      containers:
      - name: k8s-demo
        image: wardviaene/k8s-demo
        ports:
        - name: nodejs-port
          containerPort: 3000

Tolerations usage

# Taint a node
$ kubectl taint nodes NODE-NAME type=specialnode:NoSchedule

# Taint with NoExecute
$ kubectl taint nodes NODE-NAME testkey=testvalue:NoExecute

Keys

Operators
- Equal (providing key + value)
- Exists (only providing a key, checking only whether a key exists)
Effects
- NoSchedule (hard requirement that apod will not be scheduled unless there is a matching toleration)
- PreferNoSchedule (avoid placing a pod that doesn't have a matching tolerationg, but it's not a hard requirement)
- NoExecute (evict pods with non-matching tolerations)
  - tolerationSeconds key can be applied with a time in seconds for how long a node can run before it is evicted.

Use Cases

Existing node taints for master nodes.
Taint nodes that are dedicated for a team or user.
Node for specific hardware (ie GPUs) you can taint them to void running non-specific applications on those nodes.
Alpha but soon-to-be beta feature is to taint nodes by condition.

Useful Taints and Tolerations

node.kubernetes.io/not-ready
node.kubernetes.io/unreachable
node.kubernetes.io/out-of-disk
node.kubernetes.io/memory-pressure
node.kubernetes.io/disk-pressure
node.kubernetes.io/network-unavailable
node.kubernetes.io/unschedulable

3.15 Customer Resource Definitions (CRDs)

Let's you extend Kubernetes API.
Resources are the endpoints in the Kubernetes API that store collections of API Objects (ie Deployment, LoadBalancer).
Operators use CRDs to extend the Kubernetes API with their own functionality.

3.16 Operators

An Operator is a method of packaging, deploying and managing a Kubernetes Application.

It puts operational knowledge in an application.

Brings the user closer to the experience of managed cloud services, rather than having to know all the specifics of an application deployed to Kubernetes.
Once an Operator is deployed, it can be managed using Custom Resource Definitions (arbitraty types that extend the Kubernetes API).
It also provides a great way to deploy Stateful applications to Kubernetes.
There are operators for Prometheus, Valut, Rook (storage), MySQL, PostgresSQL and so on.

PostgreSQL Operator Demo

If you just deploy a PostgreSQL container, it'd only start the database. But if you're going to use this operator, it'll allow you to also create replicas, initiate a failover, create backups, scale.

An operator contains a lot of the management logic that you as an administrator or user might want, rather than having to implement it yourself.

Intro to kubeadm

This is an alternative to running Kubernetes that is not using kops

Advanced Topics

Service Discovery

Demo: Service Discovery

ConfigMap

Demo: Config Map

Ingress Controller

Demo: Ingress Controller

External DNS

Volumes

Using EBS Storage

Demo: Volumes

Volume Provisioning

Demo: Using Wordpress with Volumes

Pod Presets

Stateful Sets - (formerly Pet Sets)

Daemon Sets

Resource Usage Monitoring

Horiztonal Pod Autoscaling

Affinity/Anti-Affinity

3.13 Interpod Affinity/Anti-Affinity

Anti-affinity

Topology operators

3.14 Taints and Tolerations

Tolerations usage

Keys

Use Cases

Useful Taints and Tolerations

3.15 Customer Resource Definitions (CRDs)

3.16 Operators

PostgreSQL Operator Demo

Intro to kubeadm

Repository

Sections

Related