Reaching Zen in Elasticsearch’s Cluster Coordination

A presentation at Berlin Buzzwords 2019 in June 2019 in Berlin, Germany by Philipp Krenn

Slide 1

Slide 1

Reaching Zen in Elasticsearch’s Cluster Coordination Philipp Krenn @xeraa

Slide 2

Slide 2

Slide 3

Slide 3

Developer

Slide 4

Slide 4

Demo https://github.com/xeraa/elastic-docker/tree/master/rolling_upgrade elasticsearch1: image: docker.elastic.co/elasticsearch/elasticsearch:$ELASTIC_VERSION environment: - node.name=elasticsearch1 - ES_JAVA_OPTS=-Xms512m -Xmx512m - discovery.zen.ping.unicast.hosts=elasticsearch2,elasticsearch3 - discovery.zen.minimum_master_nodes=2 #- discovery.seed_hosts=elasticsearch2,elasticsearch3 #- cluster.initial_master_nodes=elasticsearch1,elasticsearch2,elasticsearch3 volumes: - esdata_upgrade1:/usr/share/elasticsearch/data ports: - 9201:9200 networks: - esnet

Slide 5

Slide 5

Cluster Coordination?

Slide 6

Slide 6

Cluster State?

Slide 7

Slide 7

Cluster Metadata Cluster Settings Index Metadata Lots more

Slide 8

Slide 8

GET _cluster/state Only move forward Do not lose data

Slide 9

Slide 9

{ “cluster_name” : “docker-cluster”, “cluster_uuid” : “n0Hcm7Q3R5yMN5z1PoG6UQ”, “version” : 29, “state_uuid” : “Of1zG0noRaGgIfYw_w58MA”, “master_node” : “P9UHiA-YSkesOfR7-G50_Q”, “blocks” : { }, “nodes” : { “P9UHiA-YSkesOfR7-G50_Q” : { “name” : “elasticsearch3”, “ephemeral_id” : “MdWyvnTfRCuhzD9ftWt0Dw”, “transport_address” : “172.21.0.3:9300”, “attributes” : { …

Slide 10

Slide 10

Main Components Discovery Master Election Cluster State Publication

Slide 11

Slide 11

Zen Zen to Zen2 Not pluggable

Slide 12

Slide 12

Slide 13

Slide 13

Why https://www.elastic.co/guide/en/elasticsearch/resiliency/current/index.html Repeated network partitions can cause cluster state updates to be lost (STATUS: DONE, v7.0.0) And more

Slide 14

Slide 14

How https://github.com/elastic/elasticsearch-formal-models TLA+ specification TLC model checking

Slide 15

Slide 15

https://github.com/elastic/elasticsearch-formal-models/blob/ master/cluster/isabelle/Preliminaries.thy text <open>It works correctly on finite and nonempty sets as follows:<close> theorem fixes S :: “Term set” assumes finite: “finite S” shows maxTerm_mem: “S <noteq> {} <Longrightarrow> maxTerm S <in> S” and maxTerm_max: “<And> t’. t’ <in> S <Longrightarrow> t’ <le> maxTerm S” proof presume “S <noteq> {}” with assms obtain t where t: “t <in> S” “<And> t’. t’ <in> S <Longrightarrow> t’ <le> t” proof (induct arbitrary: thesis) case empty then show ?case by simp …

Slide 16

Slide 16

Discovery Where are master-eligible nodes? Is there a master already?

Slide 17

Slide 17

Slide 18

Slide 18

Settings discovery.zen.ping.unicast.hosts → discovery.seed_hosts static discovery.zen.hosts_provider → discovery.seed_providers dynamic (file, EC2, GCE,…)

Slide 19

Slide 19

Master Election Agree which node should be master Form a cluster

Slide 20

Slide 20

Slide 21

Slide 21

discovery.zen. minimum_master_nodes Trust users? Scaling up or down?

Slide 22

Slide 22

Three Node Cluster

Slide 23

Slide 23

discovery.zen.minimum_master_nodes: ~

Slide 24

Slide 24

discovery.zen.minimum_master_nodes: 2

Slide 25

Slide 25

discovery.zen.minimum_master_nodes: 2

Slide 26

Slide 26

discovery.zen.minimum_master_nodes: 2

Slide 27

Slide 27

discovery.zen.minimum_master_nodes: 2

Slide 28

Slide 28

discovery.zen.minimum_master_nodes: 2

Slide 29

Slide 29

discovery.zen.minimum_master_nodes: 2

Slide 30

Slide 30

cluster. initial_master_nodes List of node names for the very first election

Slide 31

Slide 31

OK to set on multiple nodes as long as they are all consistent

Slide 32

Slide 32

Ignored once node has joined a cluster even if restarted

Slide 33

Slide 33

Unnecessary when joining new node to existing cluster

Slide 34

Slide 34

Upgrade 6 to 7 Full cluster restart: Set cluster.initial_master_nodes Rolling upgrade: cluster.initial_master_nodes not required

Slide 35

Slide 35

Demo Upgrade 6.7 → 7.0, 6.8 → 7.1+

Slide 36

Slide 36

Demo Full Cluster Restart docker stop <ID> on all nodes docker start <ID> on all nodes

Slide 37

Slide 37

Cluster Scaling Master-ineligible: as before Adding master-eligible: just do it Removing master-eligible: just do it As long as you remove less than half of them at once

Slide 38

Slide 38

Demo Scale down to a single node POST /_cluster/voting_config_exclusions/elasticsearch1 POST /_cluster/voting_config_exclusions/elasticsearch2

Slide 39

Slide 39

Demo Cluster Rebuild Empty cluster.initial_master_nodes

Slide 40

Slide 40

Log elasticsearch2 | {“type”: “server”, “timestamp”: “2019-05-24T14:02:51,173+0000”, “level”: “WARN”, “component”: “o.e.c.c.ClusterFormationFailureHelper”, “cluster.name”: “docker-cluster”, “node.name”: “elasticsearch2”, “message”:

Slide 41

Slide 41

“master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and [cluster.initial_master_nodes] is empty on this node: have discovered [ {elasticsearch1}{pSUJ60tSRWSrcWkRevLfyA}{_jIaabgyTQOHAOjcwUruIQ} {192.168.112.3}{192.168.112.3:9300} {…}, {elasticsearch3}{ngaTCze8QHSHydCXsttXyw}{mbIad-A4SLOJvP7Ava5dEw} {192.168.112.4}{192.168.112.4:9300} {…} ];

Slide 42

Slide 42

discovery will continue using [192.168.112.3:9300, 192.168.112.4:9300] from hosts providers and [ {elasticsearch2}{iANt64LESxqjJv8tHV5KKw}{K0bYEuQ2TnamsiOefTUXgQ} {192.168.112.2}{192.168.112.2:9300} {…} ] from last-known cluster state; node term 0, last-accepted version 0 in term 0”

Slide 43

Slide 43

Cluster State Publication Agree on cluster state updates Broadcast updates to all nodes

Slide 44

Slide 44

Slide 45

Slide 45

Slide 46

Slide 46

Conclusion

Slide 47

Slide 47

Zen to Zen2 Faster, safer, more debuggable

Slide 48

Slide 48

Tonight: Elasticsearch Meetup @Camunda https://www.meetup.com/ Elasticsearch-Berlin/

Slide 49

Slide 49

Reaching Zen in Elasticsearch’s Cluster Coordination Philipp Krenn @xeraa