Make Your Data FABulous

A presentation at Infiniteconf in July 2019 in London, UK by Philipp Krenn

Slide 1

Slide 1

Make Your Data FABulous Philipp Krenn @xeraa

Slide 2

Slide 2

Developer

Slide 3

Slide 3

What is the perfect datastore solution?

Slide 4

Slide 4

It depends…

Slide 5

Slide 5

Pick your tradeoffs

Slide 6

Slide 6

Slide 7

Slide 7

CAP Theorem

Slide 8

Slide 8

Slide 9

Slide 9

Consistent “[…] a total order on all operations such that each operation looks as if it were completed at a single instant.”

Slide 10

Slide 10

Available “[…] every request received by a nonfailing node in the system must result in a response.”

Slide 11

Slide 11

Partition Tolerant “[…] the network will be allowed to lose arbitrarily many messages sent from one node to another.”

Slide 12

Slide 12

https://berb.github.io/diploma-thesis/original/061_challenge.html

Slide 13

Slide 13

Misconceptions Partition Tolerance is not a choice in a distributed system

Slide 14

Slide 14

Misconceptions Consistency in ACID is a predicate Consistency in CAP is a linear order

Slide 15

Slide 15

Robinson Crusoe

Slide 16

Slide 16

Slide 17

Slide 17

/dev/null breaks CAP: effect of write are always consistent, it’s always available, and all replicas are consistent even during partitions. — https://twitter.com/ashic/status/591511683987701760

Slide 18

Slide 18

FAB Theory

Slide 19

Slide 19

Mark Harwood

Slide 20

Slide 20

Fast Near real-time instead of batch processing

Slide 21

Slide 21

Accurate Exact instead of approximate results

Slide 22

Slide 22

Big Parallelization needed to handle the data

Slide 23

Slide 23

Say Big Data one more time

Slide 24

Slide 24

Fast Big Accurate

Slide 25

Slide 25

Slide 26

Slide 26

Shard Unit of scale

Slide 27

Slide 27

Slide 28

Slide 28

“The evil wizard Mondain had attempted to gain control over Sosaria by trapping its essence in a crystal. When the Stranger at the end of Ultima I defeated Mondain and shattered the crystal, the crystal shards each held a refracted copy of Sosaria. http://www.raphkoster.com/2009/01/08/database-shardingcame-from-uo/

Slide 29

Slide 29

Terms Aggregation

Slide 30

Slide 30

Word Count Word Count Luke 64 Droid 13 R2 31 3PO 13 Alderaan 20 Princess 12 Kenobi 19 Ben 11 Obi-Wan 18 Vader 11 Droids 16 Han 10 Blast 15 Jedi 10 Imperial 15 Sandpeople 10

Slide 31

Slide 31

PUT starwars { “settings”: { “number_of_shards”: 5, “number_of_replicas”: 0 } }

Slide 32

Slide 32

{ “index” : { “_index” { “word” : “Luke” } { “index” : { “_index” { “word” : “Luke” } { “index” : { “_index” { “word” : “Luke” } { “index” : { “_index” { “word” : “Luke” } … : “starwars”, “_type” : “_doc”, “routing”: “0” } } : “starwars”, “_type” : “_doc”, “routing”: “1” } } : “starwars”, “_type” : “_doc”, “routing”: “2” } } : “starwars”, “_type” : “_doc”, “routing”: “3” } }

Slide 33

Slide 33

Slide 34

Slide 34

GET starwars/_search { “query”: { “match”: { “word”: “Luke” } } }

Slide 35

Slide 35

{ “took”: 6, “timed_out”: false, “_shards”: { “total”: 5, “successful”: 5, “skipped”: 0, “failed”: 0 }, “hits”: { “total”: 64, “max_score”: 3.2049506, “hits”: [ { “_index”: “starwars”, “_type”: “_doc”, “_id”: “0vVdy2IBkmPuaFRg659y”, “_score”: 3.2049506, “_routing”: “1”, “_source”: { “word”: “Luke” } }, …

Slide 36

Slide 36

GET starwars/_search { “aggs”: { “most_common”: { “terms”: { “field”: “word.keyword”, “size”: 1 } } }, “size”: 0 }

Slide 37

Slide 37

{ } “took”: 13, “timed_out”: false, “_shards”: { “total”: 5, “successful”: 5, “skipped”: 0, “failed”: 0 }, “hits”: { “total”: 288, “max_score”: 0, “hits”: [] }, “aggregations”: { “most_common”: { “doc_count_error_upper_bound”: 10, “sum_other_doc_count”: 232, “buckets”: [ { “key”: “Luke”, “doc_count”: 56 } ] } }

Slide 38

Slide 38

Slide 39

Slide 39

{ “index” : { “_index” { “word” : “Luke” } { “index” : { “_index” { “word” : “Luke” } { “index” : { “_index” { “word” : “Luke” } … { “index” : { “_index” { “word” : “Luke” } { “index” : { “_index” { “word” : “Luke” } { “index” : { “_index” { “word” : “Luke” } { “index” : { “_index” { “word” : “Luke” } … : “starwars”, “_type” : “_doc”, “routing”: “0” } } : “starwars”, “_type” : “_doc”, “routing”: “1” } } : “starwars”, “_type” : “_doc”, “routing”: “2” } } : “starwars”, “_type” : “_doc”, “routing”: “8” } } : “starwars”, “_type” : “_doc”, “routing”: “9” } } : “starwars”, “_type” : “_doc”, “routing”: “0” } } : “starwars”, “_type” : “_doc”, “routing”: “0” } }

Slide 40

Slide 40

Routing shard# = hash(_routing) % #primary_shards

Slide 41

Slide 41

GET _cat/shards?index=starwars&v index starwars starwars starwars starwars starwars shard 3 4 2 1 0 prirep p p p p p state docs store ip node STARTED 58 6.4kb 172.19.0.2 Q88C3vO STARTED 26 5.2kb 172.19.0.2 Q88C3vO STARTED 71 6.9kb 172.19.0.2 Q88C3vO STARTED 63 6.6kb 172.19.0.2 Q88C3vO STARTED 70 6.7kb 172.19.0.2 Q88C3vO

Slide 42

Slide 42

(Sub) Results Per Shard shard_size = (size * 1.5 + 10)

Slide 43

Slide 43

How Many? Results per shard Results for aggregation

Slide 44

Slide 44

“doc_count_error_upper_bound”: 10 “sum_other_doc_count”: 232

Slide 45

Slide 45

GET starwars/_search { “aggs”: { “most_common”: { “terms”: { “field”: “word.keyword”, “size”: 1, “show_term_doc_count_error”: true } } }, “size”: 0 }

Slide 46

Slide 46

“aggregations”: { “most_common”: { “doc_count_error_upper_bound”: 10, “sum_other_doc_count”: 232, “buckets”: [ { “key”: “Luke”, “doc_count”: 56, “doc_count_error_upper_bound”: 9 } ] } }

Slide 47

Slide 47

GET starwars/_search { “aggs”: { “most_common”: { “terms”: { “field”: “word.keyword”, “size”: 1, “shard_size”: 20, “show_term_doc_count_error”: true } } }, “size”: 0 }

Slide 48

Slide 48

“aggregations”: { “most_common”: { “doc_count_error_upper_bound”: 0, “sum_other_doc_count”: 224, “buckets”: [ { “key”: “Luke”, “doc_count”: 64, “doc_count_error_upper_bound”: 0 } ] } }

Slide 49

Slide 49

Cardinality Aggregation

Slide 50

Slide 50

Naive Implementation: HashSet HashSet noDuplicates = new HashSet(); noDuplicates.add(“Luke”); noDuplicates.add(“R2”); noDuplicates.add(“Luke”); // … noDuplicates.size();

Slide 51

Slide 51

Simple Estimator: Even distribution 0 – 1 hash(“Luke”) hash(“R2”) hash(“Jedi”) hash(“Luke”) -> -> -> -> 0.44 0.71 0.07 0.44 Estimated cardinality:

Slide 52

Slide 52

Probabilistic Counting: Leading 0 hash(value) -> … … … … … … … … 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 Probability or generally

Slide 53

Slide 53

LogLog: Probabilistic Averaging

Slide 54

Slide 54

Slide 55

Slide 55

LogLog: Bucketing for Averages 4 bit bucket, rest for cardinality per bucket hash(“Luke”) -> 0100 101001000 -> [4]: 3 hash(“R2”) -> 1001 001010000 -> [9]: 4 hash(“Jedi”) -> 0000 101110010 -> [0]: 1

Slide 56

Slide 56

Slide 57

Slide 57

Slide 58

Slide 58

Slide 59

Slide 59

GET starwars/_search { “aggs”: { “type_count”: { “cardinality”: { “field”: “word.keyword”, “precision_threshold”: 10 } } }, “size”: 0 }

Slide 60

Slide 60

{ } “took”: 3, “timed_out”: false, “_shards”: { “total”: 5, “successful”: 5, “skipped”: 0, “failed”: 0 }, “hits”: { “total”: 288, “max_score”: 0, “hits”: [] }, “aggregations”: { “type_count”: { “value”: 17 } }

Slide 61

Slide 61

precision_threshold Default 3,000 Maximum 40,000

Slide 62

Slide 62

Memory precision_threshold x 8 bytes

Slide 63

Slide 63

Slide 64

Slide 64

GET starwars/_search { “aggs”: { “type_count”: { “cardinality”: { “field”: “word.keyword”, “precision_threshold”: 12 } } }, “size”: 0 }

Slide 65

Slide 65

{ } “took”: 12, “timed_out”: false, “_shards”: { “total”: 5, “successful”: 5, “skipped”: 0, “failed”: 0 }, “hits”: { “total”: 288, “max_score”: 0, “hits”: [] }, “aggregations”: { “type_count”: { “value”: 16 } }

Slide 66

Slide 66

Precompute Hashes? Client or mapper-murmur3 plugin

Slide 67

Slide 67

It Depends ! large / high-cardinality fields ! low cardinality / numeric fields

Slide 68

Slide 68

Improvement: LogLog-β https://github.com/elastic/elasticsearch/ pull/22323

Slide 69

Slide 69

Improvement? “New cardinality estimation algorithms for HyperLogLog sketches” https://arxiv.org/abs/1702.01284

Slide 70

Slide 70

Inverse Document Frequency

Slide 71

Slide 71

GET starwars/_search { “query”: { “match”: { “word”: “Luke” } } }

Slide 72

Slide 72

… { “_index”: “starwars”, “_type”: “_doc”, “_id”: “0vVdy2IBkmPuaFRg659y”, “_score”: 3.2049506, “_routing”: “1”, “_source”: { “word”: “Luke” } }, { “_index”: “starwars”, “_type”: “_doc”, “_id”: “2PVdy2IBkmPuaFRg659y”, “_score”: 3.2049506, “_routing”: “7”, “_source”: { “word”: “Luke” } }, { “_index”: “starwars”, “_type”: “_doc”, “_id”: “0_Vdy2IBkmPuaFRg659y”, “_score”: 3.1994843, “_routing”: “2”, “_source”: { “word”: “Luke” } }, …

Slide 73

Slide 73

Slide 74

Slide 74

Term Frequency / Inverse Document Frequency (TF/IDF)

Slide 75

Slide 75

BM25 Default in Elasticsearch 5.0

Slide 76

Slide 76

Term Frequency

Slide 77

Slide 77

Slide 78

Slide 78

Inverse Document Frequency

Slide 79

Slide 79

Slide 80

Slide 80

Field-Length Norm

Slide 81

Slide 81

Query Then Fetch

Slide 82

Slide 82

Query

Slide 83

Slide 83

Fetch

Slide 84

Slide 84

DFS Query Then Fetch Distributed Frequency Search

Slide 85

Slide 85

GET starwars/_search?search_type=dfs_query_then_fetch { “query”: { “match”: { “word”: “Luke” } } }

Slide 86

Slide 86

{ “_index”: “starwars”, “_type”: “_doc”, “_id”: “0fVdy2IBkmPuaFRg659y”, “_score”: 1.5367417, “_routing”: “0”, “_source”: { “word”: “Luke” } }, { “_index”: “starwars”, “_type”: “_doc”, “_id”: “2_Vdy2IBkmPuaFRg659y”, “_score”: 1.5367417, “_routing”: “0”, “_source”: { “word”: “Luke” } }, { “_index”: “starwars”, “_type”: “_doc”, “_id”: “3PVdy2IBkmPuaFRg659y”, “_score”: 1.5367417, “_routing”: “0”, “_source”: { “word”: “Luke” } }, …

Slide 87

Slide 87

Slide 88

Slide 88

Slide 89

Slide 89

Don’t use dfs_query_then_fetch in production. It really isn’t required. — https://www.elastic.co/guide/en/elasticsearch/ guide/current/relevance-is-broken.html

Slide 90

Slide 90

Single Shard Default in 7.0

Slide 91

Slide 91

Simon Says Use a single shard until it blows up

Slide 92

Slide 92

PUT starwars/_settings { “settings”: { “index.blocks.write”: true } }

Slide 93

Slide 93

POST starwars/_shrink/starletwars?copy_settings=true { “settings”: { “number_of_shards”: 1, “number_of_replicas”: 0 } }

Slide 94

Slide 94

GET starletwars/_search { “query”: { “match”: { “word”: “Luke” } }, “_source”: false }

Slide 95

Slide 95

{ “_index”: “starletwars”, “_type”: “_doc”, “_id”: “0fVdy2IBkmPuaFRg659y”, “_score”: 1.5367417, “_routing”: “0” }, { “_index”: “starletwars”, “_type”: “_doc”, “_id”: “2_Vdy2IBkmPuaFRg659y”, “_score”: 1.5367417, “_routing”: “0” }, { }, “_index”: “starletwars”, “_type”: “_doc”, “_id”: “3PVdy2IBkmPuaFRg659y”, “_score”: 1.5367417, “_routing”: “0”

Slide 96

Slide 96

GET starletwars/_search { “aggs”: { “most_common”: { “terms”: { “field”: “word.keyword”, “size”: 1 } } }, “size”: 0 }

Slide 97

Slide 97

{ } “took”: 1, “timed_out”: false, “_shards”: { “total”: 1, “successful”: 1, “skipped”: 0, “failed”: 0 }, “hits”: { “total”: 288, “max_score”: 0, “hits”: [] }, “aggregations”: { “most_common”: { “doc_count_error_upper_bound”: 0, “sum_other_doc_count”: 224, “buckets”: [ { “key”: “Luke”, “doc_count”: 64 } ] } }

Slide 98

Slide 98

Change for the Cardinality Count?

Slide 99

Slide 99

Slide 100

Slide 100

Conclusion

Slide 101

Slide 101

Tradeoffs…

Slide 102

Slide 102

Consistent Available Partition Tolerant Fast Accurate Big

Slide 103

Slide 103

Questions? Philipp Krenn PS: Stickers @xeraa