Elastic Stack Workshop: Search and Beyond (Workshop)

A presentation at CrunchConf in October 2018 in Budapest, Hungary by Philipp Krenn

Slide 1

Slide 1

Search and Beyond Philipp Krenn @xeraa

Slide 2

Slide 2

Developer

Slide 3

Slide 3

This is not a Training https://training.elastic.co

Slide 4

Slide 4

Agenda 09:00 - 10:40: Intro & Architecture & Search 10:40 - 11:00: Coffee break 11:00 - 12:20: More Search 12:20 - 13:05: Lunch 13:05 - 15:00: Monitoring 15:00 - 15:20: Coffee break 15:20 - 17:00: More Monitoring & Q&A

Slide 5

Slide 5

Elastic Stack Architecture

Slide 6

Slide 6

Slide 7

Slide 7

$ curl http://localhost:9200 { "name" : "elasticsearch-hot", "cluster_name" : "metrics-cluster", "cluster_uuid" : "06nHPLLgTrmZEpYli6JW5w", "version" : { "number" : "6.5.0", "build_flavor" : "default", "build_type" : "tar", "build_hash" : "c53b7d3", "build_date" : "2018-11-08T21:28:50.577384Z", "build_snapshot" : false, "lucene_version" : "7.5.0", "minimum_wire_compatibility_version" : "5.6.0", "minimum_index_compatibility_version" : "5.0.0" }, "tagline" : "You Know, for Search" }

Slide 8

Slide 8

https://db-engines.com/en/ ranking

Slide 9

Slide 9

Slide 10

Slide 10

Slide 11

Slide 11

Slide 12

Slide 12

Slide 13

Slide 13

Slide 14

Slide 14

Slide 15

Slide 15

Slide 16

Slide 16

Slide 17

Slide 17

Slide 18

Slide 18

Slide 19

Slide 19

Only accept features that scale. — https://github.com/elastic/engineering/blob/master/ development_constitution.md

Slide 20

Slide 20

Horizontal Scaling Shards Replication Writes & Reads

Slide 21

Slide 21

Slide 22

Slide 22

Exhibit A: A JSON Document { "name": "Elasticsearch", "author": "Shay Banon", "stable_version": "6.5.0", "preview_version": "7.0.0-alpha1" }

Slide 23

Slide 23

Exhibit B: A cURL Command $ curl -XPOST -i localhost:9200/databases/nosql -d ' { "name": "Elasticsearch", "author": "Shay Banon", "stable_version": "6.5.0", "preview_version": "7.0.0-alpha1" }'

Slide 24

Slide 24

Exhibit B: A cURL Command HTTP/1.1 201 Created Location: /databases/nosql/AVfD8XQaeuK3k1LGtT8content-type: application/json; charset=UTF-8 content-length: 162 { "_index":"databases", "_type":"nosql", "_id":"AVfD8XQaeuK3k1LGtT8-", "_version":1, "result":"created", "_shards": { "total":2, "successful":1, "failed":0 }, "created":true }

Slide 25

Slide 25

Exhibit C: A Console Command POST /databases/nosql { "name": "Elasticsearch", "author": "Shay Banon", "stable_version": "6.4.2", "preview_version": "7.0.0-alpha2" }

Slide 26

Slide 26

Exhibit C: A Console Command { "_index": "databases", "_type": "nosql", "_id": "AVfD6ukyeuK3k1LGtSwT", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "created": true }

Slide 27

Slide 27

Slide 28

Slide 28

Slide 29

Slide 29

Slide 30

Slide 30

Single node $ cd elasticsearch-<version> $ ./bin/elasticsearch

Slide 31

Slide 31

Slide 32

Slide 32

Cluster No more broadcasting

Slide 33

Slide 33

Slide 34

Slide 34

Node Types

Slide 35

Slide 35

Master-eligible node Default by default

Slide 36

Slide 36

Data node Default by default

Slide 37

Slide 37

Client node Smart load balancer

Slide 38

Slide 38

Ingest node Parse and enrich

Slide 39

Slide 39

ML node Machine learning

Slide 40

Slide 40

Discovery

Slide 41

Slide 41

Zen discovery

Slide 42

Slide 42

Peer to peer network Unicast Who to contact Ping Discover each other

Slide 43

Slide 43

One master node Elected or joined to

Slide 44

Slide 44

Three dedicated master nodes for production

Slide 45

Slide 45

Healthcheck Master pings all nodes, they report back

Slide 46

Slide 46

discovery.zen.no_master_block write | all

Slide 47

Slide 47

discovery.zen.minimum_master_nodes

Slide 48

Slide 48

Split brain

Slide 49

Slide 49

GET /_cluster/health green yellow red

Slide 50

Slide 50

Alternative discovery Azure, EC2, GCE

Slide 51

Slide 51

Indexing a Document

Slide 52

Slide 52

Slide 53

Slide 53

Slide 54

Slide 54

Document Unique combination: _index _type _id PS: Types will be removed

Slide 55

Slide 55

POST /databases/nosql vs PUT /databases/nosql/ elasticsearch

Slide 56

Slide 56

Autogenerated ID 20 characters, URL-safe, Base64-encoded, GUID strings AVfD6ukyeuK3k1LGtSwT

Slide 57

Slide 57

Consistent hashing Before 2.0: djb2 Ignoring _routing

Slide 58

Slide 58

unsigned long hash(unsigned char *str) { unsigned long hash = 5381; int c; while (c = str++) hash = ((hash << 5) + hash) + c; / hash * 33 + c */ return hash; }

Slide 59

Slide 59

Consistent hashing Current default: murmur3 https://github.com/elastic/elasticsearch/blob/5.4/core/src/main/java/org/elasticsearch/ common/hash/MurmurHash3.java

Slide 60

Slide 60

Consistent hashing Better distribution 100,000 incremental IDs https://github.com/elastic/elasticsearch/pull/7954

Slide 61

Slide 61

Consistent hashing 3 shards murmur3 [33185, 33347, 33468] djb2 [30100, 30000, 39900]

Slide 62

Slide 62

Consistent hashing 5 shards murmur3 [19933, 19964, 19940, 20030, 20133] djb2 [20000, 20000, 20000, 20000, 20000]

Slide 63

Slide 63

Consistent hashing 33 shards murmur3 [2999, 3096, 2930, 2986, 3070, 3093, 3023, 3052, 3112, 2940, 3036, 2985, 3031, 3048, 3127, 2961, 2901, 3105, 3041, 3130, 3013, 3035, 3031, 3019, 3008, 3022, 3111, 3086, 3016, 2996, 3075, 2945, 2977] djb2 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 900, 900, 900, 900, 1000, 1000, 10000, 10000, 10000, 10000, 9100, 9100, 9100, 9100, 9000, 9000, 0, 0, 0, 0, 0, 0]

Slide 64

Slide 64

Shard decision shard = hash(doc_id) % (num_of_primary_shards)

Slide 65

Slide 65

Write Coordinating Node, Hash, Primary, Replica(s)

Slide 66

Slide 66

Get & Aggregate Coordinating Node, Hash, Shard

Slide 67

Slide 67

Optimistic concurrency control _version

Slide 68

Slide 68

Flaky nodes index.unassigned.node_left.delayed_timeout: 1m

Slide 69

Slide 69

Sequency numbers Quick recovery in 6.0

Slide 70

Slide 70

Lucene Segment

Slide 71

Slide 71

Lucene at work

Slide 72

Slide 72

Slide 73

Slide 73

Slide 74

Slide 74

Slide 75

Slide 75

Slide 76

Slide 76

Slide 77

Slide 77

Slide 78

Slide 78

Slide 79

Slide 79

Slide 80

Slide 80

Slide 81

Slide 81

Segments are immutable

Slide 82

Slide 82

index.refresh_interval 7.0: index.search.idle.after

Slide 83

Slide 83

Tombstone file Marks deleted documents

Slide 84

Slide 84

Merge Combine (and clean up) segments

Slide 85

Slide 85

Visualize merges http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html

Slide 86

Slide 86

Slide 87

Slide 87

Searching

Slide 88

Slide 88

Search Coordinating Node, Query then Fetch

Slide 89

Slide 89

Benchmarks Fair Reproducible Close to Production

Slide 90

Slide 90

Slide 91

Slide 91

Full-Text Search

Slide 92

Slide 92

Slide 93

Slide 93

Slide 94

Slide 94

Who uses a Database?

Slide 95

Slide 95

Who uses Search?

Slide 96

Slide 96

Slide 97

Slide 97

Store

Slide 98

Slide 98

Apache Lucene Elasticsearch

Slide 99

Slide 99

Slide 100

Slide 100

Example These are <em>not</em> the droids you are looking for.

Slide 101

Slide 101

html_strip Char Filter These are not the droids you are looking for.

Slide 102

Slide 102

standard Tokenizer These are not the droids you looking for are

Slide 103

Slide 103

lowercase Token Filter these are not the droids looking for you are

Slide 104

Slide 104

stop Token Filter droids you looking

Slide 105

Slide 105

snowball Token Filter droid you look

Slide 106

Slide 106

Analyze

Slide 107

Slide 107

GET /_analyze { "analyzer": "english", "text": "These are not the droids you are looking for." }

Slide 108

Slide 108

{ } "tokens": [ { "token": "droid", "start_offset": 18, "end_offset": 24, "type": "<ALPHANUM>", "position": 4 }, { "token": "you", "start_offset": 25, "end_offset": 28, "type": "<ALPHANUM>", "position": 5 }, ... ]

Slide 109

Slide 109

GET /_analyze { "char_filter": [ "html_strip" ], "tokenizer": "standard", "filter": [ "lowercase", "stop", "snowball" ], "text": "These are <em>not</em> the droids you are looking for." }

Slide 110

Slide 110

{ } "tokens": [ { "token": "droid", "start_offset": 27, "end_offset": 33, "type": "<ALPHANUM>", "position": 4 }, { "token": "you", "start_offset": 34, "end_offset": 37, "type": "<ALPHANUM>", "position": 5 }, ... ]

Slide 111

Slide 111

Stop Words a an and are as at be but by for if in into is it no not of on or such that the their then there these they this to was will with https://github.com/apache/lucene-solr/blob/master/lucene/ analysis/common/src/java/org/apache/lucene/analysis/en/ EnglishAnalyzer.java#L44-L50

Slide 112

Slide 112

Always Use Stop Words?

Slide 113

Slide 113

To be, or not to be.

Slide 114

Slide 114

Languages Arabic, Armenian, Basque, Brazilian, Bulgarian, Catalan, CJK, Czech, Danish, Dutch, English, Finnish, French, Galician, German, Greek, Hindi, Hungarian, Indonesian, Irish, Italian, Latvian, Lithuanian, Norwegian, Persian, Portuguese, Romanian, Russian, Sorani, Spanish, Swedish, Turkish, Thai

Slide 115

Slide 115

Language Rules English: Philipp's → philipp French: l'église → eglis German: äußerst → ausserst

Slide 116

Slide 116

More Language Plugins Core: ICU (Asian languages), Kuromoji (advanced Japanese), Phonetic, SmartCN, Stempel (Polish), Ukrainian Community: Hebrew, Vietnamese, Network Address Analysis, String2Integer,...

Slide 117

Slide 117

German GET /_analyze { "analyzer": "german", "text": "Das sind nicht die Droiden, nach denen du suchst." }

Slide 118

Slide 118

{ } "tokens": [ { "token": "droid", "start_offset": 19, "end_offset": 26, "type": "<ALPHANUM>", "position": 4 }, { "token": "den", "start_offset": 33, "end_offset": 38, "type": "<ALPHANUM>", "position": 6 }, { "token": "such", "start_offset": 42, "end_offset": 48, "type": "<ALPHANUM>", "position": 8 } ]

Slide 119

Slide 119

German with the English Analyzer da sind nicht die droiden denen du suchst nach

Slide 120

Slide 120

German Stop Words https://github.com/apache/lucene-solr/blob/master/lucene/ analysis/common/src/resources/org/apache/lucene/analysis/ snowball/german_stop.txt

Slide 121

Slide 121

Detect Languages https://github.com/spinscale/ elasticsearch-ingest-langdetect

Slide 122

Slide 122

PUT _ingest/pipeline/langdetect-pipeline { "description": "A pipeline to detect languages", "processors": [ { "langdetect" : { "field" : "quote", "target_field" : "language" } } ] }

Slide 123

Slide 123

POST _ingest/pipeline/langdetect-pipeline/_simulate { "docs": [ { "_source": { "quote": "Das sind nicht die Droiden, nach denen du suchst." } } ] }

Slide 124

Slide 124

{ } "docs": [ { "doc": { "_index": "_index", "_type": "_type", "_id": "_id", "_source": { "language": "de", "quote": "Das sind nicht die Droiden, nach denen du suchst." }, "_ingest": { "timestamp": "2018-10-26T00:06:42.320613Z" } } } ]

Slide 125

Slide 125

Phonetic GET /_analyze { "tokenizer": "standard", "filter": [ { "type": "phonetic", "encoder": "beider_morse", "languageset": "any" } ], "text": "These are not the droids you are looking for." }

Slide 126

Slide 126

Phonetic ... drDts drits drots loknk... iou ari ori

Slide 127

Slide 127

Another Example Obi-Wan never told you what happened to your father.

Slide 128

Slide 128

Another Example obi wan never told you what happen your father

Slide 129

Slide 129

Another Example <b>No</b>. I am your father.

Slide 130

Slide 130

Another Example i am your father

Slide 131

Slide 131

Inverted Index am droid father happen i look never obi told wan what you your ID 1 0 1[4] 0 0 0 1[7] 0 0 0 0 0 1[5] 0 ID 2 0 0 1[9] 1[6] 0 0 1[2] 1[0] 1[3] 1[1] 1[5] 1[4] 1[8] ID 3 1[2] 0 1[4] 0 1[1] 0 0 0 0 0 0 0 1[3]

Slide 132

Slide 132

To / The Index

Slide 133

Slide 133

PUT /starwars { "settings": { "number_of_shards": 1, "analysis": { "filter": { "my_synonym_filter": { "type": "synonym", "synonyms": [ "father,dad", "droid => droid,machine" ] } },

Slide 134

Slide 134

}, } "analyzer": { "my_analyzer": { "char_filter": [ "html_strip" ], "tokenizer": "standard", "filter": [ "lowercase", "stop", "snowball", "my_synonym_filter" ] } }

Slide 135

Slide 135

} "mappings": { "_doc": { "properties": { "quote": { "type": "text", "analyzer": "my_analyzer" } } } }

Slide 136

Slide 136

Synonyms Index synonym or query time synonym_graph

Slide 137

Slide 137

GET /starwars/_mapping GET /starwars/_settings

Slide 138

Slide 138

PUT /starwars/_doc/1 { "quote": "These are <em>not</em> the droids you are looking for." } PUT /starwars/_doc/2 { "quote": "Obi-Wan never told you what happened to your father." } PUT /starwars/_doc/3 { "quote": "<b>No</b>. I am your father." }

Slide 139

Slide 139

GET /starwars/_doc/1 GET /starwars/_doc/1/_source

Slide 140

Slide 140

Multi Lingual Index PUT /starwars_en/_doc/1 Type Field { "quote_en": "...", "quote_de": "..." }

Slide 141

Slide 141

PS: Single Type per Index

Slide 142

Slide 142

Search

Slide 143

Slide 143

POST /starwars/_search { "query": { "match_all": { } } }

Slide 144

Slide 144

GET vs POST

Slide 145

Slide 145

{ "took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 3, "max_score": 1, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "2", "_score": 1, "_source": { "quote": "Obi-Wan never told you what happened to your father." } }, ...

Slide 146

Slide 146

POST /starwars/_search { "query": { "match": { "quote": "droid" } } }

Slide 147

Slide 147

{ } "took": 2, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1, "max_score": 0.39556286, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "1", "_score": 0.39556286, "_source": { "quote": "These are <em>not</em> the droids you are looking for." } } ] }

Slide 148

Slide 148

POST /starwars/_search { "query": { "match": { "quote": "dad" } } }

Slide 149

Slide 149

... "hits": { "total": 2, "max_score": 0.41913947, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "3", "_score": 0.41913947, "_source": { "quote": "<b>No</b>. I am your father." } }, { "_index": "starwars", "_type": "_doc", "_id": "2", "_score": 0.39291072, "_source": { "quote": "Obi-Wan never told you what happened to your father." } } ] } }

Slide 150

Slide 150

POST /starwars/_doc/0/_explain { "query": { "match": { "quote": "dad" } } }

Slide 151

Slide 151

{ } "_index": "starwars", "_type": "_doc", "_id": "0", "matched": false

Slide 152

Slide 152

POST /starwars/_doc/1/_explain { "query": { "match": { "quote": "dad" } } }

Slide 153

Slide 153

{ } "_index": "starwars", "_type": "_doc", "_id": "1", "matched": false, "explanation": { "value": 0, "description": "no matching term", "details": [] }

Slide 154

Slide 154

POST /starwars/_doc/2/_explain { "query": { "match": { "quote": "dad" } } }

Slide 155

Slide 155

{ "_index": "starwars", "_type": "_doc", "_id": "2", "matched": true, "explanation": { ...

Slide 156

Slide 156

POST /starwars/_search { "query": { "match": { "quote": "machine" } } }

Slide 157

Slide 157

{ } "took": 2, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 1.2499592, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "1", "_score": 1.2499592, "_source": { "quote": "These are <em>not</em> the droids you are looking for." } } ] }

Slide 158

Slide 158

POST /starwars/_search { "query": { "match_phrase": { "quote": "I am your father" } } }

Slide 159

Slide 159

{ } "took": 3, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1, "max_score": 1.5665855, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "3", "_score": 1.5665855, "_source": { "quote": "<b>No</b>. I am your father." } } ] }

Slide 160

Slide 160

POST /starwars/_search { "query": { "match_phrase": { "quote": { "query": "I am father", "slop": 1 } } } }

Slide 161

Slide 161

{ } "took": 16, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1, "max_score": 0.8327639, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "3", "_score": 0.8327639, "_source": { "quote": "<b>No</b>. I am your father." } } ] }

Slide 162

Slide 162

POST /starwars/_search { "query": { "match_phrase": { "quote": { "query": "I am not your father", "slop": 1 } } } }

Slide 163

Slide 163

{ } "took": 5, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1, "max_score": 1.0409548, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "3", "_score": 1.0409548, "_source": { "quote": "<b>No</b>. I am your father." } } ] }

Slide 164

Slide 164

POST /starwars/_search { "query": { "match": { "quote": { "query": "van", "fuzziness": "AUTO" } } } }

Slide 165

Slide 165

{ } "took": 14, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1, "max_score": 0.18155496, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "2", "_score": 0.18155496, "_source": { "quote": "Obi-Wan never told you what happened to your father." } } ] }

Slide 166

Slide 166

POST /starwars/_search { "query": { "match": { "quote": { "query": "ovi-van", "fuzziness": 1 } } } }

Slide 167

Slide 167

{ } "took": 109, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1, "max_score": 0.3798467, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "2", "_score": 0.3798467, "_source": { "quote": "Obi-Wan never told you what happened to your father." } } ] }

Slide 168

Slide 168

FuzzyQuery History http://blog.mikemccandless.com/2011/03/lucenes-fuzzyquery-is-100-times-faster.html Before: Brute force Now: Levenshtein Automaton

Slide 169

Slide 169

http://blog.notdot.net/2010/07/Damn-Cool-Algorithms-Levenshtein-Automata

Slide 170

Slide 170

SELECT * FROM starwars WHERE quote LIKE "?an" OR quote LIKE "V?n" OR quote LIKE "Va?"

Slide 171

Slide 171

Score

Slide 172

Slide 172

Term Frequency / Inverse Document Frequency (TF/IDF) Search one term

Slide 173

Slide 173

BM25 Default in Elasticsearch 5.0 https://speakerdeck.com/elastic/improved-text-scoring-withbm25

Slide 174

Slide 174

Term Frequency

Slide 175

Slide 175

Slide 176

Slide 176

Inverse Document Frequency

Slide 177

Slide 177

Slide 178

Slide 178

Field-Length Norm

Slide 179

Slide 179

POST /starwars/_search?explain=true { "query": { "match": { "quote": "father" } } }

Slide 180

Slide 180

... "_explanation": { "value": 0.41913947, "description": "weight(Synonym(quote:dad quote:father) in 0) [PerFieldSimilarity], result of:", "details": [ { "value": 0.41913947, "description": "score(doc=0,freq=2.0 = termFreq=2.0\n), product of:", "details": [ { "value": 0.2876821, "description": "idf(docFreq=1, docCount=1)", "details": [] }, { "value": 1.4569536, "description": "tfNorm, computed from:", "details": [ { "value": 2, "description": "termFreq=2.0", "details": [] }, ...

Slide 181

Slide 181

Score 0.41913947: i am your father 0.39291072: obi wan never told what happen your father you

Slide 182

Slide 182

Vector Space Model Search multiple terms

Slide 183

Slide 183

Search your father

Slide 184

Slide 184

Slide 185

Slide 185

Coordination Factor Reward multiple terms

Slide 186

Slide 186

Search for 3 terms 1 term: 2 terms: 3 terms:

Slide 187

Slide 187

Practical Scoring Function Putting it all together

Slide 188

Slide 188

score(q,d) = queryNorm(q) · coord(q,d) · ∑ ( tf(t in d) · idf(t)² · t.getBoost() · norm(t,d) ) (t in q)

Slide 189

Slide 189

Function Score Script, weight, random, field value, decay (geo or date)

Slide 190

Slide 190

POST /starwars/_search { "query": { "function_score": { "query": { "match": { "quote": "father" } }, "random_score": {} } } }

Slide 191

Slide 191

Compare Scores "100% perfect" vs a "50%" match

Slide 192

Slide 192

Don't do this. Seriously. Stop trying to think about your problem this way, it's not going to end well. — https://wiki.apache.org/lucene-java/ ScoresAsPercentages

Slide 193

Slide 193

GET /starwars/_analyze { "analyzer" : "my_analyzer", "text": "These are my father's machines." }

Slide 194

Slide 194

{ "tokens": [ { "token": "my", "start_offset": 10, "end_offset": 12, "type": "<ALPHANUM>", "position": 2 }, { "token": "father", "start_offset": 13, "end_offset": 21, "type": "<ALPHANUM>", "position": 3 }, { "token": "dad", "start_offset": 13, "end_offset": 21, "type": "SYNONYM", "position": 3 }, { "token": "machin", "start_offset": 22, "end_offset": 30, "type": "<ALPHANUM>", "position": 4 } ] }

Slide 195

Slide 195

PUT /starwars/_doc/4 { "quote": "These are my father's machines." }

Slide 196

Slide 196

POST /starwars/_search { "query": { "match": { "quote": "my father machine" } } }

Slide 197

Slide 197

"hits": { "total": 4, "max_score": 2.92523, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "4", "_score": 2.92523, "_source": { "quote": "These are my father's machines." } }, { "_index": "starwars", "_type": "_doc", "_id": "1", "_score": 0.8617505, "_source": { "quote": "These are <em>not</em> the droids you are looking for." } }, ...

Slide 198

Slide 198

2.92523 == 100%

Slide 199

Slide 199

DELETE /starwars/_doc/4 POST /starwars/_search { "query": { "match": { "quote": "my father machine" } } }

Slide 200

Slide 200

"hits": { "total": 3, "max_score": 1.2499592, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "1", "_score": 1.2499592, "_source": { "quote": "These are <em>not</em> the droids you are looking for." } }, ...

Slide 201

Slide 201

1.2499592 == 43% or 100%?

Slide 202

Slide 202

PUT /starwars/_doc/4 { "quote": "These droids are my father's father's machines." } POST /starwars/_search { "query": { "match": { "quote": "my father machine" } } }

Slide 203

Slide 203

"hits": { "total": 4, "max_score": 3.0068164, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "4", "_score": 3.0068164, "_source": { "quote": "These droids are my father's father's machines." } }, { "_index": "starwars", "_type": "_doc", "_id": "1", "_score": 0.89701396, "_source": { "quote": "These are <em>not</em> the droids you are looking for." } }, ...

Slide 204

Slide 204

3.0068164 == 103%?

Slide 205

Slide 205

Slide 206

Slide 206

PS: Shards Default? Effect on IDF?

Slide 207

Slide 207

Distributed Frequency Search GET starwars/_search?search_type=dfs_query_then_fetch { ... }

Slide 208

Slide 208

Don’t use dfs_query_then_fetch in production. It really isn’t required. — https://www.elastic.co/guide/en/elasticsearch/ guide/current/relevance-is-broken.html

Slide 209

Slide 209

More Search

Slide 210

Slide 210

Highlighting

Slide 211

Slide 211

POST /starwars/_search { "query": { "match": { "quote": "father" } }, "highlight": { "type": "unified", "pre_tags": [ "<tag>" ], "post_tags": [ "</tag>" ], "fields": { "quote": {} } } }

Slide 212

Slide 212

... "hits": { "total": 3, "max_score": 0.631961, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "4", "_score": 0.631961, "_source": { "quote": "These droids are my father's father's machines." }, "highlight": { "quote": [ "These droids are my <tag>father's</tag> <tag>father's</tag> machines." ] } }, ...

Slide 213

Slide 213

Boolean Queries must must_not should filter

Slide 214

Slide 214

POST /starwars/_search { "query": { "bool": { "must": { "match": { "quote": "father" } }, "should": [ { "match": { "quote": "your" } }, { "match": { "quote": "obi" } } ] } } }

Slide 215

Slide 215

... "hits": { "total": 3, "max_score": 2.117857, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "2", "_score": 2.117857, "_source": { "quote": "Obi-Wan never told you what happened to your father." } }, { "_index": "starwars", "_type": "_doc", "_id": "3", "_score": 1.3856719, "_source": { "quote": "<b>No</b>. I am your father." } }, ...

Slide 216

Slide 216

POST /starwars/_search { "query": { "bool": { "filter": { "match": { "quote": "father" } }, "should": [ { "match": { "quote": "your" } }, { "match": { "quote": "obi" } } ] } } }

Slide 217

Slide 217

... "hits": { "total": 3, "max_score": 1.6694657, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "2", "_score": 1.6694657, "_source": { "quote": "Obi-Wan never told you what happened to your father." } }, { "_index": "starwars", "_type": "_doc", "_id": "3", "_score": 0.8317767, "_source": { "quote": "<b>No</b>. I am your father." } },

Slide 218

Slide 218

Named Queries & minimum_should_match

Slide 219

Slide 219

POST /starwars/_search { "query": { "bool": { "must": { "match": { "quote": "father" } }, "should": [ { "match": { "quote": { "query": "your", "_name": "quote-your" } } }, { "match": { "quote": { "query": "obi", "_name": "quote-obi" } } }, { "match": { "quote": { "query": "droid", "_name": "quote-droid" } } } ], "minimum_should_match": 2 } } }

Slide 220

Slide 220

... "hits": { "total": 1, "max_score": 2.117857, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "2", "_score": 2.117857, "_source": { "quote": "Obi-Wan never told you what happened to your father." }, "matched_queries": [ "quote-obi", "quote-your" ] } ] } }

Slide 221

Slide 221

Boosting >1 increase, <1 decrease, <0 punish <0 removed in 7.0

Slide 222

Slide 222

POST /starwars/_search { "query": { "bool": { "must": { "match": { "quote": "father" } }, "should": [ { "match": { "quote": "your" } }, { "match": { "quote": { "query": "obi", "boost": 3 } } } ] } } }

Slide 223

Slide 223

... "hits": { "total": 3, "max_score": 4.2368493, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "2", "_score": 4.2368493, "_source": { "quote": "Obi-Wan never told you what happened to your father." } }, { "_index": "starwars", "_type": "_doc", "_id": "3", "_score": 1.3856719, "_source": { "quote": "<b>No</b>. I am your father." } }, ...

Slide 224

Slide 224

Search for father but prefer father father

Slide 225

Slide 225

POST /starwars/_search { "query": { "bool": { "must": { "match": { "quote": "father father" } } } } }

Slide 226

Slide 226

... "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "4", "_score": 1.263922, "_source": { "quote": "These droids are my father's father's machines." } }, { "_index": "starwars", "_type": "_doc", "_id": "3", "_score": 1.1077905, "_source": { "quote": "<b>No</b>. I am your father." } },

Slide 227

Slide 227

POST /starwars/_search { "query": { "bool": { "must": { "match": { "quote": "father" } }, "should": { "match_phrase": { "quote": "father father" } } } } }

Slide 228

Slide 228

... "hits": { "total": 3, "max_score": 9.146545, "hits": [ { "_index": "starwars", "_type": "_doc", "_id": "4", "_score": 9.146545, "_source": { "quote": "These droids are my father's father's machines." } }, { "_index": "starwars", "_type": "_doc", "_id": "3", "_score": 1.0454913, "_source": { "quote": "<b>No</b>. I am your father." } }, ...

Slide 229

Slide 229

Suggestion Suggest a similar text _search end point _suggest deprecated since 5.0

Slide 230

Slide 230

POST /starwars/_search { "query": { "match": { "quote": "drui" } }, "suggest": { "my_suggestion" : { "text" : "drui", "term" : { "field" : "quote" } } } }

Slide 231

Slide 231

... "hits": { "total": 0, "max_score": null, "hits": [] }, "suggest": { "my_suggestion": [ { "text": "drui", "offset": 0, "length": 4, "options": [ { "text": "droid", "score": 0.5, "freq": 1 } ] } ] } }

Slide 232

Slide 232

Multiple Suggesters term phrase completion context

Slide 233

Slide 233

NGram Partial matches Edge Gram

Slide 234

Slide 234

GET /_analyze { "char_filter": [ "html_strip" ], "tokenizer": { "type": "ngram", "min_gram": "3", "max_gram": "3", "token_chars": [ "letter" ] }, "filter": [ "lowercase" ], "text": "These are <em>not</em> the droids you are looking for." }

Slide 235

Slide 235

{ "tokens": [ { "token": "the", "start_offset": 0, "end_offset": 3, "type": "word", "position": 0 }, { "token": "hes", "start_offset": 1, "end_offset": 4, "type": "word", "position": 1 }, { "token": "ese", "start_offset": 2, "end_offset": 5, "type": "word", "position": 2 }, { "token": "are", "start_offset": 6, "end_offset": 9, "type": "word", "position": 3 }, ...

Slide 236

Slide 236

GET /_analyze { "char_filter": [ "html_strip" ], "tokenizer": { "type": "edge_ngram", "min_gram": "1", "max_gram": "3", "token_chars": [ "letter" ] }, "filter": [ "lowercase" ], "text": "These are <em>not</em> the droids you are looking for." }

Slide 237

Slide 237

{ "tokens": [ { "token": "t", "start_offset": 0, "end_offset": 1, "type": "word", "position": 0 }, { "token": "th", "start_offset": 0, "end_offset": 2, "type": "word", "position": 1 }, { "token": "the", "start_offset": 0, "end_offset": 3, "type": "word", "position": 2 }, { "token": "a", "start_offset": 6, "end_offset": 7, "type": "word", "position": 3 }, { "token": "ar", "start_offset": 6, "end_offset": 8, "type": "word", "position": 4 }, ...

Slide 238

Slide 238

Combining Analyzers Reindex Store multiple times Tune BM25 Combine scores

Slide 239

Slide 239

BM25 Revisited

Slide 240

Slide 240

https://www.elastic.co/blog/practical-bm25-part-2-the-bm25algorithm-and-its-variables

Slide 241

Slide 241

b field length amplification k1 term frequency saturation Default 0.75 Default 1.2

Slide 242

Slide 242

PUT /starwars_v42 { "settings": { "number_of_shards": 1, "index": { "similarity": { "default": { "type": "BM25", "b": 0, "k1": 0 } } },

Slide 243

Slide 243

"analysis": { "filter": { "my_synonym_filter": { "type": "synonym", "synonyms": [ "father,dad", "droid => droid,machine" ] }, "my_ngram_filter": { "type": "ngram", "min_gram": "3", "max_gram": "3", "token_chars": [ "letter" ] } },

Slide 244

Slide 244

"analyzer": { "my_lowercase_analyzer": { "char_filter": [ "html_strip" ], "tokenizer": "whitespace", "filter": [ "lowercase" ] }, "my_full_analyzer": { "char_filter": [ "html_strip" ], "tokenizer": "standard", "filter": [ "lowercase", "stop", "snowball", "my_synonym_filter" ] },

Slide 245

Slide 245

}, } } "my_ngram_analyzer": { "char_filter": [ "html_strip" ], "tokenizer": "whitespace", "filter": [ "lowercase", "stop", "my_ngram_filter" ] }

Slide 246

Slide 246

"mappings": { "_doc": { "properties": { "quote": { "type": "text", "fields": { "lowercase": { "type": "text", "analyzer": "my_lowercase_analyzer" }, "full": { "type": "text", "analyzer": "my_full_analyzer" }, "ngram": { "type": "text", "analyzer": "my_ngram_analyzer" } } } } } } }

Slide 247

Slide 247

POST /_reindex { "source": { "index": "starwars" }, "dest": { "index": "starwars_v42" } }

Slide 248

Slide 248

Aliases Atomic remove and add Point to multiple indices (read-only)

Slide 249

Slide 249

PUT _alias { "actions": [ { "add": { "index": "starwars_v42", "alias": "starwars_extended" } } ] }

Slide 250

Slide 250

POST /starwars/_search { "query": { "match": { "quote": "droid" } } }

Slide 251

Slide 251

"hits": [ { "_index": "starwars", "_type": "_doc", "_id": "4", "_score": 1.1533037, "_source": { "quote": "These droids are my father's father's machines." } }, { "_index": "starwars", "_type": "_doc", "_id": "1", "_score": 1.1295731, "_source": { "quote": "These are <em>not</em> the droids you are looking for." } } ]

Slide 252

Slide 252

POST /starwars_extended/_search { "query": { "match": { "quote.full": "droid" } } }

Slide 253

Slide 253

"hits": [ { "_index": "starwars_v42", "_type": "_doc", "_id": "1", "_score": 0.6931472, "_source": { "quote": "These are <em>not</em> the droids you are looking for." } }, { "_index": "starwars_v42", "_type": "_doc", "_id": "4", "_score": 0.6931472, "_source": { "quote": "These droids are my father's father's machines." } } ]

Slide 254

Slide 254

There are no "best" b and k1 values

Slide 255

Slide 255

POST /starwars_extended/_search?explain=true { "query": { "multi_match": { "query": "obiwan", "fields": [ "quote", "quote.lowercase", "quote.full", "quote.ngram" ], "type": "most_fields" } } }

Slide 256

Slide 256

... "hits": { "total": 1, "max_score": 0.4912064, "hits": [ { "_shard": "[starwars_v42][2]", "_node": "BCDwzJ4WSw2dyoGLTzwlqw", "_index": "starwars_v42", "_type": "_doc", "_id": "2", "_score": 0.4912064, "_source": { "quote": "Obi-Wan never told you what happened to your father." }, ...

Slide 257

Slide 257

Whitespace Tokenizer "weight( Synonym(quote.ngram:biw quote.ngram:iwa quote.ngram:obi quote.ngram:wan) in 0) [PerFieldSimilarity], result of:"

Slide 258

Slide 258

POST /starwars_extended/_search { "query": { "multi_match": { "query": "you", "fields": [ "quote", "quote.lowercase^5", "quote.full", "quote.ngram" ], "type": "best_fields" } } }

Slide 259

Slide 259

"hits": [ { "_index": "starwars_v42", "_type": "_doc", "_id": "1", "_score": 3.465736, "_source": { "quote": "These are <em>not</em> the droids you are looking for." } }, { "_index": "starwars_v42", "_type": "_doc", "_id": "2", "_score": 3.465736, "_source": { "quote": "Obi-Wan never told you what happened to your father." } }, { "_index": "starwars_v42", "_type": "_doc", "_id": "3", "_score": 0.35667494, "_source": { "quote": "<b>No</b>. I am your father." } } ]

Slide 260

Slide 260

Multi Match Type best_fields Score of the best field (default) cross_fields All terms in at least one field most_fields Score sum of all fields phrase

Slide 261

Slide 261

Different Analyzers for Indexing and Searching Per query In the mapping

Slide 262

Slide 262

POST /starwars_extended/_search { "query": { "match": { "quote.ngram": { "query": "the", "analyzer": "standard" } } } }

Slide 263

Slide 263

... "hits": [ { "_index": "starwars_extended", "_type": "_doc", "_id": "2", "_score": 0.38254172, "_source": { "quote": "Obi-Wan never told you what happened to your father." } }, { "_index": "starwars_extended", "_type": "_doc", "_id": "3", "_score": 0.36165747, "_source": { "quote": "<b>No</b>. I am your father." } } ] ...

Slide 264

Slide 264

Edge Gram vs Trigram Test a setting before adding a field

Slide 265

Slide 265

Shingle Token Filter Shingles (token ngrams) from a token stream

Slide 266

Slide 266

POST /starwars_extended/_close PUT /starwars_extended/_settings { "index": { "similarity": { "default": { "type": "BM25", "b": null, "k1": null } } },

Slide 267

Slide 267

"analysis": { "filter": { "my_edgegram_filter": { "type": "edge_ngram", "min_gram": 3, "max_gram": 10 }, "my_shingle_filter": { "type": "shingle", "min_shingle_size": 2, "max_shingle_size": 2 } },

Slide 268

Slide 268

"analyzer": { "my_edgegram_analyzer": { "char_filter": [ "html_strip" ], "tokenizer": "standard", "filter": [ "lowercase", "my_edgegram_filter" ] },

Slide 269

Slide 269

} } } "my_shingle_analyzer": { "type": "custom", "tokenizer": "standard", "filter": [ "lowercase", "my_shingle_filter" ] } POST /starwars_extended/_open

Slide 270

Slide 270

GET starwars_extended/_analyze { "text": "Father", "analyzer": "my_edgegram_analyzer" }

Slide 271

Slide 271

{ } "tokens": [ { "token": "fat", "start_offset": 0, "end_offset": 6, "type": "<ALPHANUM>", "position": 0 }, { "token": "fath", "start_offset": 0, "end_offset": 6, "type": "<ALPHANUM>", "position": 0 }, { "token": "fathe", "start_offset": 0, "end_offset": 6, "type": "<ALPHANUM>", "position": 0 }, { "token": "father", "start_offset": 0, "end_offset": 6, "type": "<ALPHANUM>", "position": 0 } ]

Slide 272

Slide 272

PUT /starwars_extended/_doc/_mapping { "properties": { "quote": { "type": "text", "fields": { "edgegram": { "type": "text", "analyzer": "my_edgegram_analyzer", "search_analyzer": "standard" }, "shingle": { "type": "text", "analyzer": "my_shingle_analyzer" } } } } }

Slide 273

Slide 273

PUT /starwars_extended/_doc/5 { "quote": "I find your lack of faith disturbing." } PUT /starwars_extended/_doc/6 { "quote": "That... is your failure." }

Slide 274

Slide 274

GET /starwars_extended/_doc/5/_termvectors { "fields": [ "quote.edgegram" ], "offsets": true, "payloads": true, "positions": true, "term_statistics": true, "field_statistics": true }

Slide 275

Slide 275

{ "_index": "starwars_v42", "_type": "_doc", "_id": "5", "_version": 1, "found": true, "took": 3, "term_vectors": { "quote.edgegram": { "field_statistics": { "sum_doc_freq": 26, "doc_count": 2, "sum_ttf": 26 }, "terms": { "dis": { "doc_freq": 1, "ttf": 1, "term_freq": 1, "tokens": [ { "position": 6, "start_offset": 26, "end_offset": 36 } ] }, "dist": { "doc_freq": 1, "ttf": 1, ...

Slide 276

Slide 276

POST /starwars_extended/_search { "query": { "match": { "quote": "fail" } } }

Slide 277

Slide 277

POST /starwars_extended/_search { "query": { "match": { "quote.lowercase": "fail" } } }

Slide 278

Slide 278

POST /starwars_extended/_search { "query": { "match": { "quote.full": "fail" } } }

Slide 279

Slide 279

POST /starwars_extended/_search { "query": { "match": { "quote.ngram": "fail" } } }

Slide 280

Slide 280

"hits": [ { "_index": "starwars_v42", "_type": "_doc", "_id": "6", "_score": 1.8400999, "_source": { "quote": "That... is your failure." } }, { "_index": "starwars_v42", "_type": "_doc", "_id": "5", "_score": 1.442779, "_source": { "quote": "I find your lack of faith disturbing." } } ]

Slide 281

Slide 281

POST /starwars_extended/_search { "query": { "match": { "quote.edgegram": "fail" } } }

Slide 282

Slide 282

"hits": [ { "_index": "starwars_v42", "_type": "_doc", "_id": "6", "_score": 1.0114291, "_source": { "quote": "That... is your failure." } } ]

Slide 283

Slide 283

Updating Missing Fields Expensive

Slide 284

Slide 284

POST /starwars_extended/_update_by_query { "query": { "bool": { "must_not": { "exists": { "field": "quote.edgegram" } } } } }

Slide 285

Slide 285

Shingles: Context Should Matter

Slide 286

Slide 286

POST /starwars_extended/_search { "query": { "bool": { "must": { "match": { "quote.lowercase": "these droids are" } } } } }

Slide 287

Slide 287

"hits": [ { "_index": "starwars_v42", "_type": "_doc", "_id": "1", "_score": 2.1837702, "_source": { "quote": "These are <em>not</em> the droids you are looking for." } }, { "_index": "starwars_v42", "_type": "_doc", "_id": "4", "_score": 2.137744, "_source": { "quote": "These droids are my father's father's machines." } } ]

Slide 288

Slide 288

POST /starwars_extended/_search { "query": { "bool": { "must": { "match": { "quote.shingle": "these droids are" } } } } }

Slide 289

Slide 289

"hits": [ { "_index": "starwars_v42", "_type": "_doc", "_id": "4", "_score": 3.1811738, "_source": { "quote": "These droids are my father's father's machines." } }, { "_index": "starwars_v42", "_type": "_doc", "_id": "1", "_score": 2.6568544, "_source": { "quote": "These are <em>not</em> the droids you are looking for." } } ]

Slide 290

Slide 290

Decompounding Commonly in German, Scandinavian languages, Finnish, Korean

Slide 291

Slide 291

PUT /decompound_en { "settings": { "number_of_shards": 1, "analysis": { "filter": { "british_decompounder": { "type": "hyphenation_decompounder", "hyphenation_patterns_path": "hyph/en_GB.xml", "word_list": [ "death", "star" ] } }, "analyzer": { "british_decompound": { "type": "custom", "tokenizer": "standard", "filter": [ "lowercase", "british_decompounder" ] } } } } }

Slide 292

Slide 292

GET /decompound_en/_analyze { "analyzer" : "british_decompound", "text" : "deathstar" }

Slide 293

Slide 293

{ } "tokens": [ { "token": "deathstar", "start_offset": 0, "end_offset": 9, "type": "<ALPHANUM>", "position": 0 }, { "token": "death", "start_offset": 0, "end_offset": 9, "type": "<ALPHANUM>", "position": 0 }, { "token": "star", "start_offset": 0, "end_offset": 9, "type": "<ALPHANUM>", "position": 0 } ]

Slide 294

Slide 294

German Dictionaly (LGPL) https://github.com/uschindler/ german-decompounder

Slide 295

Slide 295

PUT /decompound_de { "settings": { "number_of_shards": 1, "analysis": { "filter": { "german_decompounder": { "type": "hyphenation_decompounder", "word_list_path": "dictionary-de.txt", "hyphenation_patterns_path": "hyph/de_DR.xml", "only_longest_match": true, "min_subword_size": 4 }, "german_stemmer": { "type": "stemmer", "language": "light_german" } }, "analyzer": { "german_decompound": { "type": "custom", "tokenizer": "standard", "filter": [ "lowercase", "german_decompounder", "german_normalization", "german_stemmer" ] } } } } }

Slide 296

Slide 296

GET /decompound_de/_analyze { "analyzer" : "german_decompound", "text" : "Todesstern" }

Slide 297

Slide 297

{ } "tokens": [ { "token": "todesst", "start_offset": 0, "end_offset": 10, "type": "<ALPHANUM>", "position": 0 }, { "token": "tod", "start_offset": 0, "end_offset": 10, "type": "<ALPHANUM>", "position": 0 }, { "token": "stern", "start_offset": 0, "end_offset": 10, "type": "<ALPHANUM>", "position": 0 } ]

Slide 298

Slide 298

Without Word Lists https://github.com/jprante/ elasticsearch-analysis-decompound

Slide 299

Slide 299

Performance

Slide 300

Slide 300

Slide 301

Slide 301

Slide 302

Slide 302

Conclusion

Slide 303

Slide 303

Indexing Formatting Tokenize Lowercase, Stop Words, Stemming Synonyms

Slide 304

Slide 304

Scoring Term Frequency Inverse Document Frequency Field-Length Norm Vector Space Model

Slide 305

Slide 305

Advanced Queries Highlighting Suggestions NGrams, Edge Grams Multiple Analyzers

Slide 306

Slide 306

Advanced Queries Reindex & Alias Update by Query Shingles Decompound

Slide 307

Slide 307

Monitor Your Apps with the

Slide 308

Slide 308

Slide 309

Slide 309

Slide 310

Slide 310

Slide 311

Slide 311

Slide 312

Slide 312

Disclaimer I build highly monitored Hello World apps

Slide 313

Slide 313

Agenda Monitor Java (preconfigured) Some Security Monitor PHP (configure yourself)

Slide 314

Slide 314

Code https://github.com/xeraa/ microservice-monitoring

Slide 315

Slide 315

Cloud

Slide 316

Slide 316

Slide 317

Slide 317

Slide 318

Slide 318

Slide 319

Slide 319

Java Application

Slide 320

Slide 320

Simple No discovery, load-balancing,...

Slide 321

Slide 321

Slide 322

Slide 322

Monitor Java

Slide 323

Slide 323

Kibana Monitoring Overview of the Elastic Stack components

Slide 324

Slide 324

Metricbeat System [Metricbeat System] Overview and [Metricbeat System] Host overview dashboards See the memory spike every 5min

Slide 325

Slide 325

Time Series Visual Builder Sum of system.memory.actual.used.bytes Sum of system.process.memory. rss.bytes grouped by the term system.process.name and moved to the negative y-axis with a Math step

Slide 326

Slide 326

Slide 327

Slide 327

Packetbeat Call /, /good, /bad, and /foobar [Packetbeat] Overview, [Packetbeat] Flows, [Packetbeat] HTTP, and [Packetbeat] DNS Tunneling dashboards

Slide 328

Slide 328

Packetbeat Raw events in Discover Process enrichment for nginx, Java, and the APM server

Slide 329

Slide 329

Filebeat Modules [Filebeat Nginx] Access and error logs, [Filebeat System] Syslog dashboard, and [Osquery Result] Compliance pack dashboards

Slide 330

Slide 330

Filebeat Raw events in Discover /good: MDC logging under json.name and the context view for one log message meta.* and host.* information

Slide 331

Slide 331

Filebeat /bad and /null: Stacktraces by filtering down on application:java and json.severity:ERROR Visualize json.stack_hash

Slide 332

Slide 332

Slide 333

Slide 333

Heartbeat Heartbeat HTTP monitoring dashboard Stop and start the frontend application while auto refreshing

Slide 334

Slide 334

Metricbeat nginx [Metricbeat Nginx] Overview dashboard

Slide 335

Slide 335

Metricbeat HTTP /health and /metrics endpoints Collected information in Discover

Slide 336

Slide 336

Metricbeat JMX Same data Visualize the heap usage: jolokia. metrics.memory.heap_usage.used divided by the max of jolokia. metrics.memory.heap_usage.max

Slide 337

Slide 337

Annotations Add changes from the events index

Slide 338

Slide 338

Slide 339

Slide 339

Slide 340

Slide 340

Some Security

Slide 341

Slide 341

Filebeat Modules [Filebeat Auditd] Audit Events, [Filebeat System] New users and groups, and [Filebeat System] Sudo commands dashboards

Slide 342

Slide 342

https://github.com/linux-audit "auditd is the userspace component to the Linux Auditing System. It's responsible for writing audit records to the disk. Viewing the logs is done with the ausearch or aureport utilities."

Slide 343

Slide 343

Auditd Monitors File and network access System calls Commands run by a user Security events

Slide 344

Slide 344

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/security_guide/chap-system_auditing

Slide 345

Slide 345

Understanding Logs https://access.redhat.com/ documentation/en-us/ red_hat_enterprise_linux/7/html/ security_guide/secunderstanding_audit_log_files

Slide 346

Slide 346

Auditbeat [Auditbeat Auditd] Overview dashboard

Slide 347

Slide 347

Fail SSH ssh elastic-user@xeraa.wtf with a bad password [Filebeat System] SSH login attempts dashboard

Slide 348

Slide 348

Success ssh elastic-user@xeraa.wtf with a good password Run service nginx restart and pick the elastic-admin user

Slide 349

Slide 349

Audit Event [Auditbeat Auditd] Executions dashboard filter elastic-user

Slide 350

Slide 350

Audit Event cat /etc/passwd Filter for tags is developers-passwdread in Discover

Slide 351

Slide 351

Power Abuse ssh elastic-admin@xeraa.wtf sudo cat /home/elastic-user/secret.txt Tag power-abuse in Discover

Slide 352

Slide 352

File Integrity Change something in /var/www/html/index.html [Auditbeat File Integrity] Overview dashboard

Slide 353

Slide 353

Monitor PHP

Slide 354

Slide 354

Heartbeat Add HTTP port 88

Slide 355

Slide 355

Packetbeat Add HTTP on port 88

Slide 356

Slide 356

Metricbeat php-fpm - module: php_fpm metricsets: ["pool"] period: 10s status_path: "/status" hosts: ["http://localhost"]

Slide 357

Slide 357

Filebeat Collect /var/www/html/silverstripe/ logs/*.json

Slide 358

Slide 358

More

Slide 359

Slide 359

a Alerting a Gold License and part of the Elastic Cloud

Slide 360

Slide 360

Slide 361

Slide 361

b Machine Learning Anomaly Detection of Time Series Data b Platinum License and part of the Elastic Cloud

Slide 362

Slide 362

Slide 363

Slide 363

Conclusion

Slide 364

Slide 364

Slide 365

Slide 365

System metrics & network Filebeat modules & Auditbeat Application logs

Slide 366

Slide 366

Uptime Application metrics Request tracing

Slide 367

Slide 367

Code https://github.com/xeraa/ microservice-monitoring

Slide 368

Slide 368

Questions? Philipp Krenn PS: Sticker @xeraa