CRATEDB: A SEARCH ENGINE OR A DATABASE? BOTH!

3y ago
70 Views
2 Downloads
1.82 MB
29 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Annika Witter
Transcription

C R AT E D B : A S E A R C H E N G I N E O R AD ATA B A S E ? B O T H !H O W W E B U I LT A S Q L D ATA B A S E O NTOP OF ELASTICSEARCH AND LUCENEMaximilian Michels@stadtlegendemax@crate.iomxm@apache.org

2W H Y A R E W E TA L K I N G A B O U T T H I S ? Traditional databases are well-researched and thereare plenty of them (Postgres, MySQL, Oracle ) Scalable search using these can be tricky Search engines are databases optimized for searchand scale (Lucene, Solr, Elasticsearch) You can’t typically use SQL with Search Engines Why not stick with an mature query language standardwhich everybody knows?

“A scalable SQL database optimizedfor search without the NoSQL bullshit.”

4C R AT E D B I N A N U T S H E L L Since 2014: https://github.com/crate/crate Apache 2.0 licensed (community edition) Built using Elasticsearch, Lucene, Netty, Antlr, SQL-99 compatible REST / Postgres Wire Protocol / JDBC / Python

5W H AT T O E X P E C T What is great about CrateDB Easy to setup No funny APIs / SQL Great scale out - Massive reads / writes Container aware Not so great Transactions Foreign keys

U S I N G C R AT E D B

7C R AT E D B I S J U S T L I K E A S Q L D B SQL is the only query API CREATE TABLE fosdem.speakers (id int PRIMARY KEY,name string) CREATE TABLE fosdem.talks (id INT PRIMARY KEY,title STRING, abstract STRING, speaker INT); INSERT INTO fosdem.speakers (id, name) VALUES (1,’max’) INSERT INTO fosdem.talks (id, title, abstract,speaker) VALUES (1, ‘Talk about CrateDB’, ‘bla’, 1) SELECT * FROM fosdem.talks t1 LEFT JOINfosdem.speakers t2 ON t1.id t2.id

8BUT THERE IS MORE CrateDB denormalized (no joins necessary) CREATE TABLE fosdem.speakers (name STRING, talkOBJECT AS (title STRING, abstract STRING)) INSERT INTO fosdem.speakers (name, talk) VALUES(‘max’, {title ‘CrateDB’, abstract ‘Loremipsum’}) SELECT talk[‘title’] as title FROMfosdem.speakers ORDER BY title

9C L U S T E R I N G / R E P L I C AT I O NNODE1NODE2NODE3NODE4 CREATE TABLE fosdem.speakers (name STRING, talkOBJECT AS (title STRING, abstract STRING)) CLUSTERED BY name into 4 shardsSHARD

10C L U S T E R I N G / R E P L I C AT I O NNODE1NODE2NODE3NODE4 CREATE TABLE fosdem.speakers (name STRING, talkOBJECT AS (title STRING, abstract STRING)) CLUSTERED BY name into 4 shards WITH (number of replicas 1)PRIMARYREPLICA

11PA R T I T I O N E D TA B L E SNODE1NODE2NODE3NODE4 CREATE TABLE fosdem.speakers (name STRING, talkOBJECT as (title STRING, abstract STRING),year INT) CLUSTERED BY name into 4 shards PARTITIONED BY (year, ) WITH (number of replicas 1)PRIMARYREPLICA

12M O R E F E AT U R E S Aggregations Geo search Text Analyzers UDFs Snapshots User management Schema / Table privileges SSL encryption MQTT Ingestion

ARCHITECTURE

14ON THE SHOULDERS OF GIANTS CrateDB: Distributed SQL Execution Engine Antlr: Parsing of SQL statements Netty: REST, Postgres Wire Protocol, Web interface Lucene: Storage, Indexing, Queries Elasticsearch: Transport, Routing, Replication

15INTRODUCTION TO Lucene stores documents which are CrateDB’s rows Documents have fields { id: ‘123’,name : ‘Bob’,title : ‘How I Learned to Stop Worryingand Love the Bomb’,text : ‘Lorem ipsum '} Fields are indexed for efficient lookup Fields have column store for efficient aggregation

16INTRODUCTION TO ELASTICSEARCH Elasticsearch core concepts revolve around indices,shards, and replicas An index is a document store with n parts,called shards Each shard has 0 or more replicas which hold copies ofthe shard data Replicas are not only useful for fault tolerance but alsoincrease the search performance

17H O W TA B L E S R E L AT E T O I N D I C E S A N {"type":"keyword"},"title":{"type":"keyword"}}} Each table in CrateDB isrepresented by an ES indexwith a mapping} Each partition in apartitioned table isrepresented by an ES index Partition indices arecreated by encoding thepartition value in the indexnameTA B L ET1INDEXt1t2.day1t2.day2SHARD1SHARD2SHARD3SHARD4 XXXXXXT2T3 t3 X X X X X

18FROM QUERY TO EXECUTION SELECT name, count(*) as talks FROM fosdem.speakersWHERE room ‘hpc’ AND year 2018 GROUP BY name ORDER BY nameCLIENTPSQLWEBCRASHJDBCPYTHONRUSTNODENODE1REST / POSTGRESPA R S E RA N A LY Z E RPLANNEREXECUTORTRANSPORT (ES)STORAGE (LUCENE)NODE2NODE4NODE3NODE5

19ARCHITECTURE HIGHLIGHTS Distributed storage / Distributedquery executionNODE1NODE3NODE2NODE4 Masterless Replication Only ephemeral storage needed(Container aware) Optimized for search: Indexing ofall fields with Lucene (tuneable)

HANDS-ON

21W H AT C A N Y O U D O W I T H C R AT E D B ? Monitoring (IoT, Industry 4.0, Cyber Security) Stream Analysis Text Analysis Time Series Analysis Geospatial Queries

CrateDB Web InterfaceDEMO

CrateDB Web Interface

CrateDB Web Interface

CrateDB Web Interface

CONCLUSION

27W H AT W E H A V E L E A R N E D Elasticsearch used Lucene and Netty to built a distributedsearch engine CrateDB used Elasticsearch, Lucene, and Netty to built adistributed SQL database CrateDB is perfect when you want or have to use SQL store large amounts of structured or unstructured data have many thousands of queries per second

28SEE FOR YOURSELF! Try out CrateDB Download from https://crate.io/download/ or curl try.crate.io bash or docker run crate or build from source https://github.com/crate/crate Check out https://crate.io/docs Contributions welcome Check out cs/index.rst Check out the issues Stackoverflow Join our Slack channel

THANK YOU!Maximilian Michels@stadtlegendemax@crate.iomxm@apache.org

WHY ARE WE TALKING ABOUT THIS? Traditional databases are well-researched and there are plenty of them (Postgres, MySQL, Oracle ) Scalable search using these can be tricky Search engines are databases optimized for search and scale (Lucene, Solr, Elasticsearch) You can’t typically use SQL with Search Engines Why not stick with an mature query language standard

Related Documents:

1.Engine Oil SABA 13 1.Engine Oil 8000 14 1.Engine Oil 6000 15 1.Engine Oil 3000 16 1.Engine Oil Alvand 17 1.Engine Oil Motor Cycle Engine Oil M-150 18 1.Engine Oil M-100 19 1.Engine Oil Gas Engine Oil CNG-BUS 20 1.Engine Oil G.I.C.X.LA 21 1.Engine Oil G.I.C.X. 22 1.Engine Oil Diesel Engine Oil Power 23 1.Engine Oil Top Engine 24

tools you already use via JDBC, ODBC, Postgres wire protocol, or HTTP (REST). Plus, there are custom integrations for CrateDB that connect it to Grafana, Apache Kafka, and other popular time series ecosystem components.

Paid vs. Organic Search Search Engine Marketing (SEM) is a term used to describe the various means of marketing a website via search engines, and entails both organic search engine optimization and paid search strategies. Organic search is based on unpaid, natural rankings determined by search engine algorithms, and can be optimized

Both paid search advertising and organic search engine optimization are essential tools to help you show up in search engine results. SEARCH 93% Of consumers begin on a search engine.13 75% Of searchers never scroll past the first page of search results.14 50% Paid search ads provide 50% incremental clicks even when a business ranks #1 for

The Aircraft Engine Design Project Fundamentals of Engine Cycles Ken Gould Spring 2009 Phil Weed 1. g GE Aviation Technical History GE Aircraft Engines U.S. jet engine U.S. turboprop engine Vibl tt iVariable stator engine Mach 2 fighter engine Mach 3 bomber engine High bypass engine

The results displayed on a search engine include paid search ads and organic (or un-paid) search links. Online advertisers pay the search engine for all impressions or clicks to their ads, but do not pay for organic search links. Importantly, search ads always appear at the top of the page, followed by organic search links (see Figure1).

Google is most user friendly search engine proved for the Indian users which give user oriented results .In addition, most of other search engines use Google search patterns so we have concentrated on it. So, if a page is optimised in Google it is optimised for most of the search engines. Keywords: Search engine optimisation, SEO, Google

sebuah standar akuntansi untuk lembaga keuangan syariah yang disebut accounting, auditing, and governance standard for Islamic institution. 3. Perkembangan Akuntansi di Indonesia (IAI) Ketika Indonesia merdeka, hanya ada satu orang akuntan pribumi, yaitu Prof. Dr. Abutari, sedangkan Prof. Soemardjo lulus pendidikan akuntan di