Tableau Spark SQL Setup Instructions

2y ago

18 Views

2 Downloads

548.60 KB

11 Pages

Last View : 3d ago

Last Download : 3m ago

Upload by : Adele Mcdaniel

Report this link

Download PDF

Transcription

Tableau Spark SQL Setup Instructions1. Prerequisites2. Configuring Hive3. Configuring Spark & Hive4. Starting the Spark Service and the Spark Thrift Server5. Connecting Tableau to Spark SQL5A. Install Tableau DevBuild 8.2.3 5B. Install the Spark SQL ODBC5C. Opening a Spark SQL ODBC Connection6. Appendix: SparkSQL 1.1 Patch Installation Steps6A. Pre-Requisites:6B. Apache Hadoop Install:Install Java:Install Hadoop:Edit Config FilesStart Hadoop and format namenodeCreate HDFS directoriesInstall PostgreSQLInstall and configure HiveConfigure PostgreSQL as Hive metastoreStart metastore and hiveserver2 servicesTo Shutdown Hadoop:To Start Hadoop:Loading TestV1 v2 data via BeelineSpark SQL 1.1.patch Install1. PrerequisitesThere are a number of prerequisites required to be able to run Tableau with Spark SQL. Themain requirements are:Server Side: Spark V1.2 - please use the 1.2 branch at https://github.com/apache/spark/tree/branch-1.2 Hadoop V2.4 or higher Hive V0.12 or V0.13Client Side: Tableau 8.3.1

Simba Spark ODBC Driver V1.0.4: http://databricks.com/spark-odbc-driver-download2. Configuring Hive There are no special Hive configurations when using with Spark SQLIf installing from scratch you can follow the Appendix 6B steps for our sample sparkcluster configuration3. Configuring Spark & Hive There are no special Spark configurations, the defaults will get you up and runningSee Appendix 6B for our sample cluster configuration4. Starting the Spark Service and the Spark Thrift Server Verify that you have HiveServer2 running and you are using PostgreSQL or MySQLas a metastore then run the following SPARK HOME/sbin/start-master.sh SPARK HOME/sbin/start-slaves.sh SPARK HOME/sbin/start-thriftserver.sh --master spark://localhost:7077 -driver-class-path CLASSPATH --hiveconf hive.server2.thrift.bind.host localhost -hiveconf hive.server2.thrift.port 10001 Note, we randomly choose port 100015. Connecting Tableau to Spark SQL5A. Install Tableau DevBuild 8.3.1 The first thing you must do is install the latest version of Tableau - anything 8.3.1 or latershould work. The Spark SQL connection will be hidden in the product unless you install aspecial license key. Please e-mail Jackie Clough if you do not have the special license key.To install a new key you must go to Help - Manage Product Keys

5B. Install the Spark SQL ODBCTo install the Spark SQL ODBC driver, simply open the appropriate version of the driver foryour system and follow the instructions: Windows 64-bit: SimbaSparkODBC64.msi Windows 32-bit: SimbaSparkODBC32.msi Max OSX: SimbaSparkODBC.dmg When installing the Mac driver, you may get a message that says“SimbaSparkODBC.dmg can’t be opened because it is from an unidentifieddeveloper.” To allow the driver to be installed, go to Applications - SystemPreferences - Security & Privacy - General Tab and click Open Anyways5C. Opening a Spark SQL ODBC ConnectionIf you have properly installed Tableau and the special license key, you should see Spark SQL(Beta) as one of the connection options after clicking Connect to Data. Select Spark SQL(Beta) and you will see a dialog box similar to below:

The parameters you need to enter include: Server: Server name or IP address of your Spark server Port: Default value of your Spark Thrift server port Type: Spark ThriftServer (Spark 1.1 and later) Authentication: User Name User Name: blank

6. Appendix: Spark SQL 1.1.x Installation Steps6A. Pre-Requisites Sample: OS: CentOS 6.5CPU: 2 dual coreRAM: 16GB6B. Apache Hadoop Install:Install Java: Copy/scp the java rpm file jdk-7u25-linux-x64.rpmto /tmp and extract/install with rpm: rpm -Uvh jdk-7u25-linux-x64.rpmSet environment variables: export JAVA HOME /usr/java/jdk1.7.0 25/Verify java version: java -versionInstall Hadoop: ssh-keygen -t rsa -P ""cat /.ssh/id rsa.pub /.ssh/authorized keysssh localhostssh actual server name useradd hadoopcd /wget rent/hadoop2.4.1.tar.gztar xzvf hadoop-2.4.1.tar.gzchown -R hadoop:hadoop /hadoop-2.4.1Edit Config FilesEdit the following config files located in /hadoop-2.4.1/etc/hadoop, These will varydepending on your environment but are provided here as a sample:

core-site.xml configuration property name fs.defaultFS /name value hdfs://localhost:8020 /value final true /final /property property name hadoop.tmp.dir /name value /data/hadoop data /value description A base for other temporary directories. /description /property /configuration mapred-site.xml configuration property name mapreduce.framework.name /name value yarn /value /property /configuration yarn-site.xml configuration property name yarn.nodemanager.aux-services /name value mapreduce shuffle /value /property property name ass /name value org.apache.hadoop.mapred.ShuffleHandler /value /property !-- To increase number of apps that can run in YARN -- property name yarn.nodemanager.resource.cpu-vcores /name value 4 /value /property property name yarn.nodemanager.resource.memory-mb /name value 8192 /value /property property name yarn.scheduler.minimum-allocation-mb /name

value 512 /value /property property name yarn.nodemanager.pmem-check-enabled /name value false /value /property property name yarn.nodemanager.vmem-check-enabled /name value false /value /property /configuration In addition, add the following environment variables to /.bashrc: export JAVA HOME /usr/java/jdk1.7.0 25 export HADOOP PREFIX /hadoop-2.4.1 export HADOOP CONF DIR HADOOP PREFIX/etc/hadoop export YARN CONF DIR HADOOP CONF DIR export PATH PATH: HADOOP PREFIX/bin export HADOOP INSTALL /hadoop-2.4.1 export HADOOP HOME /hadoop-2.4.1 export HADOOP COMMON LIB NATIVE DIR HADOOP HOME/lib/native export HADOOP OPTS " HADOOP OPTS -Djava.library.path HADOOP HOME/lib/" export HIVE HOME /usr/local/hive-0.12.0/ export PATH PATH: HIVE HOME/bin export SPARK MASTER PORT 7077Start Hadoop and format namenode /hadoop-2.4.1/bin/hdfs namenode ry-daemon.sh start historyserverCreate HDFS directories kdirInstall PostgreSQL-p /user/root-p /user/hive-p /user/hive/metastore/user/anonymous

edit /etc/yum.repos.d/CentOS-Base.repo by adding "exclude postgresql*" tothe "[base]" and "[update]" sections wget -O http://yum.postgresql.org/9.3/redhat/rhel-6-x86 64/pgdg-centos93-9.31.noarch.rpm rpm -Uvh pgdg-centos93-9.3.1.noarch.rpm yum install postgresql93-server service postgresql-9.3 initdb chkconfig postgresql-9.3 on service postgresql-9.3 startInstall and configure Hive cd /usr/local wget hive-0.12.0.tar.gz tar xvf hive-0.12.0.tar.gzConfigure /usr/local/hive-0.12.0/conf/hive-site.xml to look something like this configuration property name javax.jdo.option.ConnectionURL /name value jdbc:postgresql://localhost/metastore /value /property property name javax.jdo.option.ConnectionDriverName /name value org.postgresql.Driver /value /property property name javax.jdo.option.ConnectionUserName /name value hiveuser /value /property property name javax.jdo.option.ConnectionPassword /name value mypassword /value /property property name datanucleus.autoCreateSchema /name value false /value /property property

name hive.metastore.uris /name value thrift://localhost:9083 /value description IP address (or fully-qualified domain name) and port ofthe metastore host /description /property property name hive.metastore.warehouse.dir /name value /user/hive/metastore /value /property /configuration Configure PostgreSQL as Hive metastore Set "standard conforming strings" to off in /var/lib/pgsql/9.3/data/postgresql.conf standard conforming strings off listen addresses '*'Allow remote access by adding the following to /var/lib/pgsql/9.3/data/pg hba.confunder the IPv6 section host allall0.0.0.0 0.0.0.0passwordRestart service service postgresql-9.3 restartInstall PostgreSQL JDBC Driver yum install postgresql-jdbc ln -s /usr/share/java/postgresql-jdbc.jar /usr/local/hive/lib/postgresqljdbc.jarCreate metastore database and user account su - postgres psql CREATE USER hiveuser WITH PASSWORD 'password'; CREATE DATABASE metastore; \c metastore; \i ostgres/hiveschema-0.12.0.postgres.sql \o /tmp/grant-privsSELECT 'GRANT SELECT,INSERT,UPDATE,DELETEON "' schemaname '"."' tablename '" TO hiveuser;'FROMpg tablesWHERE tableowner CURRENT USER and schemaname 'public';\o \i /tmp/grant-privsVerify connection with hive user psql -h myhost -U hiveuser -d metastoreCreate softlink to hive-site.xml file ln -s /hadoop-2.4.1/etc/hadoop/hive-site.xml /usr/local/hive-0.12.0/conf/hive-site.xmlTo Shutdown Hadoop:

/hadoop-2.4.1/sbin/mr-jobhistory-daemon.sh stop historyserver /hadoop-2.4.1/sbin/stop-yarn.sh /hadoop-2.4.1/sbin/stop-dfs.shTo Start Hadoop: /hadoop-2.4.1/sbin/start-dfs.sh /hadoop-2.4.1/sbin/start-yarn.sh /hadoop-2.4.1/sbin/mr-jobhistory-daemon.sh start historyserverStart metastore and hiveserver2 services mkdir /var/log/hive nohup hive --service metastore /var/log/hive/metastore.log & nohup hive --service hiveserver2 /var/log/hive/hiveserver2.log &Spark SQL 1.1.x Install build spark from source with maven cd /opt wget 3/binaries/apache-maven-3.2.3-bin.tar.gz tar xvf apache-maven-3.2.3-bin.tar.gz mv apache-maven-3.2.3 /opt/maven ln -s /opt/maven/bin/mvn /usr/bin/mvn vim /etc/profile.d/maven.shAdd the following contents: #!/bin/bash MAVEN HOME /opt/maven PATH MAVEN HOME/bin: PATH export PATH MAVEN HOME export CLASSPATH . Save and close the file. Make it executable using the following command. chmod x /etc/profile.d/maven.sh Then, set the environment variables permanently by running the following command: source /etc/profile.d/maven.sh Get Spark source code mkdir /usr/local/spark-1.1.x-bin-hadoop2.4

cd /usr/local/spark-1.1.x-bin-hadoop2.4 wget https://github.com/apache/spark/archive/master.zip unzip master.zip mv spark-master/* /usr/local/spark-1.1.x-bin-hadoop2.4/ cd /usr/local/spark-1.1.x-bin-hadoop2.4 export MAVEN OPTS "-Xmx2g -XX:MaxPermSize 512M XX:ReservedCodeCacheSize 512m" mvn -Pyarn -Phadoop-2.4 -Dhadoop.version 2.4.0 -Phive -DskipTests cleanpackageWait for compiler to finishConfigure Spark: The following is optional as the default setting work just fine.edit v.sh add the following, whichwill vary depending on your environment.Add the following under the Yarn configurations SPARK EXECUTOR CORES 4#, Number of cores for the workers (Default:1). SPARK EXECUTOR MEMORY 4G#, Memory per Worker (e.g. 1000M, 2G)(Default: 1G) SPARK DRIVER MEMORY 4G#, Memory for Master (e.g. 1000M, 2G)(Default: 512 Mb)Starting Spark: To start spark master/worker and hive-thriftserver connector run the following ster.sh aves.sh riftserver.sh --master spark://localhost:7077 --driver-class-path CLASSPATH --hiveconf hive.server2.thrift.bind.hostlocalhost --hiveconf hive.server2.thrift.port 10001

2.Configuring Hive 3.Configuring Spark & Hive 4.Starting the Spark Service and the Spark Thrift Server 5.Connecting Tableau to Spark SQL 5A. Install Tableau DevBuild 8.2.3 5B. Install the Spark SQL ODBC 5C. Opening a Spark SQL ODBC Connection 6.Appendix: SparkSQL 1.1 Patch Installation Steps 6A. Pre-Requisites: 6B. Apache Hadoop Install .

Related Documents:

Data Analytics Using Python - Ducat India

Tableau - waterfall charts Tableau - dashboard l l l l l l l l l l l l l l l Tableau - Calculation Tableau - operators Tableau - functions Tableau - numeric calculations Tableau - string calculations Tableau - date calculations Tableau - table calculations Tableau - lod expressions l l l l l l l Projects

22 Views

1y ago

Spark SQL: Relational Data Processing in Spark - People

running Spark, use Spark SQL within other programming languages. Performance-wise, we ﬁnd that Spark SQL is competitive with SQL-only systems on Hadoop for relational queries. It is also up to 10 faster and more memory-efﬁcient than naive Spark code in computations expressible in SQL. More generally, we see Spark SQL as an important .

11 Views

1y ago

JDBC Connectors in Tableau - Tableau Conference 2018

How to turn on JDBC -Tableau Desktop SAP Hana, Microsoft SQL Server, Oracle, Presto Install a 64bit JRE(you may need to point your JAVA_HOME to JRE). Setup the drivers. Windows: C:/Program Files/Tableau/Drivers. Mac: /Library/Tableau/Drivers. Launch Tableau. Windows: tableau.exe -DForceJdbc Mac: Tableau.app --args -DForceJdbc .

32 Views

1y ago

Spark and Spark SQL - Department of Computer Science, University of Oxford

Spark vs. MapReduce (2/2) Amir H. Payberah (SICS) Spark and Spark SQL June 29, 2016 23 / 71. Spark vs. MapReduce (2/2) Amir H. Payberah (SICS) Spark and Spark SQL June 29, 2016 23 / 71. Challenge How to design a distributed memory abstraction that is bothfault tolerantande cient? Solution

17 Views

1y ago

Data Analytics Professional

Tableau - Line Chart Tableau - Pie Chart Tableau - Crosstab Tableau - Scaer Plot Tableau - Bubble Chart Tableau - Bullet Graph Tableau - Box Plot Tableau - Tree Map . Waterfall chart, KPI, Donut chart, Scaer chart Geographical data visualizaon using Maps Power Bi Service, Publising & Sharing

38 Views

1y ago

Tableau Training - InsideEWU

Tableau 10: Essential Training Tableau 10: Mastering Calculations Creating Interactive Dasboards in Tableau 10 Tableau 10 for Data Scientists up to version 10.2 up to version 10.2 up to version 10.1 up to version 10.2 Introducing Tableau Introducing Calculations in Tableau 10 Tableau Dashboards Green and Blue Pills Managing Data Sources and

21 Views

1y ago

EECS E6893 Big Data Analytics Spark Dataframe, Spark SQL, Hadoop metrics

Spark Dataframe, Spark SQL, Hadoop metrics Guoshiwen Han, gh2567@columbia.edu 10/1/2021 1. Agenda Spark Dataframe Spark SQL Hadoop metrics 2. . ambari-server setup service ambari-server start point your browser to AmbariHost :8080 and login with the default user admin and password admin. Third-party tools 22

15 Views

1y ago

European Union Register of Feed Additives

CECT 5940 (Holder of the authorisation Evonik Nutrition & Care GmbH) [Chickens for fattening; Chickens reared for laying] ; Commission Implementing Regulation (EU) 2020/1395 of 5 October 2020; OJ L 324, 06.10. 2020, p. 3

100 Views

3y ago

Recent Views

Grammar as a Foreign Language - List of Proceedings

Grammar as a Foreign Language Oriol Vinyals Google vinyals@google.com Lukasz Kaiser Google lukaszkaiser@google.com Terry Koo Google terrykoo@google.com Slav Petrov Google slav@google.com Ilya Sutskever Google ilyasu@google.com Geoffrey Hinton Google geoffhinton@google.com Abstract Synta

2y ago

445 Views

Attention is All you Need - NIPS

Google Brain avaswani@google.com Noam Shazeer Google Brain noam@google.com Niki Parmar Google Research nikip@google.com Jakob Uszkoreit Google Research usz@google.com Llion Jones Google Research llion@google.com Aidan N. Gomezy University of Toronto aidan@cs.toronto.edu Łukasz Kaiser Google Brain lukaszkaiser@google.com Illia Polosukhinz illia .

1y ago

303 Views

GSA Implementation of Google (G) Suite

Google Meet Classic Hangouts Google Chat Google Calendar Google Drive and Shared Drive Google Docs Google Sheets Google Slides Google Forms Google Sites Google Keep Apps Script D

2y ago

316 Views

Google Drive (Google Docs, Google Sheets, Google Slides)

Google Drive (Google Docs, Google Sheets, Google Slides) Employees are automatically issued a Kyrene Google account. Navigate to drive.google.com. Use Kyrene email address and network password to login. Launch in Chrome browser for best experience. Google Drive is a cloud storage sys

2y ago

388 Views

Quick Guide of Using Google Home to Control Smart Devices

Configuration needs Google Home app. Search "Google Home" in App Store or Google Play to install the app. 3.1 Set up Google Home with Google Home app You can skip this part if your Google Home is already set up. 1. Make sure your Google Home is energized. 2. Open the Google Home app by tapping the app icon on your mobile device. 3.

1y ago

326 Views

Elaboração de Provas Online usando o Formulário Google Docs

2 Após o login acesse o Google Drive ou o Google Docs e selecione a ferramenta Google Forms (Formulários). Clique na caixa de Ferramentas do Google, localizada no canto direito superior da tela e selecione o Google Drive. Na tela do Google Drive clique em New , opção More e selecione Google Forms. OBS: É possível acessar o google

11m ago

123 Views

ACS WASC Templates

File upload, Folder upload, Google Docs, Google Sheets, or Google Slides. You can also create Google Forms, Google Drawings, Google My Maps, etc. Share with exactly who you want — without email attachments. Search or sort your list of files, folders, and Google Docs. Preview files and Google Docs.

2y ago

366 Views

Google Drive - San Bernardino City Unified School District

Google Apps All of the Google applications that are available upon logging into Google.com (G , Gmail, Gphotos, Gdrive, etc.). Google Suite Google’s online cloud based office companion applications (Docs, Sheets, Slides). Google Drive Google’s online cloud storage and file sharing/collaboration application.

2y ago

378 Views

Single Sign On for Google Apps with NetScaler Unified Gateway

Google Apps for Work is a suite of cloud computing productivity and collaboration applications provided by Google on a subscription basis. It includes Google’s popular web applications including Gmail, Google Drive, Google Hangouts, Google Calendar and Google

2y ago

295 Views

Serviceteil

Google 84, 87, 124 Google 110 Google AdWords 101, 103 Google Alerts 127 Google Analytics 89 Google Maps 100, 110, 173 Google-Maps 63 Google Places 100, 103, 124 Graphiken 66 H Haftung 170 Haftungsausschluss 72 Hausfarbe 11 Headline 35 Heilmittelwerbegesetz 14, 69, 163 Heilversprechen 164 HONcode 78 HTML 58 HWG 31 I Imagefilm 31

2y ago

336 Views

Best practices for managing identities when you move to Google Cloud

Google Cloud. To provide t he informat ion an organizat ion would ne e d to transfer data and ownership from one Google Account to anot her for s ome of t he noncore Google s er vice s, such as Google Ads, Google Analyt ics, or DV360. Intende d audience Organizat ion administrators. Sta planning Google Cloud / Google Wor kspace migrat ion. Key .

1y ago

481 Views

Introduction - Google Earth User Guide

Google Earth Community: Learn from other Google Earth users by asking questions and sharing answers on the Google Earth Community forums. Using Google Earth: This blog describes how you can use some of the interesting features of Google Earth. Selecting a Server Note: This section is relevant to Google Earth Pro and EC users.

3y ago

288 Views

Using Google Forms to Manage Officials Signups

Google Sheets, deleting a response from the form or sheet will not affect the other. Once the Google Form is linked to a Google Sheet, clicking on the spreadsheet icon will open the linked Google Sheet. Google Responses Sheet Google automatically creates and populates the sp

2y ago

276 Views

Google Cheat Sheets - Shake Up Learning

Google Slides Cheat Sheet p. 15-18 Google Sheets Cheat Sheet p. 19-22 Google Drawings Cheat Sheet p. 23-26 Google Drive for iOS Cheat Sheet p. 27-29 Google Chrome Cheat Sheet p. 30-32 ShakeUpLearning.com Google Cheat Sheets - By Kasey Bell 3

2y ago

296 Views

ChromeBox CXI (McQueen) UM (date) EN

Create a new Google Account. You can create a new Google Account if you don’t already have one. Click . Create a Google Account. on the right to set up a new account. A Google Account gives you access to useful web services developed by Google, such as Gmail, Google Docs, and Google Calendar. Browse as a guest

2y ago

177 Views

Tableau Spark SQL Setup Instructions

It looks like you're using an ad-blocker