Migrating Teradata To BigQuery - ISmile Technologies

1y ago
9 Views
2 Downloads
540.16 KB
14 Pages
Last View : 26d ago
Last Download : 3m ago
Upload by : Konnor Frawley
Transcription

MigratingTeradata toBigQueryActivities and Stages

With increasing volume, variety and velocity of data,maintaining traditional data warehouses can be expensiveand complex. Moreover, sorting, filtering the vital data foranalytics becomes a mammoth task. Google BigQuery is a cloud enterprise data warehouse. Ituses Google’s advanced storage and the processing powerof the Google’s infrastructure to enable fast SQL queries anddeliver scalable database solutions for clients. This serverlessdata warehouse enables real time data analysis. Migration ofyour complete database means moving the pivot of yourbusiness intelligence and entire data analytics to a newenvironment. The migration can be a lengthy and complextask. Transferring to BigQuery means going serverless andNoOps which enables clients to start using BigQuery forstorage and processing of data without thinking of security ofthe warehouse, disk configuration and load balancing etc. Google BigQuery Components

Migrating Teradata to Big Query isan attractive option because It enables super-fast analytics for petabytes of data It allows you to shift from CAPEX Model to OPEX modelthereby reducing costs It doesn’t require provisioning for storage and computingresources in advance It allows streaming ingestion of huge volumes of data TITLE OF THIS CHAPTERSHALL GO HERE Owing to column wise data store in BigQuery, high levels ofcompression of data is obtainedBigQuery has integrations with various BI tools like Tableau,Looker and MicrostrategyElimination of traditional ETL process and tools

PRE-MIGRATIONPRE-MIGRATION CONSIDERATIONSThree important functions that needs to be consideredbefore migrating the Teradata databaseFunctional differences owing to migrationWhile migrating, there may be differences in functionality.Big Query has different architecture from Teradata andmay not be able to use the traditional star scheme. Not toworry!BigQuery offer native support as extensions that enablesquerying nested or repeated data for standard SQL. It alsoallows for direct querying of data from Google Cloud datastore or Google drives. It supports JSON, NL, CSV and AVROfiles.Federated data sources are helpful in such cases You can refine, load and clean the data in one go bydata querying from a federated data source (a locationoutside BigQuery). Then you can write the cleaned datainto BigQuery storageYou can also couple small amounts of changing datawith other tables. Owing to federated data source, thedata is not required to be uploaded every time it isupdated

Usage and billingYour usage and usage performance reflects on theproductivity. It is important to determine whether the solutionwill allow you to scale with data volume and the complexitiesof queries solved. BigQuery employs the concept of slots toexecute SQL queries. A slot is simply a virtual CPU and RAM oran unit of analytical computing. BigQuery automaticallydetermines the number of slots for each query depending onthe query size and its complexity. By default, 2000 slots areafforded for projects but if the consumption goes more thanthat, the allocation limit is automatically raised. But if your SLAdemands more number of slots or reserved slots flat ratebilling model would be better.( Refer: On-Demand pricing model and flat rate pricingmodel for slots)Integrating Google Stackdriver with BigQuery helps youmonitor slot availability as well as allocations. Big Query services are billed on the amount of data storedand processed. Storage and processing are billed separately and thereare pricing tiers for storage. This eliminates the need forplanning for volume discounts The processing cost is based only on the data scanned.So , if we can identify the least number of columnsrequired to produce results, we can reduce the processingcosts. An extra charge is levied if API is used for streamingingestion into BigQuery in real time. Various cost control tools like billing alerts or customquotas can be used for mapping the resource usage costper project

Data Transformations formigrationCertain data transformations may be required for sourcedata to gain full compatibility with target schema. For thisyou can create a views database of Teradata containingall views and macros needed. Create one view each tablefor migration. The views database enables transformationof data types so that they can be compatible with bigquery data schema. The data then needs to be loaded ontemporary staging tables of Bigquery where it is parsedusing javascript UDFs or its inbuilt functions. Then the dataneeds to be loaded into the target schemaData IngestionKnowing how to make the data ingestion work in the newenvironment is an essential part of Teradata migration toBigQuery. There are multiple options of loading data toBigQuery. Using the BigQuery API Loading from JSON/CSV and AVRO on GCS Using API Client libraries for streaming data intoBigQuery Using Google Cloud dataflow

Data Transfer ModesThere are three options for setting up data transfer fromTerradata to Big Query Extraction Method: JDBC coupled with TPT (TeradataParallel Transporter) or Fast Connect- In this mode atable is extracted to a specific location of local filesinto AVRO files. Extracted files are loaded on thestorage bucket, and after transfer deleted from thelocal files Automatic Schema Conversion- The BigQuery datatransfer services enable automatic schema detectionand conversion of data during transfer On -Demand or Incremental Transfers- You have theoption for snapshot transfer, recurring or periodic datatransfer from Terradata database instance usingBigQuery Transfer Service

PRE-MIGRATION ACTIVITIES PREPARE AND DISCOVER- At this stage you and yourstakeholders discover the use cases for migration ofterradata to BigQuery and analyse the expected benefitsfrom migration. The benefits can range anything fromreducing the total cost of ownership, productivity gains,reducing data usage cost etc.The stakeholders need to consider the following questionsat this stage ACCESS AND PLAN- At this stage, you assess the stateof your existent infrastructure, prioritize the use cases,design the proof of concept, choose your migrationpartner, do a cost estimation, set your internal metrics ofsuccess of data performance on BigQuery. Here the inputfrom the discovery phase is taken into consideration forplanning the migrationThe planners need to consider the following questions at thisstage–How to ensure that the data loaded in the newenvironment is identical to premise data?–How to manage disruptions while switching overto BigQuery?–What should be the architecture of the BigQuerydata pipeline?–How to integrate reporting and BI tools withBigQuery?–What would be the usage billing and how tomonitor it?–What would be the licensing and infrastructurerequirements?

MIGRATION PHASETITLE OF THIS CHAPTERSHALL GO HEREHere you need to Set the configurationsDownload the migration agentEnable the required APIs

For BigQuery data transfer service, you need toenablegcloud services enable bigquery-json.googleapis.comgcloud services enable storage-api.googleapis.comgcloud services enable storage-component.googleapis.comgcloud services enable pubsub.googleapis.comgcloud services enable bigquerydatatransfer.googleapis.com TITLE OF THIS CHAPTERSHALL GO HERECreate the big query datasets,Fulfill on-premise requirements (local machinerequirements, Teradata connection details)Create a service account for Big Query data transferserviceBid the service account to storage admin/ Big Queryadmin/Pub-Sub admin roleAccept licensing for downloading the Teradata Express VMimage, Teradata tools, and JDBC drivers from Teradata'ssupport site.Select the modes of data transferSelect the appropriate tool for transferPrioritize the data sets, applications and others formigrationConfigure Google Cloud EnvironmentInitialize and run the migration agentMonitor the Progress of migration

TITLE OF THIS CHAPTERSHALL GO HERETo stop disruption in workflow during the migration, anarchitecture where both your data warehouse andBigQuery can be used simultaneously, where both caningest data , provide access to users and run theapplications has been shown below.

TITLE OF THIS CHAPTERSHALL GO HEREPOST MIGRATIONTesting and validating data so as to ensure consistency ofdata and to ensure that the applications are providingexpected performance.

The complete framework of migrationTITLE OF THIS CHAPTERSHALL GO HERE

About usIsmile Technologies is a global technology servicescompany that helps businesses compete by adoptingdisruptive technologies such as advanced analytics, bigdata, cloud, databases, DevOps, and infrastructuremanagement to advance innovation and increaseagility. Specializing in designing, implementing, andmanaging systems that directly contribute to revenuegrowth and business success, Ismile Technologies’shighly skilled technical teams work as an integratedextension of our clients’ organizations to deliver solutionsthat enable the strategic use of data, acceleratesoftware delivery, and ensure reliable, scalable ITsystems.Connect With Usservice@iSmileTechnologies.com(732) www.youtube.com/channel/UCAIYEyaJeOeCk-wVBDY4-0w

before migrating the Teradata database Functional differences owing to migration While migrating, there may be differences in functionality. Big Query has different architecture from Teradata and may not be able to use the traditional star scheme. Not to worry! BigQuery offer native support as extensions that enables

Related Documents:

r introduction to teradata architecture (basics) o teradata sql complete course (*) teradata physical database design and implementation (*) teradata load utilities (*) teradata parallel transporter (tpt) (*) teradata sql for advanced users (*) teradata physical datbase tuning - td15 and td14

3 Overview of Teradata Customer Education 4 Teradata Certified Professional Program 6 Teradata Education Network 9 Teradata Education Network Live Virtual Classes and Schedule 10 Teradata Education Network Recorded Virtual Class Webcasts 13 Teradata Education Plan 15 Web-based Courses (Teradata) 21 Web-based Courses (CRM) 2

Introduction to Teradata 3 Preface Purpose This book provides an introduction to Teradata covering the following broad topics: The data warehouse and active Teradata † The relational model and Teradata Database architecture † Teradata Database hardware and software architecture † Teradata Database RASUI (reliability, availability, serviceability, usability, and

introduce those working with Teradata Manager to this exciting set of tools. Supported Releases This book supports the following releases: Teradata V2R4.1.1 Teradata Tools and Utilities 06.01.01 Teradata Manager 05.00.01 Changes to Teradata Manager The following features and enhancements are new for release 5.0 of Teradata Manager.

Connecting SAS with Teradata Two interfaces to connect SAS with Teradata Concealing your Teradata password Importing Teradata data to SAS Joining a small SAS dataset with Teradata data Questions Disclaimer: The presentation are the views of the presenter and not that of the Westpac Group.

What is BigQuery? BigQuery is a service provided by Google Cloud Platform, a suite of products & services that includes application hosting, cloud computing, database services, etc on on Google's scalable infrastructure BigQuery is Google's fully managed solution for companies who need

Google BigQuery Google BigQuery is a cloud-based, fully managed, serverless enterprise data warehouse that supports analytics over petabyte-scale data. It delivers high-speed analysis of large data sets as a service. BigQuery scales its use of hardware up or down to maximize performance of each query, adding and removing compute and

Araling Panlipunan. Ikalawang Markahan- Modyul 2: Mga Isyu sa Paggawa . II . Paunang Salita Ang Self-Learning Module o SLM na ito ay maingat na inihanda para sa ating mag-aaral sa kanilang pagaaral sa tahanan. Binubuo ito ng iba’t ibang bahagi na gagabay sa - kanila upang maunawaan ang bawat aralin at malinang ang mga kasanayang itinakda ng kurikulum. Ang modyul na ito ay may inilaang Gabay .