IAIT Test Talend Integration Cloud

1y ago
10 Views
2 Downloads
4.97 MB
9 Pages
Last View : 25d ago
Last Download : 3m ago
Upload by : Tia Newell
Transcription

Test: Talend Integration CloudCloud-based data integration solutionwith high-performance developmentenvironmentDr. Götz GüttichWith the Talend Integration Cloud, Talend provides a secure Cloud-basedintegration platform with which users are given the ability to connect theirapplications with one another in the Cloud or on premises and transfer data betweenthem. The solution offers graphic development tools that can be used in the browseras well as ready-made integration actions, flow templates, components andconnectors. These make data integration easy. For more demanding developmenttasks, Talend Studio for Cloud is also available. It runs locally on the user'scomputer and communicates directly with the Cloud. We took a look what it's like towork with Talend's Cloud in the test laboratory.The Talend Integration Cloud isnot only suited to transferringdata from one system to another(e.g. from an on-premisesdatabase, to Salesforce) it canalso be used to reorganizeinformation (e.g. breaking downa name file into first and lastnames) and enrich it (e.g. byautomatically filling in postalcodes based on city and streetnames and similar). In addition,the solution can be used tostandardize and deduplicate data.The product essentially consistsof two components: First theCloud solution itself, whichoffers users various features suchas browser-based actions fordefining their data sources, datasinks, flows, mapping, etc. Withthe Web interface, it is alsopossible to test the flowsgenerated (i.e. data transfers) andto take them live. In addition,there is the option to monitor theactivities carried out and managethe existing actions, i.e. thecomponents of which a flow ismade up. A scheduler thatensures that specific jobsautomatically run at specifictimes rounds off the range ofservices of the Web Interface.The second component is theTalend Studio for Cloud. This isan Eclipse-based developmentenvironment that gives users theability to develop componentsfor the Talend Integration Cloudthat do not yet exist in the WebInterface.sources for which there is not yeta connector, or to implementnew functions. The newlydeveloped software solutions canthen be directly uploaded to theCloud where they can be used.With Talend Exchange, there isalso a platform via which userscan make their developmentsavailable to other users in thenetwork. Studio offers more than900 connectors and componentsout of the box with which therelevant staff members cansimplify their data integrationprojects.This makes it possible, forinstance, to develop new te,

Redshift, SAP and Salesforce,among others. The solution alsoworks with rational databasessuch as Oracle, MySQL or theMicrosoft SQL server. The sameapplies for non SQL-baseddatabases such as Cassandra andMongoDB, as well as datawarehouses such AWS, Redshift,Teradata and ExaSol.used as controlled integrationservices. Shifting workloadsfrom the on-premises servers to asecure Cloud environment withCloud-to-Cloud and Cloud-toground connectivity makes itpossible to outsource a numberof tasks, which frees upresources in the corporatenetwork.provides the hybrid and SaaSeditions.These are aimed at users wantingto run Cloud-to-ground and B2Bintegration (hybrid) or ones thatmake do with pure Cloud-toCloudconnectivity(SaaS).Consequently, these editions donot offer the full range ofservices of the solution, but theyare less expensive.After receiving the accesscredentials for our Cloudaccount, we first got acquaintedwith the solution's Web interfaceandworkedthroughtheexamples that Talend providesfor new users. We then went onto transfer data from one systemto another. In the process, wekept an eye on the developmentand work burden as well as userguidanceduringongoingoperation.The Talend Integration Cloud welcome screen with the introductory ls in the Integration Cloud,Talend also offers a remoteengine. This is a softwareprogram that is installed on aserver in the corporate networkand handles the task ofprocessing integration flows.This makes sense if largevolumes of data need to betransferred to the Cloud or ifdata to be processed by theCloud solution are not allowed toleave the company.With the Talend IntegrationCloud, users are able to defineintegration flows that run fromone Cloud service to another andto generate hybrid flows that areTalend provides its Cloudcustomers at all times with theCloud resources they need toperform their tasks, including inthe big data environment. Allsecurity requirements at thecorporate level are met in theprocess.The testFor the test, Talend provided uswith Cloud access of the elasticedition type. This was designedfor enterprise use in companiesand in the Cloud and boasts allthe features made available bythe Talend Integration Cloud.For environments with fewerrequirements,Talendalso2!!In the next step, we installed aremote engine probe at our sitein the LAN and used it toprocess the flows generated up tothat point. Finally, we used theTalend Studio for Cloud todevelop our own components,upload them to the Cloud anduse them in our flows.Starting workIn order to be able to work withthe Talend Integration Cloud,users need to have a currentbrowser. Officially, Talendsupports Internet Explorer 10and 11, Firefox 38 or newer,Chrome 41 or newer and AppleSafari 8. With regard to mobiledevices, Talend recommendsworking with iPad 3 with iOS 7or newer, the Samsung GalaxyTab 3 10.1 with Android 4.2.x or

the Samsung Galaxy Note 10.1with Android 4.4.The Studio runs on workstationsunder Linux, MacOS andinstallation and start-up of aremote engine and provide aglimpse into working with theTalend Studio for Cloud.Comprehensive help texts that appear the first time a page is called up ensurethat users can quickly and efficiently work with the Cloud solution.!Windows. The software requiresJava as well as at least three GBof RAM (four GB arerecommended) and more thanthree GB of storage on the harddrive. The remote engine canalso be used under Linux,MacOS and Windows. In termsof hardware requirements, astandard server should be morethansufficientformostrequirements.After logging into the TalendIntegrationCloud'sWebinterface, the user first sees awelcome screen again tion about the integrationenvironment. These Englishlanguage videos provide anoverview of the structure andconcept of the Cloud solution,show by way of examples how aflow is set up, demonstrate theIn addition, on this page thereare also examples that make iteasier to start working with theproduct. Current news, anactivity overview and a referenceto Talend Exchange, the on-lineplatformforexchangingcomponents developed by usersround off the range of serviceson the welcome page.Incidentally, when a user callsup a page for the first time, theCloud solution shows him or hercomprehensive help texts thatprovide him or her withinformation on how to workproperly with the interface. Thisenables even beginners to use theenvironment efficiently.The second page of the Webinterfaceisparticularlyinteresting. It is called "flows"and includes all the data flowspresent in the system. The3!!Talend Integration Cloud alwaysprovidesuserswithtwoworkspaces. The projects savedunder "Personal" only belong toa given user, the data locatedunder "Shared" can also be usedby other company employees.Folders can be created in theworkspaces to better organize theenvironment.If users switch to a flow entry,the system shows how often theflow has been processed,whether it is active at themoment and whether everythingwent off smoothly or if therewere any rejections or failures.The run details, i.e. the datasourceused,thetargetenvironment and the schedulesfor automatic runs, can beviewed in the same place. Inaddition, users have the option tostart and stop runs.The Flow BuilderThe so-called Flow Builder isavailable to process the flows.This tool makes it possible forusers to add data sources andsinks to the flow and integratefunctions such as mappers oreven data standardizers. Thismakes it possible to adapt data tothe format of the target systemand to supplement it withinformationsuchasthepreviously mentioned postalcodes.Since the Flow Builder is theIntegration Cloud's web-baseddevelopment environment, itmakes sense here to explore inmore detail how to work with thetool. If a user wants to generate anew flow, then he or she first hasthe option to give the flow aname. At this point during the

test, we want to upload thecontent of an Excel table into ourSalesforce account.We thus called the flow"Migration from Excel toSalesforce" and took advantageof the opportunity to give it ashort description as well. Thenunder "Choose a Source" weselected our Excel file as a datasource. Since this file was in aDropbox account, we selectedtheentry"dropbox file download source" as a source icon and providedour source with the file path inDropbox and the correspondingDropbox access token.Then it was time to convert theinformation from the Excel fileso that the system could modifyit. To do this, we used the entry"xlsx file toColumns process step". This converted, as the namealready says, the entries presentin the Excel file to columns. Atthis point, we could already seeunder "Preview Data" which datawere present in the worksheet.The aforementioned columns arean internal Integration Cloudformat. In this format, the dataare held in storage and streamed.In the next step, we defined ourtarget, i.e. our Salesforceaccount. To do this, we used theaction"salesforce contact upsertBulktarget". This requires, in additionto the account information, aSalesforce security token for it towork properly. As soon as therequired entries had been made,we were able to create, using themapper, the schema data that wehad previously found out via the"Preview Data" entry, i.e. fieldsfor company names, the name ofthe contact person, the addressand similar. At the end, weassigned these fields to theassociated Salesforce databaseentries so that the system knewthe location to which it shouldwrite the data.A simple flow that reads out data from a CSV file stored in Dropbox!Alternatively, it is also possibleto find out the schema using theGuess Schema function. Thisasks for the source file to beused, opens it and then reads theschema out of it.With the assignment of thecontent to fields, the definitionof the flow was complete and wewere able to perform an initialtest run. Here, the systemshowed us how many datarecords it read out of the file andhow many data records wereuploaded.Inoursimpleexample, these figures wereidentical, but it is also possible todefine more complex flows that,for example, standardize allcountry names in all datarecords.4!The latter makes sense forexample when different namesfor the same country are found inthe source data, e.g. "UnitedKingdom" and "Great Britain". Ifsuch modification methods areused, errors can arise duringname detection or conversionand certain entries may get leftout.In this case, the Preview Datafunction shows precisely whatinformation gets caught up inwhich step and thus assists introubleshooting. It is alsopossible to filter out data in atargeted manner with a flow. Inthis case, the numbers of theincoming and outgoing datafields may not match and heretoo the preview can be used todetermine whether everything isworking as expected.If it turns out during the test runthat everything is runningproperly, then users have theoption, using the "Go Live"button, to equip the flow with a!

scheduler that starts it at regularintervals, thus ensuring thatchanges in the source aretransferred to the target, daily,for example. Here there is notonly the option to process theflow once, daily, weekly ormonthly, but there is also an itemfor selecting the runningenvironment.Here the Cloud, any existingremote engines, and "CloudExclusive" and "Cloud Sandbox"are available. With CloudExclusive, the Flow receives itsown working environment that itdoesn't have to share withanything else, and with CloudSandbox, the work runs similarlyto Cloud Exclusive. In this case,the Cloud environment alsoensures that the system does notprocess production data at thesame time as tests.In the test, there were noproblems with our first flow andwe were able to take it liveimmediately. We set it up so thatit ran every day, and that workedas expected from the start. Ifproblemsariseandtheemployees in charge have tomake changes to the flow, ahigh-performanceversionmanagement system helps themto keep things clear.The range of functions of theWeb interfaceLet's take another look at therange of functions of the Webinterface. Under "Activity",users have dashboards availablethat show information abouttasks that are currently runningor have already been performed.These can be defined for userdefined periods or for the lastday, the last two days, the lastweek or the last month. Inaddition, it is also possible tofilter the display by users andwork spaces.Under "Manage", in contrast tothis, we find the actions used byThe last item in the Webinterface is called "Admin". It isused to manage the subscription.First, users can view a dashboardThis flow reads data out of an Excel file located in the Dropbox, standardizescountry names and divides the "Name" field into first and last names. It thentransfers the data to Salesforce.!the respective users, i.e. thesources, targets, conversionactions and similar. These can beexportedheretoTalendExchange and if needed, there isthe option to import other actionentries from this platform.The entry "Connections" alsooffers users the option to manageexisting connections, e.g. toSalesforce, to Dropbox or otherservices. Similarly, there are alsoentries to the flow templates, i.e.flow entries, that have been predefined as templates for otherflows, resources, (e.g. databaseschemas) and remote engineswith their current status. AllcomponentspresentunderManage are available both for5!the personal and the sharedworkspace.!here that gives them informationabout how long their license isstill valid, how many useraccounts have been created andhow many new messages thereare from Talend.User administration makes itpossible to create new useraccounts.Todothis,administrators must give users alogin name and a password anddefine the user's first and lastname, email address andtelephone number. In thiscontext, it is important to knowthat the password must containcapital and lowercase letters aswell as numbers and specialcharacters.

The next configuration dialogoffers administrators the optionto assign rights to the useraccounts. Normal users only getthe right to log in, whereasadministration users also get atick mark in the "AccountAdmin" column. Only they haveaccess to the administration area.Rights can also be granted forworkspaces. Administrators thusspecify, for every workspace (i.e.for user's personal spaces and theshared workspace), whether thehow many engines are availableand how many user accountsthey can set up. In contrast tothis, additional remote enginescan be added under "RemoteEngines" .This functions via a so-calledremote engine key that isgenerated here by the Webinterface and that administratorsenter when installing remoteengines. They then know towhich account they have toconnect and are then ready rightto Talend Studio for Cloud andthe remote engines, the SAPRFC server is what makes itpossible to receive and edit SAPIDocs when necessary.Various links to the usercommunity, to Talend Exchangeand the help function withsupport and documentation arealso part of the range offunctions of the Web interface.The same applies to aconfiguration dialog for editingthe current user profile. In thisdialog, users can update theirname and email address, changetheir password and set their timezone.Working with sample flowsAs already mentioned, Talendalso provides users with SampleFlows where they can getacquainted with how theIntegrationCloudworks.Concretely speaking, there aretwo examples: The first is asimple"HelloWorld!"application.!Sample flow entries include, among other things, the "Run History" thatprovides information about whether problems arose during processing of data!users can create flows (author),execute (execute) and manage(manage)thecomponentscontained therein. It is alsopossible, in terms of workspaces,to allow or forbid importing andexporting of components from orto Talend Exchange. Noproblems arose during the test.Theitem"Subscription"provides information about thecurrent subscription. Here userssee which subscription they havetaken out, how long it is valid,after setup. Since remote enginesonly require an outgoing SSLconnection in order to makecontactwiththeTalendIntegration Cloud, in mostenvironments it is not necessaryto reconfigure the firewall forthis purpose. In any case, duringthe test, no problems aroseduring setup of the remoteengines."Downloads" ultimately providesusers with download links for theadditional software. In addition6!This assigns customers acustomer campaign based ontheir type defined in a list. Next,it sends customers, based ontheir current campaign, amessage, such as "Hello World!"The second example is moreinteresting. Like our previouslydescribed template, it readscustomer data out of an Excelfile and then exports it toSalesforce. Unlike our simplesolution, this template alsostandardizes country names andseparates first names from lastnames. Thus, Talend provides averypracticallyorientedexample that administrators

charged with data integrationshould definitely consider.develop a software program, allhe or she has to do is drag theassociated icons into the work!The "Preview Data" function allows users to view in what form the data—hereour book entries—arrive, and in which work step.!Talend Studio for CloudLet's now take a look at TalendStudio for Cloud. AlthoughTalend has already pre-definedmany actions for the IntegrationCloud and there are a number ofotherfeaturesavailable,employees in charge might alsohave to define their own datatransformations, data sources oreven data targets. Talendprovides the studio mentionedabove for this purpose. TalendStudio for Cloud is an Eclipsebased development environmentthat is able to assist those incharge in creating the action theyrequire and then uploadingnewly generated programs to theIntegration Cloud so that theycan be used inside the flows.space, enter the requiredconfiguration parameters, e.g.required variables or paths, andthen connect the icons so that thedata flow is displayed. Theconnections then define in whatorder the individual functions areprocessed.Unlike most other developmentenvironments, Studio for Cloudworks with a collection of iconsthat represent certain functions.If a user with Studio wishes toConnectionconfigurationincludes, in this context, thedefinition of the connection tothe source service, "DownloadFile" is self-explanatory andPreviously we had already usedthe action to download a filefrom Dropbox. If Dropbox isopened in Studio for Cloud, wefind several icons in theworkspace that define thisaction. Concretely speaking,these include the work steps"ConnectionConfiguration","Download File", "RenameVariables" and "Action OutputData".7!"Rename Variables" is used tostandardizefileaccesses."Action Output Data" handlesthetaskofforwardinginformation to the next workstep. In parallel, there are alsotwo other entries that havenothing to do with the actualworkflow. "Catch Error" detectserrors that occur and forwardsthem to "Log Error" forrecording.In this manner, it is possible notonlytoimplementnewfunctions, but also to modify,expand and adapt alreadyexisting actions to the respectiverequirements.Inpractice,working with Studio goes asfollows: After creating oropening a project, the employeesin charge have the option, on theright side of the work panel, toaccess a sort of "toolbox" thatcontains all icons with the predefined functions. This can besearched by keyword, but hasbeen divided into groups such as"Big Data", "Cloud, "Databases","ELT" and "System" so that thepeople in charge can find theirtools without problems. If theyknow the name of the functionthey need at a given time, it iseven enough to enter this namein the workspace and Studio willalready offer the correspondingicon directly. If employees usethe toolbox, they can simplydrag and drop the icons.If the function with all icons andthe data flow has been defined,the employees in charge have theoption to test them right inStudio. If the task runs withoutdifficulties, it can be uploaded inthe next step to the IntegrationCloud and be used there. If

to Cloud" in our workspace, wewere able to add it to our flow.!The "Activity Dashboard" shows at a glance what flows ran when and whetherproblems arose.!errors occur, then Studio offersextensive options for debugging.If needed, developers have theoption to embed code in theirprojects at any time. This willonly be necessary in pre-definedfunctions in specific cases.During the test, at this time, weproceeded to generate an entryelement that met specificrequirements. We had previouslycreated a database with thebooks we had in house using anAndroid app and then wanted toimport this database into anothersystem.Unfortunately,theAndroid app was only able toexport the content in the form ofa CSV file which was in no way,either in terms of the separatorsor the encoding used, standardcompliant. It could therefore notbe imported with the out-of-thebox import function of theIntegration Cloud. This was dueto – as mentioned – the verystrange format of the datasource, not the Talend solutionitself,butmademanualdevelopment work necessary.In order to successfully importthe CSV file, we first created anew flow in the browser andused our CSV file in Dropbox asa data source. We converted thisin the next step with the action"csv file toColumns process step" so that the system coulddetect the data contained. To dothis, we had to modify this actionwithin Studio for Cloud in such away that it could correctlyinterpret our rather odd CSV file.It was necessary here tomanually adapt the encoding,field separators and similarparameters which, in our file, didnot meet the usual expectationsof the current situation. After weuploaded the modified action byright clicking on the entry andselecting the command "Publish8!In order to export the book list,all we needed to do now wasenter the target database in theIntegration Cloud flow builderand perform the mapping.Afterwards, the Cloud Solutionmigrated all the information tothe database. This didn't workfor us right off the bat, since wehad to fiddle around with theparameters for interpreting theCSV file. Thanks to theintegration version management,it was not a problem to workwith Studio until the result metour expectations.ConclusionWith Integration Cloud, Talendoffers a very high-performanceproduct for data integration.Despite the high flexibility andlarge range of functions, thesolution is relatively easy to useand makes it possible forcompanies to carry out "do ityourself"dataintegrationprojects for example.In many cases, it might benecessary for the developmentdepartmentpre-definetheactions required in the respectivecompany using Studio for Cloud.Once this is done, the dataintegration officers can compileand modify the flows they needthemselves using Web interfaceas a sort of end user.

They do not require Studio forthis, nor do they require trainingfor developing with Studio. Thisas well, which previously lackedthe courage to tackle the topic of"data integration", find the CloudUnder "Manage", actions, connections, flow templates, resources and remoteengines can be managed.!saves time both for thedevelopment department, whichafter the initial setup only has todeal with data integration inspecial cases, and for the "endusers", who only requireknowledge about working withthe Web interface and the flowbuilder.TheIntegrationCloudsignificantly reduces the burdenon the IT administration as well,since the Cloud makes itunnecessary to implement andmaintain the environment forprocessingdataintegrationprojects, hard and software, highavailability and security onpremises. This saves a lot ofmoney and effort, especially inthe big data environment. Assuch, the Talend IntegrationCloud provides an interestingalternative to traditional dataintegration projects for manycompanies. Other organizations!to be an easy-to-use, risk-freeandconveniententry-leveloption.Finally, let's talk about thesecurityoftheTalendIntegration Cloud. The solutionis based on Amazon WebServices and works with dualencryption of passwords andconnection data. In addition,every access to the Cloud issecured with a key. The Clouddoes not store any usage dataduring operation; they are onlystreamed through the available in encrypted form inthe Cloud permanently.Due to the high security level,the many applications and thewide range of services, we givethe Talend Integration Cloud thedistinction of "IAIT Tested andRecommended."9!WP213-EN

Test: Talend Integration Cloud Cloud-based data integration solution with high-performance development environment Dr. Götz Güttich With the Talend Integration Cloud, Talend provides a secure Cloud-based integration platform with which users are given the ability to connect their applications with one another in the Cloud or on premises and .

Related Documents:

Talend, Global Leader in Open Source Integration Solutions Liu Wu 15311202609 wliu@talend.com l Talend Overview l Relationship between Talend and Eclipse l Talend Product l Hadoop ecosystem

Talend Open Studio for Data Integration is a free open source product that you can download directly from Talend's Website. 1. Go to the Talend Open Studio for Data Integration download page. 2. Click DOWNLOAD FREE TOOL . The download will start automatically. Installing Talend Open Studio for Data Integration

With Talend Cloud you can create roles and assign permissions as you see fit Talend Cloud users can belong to one or more groups For example, you can assign users to a Talend Cloud project by group instead of individually For the self-service migration, you need at least one Talend Cloud security administrator, who has these privileges:

Talend Cloud Data Catalog is a managed platform that helps you to create a central, governed catalog of enriched data. It can automatically discover, profile, organize, and document your metadata and make it searchable. Talend leverages security and privacy best practices to protect both the Talend platform and Talend, the company.

Talend Integration Cloud Talend Studio. Benefits of Integration-as-a-Service 1 2 3 Faster Time To Market Lower Maintenance costs Failover & Disaster Recovery 4 Automatic Upgrades Data Prep on Talend Integration Cloud Zero installation Use visual tools and smart guides to fix or shape data

Password management Talend maintains a password management policy that all employees must comply with It ensures . Talend relies on AWS-managed Customer Master Keys (CMK) for encryption Talend uses its own AWS CMK to generate unique data encryption keys (DEK) Most DEKs are tenant-specific and are managed (including rotation) by Talend DEKs .

Talend ETL Tool 7.x or higher Oracle JDBC ojdbc8.jar or higher Here is the overview of the install and configuration process to get Talend connected to ADWC Provision Download credentials file to Talend ETL system Install Talend ETL Download ojdbc8.jar from otn.oracle.com Configure Talent ETL JOB Test connection 1.

adult Korean-as-a-foreign-language (KFL) learners who intend to maintain and strengthen their knowledge of essential Korean grammar and for classroom-based learners who are looking for supplemental grammar explanations and practices. Consequently, this book differs from existing KFL materials whose primary purpose is to help KFL learners acquire four language skills, such as listening .