Are you considering a move from traditional on-prem ETL tools to a cloud-based solution? Or maybe just mystified by the ever-expanding landscape of data integration tools, and wondering about the pros and cons of using one of the battle-hardened heavyweights versus one of their newer peers?
In this on-demand roundtable session, we discuss data integration options with our ETL experts. Bringing different perspectives and a lot of experience to the subject, our panelists discuss:
- The evolution of ETL from traditional on-prem solutions to cloud-based data integration tools
- Deciding between on-prem, cloud and hybrid deployments
- Challenges when converting from on-prem to cloud-based tools
- Departmental vs. enterprise-wide toolsets
- Real-world examples using three of the most tried-and-true workhorses of data integration: DataStage, Informatica and SSIS
Hear the unvarnished truth about the benefits (and risks) of the myriad options available for data integration.
Peter is seasoned data architect who is passionate about using technology to design and implement solutions for clients.
Rick is an experienced database architect and ETL developer. He has extracted data from several ERP systems and understands how to transform that data into dimensional models.
Solutions Architect, Data Engineering
Brad has over two decades of experience helping clients achieve their analytics reporting goals. Over the past several years he has focused on schema design and data delivery using a variety of tools.Read more
Hello, everyone, and welcome to today’s Data Integration Roundtable, part of the Senturus Webinar series agenda for today, we’ll do some quick introductions of the experts who are going to be speaking to you. We’ll talk a little bit about just kind of the background of data integration.
And then we’ll head right into the round table discussion and then there’ll be a lot of back and forth with our team. So joining us today, we’ve got Brad Green, Solutions architect and data engineer here at Senturus. Brad has been with us for a long time. Most of us have been around here quite a bit, so lots of years of experience on the session today. And we also have Rick Fukunaga, who’s a senior consultant here at Senturus.
And 3rd but not for, I guess maybe last but not least, Peter Gigopolis. Also, Peter is a senior consultant here at Senturus and these three are just rock stars when it comes to data integration. Personally, I’ve worked on tons of projects with all of them over the last many, many years. I’m almost embarrassed to say how many years, but yeah, we’ve got a great team assembled here today. As for me, I’m Steve Reed Pittman.
Director of Enterprise Architecture and Engineering here at Senturus. And I’ll just kind of provide the kind of overview for today’s session. All right. So before we get started, we want to do just a couple of quick polls. This first poll has a whole bunch of options on it. So I’m not going to read all of the options to you, but.
Go ahead and the window that’s popped up there. The question for this poll is which data integration platforms do you currently use? And So what we’ve done is we’ve put about ten of the most common platforms on here and you can, you know, check off as many of those as apply to your situation. But we’re just kind of curious to get a sense of what platforms folks are using. We’re on the session today.
So I see quite a few coming in. There’s a lot of SSIS showing up. That’s not surprising. SSIS is a very mature and solid product and heavily used across many, many industries. We’ve got some Informatica coming in here, a fair amount of Azure data factory, which also isn’t surprising as more and more folks are moving to the cloud.
We’ve seen the shift, you know some SSIS workloads have been moved to ADF, which makes sense and share out the results. So I’m not sure if you all see the same view. I do. So I’m expanding my window here to show all of the results at one place. Hopefully you all see that same thing. So you can see we came in, 58% of you answered that you’re using SSIS, so not surprisingly a good solid.
Proud of that. A fair amount of ADF, some Informatica out there, smaller bits of data, stage cloud pack for data and some of the Oracle tools. Beyond that, we’ve got a little bit of other cloud. And I’d be curious actually for those of you who are using other platforms, if you want to just drop your platforms, just type a little note about them into the Q&A panel. I’d just be curious to know some of the other platforms that are being used out there.
All right, on to the next poll. Let me get my deck going here. So we’re also curious to know for a future webinar if there’s a particular platform that you’d like to see a deep dive in. And so we’ve got some of the, you know, big heavies who’ve been around for a long time and from SSIS data stage, we’ve also got others. So if there are other cloud platforms that you’d be interested in seeing a deep dive for, I let me.
Back up there, not just cloud platforms, but data integration platforms that you’d like to see a deep dive webinar, just let us know. And again, you can put that into the QA panel if you have a particular platform that you’re interested in. So I’m going to go ahead and close this out and share those results with all of you. Again, about half of you will be interested in an SSIS deep dive.
2nd place go to Informatica. Data stage isn’t too far behind, and about 1/4 of you would be interested in other platforms. All right. So with that, I’m going to stop doing talking most of the talking anyway, and we’re going to start moving on towards our roundtable group. So we’re going to start with the definition of data integration and I will talk for this slide.
So just generically speaking, data integration typically is about combining data from multiple data sources into a unified data set. And that can mean different things in different organizations. But in general, right, you’re trying to take data that’s not structured enough and get it structured in a way that’s useful for your own reporting. And then that can include lots of different activities. It can include cleansing. It can include ETL. It can include data mapping transformations.
And it can even include modeling. And of course, as most of us know today, there are an enormous number of solutions out there. I mean, you know the day even though it’s true that informatic and data stage SSIS, right. We can see from the polls today that those are still heavily used. You know, those are kind of industry standards that have been in place for a long time and continue to be very strong products, but there’s also a whole universe.
Of new tools out there. So it’s a, it’s a wild and wooly world and just a quick view of the Gartner Magic Quadrant for data integration. So most of you have probably seen these over the years as you can see some of the different companies that have come in, in the sort of sort of span between 2014 and 2022.
But you’ll note that the leaders have more or less stayed the same. We’ve still got Informatica and IBM over here in Gartner, so we’ve got Microsoft up here. And now. I’m a little surprised that Gartner didn’t have Microsoft up there even in 2014, cuz SSIS has been, at least in our experience, SSIS has been heavily used for a long time.
All right. And with that, we’re going to go into the live Q&A. I’m going to stop sharing the deck just because that’s not very interesting to look at while we chat. And I’m going to introduce Rick, Brad and Peter. We’re all be on camera here in a moment. And so we’re just going to start kind of some general questions and just kind of general discussion about data integration and so I think for a starting point.
Guys, if we could just talk a little bit about how data integration has evolved over say this last decade where you know, we used to think about, you know, ETL as being like the thing you do and you buy an ETL tool and that’s it. But now you’ve got tools that do an enormous number of different things. You’ve got cloud and you’ve got on Prem. And could you just talk about some of what you’ve seen in terms of the shifts in in recent years?
Sure. Who wants to? And remember to unmute your microphones also, guys, I know I had you turn on your video, but make sure your microphones are unmuted. Sorry, Brad. That’s okay. Yeah, I’m unmuted. So I guess that nominates me. Good morning. Good afternoon, everybody. Thanks for tuning in, first of all. And thanks for the intro, Steve. Yeah, things have changed a lot. I think that, yeah, I was thinking back the other day.
When I first started with Senturus, anyway, my first class was an ETL class, which is interesting. I spent five days learning that wonderful tool we called data manager and I thought, wow, this is kind of kind of cool, a gooey tool cuz up until then, right, everybody was writing sequel code and everything was done in chunks of sequel and Oracle databases and everything. So.
Yeah, I think things have changed a lot over the last what, almost 20 years we’ve gone from hand coded sequel and I think most of the world now is is using GUI tools or something similar to that. I mean if you’re in a cloud environment you might be using DBT, you might you know be back to using Python. So the environment is varied and different in some ways familiar and.
It’s interesting. I think things have changed a tremendous amount. We have tons of options. I mean, you showed that slide of vendors that’s way busier slide than it was in 2005, right? I mean, we just have such an array of choices now that when people go out to look at what you want to use, how do I want to?
Develop my data integration, my pipelines, moving data around the organization or from external data sources. Yeah, you’re spoiled for choice. And it’s a difficult job. So I’ll stop there and see what Rick and Peter have to contribute on that topic. Yeah, I think for me and again, good morning, everybody. Good afternoon.
What I’ve seen is as Brad was alluding to, when you have all of these new players and new technologies coming in, it also has strengthened some of the existing platforms. You know like for myself with Informatica while the construct remain the same, there’s been a lot of back end technology changes that have give some you know drastic improvements on performance on how many rows it can process and.
Giving you as a developer some tools on doing so your own optimization within the tool sets you know, specifically some of the functions in Informatica you know they have this push down optimization which based on your database platform can potentially push some of your routines and logic back to the database to try and give you, you know, some better performance. And so these advances in technologies from other solutions have helped.
You know start to robust, you know create a, you know even a better product and you know to some extent that’s why they still exist in that you know that leader platform is they continue to push the platform, continue to improve it not only on you know the on Prem version but even on the cloud versions of trying to make these solutions you know be able to handle more than just a simple of data movement from point A to point B.
Yeah, hello everyone, I’m Rick. I think over the years what I’ve found is we have to adapt a lot to, you know, Winston Tourist finds a client and sometimes they already have an ETL tool, so we have to adapt to their environment. But a lot of it has been SSIS development, at least for me.
And I’ve seen a large shift now going towards cloud based solutions too. So I.
Actually Rick, I’m a little curious, I mean since you mentioned the shift to cloud based solutions. You know one thing that we’re seeing more and more of course is that some of the traditional vendors are offering now both more like what we traditionally think of as on Prem solutions even though many times today those on Prem tools are running inside a VM that’s actually out of the cloud. But maybe could you just, I think each of you have some experience with this, so I’ll just kind of like throw it out to all three of you.
If you could just talk a little bit about your experience with working with the kind of traditional on Prem version of a particular tool and what you’ve seen with the cloud, the more modern cloud versions and sort of where they have parity and where they go. I think because a lot of the cloud versions of the tools, for instance data Factory versus SSIS, some of the features are lacking.
In them they haven’t matured as much as the onprem versions of them and I think that’s true with other solutions too. So for me I prefer the onprem versions because I know they work, they I already have felt solutions with them, so trying to find the workaround in cloudbased tools.
Sometimes it’s challenging, yeah, absolutely. So I have very limited personal experience in these particular realms, but I have seen a little bit of data factory versus SSIS. And I know just from my own observation, I mean it’s been a few years now, but at least at the time.
Like the way that data factory pipelines were constructed was really, they seemed almost like completely different beasts. Even though they were built by, they’re both built by Microsoft. But my understanding is you can deploy. There is some way to deploy SSIS packages into data factory nowadays. Yeah, you could still use the Visual Studio, create your SSIS packages and then deploy them into Azure.
I think some of those differences like you know in a lot of cases the overall design. I mean if you’re looking at like let me let’s sit down and design A pipeline, I need to move data from one place to another and that may or may not involve transformations along the way in some cases those cloud based tools are designed for us.
In some ways for a slightly different process, right. We know that traditionally we have this this lexicon of ETL versus ELT and you know as you guys know it’s a personal pet peeve of mind labeling things. We prefer to focus on what job, what process are you trying to build and not get hung up on labels. So some of these cloud based tools.
Are I think intended to load data and process it in a in a fundamentally different way than these legacy tools as we like to call them. And so when we shift to these cloud based tools, we kind of have to rethink how we build pipelines a little bit because the tools just.
Structures slightly differently. So I think you can kind of get there, but it is true that many of them are not as mature. And having had some recent experience with a cloud based tool from a vendor that has newly released their cloud based product, it is definitely something to consider to look at how long a product has been on the market and to make sure you have feature parity and that these tools do exactly.
What you expect from beginning to end, so true. Yep, cloud based products or maybe not as mature, but it does very much I think. I guess I’m, you know, in a roundabout way of saying looking at the solution that you’re trying to create and making sure that the tool is appropriate for the job is a very important thing because.
We built things differently 10 years ago than we build things today, but in some cases you’re still trying to do exactly the same job. So on Prem tool may be the perfect solution. But if you’re trying to move lots of data from a Salesforce app that’s already in the cloud to another database that’s already in the cloud, this is something like mattillion and a snowflake database may be the perfect solution.
And it doesn’t work the same way that SSIS does that. That’s my point. And maybe Rick and Peter will say that’s a bunch of, you know what, I totally disagreed. No, I agree with you. I think if you’re going from cloud source to cloud target, having a tool that works in the cloud too that that I think that makes a perfect sense. And you know, there’s different ways to move the data around.
You know, there’s products like 5 Tran that it, it’s strictly really you use that to move data from one point to another then you do your transformations on the target side. So again, that’s perfect. Elt’s solution, yeah, I would say we have, I have a couple of projects that I’m currently working on where we actually have hybrid environments where we have Azure in the cloud that is.
Pulling in data from other cloud sources but internally all of the data that they also blend in is an on Prem data set. So we’ve got this mixture of data we’re we’re bringing in through Azure, we’re storing it in the data lake through the Azure Data Lake. And one of the nice things is while some of the like a JSON file you know with the Mongo database.
You can still get at it, you know with Informatica Power Center. It’s not a direct and easy connection. You know you still can get it with some coding where nowadays some of these, the cloud based have these connectors to easily connect readily available. A little different than previous models in which you were being licensed of how you can connect and limited to what you could use it. Oh, you’re just SQL based.
Oh no, you want these other connections. You know, there was a somewhat of a fee for it. Now when you know you have these solutions, especially when you’re talking about Azure, they’re available to you. You can configure it, move the data cleanly. Like I said, we’re doing a hybrid of Azure to acquire the data and then continuing to utilize some of the existing internal resources for Informatica.
For the data aggregation and the data mark creation as well as master data management, and so it does. There is that sweet spot, at least in this particular project that this combination or hybrid solution is providing a very good solution for us. But just like anything else, looking at your data sources, looking at your objectives.
You know will help in determine you know which platform which direction you know do you want to go. Yeah that that that’s interesting and maybe that’s what you’re going to say Steve. So I’ll just but it might be interesting to explore that that question of you know when you have that deal with hybrid, you know what does it, how do people make a decision about sticking with a key vendor like Informatica or IBM or Microsoft.
And trying to use the same vendor for multiple data integration tools or do you go, well, OK, I’m going to use Mattillion and Informatica and something else to try to solve all my data integration, you know, needs. So I don’t know, that’s just an interesting point. Yeah, that is a great point, Brad. And I mean, and this actually leads into a question that came in about.
You know making decisions between do I go on Prem, do I go cloud, do I go hybrid. And I mean I think just from what you guys have said so far, the answer is probably it depends. But it led me to think about the question of what do you do like if you have on Prem data but you want to use a cloud based tool for your data integration, how?
Sort of how mature are those tools today in terms of being able to reach into your on Prem data and pull out like is that, I mean I’m sure it’s much easier to do that with an on Prem, you know, integration tool. But if you have a cloud tool like, are they fairly well evolved now in terms of being able to pull data from on Prem databases or are they really more sort of designed and efficient, which I would expect when they’re dealing with Native cloud data?
I know from the project I’m on some of the hurdles that we had is because some of our data was on Prem and some of the transformations in Azure. You can’t really do it in the traditional sense of grabbing your source data and then running it through some of the transformation engines. You know you’re limited more to the data flows.
Very simplistic movement. And so it’s acquiring the data, storing it in the data lake, then you can then massage the data and then.
Write it back out you know to your target which is on Prem. So there’s a couple of you know intermittent steps you have to take. It’s not just simply where normally we just grab your data if it’s in the cloud and you can manipulate it and then write it if there’s a, you know that interim state of let me put the data in the lake now I can read that data you know so you’ve got that interim step that you have to do in order to use that data and use our so to use the cloud tool.
From end to end, you know it’s not a simple just direct connect and you know do your manipulations and then write to your target. Yeah, I think for clients that that have a significantly large amount of data internally you know they’re not they haven’t already lifted all their data to the cloud. In other words exists internally.
You know whether you think about that internal data as already existing in its internal cloud, I mean that’s a big decision, right, because if you got a few terabytes or petabytes of data sitting, you know, in a private network, you’re probably going to think pretty hard about what you do with that data. You’re not just going to say, okay, I’m going to hand that over to Amazon and then we’ll go cloud with some cloud app, right.
So that’s a significant decision point for people to make and it’s not going to be driven by the tail of the dog, which says, oh, I want to use data factory, so let’s just move all our data to, you know, Azure, right. I don’t think people would make the decision that way.
Yeah, I would think also that just like data transfer you know like the performance of data transfers would be a consideration if you’re pulling on Prem data out. I mean well and not to mention the security aspects that you kind of touched on there Brad. But I would think if you’re having to pull a lot of the data from an on Prem database out into one of these cloud tools I would think that could be a concern just performance and or just cost I mean you know each vendor has sort of different ways that they.
Charge, you know, for bandwidth, for data moving in and out of the cloud. And so I can imagine that can be a consideration. Also. Yeah, I don’t. I don’t think it’s practical to move huge amounts of data out of the cloud, process it and then turn right around and move it back and bring it back. Yeah, it doesn’t make any. Seems illogical. You might find a scenario that makes sense to.
If you’re going to put your analytics and we’re talking about data integration for analytics purposes, which I think is really what we’re talking about here, you might be able to make the case of moving it up to the cloud for analytics purposes. That way it’s a one way trip, right? You’ve got E RP systems internally, move that data up once and process it and now it’s available in the cloud for analytics purposes. That might make sense.
Yeah. So I’m going to pause here for a minute just for a quick public service announcement in case, in case everybody hasn’t noticed in the chat window, our own Scott Felton has posted a link if you want to get some time on his calendar. Just so everybody’s aware, like if you have deeper questions about data integration or you’d like to talk to us about specific data integration needs that you have today, you can just use that link that Scott posted in the chat.
And spend some time with him. He’s always happy to chat and talk to you about your current data integration needs. Also a reminder to everybody that you can enter questions into the Q&A panel. So if you have questions specifically for our panel of experts, you can type those in as we go along. So beyond the stuff we’ve already talked about guys, I was wondering maybe if each of you could just.
Talk about like one kind of real world example of an integration project you’ve worked with and that may be hybrid, it might be on Prem, it might be cloud only, but just kind of some of the challenges and sort of considerations that come up in real world projects today.
So somebody else that gets to go first rather than me this time want me to flip a, yeah, roll A3 sided die, right. Yeah. So wants to go first. Yeah, go ahead and go first on this one. So I think for me, I’ve got a couple of different projects. One was, you know, traditionally a pure on Prem solution where we had, you know.
You know, Informatica and Oracle as the database in terms for our integration area, but in terms of the challenges that we faced was really more in dealing with multiple ERP systems, you know, trying to coordinate and consolidate you know 2526 different business units on different ERP solutions.
And trying to you know create that Federated, you know solution set and you know some we’re able to connect with directly, some, you know the old traditional let me give you a CSV file and you know FTP it over and then you process it, you know to the one of the current projects I’m on now is that hybrid solution that I mentioned earlier is going through and.
Connecting to you know, cloud versions of Mongo, a cloud version of an on a SQL database, an on Prem SQL database. You know some external metrics from the government in terms of census and population statistics that are then blended in and crunch. You know all on an on Prem. You know SQL, you know database that is.
Used, you know to try and you know figure out some you know labor and employment trends and so you know different challenges. You know the cloud made a little easier to connect to all these particular environments. But again there’s just like anything there’s still that initial setup that takes time to coordinate and especially when you’re transmitting across you know the cloud and making sure.
You know that that you know PII data that’s coming across is secured correctly and that those tunnels are you know created those keys have been created so that nobody can you know get access to the data that’s not allowed to get access to that data. So I’m just going to choose Rick as the next we’ll make you last Brad one of the.
Issues I had with one client, they had most of their business units were on one ERP system, well different copies of the of the ERP system. So they were distributed across the multiple servers, but one of their business units was on a completely different ERP system and so trying to consolidate their financials because they wanted to do.
Planning, you take that data and put it into a planning software. That was a big challenge. And then you have to create, you know, a consolidated chart of accounts, a singular look at what the chart of accounts is. So mapping was involved that include that. That required having to map one account in one of the business units to a consolidated account number.
That it, it took a while. Of course we had to ask the business people to help with that type of mapping. Once we got that done, then they wanted to also consolidate some sales data. So across all of the systems they had their own separate data warehouses, but now they wanted to consolidate all, all of it. So they had their bookings revenue and backlog.
All together in a dashboard. So those types of issues prevents, you know, presents some challenges to consolidate that data. But we do, we were able to do it all with SSIS, you know, it became the standard for reporting for the company.
Do you find actually before we go to Brad, Rick, like do you find that organizations have SSIS, you know, workflows that have been out there like for years and years and years that they just continue using or how do they tend to evolve? Well, this particular client, we started development of that of their data warehouse with SSIS in 2012, I think it was and it’s still running today.
With you know over the years it’s evolved by adding additional subject areas. You know most places start with sales, I have one in a sales data Mart, then you do purchases and inventory and then it keeps growing operations. Yeah, so over the years it evolves, but the base code I think has been in place for years and.
I’ve done that for a number of number of customers and yeah, it just works. You know, that’s true with a lot of lot of different data warehouses too. Other tools, unless they change their ERP system, you don’t have to touch the code very often.
Yeah. And just real quick, I would say that was the same for me. I’ve had a client where code has been in place since you know with 2005 in terms of the informatic ETL, their processes remain the same. You know what’s changed over the years is really just movement of where the data is stored, you know on Prem Oracle to off Prem or you know cloud based solutions in Oracle and then.
Migrations to different database platforms, like I said, now they’re on SQL Server, so just a complete different you know, movement of where it’s written. But the core of the code remains consistent. And like and Rick says, as long as the process and the data acquisition components don’t change, these marts have been you know, running for, you know now almost two decades, which is very nice to see that you know.
They’re still getting value out of the information. Yeah, that’s an interesting. So before I launch into my little example, Steve, did you have something you wanted to comment on? I just something you brought up, Peter, actually I hadn’t really thought that was the question of DB database platform migration. So you talked about how often like the core ETL might have stayed the same over all these years, but the underlying.
The database itself might have shifted from Oracle to SQL Server or you know, moved out to the cloud. I’m just curious, you know, if there are particular challenges any of you guys have seen where customers, it wasn’t that they needed to change their integration tool, but it was simply that they had to because their data was moving to a different database. Like what challenges that creates with the existing?
Code that’s there like are those difficult to accommodate or do you know maybe it’s tool specific but I’ll let you guys tell me. Yeah, for this one you know the Oracle to SQL Server wasn’t it was it was a task that required a lot of effort but it wasn’t a complicated migration simply because a lot of you know the database technologies in terms of options of how the tables are created.
You know are consistent. You know there, there may be some minor you know, features or functions that we had you know, to be aware of it. You know especially if you know some of the code we had were stored procedures. You know the one thing in particular that we’ve came across and is the, you know the for those of us that have had Oracle is the, you know, the use of the Oracle decode function that has been there that is not an ANSI compliant.
Function that doesn’t work on SQL Server. So you know you’ve got to convert all of those routines you know to case statements to try and figure out you know and give the equivalent. So that’s the hard part of OK you know now we have to scan through, you know, hundreds of pieces of potential code that exists and looking for you know, vendor specific functions, you know very similar you know.
Almost. If you’re, you know maybe moving in a BI platform, if you’ve used, you know specific vendor functions there, you know you have the same thing on the integration side. We just going to go and look for it to see OK, what’s going to break you know because of the fact that this code is using some vendor specific function that is not supported or the new vendor doesn’t have an. You going to figure out what is that new equivalent function that I need to put in place.
Yeah. That’s why we recommend that you never change anything. No, I’m kidding. All right. Well, thanks, Peter. Thank you, guys. Yeah, I’ve got. Well, I want to add something. Sometimes. Excuse me, sometimes you have to change the database architecture for performance reasons. Like it. You know, you can load it, it’s fine. But then when you start reporting on it, you have to.
You know you try different techniques to speed up reporting. In one case we had to change the architecture of the data of the table to with comp store indexes in SQL Server. That greatly improved the reporting but slowed down the ETL. So you have to find workarounds in order to make those make everything work.
As you want. So, you know, considerations like that have to be made too. Yeah. And you know, unfortunately people don’t change database vendors too often, which is a good thing. I mean, there is often a tendency to push transformations back to the database, right? We face that often where it it’s something we all do. We push those transformations to our sequel statements.
And of course, the more you do that, the more chance there is that you’re going to have to rewrite things. When you change database vendors, when you rely on the tools to do those transformations, it Shields you from that, right? Because the tools are dealing with those transformations and you change vendors and it’s not going to make any difference, right? I mean, that’s.
The benefit of using the tool to do those transformations is you don’t have to worry about changing decode statements, because those compiled C code and all that other stuff that’s in those tools is going to deal with that for you. So that’s the tradeoff, right? It’s can have it buried in your sequel, or you can let the tool do it, and there’s various arguments you know for one case or the other.
No. Anyway, just make that point. Peter looks like he wants to say something. No. Oh, you’re muted Peter. Or you’re just saying you don’t have no, no, nothing to say. Just, you know, just, you know, sort of agreeing with you and just, yeah, just ways of the worlds now. Yeah, it is, but it is depends on the, you know the people doing the work. So I.
There were so many things said there that I wanted to comment on. I’ve already forgotten most of them but so my example rather I’ve got. I’ve got so many different projects I could talk about but maybe one that would be interesting for people to hear is 1 particular project where the client was utilizing a cloud based solution which was data factory and.
They had basically they didn’t have any internal resources or expertise at all. Another was the entire data integration ETL process was external. So we did all built all the ETL, it was all managed in the cloud so.
They had data factory, Azure security, everything was completely outsourced and the business then took that data and had business critical reports. So what’s interesting when people do that is there are certain dependencies that are developed when you make those decisions as a business to completely outsource your.
Business process that delivers your data for critical analytics and I think that it was interesting to see that it was a good decision for many reasons. But you do have some downsides. So one of the downsides is as the person doing the work was the regular updates to data factory. So unlike say on Prem tool where you get to control when new versions of the software are delivered, right.
Oh, there’s a new version of data stage or a new version of SSIS. When do you want to apply that? I don’t know. When do you think? What? What? What’s changed? But with cloud tools, you usually have no control. You come in the next day and oh guess what? We upgraded your environment and things change and stuff stops working and you’re looking at your code going, oh, why doesn’t this run?
Well, because we change things and now your pipelines don’t run anymore and that that can be pretty annoying because you get calls from the client. That’s hardly a feature. Yeah, our new feature is we break your pipelines. Yeah, we broke your pipelines and the clients calling you early in the morning going my reports didn’t get delivered. So there are some issues there and there’s you know.
Centrally managed security which is awesome, and rotating security keys which are great to take advantage of, but it just shifts sort of where things traditionally have been taken care of and maybe an internal IT department to an external resource that may or may not have the same priorities as you do.
So if you have a something that breaks, you might be subject to a four 8/12/24 hour SLA and that may or may not work well for you. So is it was an interesting process to go through and see how that worked for them. And you know most days it was good, but some days it was particularly bad. So anyway, just thought I’d throw that out there. That’s my example.
Yeah. You know Brad, that reminds me, just one of the challenges I’ve seen with either all cloud like environments that are all cloud or at least environments that are mostly cloud that a big risk that a lot of people don’t necessarily realize they’re stepping into with that is the dependency you’re talking about where you’re dependent on.
The vendor of each distinct cloud tool that you’re using to keep their tool operating successfully, right? And if their service goes down, it can have dramatic impacts on you know, like the example you gave about data factory. I can recall some of those experiences where you know, Microsoft pushed an update and you know they don’t really announce it and the way they tend to roll out the updates they sort of.
Roll over time and kind of across the continent. So you don’t necessarily know when your systems are going to start being affected by a new update or even when they start rolling it back like periodically would have to roll things back. It just it was unpredictable and in some ways I mean you know historically we’d always complain oh, it takes it you know too long to fix the problem we have. But I would say at least when you have you know in that.
Situation you had somebody you could kind of go to directly and say, hey, I need help and you can get much more real time updates about what was happening. And I think that’s a big challenge. Yeah, nowadays is, yeah, it’s more, it’s not, it’s sort of opaque. That’s the word I would use. You know what’s going on is more opaque, right. And I think that you know if if you’re a big company and you’re paying for high level premium support.
You can probably get somebody on the phone, but if you’re kind of a midrange business and you’re not willing to pay a very large amount for that kind of support, then you’re going to not have that direct feedback that you would have by being able to just call your buddy down the hall and say that upgrade didn’t work, roll it back. So yeah, it definitely was a different thing, and 99.8% uptime is.
At that .2% can be pretty painful. Yeah. If it happens at their own time. Yeah. Yeah. We don’t think about like oh, point 2%. That’s nothing. Yeah, until it’s that hour that you need the system to run. So, yeah, yeah. Well, and, you know, to that point grad, I mean, you have, you know, there are common situations where these kinds of systems are heavily used, right, like let’s say beginning of the quarter or end of year. And so of course, all of these.
Cloud platforms have a much heavier load at that time and that’s also the most critical time for a lot of execs to be getting their data. And you know, if that .02% or .2% downtime happens to occur, you know, first thing in the morning on one of these critical days, like it can be a big, big problem, right, right. Yep. So anyway, just a little bit of perspective on, yeah, being totally cloud based.
Is has its own downsides. So yeah, that’s my example. Great. Well, thanks, Brod. You know, the thing I thought of as you guys were talking, I don’t know offhand how much of this you’ve seen, but I’ll just ask you, I’m curious about multiple tool environments. So I know, you know, historically you’d tend to have adoption of like one platform, right, like a company would choose SSIS and do everything that way.
But kind of out in today’s world with this broad landscape of all these new tools, do you still mostly see that organizations are kind of adopting one platform and sticking with it or do you see hybrid in the sense of like multiple different tools kind of working together. So I on my side, you know most of them have been a singular tool but as.
You know, management might change or you get new IT staff that’s coming in. You know, you may lose. In this case, we’ve had folks that, you know, we’re on Informatica and some of the new staff that came in really didn’t understand it as well as, you know, maybe at SSIS. And so they, you know, sort of start a new project, you know, going to the tool that they knew best and so that you know that’s slow but surely you started to build up of.
A multitude of, you know, different routines, some in informatics, some in SSIS. You know just because of turnovers of folks that have come in and you know take things over and then you get back to now where they’re trying to re synergize and go to a singular you know solution that you know in Azure, in the cloud, you know, so you know it’s an ongoing event that you know just.
You know, you may start at one point, but with internal changes and in folks that knew the tool and knew the application and knew what it was doing. It’s then you know, cause of the issue of folks you know, introducing another tool set just because of their familiarity with the different tool set than what was, you know, currently in place. At least for this one client that was what.
Contributed to it. But there’s others where, yeah, we definitely go in and say, yeah, we want to consolidate, you know, yes, we may be using, you know, a multitude of different ETL routines and we need to consolidate across the board. And I’ve been in somewhere different. One organization had Informatica had, data stage had.
SSIS and was looking to go to a completely different solution as our integrated solution, you know, so just consolidate all three of all three of those into something else. Yes, ohh, wow. So and that, you know and that’s not for the faint of heart. No, no, definitely not for the faint of heart, you know. But yeah, it just, it just depends on the organization sometimes things, you know departmentally.
Things crop up and that’s you know, where you get the multitude of data integration tools just like you end up with departmental uses of you know maybe Tableau or business objects you know or you know power bi and even Cognos just in departmental wise usage. So it’s same thing is true on the data integration side. Yeah, sometimes it’s based on economics too.
You know, I had one client who was on Informatica and you know there’s an annual licensing fee involved there, but they didn’t want to pay that anymore. So they brought in SSIS and I migrated some of their, their ETL routines from Informatica to SSIS. There’s planning to sunset Informatica at that place, so it’s sometimes it’s economy.
I think it’s how difficult. Oh actually go ahead Brad. I just hold my question. All right, we’ll get all three of us. And I just wanted to comment that I found that it’s more and more common now to find multiple tools. It used to be there was one vendor involved and people tended to be pretty monolithic, but in the last five plus years it seems like everybody’s got multiple tools. It’s.
It’s I would say more likely that people have multiple tools than not. We still leave its interest. We have clients who have a single tool but that’s I think they’re in the minority. I think you’re finding that more with the cloud based solutions to you know going from a cloud source to and you got multiple ways of getting data from there into for instance Snowflake.
So they use different tools for and even like Ruby scripts and stuff to move data. Yeah. Especially if you throw in data prep tools. Like if you throw in Tableau Prep or some other, you know, something that you wouldn’t call a heavy lifter, you’re definitely going to find a proliferation of tools for loading data, yeah.
Yeah, I would guess that the existence of so many cloud based tools also, I don’t know sort of accelerates that proliferation of using multiple tools just because of the sort of friction, you know, there’s less friction with getting started, right. I mean if I can go drop my credit card into a cloud based tools portal and get a month of service for you know, a small amount of money, it’s a lot easier than getting you know, approval in my organization to purchase licenses for.
Cool. But I also think that can be a, false economy because that you have the question of what does it actually take to use the tool effectively and integrate it with your other tools. Yeah, well, the, the usual hurdle is data access, right? I mean, you can buy a service and get set up, but then the first time you try to access that database that’s behind a firewall somewhere, it’s like, oh, whoops, they won’t let me in. And that’s a.
Very good roadblock. Yeah, yeah. All right, guys. Well, we’re getting close to the top of the hour here. A couple of quick questions. So I was curious Rick, when you were talking about the example of changing platforms, I think it was Informatica to SSIS was the example you gave a moment ago. How difficult is, I mean is it straightforward to migrate or is it some heavy lifting to.
There, as far as I know, there’s no tool that’s going to do the migration of the code for you. So you basically have to open up, you know, on Informatica side, look through, see where how all the lookups are being done, any transformations that are happening and then kind of translate that into SSI. So it’s not an easy thing to do and I think we’ve done some data manager to.
Data stage migrations too, same thing. You have to know both tools go in there, open it up. You know you have to go everywhere to find logic what’s being done to the data and then and then rewrite it in in the whatever tool you’re migrating to. So yeah, it’s not an easy process. I don’t know of any migration tool that that allows you to do that.
You know, and I would say I was just the opposite. I’ve been on projects where we’ve gone from SSIS to Informatica and the same thing you have to open up the tool sets and it almost leads to a, you know, at that point it’s more of a reengineering thought of what was wrong with the what, what improvements could we make with the existing process that because the process was so.
So integrated to whatever we’ve doing, you know, we’re not going to break things. But if we had a chance to start over, which in a tool migration you actually are now, you can actually think about what things would we have done differently now that we have, you know, been having the system run, you know, for a period of time, what changes will we make to the process to make it easier to maintain, you know, more efficient, what things could we do. So that opens up.
You know, a little bit when you start talking about, you know, tool migrations gives you the idea, you know, the ability to think about what could we do differently or are we just going to do, you know, you know, apples to apples migration. So we have some form of testing, you know, to ensure that the process is still delivering us the same result sets. Yeah, but doing it, you know, in between, you know, you know, how can we get a little bit more efficient and how we process the data.
Yeah, it’s interesting. I think when you gave your example initially, Rick, you said basically they didn’t want to pay the license fee, right? They were trying to save on licensing costs and that was kind of the driver for that switch. But of course there’s the additional not insignificant cost of what it actually takes to do not migration work.
All right, gang. Well, so we’re coming up to the end here, so I’m going to call it a wrap on our round table discussion. I’m going to show the deck here again, but we’ll all still be here. So thanks, everybody, for joining today. If you do have questions beyond what we covered here in the discussion today, you can always reach out to [email protected]. You can click on Scott Felton’s calendar link there in the chat window.
We’re always happy to talk to you about your current challenges. If you want to do a deep dive on topics we discussed today, again, just reach out to us, to our website. We’re happy to talk. You can also find lots of additional resources on the Senturus website. We’ve been in this business for a long time, so there’s a wealth of information out there on our site, so check that out. Lots of details about the tools we’ve discussed today and others.
A little bit about upcoming events. We got a bunch of stuff coming up. We’ve got a chat with Pat Sessions. Whoops, sorry about that. I over clicked, double over clicked. There we go. We’ve got a chat with Pat on Power query editor for data cleansing coming up later this month. Power BI data marts. We’ll have a session on Cognos Analytics performance tuning. For those of you who are Cognos folks, don’t miss that. We’ll have our own Todd Schumann.
Expert on all things Cognos to cover that, and another chat with Pat on how to do building good visualizations in Tableau. You can register for any of these at Senturus.com/events. Here at Senturus, we specialize in modern BI, particularly in hybrid environments. Nowadays we see a lot of Cognos, Power Bi, tableau in various combinations.
We’re happy to help you out with all of that. No, we’ve been in business for 22 years now, over 1400 clients, over 3000 projects. So we’ve got a lot under our belts. Just between the three experts on the call here today with a bunch of these clients and projects have all been touched by these guys over the years. So you’re in good hands with us. That’s the point.
If you’re looking to change positions that we are hiring and we’re looking for a senior Microsoft BI consultant, so if that is of interest to you, jump over to our site, visit Senturus.com why Senturus careers? Or you can just send your resume straight to us at jobs next Senturus.
Dot com all right. And with that, thank you everybody for joining us today. Brad, Rick, Peter. Thank you guys for being here as the experts for today’s roundtable discussion. Great to have you and thank you everybody for attending. We hope to see you again on future Senturus webinar. Thanks everyone. Have a good day. Thank you, everyone.