BI-Ops – 1/5 – What we are trying to do
Over the next five blog posts, I’ll be going into the details of why and how my team has reached the level of maturity and control over its BI development process that it has.
There is still a lot to improve but I believe we have reached a point where we should be proud of ourselves and others could benefit from looking into how we got most of our build and deployment work automated.
BI-Ops – 1/5 – What we are trying to do
BI-Ops – 2/5 – Source Control (Git)
BI-Ops – 3/5 – Build (CI)
BI-Ops – 4/5 – Deploy DB, SSIS and SSAS (CD)
BI-Ops – 5/5 – Deploy PowerBI (CD)
BI is mostly about organizing data: building the right queries to give your business the results it needs in an acceptable time. My team works mostly with SQL Server 2019 and Power BI Report Server.
The database is not your playground
My first job was subcontracting for a major French bank doing BI. We were doing really awesome stuff that even 10 years afterwards I still haven’t seen become mainstream in the industry I now work in. SQL Server 2008 R2 was out, PowerPivot was an awesome Excel plugin that just came out, Sharepoint integration made our eyes sparkle. Looking back at those days, I can’t help laugh inside.
And back in those days, we developed in SSMS. We had production environments, UAT, pre-production that was a daily refresh of production, and of course our DEV server. And we did everything on the DEV server. Write some code? just save it on the DEV SQL Server Instance. Tables, views, stored procedures… everything was on there. SVN? Yeah, we used that, but it was a pain to save everything to *.sql files and commit. And sometimes there were conflicts. Saving to that server was just so fast, so convenient. What could go wrong? Well… here is a list of some of the things that went wrong:
- A colleague saved a stored procedure on top of the version I had just pushed. Concurrency? Nope. My work was gone. Local copy? Why would I have one, everything was on the DEV server right?
- We’re finally done with UAT and push into production. Great ! Happy client. And after a few drinks it hits us: we have been working on DEV for 2 weeks preparing the next version. But we should refresh it with production data to be up to date. Oops. We need to script everything to restore on top of that refresh.
- What I coded for the last 2 days is rubbish. I should come back to what I had on Monday. Of wait… production is 1 month behind, UAT 2 weeks behind. I don’t have a version of what I did over the whole of last week.
- My “favorite” is walking in one morning and our boss telling us “the DEV server is dead”. And for a few seconds it hits you: that thing isn’t backed up. That dear server we loved, cherished every day got deleted by a sysadmin making a mistake. But who cares, everything important is backed up. Right?
At the end of the day, all the above could have been avoided if we had used source control. But source control isn’t native in SSMS. Source control was not fun. But it wasn’t fun because we mostly didn’t have a strong process around it. Only our lead dev had everything well planned out because he knew! Damn those grumpy old guys. He had been through all this before. I had less that 3 years of experience, I was invincible! But now, I have learned.
Also, I won’t even go over why you are not developing straight in production like I’m sure some of you readers are. Those who have pressed F5 a bit too quickly on an update statement without having the WHERE clause selected will know.
IDEs are awesome
Having said all this, what should we do then? How do others do it? People in BI, they seem to be all over the place. A bit like we were when we started looking around. We know what to do but it seems like such a pain to implement. So how do others do it?
It happens that we have a sister team in my company. A team I used to be part of. A team with C# junkies that swear you only need two tools to be happy: Visual Studio and Git.
Our first reaction was: “Visual Studio? But we have SSMS! Its just Visual Studio but better for us SQL maniacs. Turns out, its not. One is an IDE and the other is an administration tool.
So we decided to use Visual Studio, with a bit of pushing from our most senior dev. I feel like he had been through all of the horrible stuff above as well. I mean, he even refuses to be sysadmin on SQL Server instances! Crazy guy. Or is he?
After a week, we came to realize that Visual Studio brings us a lot great functionalities to make our work faster, better quality and integrates very well with source control which I’ll talk more about in the next chapter. And that is VS for SQL, as it is mandatory for SSIS and SSAS development anyway.
Production is off limits
So, why would my most senior developer NOT want sysadmin access on production? Is temptation to screw it up really that great? Well, after 10 years of experience in the IT business, I agree with him.
Developers don’t get the keys to the production environment. Repeat after me: Developers don’t get the keys to the production environment.
Developers and Administrators both think they can do the job of one another. But they are 100% convinced the other cannot do their job. See where this is going?
So which access rights did we want to give to each persona?
Our design is that everybody can read production environments. Isn’t that the goal of self-BI? (to the exception of a few confidential tables)
The team can write in production (you never know when you’ll need to do a quick update or delete on a technical table) but with a different account that the one they usually work with. No risk of doing things unintentionally.
And finally, two experts from our team are administrators of the whole environment. They use a 3rd set of accounts for that, so again, nothing unintentional should be happening there.
So this means that publishing can’t really be done by just clicking “Publish” in Visual Studio. And considering we are trying to challenge our trainees to work on critical projects as well, it is a design we like. Our goal is that we deploy from source control only. And we deploy in production the same package that was deployed and validated in the User Acceptance Tests environment. No changing a query at the end of testing and pushing it without having it tested and approved by the business. To do this, we use the DevOps tools our application-developing colleagues use.
Beginners can hop on quickly
Another great advantage of having a controlled environment where team members have a tuned and pretty straight forward process to deploy their databases, cubes, packages and reports is that people joining our team are guided, can’t mess up much and are very quickly autonomous all the way to pushing their code into production. They don’t need an expert to explain to them every detail of the environment and they can progressively learn about it all at their own pace.
It also allows us to add tools to the process and automate them to the point where people don’t need to change their configuration or even learn the details of the new tool to use it. This is for example something we plan to do with our documentation generator. Instead of each team member generating the html documentation dedicated to what he just worked on (new tables, views, dimensions, facts, etc.), they just have to deploy. In the pipeline, we will add a brick that takes care of it and publishes it on the documentation website. It lets each team member focus on his area of expertise.
We have control over our environments
This is perhaps the most important point. By showing our company system administrators that we are trustworthy because we implemented such procedures, by scripting and being able to execute certain tasks like restoring a production database in our development environment, we no longer depend on other people that already have a lot of work to do things which present no added value other than securing actions.
Having this level of trust brings a higher level of freedom, allowing us to deliver higher quality IT to our internal customers because we are less dependent on others when issues arise. Things just go faster with less people involved. This allows us to have high-level SLAs and a lower level of frustration for the developers.
And in the end, the only thing we used to get away with but we can’t anymore is writing technical documentation. We used to be able to get away with it because somebody else was doing the heavy lifting. Now, its all down to us and to what we share amongst ourselves as a team.
Why do BI-Ops ?
If there were 3 bullet points I’d have to slap on a slide to explain to management why we go through the trouble of working like this they would be:
- Secure our environments
- Empower our developers
- Improve our team efficiency and work quality
Actually, 4: add a cool buzzword on slides to show around 😎