We’ve all seen it, haven’t we? The explosion of new analytic tools and techniques, like R and Tensorflow
, that promise the world huge advances from using machine learning and artificial intelligence.
The tide is definitely flowing in the direction of these new tools
. Leaders in big organisations get excited, set up big budget programmes and recruit expensive specialist staff and then the IT department stops them doing their job. That is at least the data scientist’s perspective.
This leads to frustration and grief for both the data scientists and IT teams, but it doesn’t have to become a problem for either – as we look at common problem scenarios and ways for data scientists and IT security to collaborate and deliver a long-term solution for the company.
But, before you can find a solution, you often have to go through the different stages of grief.
Stage one: anger and denial
After a data scientist joins a company and wants to start using open source tools that have not yet been approved, they often need to put a request in. A conversation might go like this:
Data scientist: “I need to be able to install R Studio and download libraries from CRAN (https://cran.r-project.org/), so I can build some machine learning algorithms to generate real business value. How do I do it as my laptop is locked down?”
IT: “Is this on the Approved Supplier list?”
Data scientist: “No. But it is open source technology. You don’t need to buy it and you can easily download it.”
IT: “Well, you’ll have to get it approved first.”
Data scientist: “OK, well how do I do that?”
This strict process is understandable though. The core responsibility of IT teams is to ensure that data is secure and controlled, that there are no risks to core IT systems, that these services run without hindrance and all SLAs are met.
Allowing the use of just any piece of software, that is not fully enabled and supported by the platform vendor, would be negligent on their part. Ultimately, IT security would the first person/people blamed if something went wrong. So, they need to be cautious.
But, at this point the data scientist wishes they’d joined a more progressive business, and then often does just that – leaves for a ‘better’ company, meaning that the project fails to deliver on its promises. Or, maybe worse, out of frustration someone hacks the system to get their tool installed anyway and crashes the production system.
This has happened in the real world and this is where the story often ends: however, it doesn’t have to. The question should be whether the traditional enterprises can make use of new and open source tools at all?
Stage two: bargaining
As is the case with every business, employees must work within the boundaries (such as security) set by the company and often this means being creative in their thinking. The first thing to do is to take a step back and ask: “What is the problem to be solved?”
In order to serve the business with solutions they really need, the different departments must collaborate and agree on an analytics strategy; outlining what they want to achieve, how they want to achieve it and what tools they need to do so.
There really is no substitute for sitting around the table and collaborating to come up with a solution that works. We really believe this should be actually
sitting around a table: it is far superior than a faceless web meeting or an extended email chain.
The advantage of insiders collaborating is that compliance with internal processes can be woven into the actual solution more effectively – not circumventing the process but exploiting the experience people have of actually
making things happen.
Building the business case for change – quantifying the benefit from a ‘proper’ fix and considering this against the effort to solve it in the context of a future architecture vision – becomes a critical activity that the team needs to undertake.
For data scientists looking to introduce new software and tools to their companies, they should try to answer some core questions:
- What is the use case and why was an open source library identified to meet this?
- Can this requirement be met in a different way using tools the organisation already has (e.g. using the traditional tools such as SQL or SAS)?
- What is the wider scope for the output of this work – e.g. what systems/applications does it need to integrate with to drive business value going forward?
In our next article, we’ll look at how the teamwork between data scientists and IT can help to deliver a better long-term solution for the company, one that can keep both parties happy and deliver beneficial results for the business.
If you have found this interesting and want to explore further with Teradata, feel free to reach out to the authors – Stewart to talk about exciting use cases, or Greg for the boring technical stuff.