Advanced analytics are becoming a hot topic among businesses in a number of ways.
If you’ve read my primer on the Internet of Things, then you’re already aware of some of the ways that businesses will be collecting data in the near future. It’s happening already, in fact. Sensors on manufacturing equipment, online engagement tracking, and customer behaviors around brands are already being collecting data in ways that weren’t possible years ago. Gartner has reported that by 2018 more than half of large organizations globally will compete using advanced analytics and proprietary algorithms, and by 2020 predictive and prescriptive analytics will attract 40 percent of enterprises’ net new investment in business intelligence and analytics.
With all of this new data available, how do we sift through it all? It isn’t a task of finding a needle in a hay stack. Instead, it’s a task of sorting the hay stack into groups of individual pieces of hay by size, color and texture, and then identifying the trends in those pieces of hay to determine in what manner a needle would be lost within this stack of hay. Sounds daunting? Well, without some sort of program helping you do it, it’s essentially impossible.
Enter Dell Statistica, Dell’s award-winning advanced analytics platform. Statistica, and advanced analytics platforms like it, allow end users to manage, analyze and discover trends based on the data they collect. That data could be a number of things. Perhaps you’re monitoring how and when users are engaging with your website. Maybe you’re keeping track of levels of material and heat being used in your manufacturing process. It could be that you are tracking usage patterns for new technologies you’ve introduced to your employees. Statistica can be modified in order to analyze myriad data sets and give you useful information based on that data.
At the Commonwealth Hotel beside Fenway Park in Boston last week, Dell announced to a group of reporters and end users the latest major release of Statistica, version 13.1. The new release offers capabilities designed to meet the specific needs of the modern “citizen” data scientist. As the global need for traditional data scientist outpaces the available supply, citizen data scientists – everyday, non-technical users who are embedded in the line of business – will become the driving force behind analytics initiatives. Dell Statistica 13.1 addresses this emergent need with new data preparation functionality built specifically for the citizen data scientist that simplifies the preparation of structured and unstructured data.
Advanced Analytics on the Manufacturing Floor
As a part of the Dell Statistica dinner, Dell introduced those in attendance to Rob Dimitri and Gloria Gadea-Lopez of Shire and Tim Alosi of Sanofi. Both Shire and Sanofi utilize Statistica on the manufacturing floor. Being in the pharmaceutical industry, data is extremely important in both ensuring that the yield is at maximum capacity (and therefore maximum profit), and that the process and yield meet strict FDA requirements. Statistica has allowed both companies to better monitor and better create manufactured products.
I was able to speak at length with Gloria Gadea-Lopez about how Shire has utilized Statistica in its process. Gloria works as the Director of Manufacturing Systems in Internal manufacturing at a facility that manufactures only about 30 batches of its product per year, so every batch is extremely important to the overall profitability of the company. More importantly, these batches will be used to combat real life diseases, so the pressure to make sure that there are no anomalies and all of the batches are delivered is serious.
“My joke is, we don’t have big data, we have wide data,” says Robert Dimitri, Associate Director Manufacturing Systems at Shire. “Meaning we’re lucky if we manufacture 30 batches in a year. Our processes run for weeks and months. So we have to collect an enormous number of variables in order to monitor our process. We need to enable the end users on the data side with us much data as possible because that’s all we have. With the length of time that our processes run we have to be aware of what’s going on and able to react to anything that’s happening during the manufacturing process. Having that visibility is absolutely critical.”
There was once a time when scientists at the facility would hand-record data samples and enter them into spreadsheets for analysis. This took time, as data could only be obtained by checking sensors manually, and in many cases by manually taking samples and recording the data. By adding sensors and tying them into a data collection platform, Shire was able to save time and effort, and collect data sets at a much faster rate – days and weeks which used to take weeks and months.
Statistica then analyzes those data sets to provide information about how the manufacturing processes are performing. This analysis can happen for processes that are daily as well as those that take months to run their course. Data scientists within the manufacturing facility can then take the analysis and use the information to better run their processes. So, for example, analysis might recognize that yield is greatest when a batch has a certain pH level at a certain temperature. Scientists can then monitor the pH level and temperature to ensure greatest yield. And this happens in almost real-time, so that scientists don’t have to wait until the batch is finished to understand how well the batch will come out. They can monitor as it goes in order to create the best yield for every batch.
Statistica also allows Shire to scale and validate data. “Things happen in an environment of constant change. So maintaining compliance while operating in constant change is challenging,” says Gadea-Lopez. “So we need to have the right application that allows us to manage change. At the same time, continuing the expansion of these applications, either by onboarding new products and processes, or by expanding the use of the core application to new areas of use.
“One area that’s very interesting right now is that the facility in Lexington utilizes disposable technology. Traditionally, we make the product using stainless steel [reusable technology]. Single-use systems are a completely different way of manufacturing. We have another plant that makes the same product on the traditional stainless steel technology. So we can monitor performance, get approval for the facility, and demonstrate parallels. That’s where these applications are priceless – we can monitor and use all of the data from the separate systems at the same time.”
What’s New with Statistica?
Shire is just one example of how Statistica has helped companies save money and manage their processes. There are plenty of different ways, in different capacities, that Statistica can be utilized. Don’t think of it in terms of manufacturing. Think of it in terms of sifting through data, any data, analyzing it, and giving you information at a rate that humans never could. Statistica 13.1 hopes to help companies of all disciplines better use their data.
Statistica 13.1 introduces a slew of new capabilities to the already powerful platform. In tandem with Statistica’s Reusable Process Templates, it’s now easier than ever for users to share and distribute analytic workflows with non-technical users. With Statistica, traditional data scientists can build analytic models and workflows once, and non-technical business analysts can reuse those workflow templates repeatedly within the organization. This eliminates redundancy and allows business users to efficiently use analytics to solve real business problems without the technical expertise traditionally required. Other new capabilities include:
- Enhanced offering provides citizen data scientists with streamlined data prep and analytic workflows for greater ease of use and increased efficiency
- New release features edge scoring capability to address nearly all IoT analytics use cases, which reduces expense of streaming large amounts of data to a central analytic repository
- Latest version extends in-database analytics to multiple platforms including Apache Hive, MySQL, Oracle, and Teradata, while adding network analytics for easier fraud detection
- Improved visualization dashboards, enabling users to easily visualize the results of any analytic node and tie process-specific visualizations to the top level of MAS dashboards
- An upgraded Web UI that allows users to distribute analytic outputs with a modern look and feel in any web browser
- Enhanced validated data entry, ensuring that individuals relying on manual data entry can build data quality into the point of collection so it can be trusted for robust analytics
With these new capabilities, it will be easier than ever for companies to analyze data using Dell’s Statistica platform.