Data Science and Systems Integration

Posted by Adrian C. Keister on June 27, 2017

Data Science. Systems Integration. For many systems integrators, the two seem to lie at an ideal hand-off point with nothing in common. The systems integrators generate the data and store it. Once that is done, the systems integrators have finished the job. The customers do what they like with the data, problem solved. I would argue, however, that there are significant opportunities for helping customers at a higher level (higher, meaning not only further away from hardware but also higher up in the customer's management hierarchy) in the realm of data science.

What is data science? There are many definitions, but the one I like the best is the following: data science is the recognition of patterns in data to answer questions. Questions may be earlier in the process, such as: what are the patterns in the data? Or they may be further along in the process: given a particular pattern of light and dark regions in an image of a handwritten number, predict which digit it is.

To help customers at this higher level and provide more value, the project discussion needs to start at a higher level: what are the questions the customer wants answered? For many customers, questions do not seem to percolate down to the engineering level from upper management. The data to answer questions is often already there seeking the questions, much like Jeopardy. Perhaps the customer is making widgits: a certain percentage are failing, and management wants to know why. But the end-of-line tester is already generating all the data the customer needs to answer that question. If only the customer would analyze that data, they could correlate it with the failures to provide a starting-point for the investigation, and eventually find out the root cause. This is precisely a data science scenario. 

Having systems integrators do the data science has a number of significant advantages for the customer and for the integrators:

  1. Because the integrators are generating the data in the first place, they understand the data already. This is always an important step in the data science process, and it is already done!
  2. The systems integrators can produce a more tightly integrated system as a whole than if they merely generate the data and stop. Here the system is regarded as the data generation piece plus the data analysis and presentation piece. 
  3. From the data scientists' perspective, this is practically an ideal situation in which to do data science: the data is already far cleaner than typical data scientists receive! Most data scientists spend a good deal of their time cleaning (they call it "munging") the data before they can even start analyzing it to find patterns. But systems integrators are already highly experienced at producing exceptionally clean data. 
  4. It is more business for the systems integrators. (Naturally, being a data scientist in a systems integration company, I had to include this point!)
  5. The systems integrators, simply by prodding the customer in this direction, have the opportunity ultimately to solve the customer's real problems. 

In an ideal scenario, then, the process would be top-down, like so:

  1. Find out what the customer's central questions are. 
  2. Determine what data and analysis of the data would be necessary to answer those questions. 
  3. Design the hardware and software necessary to generate the data.
  4. Build the system, test, debug, etc. 

This process addresses (and automates!) a larger swath of the customer's business and thereby provides considerably greater value. 

Topics: Systems Integration, Data Science