Everyone knows how valuable big data can be. However, mistakes are often made by companies when they run vast data science projects. Here are a few of the typical errors so you can avoid them yourself or correct them if you feel they are already happening.
- Why all the Fuss? Big Data Stats
- Operator Error
- #1 – Expecting the solution to be immediate.
- #2 – Thinking you should focus on a big data strategy, rather than using big data in carrying out a business strategy.
- #3 – Entrusting data scientists to find solutions for business problems.
Why all the Fuss? Big Sata Stats
As with any buzzword or trending business concept, it helps to look at some hard numbers to understand the idea from a 10,000-foot view. Here are a couple of telling figures on big data:
- If you think of the various data-producing systems your company manages as a pancake house, be assured that you are going to be cranking out a lot of flapjacks. Between 2013 and 2015, more data was produced than throughout all of human history.
- Again with the pancake analogy, I don’t know if you’ve noticed, but you’re making more pancakes all the time. Information isn’t just growing, but growing at an ever-increasing speed. By the end of the decade, approximately 1.7 megabytes of new data will be generated each second per person globally.
In other words, if you took all the data being produced and divided it evenly across the entire population, each person would be able to fill the storage capacity of the original disk drive in under 3 seconds. That’s right: the first disk drive, 1956’s IBM Model 350 Disk File, had 50 disks that together could store just under 5 MB.
Related: The sheer size of big data projects, and the variability of needs at a given time, makes the affordability of cloud attractive in these situations, notes InfoWorld. For elastic infrastructure to meet the vastness of big data, development and testing of analytics apps is often performed in public cloud environments. At Superb Internet, we remove single points of failure through distributed (rather than centralized) storage and achieve guaranteed always-zero packet loss through InfiniBand (rather than 10 Gigabit Ethernet). Explore our cloud for hosting your big data project.
Many companies that embrace big data projects aren’t able to accurately determine what aspects of big data, predictive modeling, and data science will be of greatest use to them. They want to move forward and keep up with the competition in this regard, but they often end up wasting their time accelerating in the wrong directions.
What are some errors companies make when they work with big data?
#1 – Expecting the solution to be immediate.
Firms want data algorithms or machine learning, designed by a highly skilled professional, to improve their business edge. When business leaders look at their particular challenge, they often think they will be able to get a consultant to quickly create a model or purchase a plan that allows them to do it themselves.
However, the example of one major enterprise reveals why it’s more complicated than that, explains Erik Severinghaus in Forbes. Netflix “employs 300 people to maintain and improve its content recommendations [because] customer data is a continuously changing environment,” he says. “That’s why the company also spends $150 million recommending movies and TV shows to its members every year.”
Essentially, you want to realize the complexity of these projects rather than expecting to find a silver bullet for your problem right away.
#2 – Thinking you should focus on a big data strategy, rather than using big data in carrying out a business strategy.
You want to know what you are trying to achieve for the business over the next 6 to 9 months. Then you can figure out how you can fit big data within that context to meet your goals.
Specifically, what are your major problems as a business? How might big data better help you make decisions?
Here’s a real-life situation from Jessica Davis of InformationWeek: A hospital in a big city noticed that there were more people with injuries arriving at their emergency room following pro sports games. Doctors, nurses, and administrators all saw what was happening, but they needed their hunch to be calculated so they could figure out exactly how they needed to adjust.
“[A]n investigation of the data showed that injuries went up by 27% on game day,” notes Davis, “and that quantification of the injuries was something the hospital could use to figure out how much it needed to augment its ER nursing staff for game days.”
#3 – Entrusting data scientists to find solutions for business problems.
People who focus their careers on machine learning, statistics, and other data science specialties will be pivotal to moving your big data ideas forward. But you can’t just let these folks run without ample guidance and expect to meet your overarching business objectives.
“Data scientists typically build new models and solve intricate equations, leaving a business problem, however obvious, not a priority,” says Severinghaus. “Data scientists are only one part of the complex, cross-functional team required to create business value.”