Sunday, September 4, 2016

ABC's of Data Science

The words like Big Data (BD), Machine Learning (ML), Data Mining (DM), Cognitive Applications (CA), Artificial Intelligence (AI) etc. keep coming up in many discussions or conversations these days but what exactly they are and how are they interrelated? Here is the explanation of some of these terms in a very simple manner. Let’s start with Big Data.


1. What is Big Data? When exactly the data is termed as “Big Data”? Is it the size of elephant or dinosaur?
A: Actually there is no particular limit defined for data after which it becomes big data. The data is big or small in reference to the application in which we are using it. When the particular application is not able to handle or respond to data then that data becomes big for that particular application. For example same amount of data can be termed as Big Data for Excel but not for SAS or SPSS. Hence, the term Big Data in itself is not appropriate and it is better to refer such data as Large Data.

2. What is the difference between Data Analysis & Data Science?

A: Data Analysis is simply analyzing data. From ages, data has been analyzed for various techniques like MIS Reporting, Six Sigma, Lean etc. However, Data Science is little different from Data Analysis. Data Science is also data analysis but here we analyze large data sets with the help of Machine learning and/or Data Mining techniques. Data Analysis can be done with tools like Excel, however for Data Science tools like SAS, SPSS, Python, R etc. are required. A Data Scientist needs to have statistical knowledge along with programming skills.