03. Data Science Techniques – Content [255]

Data science is an interdisciplinary field that combines scientific methods, processes, business understanding, and systems to gain knowledge or insights from data. A data scientist analyzes data using multiple methodologies to help companies to improve business through insights, predictions, and automation. These data-driven processes, of course, start with data and apply various techniques to bring about insights. For this section we have broken these techniques down into 3 categories:

  • Visualization – being able to present data in a way that a user can navigate through the data and discover new insights
  • Machine Learning – Using data to predict the future based on new data or automatically find data connections
  • Deep Learning – Using data to make sophisticated decisions similar to human being capabilities in a limited domain

VISUALIZATION

Data visualization makes data more visible and continues the move towards Data-Driven decisions. Being able to view the data can yield more insight into how the company is running. Visualization can help with:

  • Improve response times by putting the data into the hands of the people that can do something with that data.
  • Clarity of Vision – Users can get an overall understanding of a situation first then drill down for more details to validate or disprove their hypothesis.
  • Pattern Detection – Users can absorb more information quickly and see patterns more readily. It allows decision-makers to view data using graphical representations including charts, fever charts, and heat maps.
  • Easier Collaboration – Through visualization, teams can also quickly see the situation, adjust and, react with the new information.

LEARNING TECHNIQUES FOR MACHINE LEARNING

Supervised Machine Learning – Teaching a system with data that has a known answer.

  • Classification – Grouping data in categories (e.g. predicting the type of flower based on measured factors. New data is put into a group). The data is put into discrete classes.

  • Regression – Computing continuous and discrete data (e.g. predicting the amount of a loan based on data factors, number of people attending a football game, etc.). The answer can be a number of continuous values.

Unsupervised Machine Learning – The test set has no known answers so the system discovers trends or features.

  • Clustering – This allows similar items to be grouped together. It uses techniques to compute similarities between different attributes and uses that information to group like items together. This has also been called Data Mining Techniques.

 

DEEP LEARNING

Deep learning allows for deep pattern recognition and currently requires a lot of data for training. There are techniques being developed to lessen the data load but they are not fully commercialized. Some examples where deep learning is making an impact are in the field of language, vision, and automation through text understanding, image recognition, image creation, text creation, self-driving cars, autonomous robots, and games (a.g. Chess, Go, Mario Brothers, etc.).

For more about AI watch the video below:
What AI is? (2:31)
This talks about what AI can do as well as the social impact AI is having.


The Basics of AI and Business by Philipp Gerbert (12.5 min)
This talks about the advantage of AI and why it is good to understand AI.

Have FUN and CHOOSE Powerfully