15.5 Exercises

Consider the datasets GlobalCitiesPBI.csv, 2016collisionsfinal.csv, polls_us_election_2016.csv HR_2016_Census_simple.csv, UniversalBank.csv and algae_bloom.csv (as described in the Exercises sections of Modules 8, 9, 11), or any other dataset with a sufficiently large number of predictors.

  1. Establish 2-3 questions that you could try to answer with each dataset.

  2. Based on the questions obtained in 1, provide 3-5 subsets of features that would do a good job of representing each dataset (use some of the methods described in this module, or other methods as needed).

  3. Learn 3-5 reduced manifolds for each dataset (use some of the methods described in this module, or other methods as needed).

  4. How would you validate your results?