I have read several articles on the subject, but none of the authors were really “Data Scientist” and they admit that, so I thought it was time that something was written by an actual Data Scientist.
First off, let’s make sure you understand that there’s lots of college involved, no way around that one. If you noticed a lady in the 2nd row, 3rd from the left had a mole on her nose in the last commercial you watched, you might have what it takes, even if you hadn’t thought of Mathematics, Engineering or Econometrics as a field of study. What I am implying is that it’s take someone who is VERY observant to be successful in Data Science. Why, because you deal with such large data sets and large outputs/results, your ability to absorb lots of information quickly and exactly, is your best friend. I can scroll a million records in minutes or run a small SQL script, analyze the results and tell you if that data is bad or corrupt in minutes. Cleansing data is always the 1st step, if this part is left out, I can guarantee you will have lots of N/A’s or characters where number should be, etc… so make QA your friend not your enemy.
What major or course work produces the best Data Scientist? Econometrics and Mathematics as long as they have an additional major in Business, why, because of the logic involved as well as the classic theory of Left Brain people and numbers. Creative is great for making power point presentations but when you have 10 terabytes of raw data, pretty is not the 1st things on your mind. Minor or actively engage in courses that will teach you programming, you don’t need hard core Pearl but you will need SQL skills at the very least. Microsoft Visual Studio, SSAS, SSIS, SSRS package, SAS, SPSS, SQL, Cognos, Macros, Visual Basic are all not only good to know but vital when you have multiple client who use different CRM, BI and ETL tools.
Once the schooling ends, the real world begins. My 1st boss said, “forget everything you learned in College, there is no “bell curve” here; meaning, statistics, programming, mathematics, logics and common sense are only the start. Practice on cleansing data, extracting data, normalizing data, segmenting data, loading data, trending data, modeling…. in other words data data data data data. Never assume your results, never ignore anomalies, do keep a unbiased mind and never scrimp on tools, software or classes. Yes, that’s right I still attend webinars and read like crazy to stay sharp on my tools and technic.
We need more people desperately in Science, Technology, Engineering and Mathematics (STEM) so please consider Data Science as a career. According to the latest study we’re in high demand and considered rock stars according to some.
What can your data do you for? @Data_Nerd :o)