What Does It Take To Be A Data Scientist?

What does it take to be a data scientist? What do I need to learn? What skills should I have? What tools should I know?

I don't think anyone in my generation ever thought of becoming a data scientist. To begin with, the term "data scientist" never existed. According to this Wikipedia article, the term was first coined in 2008. Today, it's regarded as the coolest job that could ever appear in anyone's profile.

Most of the time, I think in bullets, lists and outlines. It's the plain reason I like writing pseudo codes. It's natural to me. However, I know a couple or so people who think in matrices. It's natural for them, and they don't even realize how complex their default mindsets are to the people around them. I think that very few can really think further in multiple dimensions. But then again, is it relevant to become a true data scientist?

What is a true data scientist anyway? If you gather and analyze data, does that make you a data scientist? If you gather and process data so that they can be analyzed into discoverable and meaningful information, does that make you a data scientist? What if you play a role to creating big data, AI and machine learning solutions, does that qualify you as a data scientist? Should you be a statistician? A mathematician? A Graph Theory expert?

Is it enough that you use R, F#, Python, Scala, Go, SAS, Hadoop, Neo4j, Cosmos DB or MongoDB? Should you be good at APIs and tools like those from Microsoft, or Amazon, or Google, or Apple, or IBM, etc.? Is data science really a new breed of statistics? Is it analytics? Is it artificial intelligence (AI)? Cognition? What is it really?

Seriously, what does it take to be a data scientist?

Comments