So you want to become a data scientist? Well one thing we can sure assure you, it is very hot right now and it always pays a lot of money
But let us start focusing on the important parts of becoming a data scientist.
Who is a data scientist?
Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured, similar to data mining – Wikipedia
So just like a scientist will do experiments. A data scientist will try to gain insight from the data which is available to him. Remember since data comes in all shapes and sizes, it requires a lot of effort to master the art.
Why one should aspire to become a data scientist?
DATA is new GOLD, more data you have, more efficiently you can serve you customers with your products and services.
All the major companies in the silicon valley are hiring data scientists. They want to know what they can do with all the data.
They want to know, why some products are more used and why some are useless.
And the most important point, they want to know where are they heading?
What do you need to know about becoming a data scientist?
Apparently not much. You may already have the required skills to pursue the career of data scientist. Now I want to make one thing very clear, becoming a data scientist does require some basic understanding of programming. If you have never touched programming, you are required to learn it first.
Skills you need to learn to become a data scientist
Programming is something which scares a lot of individuals. But by mastering it, you can explore the whole new world. Now if you don’t have programming skills, we suggest getting a good beginner course.
Now data science has a great set of tools but they work with some specific languages only. You can use any language for data science but your learning curve might be huge and community support might be lesser.
Preferred languages for data scientist role
Python – Most popular, has huge community support and easy to learn, in fact, a lot of people learn python as their first programming language.
R language – Built for data science job. Since this is a new language in town. It falls short in terms of community support when compared with python. Language R does come with some learning curve.
Special Degree? As Google and Apple have stopped hiring people based on degree, we won’t focus on them.
As a data scientist, your job has everything to do with big data. So you must learn the basics of the database.
The learning curve is very small and it takes hardly 15 days to 24 days to learn the concepts as it is just like how human beings talk.
Now database basics contain
- 10% concepts
- 90% SQL
What is SQL?
SQL is popularly known as a Structured query language, and in simple terms it means, querying the database(getting data from the database) as we talk to a human being.
Tell me who all are present from Maths department?
English into SQL
SELECT * FROM students WHERE department = “maths”;
Don’t worry about the statement, but I believe, If you can understand English, you can understand the above statement.
Big data is something which relies heavily on tools and frameworks to make your life easier.
You can still become a data scientist without mastering these tools but still, It is always better to move forward and save time with tools.
- Hadoop – A framework with which you can break your big data into smaller pieces and process them in different machines.Yes, if I talk about big data, it is very huge, sometimes it is impossible for us to process this in a single machine.
We use frameworks like Hadoop to break them into smaller pieces and then process them in different machines to save time.
- Apache Spark – Again, this framework works similarly but it is faster than Hadoop
Why Apache Spark is faster than Hadoop? When we talk about reading and writing, Hadoop does it directly on the hard disk while Apache Spark caches them into primary memory (aka RAM)
Common misconceptions about Data Scientist Career (Busted)
- They wear regular clothes not just white coat and big black glasses
- They have to work in a closed environment. NO, you can work from anywhere, thanks to the internet, you can connect to your environment from anywhere in the world
- You need a special degree. NO, a lot of nontechnical degree holders and now data scientist.
- You have to take special classes, skip job/classes to learn data science. NO, even if you can spare 60 mins a day, you can easily become a data scientist in 3 months.
- As a student, I cannot pay huge fees to learn data science. NO, You can learn data science for free, in fact, we have one free tutorial series on Pyspark
- Data science is for big companies. NO, a lot of government and small businesses are looking for a data scientist.
Well with this we wish you luck in your journey, feel free to drop a comment below and let us know what do you think.
If you feel we are missing out on something, feel free to let us know.
Want to master data science and other technologies.