Data Science for Business: What you need to know about data mining and data-analytic thinking
Foster Provost and Tom Fawcett
Data science is the new best thing, but like Aristotle’s elephant people study to define
exactly what data science is and what the skills required are.
When we see data science we tend to recognise what it is, that mixture
of analysis, inference and logic that pulls information out of numbers, be it social
network analysis, plotting interest in a topic over time, or predicting the impact of the
weather on supermarket stock levels.
This book serves as an introduction to the topic. It’s designed for use as a
college textbook and perhaps aimed at business management courses. It starts at a very
low level, assuming little or no knowledge of statistics or of any of the more advanced
techniques such as cluster analysis or topic modelling.
If all you ever do is read the first two chapters you’ll come away with enough
high level knowledge to fluff your way through a job interview as long as you’re
not expected to get your hands dirty.
Chapter three and things get a bit more rigorous. The book noticably changes
gear and takes you through some fairly advanced mathematics, discussing
regression, cluster analysis and the overfitting of mathematical models, all of
which are handled fairly well
It’s difficult to know where this book sits. The first two chapters are most
definitely ‘fluffy’, the remainder demand some knowledge of probability theory
and statistics of the reader, plus an ability not to be scared by equations embedded
in the text.
It’s a good book, it’s a useful book. It probably asks too much to be ideal for the
general reader or even the non numerate graduate, I’d position it more as an
introduction to data analysis for beginning researchers and statisticians more than
anything else, rather than as a backgrounder on data science.
[originally written for LibraryThing]