Theoretical relevance can make a model highly valuable,
but sometimes it's all about performance product quality,
balancing bias and variance as best possible.
What is modeling?
The real world is a messy place,
but this is where data comes from.
While data might be generated by a "true" mechanism,
it may not be known, and exist with interference and noise.
Some modeling seeks to simulate real world phenomena,
but for us will exist in the context of explaining data,
and "fitting it" to identify and predict real-world processes.
The role of mathematics
Mathematics is integral to modeling,
as it provides tools and frameworks for abstraction.
While models might have physical or social foundations,
these must be expressed mathematically to be applied.
A mathematical model might use simple, high school algebra,
or any part of calculus or matrix algebra as a framework.
Calculus
Calculus is all about rates of change and accumulation.
This becomes very important when approaching optimization,
e.g., when error is minimized for a parameter's estimation.
However, calculus is a continuous (smooth) science
and measurements and computations are always discrete,
so the day-to-day value for DS is largely with intuition
and model development or analysis, instead of application.
Differential calculus is about rates (slopes)
The derivative f′(x) of a curve at a point is the slope of the line tangent to that curve at that point.
This slope is determined by considering the limiting value of the slopes of secant lines.
Integral calculus is about accumulation
Integration can be thought of as measuring the area under a curve, defined by f(x), between two points (here a and b).
Required readings
If you have never seen calculus, please read
sections 2.2
and 2.4
from Wikipedia's article on the topic,
but do not worry if you don't understand the algebraic details,
i.e., just focus on gaining an intuition for rates and accumulation.
Linear algebra
Linear algebra is all about equations and solutions,
with restrictions on the types of operations considered,
i.e., only multiplication and addition (making things linear),
and extreme generality for many dimensions.
For data representation this is an important framework,
e.g., raster images are matrices of color intensities.
In algorithms, linear algebra plays a central role, too,
e.g., Google's search algorithm is a matrix multiplication!
PageRank is matrix multiplication
Required videos
If you've never had linear algebra, please watch the first four videos in this series:
the essence of linear algebra,
and once again, don't worry so much about the details,
but focus on gaining an intuition for the nature of the subject.
Calculus vs. linear algebra
Calculus usually gets extreme emphasis in math education.
However, there is contention over which is more important.
This is true, with perhaps more emphasis in data science,
where many attest its usefulness over calculus.
Is this true, or simply backlash against historical emphasis?
Whether either is intuitively grounding or explicitly useful,
neither branch of math should really be left out entirely,
i.e., a data scientist is best off understanding both.