Kadomin

Vector Space

We will gradually work our way toward the definition of a vector space. We wil start from simpler algebraic ideas and gradually add structure.

Inverses in Arithmetic

In everyday arithmetic, we learn four operations: addition, subtraction, multiplication, and division. However, subtraction and division are not actually new operations. Subtraction is just addition with an additive inverse. Example:

\(2-3 = 2 + (-3)\)

The additive inverse of \(3\) is \(-3\), because \(3 + (-3) = 0\). The number \(0\) is known as the neutral element or identity element because adding it to any number leaves the result unchanged. Division is multiplication with a multiplicative inverse. Example:

\(5 \div 3 = 5 \cdot \frac{1}{3} \)

The multiplicative inverse of \(3\) is \(\frac{1}{3}\), because \(3 \cdot \frac{1}{3} = 1\). Here, the number \(1\) is the neutral element with respect to multiplication, since multiplying by \(1\) leaves any number unchanged.

This viewpoint is helpful because it lets us focus on addition and multiplication as the core operations.

Groups

Now we want to capture how objects interact given some operation we define. These objects could be anything, but for simplicity, think of them as numbers for now. We place these objects into something called a set. A set is simply a collection of objects or elements, written using curly braces. For example:

\(\{1,2,3\}\)

is the set containing the elements \(1\), \(2\), and \(3\).

Two key properties of sets:

  • No duplicates:
    \(\hspace{1cm}\)Writing an element more than once does not change the set.
    \(\hspace{1cm}\)For example, \(\{2,2\}\) is the same set as \(\{2\}\).
  • Order does not matter:
    \(\hspace{1cm}\)\(\{2,3\}\) and \(\{3,2\}\) represent the same set.

There is also a special set that contains no elements at all, called the empty set, written as:

\(\{\}\) or \(\emptyset\)

With a set, we have the objects we want to work with. The next step is to describe how these objects interact. This is done by defining an operation on the set. An operation is simply a rule that takes two elements from the set and assigns another element to them.

The operation can be defined in any way we choose. For example, we could invent a special kind of addition where:

\(1 \oplus 0 = 1\) and \(1 \oplus 1 = 0\)

Or we could define something entirely different if we wanted. To keep it simple, just think about the classical addition or multiplication for now. We usually write addition with the symbol \(+\), and multiplication with the symbol \(\cdot\). When speaking about a general operation (without specifying which one), we will use the symbol \(\circ\).

Now we are ready to define what a group is:

A group \((G, \circ)\) is a set \(G\) together with one operation \(\circ\) that follows certain rules:

Closure
\(\hspace{1cm}\)For all \(a,b \in G\), then \(a \circ b \in G\).

Associativity
\(\hspace{1cm} (a \circ b) \circ c = a \circ (b \circ c)\) for all \(a,b,c \in G\).

Inverse
\(\hspace{1cm}\)For every \(a \in G\), there exists a unique \(-a \in G\) such that \(a + (-a) = 0\).

Identity
\(\hspace{1cm} a \circ 0 = a\) for all \(a \in G\).

Closure means that when an operation is applied to elements of a set, the result is still an element of that same set. For example, the set of natural numbers \(\mathbb{N} = \{1,2,3, \dots\}\) is closed under addition, since adding any two natural numbers always produces another natural number. In contrast, the set \(\{1,2,3\}\) is not closed under addition, because \(2 + 3 = 5\), and \(5\) is not part of the set.

Associativity means that the way elements are grouped under an operation does not affect the result.

The inverse property means that every element has a partner, or inverse, that cancels it out when the operation is applied. For instance, with addition on integers, \(3\) and \(-3\) are inverses because \(3 + (-3) = 0\).

The identity element is one that doesn’t change other elements when the operation is applied. For example, adding \(0\) to a number.

To clarify the defintion look at the following example: Consider the group \((\mathbb{Z}, +)\)—the integers under addition. You can check that all the group properties hold: the set of integers \(\mathbb{Z} = \{\dots, -2, -1, 0, 1, 2, \dots\}\) is closed under addition; grouping doesn’t matter (associativity); \(0\) serves as the identity element; and every integer \(a\) has an inverse \(-a\).

As a counterexample, the integers under multiplication, \((\mathbb{Z}, \cdot)\), do not form a group. While the set is closed and multiplication is associative, most integers do not have multiplicative inverses that are also integers—for example, the inverse of \(2\) would be \(\frac{1}{2}\)​, which is not an integer.

Why this definition of a group? To answer that, consider the following analogy: Think of a group like the basic framework of a car. To be considered a car, you need a few essential components: a chassis, wheels that roll, a steering mechanism, an engine, and brakes. With these in place, you know it will behave predictably—you can drive it, turn it, and stop safely. Similarly, we now know how a group will behave, since we know that the set and its associated operation follow the rules stated above.

Once you have this basic car (the group), you can build on it. You might add features like racing seats, subwoofer, or a massive spoiler—just as in mathematics, once we know we have a group, we can construct more complex structures like rings, fields, and vector spaces. With a group, we can predict how things behave, prove properties about them, and relate them to other “cars” (groups) that follow the same rules.

One simple upgrade to the group structure is commutativity:

\(a + b = b + a \) for all \(a,b \in G.\)

which means that the order in which we apply the operation does not matter. If this rule is satisfied along with the others, the resulting structure is called an abelian group, or simply a commutative group. Now, let’s upgrade our structure further by introducing another operation.

Rings

Consider a set \(R\) equipped with two operations: addition \((+)\) and multiplication \((\cdot)\). Together, they form a ring if the following conditions hold:

  • \((R, +)\) forms an abelian group.
  • Under multiplication, \(R\) is closed and associative, but not necessarily a group — only these two properties are required. (This structure is often referred to as a semigroup. Note that a semigroup does not require the existence of an identity element).
  • The two operations are connected by the distributive law, which states that:
    \(\hspace{1cm} a \cdot (b +c) = a \cdot b + a \cdot c\) and \((a + b) \cdot c = a \cdot c+ b \cdot c\) for all \(a,b,c \in R\).

It is important to note that rings do not require multiplication to be commutative, nor do they require multiplicative inverses for all elements.
(If a ring contains a multiplicative identity element (usually denoted \(1\)), then it is called a ring with unity or a unital ring, or a ring with identity)

In essence, a ring is a specific extension of the group structure that adds a second operation, multiplication, alongside addition. The constraints on multiplication are less strict than those on addition.

If multiplication is also commutative (\(a \cdot b = b \cdot a\)) then the ring is a commutative ring. Still no multiplicative inverses needed. If every nonzero element has a multiplicative inverse, then the ring is a division ring. Multiplication does not need to be commutative in a division ring.

We’re getting closer to the definition of a vector space — we just need one more thing, one final upgrade.

Fields

If a ring is commutative and every nonzero element has a multiplicative inverse (meaning a commutative division ring), we call it a field. In other words, a set \(F\) with addition \((+)\) and multiplication \((\cdot)\) forms a field if:

  • \((F, +)\) is an abelian group.
  • \((F \setminus \{0\}, \cdot\)) is also an abelian group.
  • The two operations are connected by the distributive law.

Below is the same definition, written without using group terminology, for those who want to see all the rules explicitly:

A set \(F\) is called a field if it contains at least two distinct elements, \(0\) and \(1\), with \(0 \neq 1\), and is equipped with two operations called addition \((+)\) and multiplication \((\cdot)\), such that:

Closure
\(\hspace{1cm}\)For all \(a,b \in F\), both \(a + b \in F\) and \(a \cdot b \in F\).

Commutativity
\(\hspace{1cm} a + b = b + a \) and \(a \cdot b = b \cdot a\) for all \(a,b \in F\).

Associativity
\(\hspace{1cm} (a + b) + c = a + (b + c)\) and \((a \cdot b) \cdot c = a \cdot (b \cdot c)\) for all \(a,b,c \in F\).

Identities
\(\hspace{1cm} a + 0 = a\) and \(a \cdot 1 = a\) for all \(a \in F\), where \(0\) and \(1\) are distinct.

Additive Inverse
\(\hspace{1cm}\)For every \(a \in F\), there exists a unique \(-a \in F\) such that \(a + (-a) = 0\).

Multiplicative Inverse
\(\hspace{1cm}\)For every \(a \in F\) with \(a \neq 0\), there exists a unique \(a^{-1} \in F\) such that \(a \cdot a^{-1} = 1\).

Distributive Property
\(\hspace{1cm} a \cdot (b + c) = a \cdot b + a \cdot c\) and \((a + b) \cdot c = a \cdot c+ b \cdot c\) for all \(a,b,c \in F\).

A field provides the perfect setting for all the arithmetic we take for granted—one where operations behave safely, predictably, and symmetrically.

In simpler terms, a field is a mathematical playground where the four basic arithmetic operations—addition, subtraction, multiplication, and division (except by zero)—all work together in harmony. This balance is what makes algebra and especially linear algebra possible.

Fields form the foundation of vector spaces. Whenever you work with vectors, matrices, or linear transformations, you’re implicitly relying on a field. It ensures that you can scale, combine, and manipulate numbers freely, without ever breaking the underlying arithmetic rules.

In short:

  • Groups give us basic structure (like addition and substraction).
  • Rings let us add, substract and multiply.
  • Fields let us add, subtract, multiply, and divide—consistently and predictably.

You can think of a field as the fully equipped car in our analogy. Groups gave us the basic framework—wheels, an engine, steering and braking. Rings added more systems, like comfort, safety features and of course the massive spoiler. But a field is the complete, road-ready vehicle: it can drive smoothly in any direction, handle any maneuver, and take you wherever you need to go in mathematics.

Common examples of fields in linear algebra are the real numbers \(\mathbb{R}\) and the complex numbers \(\mathbb{C}\). In these tutorials, we will focus on the real numbers \(\mathbb{R}\), since they are the ones most commonly used in machine learning. A real number is any number that can be represented on the number line, including positive numbers, negative numbers, zero, rational numbers, and irrational numbers. So just remember: the set of all real numbers \(\mathbb{R}\), together with additon and multiplication, forms a field.

The next ingredient to a vector space will be lists, so what are they?

Lists

Suppose \(n\) is a nonnegative integer. A list of length \(n\) is an ordered collection of \(n\) elements.

By “ordered,” we mean that the sequence of elements matters—so \((2,3)\) is not the same as \((3,2)\). Two lists are considered equal if and only if they have the same length and their corresponding elements are identical and in the same order. A typical list of numbers looks like this:

\((x_1, x_2, \dots)\)

A list of length \(0\) looks like this:

\(()\)

There is an important difference between lists and sets. In a list, the order of elements matters and repetitions are significant. In a set, however, order does not matter and repetitions are ignored.

For example:

  • The lists \((19, 8)\) and \((8, 19)\) are not equal, because the order of elements is different. But the sets \(\{19, 8\}\) and \(\{8, 19\}\) are equal, since order doesn’t matter in sets.
  • Similarly, the lists \((6, 6)\) and \((6, 6, 6)\) are not equal, because they have different lengths. However, the sets \(\{6, 6\}\) and \(\{6, 6, 6\}\) are equal, since repetitions in a set are ignored.

Now, we’ll combine fields with lists.

\(\mathbb{R}^n\)

\(\mathbb{R}^n\) is the set of all lists of length \(n\) of elements of \(\mathbb{R}\)

\(\mathbb{R}^n = \{(x_1, x_2, \dots, x_n) : x_k \in \mathbb{R} \; \text{for} \; k = 1, \dots, n\}\)

For \( (x_1, \dots, x_n) \in \mathbb{R}^n\) and \(k \in \{1,\dots,n\}\) we say that \(x_k\) is the \(k^{\text{th}}\) coordinate of \( (x_1, \dots, x_n)\)

To better understand the definition, let’s look at an example: \(\mathbb{R}^2\). Any point in \(\mathbb{R}^2\) can be written as \((x_1, x_2)\). For instance, consider the coordinates 2 and 3, which gives us the list (2, 3). Now, let’s take a simple number line and place the first number on it:

We can do the same for the second number:

Now we can combine the two number lines by placing one perpendicular to the other, intersecting at the origin (zero):

Here, you can see that (2, 3) and (3, 2) are different, which shows why the order matters. This was just a single point in 2D space, represented as a list. Now, consider every possible point in this 2D plane—each can be represented as a list. The set of all these lists, representing all possible points, is \(\mathbb{R}^2\).

This idea can be naturally extended to three-dimensional space by adding a third number line perpendicular to the first two. In this case, \(\mathbb{R}^3\) represents all possible points, each expressed as a list of length 3: \((x_1, x_2, x_3)\)—one entry for each direction. Visualizing beyond \(\mathbb{R}^3\) becomes challenging, but the same principles apply in any number of dimensions.

Next, we need to define how elements of \(\mathbb{R}^n\) interact with each other. We already have our set — now we’ll define the operations.

Operations on \(\mathbb{R}^n\): Addition and Scalar Multiplication

Addition in \(\mathbb{R}^n\) is defined by adding corresponding coordinates:

\( (x_1, x_2, \dots, x_n) + (y_1, y_2, \dots, y_n) = (x_1 + y_1, x_2 + y_2, \dots, x_n + y_n) \)

We can introduce a placeholder for the lists so that we don’t have to write out every element each time:

\(\mathbf{x} + \mathbf{y}\)

here \(\mathbf{x} = (x_1, \dots, x_n)\) and \(\mathbf{y} = (y_1, \dots, y_n)\). Observe that this operation is commutative; in other words, \( \mathbf{x} + \mathbf{y} = \mathbf{y} + \mathbf{x}\).

Next, we’ll look at the neutral element, denoted by \( \mathbf{0} \), which has length \(n\):

\( \mathbf{0} = (0,0,\dots,0) \)

This is useful because it allows us to define the additive inverse in \(\mathbb{R}^n\):

For \(\mathbf{x} \in \mathbb{R}^n\), the additive inverse of \(\mathbf{x}\), denoted by \(-\mathbf{x}\), is the element \(-\mathbf{x} \in \mathbb{R}^n \) such that:

\(\mathbf{x} + (-\mathbf{x}) = \mathbf{0}\)

So if \(\mathbf{x} = (x_1, \dots, x_n)\) then \(-\mathbf{x} = (-x_1, \dots, -x_n)\).

Let’s move on to scalar multiplication:

The product of a scalar\(\lambda\) and an element in \(\mathbb{R}^n\) is computed by multiplying each coordinate of the element by \(\lambda\):

\(\lambda \mathbf{x} = \lambda (x_1, x_2, \dots, x_n) = (\lambda x_1, \lambda x_2, \dots, \lambda x_n)\)

here \(\lambda \in \mathbb{R}\) and \(\mathbf{x} \in \mathbb{R}^n\).

Scalar” is really just another word for “number”, we’ll see soon why that name makes sense. Also, note that both vector addition and scalar multiplication in \(\mathbb{R}^n\) are closed operations. In addition, observe that the scalar comes from our field \(\mathbb{R}\), whereas the elements we add (the lists) come from \(\mathbb{R}^n\)

Finally, we have everything we need to define a vector space — specifically, lists of elements over the field \(\mathbb{R}\). The definition of a vector space arises from the properties of addition and scalar multiplication in \(\mathbb{R}^n\) that we have just introduced.

Vector Spaces

A vector space is a set \(V\) along with an addition on \(V\) and a scalar multiplication on \(V\) such that the following properties hold:

Closure
\(\hspace{1cm}\)For all \(u,v \in V\) and \(a \in F\), both \(u + v \in V\) and \(a \cdot u \in V\).

Commutativity
\(\hspace{1cm} u + v = v + u \) for all \(u,v \in V\).

Associativity
\(\hspace{1cm} (u + v) + w = u + (v + w)\) and \((a \cdot b) \cdot v = a \cdot (b \cdot v)\) for all \(u,v,w \in V\) and \(a, b \in F\)

Additive Identity
\(\hspace{1cm}\)There exists an element \(\mathbf{0} \in V\) such that \(v + \mathbf{0} = v\) for all \(v \in V\).

Additive Inverse
\(\hspace{1cm}\)For every \(v \in V\), there exists a unique \(-v \in V\) such that \(v + (-v) = \mathbf{0}\).

Multiplicative Identity
\(\hspace{1cm}1 \cdot v = v\) for all \(v \in V\).

Distributive Property
\(\hspace{1cm} a \cdot (u + v) = a \cdot u + a \cdot v\) and \((a + b) \cdot v = a \cdot v + b \cdot v\) for all \(a,b \in F\) and for all \(u,v \in V\).

In essence, a vector space is an abelian group under addition that is also equipped with a scalar multiplication operation by elements of a field \(F\). Simply put, a vector space is a set of objects that you can add together and scale (multiply by a number, called a “scalar”) while following some rules. I stated the definition in its general form, meaning it works for any field \(F\). However, in our setting we will work specifically with the field \(\mathbb{R}\). So whenever we refer to “the field” of a vector space in what follows, we mean \(\mathbb{R}\).

The elements of a vector space—the objects in the set \(V\)—are called vectors. In our case, the lists we defined earlier are our vectors.
Geometrically, the point in the figure above with coordinates \(2\) and \(3\), represented by the list \((2,3)\), is a vector. It is not difficult to see that \(\mathbb{R}^2\) is a vector space under our definitions of list addition and scalar multiplication (just check the rules).

Now, you may wonder: in the definition of a vector space, the vectors come from the set \(V\), and only the scalars come from the field \(F\). So why did we spend so much time defining what a field is, only to end up using ordinary numbers? Couldn’t we have replaced the field with a ring? Technically, yes—doing so would give us a module. But a module lacks many of the nice properties a vector space has.

For example, consider solving a simple equation (this is algebra, after all):

\(a\mathbf{x} = b\),

where \(a\) is a scalar. If we want to solve for \(\mathbf{x}\), we need to “move” \(a\) to the other side—that is, we need to divide by \(a\). This requires a multiplicative inverse. Rings don’t guarantee that; fields do. Without inverses, even a simple exercise like this could break down.

These rules exist precisely so we can perform the operations we typically take for granted—solving equations, manipulating expressions, and computing without running into avoidable complications. And the scalars can’t come from just a group either, since both addition and multiplication are required for the distributive laws.

Examples of Vector Spaces (Bonus)

We’ve already seen the example of \(\mathbb{R}^2\), or more generally \(\mathbb{R}^n\), which will be our main “workspace” in machine learning. This is the example you really need to know by heart and understand deeply.

But remember, the definition of a vector space is abstract, so many things can form a vector space. Here are a few examples for fun, we won’t be working with these right away, but it’s nice to see the bigger picture.

Polynomials of degree less than 3

These are polynomials that look like:

\(ax^3 + bx^2 + cx + d\)

Now, imagine a whole collection of these polynomials. This collection will be our set \(V\), and each polynomial is a vector.

  • We can add any two polynomials in the set, and the result is still a polynomial of degree 3 or less.
  • We can multiply a polynomial by a scalar, and it still stays in the set.
  • All the other vector space rules we discussed also hold.

Real-valued functions on a domain

Next, consider all real-valued functions defined on some interval, like all continuous functions on \([0,1]\). Denote them by \(f(x), g(x), \dots\) These are our vectors.

  • Adding two functions gives another function in the same set.
  • Multiplying a function by a scalar still gives a function in the set.
  • And again, all the other vector space rules apply.

The key takeaway: vector spaces aren’t limited to lists of numbers. Polynomials, functions, sequences, and so on can all form vector spaces — as long as addition and scalar multiplication follow the rules.

Summary

  • A vector space is a mathematical structure consisting of a set \(V\) where we can add elements and multiply them by scalars, following certain rules.
  • The elements of \(V\) are called vectors.
  • The scalars come from a field \(F\).
  • We need the field \(F\) and the vector space rules so we can perform familiar operations—solving equations, manipulating expressions, and computing—without running into unnecessary complications.
  • The main vector space used in machine learning is \(\mathbb{R}^n\), where each vector is a list of real numbers.