### \(\S 1.\) Symmetric polynomials

Let \(R\) be a commutative ring and \(R[x_1, x_2 , \ldots , x_n]\) the polynomial ring in \(n\) indeterminates over \(R\). A polynomial \(f(x)\) in it is said to be a symmetric polynomial (or symmetric function) of \(x_1, \ldots , x_n\) if it is unchanged by any permutation of the indeterminates. Consider the polynomials:

All of \(f(t ,x_1, \ldots, x_n)\) and \(e_i(x_1, \ldots , x_n)\) are symmetric functions of \(x_1, \ldots , x_n\).
The \(e_i\)'s are called elementary symmetric polynomials of \(x_1, \ldots , x_n\).
Sums, products and scalar multiples of symmetric polynomials are again symmetric polynomials.
A polynomial \(g(e_1, \ldots , e_n)\) of \(e_1, \ldots , e_n\) becomes a symmetric polynomial \(g(x_1, \ldots x_n)\) when \(e_i\)'s are written in terms of \(x_i\)'s.
The fundamental theorem on symmetric functions is the converse of this statement, credited to Gauss^{2},
which will here be more precisely stated and proved.

Introduce one more indeterminate \(t\) and consider the polynomial

by which

Note that symmetric polynomials give general expressions for coefficients of a polynomial in terms of its roots. The fundamental theorem (once proved) can then be thought of as being able to express any symmetric function of the roots of a polynomial, in terms of the coefficients without needing to explicitly compute the roots.

Note that the discussion so far is valid for any ring \(R\) by e.g., first doing in the ring \(\mathbb Z[t, x_1, \ldots , x_n]\) and tensoring with \(R\).
A more sophisticated way to state context is to construct the ring of symmetric functions as a "power series ring" constructed as the inverse limit as \(n\) varies, of the ring of symmetric polynomials in \(n\) variables over \(R\).
Details of such a construction is available eg in MacDonald (1979)^{5}.
The exposition here is of van der Waerden (1930)^{1} and Lang (1993)^{3}.

Every polynomial \(e_i\) is a homogeneous polynomial of degree \(i\) in \(x_1, \ldots x_n\).

Consider the homomorphism from \(R[x_1, \ldots , x_n]\) to itself sending \(x_1 \mapsto e_1, \ldots ,x_n \mapsto e_n\).
Given a polynomial \(g(x_1, \ldots , x_n)\), one gets the symmetric polymonial \(g(e_1, \ldots , e_n)\) considered as a polynomial in \(x_1, \ldots , x_n\).
The term \(a.e_1^{\mu_1}.\cdots .e_n^{\mu_n}\) of \(g(e_1, \ldots , e_n)\) where \(a\) is a scalar, becomes a homogeneous polynomial w. r. t. \(x_i\) of degree \(\mu_1 + 2\mu_2 + \ldots + n\mu_n\) in \(x_1, \ldots, x_k.\)
The sum \(\mu_1 + 2\mu_2 + \ldots + n\mu_n\) is said to be the *weight* of the monomial \(a.e_1^{\mu_1}.\cdots .e_n^{\mu_n}\).
Weight of a polynomial may be defined as the highest among the weights of monomial summands.
A polynomial \(g\) of weight \(k\) as above thus yields a symmetric polynomial in \(x_i\) of weight at most \(k\).

Theorem (Fundamental Theorem of Symmetric FunctionsLet \(R\) be a commutative ring and \(R[x_1, x_2 , \ldots , x_n]\) the polynomial ring in \(n\) indeterminates over \(R\). A symmetric polynomial \(f(x_1, \ldots, x_n)\) of degree \(k\) can be expressed as a polynomial of the elementary symmetric polynomials. This representation is unique.

For proof, we follow the presentation of van der Waerden^{1}.
First we establish that any symmetric polynomial can be expressed as a polynomial of the elementary symmetric polynomials.

The proof is constructive and uses a lexicographically ordering of the monomials, in which a term \(x_1^{\mu_1}.\cdots .x_n^{\mu_n}\) is greater than \(e_1^{\nu_1}.\cdots .e_n^{\nu_n}\) if the first nonvanishing difference \(\mu_i - \nu_i\) is positive. The component monomials of \(f\) are written in decreasing lexicographical order, where, given a particular monomial, all other resulting monomials result from permuting the \(x_i\) in it are considered together with it and written implicitly in short using the lexicographically greatest term only inside a summation, like:

Let \(k\) be the degree of \(f\) and let the first term in the lexicographic ordering be \(a x_1^{\mu_1}.\cdots .x_n^{\mu_n}\). Thus \(k = \mu_1 + 2\mu_2 + \ldots + n\mu_n\). Now the product

has weight \(k\) and when expanded as a function of the \(x_i\) and lexicographically ordered, has initial term \(a x_1^{\mu_1}.\cdots .x_n^{\mu_n}\). The weight of this product is less than or equal to

This product may be subtracted from \(f(x)\) and the resulting polynomial, in lexicographical ordering, has the term corresponding to weight \(k\) and indices \(\mu_1, \ldots, \mu_n\), equal to zero. On this difference, the procudure may be inductively applied. This subtraction does not increase the degree of the polynomial. For each given degree, there are only finitely many choices of possible indices. At each step we move further along the lexicographical ordering as some term vanishes. Therefore the inductive procedure terminates after finitely many steps and the polynomial \(f\) is expressed as a polynomial of the elementary symmetric polynomials.

Note

If \(f\) is additionally a homogeneous polynomial, the above algorithm works with every step yielding homogeneous polynomials. In the non-homogeneous case, the procedure may be considered to be running on each homogeneous component from the highest degree, one by one.

To show uniqueness of the representation, if two different polynomials functions \(f_1\) and \(f_2\)of \(e_i\) represent the same symmetric function, then their (nonzero) difference \(f_1 - f_2\) evaluated on the \(e_i\) would yield \(0\). Therefore it is enough to show that if \(f(x_1, \ldots, x_n) \neq 0\), then \(f(e_1, \ldots, e_n) \neq 0\). Given nonzero \(f\), after lexicographical ordering, each monomial summand can be written in the form \(ax_1^{\mu_1 - \mu_2}e_x^{\mu_2 - \mu_3}.\cdots .x_n^{\mu_n}\) (together with its permutations). Among all systems \((\mu_1, \ldots , \mu_n)\) of indices for nonzero coefficients \(a\), there is one that is first in the lexicographic ordering. Replacing \(x_i\) by \(e_i\) in the above and expanding in terms of \(x_i\), then in the above algorithm the first term of the lexicographical ordering is

and this term does not cancelled by any ones that are lesser in the lexicographical ordering,
and thus \(g(e_1, \ldots, e_n)\) is nonzero.
Therefore uniqueness of the representation is proved.
`Q. E. D.`

Let the symmetric group \(S_n\) act on \(R[x_1, x_2 , \ldots , x_n]\) sending \(x_i \to \sigma(x_i)\) where \(\sigma \in S_n\) is a permutation. Then the fundamental theorem says that the subring fixed under this action is the subring generated by \(e_1, \ldots , e_n\) and that there are no relations between the generators \(e_i\); that is to say, the homomorphism generated by \(x_i \to e_i\) is an isomorphism from \(R[x_1, x_2 , \ldots , x_n]\) to itself.

As the ring of symmetric functions in \(n\) variables and the polynomial ring in \(n\) variables are isomorphic, one may substitute particular values in \(R\) or another approprite rings for the \(x_i\) and any relations between the the symmetric polynomials are preserved for the evaluations. In particular, considering \(f\) as in \eqref{eq:elemsymmpoly}, symmetric functions of the roots of a polynomial are expressible as polynomials of the coefficients of \(f\).

### \(\S 2.\) Discriminant of a polynomial

Consider the polynomial \(f = a_nx^n + a_{n-1} x^{n-1} + \cdots + a_0\) with roots \(x_i\). Consider the difference product

which is an alternating function of \(x_i;\) that is to say, it changes sign upon an odd permutation of the \(x_i.\)
Its vanishing indicates that \(f\) has at least one root of multiplicity greater than \(1\).
Its square is called the *discriminant*:

and is a symmetric function of the \(x_i\). By the preceding, the discriminant is expressible as a polynomial function of the coefficents of \(f\). Given a more general polynomial \(f(t) = a_0 + a_1t^n + \cdots + a_n t^n\) where \(a_n\) is not necessarily \(1\), with roots \(x_i\), the discriminant may be defined as:

The product

in \eqref{eq:discprod} is a vandermonde determinant:

This agrees with definition of descriminant given in the quadratic case \(ax^2 + bx + c\) where it is equal to \(b^2 - 4ac\). For the case of cubics considered for e. g., for elliptic curves such as \(ax^3 + bx^2 + cx + d\), the discriminant is \(b^2c^2 -4ac^3 - 4b^3d - 27 a^2 d^2 + 18abcd\), which may be simplified for the polynomial \(x^3 + ax +b\) into \(-(4a^3 + 27b^2)\). A method for computing discriminants from the coefficents of \(f\) will be given proved in \(\S 5.\) These statements may be verified either by computation by hand using that methord, or a software computer algebra system.

The discriminant is a homogeneous polynomial of the roots; therefore it is also a homogenous w. r. t. the coefficients.

The discriminanant is a determinantal expression on polynomials in \(R[x]\); therefore, it may also be thought of as an element living in the exterior algebra on \(R[x]\).

### \(\S 3.\) Resultant of two polynomials

Consider two polynomials in \(R[x]\) at once, viz:

We wish to find conditions for \(f(x)\) and \(g(x)\) to have a common divisor.
We follow the approach of van der Waerden^{1}.

In the above, in general, it may be that either \(a_n\) or \(b_n\) vanish; we call \(n\) or \(m\) the respective formal degree of the polynomial and \(a_n\) or \(b_m\) the formal leading coefficent. For the present we assume that \(a_n\) and \(b_n\) do not simultaneously vanish.

**Lemma 1.** \(f(x)\) and \(g(x)\) have a nonconstant common divisor \(\phi(x)\) only if
an equation of the form

exists, where \(h(x)\) is of degree at most \(m - 1\) and \(k(x)\) is of degree at most \(n-1\), and \(h, k\) do not simultaneously equal \(0\). If \(R\) is a field \(\mathbb F\), the above condition is also sufficient.

**Proof.**

If \(\varphi(x)\) is a common factor of \(f(x)\) and \(g(x)\), then one may decompose \(f(x) = \varphi(x)h(x)\) and \(g(x) = \varphi(x)k(x)\); these \(h(x)\) and \(g(x)\) will satisfy \eqref{eq:polycommfactor}.

Now we show sufficiency and start with \eqref{eq:polycommfactor}.
If \(R\) is a field \(\mathbb F\), then we have unique factorization into irreducibles also in the polynomial ring \(R[x]\).
Without loss of generality let it be that \(a_n \neq 0\).
Every prime factor of \(f(x)\) must also be a factor of \(k(x)g(x)\) with at least the same exponent;
but \(f(x)\) does not divide \(k(x)\) as \(f\) is of degree \(n\) and \(k\) of degree \(n-1\).
Therefore at least one factor of \(f(x)\) divides \(k(x)\); therefore \(k\) is nonvanishing.
`Q. E. D.`

Now put

where the highest degree coefficent might also be zero.

Evaluating \eqref{eq:polycommfactor} and expanding we get \(n+m\) homogeneous linear equations in \(n+m\) quantities:

For a solution to exist, the determinant of the *Sylvester matrix* should vanish:

Here, to avoid minus signs in the determinant, \(c_i\) and \(-d_j\) were regarded as unknowns, and the terms invoving \(d_j\) in \eqref{eq:sylveqn} were moved to the left, thus yielding a matrix equation in \(n+m\) unknowns that equals zero, for which the \((n+m) \times (n+m)\) matrix should have zero determinant to have solutions.

The quantity \(R\) is called the *resultant* of \(f\) and \(g\).
By the preceding, its vanishing is necessary and sufficient condition for \(f\) and \(g\) to have a nonconstant common factor.
Generally, for more efficient computation of a determinant of a polynomial matrix with coefficents in a field,
one strategy that may be employed is to use the
row operations mentioned in the article on determinants;
one may multiply one row with a scalar and add or subtract it from another,
or even apply a variant of the Euclidean algorithm to compute the “gcd” of the rows involved, and repeat the procedure with more rows,
and finally use the Leibniz expansion.
For ease of computation, software for computer algebra such as PARI/GP have built-in functions to compute resultants and these may be used.

The determinant contains the “principal term” \(a_n^mb_0^n\) and this is the highest degree term in \(R\) considered as a polynomial of \(a_\mu\) and \(b_\nu\). It vanishes when \(a_n = b_n = 0\). There are precisely \(m\) rows with entries \(a_n, \ldots, a_1, a_0\) and \(n\) rows with entries \(b_m, \ldots, b_1, b_0.\) The resultant is a homogeneous polynomial of degree \(m\) in the \(a_\mu\) and degree \(n\) in the \(b_\nu\).

In the above, it was assumed that at least one of \(a_n, b_m\) was nonzero. To deal with the case when both are zero, consider the homogeneous polynomials

Any factorization of \(f(x) = (p_nx^r + p_{r-1} x^{r-1} + \cdots + p_0)(q_nx^s + q_{s-1} x^{s-1} + \cdots + q_0)\) yield a factorization \(F(x) = (p_nx^r + p_{r-1}y x^{r-1} + \cdots + p_0y^r)(q_nx^s + q_{s-1} x^{s-1}y + \cdots + q_0y^s)\) and vice versa; factors of \(f\) may be obtained from those of \(F\) by setting \(y = 1\). Same argument may be made for \(g\) and \(F\). Every common factor of \(f\) and \(g\) gives a common factor of \(F\) and \(G\) and vice versa. The whole process above of taking determinant of the the Sylvester matrix may be repeated. The case \(a_n = b_m =0\) corresponds to when a nonconstant common factor of \(F, G\) is a pure power of \(y\). Thus the resultant criterion may be reformulated, using homogeneous polynomials, as: if the resultant is zero, then \(F\) and \(G\) have a nonconstant, homogeneous common factor and conversely.

### \(\S 4.\) Resultant as a symmetric function

Suppose that the two polynomials \(f, g\) above can be factored completely into linear factors:

Then the coefficents \(a_\mu\) of \(f(x)\) are products by \(a_n\) of the elementary symmetric functions of the roots \(x_i.\) Likewise the coefficents \(b_\nu\) of \(g(x)\) are products by \(b_n\) of the elementary symmetric functions of the roots \(y_j.\) The resultant \(R\) is homogenous of degree \(m\) in the \(a_\mu\) and degree \(n\) in the \(b_\nu\) where it is unchanged by a permutation of the \(x_i\) or \(y_j\).

Applying the fundamental theorem twice in succession, \(R\) can be written as \(a_0^mb_0^n\) times a polynomial function of the coefficents \(a_\mu\) and \(b_\nu\), that are respectively the values of the elementary symmetric functions on \(x_i\) and \(y_j\). The following more explicit argument also may be made given \(x_i\) and \(y_j\).

Suppose the \(x_i\) and \(y_j\) are indeterminates. For some \(i\) and \(j\) let \(x_i = y_j\). Then \(R(f,g) = 0\) as \(f, g\) has a common factor. Therefore \((x_i - x_j)\) divides \(R(f,g)\) and likewise for each pair of \(i\) and \(j\). Thus the product of all \((x_i - x_j)\) divides \(R(f,g)\).

Consider

and form the product

Similarly, from

form the product

It is convenient to define

Now like \(R\), \(S\) is also homogenous of degree \(m\) in the \(a_\mu\) and degree \(n\) in the \(b_\nu\) and is unchanged by a permutation of the \(x_i\) or \(y_j\). Therefore \(R\) and \(S\) can differ only by a scalar factor. But the highest powers of either \(a_n\) or \(b_m\) in both \(S\) and \(R\) corresponds to the “principal term” \(a_n^mb_m^n\) and therefore the scalar factor must be \(1\) and \(R = S\).

The representation of \(R\) in terms of \(a_\mu\) and \(b_\nu\) is unique by the uniqueness part of the elementary theorem for symmetric functions. Therefore the foregoing is valid even if \(f(x)\) and \(g(x)\) do not split into linear factors.

Moreover, the computation of \(R\) via determinant of Sylvester matrix is valid over any ring, as determinant computation involves
only addition, subtraction and scalar multiplication.
Regarding arbitrary field of definition, we may additionally prove the *absolute irreducibility* of \(R\).
That is to say, \(R\) is an irreducible polynomial of the \(a_\mu\) and \(b_\nu\) over any field of definition.
To see this, assume that \(R\) factors as \(R = R_1R_2,\) where \(R_1\) is nonconstant;
that is, for some \(i, j\), the monomial \((x_i - y_j)\) divides \(R_1\).
But as \(R_1\) and \(R_2\) are polynomial expressions in \(a_\mu\) and \(b_\nu\), they are also symmetric functions in the roots \(x_i\) and \(y_j\).
Thus the product \(\prod_{i,j} x_i - y_j)\) divides \(R_1\).
Therefore \(R_2\) must divide \(a_n^m b_m^n\).
But looking at the determinant of the Sylvester matrix, one sees that \(R\) is not divisible by \(a_n\) or \(b_n\).
Therefore \(R_2 =1\) and a nontrivial factorization of \(R\) is impossible.

### \(\S 5.\) Resultant and discriminant

Consider the polynomial

and its derivative \(f^\prime(x)\). The resultant \(R(f, f^\prime)\) is

By the product rule for differentiation:

Using this in \eqref{eq:polyderiresu},

and if \(\Delta\) denotes the discriminant,

Thus, the discriminant may be computed as using the determinant of Sylvester Matrix for \(R(f,f^\prime):\)

### Bibliography

van der Waerden, B. L., Moderne Algebra, 1930; Algebra I & II , Springer, 1991. ↩

Barbeau, E. J., Polynomials , Springer, 1989. ↩

Lang, S., Algebra , Addison-Wesley, 1993, Rev. 3rd, Ed., Graduate Texts in Mathematics 211, Springer, 2002. ↩

Prosolov, V. V., Polynomials , Springer, 2001. ↩

MacDonald, I. G., Symmetric Functions and Hall Polynomials , Oxford Mathematical Monographs, 1979. ↩