You are currently browsing the monthly archive for December 2007.

Here is an interesting article (by Terry Tao) and the accompanying discussion on the relation between high IQ and doing good mathematics. Moral of the story: there is hope! 🙂

I wish to use this part of the blog to quickly go through the basic elements of propositional calculus, and then later move on to predicate calculus in another part of the blog, followed by the fundamentals of relational algebra in yet another part. I might then go through the problem of query optimization in RDBMS after that. Let’s see how far this goes.

The Harvard College Mathematics Review (HCMR) published its inaugural issue in April 2007, and the second issue was out almost a fortnight ago. Clearly, the level of exposition contained in the articles is extremely high, and it is a pleasure reading all the articles even though a lot of it may not make a lot of sense to a lot of people. I would recommend anyone to visit their website and browse all their issues. For problem-solvers, the problem section in each issue is a delight!

Anyway, the first issue’s problem section contained a somewhat hard inequality problem (proposed by Shrenik Shah), which I was able to solve and for which my solution was acknowledged in the problem section of the second issue. The HCMR carried Greg Price’s solution to that particular problem, and I must say his solution is somewhat more “natural” and “intuitive” than the one I gave.

Well, I want to present my solution here but in a more detailed fashion. In particular, I want to develop the familiar AM-GM-HM inequality up to a point where the solution to the aforesaid problem turns out to be somewhat “trivial.” The buildup to the final solution itself is non-trivial, however. This post relies on the material presented in the classic book Problems and Theorems in Analysis I by George Pólya and Gabor Szegö. Again, I strongly recommend all problem-solvers to study this book. Anyway, we will now state the problem and discuss its solution. (I have slightly changed the formulation of the problem in order to avoid any possible copyright issues.)

Problem: For all distinct positive reals a and b, show that

\displaystyle \frac{a+b}{2} > \frac{a^{\frac{a}{a-b}}b^{\frac{b}{b-a}}}{e} > \frac{a-b}{\ln a - \ln b}.

First, let us discuss some facts.

1. AM-GM-HM inequality: If x_1, x_2, \ldots, x_n are n positive real numbers, then

\displaystyle (x_1 + x_2 + \ldots + x_n)/n \geq \sqrt[n]{x_1x_2\cdots x_n} \geq \frac{n}{1/x_1 + 1/x_2 + \ldots + 1/x_n},

with equality if and only if x_1 = x_2 = \ldots = x_n.

Proof: For a hint on proving the above using mathematical induction, read this. However, we will make use of Jensen’s inequality to prove the above result. We won’t prove Jensen’s inequality here, though it too can be proved using induction.

Jensen’s inequality: Let f : (a, b) \to \mathbb{R} be a continuous convex function. Let \lambda_1, \lambda_2, \ldots, \lambda_n be positive reals such that \lambda_1 + \lambda_2 + \ldots + \lambda_n = 1. Then for all x_1, x_2, \ldots, x_n \in (a,b), we have

\lambda_1f(x_1) + \lambda_2f(x_2) + \ldots + \lambda_nf(x_n) \geq f(\lambda_1x_1 + \lambda_2x_2 + \ldots + \lambda_nx_n),

with equality if and only if x_1 = x_2 = \ldots = x_n.

 

 

 

 

Now, in order to prove (1), consider the function f : (0,\infty) \to \mathbb{R} defined by f(x) = -\ln x. Indeed, f is continuous on the stated domain. Also, f''(x) = 1/x^2 > 0, which implies f is convex on (0, \infty). Therefore, using Jensen’s inequality and setting \lambda_1 = \lambda_2 = \ldots = \lambda_n = 1/n, for positive reals x_1, x_2, \ldots, x_n, we have

\displaystyle -\frac1{n} \left( \ln x_1 + \ln x_2 + \ldots + \ln x_n \right) \geq -\ln \left( \frac{x_1 + x_2 + \ldots + x_n}{n} \right)

\displaystyle \Rightarrow \frac{x_1 + x_2 + \ldots + x_n}{n} \geq \sqrt[n]{x_1x_2\cdots x_n} \quad \ldots (since \ln x is monotonically increasing on (0, \infty).)

This proves the first part of the inequality in (1). Now, replace each x_i with 1/x_i for 1 \leq i \leq n to derive the second part of the inequality. And, this proves the AM-GM-HM inequality.

We can generalize this further. Indeed, for any positive reals p_1, p_2, \ldots, p_n and positive reals x_1, x_2, \ldots, x_n, replacing each \lambda_i with p_i/(p_1 + p_2 + \ldots + p_n) for 1 \leq i \leq n, and using Jensen’s inequality for f(x) = -\ln x once again, we obtain the following generalized AM-GM-HM inequality, which we shall call by a different name:

2. Generalized Cauchy Inequality (non-integral version) : For any positive reals p_1, p_2, \ldots, p_n and positive reals x_1, x_2, \ldots, x_n, we have

\displaystyle \frac{p_1x_1 + \ldots + p_nx_n}{p_1 + \ldots + p_n} \geq  \left(  x_1^{p_1}\cdots x_n^{p_n}\right)^{1/(p_1 + \ldots + p_n)} \geq \frac{p_1 + \ldots + p_n}{p_1/x_1 + \ldots + p_n/x_n}.

Now, the remarkable thing is we can generalize the above inequality even further to obtain the following integral version of the inequality.

3. Generalized Cauchy Inequality (integral version) : Let f(x) and p(x) be continuous and positive functions on the interval [a, b] ; also suppose f(x) is not a constant function. Then we have

\displaystyle \frac{\int_{a}^{b} p(x)f(x)\, dx}{\int_{a}^{b} p(x)\, dx} > \exp\left({\frac{\int_{a}^{b} p(x)\ln f(x)\, dx}{\int_{a}^{b} p(x)\, dx}}\right) > \frac{\int_{a}^{b} p(x)\, dx}{\int_{a}^{b} \frac{p(x)}{f(x)}\, dx},

where \exp(x) = e^x.

Okay, now we are finally ready to solve our original problem. First, without any loss of generality, we can assume a < b. Now, we shall use the above version of the Generalized Cauchy Inequality, and so we set p(x) = 1 and f(x) = x. Here f and g are both positive functions on the interval [a, b]. Also, note that f is not a constant function.

 

 

 

Thus, we have \displaystyle \frac{\int_a^b 1\, dx}{\int_a^b \frac1{x} \, dx} = \frac{a-b}{\ln a - \ln b} \quad \ldots (*).

Also, \exp \left( \frac{\int_a^b \ln x \, dx}{\int_a^b 1 \, dx} \right) = \exp \left(  \frac{b\ln b - a\ln a - (b-a)}{b-a}\right) = \frac{a^{\frac{a}{a-b}} b^{\frac{b}{b-a}}}{e} \quad \ldots (**).

And, \displaystyle \frac{\int_a^b x\, dx}{\int_a^b 1 \, dx} = \frac{(b^2 - a^2)/2}{b-a} = \frac{a+b}{2} \quad \ldots (***).

Combining (*), (**) and (***), we immediately obtain the desired inequality. And, we are done.

The difference between sets A and B, also known as the relative complement of B in A, is the set A - B defined by

A - B = \{ x: x \in A \mbox{ and } x \not\in B \}.

If we assume the existence of a universe, U, such that all the sets are subsets of U, then we can considerably simplify our notation. So, for instance, U - B can simply be written as B', which denotes the complement of B in U. Similarly, U' = U - U  = \emptyset \,, \emptyset ' = U - \emptyset = U, and so on. A quick look at a few more facts:

  • (A')' = A,
  • A \cap A' = \emptyset,
  • A \cup A' = U
  • A \subset B if and only if B' \subset A'.

The last one is proved as follows. We prove the “if” part first. Suppose B' \subset A'. If x \in A, then clearly x \not\in A'. But, since B' \subset A', we have x \not\in B', which implies x \in B. Hence, A \subset B. This closes the proof of the “if” part. Now, we prove the “only if” part. So, suppose A \subset B. Now, if x \in B', then clearly x \not\in B. But, since A \subset B, we have x \not\in A, which implies x \in A'. Hence, B' \subset A'. This closes the proof of the “only if” part, and, we are done.

The following are the well-known DeMorgan’s laws (about complements):

(A \cup B)' = A' \cap B', and (A \cap B)' = A' \cup B'.

Let’s quickly prove the first one. Suppose x belongs to the left hand side. Then, x \not\in A \cup B, which implies x \not\in A and x \not\in B, which implies x \in A' and x \in B', which implies x \in A' \cap B'. This proves that the left hand side is a subset of the right hand side. We can similarly prove the right hand side is a subset of the left hand side, and this closes our proof.

Though it isn’t very apparent, but if we look carefully at the couple of problems whose proofs we did above, we note something called the principle of duality for sets. One encounters such dual principles in mathematics quite often. In this case, this dual principle is stated a follows.

Principle of duality (for sets): If in an inclusion or equation involving unions, intersections and complements of subsets of U (the universe) we replace each set by its complement, interchange unions and intersections, and reverse all set-inclusions, the result is another theorem.

Using the above principle, it is easy to “derive” one of the DeMorgan’s laws from another and vice versa. In addition, DeMorgan’s laws can be extended to larger collections of sets instead of just pairs.

Here are a few exercises on complementation.

  1. A - B = A \cap B',
  2. A \subset B if and only if A - B = \emptyset,
  3. A - (A - B) = A \cap B,
  4. A \cap (B - C) = (A \cap B) - (A \cap C),
  5. A \cap B \subset (A \cap B) \cup (A \cap C),
  6. (A \cup C) \cap (B \cup C') \subset A \cup B.

 

We will prove the last one, leaving the remaining as exercises to the reader. Suppose x belongs to the left hand side. Then, x \in A \cup C and x \in B \cup C'. Now, note that if x \in C, then x \not\in C', which implies x \in B, which implies x \in A \cup B. If, on the other hand, x \in C', then x \not\in C, which implies x \in A, which implies x \in A \cup B. Hence, in either case, the left hand side is a subset of A \cup B, and we are done.

We now define the symmetric difference (or Boolean sum) of two sets A and B as follows:

A \triangle B = (A-B) \cup (B-A).

This is basically the set of all elements in A or B but not in A \cap B. In other words, A \triangle B = (A \cup B) - (A \cap B). Again, a few facts (involving symmetric differences) that aren’t hard to prove:

  1. A \triangle \emptyset = A,
  2. A \triangle A = \emptyset,
  3. A \triangle B = B \triangle A (commutativity),
  4. (A \triangle B) \triangle C = A \triangle (B \triangle C) (associativity),
  5. (A \triangle B) \triangle (B \triangle C) = A \triangle C,
  6. A \cap (B \triangle C) = (A \cap B) \triangle (A \cap C).

This brings us now to the axiom of powers, which basically states if E is a set then there exists a set that contains all the possible subsets of E as its elements.

Axiom of powers: If E is a set, then there exists a set (collection) P, such that if X \subset E, then X \in P.

The set P, described above, may be too “comprehens ive”, i.e., it may contain sets other than the subsets of E. Once again, we “fix” this by applying the axiom of specification to form the new set P = \{ X: X \subset E \}. The set P is called the power set of E, and the axiom of extension, again, guarantees its uniqueness. We denote P by P(E) to show the dependence of P on E. A few illustrative examples: P(\emptyset) = \{ \emptyset \}, P(\{ a \}) = \{ \emptyset, \{ a \} \}, P(\{ a, b \}) = \{ \emptyset,  \{ a \}, \{ b \}, \{ a, b \} \}, and so on.

Note that if E is a finite set, containing n elements, then the power set P(E) contains 2^n elements. The “usual” way to prove this is by either using a simple combinatorial argument or by using some algebra. The combinatorial argument is as follows. An element in E either belongs to a subset of E or it doesn’t: there are thus two choices; since there are n elements in E, the number of all possible subsets of E is therefore 2^n. A more algebraic way of proving the same result is as follows. The number of subsets with k elements is \binom{n}{k}. So, the number of subsets of E is \sum_{k=0}^{n}\binom{n}{k}. But, from the binomial theorem, we have \displaystyle (1 + x)^n = \sum_{k=0}^{n}\binom{n}{k}x^k. Putting x = 1, we get 2^n as our required answer.

A few elementary facts:

  1. \bigcap_{X \in P(E)} X = \emptyset.
  2. If E \subset F, then P(E) \subset P(F).
  3. E = \bigcup P(E) .

EXERCISES:

1. Prove that P(E) \bigcap P(F) = P(E \bigcap F).

2. Prove that P(E) \bigcup P(F) \subset P(E \bigcup F).

 

 

 

 

 

 

 

 

 

 

We are now ready to discuss a couple of familiar set theoretic operations: unions and intersections. Given two sets A and B, it would be “nice” to have a set U that contains all the elements that belong to at least one of A or B. In fact, it would be nicer to generalize this to a collection of sets instead of just two, though we must be careful about using words like “two”, “three” and so on, since we haven’t really defined what numbers are so far. We don’t want to run the risk of inadvertently introducing circularity in our arguments! Anyway, this brings us to the following axiom.

Axiom of unions: For every collection of sets, there exists a set that contains all the elements that belong to at least one set of the given collection.

In other words, for every collection C, there exists a set U such that if x \in X for some X in C, then x \in U. Now, the set U may contain “extra” elements that may not belong to any X in C. This can be easily fixed by invoking the axiom of specification to form the set \{ x \in U: x \in X \mbox{ for some } X \mbox{ in } C \}. This set is called the union of the collection C of set. Its uniqueness is guaranteed by the axiom of extension.

Generally, if C is a collection of sets, then the union is denoted by \bigcup \{  X: X \in C \}, or \bigcup_{X \in C} X.

A quick look at a couple of simple facts.

1) \bigcup \{  X: X \in \emptyset \} = \emptyset, and

2) \bigcup \{  X: X \in \{ A \} \} = A.

We finally arrive at the definition of the union of two sets, A and B. A \cup B = \{ x: x \in A \mbox{ or } x \in B \}.

Below is a list of a few facts about unions of pairs:

  • A \cup \emptyset = A,
  • A \cup B = B \cup A (commutativity),
  • A \cup ( B \cup C) = (A \cup B) \cup C (associativity),
  • A \cup A = A (idempotence),
  • A \subset B if and only if A \cup B = B.

Now, we define the intersection of two sets, A and B as follows.

A \cap B = \{ x: x \in A \mbox{ and } x \in B\}.

Once again, a few facts about intersections of pairs (analogous to the ones involving unions):

  • A \cap \emptyset = \emptyset ,
  • A \cap B = B \cap A ,
  • A \cap ( B \cap C) = (A \cap B) \cap C,
  • A \cap A = A,
  • A \subset B if and only if A \cap B = A.

Also, if A \cap B = \emptyset, then the sets A and B are called disjoint sets.

Two useful distributive laws involving unions and intersections:

  • A \cap (B \cup C) = (A \cap B) \cup (A \cap C),
  • A \cup (B \cap C) = (A \cup B) \cap (A \cup C).

We prove the first one of the above. The proof of the second one is left as an exercise to the reader. The proof relies on the idea that we show each side is a subset of the other. So, suppose x belongs to the left hand side; then x \in A and x \in B \cup C, which implies x \in A and x \in B or C, which implies x \in A \cap B or x \in A \cap C, which implies x \in (A \cap B) \cup (A \cap C); hence x belongs to the right hand side. This proves that the left hand side is a subset of the right hand side. A similar argument shows that the right hand side is a subset of the left hand side. And, we are done.

The operation of the intersection of sets from a collection, C, is similar to that of the union of sets from C. However, the definition will require that we prohibit C from being empty, and we will see why in the next section. So, for each collection, C, there exists a set V such that x \in V if and only if x \in X for every X in C. To construct such a set V, we choose any set A in C – this is possible because C \not= \emptyset – and write \{ x \in A: x \in X \mbox{ for all } X \mbox{ in } C\}.

Note that the above construction is only used to prove that V exists. The existence of V doesn’t depend on any arbitrary set A in the collection C. We can, in fact, write

\{ x: x \in X \mbox{ for all } X \mbox{ in } C\}.

The set V is called the intersection of the collection C of sets. The axiom of extension, once again, guarantees its uniqueness. The usual notation for such a set V is \bigcap \{ X: X \in C \} or \bigcap_{X \in C} X.

EXERCISE: (A \cap B) \cup C = A \cap (B \cup C) if and only if C \subset A.

SOLUTION: We first prove the “if” part. So, suppose C \subset A. Now, if x \in (A \cap B) \cup C, then either x \in A \cap B or x \in C. In the first case, x \in A and x \in B, which implies x \in A and x \in B \cup C. In the second case, we again have x \in A (since C \subset A), which implies x \in A and x \in B \cup C. In either case, we have x \in A \cap (B \cup C). Hence, (A \cap B) \cup C is a subset of A \cap (B \cup C).

Similarly, if x \in A \cap (B \cup C), then x \in A and x \in B \cup C. Now, if x \in B, then x \in A \cap B, which implies x \in (A \cap B) \cup C. And, if x \in C, then once again x \in (A \cap B) \cup C. Thus, in either case, x \in (A \cap B) \cup C. Hence, A \cap (B \cup C) is a subset of (A \cap B) \cup C. We, thus, proved (A \cap B) \cup C = A \cap (B \cup C). This concludes the proof of the “if” part.

Now, we prove the “only if” part. So, suppose (A \cap B) \cup C = A \cap (B \cup C). If x \in C, then x belongs to the left hand side of the equality, which implies x belongs to the right hand side. This implies x \in A (and x \in B \cup C.) Hence, C \subset A. And, we are done.

After postulating a couple of important axioms in the previous two sections, we now arrive at a couple of important results.

1. There exists an empty set. (In fact, there exists exactly one!)

2. The empty set is a subset of every set.

Indeed, to prove the first result, suppose A is some set. Then, the set \{ x \in A: x \not= x \} is clearly an empty set, i.e. it doesn’t contain any elements. To “picture” this, imagine an empty box with nothing inside it. In fact, we can apply the axiom of specification to A with any universally false sentence to create an empty set. The empty set is denoted by \emptyset. The axiom of extension, on the other hand, guarantees there can be only one empty set.

Now, how do we argue that \emptyset \subset A, for any arbitrary set A? Well, the reasoning is an indirect one, and for most beginners, it doesn’t seem like a complete one. There is something in the argument that doesn’t feel quite “right!” However, there is nothing “incomplete” about the argument, and here it is anyway.

Suppose, for the sake of contradiction, the emptyset, \emptyset, is not a subset of A. Then, there exists an element in \emptyset that doesn’t belong to A. But, the empty set is empty, and hence, no such element exists! This means our initial hypothesis is false. Hence, we conclude (maybe, still somewhat reluctantly) \emptyset \subset A.

Now, the set theory we have developed thus far isn’t a very rich one; after all, we have only showed there is only one set and that it is empty! Can we do better? Can we come up with an axiom that can help us construct new sets? Well, it turns out, there is one.

Axiom of pairing: If a and b are two sets, then there exists a set A such that a \in A and b \in A.

The above axiom guarantees that if there are two sets, a and b, then there exists another one, A, that contains both of these. However, A may contain elements other than a and b. So, can we guarantee there is a set that contains exactly a and b and nothing else? Indeed, we can. We just apply the axiom of specification to A with the sentence “x = a or x = b.” Thus, the set \{ x \in A: x = a \mbox{ or } x = b\} is the required one.

The above construction of a particular set illustrates one important fact: all the remaining principles of set construction are pseudo-special cases of the axiom of specification. Indeed, if it were given that there exists a set containing some particular elements, then the existence of a set containing exactly those elements (and nothing else) would follow as a special case of the axiom of specification.

Now, observe if a is a set, then the axiom of pairing implies the existence of the set \{ a, a \}, which is the same as the set \{ a \} and is called a singleton of a. Also, note that \emptyset and \{ \emptyset \} are different sets; the first has no elements at all, whereas the second has exactly one element, viz. the empty set. In fact, there is a minimalist (inductive) way of constructing the set of natural numbers, N, (due to von Neumann) using the axiom of infinity as follows.

0 = \emptyset, 1 = \{ 0 \}, 2 = \{ 0, 1 \}, \ldots

But, more on this later.

The axiom of extension (discussed in Section 1) is unique in the sense that it postulates the existence of a relation between belonging and equality. All the other axioms of set theory, on the other hand, are designed to create new sets out of old ones!

The axiom of specification, loosely speaking, states that given some arbitrary (but well-defined) set U (our universe), if we can make some “intelligent” assertion about the elements of U, then we specify or characterize a subset of U. An intelligent assertion about the elements of U could, for example, specify a property that is shared by some elements of U and not shared by the other elements. In the end, we will take up an example about an assertion that is tied to the famous Russell’s paradox.

For now, let us discuss a simple example. Suppose A is the set of all (living) women. If we use x to denote an arbitrary element of A, then the sentencex is married” is true for some of the elements of A and false for others. Thus, we could generate a subset of A using such a sentence. So, the subset of all the women who are married is denoted by \{ x \in A: x \mbox{ is married}\}. To take another example, if N is the set of natural numbers, then \{x \in N: 1 < x \leq 5 \} = \{ 2, 3, 4, 5 \}. Now, note that the subset \{ 2 \} of N is not the same as the number 2. Loosely speaking, a box containing a hat is not the same thing as the hat itself.

Now, we only need to define what a sentence is before we can precisely formulate our axiom of specification. The following rules would be a formal way to (recursively) define a sentence:

x \in A” is a sentence.”

A = B” is a sentence.

If X is a sentence, then (\mbox{not}(X)) is a sentence.

If X, Y are sentences, then (X \mbox{ and } Y) is a sentence.

If X, Y are sentences, then (X \mbox{ or } Y) is a sentence.

If X, Y are sentences, then (\mbox{if } X \mbox{ then } Y) is a sentence.

If X, Y are sentences, then (X \mbox{ if and only if } Y) is a sentence.

If X is a sentence, then (\mbox {for some }y  (X) ) is a sentence.

If X is a sentence, then (\mbox {for all }y  (X)) is a sentence.

Note that the two types of sentences, “x \in A” and “A = B“, stated in the first two rules, are what we would call atomic sentences, while the rest of the other rules specify (valid) ways of generating (infinitely) many sentences from those two atomic sentences using the usual logical operators. Also, note that some of the rules above are rather redundant because it is possible to convert certain sentences having a set of logical operators to another sentence having a different set of logical operators. For example, “X \mbox{ or } Y” can be written as “\mbox{not ( not(X) and not(Y))}“, and so on. Anyway, we are digressing too far from our objective.

Having defined what a sentence is, we can now formulate the major principle of set theory, often referred to by its German name Aussonderungsaxiom.

Axiom of specification: To every set A and to every condition S(x), there corresponds a set B whose elements are exactly those elements x of A for which S(x) holds.

A “condition” here just means a sentence. The letter x is free in the sentence S(x), meaning x occurs in S(x) at least once without occurring in the phrases “for some x” or “for all x“. Now, the axiom of extension guarantees us that the axiom of specification determines the set B uniquely, and we usually write B = \{ x \in A: S(x) \}.

This finally brings us to the example we mentioned in the beginning of this section. Let us define S(x) := x \not\in x. Suppose A is some arbitrary set. Let B = \{ x \in A: x \not\in x \}. Then for all y,

(*) \,\, y \in B \mbox{ if and only if } (y \in A \mbox{ and } y \not\in y).

Can we have B \in A? The answer is no, and here’s why. Suppose, for the sake of contradiction, B \in A. Then, we have either B \in B, or B \not\in B. If B \in B, then using (*), we have B \not\in B, a contradiction. And, if B \not\in B, then using (*) again, the assumption B \in A yields B \in B, a contradiction. This proves that B \in A is false, and hence we conclude B \not\in A. Note that our set A was an arbitrary one, and we just showed that there is something (viz. B) that does not belong to A. We have, thus, essentially proved that

 

there is no universe.

Here, “universe” means “universe of discourse”, a set that contains all the objects that enter into that discussion.

It was mentioned earlier that the above example has something to do with Russell’s paradox. We shall see why. In the earlier pre-axiomatic approaches to set theory, the existence of a universe was taken for granted. Now, in the above example, we just showed that B \not\in A implies the non-existence of a universe. So, if we assume that a universe exists, then it implies that B \in A, but we have already shown that this leads to a contradiction! And this was exactly the content of Russell’s paradox. In Halmos’ own words:

The moral is that it is impossible , especially in mathematics, to get something for nothing. To specify a set, it is not enough to pronounce some magic words (which may form a sentence such as “x \not\in x“); it is necessary also to have at hand a set whose elements the magic words apply.

We encounter sets, or if we prefer, collections of objects, everyday in our lives. A herd of cattle, a group of women, or a bunch of yahoos are all instances of sets of living beings. “The mathematical concept of a set can be used as the foundation for all known mathematics.” The purpose here is to develop the basic properties of sets. As a slight digression, I wouldn’t consider myself a Platonist; hence, I don’t believe there are some abstract properties of sets “out there” and that the chief purpose of mathematics is to discover those abstract things, so to speak. Even though the idea of a set is ubiquitous and it seems like the very concept of a set is “external” to us, I still think that we must build, or rather postulate, the existence of the fundamental properties of sets. (I think I am more of a quasi-empiricist.)

Now, we won’t define what a set is, just as we don’t define what points or lines are in the familiar axiomatic approach to elementary geometry. So, we somewhat rely on our intuition to develop a definition of sets. Of course, our intuition may go wrong once in a while, but one of the very purposes of our exposition is to reason very clearly about our intuitive ideas, so that we can correct them any time if we discover they are wrong.

Now, a very reasonable thing to “expect” from a set is it should have elements or members. So, for example, Einstein was a member of the set of all the people who lived in the past. In mathematics, a line has points as its members, and a plane has lines as its members. The last example is a particularly important one for it underscores the idea that sets can be members of other sets!

So, a way to formalize the above notion is by developing the concept of belonging. This is a primitive (undefined) concept in axiomatic set theory. If x is a member of A (x is contained in A, or x is an element of A), we write x \in A. (\in is a derivation of the Greek letter epsilon, \epsilon, introduced by Peano in 1888.) If x is not an element of A, we write x \not\in A. Note that we generally reserve lowercase letters (x, y, etc) for members or elements of a set, and we use uppercase letters to denote sets.

A possible relation between sets, more elementary than belonging, is equality. If two sets A and B are equal, we write A = B. If two sets A and B are not equal, we write A \not= B.

Now, the most basic property of belonging is its relation to equality, which brings us to the following formulation of our first axiom of set theory.

Axiom of extension: Two sets are equal if and only if they have the same elements.

Let us examine the relation between equality and belonging a little more deeply. Suppose we consider human beings instead of sets, and change our definition of belonging a little. If x and A are human beings, we write x \in A whenever x is an ancestor of A. Then our new (or analogous) axiom of extension would say if two human beings A and B are equal then they have the same ancestors (this is the “only if” part, and it is certainly true), and also that if A and B have the same ancestors, then they are equal (this is the “if” part, and it certainly is false.) A and B could be two sisters, in which case they have the same ancestors but they are certainly not the same person.

Conclusion: The axiom of extension is not just a logically necessary property of equality but a non-trivial statement about belonging.

Also, note that the two sets A = \{ x, y \} and B = \{ x, x, y, y, y \} have the same elements, and hence, by the axiom of extension, A = B, even though it seems like A has just two elements while B has five! It is due to this that we drop duplicates while writing down the elements of a set. So, in the above example, it is customary to simply write B = \{ x,y \}.

Now, we come to the definition of a subset. Suppose A and B are sets. If every member of A is a member of B, then we say A is a subset of B, or B includes A, and write A \subset B or B \supset A. This definition, clearly, implies that every set A is a subset of itself, i.e. A \subset A, which demonstrates the reflexive property of set inclusion. (Of course, equality also satisfies the reflexive property, i.e. A = A.) We say A is a proper subset of B whenever A \subset B but A \not= B. Now, if A \subset B and B \subset C, then A \subset C, which demonstrates the transitive property of set inclusion. (Once again, equality also satisfies this property, i.e. if A = B and B = C, then A = C.) However, we note that set inclusion doesn’t satisfy the symmetric property. This means, ifA \subset B, then it doesn’t necessarily imply B \subset A. (On the other hand, equality satisfies the symmetric property, i.e. if A = B, then B = A.)

But, set inclusion does satisfy one very important property: the antisymmetric one. If we have A \subset B and B \subset A, then A and B have the same elements, and therefore, by the axiom of extension, A = B. In fact, we can reformulate the axiom of extension as follows:

Axiom of extension(another version): Two sets A and B are equal if and only if A \subset B and B \subset A.

In mathematics, the above is almost always used whenever we are required to prove that two sets A and B are equal. All we do is show that A \subset  B and B \subset A, and invoke the (above version of) axiom of extension to conclude that A = B.

Before we conclude, we note that conceptually belonging (\in) and set inclusion (\subset) are two different things. A \subset A always holds, but A \in A is “false”; at least, it isn’t true of any reasonable set that anyone has ever constructed! This means, unlike set inclusion, belonging does not satisfy the reflexive property. Again, unlike set inclusion, belonging does not satisfy the transitive property. For example, a person could be considered a member of a country and a country could be considered a member of the United Nations Organizations (UNO); however, a person is not a member of the UNO.

 

I have just started reading Paul R. Halmos’ classic text Naive Set Theory, and I intend to blog on each section of the book. The purpose is mainly to internalize all the material presented in the book and at the same time provide the gist of each section, so that I can always come back and read whenever I feel like doing so. The actual text, divided into 25 sections (or chapters, if you will), comprises 102 pages. Halmos’ original intention was “to tell the beginning student of advanced mathematics the basic set theoretic facts of life, and to do so with the minimum of philosophical discourse and logical formalism… The style is usually informal to the point of being conversational.

The reader is warned that “the expert specialist will find nothing new here.” Halmos recommends Hausdorff’s Set Theory and Axiomatic Set Theory by Suppes for a more extensive treatment of the subject. Nevertheless, the treatment by Halmos is not trivial at all. I personally feel his exposition is impeccable!

Almost all the ideas presented in the following posts belong to the author of the book, and I make absolutely no claims to originality in the exposition.

Here is some career advice by Terry Tao on his blog. I think it has tons of useful information and wisdom.

Our other blog

Visitors to this blog

Blog Stats

  • 380,789 hits

Wikio Ranking

Wikio - Top Blogs - Sciences

Current Online Readers

Archives

December 2007
M T W T F S S
 12
3456789
10111213141516
17181920212223
24252627282930
31