Formal Languages & Automata Theory
Languages :
A general definition of language must cover a variety of distinct categories: natural languages, programming languages, mathematical languages, etc. The notion of natural languages like English, Hindi, etc. is familiar to us. Informally, language can be defined as a system suitable for expression of certain ideas, facts, or concepts, which includes a set of symbols and rules to manipulate these. The languages we consider for our discussion is an abstraction of natural languages. That is, our focus here is on formal languages that need precise and formal definitions. Programming languages belong to this category. We start with some basic concepts and definitions required in this regard.
Symbols :
Symbols are indivisible objects or entity that cannot be defined. That is, symbols are the atoms of the world of languages. A symbol is any single object such as a, 0, 1, #, begin, or do. Usually, characters from a typical keyboard are only used as symbols.
A general definition of language must cover a variety of distinct categories: natural languages, programming languages, mathematical languages, etc. The notion of natural languages like English, Hindi, etc. is familiar to us. Informally, language can be defined as a system suitable for expression of certain ideas, facts, or concepts, which includes a set of symbols and rules to manipulate these. The languages we consider for our discussion is an abstraction of natural languages. That is, our focus here is on formal languages that need precise and formal definitions. Programming languages belong to this category. We start with some basic concepts and definitions required in this regard.
Symbols :
Symbols are indivisible objects or entity that cannot be defined. That is, symbols are the atoms of the world of languages. A symbol is any single object such as a, 0, 1, #, begin, or do. Usually, characters from a typical keyboard are only used as symbols.
|
Alphabets :
An alphabet is a finite, nonempty set of symbols. The alphabet of a language is normally denoted by ∑. When more than one alphabets are considered for discussion, then subscripts may be used or sometimes other symbol like G may also be introduced. Strings or Words over Alphabet : A string or word over an alphabet ∑ is a finite sequence of concatenated symbols of ∑. Example : 0110, 11, 001 are three strings over the binary alphabet { 0, 1 } . aab, abcb, b, cc are four strings over the alphabet { a, b, c }. It is not the case that a string over some alphabet should contain all the symbols from the alphabet. For example, the string cc over the alphabet { a, b, c } does not contain the symbols a and b. Hence, it is true that a string over an alphabet is also a string over any superset of that alphabet. |
Length of a string :
The number of symbols in a string w is called its length, denoted by |w|.
Example : | 011 | = 4, |11| = 2, | b | = 1 It is convenient to introduce a notation e for the empty string, which contains no symbols at all. The length of the empty string e is zero, i.e., | e | = 0.
Convention :
We will use small case letters towards the beginning of the English alphabet to denote symbols of an alphabet and small case letters towards the end to denote strings over an alphabet. That is, a, b, c, ϵ, ∑ (symbols) and u, v, w, x, y, z are strings.
The number of symbols in a string w is called its length, denoted by |w|.
Example : | 011 | = 4, |11| = 2, | b | = 1 It is convenient to introduce a notation e for the empty string, which contains no symbols at all. The length of the empty string e is zero, i.e., | e | = 0.
Convention :
We will use small case letters towards the beginning of the English alphabet to denote symbols of an alphabet and small case letters towards the end to denote strings over an alphabet. That is, a, b, c, ϵ, ∑ (symbols) and u, v, w, x, y, z are strings.
Need More Information Download the materials!