Copyright © Philip M. Parker, INSEAD. Terms of Use.

(From Wikipedia, the free Encyclopedia)
C is a free-form programming language developed by Dennis Ritchie, in the early 1970s, from BCPL; for use on the UNIX operating system. C is most widely used language for writing system software; it is also used for writing applications. C is one of the most frequently used programming languages in computer science education. C++ was developed from C.
History
The initial development of C occurred between 1969 and 1973 (according to Ritchie, the most creative period was 1972). It was called "C" because many features derived from an earlier language named B. Accounts differ regarding the origins of B. It may have derived from an earlier language called BCPL, or from another language called Bon, which may or may not have been named after Ken Thompson's wife Bonnie.
By 1973, the C language had become powerful enough that most of the UNIX kernel was reimplemented in C, perhaps following the examples of the Multics system (implemented in PL/I), Tripos (implemented in BCPL), and possibly others. In 1978, Ritchie and Brian Kernighan published The C Programming Language. During the late '70s, C began to replace BASIC as a microcomputer language; eventually being adopted for use with the IBM PC.
The popularity of C increased significantly during the '80s, It was officially standardized, in 1983, by the American National Standards Institute (ANSI) and the International Standards Organization (ISO). In the late 1980s, Bjarne Stroustrup and others at Bell Labs worked to add object-oriented programming language constructs to C. The language they produced with Cfront was called C++ (thus avoiding the issue of whether the successor to "B" and "C" should be "D" or "P".) C++ is now the language most commonly used for commercial applications on the Microsoft Windows operating system, though C remains more popular in the Unix world.
A study of one Linux distribution found that 71% of its 30 million lines of code was C code.
Features
The main features of C are:
The functionality of C is guaranteed by the ANSI/ISO C89/90 and C99 standards documents, which explicitly specify when the compiler (or environment) shall issue diagnostics. The documents also specify what behavior one can expect from C code that conforms to the standard.
- Focus on the procedural programming paradigm, with facilities for programming in a structured style.
- Access to low level hardware via the use of pointers to refer to locations in memory.
- Parameters are always passed to functions by value, not by reference.
- Lexical variable scoping (but no support for closures or functions defined within functions).
- A language definition simple enough to keep the entire language in the programmer's head. This is accomplished by restricting the language to essential syntax and operators, and exporting everything nonessential to a standardized set of library routines.
- Use of a preprocessor language, the C preprocessor, for tasks such as defining macros and including multiple source code files.
- O(1) performance for all operators.
For example, the following code, according to the standard, produces undefined behavior (specifically, because the parameters passed to the standard
strcpy()function should not overlap).
\rNote: bracing style varies from programmer to programmer and can be the subject of great debate ("religious wars"). See Indent style for more details.
include\r
include\r
\r
int main (void)\r
{\r
char *s = "Hello World!\
";\r
\r
strcpy (s, s+1); /* remove first character from string s --\r
s = s+1;\r
is probably what the programmer wanted */\r
return 0;\r
}\r"Undefined behavior" means that the resulting program can do anything, including (accidentally) working as the programmer intended it to, producing incorrect output, crashing horribly every time it is run, crashing only under certain obscure conditions, etc. The canonical expression among experienced C programmers is that "demons may fly out of your nose" (usually abbreviated to "nasal demons")—i.e., anything can happen.
Some compilers do not adhere to either of the standards in their default mode, which leads to many programs being written which will only compile with a certain version of a certain compiler on a certain platform. Any program written only in standard C will compile unchanged on any platform which has a conforming C implementation.
Although C is usually termed a high level language, this is only in comparison to assembly language; it is significantly lower-level than most other programming languages. In particular, it is up to the programmer to manage the contents of computer memory. C provides no facilities for array bounds checking or automatic garbage collection. C has sometimes been termed "portable assembly language".
Manual memory management provides the programmer with greater leeway in tuning the performance of a program, which is particularly important for programs such as device drivers. However, it also makes it easy to accidentally create code with bugs stemming from erroneous memory operations, such as buffer overflows. Tools have been created to help programmers avoid these errors, including libraries for performing array bounds checking and garbage collection, and the lint source code checker. Intentional exploitation of programs written in C containing potential buffer overruns is often used to break computer security, either manually or by viruses and worms.
Some of the perceived shortcomings of C have been addressed by newer programming languages derived from C. The Cyclone programming language has features to guard against erroneous memory operations. C++ and Objective C provide constructs designed to aid object-oriented programming. Java and C# add object-oriented programming constructs as well as a higher level of abstraction, such as automatic memory management.
Versions of C
K&R C
C evolved continuously from its beginnings in Bell Labs. In 1978, the first edition of Kernighan and Ritchie's The C Programming Language was published. It introduced the following features to the existing versions of C:
For several years, the first edition of The C Programming Language was widely used as a de facto specification of the language. The version of C described in this book is commonly referred to as "K&R C." (The second edition covers the ANSI C standard, described below.)
- structure data types
long intdata typeunsigned intdata type- The
=+operator was changed to+=, and so forth (=+was confusing the C compiler's lexical analyzer).
K&R C is often considered the most basic part of the language that is necessary for a C compiler to support. Since not all of the currently-used compilers have been updated to fully support ANSI C fully, and reasonably well-written K&R C code is also legal ANSI C, K&R C is considered the lowest common denominator that programmers should stick to when maximum portability is desired. For example, the bootstrapping version of the GCC compiler, xgcc, is written in K&R C. This is because many of the platforms supported by GCC did not have an ANSI C compiler when GCC was written, just one supporting K&R C.
However, ANSI C is now supported by almost all the widely used compilers. Most of the C code being written nowadays uses language features that go beyond the original K&R specification.
ANSI C and ISO C
In 1989, C was first officially standardized by ANSI in ANSI X3.159-1989 "Programming Language C". One of the aims of the ANSI C standard process was to produce a superset of K&R C. However, the standards committees also included several new features, more than is normal in programming language standardization.
Some of the new features had been "unofficially" added to the language after the publication of K&R, but before the beginning of the ANSI C process. These included:
Several features were added during the standardization process, most notably function prototypes (borrowed from C++), and a more capable preprocessor.
voidfunctions andvoid *data type- functions returning
structoruniontypesstructfield names in a separate name space for each struct type- assignment for
structdata typesconstqualifier to make an object read-only- a standard library incorporating most of the functionality implemented by various vendors
- enumerations
- the single-precision
floattype
The ANSI C standard, with a few minor modifications, was adopted as ISO standard number ISO 9899. The first ISO edition of this document was published in 1990 (ISO 9899:1990.)
C99
After the ANSI standardization process, the C language specification remained relatively static for some time, whereas C++ continued to evolve. (Normative Amendment 1 created a new version of the C language in 1995, but this version is rarely acknowledged.) However, the standard underwent revision in the late 1990s, leading to ISO 9899:1999, which was published in 1999. This standard is commonly referred to as "C99". It was adopted as an ANSI standard in March 2000.
The new features added in C99 include:
Interest in supporting the new C99 features is mixed. Whereas GCC and several commercial compilers support most of the new features of C99, the compilers maintained by Microsoft and Borland do not, and these two companies do not seem to be interested in adding such support.
- inline functions
- freeing of restrictions on the location of variable declarations (as in C++)
- the addition of several new data types, including
long long int(to reduce the pain of the 32-bit to 64-bit transition looming for much old code with the predicted obsolescence of the x86 architecture), an explicit boolean datatype, and a_Complextype representing complex numbers- variable-length arrays
- official support for one-line comments beginning with // as in C++ (already supported by many C89 compilers as a nonstandard extension)
- several new library functions, including
snprintf()- several new header files, including
stdint.h
"Hello, World!" in C
The following simple application prints out "Hello, World" to the standard output file (which is usually the screen, but might be a file or some other hardware device). A version of this program appeared for the first time in K&R.
\r
include\r
\r
int main(void)\r
{\r
printf("Hello, World!\
");\r
return 0;\r
}\r
Anatomy of a C Program
A C program consists of functions and variables. C functions are like the subroutines and functions of Fortran or the procedures and functions of Pascal. The function
main()is special in that a C program always begins executing at the beginning of this function. This means that every C program must have amain()function.The
main()function will usually call other functions to help perform its job, such asprintf()in the above example. Functions from the standard library are frequently used. Other libraries can provide extra functionality, such as a graphical interface, advanced mathematical operations, or access to platform-specific features. Any nontrivial program will include its own functions written by the programmer.A function may return a value to the environment which called it. This is usually another C function. The
main()function's calling environment is the operating system. Hence, in the "Hello, world!" example above, the operating system receives a value of 0 when the program terminates.A C function consists of a return type (
voidif no value is returned), a unique name, a list of parameters in parentheses (voidif there are none) and a function body delimited by braces. The syntax of the function body is equivalent to that of a compound statement.
Control structures
Compound statements
Compound statements in C have the form
{and are used as the body of a function or anywhere that a single statement is expected.}
Expression statements
A statement of the form
is an expression statement. If the expression is missing, the statement is called a null statement.;
Selection statements
C has three types of selection statements: two kinds of
ifand theswitchstatement.The two kinds of
ifstatement are
if (and)
if (In the) else
ifstatement, if the expression in parentheses is nonzero or true, control passes to the statement following theif. If theelseclause is present, control will pass to the statement following theelseclause if the expression in parentheses is zero or false. The two are disambiguated by matching anelseto the next previous unmatchedifat the same nesting level. Braces may be used to override this or for clarity.The
switchstatement causes control to be transferred to one of several statements depending on the value of an expression, which must have integral type. The substatement controlled by a switch is typically compound. Any statement within the substatement may be labeled with one or morecaselabels, which consist of the keywordcasefollowed by a constant expression and then a colon (:). No two of the case constants associated with the same switch may have the same value. There may be at most onedefaultlabel associated with a switch; control passes to thedefaultlabel if none of the case labels are equal to the expression in the parentheses followingswitch. Switches may be nested; acaseordefaultlabel is associated with the smallest switch that contains it. Switch statements can "fall-through", that is, when one case section has completed its execution, statements will continue to be executed downward until a break statement is encountered. This may prove useful in certain circumstances, newer programming languages forbid case statements to "fall-through". In the below example, ifis reached, the statements are executed and nothing more inside the braces. However if is reached, both and are executed since there is no breakto separate the two case statements.
switch () { case : case : break; default : }
Iteration statements
C has three forms of iteration statement:
doIn thewhile ( ); while (
) for (
; ; )
whileanddostatements, the substatement is executed repeatedly so long as the value of the expression remains nonzero or true. Withwhile, the test, including all side effects from the expression, occurs before each execution of the statement; withdo, the test follows each iteration.If all three expressions are present in a
for, the statement
for (e1; e2; e3) s;is equivalent to
e1; while (e2) { s; e3; }Any of the three expressions in the
forloop may be omitted. A missing second expression makes thewhiletest nonzero, creating an infinite loop.
Jump statements
Jump statements transfer control unconditionally. There are four types of jump statements in C:
goto,continue,break, andreturn.The
gotostatement looks like this:
goto <identifier>;The identifier must be a label located in the current function. Control transfers to the labeled statement.
A
continuestatement may appear only within an iteration statement and causes control to pass to the loop-continuation portion of the smallest enclosing such statement. That is, within each of the statements
while (expression) { /* ... */ cont: ; }ado { /* ... */ cont: ; } while (expression);
for (optional-expr; optexp2; optexp3) { /* ... */ cont: ; }
continuenot contained within a nested iteration statement is the same asgoto cont.The
breakstatement is used to get out of aforloop,whileloop,doloop, orswitchstatement. Control passes to the statement following the terminated statement.A function returns to its caller by the
returnstatement. Whenreturnis followed by an expression, the value is returned to the caller of the function. Flowing off the end of the function is equivalent to areturnwith no expression. In either case, the returned value is undefined.
Operator precedence in C89
() [] -> . ++ -- (cast) postfix operators ++ -- * & ~ ! + - sizeof unary operators * / % multiplicative operators + - additive operators << >> shift operators < <= > >= relational operators == != equality operators & bitwise and ^ bitwise exclusive or | bitwise inclusive or && logical and || logical or ?: conditional operator = += -= *= /= %= <<= >>= &= |= ^= assignment operators , comma operator
Data declaration
name minimum range char-127..127 or 0..255 unsigned char0..255 signed char-127..127 int-32767..32767 short int-32767..32767 long int-2147483647..2147483647 float1e-37..1e+37 (positive range) double1e-37..1e+37 (positive range) long double1e-37..1e+37 (positive range)
a[0][0]a[0][1]a[0][2]a[0][3]a[1][0]a[1][1]a[1][2]a[1][3]a[2][0]a[2][1]a[2][2]a[2][3]
Pointers
If a variable has an asterisk (*) in its declaration it is said to be a pointer.
Examples:
int *pi; /* pointer to int */ int *api[3]; /* array of 3 pointers to int */ char **argv; /* pointer to pointer to char */The value at the address stored in a pointer variable can then be accessed in the program with an asterisk. For example, given the first example declaration above,
*piis anint. This is called "dereferencing" a pointer.Another operator, the
&(ampersand), called the address-of operator, returns the address of variable, array, or function. Thus, given the following
int i, *pi; /* int and pointer to int */ pi = &i;
iand*picould be used interchangeably (at least untilpiis set to something else).
Strings
Strings may be manipulated without using the standard library. However, the library contains many useful functions for working with both zero-terminated strings and unterminated arrays of
char.The most commonly used string functions are:
The less important string functions are:
strcat(dest, source)- appends the stringsourceto the end of stringdeststrchr(s, c)- finds the first instance of charactercin stringsand returns a pointer to it or a null pointer ifcis not foundstrcmp(a, b)- compares stringsaandb(lexical ordering); returns negative ifais less thanb, 0 if equal, positive if greater.strcpy(dest, source)- copies the stringsourceto the stringdeststrlen(st)- return the length of stringststrncat(dest, source, n)- appends a maximum ofncharacters from the stringsourceto the end of stringdest; characters after the null terminator are not copied.strncmp(a, b, n)- compares a maximum ofncharacters from stringsaandb(lexical ordering); returns negative ifais less thanb, 0 if equal, positive if greater.strncpy(dest, source, n)- copies a maximum ofncharacters from the stringsourceto the stringdeststrrchr(s, c)- finds the last instance of charactercin stringsand returns a pointer to it or a null pointer ifcis not found
strcoll(s1, s2)- compare two strings according to a locale-specific collating sequencestrcspn(s1, s2)- returns the index of the first character ins1that matches any character ins2strerror(err)- returns a string with an error message corresponding to the code inerrstrpbrk(s1, s2)- returns a pointer to the first character ins1that matches any character ins2or a null pointer if not foundstrspn(s1, s2)- returns the index of the first character ins1that matches no character ins2strstr(st, subst)- returns a pointer to the first occurrence of the stringsubstinstor a null pointer if no such substring exists.strtok(s1, s2)- returns a pointer to a token withins1delimited by the characters ins2.strxfrm(s1, s2, n)- transformss2intos1using locale-specific rules
File Input / Output
In C, input and output are performed via a group of functions in the standard library. In ANSI/ISO C, those functions are defined in the<stdio.h>header.
Standard I/O
Three standard I/O streams are predefined:These streams are automatically opened and closed by the runtime environment, they need not and should not be opened explicitly.
stdinstandard inputstdoutstandard outputstderrstandard errorThe following example demonstrates how a filter program is typically structured:
\r
include\r
\r
int main()\r
{\r
int c;\r
\r
while (( c = getchar()) != EOF ) {\r
/* do various things \r
to the characters */\r
\r
if (anErrorOccurs) {\r
fputs("an error eee occurred\
", stderr);\r
break;\r
}\r
\r
/* ... */\r
putchar(c);\r
/* ... */\r
\r
}\r
return 0;\r
}\r
Passing command line arguments
The parameters given on a command line are passed to a C program with two predefined variables - the count of the command line arguments in
argcand the individual arguments as character arrays in the pointer arrayargv. So the commandmyFilt p1 p2 p3results in something like
(Note: there is no guarantee that the individual strings are contiguous.)
The individual values of the parameters may be accessed with
argv[1],argv[2], andargv[3].
The C Library
Many features of the C language are provided by the standard C library. A "hosted" implementation provides all of the C library. (Most implementations are hosted, but some, not intended to be used with an operating system, aren't.) Access to library features is achieved by including standard headers via the#includepreprocessing directive.See C library, C standard library (ANSI C standard library), GNU Compiler Collection.
References
- The C Programming Language, by Brian Kernighan and Dennis Ritchie. Also known as K&R. This is good for beginners.
- 1st, Prentice-Hall 1978; ISBN 0-131-10163-3. Pre-ANSI C.
- 2nd, Prentice-Hall 1988; ISBN 0-131-10362-8. ANSI C.
- The C Standard, edited by the British Standard Institute. The official ISO standard (C99) in book form.
- Wiley, 2003; ISBN 0-470-84573-2.
- C: A Reference Manual, by Samuel P. Harbison and Guy L. Steele. This book is excellent as a definitive reference manual, and for those working on C compiler and processors. The book contains a BNF grammar for C.
- 4th, Prentice-Hall 1994; ISBN 0-133-26224-3.
- 5th, Prentice-Hall 2002; ISBN 0-130-89592-X.
External links
This article (or an earlier version of it) contains material from FOLDOC, used with permission.
- The Development of the C Language article by Dennis M. Ritchie
- Programming in C: A Tutorial by Brian W. Kernighan
- Lysator collection of C language resources
- comp.lang.c Answers to Frequently Asked Questions (FAQ List) by Steve Summit
Source: adapted by the editor from Wikipedia, the free encyclopedia under a copyleft GNU Free Documentation License (GFDL) from the article "C programming language."
Hexadecimal (or equivalents, 770AD-1900s) (references)43      53 54 41 4E 44 41 52 44      49 2F 4F      4C 49 42 52 41 52 59 |
| Leonardo da Vinci (1452-1519; backwards) (references)
|
Binary Code (1918-1938, probably earlier) (references)01000011 00100000 01010011 01010100 01000001 01001110 01000100 01000001 01010010 01000100 00100000 01001001 00101111 01001111 00100000 01001100 01001001 01000010 01010010 01000001 01010010 01011001 |
HTML Code (1990) (references)C   S T A N D A R D   I / O   L I B R A R Y |
ISO 10646 (1991-1993) (references)0043      0053 0054 0041 004E 0044 0041 0052 0044      0049 002F 004F      004C 0049 0042 0052 0041 0052 0059 |
Encryption (beginner's substitution cypher): (references)37253543548383552382431749246433652355259 |
| 1. Orthography 2. Bibliography |
Copyright © Philip M. Parker, INSEAD. Terms of Use.