Current mission: finish filling in the OTDs

SNCA:C/Tutorial

From Soyjak Wiki, the free ensoyclopedia
Jump to navigationJump to search

This is a brief tutorial on the C programming language.

This tutorial assumes C23 or later.

When you're done consider reading the C++ tutorial, which adds on to this.

Hello World[edit | edit source]

The Hello World program in C is as follows:

#include <stdio.h>

int main() {
    printf("Hello, world!");
}

Here, #include <stdio.h> is a directive that tells the compiler to insert the contents of the standard library header <stdio.h> at the exact location of the #include directive. This header, <stdio.h>, contains various functions for input and output systems.

The main function int main() must always have a return type of int. However, an explicit return statement may be omitted in the main() function, which without a return statement implicitly returns 0 (indicates success). The main function must always reside in the global namespace.

Additionally, the signature of main may be int main(int argc, char* argv[]) to accommodate command-line arguments.

  • argc is the number of command line arguments, plus 1 for the program name.
  • argv is an array of char* (char pointers, essentially the C way of representing strings), where the first element is the name of the executable and all subsequent elements are the command-line arguments.

Basic types[edit | edit source]

C consists of the following basic types:

  • int: an integer, contains at least 16 bits; on most systems it is usually 32 bits.
  • short: a (sometimes) smaller integer, contains at least 16 bits; on most systems it is usually 16 bits.
  • long: a long integer, contains at least 32 bits; on most systems it is 64 bits. Note that long long is another long integer, which is guaranteed to have 64 bits, but is often not needed as in most systems long is already 64 bits.
  • float: a floating point (number with decimal), usually 32 bits.
  • double: a floating point number, usually 64 bits. Note that long double is the analogous float for double, but is not always needed.
  • char: 8 bit integral type used to represent characters with size of at least 8 bits. Typically represents a single character of ASCII. Note that it can be signed or unsigned depending on implementation; for use as a small integer you should specify it to be signed or unsigned.
  • bool: a boolean value (either true or false)
  • void: represents no value (used in functions that don't return a value). This is also used in void* (a void pointer), which will be explained later.

All integral types (int, char, long, etc.) have both a signed and unsigned version. By default, those types are signed, but if prepended with unsigned, they do not accommodate negative values, and have the minimum value of 0. If you exceed the range of allowed values in the type, it causes an overflow. While this is undefined behavior for signed integers and should be avoided, (due to 0 signed integers have 1 more number downwards than upwards) unsigned integer overflow is well defined behavior and can be safely used.

To declare strings, use the type char*. This is called a char pointer (where the * represents a "pointer" type). This will be explained in detail later. Note that you can't use == to compare strings, for reasons that will be explained later.

Note that in older versions of C, the type bool (and its values true and false) were not directly part of the language, to use them you needed to include a header <stdbool.h>.

Variables[edit | edit source]

Variables in C are first declared with their type, and can later be assigned to with the = operator. The lvalue (variable on the left) of the = operator is the variable which is later assigned to by the rvalue (expression on the right). It is best to name your variables something descriptive but not too long so that its purpose is obvious when re-reading code.

Note that you can declare variables without assigning to them, though this results in the variable being filled with absolute garbage; uninitialized variables should not be read.[1] An easy fix for this is to just assign values to your variables.

Assignment expressions evaluate to the rvalue which allows you to chain assignments together or assign to a variable and check its result in the same sequence point.

#include <stdio.h>

int main() {
    int age = 18;
    char* name = "Nate";
    char* website = "soyjak.party";
    printf("%s is %d, and thus old enough to post on %s!\n", age, name, website);
}

In a string, \n represents a newline.

The type of a variable can be omitted by declaring the variable using auto, which leaves to the compiler to automatically infer its type.

auto a = 42; // int
auto b = 3.14; // double
auto c = 'A'; // char

Reading input[edit | edit source]

To read input from the global input stream, use scanf(), which reads something from input and puts it into the variable.

#include <stdio.h>

int main() {
    int a;
    int b;
    printf("Enter a first number: ");
    scanf("%d", &a);
    printf("\nEnter a second number: ");
    scanf("%d", &b);
    printf("\nThe sum of the numbers is a + b = %d + %d = %d\n", a, b, a + b);
}

Control flow[edit | edit source]

If/else[edit | edit source]

if, else if, and else allow you to check boolean conditions and execute code based on it. The basic syntax is if (condition1) { body1 } else if (condition2) { body2 } repeat else-if chains... else { conditionN }. You can have any number of else if blocks, even zero, and you can have just an if block by itself, but you cannot have an else or else if block by themselves.

#include <stdio.h>

int main() {
    int number;

    printf("Enter a number: ");
    scanf("%d", &number);
    printf("\n");

    if (number > 0) {
        printf("The number %d is positive.\n");
    } else if (number < 0) {
        printf("The number %d is negative.\n"
    } else {
        printf("The number is is exactly 0.\n");
    }
}

Switch[edit | edit source]

A switch block in C basically selects a path based on the value of an integral expression. This is often cleaner than using if/else checks (like YandereDev does), and when possible the compiler will expand it into a jump table, which is usually faster than if/else.

A switch block consists of cases to check against and an optional default case (if no case matches). In order to leave the switch block prematurely, you need to use a break; statement. If there is no break;, it continues on to the next case.

#include <stdio.h>

int main() {
    int day = 3;
    switch (day) {
        case 1:
            printf("Monday");
            break;
        case 2:
            printf("Tuesday");
            break;
        case 3:
            printf("Wednesday");
            break;
        default:
            printf("Some other day");
    }
}

Goto[edit | edit source]

A goto statement is a statement used to directly tell the program to jump to another part of the code. To make a label, insert label: at a specified part of the code, and then use goto label; to immediately jump to that part of the code.

#include <stdio.h>

int main() {
    int i = 0;

    // Mark a label (start)
start:
    printf("i = %d\n", i);
    i++;

    if (i < 5) {
        goto start; // Jump back to the label (start)
    }

    printf("Done\n");
}

Note that in C++ (but not in C), it is illegal to have a goto statement jump over a variable declaration.

#include <stdio.h>

int main() {
    goto skip;

    int x = 10; // Declare variable x

    // Illegal in C++, but not in C
skip:
    printf("Hello\n");
}

goto statements are considered bad practice because of some 'zelligNIGGER in 1968 who said they result in spaghetti code.[2] They are useful for avoiding repeating code, and are often used to call free statements in a cascade at the end of a function.

Loops[edit | edit source]

A loop in C is basically a way to repeatedly perform some code.

Loops support the two following keywords:

  • break: leaves the loop immediately.
  • continue: ends the current iteration of the loop, and begins the next one immediately.

For loop[edit | edit source]

A for loop is a loop where you can control exactly how many times you want to iterate, and exactly when you want to, and how to change the loop as you go. The basic syntax is for (initialisation; condition; update) { body }.

#include <stdio.h>

int main() {
    for (int i = 0; i < 5; ++i) {
        printf("i = %d\n", i);
    }
}

While loop[edit | edit source]

A while loop is basically a loop where you only control the condition. It is typically used when you don't know exactly how many iterations you want. The basic syntax is while (condition) { body }.

#include <stdio.h>

int main() {
    int i = 0;
    while (i < 5) {
        printf("i = %d\n", i);
        ++i;
    }
}

If you want an infinite loop, you would set the condition to just true, like while (true):

while (true) {
    printf("Cobson will always be a gem!\n");
}

Do-while loop[edit | edit source]

A do-while loop is basically the same as a while loop, except it is guaranteed to execute once (even if the condition is false). The basic syntax is do { body } while (condition).

#include <stdio.h>

int main() {
    int i = 0;
    do {
        printf("i = %d\n", i);
        ++i;
    } while (i < 5);
}

Star pyramid[edit | edit source]

To make a star pyramid like in the Java tutorial, here is what you would do:

#include <stdio.h>

int main() {
    int rows;
    printf("Enter number of rows: ");
    scanf("%d", &rows);
    printf("\n");
    for (int i = rows; i > 0; i -= 2) {
        for (int j = 1; j <= i / 2; ++j) {
            printf(" ");
        }
        for (int k = i; k <= rows; ++k) {
            printf("*");
        }
        printf("\n");
    }
}

Functions[edit | edit source]

To create your own function, you begin with the return type, then the function name, then the list of parameters, and then the function's body.

void postBait() {
    printf("Trans rights are human rights!");
}

int currentAdminNumber() {
    return 6;
}

Pointers and arrays[edit | edit source]

In C, a pointer is a variable that stores the memory address of another variable.

int x = 10;
int* addr = &x; // addr stores the address of x

For any type T, it has a corresponding type T*, meaning a pointer to that type T.

There are two important operators with pointers:

  • &, the address-of operator. This gets the address of where something lives in memory. For example, &x gets the address of x.
  • *, the dereference operator. This gets whatever is stored at the address of p (yes I know this variable sounds like 'p but it stands for pointer, take your meds). For example, *p gets the value of whatever is stored at address p.

nullptr (or NULL in older versions of C) represents a "null pointer", in other words a pointer that points to nothing, and usually has the memory address 0. It cannot be dereferenced, and if you dereference something that is a null pointer the program will crash.

int* p = nullptr;
int x = &p; // Crashes the program!

This is a serious source of bugs in many C programs.

Arrays in C are a collection of objects stored in contiguous memory locations. They are indexed beginning with 0.

int numbers[5] = {1, 2, 3, 4, 5};

numbers[3] = 6;
// Now: numbers = [1, 2, 3, 6, 5]

// You don't always have to specify the length
int moreNumbers[] = {6, 7, 8, 9};

To loop through an array:

int numbers[] = {1, 2, 3, 4, 5};
for (int i = 0; i < 5; ++i) {
    printf("numbers[i] = %d\n");
}

Unless you are using dynamic memory, arrays can't be resized and their size must be known at creation.

Arrays in C are closely related to pointers. For every type T, there is a type T[N], representing an array of T of length N. For example, when you declare int numbers[] = {1, 2, 3};, just using numbers itself acts like a pointer to the first element (numbers[0]). Also, if a function takes an array as a parameter, then it's essentially taking a pointer to the first element of that array.

The indexing operation is roughly equivalent to pointer arithmetic. For an array a: a[i] is equivalent to *(a + i), the element i entries away from a (the start).

Accessing an array beyond the valid range of indices causes the program to read whatever happens to be next in that point in memory, which is extremely unsafe and is a cause of many bugs.

You can also have a multi-dimensional array:

int matrix[2][3] = {
    {1, 2, 3},
    {4, 5, 6}
};

int value = matrix[0][2]; // value = 3 

Void pointers[edit | edit source]

In C, there is a special kind of pointer, the void* (void pointer). Note that this doesn't mean it points to a void in memory (as void doesn't have a size and can't occupy memory), but rather is a pointer that can point to any data type, but does not know the type it is pointing to. You cannot directly dereference a void pointer, you must first cast it to something else. You also cannot perform pointer arithmetic on a void pointer.

void pointers are thus extremely useful in acting as generic code, as they can point to any piece of memory.

#include <stdio.h>

void printAsInt(void* data) {
    printf("%d\n", *(int*)data);
}

int main() {
    int x = 10;
    void* p = &x;

    printAsInt(p);
}

In the <stdlib.h> header, there are many useful algorithms which operate on void*, such as qsort(), which performs the quicksort algorithm on an array, or bsearch(), which performs the binary search algorithm on a sorted array.

Strings[edit | edit source]

In C, there isn't a native string type or class like there is in C++, Java, or any other language. Instead, a string is essentially represented as an array of chars, where the final element is the character '\0' (the null character), which has the ASCII value 0.

char s1[] = "Hello";
// s = ['h', 'e', 'l', 'l', 'o', '\0']

char s2[] = {'H', 'e', 'l', 'l', 'o', '\0'};
// Declares an equivalent string using the explicit array syntax

Note that you cannot check s1 == s2, as this operation actually compares the memory addresses of s1 and s2. In order to check if strings are equal, you must use the strcmp() function from <string.h>.

#include <stdio.h>
#include <string.h>

int main() {
    char s1[] = "Hello";
    char s2[] = {'H', 'e', 'l', 'l', 'o', '\0'};

    if (strcmp(s1, s2) == 0) {
        printf("s1 and s2 are the same string");
    } else {
        printf("s1 and s2 are not the same string");
    }
}

Note that both char[] and char* represent a string, but char* points only to a read-only string, whereas char[] can be modified.

Structs[edit | edit source]

In C, there are no classes like you might see in C++ or Java, but the closest thing is a struct, which aggregates data into a single object.

To access elements inside the struct, use the . syntax to get its elements.

struct Nusoi {
    char name[50];
    int years;
};

int main() {
    struct Nusoi nate = {
        .name = "Nate Higgers",
        .years = 1
    };

    printf("%s is a nusoi who has been on the bald men with glasses website for %d year(s).", nate.name, nate.years);
}

If you have a pointer to a struct, instead of ., you need to use -> to get the fields:

struct Nusoi nate = {"Nate Higgers", 1};
struct Nusoi* nusoi = &nate;

printf("The nusoi %s has used the sharty for %d year(s)", nusoi->name, nusoi->years);

Enums[edit | edit source]

In C, an enum represents a type that assigns names some integer value.

enum Colour {
    RED,
    ORANGE,
    YELLOW,
    GREEN,
    BLUE,
    INDIGO,
    VIOLET
};

Enums can also be manually assigned values.

enum Day {
    MONDAY = 1,
    TUESDAY = 2,
    WEDNESDAY = 3,
    THURSDAY = 4,
    FRIDAY = 5,
    SATURDAY = 6,
    SUNDAY = 7
};

Enums are particularly useful for representing a fixed set of related constants, especially if those values represent categories, states, or options. Note that in C, enums are not type-safe:

enum Colour colour = RED:
enum Day day = MONDAY;
day = colour; // day is assigned the integer value RED holds
colour = 15; // colour is assigned the integer value 15, even if there's no Colour with value 15

In other words, enums are basically just ints with some named constants.

Dynamic memory allocation[edit | edit source]

In C, you can request direct access to memory using the <stdlib.h> header. This provides the following functions:

  • void* malloc(size_t size): allocates memory of size size (note that size_t is an unsigned integer type)
  • void* calloc(size_t num, size_t size): allocates memory for an array of num objects of size size and sets all bytes to 0
  • void* realloc(void* ptr, size_t size): re-allocates the memory at ptr to size size, allowing you to grow or shrink the amount of allocated memory you have
  • void free(void* ptr): frees the memory allocated at ptr by returning it to the operating system

In C, memory is distinguished between the stack and the heap. In the stack, your allocation and deallocation is done automatically, and is much faster than the heap, but it must all be done at compile time and thus objects are fixed size. Meanwhile, on the heap, you have manual control over the memory and have far more space to work with, and can manually control the size, but are responsible for freeing the memory when you are done.

Such a problem with the stack is that if you don't know the size you need at compile time, you can use heap allocations to dynamically decide. With dynamic memory, you can also have strings of unspecified size, and grow the string as needed, rather than have a fixed limit size with stack-allocated char[]. This also allows you to create complex data structures that need to grow and shrink, such as linked lists, trees, graphs, hash tables, etc. Furthermore, if you need variables to persist even after a function ends, you can to use a pointer to heap memory and free it when you are done.

int* a = malloc(10 * sizeof(int)); // Creates space for 10 ints

free(a); // Frees the memory in a

It is extremely important to free the memory once you are done with it. When the program is finished executing, the operating system will reclaim all the memory used by the program anyway, but if the program runs for too long it may continue to eat up memory without returning it to the operating system, causing an out of memory error.

If you lose the pointer you get from malloc() (without freeing it first), it causes a memory leak, and that memory is inaccessible forever. Languages with garbage collection, like Java, C#, and Python avoid this problem entirely, however, but are slower due to the overhead of garbage collection.

Furthermore, using a pointer after you have already freed it is a memory. You also cannot free a pointer twice.

Finally, malloc() may return nullptr if it fails to allocate, such as if you ask for too much memory. You must always check if malloc() returned nullptr after you call it.

It should be noted that you can immediately assign void* (which is returned by malloc(), calloc() and realloc()) to any pointer type in C, but this is not legal in C++; it must first be casted to the correct type, as C++ does not allow this due to stricter type safety.

Headers[edit | edit source]

In C, you typically split code into a header file (with extension .h), which contains the declarations, or "interface" of your library, while the actual code goes in the source file (with extension .c), which contains the actual definition (the implementation). To actually use your library, use the #include directive followed by the path to that header. Since including multiple files may cause the same file to be included twice (which causes a one-definition-rule violation), you have to add #pragma once to the top of the file to prevent a file from being included more than once.

For example:

In Nusoi.h:

#pragma once

struct Nusoi {
    char* name;
    int years;
};

struct Nusoi* newNusoi(char* name, int years);
struct Nusoi* deleteNusoi(struct Nusoi* nusoi);
char* getNusoiName(struct Nusoi* nusoi);
int getNusoiYears(struct Nusoi* nusoi);
void printNusoi(struct Nusoi* nusoi);

In Nusoi.c:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "Nusoi.h"

struct Nusoi* newNusoi(char* name, int years) {
    struct Nusoi* nusoi = malloc(sizeof(struct Nusoi));
    // Note that you can check !nusoi instead of nusoi == nullptr
    // since nullptr implicitly converts to false in boolean checks
    if (!nusoi) {
        // fprintf with first paramter as stderr writes to the
        // standard error stream, instead of standard output stream
        fprintf(stderr, "Failed to create nusoi!");
        return nullptr;
    }
    strcpy(nusoi->name, name);
    nusoi->years = years;
}

void deleteNusoi(struct Nusoi* nusoi) {
    free(nusoi->name); // Free the name first
    free(nusoi); // Then free the nusoi pointer
}

char* getNusoiName(struct Nusoi* nusoi) {
    return nusoi->name;
}

int getNusoiYears(struct Nusoi* nusoi) {
    return nusoi->years;
}

void printNusoi(struct Nusoi* nusoi) {
    printf("Nusoi %s has been on the bald men with glasses site for %d year(s)", nusoi->name, nusoi->years);
}

Constants[edit | edit source]

A runtime constant is denoted with const, which makes a variable unmodifiable after it is initialised.

const int x = 10;

Meanwhile, a compile-time constant is denoted with constexpr. This constant must be initialised during compile-time, unlike const which can be initialised by something at runtime. It is a much stronger form of const, which the compiler will basically insert wherever it is used during compilation rather than store it in memory. Such constants are named in ALL_CAPS to emphasise that they are extremely important.

constexpr double PI = 3.14159265358979323846;

In older versions of C, this had to be done using a preprocessor directive, #define:

#define PI 3.14159265358979323846

However, this is considered far more unsafe, as the preprocessor performs textual substitution and can easily break code as it is not actually part of the compilation process.

Preprocessor[edit | edit source]

In C and C++, the steps of compilation:

  • Preprocessing (expand all macros and preprocessor tokens, handle all file inclusions)
  • Compilation (compile code to assembly)
  • Assembling (assemble assembly code into machine code to object files)
  • Linking (link translation units to be able to reference other object files, combine all object files into single binary)

So, before compilation is run, the compiler must first expand all macros. The C preprocessor is used to handle this step. The preprocessor itself does not understand the C language, but just modifies the text in place according to rules. It expands macros, includes header files (by copying the contents in), removes comments, and handles conditional compilation.

Common directives include:

  • #include: inserts contents of another file into code at that spot. Note the difference between angle brackets and quotations (#include <stdio.h> vs #include "Nusoi.h"): angle brackets are handled by first searching in the include directory of the project, while quotations are handled by searching relative paths.
  • #embed: embeds binary at that spot. Basically #include, but for embedding binary information rather than code.
  • #define: defines a macro. This can be either a constant, or a function-like macro. Function-like macros take a parameter and expand based on the parameter. For example: #define SQUARE(x) ((x) * (x)). Brackets must be used in preprocessor directives to ensure expressions are correctly parsed, as order of operations may break if it is not expanded within brackets.
  • #undef: undefines a macro.
  • #if/#elif/#else/#endif: allows code to be excluded depending based on certain conditions.
    • #ifdef/#ifndef/#elifdef: allows code to be excluded based on whether a macro is defined.
  • #pragma: indicates compiler-specific instructions.
  • #error: halts compilation if encountered.
  • #warning: displays a compiler warning during compilation if encountered.
  • #line: overrides the __LINE__ and __FILE__ macros

This is an example of using the preprocessor for conditional compilation:

#include <stdio.h>
#define DEBUG 1

int main() {
#if DEBUG
    printf("Debug mode on\n");
#else
    printf("Debug mode off\n");
#endif
}

Conditional compilation can be used to check the operating system, for example.

#include <stdio.h>

int main() {
#ifdef _WIN32
    printf("GEEEEEG, do winjeets really?\n");
#elifdef __linux__
    printf("GEEEEEG, do linuxtroons really?\n");
#elifdef __APPLE__
    printf("GEEEEEG, do itoddlers really?\n");
#else
    printf("I don't know what operating system is so I can't insult it o algo\n");
#endif
}

Function-like macros[edit | edit source]

You can create function like macros which expand to replace arguments with the values provided to them. Function like macros are used to achieve functionality which would be otherwise impossible. This includes type generic variable declarations using auto or typeof, converting names of variables to strings, and concatenating text together to declare multiple unique variables.

They consist of one expression and therefore use commas. For control flow, you can (didn't finish the sentence award)

Use of macros should be minimized to only situations where functions would not be able to perform the same actions, as they are not type-safe and difficult to debug due to being fully expanded during compilation.

You should wrap instances of arguments in the macro's expression to avoid re-evaluation of an expression with side-effects.

Here is an example of a basic function like macro:

#define PRINT_VAR(var) printf(#var " = %d\n", var_##var)

While C doesn't support function overloading (allowing one function to have multiple signatures), it can be somewhat emulated by using a function-like macro and resolving it using _Generic:

#include <stdio.h>

#define printValue(x) _Generic((x), \
     int: printInt, \
     double: printDouble, \
     char: printChar)(x)

void printInt(int x) {
    printf("int: %d\n", x);
}

void printDouble(double x) {
    printf("double: %f\n", x);
}

void printChar(char x) {
    printf("char: %c\n", x);
}

int main() {
    printValue(10); // calls printInt
    printValue(3.14); // calls printDouble
    printValue('A'); // calls printChar
}

_Generic also supports a default case, if no type matches.

#include <stdio.h>

#define typeName(x) _Generic((x), \
    int: "int", \
    double: "double", \
    char: "char", \
    default: "unknown")

int main() {
    int a = 5;
    double b = 3.14;
    char c = 'A';

    printf("%s\n", typeName(a)); // prints "int"
    printf("%s\n", typeName(b)); // prints "double"
    printf("%s\n", typeName(c)); // prints "char"
}

Snopes[edit | edit source]