Linux - History, Terms, and Concepts

Motivation - Why Open Source?

Big Tech Giants - short lists of selected acquisitions

See also

History

In terms of lines of code: Linux System = Linux Kernel (6%) + GNU software (15%) + other free software (79%)

Terms

File systems

File systems provide a unified interface to physical media for applications i.e. program code for reading and writing files to hard disk, USB stick, and SD card is identical. File systems differ in terms of (among other things)

Often file names are given and cannot be changed. However, if you have the choice, consider the following:

Here is an overview of the most commonly used file systems:

Buffers and Page Cache

When applications read or write files the system does not implement each access as a physical operation on the disk; this would result in very poor performance. Instead, files are copied to RAM where they stay for some time so subsequent requests can be performed much faster. A large amount of available RAM is therefore beneficial for performance.

Buffers are used for I/O operations that move data between storage devices, such as from RAM to disk.

The page cache is an area of memory where file content is stored for faster access. This area is typically much larger than the buffers and can occupy a significant part of the total RAM, leaving very little free memory, since the cache memory can be quickly re-allocated for programs if needed. Memory usage is only meaningful when listed for programs and cache separately.

Under normal conditions the system takes care of the buffering and page caching without the user noticing, with the exception of removable drives like USB sticks which need to be 'unmounted' or 'ejected' before physical disconnect in order to make sure that any data in memory can be written to the device before it is unplugged.

Encodings

Character encodings are used to translate byte streams into characters displayed on the screen. The hard drive contains files which in turn contain bytes. Each byte contains 8 bits, therefore a single byte can encode 256 different characters.

The ASCII code (American Standard Code for Information Interchange) is a 7 bit code. Its 128 positions are occupied by the characters used in the English language, and some control characters used in data communication. Each character is encoded in one byte.

The ISO 8859-1 character encoding contains the ASCII characters as the first 128 entries, and other characters used in Western European languages in the remaining 128 positions. Each character is encoded in one byte.

UTF8 is the standard encoding for Unicode. The Unicode project aims at providing support for all characters used by any major community in the world. Currently the Unicode table contains about 110,000 characters.

From the position in the Unicode table the UTF8 encoding of the character can be derived:

UTF8 is becoming the standard in information processing. However, many applications still use other encodings, and this continues to cause problems.

Note that the ASCII characters have the same byte values in all three encodings described above. Files containing only ASCII characters have the best chance of being correctly transferred across different types of systems and processed by whatever application software will work on them. For this reason it is still a good idea to only use ASCII characters if at all feasible.

Network Basics

Linux and Unix systems are usually connected to the Internet, and often function as servers i.e. provide a number of services to the outside world.

In order to achieve data communication between networked computers (hosts) a number of protocols have to be established; these form the Internet Protocol Suite which is commonly described in the following layers (in the TCP/IP model):

Ethernet is a family of networking technologies commonly used in the LAN (Local Area Network).

DSL (Digital Subscriber Line) is a family of technologies for transmitting digital data over telephone lines.

PPP (Point to Point Protocol) is used to establish a direct connection between two nodes over many types of physical networks, including cellular phone.

IP (Internet Protocol) is used for packet construction, addressing and routing along a number of nodes from source to destination.

UDP (User Datagram Protocol) is a fast and lightweight connection-less protocol. TCP (Transmission Control Protocol) is a slower, heavyweight connection-oriented protocol.

Both UDP and TCP use port numbers: when an application sends a request to a server host, the correspoding service at the destination is identified by port number, since that host may provide a number of different services. Some services are identified by their well-known ports, such as

A firewall is commonly used to allow only certain types of network traffic to certain hosts and ports, thereby avoiding a large number of problems associated with malicious requests. A firewall establishes a barrier against attacks and involves one or more computers or specialized hardware.

If a host is meant to answer HTTP request and allow users to connect via ssh then the ports 80 and 22 have to be open. The following command can be used to find the open ports of a given host, and the services listening on those ports. Scanning for open ports can be interpreted as preparing for an attack, so consider carefully before using this command on hosts outside you own control.

nmap localhost

Note that while this command identifies the open ports on your local host, the results do not mean that those ports are actually reachable from outside your LAN or organisation. A number of port scanner web sites are available to test the ports on your host reachable from the Internet.

Linux Installation

Look at the website distrowatch.com to get an overview of current Linux distributions, their popularity, and their features. Popular distributions are:

  1. Mint is based on Ubuntu and comes in several 'spins' i.e. desktops:
  2. Ubuntu is based on Debian. It comes with the new Gnome 3 desktop which is not universally popular with users, especially those coming from Gnome 2, as the interface design has been changed considerably and for no good reason, many would feel, the present author included. By switching to the 'Classic' desktop upon login the traditional Gnome 2 design can be restored to some extent.
  3. Debian is the system of choice for servers. It can be run on the desktop, but this is best left to more experienced users.
  4. Fedora is another popular choice; this one is not based on Debian.

The choice of desktop is not critical, you can always install additional desktops later. Switching to another distribution can be a hassle; the software will be basically the same, but all those personal configurations need to be migrated, and many things will work slightly differently -- expect some headaches. Better choose once and stick with it.

Download the ISO image for the distribution of your choice, e.g. linuxmint-21.2-mate-64bit.iso

32 or 64 bit: today 64-bit is the only sensible choice for practically everyone. Allmost all current desktop computers and notebooks work in 64-bit mode. The main difference is memory addressing. A 32-bit pointer can address 2^32 = 4294967296 bytes of memory i.e. 4 GB, a serious limitation when PCs today come with 8 GB of RAM or more.

VirtualBox Installation:

You can install Linux inside a virtual machine such as VirtualBox. Download the software from virtualbox.org (in Windows you probably also need the Visual C++ runtime), then use the Linux ISO file when you create the VM for you Linux system; the new version 7 of virtualbox makes this quite easy and intuitive.

There will an impact on performance, but you do not need to worry about disturbing your existing operating system (the 'host' system in VM terms), and you can run both systems at the same time. Setup is also much easier, just start the VM with the ISO file mounted (e.g. as 'optical drive'), and then run the Linux installer.

Especially for trying things out, comparing various distributions, and getting comfortable with the Linux system the VM is certainly a sensible option. However, at some point you will probably become unhappy with the slower performance, and you will want to switch to dual-boot.

Dual-boot Installation:

If you do not intend a dual-boot setup the installation will still use the boot manager. However, you can set the grub timeout to something like 2 seconds to speed up the startup. Do not set it to 0, you want to be able to go into rescue mode (without a USB stick containing the live system, which is always an option).

Partitions: For a very basic desktop system you only need one partition mounted as / (root) with a minimum size of about 20 GB (Mint). Unless your hard disk or SSD space is severly limited about 50 GB (at least) is a more reasonable choice.

Swap space: The installer suggests to create a swap partion, but for a desktop installation with a single Linux system it is more flexible to use a swap file which can be set up later. If you plan on installing other Linux version as well then a separate /swap partition can be shared among all of them to save some space.

/home partition: by default /home is in the system partition so the disk space can be shared among system and user data; however, there are advantages to having a separate /home partition, such as easily keeping your data and user settings when you upgrade to a new major release of your distribution, and installing and using more than one Linux system. In both cases there is a chance of incompatible user settings.

Encrypt home directory: usually, but not always, a good idea.

Notebooks: very much recommended - they can get lost or stolen easily. Obviously it means using a sufficiently strong login password. See section Tools/Keyring for details.

A strong password should be at least 12 characters long and must not be based on dictionary words. Use upper case, lower case, and digits, but do not substitute digits for letters: good old simple passwords like netw0rk or g0ldf1sh can be cracked almost instantly nowadays.

Desktop PC: they tend to be at a much lower risk of theft or loss compared to laptops, and home directory encryption comes with some downsides:

For these reasons many people opt for other encryption solutions on their desktop, such as using it only on a particular folder or virtual volume. Tools are e.g. gocryptfs (apt-get install gocryptfs) and veracrypt.

Boot Menu: After the installation is finished you will see the boot manager taking over at startup. It shows the boot menu and allows you to choose an operating system for this session.

After installation the boot manager defaults to the new Linux system. Once the system is up and you are logged in the default can be changed with

sudo grub-set-default n

in a terminal window, where n is the number of the entry you see in the boot menu (index origin zero!).

New Release: this depends on your distribution; in Linux Mint there are new major versions every other year or so. There are also three point releases (or minor releases). A release is identified by major and minor number and a name, e.g. Linux Mint 19.3 Tricia. Releases are supported for five years, e.g. Mint 19 was released in 2018 and supported until (end of) 2023. Transitions between minor releases such as from 19.2 to 19.3 tend to cause no problems and can be done with the Update Manager: Refresh, then look in Edit.

Going to a new major release is usually not painless. There are several options, and the choice is tricky. Whatever you opt for, backup your home directory first; also make a list of your installed packages. The Backup Tool (in the Administration menu) helps with that; however, it creates a single (possibly huge) tar file. Another simple backup of your home directory is e.g. cp -rp /home/myuser /media/myuser/somedrive. This gives you a copy on your external drive with all the files and directories, ready to work with.

The following are feasible options, in order of probable usefulness, depending on your situation:

Additional Packages: When your system is up and running you may want to start your Software Manager and install some additional packages. There are thousands of packages, here are only a few suggestions (some of them may be part of your system already, depending on your distro):

Running software for other operating systems

Within your Linux session you sometimes want to run software from other operating systems.

Participate!

Even if your programming skills are not top-notch, and you don't have the skill or time to write documentation and tutorials, you can still take part in the Open Source movement with moderate effort: