Executable Shell Scripts

Warning

This page covers a topic which is not a part of POSIX; these features are not strictly portable across disparate systems.

It’s nice that we can run our script by passing it in as the first argument of the sh utility, but could we make the script, itself, executable? How does the operating system even know that a file is executable, or what type of file it is?

Magic Bytes

A short sequence of bytes at the beginning of each file, called the file signature, is used to identify the contents of that file. These signatures are colloquially referred to as magic numbers or magic bytes.

The POSIX utility, file can be used to inspect these magic bytes–it also can guess the contents of many plaintext files that don’t have magic bytes by inspecting their contents and file extension.

Let’s see an example,

$ file /bin/vim
/bin/vim: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked ...
$ file ~/image.png
image.png: PNG image data, 16 x 16, 8-bit/color RGBA, non-interlaced

Notice that the file utility is able to determine that /bin/vim is an ELF (executable linkable format) file, even though there is no file extension or other meta-information stored about the file. That is because the first four bytes of the file itself are: 7F 45 4C 46 (␡ELF), as we can see with a hex-dump:

$ od -An -t x1c -N4 /bin/vim
  7f   45  4c  46
 177    E   L   F

Same goes for image.png, which begins with a magic byte sequence,

$od -An -t x1c -N4 image.png
  89  50  4e  47
 211   P   N   G

The rest of the information that file is able to provide is derived by reading more of the file header following the magic bytes, which contains additional information that is specific to that particular format.

What does file tell us when we inspect hello-world.sh?

$ file hello-world.sh
hello-world.sh: ASCII text

Notice that the first byte of the ELF and PNG formats is never a printable ASCII character; this ensures that they are never misinterpreted as plaintext files. On the other hand, a shell script always consists of plaintext, which makes it difficult to identify.

Executable Formats

There are several standard executable formats in wide use today; these are files which contain header information about the type of executable, symbol tables, version information, program code, and data. Microsoft Windows uses the Portable Executable (PE and PE32+) formats for its executables, with a two-byte file signature 0x5A4D. Linux and most versions of Unix have standardized on the Executable and Linkable Format (ELF) we discussed above, with the four-byte file signature 0x7f "ELF". Finally, macOS and iOS use the Mach-O (Mach object) format, with the four-byte file signature 0xfeedface for 32-bit programs, and 0xfeedfacf for 64-bit programs; yes, the “Feed Face” is intentional.

Shebang

One magic byte sequence is the ASCII characters #!, also known as “shebang” (a contraction of “sharp-bang”), followed by the path to a program. When an executable file with a shebang is executed, the system executes the specified path with the name of the file itself as an argument. In other words, it is a method of specifying the interpreter to execute a script with.

Here is a simple example,

script.sh

#!/bin/echo
$ chmod +x script.sh  # mark script.sh as executable
$ ./script.sh
./script.sh

This technique is frequently used with shell scripts, to pass the script in as an argument to sh or another shell utility,

script.sh

#!/bin/sh
echo "Hello World!"
$ ./script.sh
Hello World!

One downside is to this is that the locations of system utilities are always standardized very well, and shebangs do not do command lookup if only a name is supplied (e.g. #!bash won’t work). Is it /bin/bash, or is it /usr/bin/bash? Thankfully most implementations of the shebang allow for an additional argument, and this is typically used to call /usr/bin/env with the name of the utility we want to find,

script.sh

$ ./script.sh
This is Bash!

Thankfully, the location of env is pretty widely adopted as /usr/bin/env and not somewhere else like /bin/env, so that is the path that is used most often.

This also works for other types of scripts, such as python,

script.py

#!/usr/bin/env python3

print("This is python!")
$ chmod +x script.py
$ ./script.py
This is python!

By the way, file can read shebangs,

$ file script.py
script.py: Python script, ASCII text executable
$ file script.sh
script.sh: Bourne-Again shell script, ASCII text executable