This is a short tutorial on Linear Codes, using Sage. To follow along with the code, you must either download Sage or use it online. Both options are free and you can find the links here: sagemath.org

### Basics:

Size of input word: *k*

Size of coded word: *n*

Number of coded words: *2 ^{k}*

Generator Matrix: [

*I*;

*P*] where

*I*is

*k*x

*k*and

*P*=

*k*x

*n*–

*k*

*I*: Identity Matrix

*P*: An Arbitrary Matrix

The Minimum Hamming Distance (*MHD*) is very important in Code Theory because it is closely related to the code’s ability to correct errors. **A code with MHD=d can only correct (d-1)/2 errors.**

The *MHD* depends heavily on *P*, so deciding on its values is of the utmost importance. A safe strategy to build around is by cycling a string of bits *k* times.

Example:

We cycled the binary word *11100* three times and produced the above matrix.

Choosing all 1s or all 0s is an especially bad practise and will result in the shortest possible *MHD*. Increasing *n* generally increases *MHD*, but one will still have some experimenting to do.

Let’s define two coded words *c _{1}* and

*c*. Their modulo 2 addition is also a coded word.

_{2}*c*+

_{1}*c*=

_{2}*c*, where

_{3}*c*a valid coded word. Because of that, the code will always include the word

_{3}*0*(

_{n}*n*zeros in a row – 000…000). We can see that by adding the same word

*c*twice.

*c + c = 0*.

### Encoding:

This is how a coded word is calculated:

*d = d _{1},d_{2},d_{3}…d_{k}* – Input Word

*c = c*– Coded Word (output)

_{1},c_{2},c_{3}…c_{n}The first *k* bits of the coded word *c* are the same with the input word *d*:

*c _{1} = d_{1}*

*c*

_{2}= d_{2}*c*

_{3}= d_{3}…

*c*

_{k}= d_{k}The rest bits of the coded word are computed like this:

*c _{k+1} = p_{11}*d_{1} + p_{21}*d_{2} + p_{31}*d_{3} + p_{k1}*d_{k}*

*c*

_{k+2}= p_{12}*d_{1}+ p_{22}*d_{2}+ p_{32}*d_{3}+ p_{k2}*d_{k}…

…

*c*

_{n}= p_{1,n-k}*d_{1}+ p_{2,n-k}*d_{2}+ p_{3,n-k}*d_{3}+ p_{k,n-k}*d_{k}All multiplications are modulo 2.

If we represent the coded word with the *1*x*n* matrix *C*, the input word with the *1*x*k* matrix *D* and our generator matrix with *G*, we can write the above with the formula:

Linear codes are *systematic* codes, which means the input data can be found in the coded word. In this case, the input will be at the front of the word (*c _{i} = d_{i}* for

*i*in [

*i,k*]).

Let’s say we want to encode a message of *m* bits, with *m > k*. We simply divide the message into parts of *k* and encode them seperately, joining them all together in the end to form our coded message.

Because of that, it is better if we choose *k* as a number that divides *m* perfectly. For example, if *m* is in ASCII, which is represented by 8-bits (although the first is not used), we can choose *k* to be 2, 4 or 8 (or even 1).

### Sage Example:

#### Generation

After you open up a new Sage project, add this code in the editor:

MS = MatrixSpace(GF(2),8,16); G = MS([ [1,0,0,0,0,0,0,0,1,0,1,1,1,0,0,0], [0,1,0,0,0,0,0,0,0,1,0,1,1,1,0,0], [0,0,1,0,0,0,0,0,0,0,1,0,1,1,1,0], [0,0,0,1,0,0,0,0,0,0,0,1,0,1,1,1], [0,0,0,0,1,0,0,0,1,0,0,0,1,0,1,1], [0,0,0,0,0,1,0,0,1,1,0,0,0,1,0,1], [0,0,0,0,0,0,1,0,1,1,1,0,0,0,1,0], [0,0,0,0,0,0,0,1,0,1,1,1,0,0,0,1] ]);

MS is the Matrix Space the Generator Matrix will be in. GF(2) means that the values in the space will be binary (0 or 1). The dimension of the matrices in the space will be 8. That also means the input word will be 8 bits long. As a result, there will be *2 ^{8}* coded words in total. The legnth of a coded word is 16.

G is our generator matrix. It comes from MS (our Matrix Space). We simply give it the values we want. If you pay close attention, you will notice that G is in the form of [*I;P*] (where *I* is *8*x*8* and *P=8*x*16-8*). In every line, the first 8 values come from the Identity Matrix, and the last 8 are some arbitrary values. Notice though that we haven’t put just some random values, but we have cycled a 5-bit word into a space of 8 bits. Because the dimension of G is 8, we can only cycle those 5 bits 8 times.

Now it is time to actually create our Linear Code. Add:

C=LinearCode(G);

And voila. Our Linear Code is C. We can quickly check the code’s minimum distance (MHD) by writing:

C.minimum_distance();

This code has a minimum distance of 5. That means it can correct *(5-1)/2=2* errors. (NOTE: If *MHD* is even, then the number of errors the code can correct is *x.5*. The *.5* means that it can correct all errors of size x, and it can correct half of the errors of size x+1, but not any for bigger sizes.)

#### Encoding

Let’s say we have a message that we want to encode. We will call this message *m*, and it will be of length *l*. Keep in mind that *m* is in binary and *l* is the number of bits in the message. If you have a non-binary message you want to encode, you’d have to convert it to binary.

As previously mentioned, we need to divide our word into parts of length *k*. Assuming we chose *k* as a number that perfectly divides *l*, we can divide *l* into *l/k* parts, all of which will have a length of *k*.

Now we go through each part and encode it, joining all the coded parts to form our coded word. The coded word will have a length of *n*l/k*.

In general, we can write something like this:

l = len(m); c = []; for i in range(l/k): w = C.encode(m[i*k:i*k+k]); c.append(w);

The above code divides *m* into *l/k* parts and encodes them one-by-one, adding each separate result into one final coded word. The part of the message from *i*k* to *i*k+k* is the part that will be encoded on each loop.

For our example, we replace *k* with *8* and we have:

l = len(m); c = []; for i in range(l/8): w = C.encode(m[i*8:i*8+8]); c.append(w);

Keep in mind that every time we encode a part, a vector is created. So, *c* is in fact a list of vectors, where each vector is a coded word on its own. If we join the vectors together one after the other, we will end up with our true coded word in string (or anything we want) from.

#### Decoding

To decode a word we do the exact opposite of encoding. We have our list of coded vectors, *c*, and we need to decode them one-by-one and join them together to form our original message. Let *s* be the size of *c*, aka the number of vectors in the list. We will store the decoded message to *decodedMessage*. Because we want the decoded word as a string, we will also have to do some convertion. After we decode each vector, we get another vector of *k* bits. We will be converting these bits to string one-by-one and we will add them to *decodedMessage*. In the end, we have this piece of code:

decodedMessage = ""; for i in range(s): decodedPart = C.decode_to_message(c[i]); for j in range(k): decodedMessage += str(decodedPart[j]);

The temporary variable *decodedPart* is used for storing the decoded vector. We then need to parse the *k* bits in *decodedPart* to string and add them to *decodedMessage*.

After the above code, *decodedMessage* will hold our original message in binary form.

#### Complete Code

MS = MatrixSpace(GF(2),8,16); G = MS([ [1,0,0,0,0,0,0,0,1,0,1,1,1,0,0,0], [0,1,0,0,0,0,0,0,0,1,0,1,1,1,0,0], [0,0,1,0,0,0,0,0,0,0,1,0,1,1,1,0], [0,0,0,1,0,0,0,0,0,0,0,1,0,1,1,1], [0,0,0,0,1,0,0,0,1,0,0,0,1,0,1,1], [0,0,0,0,0,1,0,0,1,1,0,0,0,1,0,1], [0,0,0,0,0,0,1,0,1,1,1,0,0,0,1,0], [0,0,0,0,0,0,0,1,0,1,1,1,0,0,0,1] ]); C=LinearCode(G); C.minimum_distance(); message = "Hello World!"; #Our message to encode m = binascii.hexlify(message); #Turn to hex message = ""; #We will put the message here, converted to binary #Convert message to binary for i in range(0, len(m), 2): binary = bin(int(m[i:i+2], 16))[2:]; message += (8 - len(binary)) * '0' + binary; m = vector(map(int, message));#Turn into vector #Encoding the message l = len(m); c = []; for i in range(l/8): w = C.encode(m[i*8:i*8+8]); c.append(w); #Decoding the message decodedMessage = ""; for i in range(s): decodedPart = C.decode_to_message(c[i]); for j in range(k): decodedMessage += str(decodedPart[j]); decodedMessage.replace(' ','').strip(); #Remove any rogue space n = int(decodedMessage, 2); #Binary to decimal #From binary to string decodedMessage = binascii.unhexlify('%x' % n); #Our message, decoded. It's the same as the original. decodedMessage;