[FrontPage] [TitleIndex] [WordIndex

Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

1. Bureaucratic part

This part you will not be graded on, but you should do it anyway.

Send me email. My address is <aspnes@cs.yale.edu>. In your message, include:

  1. Your name.
  2. Your status: whether you are an undergraduate, grad student, auditor, etc.
  3. Anything else you'd like to say.

2. Technical part

This part you will be graded on.

Steganography is the technique of hiding a message inside another, seemingly-innocuous message. For this assignment, your are to implement a program that extracts messages hidden in texts as described below.

2.1. The encoding

To encode a message, each character is expanded into a sequence of eight bits representing its ASCII code, with most-significant bit first. For example, the character 'R' (ASCII code 82) would be represented as the bit string 01010010. Each bit is then encoded by mapping each one to one of the first ten letters of the alphabet ('a' through 'j' or 'A' through 'J') and each 0 to some other character. This gives the first 8 characters of a line of text that is then used to encode the original character (any characters after the first 8 are ignored). So, for example, 'R' might be encoded by

Misinterpretation of this important message is encouraged!

A new line is started for each character in the original plaintext. So

Of more delight than hawks and horses be;
The more I hear and see just cause of hate?
That they behold, and see not what they see?
The dedicated words which writers use
The dear respose for limbs with travel tir'd;

encodes the bit sequences

01000010
01100001
01100011
01101111
01101110

which represent the ASCII codes for the characters Bacon, a notorious Elizabethan-era steganographer.

2.2. Your task

Write a program decode.c that reads a hidden message encoded as above from stdin, and writes the message to stdout.

2.3. Details

  1. Your program should handle lines of arbitrary length (including lines that are too short). Missing characters should be treated as zeros, so that

    We do
    think about
    things
    we do
    think about.

    decodes to PchPc.

  2. If end-of-file occurs before reaching a newline, you should not output anything for that line, even if the line has 8 or more characters.
  3. Your program should be able to handle arbitrary characters in its input and output, whether printable or not. Sending non-printable characters directly to your terminal window can have odd effects. It's safer to either redirect the output to a file (./decode < input-file > output-file) and look at the result in a text editor; or pipe the output to a program that will convert it to a more readable format (./decode < input-file | od -t x1).

2.4. Extra hints added 2012-01-18

3. Submitting your assignment

To submit your assignment, use the program /c/cs223/bin/submit on any machine in the Zoo, like this:

$ /c/cs223/bin/submit 1 decode.c
Copying decode.c
$

Here the 1 is the assignment number and decode.c is the name of your source file (which must be exactly decode.c). There is no limit to how many times you can submit your assignment, but only the last version submitted will be graded. Note that submitting your assignment after the deadline will incur a lateness penalty.

You can check that the file was copied correctly by running /c/cs223/bin/check. Additional tools for dealing with submissions are described in HowToUseTheComputingFacilities.

4. Testing your assignment

A public test script is available on the Zoo in /c/cs223/Hwk1/test.decode. You can run this with no arguments in a directory containing a file decode.c, and it will compile your program and run various cryptic tests on it. The tests used for grading may be different, so the score you get from test.decode may not accurately predict your final grade.

If you want to see the inputs for each test and the corresponding expected outputs, look in /c/cs223/Hwk1/testfiles. Many of these files contain non-printing characters, so looking at them with cat is not recommended.

Note that test.decode first tries compiling your program with the line gcc -Wall -std=c99 -pedantic -o decode decode.c, which turns on all of gcc's warnings about errors or use of non-standard C extensions (e.g., features from C++, which gcc will often permit by default). You should make sure that your program compiles without warnings with these flags for full credit.

After submitting your assignment, you can run the test.decode script on the submitted files with the command

/c/cs223/bin/testit 1 decode

This can be useful for making sure you have in fact submitted everything you were supposed to.

5. Sample solution

   1 /*
   2  * Decode steganographic messages
   3  *
   4  * For each line, interpret characters a..j and A..J as 1 bits
   5  * all others as 0 bits,
   6  * representing a single character, most significant bit first.
   7  *
   8  */
   9 
  10 #include <stdio.h>
  11 
  12 #define MOST_SIGNIFICANT_BIT (0x80)   /* value of most significant bit in an 8-bit char */
  13 
  14 int
  15 main(int argc, char **argv)
  16 {
  17     int c;
  18     int output = 0;
  19     int mask = MOST_SIGNIFICANT_BIT; 
  20 
  21     while((c = getchar()) != EOF) {
  22         if(c == '\n') {
  23             /* flush and reset output */
  24             putchar(output);
  25             output = 0;
  26             mask = MOST_SIGNIFICANT_BIT;
  27         } else if((('a' <= c) && (c <= 'j')) || (('A' <= c) && (c <= 'J'))) {
  28             /* one bit */
  29             output += mask;
  30             mask >>= 1;
  31         } else {
  32             /* zero bit */
  33             mask >>= 1;
  34         } 
  35     }
  36 
  37     return 0;
  38 }
decode.c

2014-06-17 11:58