Practice Assignment

(fileconv.c)

Information

  1. For this practice assignment, you will write two functions. The first function will convert a Windows-style text file to a Unix-style text file, and the second will convert a Unix-style text file to a Windows-style text file. The only difference between the two text file formats is how the end-of-line (EOL) is stored. Windows uses two characters: a carriage return (CR, ASCII 0x0D) followed by a line feed (LF, ASCII 0x0A). Unix-like operatings systems (e.g. Linux, Mac OS X, iOS, Android, etc.) use only a single LF character.

    Historical footnote: Prior to the release of Mac OS X (circa 2000), the Mac used neither of these schemes. Instead, it used a single carriage return (CR, ASCII Ox0D).

    Given a text file that contains these 4 lines of text:
    Roses are red.
    Violets are blue.
    Some poems rhyme.
    But not this one.
    
    Looking at the ASCII bytes in hexadecimal:

    Windows with CR-LF (0D 0A):

    roses-CRLF.txt:
           00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
    --------------------------------------------------------------------------
    000000 52 6F 73 65 73 20 61 72  65 20 72 65 64 2E 0D 0A   Roses are red...
    000010 56 69 6F 6C 65 74 73 20  61 72 65 20 62 6C 75 65   Violets are blue
    000020 2E 0D 0A 53 6F 6D 65 20  70 6F 65 6D 73 20 72 68   ...Some poems rh
    000030 79 6D 65 2E 0D 0A 42 75  74 20 6E 6F 74 20 74 68   yme...But not th
    000040 69 73 20 6F 6E 65 2E 0D  0A                        is one...
    
    Unix with LF only (0A):
    roses-LF.txt:
           00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
    --------------------------------------------------------------------------
    000000 52 6F 73 65 73 20 61 72  65 20 72 65 64 2E 0A 56   Roses are red..V
    000010 69 6F 6C 65 74 73 20 61  72 65 20 62 6C 75 65 2E   iolets are blue.
    000020 0A 53 6F 6D 65 20 70 6F  65 6D 73 20 72 68 79 6D   .Some poems rhym
    000030 65 2E 0A 42 75 74 20 6E  6F 74 20 74 68 69 73 20   e..But not this 
    000040 6F 6E 65 2E 0A                                     one..
    
    The prototypes for the functions look like this:
    enum FILE_ERR win2unix(const char *finput, const char *foutput);
    enum FILE_ERR unix2win(const char *finput, const char *foutput);
    
    Here is a header file that you should use (fileconv.h):
    #define CR 0x0D /* Carriage Return */
    #define LF 0x0A /* Line Feed       */
    
    /* Possible file errors */
    enum FILE_ERR {feNONE, feINPUT, feOUTPUT};
    
    enum FILE_ERR win2unix(const char *finput, const char *foutput);
    enum FILE_ERR unix2win(const char *finput, const char *foutput);
    
    Here is a driver file: (main.c, HTML Text) You must specify the file names on the command line, along with the target format:
    fileconv unix some_winfile.txt some_unixfile.txt
    
    This will convert the Windows file some_winfile.txt to a Unix file named some_unixfile.txt

    If you don't provide 3 arguments to the program, it will display this message:

    Usage:  fileconv target input_file output_file
    
    where:  target is either win or unix (the resulting format)
            input_file is the file to convert
            output_file is the newly converted file
    
    Example: (Converts a Windows text file to a Unix text file)
      fileconv unix file-with-CRLF.txt file-with-LF.txt
    

    The name of your implementation file should be fileconv.c and the command to compile it will look like this:

    gcc -O -Werror -Wall -Wextra -ansi -pedantic main.c fileconv.c -o fileconv
    

    Approximate number of lines of code for each function: 15.

Notes

  1. Hint: You need to open both files in binary mode. Otherwise, the EOL characters will be translated and you don't want that.
  2. Just read in one character at a time (using fgetc) and write one character (using fputc). Any other way will just complicate things for you.
  3. The return value of the functions are as such:

    We are going to assume that, if you can open the file, then you can do the conversion. It is possible to get a read error during the conversion, but you don't have to handle that at this point.

  4. You may notice that the code for functions will be almost identical. You may make a helper function that does all of the work of both functions and just call this function from the two functions (win2unix and unix2win) you must create. You may want to make two separate functions first to make sure you can do it, then combine them. It's up to you.
  5. You don't have to worry about being given binary files instead of text files. Also, you don't have to worry about being given invalid text files. What this means is, if you're given a Windows text file and are told to convert to Windows, your code may convert it incorrectly. For this simple practice, just assume you are given the correct text files to convert.
  6. Here are some files to test your code with. The filenames tell you which kind of line endings are used. They will likely all look the same in the browser. You will have to download them to your computer to preserve the line endings.
  7. There are a couple of Unix utilities (available in Cygwin) called dos2unix and unix2dos which convert files from Windows to Unix and Unix to Windows, respectively. (These were around when DOS was the primary operating system from Microsoft, hence the name dos2unix.) You can use these to test any files. Your output should match the output from these programs. For example, to convert a Windows (DOS) file to Unix:
    dos2unix -n input.txt output.txt
    
    Be sure you have the -n or you'll overwrite your original files! The unix2dos program works similarly. Type:
    man dos2unix
    
    To get detailed information on how to use the two programs.

  8. When using diff on your output, DO NOT use --strip-trailing-cr. This time, you REALLY want to see different line endings between Windows and Unix files. (You don't want to ignore those differences.)