/printf_42

A thorough reimplementation of the C Standard Library's printf

Primary LanguageC

printf_42

A personal printf built out of Professor Don Colton's wonderful resource Secrets of "printf".

This project was a great excercise in understanding how variable arguments work in C and how versatile the printf function is through it's many flags and format identifiers. The aim for this README is to give a concise explanation of the project's most important features and touch on other points of interest.

Key Details

The first function to be addressed, run_thru_string, gives a glimpse into how this project is organized through the loop found on line 43:

screen shot 2017-11-28 at 1 39 20 pm

The format string runs similar to how a string in a simple putstr would. Basically, each character is outputted through the write system call as long as said character is not null. However, printf would not be printf if not for it's ability to print formatted data to the standard output. This is why the while loop also checks for format identifiers (subsequences beginning with %) on line 45. If the loop does encounters a %, the additional arguments fed to printf are properly formatted and inserted into the resulting string where their respective specifiers would have been. The first of which is found on line 49, where 'n' is used to store the total number of characters written so far. All other specifiers are determined later on in the program. To view the many types of format identifiers, please refer to the image below. The last thing worth mentioning is the variable named 'size'. Printf returns the number of characters in the resulting string which is why size is incremented with each call to write and is passed as an argument throughout the program.

screen shot 2017-11-28 at 1 51 19 pm

Note ~ This program does not manage doubles (%f, %e/%E, %g/%G), but does manage wide characters (%C and %S).

After running some validation checks on what follows the % in the run_thru_string function, the next major function worth examining is the determine_flags function. The format specifier in printf is interesting because it can also contain sub-specifiers such as flags, width, precision, and modifiers. The determine_flags function is the main function responsible for checking all of the above. Please refer to the detailed explanation behind each sub-specifier in the table below before moving on to determine_flags:


screen shot 2017-11-28 at 2 38 03 pm

screen shot 2017-11-28 at 2 28 35 pm

determine_flags takes both the sub-string that followed the % from earlier as well as the list of variable argument vectors. The purpose of this function is two-fold. First, it takes the content from it's sub-string argument and parses it into the project's struct. Secondly, it turns on flags and manages the data for the appropriate sub-specifiers (listed above) if they are found in the sub-string. For the latter purpose, this is done primarily though the function calls on line 101 to 103. determine_width will be looked at to get an idea of how this is done on a more detailed level before moving on to dispatcher.

screen shot 2017-11-28 at 4 01 38 pm

To determine the width (and precision for floating point value) of the given sub-string, the determine_width function contains two if-statements to be evaluated. The first checks for a numerical value (but not zero as zero is it's own sub-specifier - see diagram above) in the sub-string as this number is used to determine the width offset for the corresponding variable argument. If true, this case is noted and the value is stored as seen on lines 23 and 24.
How output looks like with width adjusted to the right
screen shot 2017-11-28 at 5 08 35 pm
How output looks like with width adjusted to the left (Note the preceding minus sign)
screen shot 2017-11-28 at 5 08 49 pm

The second if-statement checks for the appropriate sign to set precision, '.', and if true, makes note of it and stores the value in a similar fashion on lines 30 and 31. Lastly, the two nested while-loops are used to escape the numerical value used to set width and/or precision.

Next, the dispatcher function will be examined. This function was originally planned to store a dispatch table, but for the sake of brevity and quickness, it now is the first of three functions that contain a few if-statements that correspond to the appropriate format specifier.

screen shot 2017-11-28 at 5 16 57 pm

This function passes the different integers into their appropriate functions on line 44, strings, characters, and pointers to their respective functions on line 46, and all else into the invalid function on line 47.

As seen in the dispatcher function above, this recoded printf supports 15 conversion specifiers with invalid options handled in a separate function. To showcase this variety, the support for strings, octal integers, and wide characters will be discussed in full.

screen shot 2017-12-03 at 12 42 23 am

The handler for string types was perhaps the most straightforward to implement. The first task was to evaluate the truthfulness of the trailing variable argument. If said argument is false, the program's output should then mimic printf's error handling as seen on lines 50 and 52. Otherwise, the variable argument is duplicated and passed to the auxillary function in charge of output. In helper, the auxillary, two calls to precision and width formatting functions (listed as s_precision and s_width in reference to the string type) finally perform the specifications from earlier in the program if they were given. The functions alter the duplicated string by either limiting the number of characters to be printed or justifying the result according to white space. Once these two actions have been performed, the string is ready to be printed and the resultant length in characters is returned.

The handler for base-8 integers will be observed next.

screen shot 2017-12-03 at 12 40 59 am

The parameter 'type' stores the variable argument as it needed a correct data type matching before being cast to this function. The variable is then converted with the appropriate base still intact as seen on line 39. If the pound flag is set, a zero-prefix is required for the octal before the rest of the code is executed. From lines 51 to 57, the logic follows from the string handler's auxillary as precision and width is taken care of before before writing to the standard output. Lastly, the number of characters written is returned just like from before.

Lastly, the handler for wide character (though not wide strings as that is taken care of elsewhere) will be discussed.

screen shot 2017-12-03 at 12 42 08 am

Making this project UTF-8 compatible was probably the hardest part of it all. There was a lot of research to be done on Wikipedia. The jist of the function above though is relatively simple. The logic follows the main logic from print_s in how the length is returned and the value is outputted. After an initial check for NULL, the main difference is in wc_to_s which is a function that bitshifts the variable argument into the appropriate ASCII format that can be printed. It took a lot of time to create something that functioned as it was suppossed to, but the Wiki page was an excellent guide.

Acknowledgement

printf_42 was developed at École 42 USA in Fremont, California