HEX File to Array in C
Setting Up the GCC Compiler
I setup a C environment as basic I could. There may be easier ways to go about this, but I wanted to use GCC to compile. To setup the environment:
- I downloaded and setup MinGW32.
- I added these includes to make the code go.
#include <stdio.h> #include <stdarg.h> #include <stdlib.h> #include <windows.h> #include <windef.h> #include <winnt.h> #include <winbase.h> #include <string.h> #include <math.h> #include <stdbool.h> #include <stdint.h> #include <sys/time.h>
I used this line to build it:
$ gcc -o main main.c
As for editing, I've really grown to love Sublime Text 3. If you have issues, make sure directory containing your files is in your PATH environment variable (I go over how to add the directory to your environment variables in this post).
Intel Hexfile to an Array Based on Data Address
To load data from an Intel HEX format file I used several functions, open_file() to create a data stream, more commonly know as a file pointer, from the file I wanted to read. And hex_file_to_array(), to parse the hex file and extract the data.
Main.c
MAIN.C int main(int argc, char *argv[]) { //If the user fails to give us two arguments yell at him. if ( argc != 2 ) { fprintf ( stderr, "Usage: %s \n", argv[0] ); exit ( EXIT_FAILURE ); } // Data array uint8_t HEX_array[32768]; // Bytes read into array. int HEX_array_size; //File to be loaded. FILE *hex_file; //Open file using command-line info; for reading. hex_file = open_file (argv[0], "rb" ); // Load the data from file HEX_array_size = hex_file_to_array(hex_file, HEX_array); } // END PROGRAM
- 6: Let's check the number of arguments passed in by the user. If there is no file name, then we exit.
- 11: Declare a unsigned array for the data. I've set it arbitrarily, but it will need to be large enough for the amount of data to be extracted from the hex file.
- 17: Here we create a pointer to a file data stream.
- 20: We pass the pointer to the data stream to the open_file function. We are setting up to only read the file in binary. We pass it the file we wish to open and it returns the opened file.
- 23: We pass hex_file_to_array a file pointer and pointer to an an array. This function reads the hex file, parses it, extracting the data and placing them into the the uint8_t array based on the data's address found in the hexfile. The function then returns the number of data bytes found in the hex file.
HEX File Format
Let's take a look at the raw data,
:10010000214601360121470136007EFE09D2190140 :100110002146017E17C20001FF5F16002148011928 :10012000194E79234623965778239EDA3F01B2CAA7 :100130003F0156702B5E712B722B732146013421C7 :00000001FF
Parsed HEX file:
: 11 2222 33 44444444444444444444444444444444 55 \n
- ':' = Start Code.
- 11 = Byte Count
- 2222 = Address
- 33 = Data Type
- 44 = Data
- 55 = Check Sum
- '\n' = End Code
All of the information in the file is important, but we are only looking to put the Data into the array. To extract this data we are going to use three sub-routines:
- read_byte_from_file()
- Ascii2Hex()
- clear_special_char()
Open_File()
Read_Byte_from_File()
One bit to understand about hex files is the data is actually stored as ASCII characters. When we open a file pointer to these ASCII characters, we can't just read the bytes, since they'd simply be an ASCII character representing the nibble read. To make the conversion we get a character, store it as a binary nibble A, get another character and store it as binary nibble B. We then combine nibble A and B into a single byte.
The function takes three parameters: the file pointer, a uint8_t pointer for storing the complete byte, and the total_chars_read, which allows us to track how far we are into the file.
- 6: Declaring a 8-bit unsinged integer to hold the finished byte.
- 8: Get an ASCII character from the file pointer.9: Here we call the cleaer_special_char function to remove '\n' and '\r' found in the hex file.
- 11: We then convert the ASCII character into a true binary nibble. The result is stored in the string. (I will cover the Ascii2Hex function below.)The above steps are repeated for nibble B.
- 18: We combine the string of nibbles into a byte.
- 26: We increment two ASCII characters read from the file pointer.
Clear_Special_Char()
The clear special character function is simply meant to remove the ':', '\n', and '\r' characters from the data stream. It simply looks through the character pulled from the data stream. If it is not a special character, it does nothing. If it is, it increments the character counter and discards the character.
Ascii2Hex()
Another fairly simple function. Here, we simply find the numeric value of the ASCII character and convert it to its binary equivalent.
This function is pretty simple, if you keep in mind each character is actually an integer. For example, the if-statements could be re-written as follows,
if (c >= 0 && c <= 9) { return (uint8_t)(c - 0) } if (c >= 65 && c <= 70) { return (uint8_t)(c - 65 + 10)} if (c >= 97 && c <= 102) {return (uint8_t)(c - 97 + 10)}
You can use an ASCII reference table to determine how a character read will be interpreted. For instance, 'D' or 'd' would be 68 or 100. 68 - 65 + 10 = 13. We know D is hexadecimal for 13 (0 = 0, 1 = 1, 1 = 2, etc... A = 10, B, = 11, C = 12, D = 13, E = 14, F = 15).
Read_Line_from_Hex_File()
This brings us to the main function,
The above code parses exactly one line of hex data from the file pointer.
- 17: We read the first byte of a line. This should be the ':' character, but remember our clear_special_char() should skip this and read the next two bytes '1' and '0' (green). The "10" is how many bytes of data (blue) found on this line. Note, 10 is not a decimal number, it's hexadecimal. Meaning, there should be 16 bytes of data found on this line.
- 20: We check if there was any data on this line. If there are zero data, we return false.23: Take the first byte of the data address (purple).
- 26: Take the second byte of the data address (purple).
- 29: Get the byte (red) identifying the type of information found on this line. We are only looking for data ('00'). The other types are explained well at the ole' Wiki article: Intel HEX record types.
- 32: If the record type is not data, we don't want it. We return false.
- 34: Combine the two 8-bit address bytes into one 16-bit address.
- 37: Let's get all the data found on this line and put it into the array we provided the function.
- 42: We have to keep track of how many bytes are on each line, to complete our address of the data. Therefore, we pass it back to hex_file_to_array().
- 45: I read the checksum, but I don't do anything with it. I probably should.
Hex_File_Line_Count()
To properly parse the hexfile we need to know how many lines are found in the the file. We can find this information several ways, but I counted the number of line start characters ':'.
- 8: Loops until the end-of-file character is reached.
- 10: Gets a ASCII character from the file pointer.
- 11: We check to see if the character we go was line start character ':'.
- 13: This function iterates through the entire file, but we want to start pulling data from the beginning of the file, so we rewind the file to the first character.
- 14: We return the number of lines.
Hex_File_to_Array()
- 23: We count the number of lines in the file we wish to extract data.
- 31: This is the work-horse loop. We loop until the we have read through all the lines we counted.
- 33: We pass read_line_from_hex() our variables we wish to fill. The hex file we want to parse (file), the buffer we hold the line data in, the int array which will serve to hold the address of this line of data, a variable to hold the number of bytes in this line. If the function was got data, it will return true. Otherwise, it will return false. We store this flag to make sure we got something.
- 34: We check to see if we actually got data from our attempt.
- 39: Here, we move the line of data from the buffer into the final array.
- 41: We place the data into the array based upon the address we pulled from the line (address1 + address2) and the byte number.
- 42: Reset the buffer to nil.
- 49-64: Finally, we print out the data. The k-loop goes through each line we extracted; the j-loop goes through each byte found on the respective line.
And that's it. Note, 49-64 is meant to demonstrate the data is properly extracted. These lines could be moved to another function where the data may be used as needed.