> Let's say that index.head[0] is null so i would have to set a head. > so i should just set index.head[0] to the first a WORD_T that comes in? Yes. > and i don't understand something about find_word i can see that it > could be used to delete the words... but the problem is since we are only > using singly linked lists then when it returns the word we won't be able > to know the previous node... so i can't set the previous nodes next to > the one after the deleted node..??? so should i delete in find_word or > what? You can use find_word for two purposes. First, you can call find_word to discover if the word is already present in the index before creating a new node and trying to insert it. If it's found, then you may want to update its section number array using the returned pointer. Second, you can use find_word to delete a node. If you find the word in the index and return a pointer to that index, you can use that address to find the previous node by traversing the list up to the node before the pointer in the exclude_words function. -------------------------------------------------------------------------- > I don't quite understand when we should be using malloc. > Do we hard code the array sizes for word and sections in > the struct? If not, how (or when) do we use malloc to do > that? The number of linked lists in the index is constant as well as the sizes of the word string and section array, so they can be hard-coded (if you're not doing the extra credit). You'll need to use malloc whenever you create a new node of WORD_T to insert into the index. For the extra credit portions, you need to use malloc to allocate space for the word string and the nodes in the section number linked list. -------------------------------------------------------------------------- > When I read in the words from the omit file using fgets, and storing in my > variable called omit, then I look at the length of that variable, it seems > to be reading in too many chars. > > ie. the first word in the omit list is "a", and as soon as I read in > that word and use strlen, it tells me the length is 2. And I know that > strlen doesn't even count the new line character. So what is it counting > and how do I get around that? strlen doesn't count the string terminator ('\0'), but it does count the newline ('\n') as a character. -------------------------------------------------------------------------- > Will the input file ever look like this? > > 3RTY
> In words, will the file try to trick the program to see if it is > only checking the first character in the line for a digit? In > the above case, this would not be a new section according to the > specifications, but in my program as it runs now it would. You > said there could be up to 30 sections, but I didn't know if we > had to check for that case since you didn't specify if there were > any input standards. You don't have to worry about the above case. If there's a blank line and the next line begins with a digit, it's a new section. -------------------------------------------------------------------------- > Also, on the paper it says "The head array in INDEX_T will contain > one entry for each letter in the alphabet." This is really confusing > because INDEX_T is a type of structure, not a structure. I don't see > why you would have a structure with one field anyway unless it is just > easier to see it being passed around to functions that way. Is there > something wrong with accessing the array directly with a pointer? There is no difference between the type name INDEX_T and the underlying abstract structure that it renames. If what you really mean is that INDEX_T is not the same as a variable that has been allocated space for the struct, then you need to think of it more abstractly. As for structs with one field, occassionally it's convenient to place a type within a struct to hide its complexity in formal and actual parameters. Additionally, it results in more readable code, and for these reasons, it was done here. -------------------------------------------------------------------------- > when we are "removing punctuation and other extraneous characters", should > we remove apostrophes from conjunctions and index the conjuction without > the apostrophe. There will be no conjunctions in the input file, so you don't need to worry about them. -------------------------------------------------------------------------- > what about hyphenated words? > > can we assume that every word in a line will be seperated by a space > character? No hyphenated words will appear in the input. And yes, every word will be delineated by white space. -------------------------------------------------------------------------- > Will there ever be a case where there are tricky indexes, like 0, negative > numbers, or numbers out of order? I started coding for those cases and > realized it wold probably be a waste of time since I don't think you're that > cruel. ;) No tricky section numbers (0, negative, or out of order); however, the section number may not start with 1. -------------------------------------------------------------------------- > also, so we ONLY allocate memory when we're adding new WORD_T's? and the > program will free them automatically? Yes and no. You only allocate memory when creating a new node to insert into the index. You must free the memory whenever you delete a node from the index; the system will only return memory when your program terminates. -------------------------------------------------------------------------- > Is there a function like "is_digit()" that can check a char > to see if it is a number, and if so could you give us the specs for it. You can easily check for a digit by testing whether the character is in the range '0' to '9'. -------------------------------------------------------------------------- > If you have time, can you put another test driver on the web with it's > solution? I encourage you to create another test file yourself and verify the results (it's quite easily done). -------------------------------------------------------------------------- > Whenever I try to get the section number from a line I > just read in it gives me a number like 49 for 1, 50 for 2, > and so on. > I am setting the section like this: > > section = (int) line[0]; > > Is there a special way we need to extract the number from > a line of chars? You're getting the ASCII value for the character instead of the int value. You should use sscanf. -------------------------------------------------------------------------- > I've heard that we need to free all the memory after > we've printed the index, but on the handout it doesn't > say anything about that. Could you clear this up. I answered this question in class. You only need to free memory when you delete a node (omit a word). Upon program termination, all of the memory the program has allocated will be freed automically by the system. -------------------------------------------------------------------------- > Are there going to be any sections numbers with 3 or more digits? > Or will all sections numbers have 1 or 2 digits only? Your program should be able to handle any int as the section number. --------------------------------------------------------------------------