Please read the remarks about programming problems for this course.


Well-constructed hash tables provide constant-time access to data. Such fast-searching data structures are essential for many real-world applications. In this assignment you will develop a hash table abstract data type (ADT) and then use your hash table ADT as the core data structure in a spell-checking program. This assignment has three parts:

  1. Implement a hash table ADT. Name your new class HashTable. Your HashTable class must provide fast access to std::string objects and support the following operations:
    • Constructor: Provide a constructor that accepts a single integer representing the size of the table and a string file name. The constructor should create a hash table of the given size and fill it with the words found in the indicated text file.

    • hash function: Include a hash function that maps a string to an index within the table.

    • insert method: Include an insert method that inserts a string into the hash table. The method should return true if it successfully inserts the word into the table; otherwise, it should return false if the word already is present in the table.

    • remove method: Include a method named remove that removes a string from the hash table. This method returns true if it removes the word; otherwise, it returns false if the word to remove is not in the hash table.

    • contains method: Provide a contains method that returns true if the hash table contains a given string; otherwise, the method returns false if the string is not present.

    • size method: Include a size method that returns the total number of strings in the hash table.

    Use a std::vector as the basis for your hash table. Do not use std::unordered_map, std::map, std::set, or any other STL container—your task is to implement a hash table from the ground up.

    You may use chaining or open addressing to resolve collisions in your hash table. Chaining is easier, and if you choose chaining, I recommend using std::vector<std::vector<std::string>> (the table is a vector where each element is a vector of strings forming the chain).

  2. Create an instance of your HashTable ADT and fill it with the strings found in the file http://www-01.sil.org/linguistics/wordlists/english/wordlist/wordsEn.txt). This text file contains 109,582 English words, each one on its own line.

  3. Write a main function containing a loop that allows the user to enter a single word each time through the loop. If the word is in the hash table, the program should print an asterisk (*) indicating that the word is acceptable. If the word is not found in the hash table, the program should attempt correction in the following ways:

    • Perhaps the user omitted a letter. Try all new words resulting from adding a single letter at all possible places in the word. If any of these new words match a word in the hash table, print the word. Note that for a word that contains n letters you can create 26(n + 1) words by adding a single letter in all possible positions (in front, in back, and anywhere in the middle).

    • Perhaps the user added an extraneous letter. Try all new words that result from removing a single letter from the word. If any of these words match a word in the hash table, print the word. Note that for a word that contains n letters you can create n new words by removing a single letter.

    • Perhaps the user mistyped one of the letters in the word. Try all new words that result from replacing an existing letter in the word with some other letter. If any of these words match a word in the hash table, print the word. Note that for a word that contains n letters you can create 25n new words.

    • Perhaps the user transposed two adjacent letters. Try all new words that result from transposing adjacent letters. If any of these words match a word in the hash table, print the new word. Note that for a word that contains n letters you can create n – 1 new words via letter transposition.

    • Perhaps the user typed two legal words but ran them together forgetting to put a space in between. Try all new word pairs that result from placing a space in all possible locations between letters in the word. If both words in any of these word pairs match words in the hash table, print the word pair—print the first word, then a space, then the second word. Note that for a string that contains n letters you can create n – 1 potential word pairs.

    When multiple corrections are possible, list each potential word on a line all by itself, with the words appearing in alphabetical order. (For word pairs, alphabetize by the first word in the pair.) If none of the five kinds of simple corrections listed above produce a valid word (that is, a word in the hash table), the program should print ??? (three question marks) to indicate it is unable to provide a suggestion.

    The user terminates the loop (and, therefore, the program) by typing a single period (.).

You probably have gathered that your program will be accessing your hash table object frequently. You must ensure that your program remains responsive. Your overall grade on this assignment is dependent on the speed of your hash table. You should experiment with different hash functions and settle on the one that provides the best performance.

When you are satisfied that your code is complete and correct, consolidate all your code into one source file named with your last name; for example, halterman.cpp. It should be possible to compile this single source file to produce the executable. Submit this source file to eclass.