In this Programming Assignment, you will be assessing your understanding of Sets, Multiway Tries (MWTs), and Ternary Search Trees (TSTs).
Complete this weekly reflection survey: https://forms.gle/bvFS7BPP6brGXn8B9
Also, if you haven't already completed the degree planning assignment, please do so as soon as possible.
The Set Abstract Data Type (ADT) is an ADT that can store unique elements. Generally, the Set ADT is defined by the following operations:
size
: Return the number of elements stored in the setinsert(x)
: Insert elementx
into this set (don't allow duplicates)remove(x)
: Remove elementx
from this set (if it exists)find(x)
: Determine whether or notx
exists in this set
We can implement the Set ADT using various data structures we have already learned about, and we can even simply wrap around existing C++ classes:
- We can use an Array List, which is implemented in the C++
vector
class - We can use a Linked List, which is implemented in the C++
list
class - We can use a Red-Black Tree (RBT), which is implemented in the C++
set
class - We can use a Hash Table, which is implemented in the C++
unordered_set
class - We can use a Multiway Trie (MWT), which is not natively implemented in C++
- We can use a Ternary Search Tree (TST), which is not natively implemented in C++
Task 1a: Edit ArrayListSet.cpp
(10 points)
In this repository, there is a file called ArrayListSet.cpp
that contains initial steps towards implementing the Set ADT using an Array List via the C++ vector
class. Function headers (with usage details) are included in Set.h
. Your task is to fill in the missing code.
Task 1b: Edit LinkedListSet.cpp
(10 points)
In this repository, there is a file called LinkedListSet.cpp
that contains initial steps towards implementing the Set ADT using a Linked List via the C++ list
class. Function headers (with usage details) are included in Set.h
. Your task is to fill in the missing code.
Task 1c: Edit RedBlackTreeSet.cpp
(10 points)
In this repository, there is a file called RedBlackTreeSet.cpp
that contains initial steps towards implementing the Set ADT using a Red-Black Tree via the C++ set
class. Function headers (with usage details) are included in Set.h
. Your task is to fill in the missing code.
Task 1d: Edit HashTableSet.cpp
(10 points)
In this repository, there is a file called HashTableSet.cpp
that contains initial steps towards implementing the Set ADT using a Hash Table via the C++ unordered_set
class. Function headers (with usage details) are included in Set.h
. Your task is to fill in the missing code.
Imagine we want to insert n elements of length k into our set. Array Lists, Linked Lists, and Red-Black Trees all scale as a function of n in the average and worst cases, and although Hash Tables are O(k) in the average case, they scale as a function of n in the worst case, and they have no inherent order. Instead, if we implement the Set ADT using a Multiway Trie, our find, insert, and remove operations will all be O(k) in the worst case, meaning our data structure's runtime will not worsen as n increases, and we can iterate over our elements in sorted order.
Task: Edit MultiwayTrieSet.cpp
(30 points)
In this repository, there is a file called MultiwayTrieSet.cpp
that contains initial steps towards implementing the Set ADT using a Multiway Trie. Function headers (with usage details) are included in Set.h
. Your task is to fill in the missing code.
We have provided a tester program, SetTest
, that will help you test your code. You can compile it as follows:
g++ -Wall -pedantic -g -O0 -std=c++11 -o SetTest SetTest.cpp ArrayListSet.cpp HashTableSet.cpp LinkedListSet.cpp MultiwayTrieSet.cpp RedBlackTreeSet.cpp
We have also provided a Makefile
to make compilation easier. Here's an example of how it should look like when it's compiled and run from the command line:
$ make clean
rm -f SetTest *.o
$ make
g++ -Wall -pedantic -g -O0 -std=c++11 -o SetTest SetTest.cpp ArrayListSet.cpp HashTableSet.cpp LinkedListSet.cpp MultiwayTrieSet.cpp RedBlackTreeSet.cpp
$ ./SetTest
If nothing is printed by SetTest
, then your functions in Parts 1 and 2 are correct. Otherwise, SetTest
will print out which classes are incorrect, such as the following:
$ ./SetTest
ArrayListSet failed
HashTableSet failed
LinkedListSet failed
MultiwayTrieSet failed
RedBlackTreeSet failed
To receive these 10 points, your implementations for both Part 1 and Part 2 must not have any memory leaks. You can use valgrind
to check for memory leaks. For example, you can run it as follows:
valgrind --tool=memcheck --leak-check=yes ./SetTest
If it gives you a report like the following, you do not have memory leaks and are good to go (the important part is All heap blocks were freed -- no leaks are possible
):
==1482== Memcheck, a memory error detector
==1482== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1482== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==1482== Command: ./SetTest
==1482==
==1482== error calling PR_SET_PTRACER, vgdb might block
==1482==
==1482== HEAP SUMMARY:
==1482== in use at exit: 0 bytes in 0 blocks
==1482== total heap usage: 29,796 allocs, 29,796 frees, 1,455,627 bytes allocated
==1482==
==1482== All heap blocks were freed -- no leaks are possible
==1482==
==1482== For counts of detected and suppressed errors, rerun with: -v
==1482== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
If you do have memory leaks, the report will look something like the following:
==1516== Memcheck, a memory error detector
==1516== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1516== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==1516== Command: ./SetTest
==1516==
==1516== error calling PR_SET_PTRACER, vgdb might block
==1516==
==1516== HEAP SUMMARY:
==1516== in use at exit: 941,480 bytes in 24,559 blocks
==1516== total heap usage: 29,822 allocs, 5,263 frees, 1,457,307 bytes allocated
==1516==
==1516== 941,480 (64 direct, 941,416 indirect) bytes in 1 blocks are definitely lost in loss record 4 of 4
==1516== at 0x4C3017F: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1516== by 0x112912: MultiwayTrieSet::MultiwayTrieSet() (MultiwayTrieSet.cpp:7)
==1516== by 0x10A134: main (SetTest.cpp:22)
==1516==
==1516== LEAK SUMMARY:
==1516== definitely lost: 64 bytes in 1 blocks
==1516== indirectly lost: 941,416 bytes in 24,558 blocks
==1516== possibly lost: 0 bytes in 0 blocks
==1516== still reachable: 0 bytes in 0 blocks
==1516== suppressed: 0 bytes in 0 blocks
==1516==
==1516== For counts of detected and suppressed errors, rerun with: -v
==1516== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Unfortunately, when a MWT is sparse, it can waste a lot of space. The Ternary Search Tree (TST) is essentially the combination of a MWT and a BST, and it can be useful for contexts in which we would want to use a MWT (e.g. spell-checking and autocompletion) but don't have enough memory.
Imagine we have the following TST:
Create a file called 3a.txt
(case-sensitive) in the root directory of this repository (i.e., in the same folder as README.md
) containing the list of strings (1 per line) that are contained in this TST (the order of the lines doesn't matter). For example, if it contained the strings niema
and moshiri
, your file would contain the following:
niema
moshiri
The following would also be equally correct:
moshiri
niema
Imagine we have the following TST:
Create a file called 3b.txt
(case-sensitive) in the root directory of this repository (i.e., in the same folder as README.md
) containing the list of strings (1 per line) that are contained in this TST (the order of the lines doesn't matter). For example, if it contained the strings niema
and moshiri
, your file would contain the following:
niema
moshiri
The following would also be equally correct:
moshiri
niema
Imagine we have the following TST:
Create a file called 3c.txt
(case-sensitive) in the root directory of this repository (i.e., in the same folder as README.md
) containing the list of strings (1 per line) that are contained in this TST (the order of the lines doesn't matter). For example, if it contained the strings niema
and moshiri
, your file would contain the following:
niema
moshiri
The following would also be equally correct:
moshiri
niema
Imagine we have the following TST:
Create a file called 3d.txt
(case-sensitive) in the root directory of this repository (i.e., in the same folder as README.md
) containing the list of strings (1 per line) that are contained in this TST (the order of the lines doesn't matter). For example, if it contained the strings niema
and moshiri
, your file would contain the following:
niema
moshiri
The following would also be equally correct:
moshiri
niema
This Programming Assignment (PA) must be completed 100% independently! You may only discuss the PA with the Tutors, TAs, and Instructors. You are free to use resources from the internet, but you are not allowed to blatantly copy-and-paste code. If you ever find yourself highlighting a code snippet, copying, and pasting into your PA, you are likely violating the Academic Integrity Policy. If you have any concerns or doubts regarding what you are about to do, please be sure to post on Piazza first to ask us if it is okay.
- Part 0: Weekly Reflection Survey
- 5 points for submitting the weekly reflection survey (extra credit)
- 0 points for not submitting the weekly reflection survey
- Part 1: Implementing the Set ADT
- 10 points for a correct
ArrayListSet
implementation (0 points for incorrect) - 10 points for a correct
LinkedListSet
implementation (0 points for incorrect) - 10 points for a correct
RedBlackTreeSet
implementation (0 points for incorrect) - 10 points for a correct
HashTableSet
implementation (0 points for incorrect)
- 10 points for a correct
- Part 2: Implementing a Multiway Trie (MWT)
- 30 points for a correct
MultiwayTrieSet
implementation (0 points for incorrect)
- 30 points for a correct
- Parts 1-2: Checking for Memory Leaks
- 10 points if Parts 1 and 2 have no memory leaks
- 0 points if either Part 1 or Part 2 has any memory leaks
- Part 3: Ternary Search Trees (TSTs)
- 5 points for a correct
3a.txt
(0 points for incorrect) - 5 points for a correct
3b.txt
(0 points for incorrect) - 5 points for a correct
3c.txt
(0 points for incorrect) - 5 points for a correct
3d.txt
(0 points for incorrect)
- 5 points for a correct