/PA4-Starter

PA4 Starter

Primary LanguageC++

Programming Assignment 4: Sets, Multiway Tries (MWTs), and Ternary Search Trees (TSTs)

In this Programming Assignment, you will be assessing your understanding of Sets, Multiway Tries (MWTs), and Ternary Search Trees (TSTs).

Part 0: Weekly Reflection Survey (5 points)

Complete this weekly reflection survey: https://forms.gle/bvFS7BPP6brGXn8B9

Also, if you haven't already completed the degree planning assignment, please do so as soon as possible.

Part 1: Implementing the Set ADT (40 points)

The Set Abstract Data Type (ADT) is an ADT that can store unique elements. Generally, the Set ADT is defined by the following operations:

  • size: Return the number of elements stored in the set
  • insert(x): Insert element x into this set (don't allow duplicates)
  • remove(x): Remove element x from this set (if it exists)
  • find(x): Determine whether or not x exists in this set

We can implement the Set ADT using various data structures we have already learned about, and we can even simply wrap around existing C++ classes:

Task 1a: Edit ArrayListSet.cpp (10 points)

In this repository, there is a file called ArrayListSet.cpp that contains initial steps towards implementing the Set ADT using an Array List via the C++ vector class. Function headers (with usage details) are included in Set.h. Your task is to fill in the missing code.

Task 1b: Edit LinkedListSet.cpp (10 points)

In this repository, there is a file called LinkedListSet.cpp that contains initial steps towards implementing the Set ADT using a Linked List via the C++ list class. Function headers (with usage details) are included in Set.h. Your task is to fill in the missing code.

Task 1c: Edit RedBlackTreeSet.cpp (10 points)

In this repository, there is a file called RedBlackTreeSet.cpp that contains initial steps towards implementing the Set ADT using a Red-Black Tree via the C++ set class. Function headers (with usage details) are included in Set.h. Your task is to fill in the missing code.

Task 1d: Edit HashTableSet.cpp (10 points)

In this repository, there is a file called HashTableSet.cpp that contains initial steps towards implementing the Set ADT using a Hash Table via the C++ unordered_set class. Function headers (with usage details) are included in Set.h. Your task is to fill in the missing code.

Part 2: Implementing a Multiway Trie (MWT) (30 points)

Imagine we want to insert n elements of length k into our set. Array Lists, Linked Lists, and Red-Black Trees all scale as a function of n in the average and worst cases, and although Hash Tables are O(k) in the average case, they scale as a function of n in the worst case, and they have no inherent order. Instead, if we implement the Set ADT using a Multiway Trie, our find, insert, and remove operations will all be O(k) in the worst case, meaning our data structure's runtime will not worsen as n increases, and we can iterate over our elements in sorted order.

Task: Edit MultiwayTrieSet.cpp (30 points)

In this repository, there is a file called MultiwayTrieSet.cpp that contains initial steps towards implementing the Set ADT using a Multiway Trie. Function headers (with usage details) are included in Set.h. Your task is to fill in the missing code.

Testing Parts 1 and 2

We have provided a tester program, SetTest, that will help you test your code. You can compile it as follows:

g++ -Wall -pedantic -g -O0 -std=c++11 -o SetTest SetTest.cpp ArrayListSet.cpp HashTableSet.cpp LinkedListSet.cpp MultiwayTrieSet.cpp RedBlackTreeSet.cpp

We have also provided a Makefile to make compilation easier. Here's an example of how it should look like when it's compiled and run from the command line:

$ make clean
rm -f SetTest *.o
$ make
g++ -Wall -pedantic -g -O0 -std=c++11 -o SetTest SetTest.cpp ArrayListSet.cpp HashTableSet.cpp LinkedListSet.cpp MultiwayTrieSet.cpp RedBlackTreeSet.cpp
$ ./SetTest

If nothing is printed by SetTest, then your functions in Parts 1 and 2 are correct. Otherwise, SetTest will print out which classes are incorrect, such as the following:

$ ./SetTest
ArrayListSet failed
HashTableSet failed
LinkedListSet failed
MultiwayTrieSet failed
RedBlackTreeSet failed

Parts 1-2: Checking for Memory Leaks (10 points)

To receive these 10 points, your implementations for both Part 1 and Part 2 must not have any memory leaks. You can use valgrind to check for memory leaks. For example, you can run it as follows:

valgrind --tool=memcheck --leak-check=yes ./SetTest

If it gives you a report like the following, you do not have memory leaks and are good to go (the important part is All heap blocks were freed -- no leaks are possible):

==1482== Memcheck, a memory error detector
==1482== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1482== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==1482== Command: ./SetTest
==1482==
==1482== error calling PR_SET_PTRACER, vgdb might block
==1482==
==1482== HEAP SUMMARY:
==1482==     in use at exit: 0 bytes in 0 blocks
==1482==   total heap usage: 29,796 allocs, 29,796 frees, 1,455,627 bytes allocated
==1482==
==1482== All heap blocks were freed -- no leaks are possible
==1482==
==1482== For counts of detected and suppressed errors, rerun with: -v
==1482== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

If you do have memory leaks, the report will look something like the following:

==1516== Memcheck, a memory error detector
==1516== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1516== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==1516== Command: ./SetTest
==1516==
==1516== error calling PR_SET_PTRACER, vgdb might block
==1516==
==1516== HEAP SUMMARY:
==1516==     in use at exit: 941,480 bytes in 24,559 blocks
==1516==   total heap usage: 29,822 allocs, 5,263 frees, 1,457,307 bytes allocated
==1516==
==1516== 941,480 (64 direct, 941,416 indirect) bytes in 1 blocks are definitely lost in loss record 4 of 4
==1516==    at 0x4C3017F: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1516==    by 0x112912: MultiwayTrieSet::MultiwayTrieSet() (MultiwayTrieSet.cpp:7)
==1516==    by 0x10A134: main (SetTest.cpp:22)
==1516==
==1516== LEAK SUMMARY:
==1516==    definitely lost: 64 bytes in 1 blocks
==1516==    indirectly lost: 941,416 bytes in 24,558 blocks
==1516==      possibly lost: 0 bytes in 0 blocks
==1516==    still reachable: 0 bytes in 0 blocks
==1516==         suppressed: 0 bytes in 0 blocks
==1516==
==1516== For counts of detected and suppressed errors, rerun with: -v
==1516== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

Part 3: Ternary Search Trees (TSTs) (20 points)

Unfortunately, when a MWT is sparse, it can waste a lot of space. The Ternary Search Tree (TST) is essentially the combination of a MWT and a BST, and it can be useful for contexts in which we would want to use a MWT (e.g. spell-checking and autocompletion) but don't have enough memory.

Task 3a: Create 3a.txt (5 points)

Imagine we have the following TST:

Create a file called 3a.txt (case-sensitive) in the root directory of this repository (i.e., in the same folder as README.md) containing the list of strings (1 per line) that are contained in this TST (the order of the lines doesn't matter). For example, if it contained the strings niema and moshiri, your file would contain the following:

niema
moshiri

The following would also be equally correct:

moshiri
niema

Task 3b: Create 3b.txt (5 points)

Imagine we have the following TST:

Create a file called 3b.txt (case-sensitive) in the root directory of this repository (i.e., in the same folder as README.md) containing the list of strings (1 per line) that are contained in this TST (the order of the lines doesn't matter). For example, if it contained the strings niema and moshiri, your file would contain the following:

niema
moshiri

The following would also be equally correct:

moshiri
niema

Task 3c: Create 3c.txt (5 points)

Imagine we have the following TST:

Create a file called 3c.txt (case-sensitive) in the root directory of this repository (i.e., in the same folder as README.md) containing the list of strings (1 per line) that are contained in this TST (the order of the lines doesn't matter). For example, if it contained the strings niema and moshiri, your file would contain the following:

niema
moshiri

The following would also be equally correct:

moshiri
niema

Task 3d: Create 3d.txt (5 points)

Imagine we have the following TST:

Create a file called 3d.txt (case-sensitive) in the root directory of this repository (i.e., in the same folder as README.md) containing the list of strings (1 per line) that are contained in this TST (the order of the lines doesn't matter). For example, if it contained the strings niema and moshiri, your file would contain the following:

niema
moshiri

The following would also be equally correct:

moshiri
niema

Academic Integrity

This Programming Assignment (PA) must be completed 100% independently! You may only discuss the PA with the Tutors, TAs, and Instructors. You are free to use resources from the internet, but you are not allowed to blatantly copy-and-paste code. If you ever find yourself highlighting a code snippet, copying, and pasting into your PA, you are likely violating the Academic Integrity Policy. If you have any concerns or doubts regarding what you are about to do, please be sure to post on Piazza first to ask us if it is okay.

Grading (100 points total)

  • Part 0: Weekly Reflection Survey
    • 5 points for submitting the weekly reflection survey (extra credit)
    • 0 points for not submitting the weekly reflection survey
  • Part 1: Implementing the Set ADT
    • 10 points for a correct ArrayListSet implementation (0 points for incorrect)
    • 10 points for a correct LinkedListSet implementation (0 points for incorrect)
    • 10 points for a correct RedBlackTreeSet implementation (0 points for incorrect)
    • 10 points for a correct HashTableSet implementation (0 points for incorrect)
  • Part 2: Implementing a Multiway Trie (MWT)
    • 30 points for a correct MultiwayTrieSet implementation (0 points for incorrect)
  • Parts 1-2: Checking for Memory Leaks
    • 10 points if Parts 1 and 2 have no memory leaks
    • 0 points if either Part 1 or Part 2 has any memory leaks
  • Part 3: Ternary Search Trees (TSTs)
    • 5 points for a correct 3a.txt (0 points for incorrect)
    • 5 points for a correct 3b.txt (0 points for incorrect)
    • 5 points for a correct 3c.txt (0 points for incorrect)
    • 5 points for a correct 3d.txt (0 points for incorrect)