/C-Macro-Collections

Header only, macro generated, generic and type-safe Collections in C

Primary LanguageCMIT LicenseMIT

C Macro Collections

C Macro Collections Logo

Header only, macro generated, generic and type-safe Collections in C.

LinkToRepo LinkToDocs

License Version travis-ci codecov

Table of Contents

  • Project Structure
  • Available Collections
  • Overall To-Do
  • Design Decisions
  • What to use
  • How to use

Project Structure

  • benchmarks - Where all benchmarks are hosted
  • examples - Examples separated by collections
  • src - All headers part of the C Macro Collections Library
    • cmc - The main C Macro Collections Library
    • dev - The main C Macro Collections Library for development (containing logging)
    • sac - Statically Allocated Collections
    • utl - Utility like ForEach macros, logging, etc
    • macro_collections.h - Master header containing all collections and utilities
  • tests - Where all tests are hosted

Available Collections

  • Linear Collections
    • List, LinkedList, Deque, Stack, Queue
  • Sets
    • HashSet, TreeSet, MultiSet
  • Maps
    • HashMap, TreeMap, MultiMap
  • Heaps
    • Heap, IntervalHeap
  • Coming Soon
    • BidiMap, SortedList
Collection Abstract Data Type Data Structure Details
BidiMap
bidimap.h
Bidirectional Map Two Hashtables A bijection between two sets of unique keys and unique values K <-> V using two hashtables
Deque
deque.h
Double-Ended Queue Dynamic Circular Array A circular array that allows push and pop on both ends (only) at constant time
HashMap
hashmap.h
Map Hashtable A unique set of keys associated with a value K -> V with constant time look up using a hashtable with open addressing and robin hood hashing
HashSet
hashset.h
Set Hashtable A unique set of values with constant time look up using a hashtable with open addressing and robin hood hashing
Heap
heap.h
Priority Queue Dynamic Array A binary heap as a dynamic array as an implicit data structure
IntervalHeap
intervalheap.h
Double-Ended Priority Queue Custom Dynamic Array A dynamic array of nodes, each hosting one value from the MinHeap and one from the MaxHeap
LinkedList
linkedlist.h
List Doubly-Linked List A default doubly-linked list
List
list.h
List Dynamic Array A dynamic array with push and pop anywhere on the array
MultiMap
multimap.h
Multimap Custom Hashtable A mapping of multiple keys with one node per key using a hashtable with separate chaining
Multiset
multiset.h
Multiset Hashtable A mapping of a value and its multiplicity using a hashtable with open addressing and robin hood hashing
Queue
queue.h
FIFO Dynamic Circular Array A queue using a circular array with enqueue at the back index and dequeue at the front index
SortedList
sortedlist.h
Sorted List Sorted Dynamic Array A lazily sorted dynamic array that is sorted only when necessary
Stack
stack.h
FILO Dynamic Array A stack with push and pop at the end of a dynamic array
TreeMap
treemap.h
Sorted Map AVL Tree A unique set of keys associated with a value K -> V using an AVL tree with log(n) look up and sorted iteration
TreeSet
treeset.h
Sorted Set AVL Tree A unique set of keys using an AVL tree with log(n) look up and sorted iteration

Overall To-Do

In the long term, these are the steps left for the completion of this library:

  • Complete the implementation of all the functions in the scope of the TODO file for the main collections;
  • Reorganize and complete all tests for the cmc collections;
  • Make an exact copy of all collections to dev with many logging utility, for them to be used under development;
  • Port all of these collections to be statically allocated and be part of the sac library;
  • Complete all tests for sac.

Design Decisions

Stack vs Heap Allocation

Currently all collections need to be allocated on the heap. Iterators have both options but it is encouraged to allocate them on the stack since they don't require dynamic memory.

Some collections overlap others in terms of functionality

Yes, you can use a Deque as a Queue or a List as a Stack without any major cost, but the idea is to have the least amount of code to fulfill the needs of a collection.

Take for example the Stack. It is simple, small and doesn't have many functions. If you generate a List to be used (only) as a Stack (which is one of the bulkiest collections) you'll end up with a lot of code generated and compiled for nothing.

The Deque versus Queue situation is a little less problematic, but again, you need to be careful when generating a lot of code as compilation times might go up to 15 seconds even with modern ultra-fast compilers.

Another example is using a HashMap/TreeMap as a HashSet/TreeSet (with a dummy value that is never used), but I just think that this is a bad thing to do and you would be wasting some memory. Also, the sets generate a lot of code related to set theory, whereas maps don't.

But what about the LinkedList ?

You can use them as Stacks, Queues and Deques, but with modern memory hierarchy models, array-based data structures have a significantly faster runtime due to caching, so I didn't bother to have specific implementations of those aforementioned collections.

You can't structurally modify a collection when iterating over it

Modifying a collection will possibly invalidate all iterators currently initialized by it. Currently, the only collection that allows this is the LinkedList (using the node-based functions, not the iterator).

What to use

The following table shows how each collection is implemented and how well they do when using as common abstract data types.

  • Ideal - The collection implements correctly the abstract data type;
  • Not Ideal - The implementation is fulfilled but some functionalities are either not part of the ADT or not present;
  • Bad - It can be done, but its a bad idea.

DataStructuresDiagram

GoodColor AverageColor BadColor

How to use

To generate the collection, all you need to do is to include the necessary header files. You can include the containers you want to use individually or you can include the master header, macro_collections.h, that comes with the entire C-Macro-Collections library.

Macros

Note here that SNAME represents the uppercase name of the collection.

Every collection is separated by two parts:

  • HEADER - Contains all struct definitions and function definitions.
  • SOURCE - Contains all function implementations.

All collections have three main macros:

  • CMC_GENERATE_SNAME - Generates CMC_GENERATE_SNAME_HEADER and CMC_GENERATE_SNAME_SOURCE.

Or you can generate each part individually:

  • CMC_GENERATE_SNAME_HEADER - Generates all struct definitions and function definitions.
  • CMC_GENERATE_SNAME_SOURCE - Generates all function implementations.

Parameters

When including macro_collections.h in your source code you gain access to a macro called CMC_COLLECTION_GENERATE with the following parameters:

  • C - Container name in uppercase (LIST, LINKEDLIST, STACK, QUEUE, DEQUE, HEAP, TREESET, TREEMAP, HASHSET, HASHMAP).
  • PFX - Functions prefix or namespace.
  • SNAME - Structure name (typedef struct SNAME##_s SNAME).
  • K - Key type. Only used in HASHMAP, TREEMAP, MULTIMAP and BIDIMAP; ignored by others.
  • V - Value type. Primary type for most collections, or value to be mapped by HASHMAP, TREEMAP, MULTIMAP and BIDIMAP.

In fact, all macros follow this pattern. So whenever you see a macro with a bunch of parameters and you don't know what they are, you can check out the above list.

For Each

There are 2 for-each macros:

  • CMC_FOR_EACH - Starts at the start of the collection towards the end.
  • CMC_FOR_EACH_REV - Starts at the end of the collection towards the start.

When including foreach.h in your source code you gain access to all for-each macros with the following parameters:

  • PFX - Functions prefix or namespace.
  • SNAME - Structure name.
  • TARGET - The variable name of the collection you wish to iterate over.
  • BODY - Block of code.

Inside body you will have access to the iterator variable iter. With it you can use functions to access the key, value or index from the iterator. Checkout the documentation for more details.


Check out some code reviews that covers some parts the project:

About Link
Unit Test ./utl/test.h Code Review
Interval Heap ./cmc/intervalheap.h Code Review
Hash Set ./cmc/hashset.h Code Review
Linked List ./cmc/linkedlist.h Code Review
Others Code Review