Source Code Search Engine
The task is to implement simple search engine on top of source code repositories.
Description
Your program receives a path to a directory that contains source code. It will traverse the directory recursively and index the text files. Your program will provide users the ability to query the indexed files.
Reverse index
Imagine you have 2 documents doc1
and doc2
. doc1
contains words a
and b
.
doc2
contains words a
and c
. Let's say that
doc1
has number 1
and doc2
has number 2
. Reverse index is a data structure that looks like this:
a - 1,2
b - 1
c - 2
Supported commands
index ${abs_path_to_dir}
- creates index directory for the directorysearch ${query}
- searches the reverse index and returns the files that match the${query}
Query format
+word
- document must contain this word-word
- document cannot contain this wordword
- optional words - if provided then document must contain at least one of these words
Examples
+class
- finds all files that contain wordclass
+main -int
- finds all files that contain wordmain
but do not contain wordint
+String
equals
trim
- finds all files that contain wordString
and eitherequals
ortrim
or both
Requirements
- commands are read from standard input ("CLI")
Output format
-
index
:- Successfully created index for ${directory_name}.
-
search
:- Found ${x} documents for ${query}. List of files:
* file1
* file2
- Found ${x} documents for ${query}. List of files:
Test files
Simple Test
Command: index tests/SimpleTest
Expected output: Successfully created index for SimpleTest.
Command: search +a
Expected output:
Found 2 documents for "+a". List of files:
* doc1.txt
* doc2.txt
Command: search +a -b
Expected output:
Found 1 documents for "+a -b". List of files:
* doc2.txt
Command: search b c
Expected output:
Found 2 documents for "b c". List of files:
* doc1.txt
* doc2.txt
Complex Test
Command: index tests/ComplexTest
Expected output: Successfully created index for ComplexTest.
Command: search +MyFileUtils
Expected output:
Found 2 documents for "+MyFileUtils". List of files:
* AntExercise/src/cz/cuni/mff/fileutils/MyFileUtils.java
* README.md
Good luck :)