/Text-Transformation

Large Scale Programming & Testing Project 1 (Search Engine Architecture)

Primary LanguagePython

LSPT Project: Team Z

Large Scale Programming & Testing Project 1 (Search Engine Architecture)

Members:

  • Lee Cattarin
  • Charles Schmitter
  • Zachary Wimer

Unit Tests

To unit test this project, go to cd src/Unit-Tests then run

	python3 -m unittest discover

Design preferences:

  1. Indexing: Map out where in each document any given word can be found. Notate extant fields like title and author.
  2. Ranking: Use inlink analysis and index date (with bi/tri-gram analysis) to generate a ranked list of documents for a search query
  3. Text transformation: Strip out html tags and clean text to prepare it for indexing. Potentially find and order metadata like title, author, date, tags, etc.

Progress

  • Initial presentation on design, components, test plan, and quality standards
  • Initial design document details language, input & output, program flow & components, test plans, quality metrics, and coding standard