/BUAA-DataStructures-2023-Project

Beihang University, Data Structure-2023 Project for Postgraduate International and Chinese Students.

BUAA-DataStructures-2023-Project

Beihang University, Data Structure-2023 Project for Postgraduate International and Chinese Students. BUAA Data Structure project-2023 3 CREDITS

WeChat4b4a90390769f1d5f51991b10a57c1d5

This is a comprehensive performance testing assignment that involves large amounts of test data and assesses students' mastery of data structures and algorithms. Topics covered may include sequential lists, linked lists, binary search trees and search (indexing and hashing), sorting, etc. There are two identical questions with different test data sets. The grading criteria for the assignment are based on the sum of the scores for both questions.

  1. The first question is a small data set testing question. We have the dictionary file (dictionary.txt), stopword file (stopword.txt), text file (article.txt), and sample running result file (results(example).txt) for students to use for debugging the program. There is no performance testing for this question, and a score will be awarded for correct results.

    My results: For Question 1: Total test data 2 Average Memory Usage: 71.750K Average CPU Time: 0.73170S Average Wall Clock Time: 0.73167S

WeChatd5adfcdcaa7c69dfcf7fe38aed330280
  1. The second question is a large data set testing question. Points are awarded only if the program runs correctly (passes the test cases), with correct results accounting for 40% of the score and performance accounting for 60%. Performance is evaluated based on the average of the fastest two programs, with scores calculated accordingly. Programs that do not produce results or produce incorrect results will not receive points.

    My results: For Question 2: Total Test Data: 1 Average Memory Usage: 129.387K Average CPU Time: 0.64111S Average Wall Clock Time: 0.64108S q 2

It is recommended that students try different methods to implement the program to understand how the combination of different data structures and algorithms can affect program performance. Additionally, students are encouraged to use the knowledge learned in this course to solve the problem.

Keyword-Based Large-Scale Document Search (Comprehensive - Small Data) [Problem Description] Search engines such as Baidu and Google provide efficient webpage and document search functions, allowing users to query information of interest through one or more keywords. To implement a large-scale text document search, efficient indexing, and query algorithms are typically required. Implement a document search program based on keywords to achieve quick searching and sorting of large-scale text documents.

Keyword-Based Large-Scale Document Search (Comprehensive - Big Data)

The problem description is the same as a Keyword-Based Large-Scale Document Search (Comprehensive - Small Data)

[Scoring Criteria] This is a comprehensive performance test question, and the scoring criteria are based on the program with the fastest running time receiving full marks, while the scores of other programs are calculated based on the running time of the fastest program. Programs that produce no results, timeout (not exceeding 120 seconds), or incorrect results will not receive any points.

I have successfully scored 100 out of 100 on this project.

Image

Remember, I haven't uploaded my source code. If you need help, you can contact me through my email at sahchandan98@buaa.edu.cn Thanks! Let's code!