SID | Name | Contribution ratio | Github ID |
---|---|---|---|
12011736 | Zhe DING | 50% | DingZ0115 |
12011126 | Zexuan Jia | 50% | Kazawaryu |
Thanks to the following people who participated in the project:
This project uses Vue3 as a front-end development framework, combined with component libraries such as Element Plus and Echarts, for data visualization. The structure is as follows:
Front-end
├─public
└─src
├─apis
├─assets
├─components
├─plugins
├─router
├─store
├─utils
└─views
This project adopts the Spring SSM & MVC framework at the back end, and generally divides the framework into common, config, controller, entity mapper and service layers. The structure is as follows:
CS209
└─project
├─common
├─config
├─controller
├─entity
├─mapper
└─service
In SpringBoot, the project uses MyBatis to interact with the database, which facilitates decoupling and simplifies development, adopts declarative transaction processing, reduces the difficulty of using Java EE API, and performs lightweight development with outstanding effects. The basic function and structure of the four-layer framework are shown in the following figure:
The front end uses axios to interact with the back end. In addition, the front end encapsulates the API code, making it easy to write and manage.
import api from './apis'
app.config.globalProperties.$api = api;
export function getTop5UpvoteTags() {
return request({
method: 'GET',
url: '/tag/getTop5UpvoteTags'
})
}
import axios from "axios";
const request = axios.create({
baseURL: 'http://10.24.125.235:8080',
timeout: 5000
})
export default request
this.$api.API.getTop5UpvoteTags().then((resp) => {
_this.top5Upvote = resp.data.data.list1
_this.xValue_upvote = resp.data.data.list2
_this.showUpvotes()
}).catch(err => {
console.log(err);
});
@Configuration
@EnableSwagger2
@EnableWebMvc
public class SwaggerConfig implements WebMvcConfigurer {
@Override
public void addResourceHandlers(ResourceHandlerRegistry registry);
@Bean
public Docket createRestApi();
}
Swagger is a specification and complete framework for generating, describing, invoking and visualizing RESTful web services. Swagger is a specification and complete framework for generating, describing, invoking and visualizing RESTful web services. To use the Swagger framework, you need to register the interface during initialization and open the relevant permissions. More details at config\SwaggerConfig
.
Since this project does not involve large-scale data interaction, a relational database is not used in the database, but a non-relational database is used lightly to speed up query and response. When adding or modifying data, the server judges whether the data is legal.
public void updateQuestionAndTagFromWeb() throws URISyntaxException, IOException, SQLException;
public void updateAnswerFromDB() throws SQLException, URISyntaxException, IOException;
public void updateJavaAPIFromWeb() throws IOException, SQLException;
public void updateUserFromDB() throws SQLException, IOException;
public void updateTags() throws SQLException, IOException;
The above four methods are used to update the database using the API interface crawler. The basic idea is: construct the access URI, apply for the return value in JSON format, process the returned data, and write to the database. More details at config\DBconfig
.
Details and implementation of the above method can be found in the QuestionController, QuestionMapper, and QuestionService.
To obtain the percentage of unanswered questions, simply count the number of questions with an answer number of 0 in the database and divide it by the total number of questions.
public Result getNoAnswerRatio();
To obtain the average and maximum number of answers for a single question, simply perform a statistical analysis on the "answer number" column of the "Question" table in the database.
public Result getMaxAnswer();
public Result getAverageAnswer();
To perform a horizontal analysis of the number of answers for each question, divide the questions into eight sub-intervals based on the maximum number of answers, and count the number of questions in each sub-interval. Then, create a bar graph to display the results.
public Result getDistributionOfAnswers() {
int maxAnswerOfQuestion = questionMapper.getMaxAnswerOfQuestion();
List<Range> listRange = new ArrayList<>();
int r = (int) Math.round((double) maxAnswerOfQuestion / 8);
int left = 0;
int right = r;
for (int i = 0; i < 8; i++) {
......
listRange.add(new Range(tName, tValue));
}
return Result.ok().code(200).message("success").addData("distribution", listRange);
}
Details and implementation of the above method can be found in the QuestionController, QuestionMapper, and QuestionService.
Similar to 4.1.1, simply count the number of questions in the "Question" table of the database where the "accepted_answer" column is true.
public Result getAcceptedQuestionCount();
To calculate the distribution of the time difference between accepting an answer and posting a question, only questions that have accepted answers need to be considered. For each question with an accepted answer, retrieve the submission time of the question and the posting time of the accepted answer, calculate their difference, and perform a statistical analysis similar to 4.1.3.
For the small percentage of questions where the time difference exceeds 3000 hours, we consider them to be "zombie" questions. To ensure the data is more readable and analyzable, we will not discuss them here.
When performing the analysis, the time difference can be converted into minutes and seconds to create a scatter plot for display.
public Result getDistributionOfQuestionDeltaTimes() {
List<Integer> post = questionMapper.getPostQuestionTimes();
List<Integer> accept = questionMapper.getAcceptedQuestionTimes();
int[][] array = new int[post.size()][2];
for (int i = 0; i < post.size(); i++) {
int delta = accept.get(i) - post.get(i);
if (delta < 3000) {
array[i][0] = delta / 60;
array[i][1] = delta - 60 * (delta / 60);
}
}
return Result.ok().code(200).message("success").addData("distribution", array);
}
For this question, only questions with accepted answers need to be considered. Find the ID of the question that has an accepted answer, and then find all other answers to that question by using the question ID. Calculate the maximum upvote value for each of the other answers, and compare it with the upvote value of the accepted answer to determine the highest upvote value among all answers to the question.
public Result getBetterRatio() {
List<Integer> acceptQuestion = questionMapper.getAcceptQuestionId();
int questionCount = questionMapper.getQuestionCount();
int validCount = 0;
for (Integer question_id : acceptQuestion) {
int maxUnAcceptedScore = questionMapper.getMaxUnAcceptedScore(question_id);
int acceptedScore = questionMapper.getAcceptedScore(question_id);
if (maxUnAcceptedScore > acceptedScore) {
validCount++;
}
}
double ret = round((double) validCount / questionCount * 100, 2);
return Result.ok().code(200).message("success").addData("ratio", ret + "%");
}
Details and implementation of the above method can be found in the TagController, TagMapper, and TagService.
The answer to this question can be obtained by performing a statistical analysis on the "Question" table. When querying questions with the "Java" tag, other tags that appear together with the "Java" tag can be considered as co-occurring tags. Count the frequency of these co-occurring tags and sort them in descending order.
public Result getTop15AppearWithJavaTags();
To obtain statistics on upvote count, a filter of !)4qcVpY9Tvb_cdttjBnShttVWo.l
was used. The upvote count can be found and counted in the "answers" field of the resulting data.
To obtain statistics and query data for a single question, you can use the following URL with the question ID:
https://api.stackexchange.com/2.2/questions/{question_id}/answers?order=desc&sort=votes&site=stackoverflow&filter=!)4qcVpY9Tvb_cdttjBnShttVWo.l&votes>=100
This will return information about the question with the specified ID, including the number of answers, the score of the question, and other relevant data. You can adjust the filter as needed to include or exclude certain fields from the response.
public Result getTop5UpvoteTags();
public Result getTagsUpvoteComb();
Similar to 4.3.2, the same method can be used to perform the statistics.
public Result getTop5ViewTags();
public Result getTagsViewComb();
Details and implementation of the above method can be found in the UserController, UserMapper, and UserService.
The answer to this question can be obtained by performing a statistical analysis on the "Question" table. For each question, count the number of unique users who participated in the discussion, including users who posted answers and comments.
Note: Due to certain restrictions and limitations, it may not be possible to obtain a large amount of data for this analysis. The results should be considered as reference only.
The method for calculating the distribution is similar to that in 4.1.3.
public Result getUserDistributionOfCommunication();
Similarly, the answer to this question is obtained from the "Question" table. When counting the number of participants in a question, the "comment count" and "answer count" fields of the corresponding users in the "User" table can be updated and counted. The method for calculating the distribution is the same as in 4.1.3.
public Result getUserDistributionOfAnswer();
public Result getUserDistributionOfComment();
The statistics for this question come from three sources: the number of questions posted, the number of questions answered, and the number of questions discussed. For each user, calculate an activity score using the formula. Sort the users by their activity score to obtain a ranking of user activity. $$ A_u = post_num \times 0.2 + answer_num \times 0.5 + comment_num \times 0.3 $$ In addition, it is also possible to query for the users with the highest number of posted questions, answered questions, and discussed questions.
public Result getMostActiveUser();
How to obtain the popularity of the Java API seems to be a relatively difficult problem, but you can find the naming rules of the Java API by consulting the Java API documentation, that is, starting with java, javax, org and so on. Therefore, you only need to process the name of the tag when querying the tag, fuzzy query, and use regular expressions to exclude tags such as java-11 to correctly obtain the number of questions to fill in the Java API.
This method is to count the problems that the Java API tag is correctly filled in, and does not consider the problems that are not filled in correctly.
The following are some of the Java API popularity data crawled:
item_id | api_name | appear_num |
---|---|---|
42 | java-stream | 11383 |
43 | java-native-interface | 9653 |
13 | java.util.scanner | 6401 |
44 | java-me | 5780 |
45 | java-ee-6 | 2033 |
46 | java-io | 1750 |
In the service layer, the method is simplified as follows:
public Result getJavaAPI() {
List<String> name = tagMapper.getJavaAPIName();
List<String> num = tagMapper.getJavaAPINum();
List<Range> items = new ArrayList<>();
for (int i = 0; i < name.size(); i++) {
items.add(new Range(name.get(i), num.get(i)));
}
return Result.ok().code(200).message("success").addData("list", items);
}
In this project, there are four controllers: QuestionController, AnswerController, TagController, and UserController. Each controller contains a set of different RESTful API endpoints that handle requests related to their respective resources (Question, Answer, Tag, and User). Here are three examples of these endpoints:
@GetMapping("/getAcceptedQuestionCount")
@ApiOperation(value = "Question with Accepted Answer Count")
public Result getAcceptedQuestionCount() {
return questionService.getAcceptedQuestionCount();
}
@GetMapping("/getDistributionOfAnswers")
@ApiOperation(value = "Distribution Of Answers")
public Result getDistributionOfAnswers() {
return questionService.getDistributionOfAnswers();
}
@GetMapping("/getBetterRatio")
@ApiOperation(value = "Distribution Of Answers")
public Result getBetterRatio() {
return questionService.getBetterRatio();
}
To mark a class as a controller that handles REST requests, the @RestController annotation is used. The @RequestMapping annotation specifies the URL path for each endpoint. Different HTTP method annotations (@GetMapping, @PostMapping, @PutMapping, @DeleteMapping) specify the type of operation for each endpoint.
-
Percentage of questions don't have any answer: 45.2%
-
Average number of answers: 1.37
-
Maximum number of answers: 34
-
Each question has a maximum weight of 0-4 answers.
-
percentage of questions have accepted answers: 18.8%
-
percentage of questions have accepted answers that have received more upvotes than the accepted answers: 9
-
Top 5 tags frequently appear together with the java tag: spring-boot, android, spring, kotlin, hibernate
-
Top 5 tags receive the most upvotes: java, collections, arraylist, initialization, hibernate
-
Top 5 tag combinations receive the most upvotes:
- java, date, timezone
- java, nullpointerexception, null
- java, casting, operators, variable-assignment, assignment-operator
- java, java-8, java-stream
- java, java-8, method-reference
-
Top 5 tags receive the most views: java, collections, arraylist, initialization, hibernate
-
Top 5 tag combinations receive the most views:
- java, nullpointerexception, null
- java, date, timezone
- java, arrays, java-8, java-stream
- java, java-8, java-stream
- java, list, search, contains
-
The number of users who post 1 question is the largest.
-
The number of users with 0 answer is the highest
-
The number of users with 0 comment is the highest
-
The largest number of questions discussed by 0-1 user.
-
The top 3 active users discussed in java thread: 3145399, 3668752, 67541
-
The top 5 most frequently discussed APIs on stack overflow: java-stream, java-native-interface, java.util.scanner, java-me, java-ee-6