Welcome to the final project of the camera course. By completing all the lessons, you now have a solid understanding of keypoint detectors, descriptors, and methods to match them between successive images. Also, you know how to detect objects in an image using the YOLO deep-learning framework. And finally, you know how to associate regions in a camera image with Lidar points in 3D space. Let's take a look at our program schematic to see what we already have accomplished and what's still missing.
In this final project, you will implement the missing parts in the schematic. To do this, you will complete four major tasks:
- First, you will develop a way to match 3D objects over time by using keypoint correspondences.
- Second, you will compute the TTC based on Lidar measurements.
- You will then proceed to do the same using the camera, which requires to first associate keypoint matches to regions of interest and then to compute the TTC based on those matches.
- And lastly, you will conduct various tests with the framework. Your goal is to identify the most suitable detector/descriptor combination for TTC estimation and also to search for problems that can lead to faulty measurements by the camera or Lidar sensor. In the last course of this Nanodegree, you will learn about the Kalman filter, which is a great way to combine the two independent TTC measurements into an improved version which is much more reliable than a single sensor alone can be. But before we think about such things, let us focus on your final project in the camera course.
- cmake >= 2.8
- All OSes: click here for installation instructions
- make >= 4.1 (Linux, Mac), 3.81 (Windows)
- Linux: make is installed by default on most Linux distros
- Mac: install Xcode command line tools to get make
- Windows: Click here for installation instructions
- Git LFS
- Weight files are handled using LFS
- OpenCV >= 4.1
- This must be compiled from source using the
-D OPENCV_ENABLE_NONFREE=ON
cmake flag for testing the SIFT and SURF detectors. - The OpenCV 4.1.0 source code can be found here
- This must be compiled from source using the
- gcc/g++ >= 5.4
- Linux: gcc / g++ is installed by default on most Linux distros
- Mac: same deal as make - install Xcode command line tools
- Windows: recommend using MinGW
- Clone this repo.
- Make a build directory in the top level project directory:
mkdir build && cd build
- Compile:
cmake .. && make
- Run it:
./3D_object_tracking
.
in camFusion_Student.cpp : implement matchBoundingBoxes
void matchBoundingBoxes(std::vector<cv::DMatch> &matches,
std::map<int, int> &bbBestMatches,
DataFrame &prevFrame,
DataFrame &currFrame)
{
// Loop over every bounding box in the previous frame
for (auto boundingBoxPrev : prevFrame.boundingBoxes)
{
// Create a container for storing matches that are enclosed by this bounding box in previous frame
vector<cv::DMatch> matchesEnclosedInPrev;
// Check how many keypoints each bounding box encloses
for (auto match : matches)
{
int prevKeyPointIdx = match.queryIdx;
cv::Point prevKeyPoint = prevFrame.keypoints.at(prevKeyPointIdx).pt;
if (boundingBoxPrev.roi.contains(prevKeyPoint))
{
matchesEnclosedInPrev.push_back(match);
}
}
// Create a map that stores the IDs of bounding box in current frame which also enclose the same matches
// as in previous frame.
std::multimap<int, int> sameMatchesCurr;
// Loop over each match that is found in this bounding box in previous frame,
// check which bounding boxes in the current frame also enclose is.
for (auto matchInPrev : matchesEnclosedInPrev)
{
int currKeyPointIdx = matchInPrev.trainIdx;
for (auto boundingBoxCurr : currFrame.boundingBoxes)
{
cv::Point currKeyPoint = currFrame.keypoints.at(currKeyPointIdx).pt;
if (boundingBoxCurr.roi.contains(currKeyPoint))
{
sameMatchesCurr.insert({boundingBoxCurr.boxID, currKeyPointIdx});
}
}
}
// Check which bounding box in current frame has highest number of matches,
// then it will assigned as the best bounding box.
if (sameMatchesCurr.size() > 0)
{
int maxOccurance = 0;
int boundingBoxBestMatchedIdx;
for (auto sameMatchCurr : sameMatchesCurr)
{
int boxID = sameMatchCurr.first;
size_t boxIdOccurance = sameMatchesCurr.count(boxID);
if (boxIdOccurance > maxOccurance)
{
maxOccurance = boxIdOccurance;
boundingBoxBestMatchedIdx = boxID;
}
}
bbBestMatches.insert({boundingBoxPrev.boxID, boundingBoxBestMatchedIdx});
}
}
}
in camFusion_Student.cpp : implement computeTTCLidar
void computeTTCLidar(std::vector<LidarPoint> &lidarPointsPrev,
std::vector<LidarPoint> &lidarPointsCurr,
double frameRate,
double &TTC)
{
double dt = 1.0 / frameRate;
double minXPrev = 1e9, minXCurr = 1e9;
// Find closest distance to Lidar points
for (auto itrPointsPrev = lidarPointsPrev.begin(); itrPointsPrev != lidarPointsPrev.end(); ++itrPointsPrev)
{
if (itrPointsPrev->x < minXPrev)
{
minXPrev = itrPointsPrev->x;
}
}
for (auto itrPointsCurr = lidarPointsCurr.begin(); itrPointsCurr != lidarPointsCurr.end(); ++itrPointsCurr)
{
if (itrPointsCurr->x < minXCurr)
{
minXCurr = itrPointsCurr->x;
}
}
// compute TTC from both measurements
TTC = minXCurr * dt / (minXPrev - minXCurr);
}
in camFusion_Student.cpp : implement clusterKptMatchesWithROI
void clusterKptMatchesWithROI(BoundingBox &boundingBoxCurr,
BoundingBox &boundingBoxPrev,
std::vector<cv::KeyPoint> &kptsPrev,
std::vector<cv::KeyPoint> &kptsCurr,
std::vector<cv::DMatch> &kptMatches)
{
float shrinkFactor = 0.15; //Shrink bounding boxes 10 % to reduce outliers
std::vector<cv::DMatch> kptMatches_roi;
cv::Rect smallerBoxCurr = shrinkBoundingBox(boundingBoxCurr, shrinkFactor);
cv::Rect smallerBoxPrev = shrinkBoundingBox(boundingBoxPrev, shrinkFactor);
// Loop over each match to see if boudning box contains keypoints
for (auto match : kptMatches)
{
cv::KeyPoint currKeypoint = kptsCurr.at(match.trainIdx);
auto currCvPoint = cv::Point(currKeypoint.pt.x, currKeypoint.pt.y);
cv::KeyPoint prevKeypoint = kptsPrev.at(match.queryIdx);
auto prevCvPoint = cv::Point(prevKeypoint.pt.x, prevKeypoint.pt.y);
if (smallerBoxCurr.contains(currCvPoint) && smallerBoxPrev.contains(prevCvPoint))
{
kptMatches_roi.push_back(match);
}
}
double meanKeypointDist = 0.0;
for (auto kptMatch : kptMatches_roi)
{
meanKeypointDist += cv::norm(kptsCurr.at(kptMatch.trainIdx).pt - kptsPrev.at(kptMatch.queryIdx).pt);
}
// Calculate mean match distance
if (kptMatches_roi.size() > 0)
{
meanKeypointDist /= kptMatches_roi.size();
}
else
{
return;
}
// Keep the match distance < dist_mean * 1.5
double distThreshold = meanKeypointDist * 1.5;
for (auto kptMatch : kptMatches_roi)
{
float dist = cv::norm(kptsCurr.at(kptMatch.trainIdx).pt - kptsPrev.at(kptMatch.queryIdx).pt);
if (dist < distThreshold)
{
boundingBoxCurr.kptMatches.push_back(kptMatch);
boundingBoxPrev.kptMatches.push_back(kptMatch);
}
}
}
in camFusion_Student.cpp : implement computeTTCCamera
void computeTTCCamera(std::vector<cv::KeyPoint> &kptsPrev,
std::vector<cv::KeyPoint> &kptsCurr,
std::vector<cv::DMatch> kptMatches,
double frameRate,
double &TTC)
{
double dt = 1.0 / frameRate;
// Compute distance ratios on every pair of keypoints, O(n^2) on the number of matches contained within the ROI
vector<double> distRatios;
for (auto itrFirstMatch = kptMatches.begin(); itrFirstMatch != kptMatches.end() - 1; ++itrFirstMatch)
{
// First points in previous and current frame
cv::KeyPoint kptFirstCurr = kptsCurr.at(itrFirstMatch->trainIdx);
cv::KeyPoint kptFirstPrev = kptsPrev.at(itrFirstMatch->queryIdx);
for (auto itrSecondMatch = kptMatches.begin() + 1; itrSecondMatch != kptMatches.end(); ++itrSecondMatch)
{
cv::KeyPoint kptSecondCurr = kptsCurr.at(itrSecondMatch->trainIdx);
cv::KeyPoint kptSecondPrev = kptsPrev.at(itrSecondMatch->queryIdx);
// Compute distances for previous and current frame
double distCurr = cv::norm(kptFirstCurr.pt - kptSecondCurr.pt);
double distPrev = cv::norm(kptFirstPrev.pt - kptSecondPrev.pt);
double minDist = 100.0; // Threshold the calculated distRatios by requiring a minimum current distance
// Avoid division by zero and apply the threshold
if (distPrev > std::numeric_limits<double>::epsilon() && distCurr >= minDist)
{
double distRatio = distCurr / distPrev;
distRatios.push_back(distRatio);
}
}
}
// Only continue if the vector of distRatios is not empty
if (distRatios.size() == 0)
{
TTC = std::numeric_limits<double>::quiet_NaN();
return;
}
// Use the median to exclude outliers
std::sort(distRatios.begin(), distRatios.end());
double medianDistRatio = distRatios[distRatios.size() / 2];
TTC = -dt / (1 - medianDistRatio);
}
In some of the frames like(11,12) and (16,17) shown below, TTC of Lidar can lead to negative as the procceding car is actually accelerating, thus the method should be adapted to this case.
From the performance evaluation results from Midterm Project, Top 3 suggestion works pretty well, I choose FAST/BRIEF as the combination as it's fast and it detects relatively more keypoints. The TTC from camera is quite stable and realistic, also it is quite consistent with TTC from Lidar.
As for combination of for excample HARRIS/ORB, the TTC output from camera becomes really unstable. Sometimes TTC is NaN
or inf
because there are too few keypoints detected or matches found.
Results of each frame for bad Detector/Descriptor combination are in the left part of the table below, good combination is listed in the right part.
Frame | Combination | TTC Lidar | TTC Camera | Combination | TTC Lidar | TTC Camera |
---|---|---|---|---|---|---|
0/1 | HARRIS/ORB | 12.9722 | nan | FAST/BRIEF | 12.9722 | 12.1312 |
1/2 | HARRIS/ORB | 12.264 | -inf | FAST/BRIEF | 12.264 | 10.7908 |
2/3 | HARRIS/ORB | 13.9161 | 93.2924 | FAST/BRIEF | 13.9161 | 12.5844 |
3/4 | HARRIS/ORB | 7.11572 | 12.6321 | FAST/BRIEF | 7.11572 | 14.1171 |
4/5 | HARRIS/ORB | 16.2511 | 12.5624 | FAST/BRIEF | 16.2511 | 14.7374 |
5/6 | HARRIS/ORB | 12.4213 | nan | FAST/BRIEF | 12.4213 | 13.0071 |
6/7 | HARRIS/ORB | 34.3404 | -inf | FAST/BRIEF | 34.3404 | 13.5165 |
7/8 | HARRIS/ORB | 9.34376 | 12.9486 | FAST/BRIEF | 9.34376 | 32.985 |
8/9 | HARRIS/ORB | 18.1318 | nan | FAST/BRIEF | 18.1318 | 12.7521 |
9/10 | HARRIS/ORB | 18.0318 | nan | FAST/BRIEF | 18.0318 | 11.5258 |
10/11 | HARRIS/ORB | 3.83244 | nan | FAST/BRIEF | 3.83244 | 12.1 |
11/12 | HARRIS/ORB | -10.8537 | nan | FAST/BRIEF | -10.8537 | 11.7805 |
12/13 | HARRIS/ORB | 9.22307 | 13.4095 | FAST/BRIEF | 9.22307 | 11.6205 |
13/14 | HARRIS/ORB | 10.9678 | nan | FAST/BRIEF | 10.9678 | 11.1323 |
14/15 | HARRIS/ORB | 8.09422 | nan | FAST/BRIEF | 8.09422 | 12.8498 |
15/16 | HARRIS/ORB | 3.17535 | 12.5305 | FAST/BRIEF | 3.17535 | 11.328 |
16/17 | HARRIS/ORB | -9.99424 | nan | FAST/BRIEF | -9.99424 | 10.8182 |
17/18 | HARRIS/ORB | 8.30978 | nan | FAST/BRIEF | 8.30978 | 11.8703 |