您好,登錄后才能下訂單哦!
這篇文章主要介紹“基于python怎么實(shí)現(xiàn)單目三維重建”,在日常操作中,相信很多人在基于python怎么實(shí)現(xiàn)單目三維重建問題上存在疑惑,小編查閱了各式資料,整理出簡單好用的操作方法,希望對大家解答”基于python怎么實(shí)現(xiàn)單目三維重建”的疑惑有所幫助!接下來,請跟著小編一起來學(xué)習(xí)吧!
客觀世界的物體是三維的,而我們用攝像機(jī)獲取的圖像是二維的,但是我們可以通過二維圖像感知目標(biāo)的三維信息。三維重建技術(shù)是以一定的方式處理圖像進(jìn)而得到計算機(jī)能夠識別的三維信息,由此對目標(biāo)進(jìn)行分析。而單目三維重建則是根據(jù)單個攝像頭的運(yùn)動來模擬雙目視覺,從而獲得物體在空間中的三維視覺信息,其中,單目即指單個攝像頭。
在對物體進(jìn)行單目三維重建的過程中,相關(guān)運(yùn)行環(huán)境如下:
matplotlib 3.3.4
numpy 1.19.5
opencv-contrib-python 3.4.2.16
opencv-python 3.4.2.16
pillow 8.2.0
python 3.6.2
其重建主要包含以下步驟:
(1)相機(jī)的標(biāo)定
(2)圖像特征提取及匹配
(3)三維重建
接下來,我們來詳細(xì)看下每個步驟的具體實(shí)現(xiàn):
在我們?nèi)粘I钪杏泻芏嘞鄼C(jī),如手機(jī)上的相機(jī)、數(shù)碼相機(jī)及功能模塊型相機(jī)等等,每一個相機(jī)的參數(shù)都是不同的,即相機(jī)拍出的照片的分辨率、模式等。假設(shè)我們在進(jìn)行物體三維重建的時候,事先并不知道我們相機(jī)的矩陣參數(shù),那么,我們就應(yīng)當(dāng)計算出相機(jī)的矩陣參數(shù),這一個步驟就叫做相機(jī)的標(biāo)定。相機(jī)標(biāo)定的相關(guān)原理我就不介紹了,網(wǎng)上很多人都講解的挺詳細(xì)的。其標(biāo)定的具體實(shí)現(xiàn)如下:
def camera_calibration(ImagePath): # 循環(huán)中斷 criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001) # 棋盤格尺寸(棋盤格的交叉點(diǎn)的個數(shù)) row = 11 column = 8 objpoint = np.zeros((row * column, 3), np.float32) objpoint[:, :2] = np.mgrid[0:row, 0:column].T.reshape(-1, 2) objpoints = [] # 3d point in real world space imgpoints = [] # 2d points in image plane. batch_images = glob.glob(ImagePath + '/*.jpg') for i, fname in enumerate(batch_images): img = cv2.imread(batch_images[i]) imgGray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # find chess board corners ret, corners = cv2.findChessboardCorners(imgGray, (row, column), None) # if found, add object points, image points (after refining them) if ret: objpoints.append(objpoint) corners2 = cv2.cornerSubPix(imgGray, corners, (11, 11), (-1, -1), criteria) imgpoints.append(corners2) # Draw and display the corners img = cv2.drawChessboardCorners(img, (row, column), corners2, ret) cv2.imwrite('Checkerboard_Image/Temp_JPG/Temp_' + str(i) + '.jpg', img) print("成功提取:", len(batch_images), "張圖片角點(diǎn)!") ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, imgGray.shape[::-1], None, None)
其中,cv2.calibrateCamera函數(shù)求出的mtx矩陣即為K矩陣。
當(dāng)修改好相應(yīng)參數(shù)并完成標(biāo)定后,我們可以輸出棋盤格的角點(diǎn)圖片來看看是否已成功提取棋盤格的角點(diǎn),輸出角點(diǎn)圖如下:
圖1:棋盤格角點(diǎn)提取
在整個三維重建的過程中,這一步是最為關(guān)鍵的,也是最為復(fù)雜的一步,圖片特征提取的好壞決定了你最后的重建效果。
在圖片特征點(diǎn)提取算法中,有三種算法較為常用,分別為:SIFT算法、SURF算法以及ORB算法。通過綜合分析對比,我們在這一步中采取SURF算法來對圖片的特征點(diǎn)進(jìn)行提取。三種算法的特征點(diǎn)提取效果對比如果大家感興趣可以去網(wǎng)上搜來看下,在此就不逐一對比了。具體實(shí)現(xiàn)如下:
def epipolar_geometric(Images_Path, K): IMG = glob.glob(Images_Path) img1, img2 = cv2.imread(IMG[0]), cv2.imread(IMG[1]) img1_gray = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY) img2_gray = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY) # Initiate SURF detector SURF = cv2.xfeatures2d_SURF.create() # compute keypoint & descriptions keypoint1, descriptor1 = SURF.detectAndCompute(img1_gray, None) keypoint2, descriptor2 = SURF.detectAndCompute(img2_gray, None) print("角點(diǎn)數(shù)量:", len(keypoint1), len(keypoint2)) # Find point matches bf = cv2.BFMatcher(cv2.NORM_L2, crossCheck=True) matches = bf.match(descriptor1, descriptor2) print("匹配點(diǎn)數(shù)量:", len(matches)) src_pts = np.asarray([keypoint1[m.queryIdx].pt for m in matches]) dst_pts = np.asarray([keypoint2[m.trainIdx].pt for m in matches]) # plot knn_image = cv2.drawMatches(img1_gray, keypoint1, img2_gray, keypoint2, matches[:-1], None, flags=2) image_ = Image.fromarray(np.uint8(knn_image)) image_.save("MatchesImage.jpg") # Constrain matches to fit homography retval, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 100.0) # We select only inlier points points1 = src_pts[mask.ravel() == 1] points2 = dst_pts[mask.ravel() == 1]
找到的特征點(diǎn)如下:
圖2:特征點(diǎn)提取
我們找到圖片的特征點(diǎn)并相互匹配后,則可以開始進(jìn)行三維重建了,具體實(shí)現(xiàn)如下:
points1 = cart2hom(points1.T) points2 = cart2hom(points2.T) # plot fig, ax = plt.subplots(1, 2) ax[0].autoscale_view('tight') ax[0].imshow(cv2.cvtColor(img1, cv2.COLOR_BGR2RGB)) ax[0].plot(points1[0], points1[1], 'r.') ax[1].autoscale_view('tight') ax[1].imshow(cv2.cvtColor(img2, cv2.COLOR_BGR2RGB)) ax[1].plot(points2[0], points2[1], 'r.') plt.savefig('MatchesPoints.jpg') fig.show() # points1n = np.dot(np.linalg.inv(K), points1) points2n = np.dot(np.linalg.inv(K), points2) E = compute_essential_normalized(points1n, points2n) print('Computed essential matrix:', (-E / E[0][1])) P1 = np.array([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0]]) P2s = compute_P_from_essential(E) ind = -1 for i, P2 in enumerate(P2s): # Find the correct camera parameters d1 = reconstruct_one_point(points1n[:, 0], points2n[:, 0], P1, P2) # Convert P2 from camera view to world view P2_homogenous = np.linalg.inv(np.vstack([P2, [0, 0, 0, 1]])) d2 = np.dot(P2_homogenous[:3, :4], d1) if d1[2] > 0 and d2[2] > 0: ind = i P2 = np.linalg.inv(np.vstack([P2s[ind], [0, 0, 0, 1]]))[:3, :4] Points3D = linear_triangulation(points1n, points2n, P1, P2) fig = plt.figure() fig.suptitle('3D reconstructed', fontsize=16) ax = fig.gca(projection='3d') ax.plot(Points3D[0], Points3D[1], Points3D[2], 'b.') ax.set_xlabel('x axis') ax.set_ylabel('y axis') ax.set_zlabel('z axis') ax.view_init(elev=135, azim=90) plt.savefig('Reconstruction.jpg') plt.show()
其重建效果如下(效果一般):
圖3:三維重建
從重建的結(jié)果來看,單目三維重建效果一般,我認(rèn)為可能與這幾方面因素有關(guān):
(1)圖片拍攝形式。如果是進(jìn)行單目三維重建任務(wù),在拍攝圖片時最好保持平行移動相機(jī),且最好正面拍攝,即不要斜著拍或特異角度進(jìn)行拍攝;
(2)拍攝時周邊環(huán)境干擾。選取拍攝的地點(diǎn)最好保持單一,減少無關(guān)物體的干擾;
(3)拍攝光源問題。選取的拍照場地要保證合適的亮度(具體情況要試才知道你們的光源是否達(dá)標(biāo)),還有就是移動相機(jī)的時候也要保證前一時刻和此時刻的光源一致性。
其實(shí),單目三維重建的效果確實(shí)一般,就算將各方面情況都拉滿,可能得到的重建效果也不是特別好?;蛘呶覀兛梢钥紤]采用雙目三維重建,雙目三維重建效果肯定是要比單目的效果好的,在實(shí)現(xiàn)是也就麻煩一(億)點(diǎn)點(diǎn),哈哈。其實(shí)也沒有多太多的操作,主要就是整兩個相機(jī)拍攝和標(biāo)定兩個相機(jī)麻煩點(diǎn),其他的都還好。
import cv2 import json import numpy as np import glob from PIL import Image import matplotlib.pyplot as plt plt.rcParams['font.sans-serif'] = ['SimHei'] plt.rcParams['axes.unicode_minus'] = False def cart2hom(arr): """ Convert catesian to homogenous points by appending a row of 1s :param arr: array of shape (num_dimension x num_points) :returns: array of shape ((num_dimension+1) x num_points) """ if arr.ndim == 1: return np.hstack([arr, 1]) return np.asarray(np.vstack([arr, np.ones(arr.shape[1])])) def compute_P_from_essential(E): """ Compute the second camera matrix (assuming P1 = [I 0]) from an essential matrix. E = [t]R :returns: list of 4 possible camera matrices. """ U, S, V = np.linalg.svd(E) # Ensure rotation matrix are right-handed with positive determinant if np.linalg.det(np.dot(U, V)) < 0: V = -V # create 4 possible camera matrices (Hartley p 258) W = np.array([[0, -1, 0], [1, 0, 0], [0, 0, 1]]) P2s = [np.vstack((np.dot(U, np.dot(W, V)).T, U[:, 2])).T, np.vstack((np.dot(U, np.dot(W, V)).T, -U[:, 2])).T, np.vstack((np.dot(U, np.dot(W.T, V)).T, U[:, 2])).T, np.vstack((np.dot(U, np.dot(W.T, V)).T, -U[:, 2])).T] return P2s def correspondence_matrix(p1, p2): p1x, p1y = p1[:2] p2x, p2y = p2[:2] return np.array([ p1x * p2x, p1x * p2y, p1x, p1y * p2x, p1y * p2y, p1y, p2x, p2y, np.ones(len(p1x)) ]).T return np.array([ p2x * p1x, p2x * p1y, p2x, p2y * p1x, p2y * p1y, p2y, p1x, p1y, np.ones(len(p1x)) ]).T def scale_and_translate_points(points): """ Scale and translate image points so that centroid of the points are at the origin and avg distance to the origin is equal to sqrt(2). :param points: array of homogenous point (3 x n) :returns: array of same input shape and its normalization matrix """ x = points[0] y = points[1] center = points.mean(axis=1) # mean of each row cx = x - center[0] # center the points cy = y - center[1] dist = np.sqrt(np.power(cx, 2) + np.power(cy, 2)) scale = np.sqrt(2) / dist.mean() norm3d = np.array([ [scale, 0, -scale * center[0]], [0, scale, -scale * center[1]], [0, 0, 1] ]) return np.dot(norm3d, points), norm3d def compute_image_to_image_matrix(x1, x2, compute_essential=False): """ Compute the fundamental or essential matrix from corresponding points (x1, x2 3*n arrays) using the 8 point algorithm. Each row in the A matrix below is constructed as [x'*x, x'*y, x', y'*x, y'*y, y', x, y, 1] """ A = correspondence_matrix(x1, x2) # compute linear least square solution U, S, V = np.linalg.svd(A) F = V[-1].reshape(3, 3) # constrain F. Make rank 2 by zeroing out last singular value U, S, V = np.linalg.svd(F) S[-1] = 0 if compute_essential: S = [1, 1, 0] # Force rank 2 and equal eigenvalues F = np.dot(U, np.dot(np.diag(S), V)) return F def compute_normalized_image_to_image_matrix(p1, p2, compute_essential=False): """ Computes the fundamental or essential matrix from corresponding points using the normalized 8 point algorithm. :input p1, p2: corresponding points with shape 3 x n :returns: fundamental or essential matrix with shape 3 x 3 """ n = p1.shape[1] if p2.shape[1] != n: raise ValueError('Number of points do not match.') # preprocess image coordinates p1n, T1 = scale_and_translate_points(p1) p2n, T2 = scale_and_translate_points(p2) # compute F or E with the coordinates F = compute_image_to_image_matrix(p1n, p2n, compute_essential) # reverse preprocessing of coordinates # We know that P1' E P2 = 0 F = np.dot(T1.T, np.dot(F, T2)) return F / F[2, 2] def compute_fundamental_normalized(p1, p2): return compute_normalized_image_to_image_matrix(p1, p2) def compute_essential_normalized(p1, p2): return compute_normalized_image_to_image_matrix(p1, p2, compute_essential=True) def skew(x): """ Create a skew symmetric matrix *A* from a 3d vector *x*. Property: np.cross(A, v) == np.dot(x, v) :param x: 3d vector :returns: 3 x 3 skew symmetric matrix from *x* """ return np.array([ [0, -x[2], x[1]], [x[2], 0, -x[0]], [-x[1], x[0], 0] ]) def reconstruct_one_point(pt1, pt2, m1, m2): """ pt1 and m1 * X are parallel and cross product = 0 pt1 x m1 * X = pt2 x m2 * X = 0 """ A = np.vstack([ np.dot(skew(pt1), m1), np.dot(skew(pt2), m2) ]) U, S, V = np.linalg.svd(A) P = np.ravel(V[-1, :4]) return P / P[3] def linear_triangulation(p1, p2, m1, m2): """ Linear triangulation (Hartley ch 12.2 pg 312) to find the 3D point X where p1 = m1 * X and p2 = m2 * X. Solve AX = 0. :param p1, p2: 2D points in homo. or catesian coordinates. Shape (3 x n) :param m1, m2: Camera matrices associated with p1 and p2. Shape (3 x 4) :returns: 4 x n homogenous 3d triangulated points """ num_points = p1.shape[1] res = np.ones((4, num_points)) for i in range(num_points): A = np.asarray([ (p1[0, i] * m1[2, :] - m1[0, :]), (p1[1, i] * m1[2, :] - m1[1, :]), (p2[0, i] * m2[2, :] - m2[0, :]), (p2[1, i] * m2[2, :] - m2[1, :]) ]) _, _, V = np.linalg.svd(A) X = V[-1, :4] res[:, i] = X / X[3] return res def writetofile(dict, path): for index, item in enumerate(dict): dict[item] = np.array(dict[item]) dict[item] = dict[item].tolist() js = json.dumps(dict) with open(path, 'w') as f: f.write(js) print("參數(shù)已成功保存到文件") def readfromfile(path): with open(path, 'r') as f: js = f.read() mydict = json.loads(js) print("參數(shù)讀取成功") return mydict def camera_calibration(SaveParamPath, ImagePath): # 循環(huán)中斷 criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001) # 棋盤格尺寸 row = 11 column = 8 objpoint = np.zeros((row * column, 3), np.float32) objpoint[:, :2] = np.mgrid[0:row, 0:column].T.reshape(-1, 2) objpoints = [] # 3d point in real world space imgpoints = [] # 2d points in image plane. batch_images = glob.glob(ImagePath + '/*.jpg') for i, fname in enumerate(batch_images): img = cv2.imread(batch_images[i]) imgGray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # find chess board corners ret, corners = cv2.findChessboardCorners(imgGray, (row, column), None) # if found, add object points, image points (after refining them) if ret: objpoints.append(objpoint) corners2 = cv2.cornerSubPix(imgGray, corners, (11, 11), (-1, -1), criteria) imgpoints.append(corners2) # Draw and display the corners img = cv2.drawChessboardCorners(img, (row, column), corners2, ret) cv2.imwrite('Checkerboard_Image/Temp_JPG/Temp_' + str(i) + '.jpg', img) print("成功提取:", len(batch_images), "張圖片角點(diǎn)!") ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, imgGray.shape[::-1], None, None) dict = {'ret': ret, 'mtx': mtx, 'dist': dist, 'rvecs': rvecs, 'tvecs': tvecs} writetofile(dict, SaveParamPath) meanError = 0 for i in range(len(objpoints)): imgpoints2, _ = cv2.projectPoints(objpoints[i], rvecs[i], tvecs[i], mtx, dist) error = cv2.norm(imgpoints[i], imgpoints2, cv2.NORM_L2) / len(imgpoints2) meanError += error print("total error: ", meanError / len(objpoints)) def epipolar_geometric(Images_Path, K): IMG = glob.glob(Images_Path) img1, img2 = cv2.imread(IMG[0]), cv2.imread(IMG[1]) img1_gray = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY) img2_gray = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY) # Initiate SURF detector SURF = cv2.xfeatures2d_SURF.create() # compute keypoint & descriptions keypoint1, descriptor1 = SURF.detectAndCompute(img1_gray, None) keypoint2, descriptor2 = SURF.detectAndCompute(img2_gray, None) print("角點(diǎn)數(shù)量:", len(keypoint1), len(keypoint2)) # Find point matches bf = cv2.BFMatcher(cv2.NORM_L2, crossCheck=True) matches = bf.match(descriptor1, descriptor2) print("匹配點(diǎn)數(shù)量:", len(matches)) src_pts = np.asarray([keypoint1[m.queryIdx].pt for m in matches]) dst_pts = np.asarray([keypoint2[m.trainIdx].pt for m in matches]) # plot knn_image = cv2.drawMatches(img1_gray, keypoint1, img2_gray, keypoint2, matches[:-1], None, flags=2) image_ = Image.fromarray(np.uint8(knn_image)) image_.save("MatchesImage.jpg") # Constrain matches to fit homography retval, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 100.0) # We select only inlier points points1 = src_pts[mask.ravel() == 1] points2 = dst_pts[mask.ravel() == 1] points1 = cart2hom(points1.T) points2 = cart2hom(points2.T) # plot fig, ax = plt.subplots(1, 2) ax[0].autoscale_view('tight') ax[0].imshow(cv2.cvtColor(img1, cv2.COLOR_BGR2RGB)) ax[0].plot(points1[0], points1[1], 'r.') ax[1].autoscale_view('tight') ax[1].imshow(cv2.cvtColor(img2, cv2.COLOR_BGR2RGB)) ax[1].plot(points2[0], points2[1], 'r.') plt.savefig('MatchesPoints.jpg') fig.show() # points1n = np.dot(np.linalg.inv(K), points1) points2n = np.dot(np.linalg.inv(K), points2) E = compute_essential_normalized(points1n, points2n) print('Computed essential matrix:', (-E / E[0][1])) P1 = np.array([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0]]) P2s = compute_P_from_essential(E) ind = -1 for i, P2 in enumerate(P2s): # Find the correct camera parameters d1 = reconstruct_one_point(points1n[:, 0], points2n[:, 0], P1, P2) # Convert P2 from camera view to world view P2_homogenous = np.linalg.inv(np.vstack([P2, [0, 0, 0, 1]])) d2 = np.dot(P2_homogenous[:3, :4], d1) if d1[2] > 0 and d2[2] > 0: ind = i P2 = np.linalg.inv(np.vstack([P2s[ind], [0, 0, 0, 1]]))[:3, :4] Points3D = linear_triangulation(points1n, points2n, P1, P2) return Points3D def main(): CameraParam_Path = 'CameraParam.txt' CheckerboardImage_Path = 'Checkerboard_Image' Images_Path = 'SubstitutionCalibration_Image/*.jpg' # 計算相機(jī)參數(shù) camera_calibration(CameraParam_Path, CheckerboardImage_Path) # 讀取相機(jī)參數(shù) config = readfromfile(CameraParam_Path) K = np.array(config['mtx']) # 計算3D點(diǎn) Points3D = epipolar_geometric(Images_Path, K) # 重建3D點(diǎn) fig = plt.figure() fig.suptitle('3D reconstructed', fontsize=16) ax = fig.gca(projection='3d') ax.plot(Points3D[0], Points3D[1], Points3D[2], 'b.') ax.set_xlabel('x axis') ax.set_ylabel('y axis') ax.set_zlabel('z axis') ax.view_init(elev=135, azim=90) plt.savefig('Reconstruction.jpg') plt.show() if __name__ == '__main__': main()
到此,關(guān)于“基于python怎么實(shí)現(xiàn)單目三維重建”的學(xué)習(xí)就結(jié)束了,希望能夠解決大家的疑惑。理論與實(shí)踐的搭配能更好的幫助大家學(xué)習(xí),快去試試吧!若想繼續(xù)學(xué)習(xí)更多相關(guān)知識,請繼續(xù)關(guān)注億速云網(wǎng)站,小編會繼續(xù)努力為大家?guī)砀鄬?shí)用的文章!
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進(jìn)行舉報,并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。