您好,登錄后才能下訂單哦!
這篇文章主要講解了“Opencv實(shí)現(xiàn)眼睛控制鼠標(biāo)的案例分析”,文中的講解內(nèi)容簡(jiǎn)單清晰,易于學(xué)習(xí)與理解,下面請(qǐng)大家跟著小編的思路慢慢深入,一起來(lái)研究和學(xué)習(xí)“Opencv實(shí)現(xiàn)眼睛控制鼠標(biāo)的案例分析”吧!
在開始項(xiàng)目之前,我們需要引入第三方庫(kù)。
# For monitoring web camera and performing image minipulations import cv2 # For performing array operations import numpy as np # For creating and removing directories import os import shutil # For recognizing and performing actions on mouse presses from pynput.mouse import Listener
首先讓我們了解一下Pynput的Listener工作原理。pynput.mouse.Listener創(chuàng)建一個(gè)后臺(tái)線程,該線程記錄鼠標(biāo)的移動(dòng)和鼠標(biāo)的點(diǎn)擊。這是一個(gè)簡(jiǎn)化代碼,當(dāng)你們按下鼠標(biāo)時(shí),它會(huì)打印鼠標(biāo)的坐標(biāo):
from pynput.mouse import Listener def on_click(x, y, button, pressed): """ Args: x: the x-coordinate of the mouse y: the y-coordinate of the mouse button: 1 or 0, depending on right-click or left-click pressed: 1 or 0, whether the mouse was pressed or released """ if pressed: print (x, y) with Listener(on_click = on_click) as listener: listener.join()
現(xiàn)在,為了實(shí)現(xiàn)我們的目的,讓我們擴(kuò)展這個(gè)框架。但是,我們首先需要編寫裁剪眼睛邊界框的代碼。我們稍后將在on_click函數(shù)內(nèi)部調(diào)用此函數(shù)。我們使用Haar級(jí)聯(lián)對(duì)象檢測(cè)來(lái)確定用戶眼睛的邊界框。你們可以在此處下載檢測(cè)器文件,讓我們做一個(gè)簡(jiǎn)單的演示來(lái)展示它是如何工作的:
import cv2 # Load the cascade classifier detection object cascade = cv2.CascadeClassifier("haarcascade_eye.xml") # Turn on the web camera video_capture = cv2.VideoCapture(0) # Read data from the web camera (get the frame) _, frame = video_capture.read() # Convert the image to grayscale gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) # Predict the bounding box of the eyes boxes = cascade.detectMultiScale(gray, 1.3, 10) # Filter out images taken from a bad angle with errors # We want to make sure both eyes were detected, and nothing else if len(boxes) == 2: eyes = [] for box in boxes: # Get the rectangle parameters for the detected eye x, y, w, h = box # Crop the bounding box from the frame eye = frame[y:y + h, x:x + w] # Resize the crop to 32x32 eye = cv2.resize(eye, (32, 32)) # Normalize eye = (eye - eye.min()) / (eye.max() - eye.min()) # Further crop to just around the eyeball eye = eye[10:-10, 5:-5] # Scale between [0, 255] and convert to int datatype eye = (eye * 255).astype(np.uint8) # Add the current eye to the list of 2 eyes eyes.append(eye) # Concatenate the two eye images into one eyes = np.hstack(eyes)
現(xiàn)在,讓我們使用此知識(shí)來(lái)編寫用于裁剪眼睛圖像的函數(shù)。首先,我們需要一個(gè)輔助函數(shù)來(lái)進(jìn)行標(biāo)準(zhǔn)化:
def normalize(x): minn, maxx = x.min(), x.max() return (x - minn) / (maxx - minn)
這是我們的眼睛裁剪功能。如果發(fā)現(xiàn)眼睛,它將返回圖像。否則,它返回None:
def scan(image_size=(32, 32)): _, frame = video_capture.read() gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) boxes = cascade.detectMultiScale(gray, 1.3, 10) if len(boxes) == 2: eyes = [] for box in boxes: x, y, w, h = box eye = frame[y:y + h, x:x + w] eye = cv2.resize(eye, image_size) eye = normalize(eye) eye = eye[10:-10, 5:-5] eyes.append(eye) return (np.hstack(eyes) * 255).astype(np.uint8) else: return None
現(xiàn)在,讓我們來(lái)編寫我們的自動(dòng)化,該自動(dòng)化將在每次按下鼠標(biāo)按鈕時(shí)運(yùn)行。(假設(shè)我們之前已經(jīng)root在代碼中將變量定義為我們要存儲(chǔ)圖像的目錄):
def on_click(x, y, button, pressed): # If the action was a mouse PRESS (not a RELEASE) if pressed: # Crop the eyes eyes = scan() # If the function returned None, something went wrong if not eyes is None: # Save the image filename = root + "{} {} {}.jpeg".format(x, y, button) cv2.imwrite(filename, eyes)
現(xiàn)在,我們可以回憶起pynput的實(shí)現(xiàn)Listener,并進(jìn)行完整的代碼實(shí)現(xiàn):
import cv2 import numpy as np import os import shutil from pynput.mouse import Listener root = input("Enter the directory to store the images: ") if os.path.isdir(root): resp = "" while not resp in ["Y", "N"]: resp = input("This directory already exists. If you continue, the contents of the existing directory will be deleted. If you would still like to proceed, enter [Y]. Otherwise, enter [N]: ") if resp == "Y": shutil.rmtree(root) else: exit() os.mkdir(root) # Normalization helper function def normalize(x): minn, maxx = x.min(), x.max() return (x - minn) / (maxx - minn) # Eye cropping function def scan(image_size=(32, 32)): _, frame = video_capture.read() gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) boxes = cascade.detectMultiScale(gray, 1.3, 10) if len(boxes) == 2: eyes = [] for box in boxes: x, y, w, h = box eye = frame[y:y + h, x:x + w] eye = cv2.resize(eye, image_size) eye = normalize(eye) eye = eye[10:-10, 5:-5] eyes.append(eye) return (np.hstack(eyes) * 255).astype(np.uint8) else: return None def on_click(x, y, button, pressed): # If the action was a mouse PRESS (not a RELEASE) if pressed: # Crop the eyes eyes = scan() # If the function returned None, something went wrong if not eyes is None: # Save the image filename = root + "{} {} {}.jpeg".format(x, y, button) cv2.imwrite(filename, eyes) cascade = cv2.CascadeClassifier("haarcascade_eye.xml") video_capture = cv2.VideoCapture(0) with Listener(on_click = on_click) as listener: listener.join()
運(yùn)行此命令時(shí),每次單擊鼠標(biāo)(如果兩只眼睛都在視線中),它將自動(dòng)裁剪網(wǎng)絡(luò)攝像頭并將圖像保存到適當(dāng)?shù)哪夸浿?。圖像的文件名將包含鼠標(biāo)坐標(biāo)信息,以及它是右擊還是左擊。
這是一個(gè)示例圖像。在此圖像中,我在分辨率為2560x1440的監(jiān)視器上在坐標(biāo)(385,686)上單擊鼠標(biāo)左鍵:
級(jí)聯(lián)分類器非常準(zhǔn)確,到目前為止,我尚未在自己的數(shù)據(jù)目錄中看到任何錯(cuò)誤?,F(xiàn)在,讓我們編寫用于訓(xùn)練神經(jīng)網(wǎng)絡(luò)的代碼,以給定你們的眼睛圖像來(lái)預(yù)測(cè)鼠標(biāo)的位置。
import numpy as np import os import cv2 import pyautogui from tensorflow.keras.models import * from tensorflow.keras.layers import * from tensorflow.keras.optimizers import *
現(xiàn)在,讓我們添加級(jí)聯(lián)分類器:
cascade = cv2.CascadeClassifier("haarcascade_eye.xml") video_capture = cv2.VideoCapture(0)
正常化:
def normalize(x): minn, maxx = x.min(), x.max() return (x - minn) / (maxx - minn)
捕捉眼睛:
def scan(image_size=(32, 32)): _, frame = video_capture.read() gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) boxes = cascade.detectMultiScale(gray, 1.3, 10) if len(boxes) == 2: eyes = [] for box in boxes: x, y, w, h = box eye = frame[y:y + h, x:x + w] eye = cv2.resize(eye, image_size) eye = normalize(eye) eye = eye[10:-10, 5:-5] eyes.append(eye) return (np.hstack(eyes) * 255).astype(np.uint8) else: return None
讓我們定義顯示器的尺寸。你們必須根據(jù)自己的計(jì)算機(jī)屏幕的分辨率更改以下參數(shù):
# Note that there are actually 2560x1440 pixels on my screen # I am simply recording one less, so that when we divide by these # numbers, we will normalize between 0 and 1. Note that mouse # coordinates are reported starting at (0, 0), not (1, 1) width, height = 2559, 1439
現(xiàn)在,讓我們加載數(shù)據(jù)(同樣,假設(shè)你們已經(jīng)定義了root)。我們并不在乎是單擊鼠標(biāo)右鍵還是單擊鼠標(biāo)左鍵,因?yàn)槲覀兊哪繕?biāo)只是預(yù)測(cè)鼠標(biāo)的位置:
filepaths = os.listdir(root) X, Y = [], [] for filepath in filepaths: x, y, _ = filepath.split(' ') x = float(x) / width y = float(y) / height X.append(cv2.imread(root + filepath)) Y.append([x, y]) X = np.array(X) / 255.0 Y = np.array(Y) print (X.shape, Y.shape)
讓我們定義我們的模型架構(gòu):
model = Sequential() model.add(Conv2D(32, 3, 2, activation = 'relu', input_shape = (12, 44, 3))) model.add(Conv2D(64, 2, 2, activation = 'relu')) model.add(Flatten()) model.add(Dense(32, activation = 'relu')) model.add(Dense(2, activation = 'sigmoid')) model.compile(optimizer = "adam", loss = "mean_squared_error") model.summary()
這是我們的摘要:
接下來(lái)的任務(wù)是訓(xùn)練模型。我們將在圖像數(shù)據(jù)中添加一些噪點(diǎn):
epochs = 200 for epoch in range(epochs): model.fit(X, Y, batch_size = 32)
現(xiàn)在讓我們使用我們的模型來(lái)實(shí)時(shí)移動(dòng)鼠標(biāo)。請(qǐng)注意,這需要大量數(shù)據(jù)才能正常工作。但是,作為概念證明,你們會(huì)注意到,實(shí)際上只有200張圖像,它確實(shí)將鼠標(biāo)移到了你們要查看的常規(guī)區(qū)域。當(dāng)然,除非你們擁有更多的數(shù)據(jù),否則這是不可控的。
while True: eyes = scan() if not eyes is None: eyes = np.expand_dims(eyes / 255.0, axis = 0) x, y = model.predict(eyes)[0] pyautogui.moveTo(x * width, y * height)
這是一個(gè)概念證明的例子。請(qǐng)注意,在進(jìn)行此屏幕錄像之前,我們只訓(xùn)練了很少的數(shù)據(jù)。這是我們的鼠標(biāo)根據(jù)眼睛自動(dòng)移動(dòng)到終端應(yīng)用程序窗口的視頻。就像我說(shuō)的那樣,這很容易,因?yàn)閿?shù)據(jù)很少。有了更多的數(shù)據(jù),它有望穩(wěn)定到足以以更高的特異性進(jìn)行控制。僅用幾百?gòu)垐D像,你們就只能將其移動(dòng)到注視的整個(gè)區(qū)域內(nèi)。另外,如果在整個(gè)數(shù)據(jù)收集過(guò)程中,你們?cè)谄聊坏奶囟▍^(qū)域(例如邊緣)都沒(méi)有拍攝任何圖像,則該模型不太可能在該區(qū)域內(nèi)進(jìn)行預(yù)測(cè)。
感謝各位的閱讀,以上就是“Opencv實(shí)現(xiàn)眼睛控制鼠標(biāo)的案例分析”的內(nèi)容了,經(jīng)過(guò)本文的學(xué)習(xí)后,相信大家對(duì)Opencv實(shí)現(xiàn)眼睛控制鼠標(biāo)的案例分析這一問(wèn)題有了更深刻的體會(huì),具體使用情況還需要大家實(shí)踐驗(yàn)證。這里是億速云,小編將為大家推送更多相關(guān)知識(shí)點(diǎn)的文章,歡迎關(guān)注!
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。