溫馨提示×

溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊×
其他方式登錄
點擊 登錄注冊 即表示同意《億速云用戶服務(wù)條款》

不同 Python 數(shù)據(jù)類型的搜尋

發(fā)布時間:2020-02-29 16:34:50 來源:網(wǎng)絡(luò) 閱讀:153 作者:樂趣碼農(nóng) 欄目:編程語言

不同 Python 數(shù)據(jù)類型的搜尋

語言: Python 3.7.2

系統(tǒng): Win10 Ver. 10.0.17763

主題: 004.01 不同 Python 數(shù)據(jù)類型的搜尋
最近在做資料搜索比對的案子的時候,發(fā)現(xiàn)大量的數(shù)據(jù)在搜索比對時,速度變的非常慢,慢到完全無法接受,我想要的是 ' 立即 ' 有結(jié)果,結(jié)果卻是要等好幾小時,暈!雖然以 Python 來說,肯定比不上 C 或 Assembly 語言,但是還是要想辦法提升一下速度。以下是在一萬筆數(shù)據(jù)中,找一萬筆數(shù)據(jù)的各種方法以及所需的時間,雖然最后一個方法 index_list_sort(), 速度快了多,但是我還是覺得不夠快,而且這里還只是整數(shù)的搜索,如果是字符串呢?如果是副字符串呢?各位如果有更好的方法,也請?zhí)崾荆x謝!

結(jié)果:

0:00:04.734338 : index_sequence
0:00:01.139984 : index_list
0:00:00.330116 : index_np
0:00:00.233343 : index_np_sort
0:00:00.223401 : index_dict
0:00:00.213462 : index_set
0:00:00.007977 : index_list_sort

代碼:

代碼:from datetime import datetime
import numpy as np
import bisect
import time
import random
import inspect
import copy

size        = 10000
value       = size-1
db          = random.sample(range(size), size)
db_sort     = copy.deepcopy(db)
db_sort.sort()
db_set      = set(db)
db_dict     = {db[i]:i for i in range(size)}
db_np       = np.array(db)
value       = [i for i in range(size)]

def call(func):
    # Call function and calculate execution time, then print duration and function name
    start_time = datetime.now()
    func()
    print(datetime.now() - start_time,':',func.__name__)

def do_something():
    # Do something here, it may get duration different when multi-loop method used
    for i in range(1000):
        pass

def index_sequence():
    # List unsort and just by Python without any method used or built-in function.
    for i in range(size):
        for j in range(size):
            if value[j] == db[i]:
                index = j
                do_something()
                break

def index_list():
    # Unsorted list, use list.index()
    for i in range(size):
        try:
            index = db.index(value[i])
        except:
            index = -1
        if index >= 0:
            do_something()
def index_np():
    # By using numpy and np(where)
    for i in range(size):
        result = np.where(db_np==value[i])
        if len(result[0])!=0:
            do_something()

def index_np_sort():
    # By using numpy and sorted numpy array
    for i in range(size):
        result = np.searchsorted(db_np, value[i])
        if result != size:
            do_something()

def index_list_sort():
    # By using bisect library
    for i in range(size):
        index = bisect.bisect_left(db, value[i])
        if index < size-1 and value[index]==db[index]:
            do_something()

def index_set():
    # Set serach
    for i in range(size):
        if value[i] in db_set:
            do_something()

def index_dict():
    # Dictionary search
    for i in range(size):
        try:
            index = db_dict[value[i]]
        except:
            index = -1
        if index >= 0:
            do_something()

Test execution time

call(index_sequence)
call(index_list)
call(index_np)
call(index_np_sort)
call(index_dict)
call(index_set)
call(index_list_sort)復(fù)制代碼 database search
向AI問一下細(xì)節(jié)

免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進(jìn)行舉報,并提供相關(guān)證據(jù),一經(jīng)查實,將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI