<font id="xxjwf"></font>

<big id="xxjwf"></big>

溫馨提示×

溫馨提示×

您好，登錄后才能下訂單哦！

密碼登錄×

忘記密碼？

登錄注冊(cè)×

獲取短信驗(yàn)證碼

其他方式登錄

點(diǎn)擊登錄注冊(cè) 即表示同意《億速云用戶服務(wù)條款》

用戶登錄×

賬戶密碼登錄

請(qǐng)使用微信掃描上方二維碼

使用幫助

請(qǐng)求超時(shí)！

請(qǐng)點(diǎn)擊重新獲取二維碼

C語言怎么用堆解決Topk問題

發(fā)布時(shí)間：2021-12-02 11:45:52 來源：億速云閱讀：205 作者：小新欄目：開發(fā)技術(shù)

這篇文章給大家分享的是有關(guān)C語言怎么用堆解決Topk問題的內(nèi)容。小編覺得挺實(shí)用的，因此分享給大家做個(gè)參考，一起跟隨小編過來看看吧。

前言

將詳細(xì)講解如何利用小根堆的方法解決TopK問題，這么多數(shù)據(jù)要處理，

該算法時(shí)間復(fù)度居然只需

C語言怎么用堆解決Topk問題

TopK問題

TopK問題介紹：在N個(gè)數(shù)中找出最大的前K個(gè) （比如在1000個(gè)數(shù)中找出最大的前10個(gè)）

解題方法

方法1：先排降序，前N個(gè)就是最大的。

時(shí)間復(fù)雜度：

C語言怎么用堆解決Topk問題

方法2：N個(gè)數(shù)依次插入大堆，HeapPop K次，每次取堆頂?shù)臄?shù)據(jù)，即為前K個(gè)。

時(shí)間復(fù)雜度：

C語言怎么用堆解決Topk問題

假設(shè)N非常大，N是10億，內(nèi)存中存不下這些數(shù)，它們存在文件中的。K是100，方法1 和方法2 就都不能用了……

話說 10 億個(gè)整數(shù)，大概占用多少空間？

1G = 1024MB

1G = 1024*1024KB

1G = 1024*1024*1024Byte

要占用10億字節(jié)！所以我們來看看方法3。

方法3：

① 用前個(gè)K數(shù)建立一個(gè)K個(gè)數(shù)的小堆。

② 剩下的N-K個(gè)數(shù)，依次跟堆頂?shù)臄?shù)據(jù)進(jìn)行比較。如果比堆頂?shù)臄?shù)據(jù)大，就替換堆頂?shù)臄?shù)據(jù)，再向下調(diào)整。

③ 最后堆里面的K個(gè)數(shù)就是最大的K個(gè)數(shù)。

時(shí)間復(fù)雜度：

C語言怎么用堆解決Topk問題

這里為什么使用小堆而不使用大堆？

最大的前K個(gè)數(shù)一定會(huì)比其他數(shù)要大，只要進(jìn)來的數(shù)比堆頂數(shù)據(jù)大，就替代它。因?yàn)槭切《眩ㄐ〉脑谏洗蟮脑谙拢?，最大的?shù)進(jìn)去后一定會(huì)沉到下面，所以不可能存在大的數(shù)堵在堆頂導(dǎo)致某個(gè)數(shù)進(jìn)不去的情況，數(shù)越大沉得越深。對(duì)應(yīng)地，如果使用大堆就會(huì)出現(xiàn)一個(gè)大數(shù)堵在堆頂，剩下的數(shù)都比這個(gè)大數(shù)小，導(dǎo)致其他數(shù)進(jìn)不來，最后只能選出最大的那一個(gè)。

代碼實(shí)現(xiàn)與講解

由于還沒開始講 C++ ，這里我們沒法用優(yōu)先級(jí)隊(duì)列，我們得手動(dòng)自己寫一個(gè)堆來使用。當(dāng)然，如果自己懶得寫，以下是 C語言實(shí)現(xiàn)堆的代碼。

Heap.h

#define _CRT_SECURE_NO_WARNINGS 1
#pragma once
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <stdbool.h>
 
typedef int HPDataType;
 
typedef struct Heap {
    HPDataType* array;  //指向動(dòng)態(tài)開辟的數(shù)組
    int size;           //有效數(shù)據(jù)的個(gè)數(shù)
    int capacity;       //容量空間的大小
} HP;
 
/* 堆的初始化 */
void HeapInit(HP* php);
 
/* 堆的銷毀 */
void HeapDestroy(HP* php);
 
/* 堆的打印 */
void HeapPrint(HP* php);
 
/* 判斷堆是否為空 */
bool HeapIfEmpty(HP* hp);
 
/* 堆的插入 */
void HeapPush(HP* php, HPDataType x);
    /* 檢查容量 */
    void HeapCheckCapacity(HP* php);
        /* 交換函數(shù) */
        void Swap(HPDataType* px, HPDataType* py);
    /* 大根堆上調(diào) */ 
    void BigAdjustUp(int* arr, int child);
    /* 小根堆上調(diào) */ 
    void SmallAdjustUp(int* arr, int child);
 
/* 堆的刪除 */
void HeapPop(HP* php);
    /* 小根堆下調(diào)*/ 
    void SmallAdjustDown(int* arr, int n, int parent);
    /* 大根堆下調(diào) */
    void BigAdjustDown(int* arr, int n, int parent);
 
/* 返回堆頂數(shù)據(jù)*/
HPDataType HeapTop(HP* php);
 
/* 統(tǒng)計(jì)堆的個(gè)數(shù) */
int HeapSize(HP* php);

Heap.c

#define _CRT_SECURE_NO_WARNINGS 1
#include "Heap.h"
 
/* 堆的初始化 */
void HeapInit(HP* php) {
    assert(php);
    php->array = NULL;
    php->size = php->capacity = 0;
}
 
/* 堆的銷毀 */
void HeapDestroy(HP* php) {
    assert(php);
    free(php->array);
    php->capacity = php->size = 0;
}
 
/* 堆的打印 */
void HeapPrint(HP* php) {
    for (int i = 0; i < php->size; i++) {
        printf("%d ", php->array[i]);
    }
    printf("\n");
}
 
/* 判斷堆是否為空 */
bool HeapIfEmpty(HP* php) {
    assert(php);
 
    return php->size == 0; // 如果為size為0則表示堆為空
}
 
/* 堆的插入 */
    /* 檢查容量 */ 
    void HeapCheckCapacity(HP* php) {
        if (php->size == php->capacity) {
            int newCapacity = php->capacity == 0 ? 4 : (php->capacity * 2); //第一次給4，其他情況擴(kuò)2倍
            HPDataType* tmpArray = (HPDataType*)realloc(php->array, sizeof(HPDataType) * newCapacity); // 數(shù)組擴(kuò)容
            if (tmpArray == NULL) {  //檢查realloc
                printf("realloc failed!\n");
                exit(EXIT_FAILURE);
            }
            //更新他們的大小
            php->array = tmpArray;
            php->capacity = newCapacity;
        }
    }
 
        /* 交換函數(shù) */ 
        void Swap(HPDataType* px, HPDataType* py) {
            HPDataType tmp = *px;
            *px = *py;
            *py = tmp;
        } 
 
    /* 大根堆上調(diào) */ 
    void BigAdjustUp(int* arr, int child) {
        assert(arr);
        // 首先根據(jù)公式計(jì)算算出父親的下標(biāo)
        int parent = (child - 1) / 2;
        // 最壞情況:調(diào)到根，child=parent 當(dāng)child為根節(jié)點(diǎn)時(shí)結(jié)束（根節(jié)點(diǎn)永遠(yuǎn)是0）
        while (child > 0) {
            if (arr[child] > arr[parent]) {  // 如果孩子大于父親（不符合堆的性質(zhì)）
                // 交換他們的值
                Swap(&arr[child], &arr[parent]);
                // 往上走
                child = parent;
                parent = (child - 1) / 2;
            } else {  // 如果孩子小于父親（符合堆的性質(zhì)）
            // 跳出循環(huán)
                break;
            }
        }
    }
 
    /* 小根堆上調(diào) */ 
    void SmallAdjustUp(int* arr, int child) {
        assert(arr);
        // 首先根據(jù)公式計(jì)算算出父親的下標(biāo)
        int parent = (child - 1) / 2;
        // 最壞情況:調(diào)到根，child=parent 當(dāng)child為根節(jié)點(diǎn)時(shí)結(jié)束（根節(jié)點(diǎn)永遠(yuǎn)是0）
        while (child > 0) {
            if (arr[child] < arr[parent]) {  // 如果孩子大于父親（不符合堆的性質(zhì)）
                // 交換他們的值
                Swap(&arr[child], &arr[parent]);
                // 往上走
                child = parent;
                parent = (child - 1) / 2;
            } else {  // 如果孩子小于父親（符合堆的性質(zhì)）
            // 跳出循環(huán)
                break;
            }
        }
    }
void HeapPush(HP* php, HPDataType x) {
    assert(php);
    // 檢查是否需要擴(kuò)容
    HeapCheckCapacity(php);
    // 插入數(shù)據(jù)
    php->array[php->size] = x;
    php->size++;
    // 向上調(diào)整 [目標(biāo)數(shù)組，調(diào)整位置的起始位置（剛插入的數(shù)據(jù)）]
    SmallAdjustUp(php->array, php->size - 1);
}
 
/* 堆的刪除 */
 
    /* 小根堆下調(diào)*/ 
    void SmallAdjustDown(int* arr, int n, int parent) {
        int child = parent * 2 + 1; // 默認(rèn)為左孩子
        while (child < n) { // 葉子內(nèi)
            // 選出左右孩子中小的那一個(gè)
            if (child + 1 < n && arr[child + 1] < arr[child]) {
                child++;
            }
            if (arr[child] < arr[parent]) { // 如果孩子小于父親（不符合小堆的性質(zhì)）
                // 交換它們的值
                Swap(&arr[child], &arr[parent]);
                // 往下走
                parent = child;
                child = parent * 2 + 1;
            } else { // 如果孩子大于父親（符合小堆的性質(zhì)）
                // 跳出循環(huán)
                break;
            }
        }
    }
 
    /* 大根堆下調(diào) */
    void BigAdjustDown(int* arr, int n, int parent) {
        int child = parent * 2 + 1; // 默認(rèn)為左孩子
        while (child < n) { // 葉子內(nèi)
            // 選出左右孩子中大的那一個(gè)
            if (child + 1 < n && arr[child + 1] > arr[child]) {
                child++;
            }
            if (arr[child] > arr[parent]) { // 如果孩子大于父親（不符合大堆的性質(zhì)）
                // 交換它們的值
                Swap(&arr[child], &arr[parent]);
                // 往下走
                parent = child;
                child = parent * 2 + 1;
            } else { // 如果孩子小于父親（符合大堆的性質(zhì)）
                // 跳出循環(huán)
                break;
            }
        }
    }
void HeapPop(HP* php) {
    assert(php);
    assert(!HeapIfEmpty(php));
    // 刪除數(shù)據(jù)
    Swap(&php->array[0], &php->array[php->size - 1]);
    php->size--;
    // 向下調(diào)整 [目標(biāo)數(shù)組，數(shù)組的大小，調(diào)整位置的起始位置]
    SmallAdjustDown(php->array, php->size, 0);
}
 
/* 返回堆頂數(shù)據(jù) */
HPDataType HeapTop(HP* php) {
    assert(php);
    assert(!HeapIfEmpty(php));
 
    return php->array[0];
}
 
/* 統(tǒng)計(jì)堆的個(gè)數(shù) */
int HeapSize(HP* php) {
    assert(php);
 
    return php->size;
}

第三種方法的參考代碼：

#define _CRT_SECURE_NO_WARNINGS 1
#include "Heap.h"
 
/* 在N個(gè)數(shù)中找出最大的前K個(gè) */
void PrintTopK(int* arr, int N, int K) {
 
    HP hp;                             // 創(chuàng)建堆
    HeapInit(&hp);                     // 初始化堆
 
    for (int i = 0; i < K; i++) {      // Step1: 創(chuàng)建一個(gè)K個(gè)數(shù)的小堆
        HeapPush(&hp, arr[i]);
    }
 
    for (int i = K; i < N; i++) {      // Step2: 剩下的N-K個(gè)數(shù)跟堆頂?shù)臄?shù)據(jù)比較
        if (arr[i] > HeapTop(&hp)) {   // 如果比堆頂?shù)臄?shù)據(jù)大就替換進(jìn)堆
            HeapPop(&hp);              // 此數(shù)確實(shí)比堆頂大，刪除堆頂
            HeapPush(&hp, arr[i]);     // 將此數(shù)推進(jìn)堆中，數(shù)越大下沉越深
            /* 另一種寫法: 手動(dòng)替換
            hp.array[0] = arr[i];
            SmallAdjustDown(hp.array, hp.size, 0);
            */
        }
    }
    HeapPrint(&hp);                    // 打印K個(gè)數(shù)的堆
    HeapDestroy(&hp);                  // 銷毀堆
}
 
/* 模擬測試 TopK */
void TestTopK() {
    int N = 1000000;
    int* arr = (int*)malloc(sizeof(int) * N);
 
    srand(time(0)); // 置隨機(jī)數(shù)種子
    for(size_t i = 0; i < N; i++) {
        // 產(chǎn)生隨機(jī)數(shù)，每次產(chǎn)生的隨機(jī)數(shù)都mod100w，這樣產(chǎn)生的數(shù)一定是小于100w的
        arr[i] = rand() % 1000000; 
    }
    
    // 再去設(shè)置10個(gè)比100w大的數(shù)
    arr[5] = 1000000 + 1;
	arr[1231] = 1000000 + 2;
	arr[5355] = 1000000 + 3;
	arr[51] = 1000000 + 4;
	arr[15] = 1000000 + 5;
	arr[2335] = 1000000 + 6;
	arr[9999] = 1000000 + 7;
	arr[76] = 1000000 + 8;
	arr[423] = 1000000 + 9;
	arr[3144] = 1000000 + 10;
 
    PrintTopK(arr, N, 10); //測試用，所以給10個(gè)
}
 
int main(void) {
    TestTopK();
 
	return 0;
}

運(yùn)行結(jié)果

C語言怎么用堆解決Topk問題

函數(shù)解讀

PrintTopK 解讀

① 準(zhǔn)備好我們實(shí)現(xiàn)好的堆之后，我們就可以寫TopK的算法了。我們創(chuàng)建一個(gè) PrintTopK 函數(shù)，函數(shù)需要接收存放數(shù)據(jù)的數(shù)組、數(shù)組的大?。∟）和要找前多少個(gè)（K）。

② 首先創(chuàng)建一個(gè)堆，用來存放K 。按照規(guī)矩我們先把 HeapInit（初始化）和 HeapDestroy（銷毀）先寫好，防止自己不小心忘記銷毀。

③ 核心步驟1：創(chuàng)建一個(gè)K個(gè)數(shù)的小堆，我們直接用 for 循環(huán)將數(shù)組中前K個(gè)值先逐個(gè) HeapPush （堆的插入）進(jìn)去。

這里不代表最后的結(jié)果，我們只是相當(dāng)于 "默認(rèn)" 認(rèn)為這前K個(gè)數(shù)是最大的，方便我們后續(xù)進(jìn)行比較替代。經(jīng)過 HeapPush （堆的插入）后，這些數(shù)據(jù)會(huì)通過 SmallAdjustDown （小堆下調(diào)接口）對(duì)數(shù)據(jù)進(jìn)行相應(yīng)的調(diào)整：

for (int i = 0; i < K; i++) {      // Step1: 創(chuàng)建一個(gè)K個(gè)數(shù)的小堆
    HeapPush(&hp, arr[i]);
}

④ 核心步驟2：除去K，將剩下的N-K個(gè)數(shù)據(jù)進(jìn)行比較。我們?cè)倮?for 循環(huán)將數(shù)組中剩下的N-K個(gè)數(shù)據(jù)進(jìn)行遍歷。

這里逐個(gè)進(jìn)行判斷，如果該數(shù)堆頂?shù)臄?shù)據(jù) i＜K(max)，我們就進(jìn)行替換操作。調(diào)用 HeapPop（堆的刪除）刪除堆頂?shù)臄?shù)據(jù)，給讓位。之后將其 HeapPush （堆的插入）推到堆中，就完成了替換的工作。值得一提的是，我們還可以不調(diào)用 Pop 和 Push 這兩個(gè)操作，手動(dòng)進(jìn)行替換。hp.array [ 0 ] 就表示棧頂，我們將賦值給它，隨后再手動(dòng)進(jìn)行 SmallAdjustDown （小堆下調(diào)操作），傳入相應(yīng)的值即可：

for (int i = K; i < N; i++) {      // Step2: 剩下的N-K個(gè)數(shù)跟堆頂?shù)臄?shù)據(jù)比較
    if (arr[i] > HeapTop(&hp)) {   // 如果比堆頂?shù)臄?shù)據(jù)大就替換進(jìn)堆
        HeapPop(&hp);              // 此數(shù)確實(shí)比堆頂大，刪除堆頂
        HeapPush(&hp, arr[i]);     // 將此數(shù)推進(jìn)堆中，數(shù)越大下沉越深
        /* 另一種寫法: 手動(dòng)替換
        hp.array[0] = arr[i];
        SmallAdjustDown(hp.array, hp.size, 0);
        */
    }
}

⑤ 當(dāng) for 遍歷完所有數(shù)據(jù)之后，小堆中就保留了N個(gè)數(shù)據(jù)中最大的前K個(gè)數(shù)了。此時(shí)我們調(diào)用堆打印接口將小堆里的數(shù)據(jù)打印出來就大功告成了。

TestTopK 解讀

① 這是用來測試我們寫的TopK算法的函數(shù)。設(shè)置 N 的大小為 100w，動(dòng)態(tài)內(nèi)存開辟一塊可以存下這么多數(shù)據(jù)的空間：

int N = 1000000;
int* arr = (int*)malloc(sizeof(int) * N);

② 隨后根據(jù)時(shí)間來置隨機(jī)數(shù)種子，將每個(gè)元素都進(jìn)行隨機(jī)數(shù)的填充，每次產(chǎn)生的隨機(jī)數(shù)都模上100w，這樣可以保證產(chǎn)生的隨機(jī)數(shù)一定是小于100w的。

srand(time(0));
for(size_t i = 0; i < N; i++) {
    arr[i] = rand() % 1000000; 
}

③ 隨便寫幾個(gè)大于100w的數(shù)，便于測試：

// 再去設(shè)置10個(gè)比100w大的數(shù)
arr[5] = 1000000 + 1;
arr[1231] = 1000000 + 2;
arr[5355] = 1000000 + 3;
arr[51] = 1000000 + 4;
arr[15] = 1000000 + 5;
arr[2335] = 1000000 + 6;	
arr[9999] = 1000000 + 7;	
arr[76] = 1000000 + 8;
arr[423] = 1000000 + 9;
arr[3144] = 1000000 + 10;

④ 調(diào)用我們剛才實(shí)現(xiàn)好的 PrintTopK 函數(shù)，遞交對(duì)應(yīng)的參數(shù)后就可以進(jìn)行測試了。這里為了方便測試，我們的K設(shè)置為10：

PrintTopK(arr, N, 10);

感謝各位的閱讀！關(guān)于“C語言怎么用堆解決Topk問題”這篇文章就分享到這里了，希望以上內(nèi)容可以對(duì)大家有一定的幫助，讓大家可以學(xué)到更多知識(shí)，如果覺得文章不錯(cuò)，可以把它分享出去讓更多的人看到吧！

向AI問一下細(xì)節(jié)

推薦閱讀：

免責(zé)聲明：本站發(fā)布的內(nèi)容（圖片、視頻和文字）以原創(chuàng)、轉(zhuǎn)載和分享為主，文章觀點(diǎn)不代表本網(wǎng)站立場，如果涉及侵權(quán)請(qǐng)聯(lián)系站長郵箱：is@yisu.com進(jìn)行舉報(bào)，并提供相關(guān)證據(jù)，一經(jīng)查實(shí)，將立刻刪除涉嫌侵權(quán)內(nèi)容。

上一篇新聞：
Android?CameraX如何結(jié)合LibYUV和GPUImage自定義相機(jī)濾鏡
下一篇新聞：
tk.Mybatis插入數(shù)據(jù)獲取Id怎么實(shí)現(xiàn)

猜你喜歡

AI
助
手

產(chǎn)品服務(wù)

地區(qū)劃分

專題活動(dòng)

幫助支持

關(guān)于我們

售后咨詢

7*24小時(shí)在線電話：400-100-2938

7*24小時(shí)在線 QQ：800811969

關(guān)注億速云

億速云公眾號(hào)

手機(jī)網(wǎng)站二維碼