溫馨提示×

溫馨提示×

您好，登錄后才能下訂單哦！

密碼登錄×

忘記密碼？

登錄注冊×

獲取短信驗證碼

其他方式登錄

點擊登錄注冊即表示同意《億速云用戶服務(wù)條款》

用戶登錄×

賬戶密碼登錄

請使用微信掃描上方二維碼

使用幫助

請求超時！

請點擊重新獲取二維碼

C++聚類算法中的軟聚類與硬聚類

發(fā)布時間：2024-11-11 11:23:55 來源：億速云閱讀：79 作者：小樊欄目：編程語言

在C++中實現(xiàn)聚類算法時，軟聚類和硬聚類是兩種常見的方法，它們在處理數(shù)據(jù)點歸屬問題時有著不同的方式。

硬聚類

硬聚類是一種將數(shù)據(jù)點劃分為固定數(shù)量的簇的方法，每個數(shù)據(jù)點只能屬于一個簇，且簇的邊界是明確的。在C++中，可以使用多種算法來實現(xiàn)硬聚類，如K-means算法。

K-means算法是一種迭代優(yōu)化算法，旨在將n個觀測值劃分為k個（k≤n）聚類，使得每個觀測值屬于最近的均值（聚類中心）所代表的聚類，同時使得各聚類的內(nèi)部觀測值之間的平方距離（或歐氏距離）之和最小。

以下是一個簡單的C++ K-means算法實現(xiàn)示例：

#include <iostream>
#include <vector>
#include <cmath>
#include <random>

struct Point {
    double x, y;
};

double distance(const Point& a, const Point& b) {
    return std::sqrt((a.x - b.x) * (a.x - b.x) + (a.y - b.y) * (a.y - b.y));
}

std::vector<Point> kMeans(const std::vector<Point>& points, int k, int maxIterations = 100) {
    std::vector<Point> centroids(k);
    std::vector<int> assignments(points.size(), -1);
    std::default_random_engine generator;
    std::uniform_int_distribution<int> distribution(0, k - 1);

    for (int i = 0; i < maxIterations; ++i) {
        // Assign points to the nearest centroid
        std::vector<int> counts(k, 0);
        for (size_t j = 0; j < points.size(); ++j) {
            double minDist = std::numeric_limits<double>::max();
            int closestCentroid = -1;
            for (int c = 0; c < k; ++c) {
                double dist = distance(points[j], centroids[c]);
                if (dist < minDist) {
                    minDist = dist;
                    closestCentroid = c;
                }
            }
            assignments[j] = closestCentroid;
            counts[closestCentroid]++;
        }

        // Update centroids
        for (int c = 0; c < k; ++c) {
            if (counts[c] > 0) {
                centroids[c] = {0, 0};
                for (size_t j = 0; j < points.size(); ++j) {
                    if (assignments[j] == c) {
                        centroids[c].x += points[j].x;
                        centroids[c].y += points[j].y;
                    }
                }
                centroids[c].x /= counts[c];
                centroids[c].y /= counts[c];
            }
        }
    }

    return centroids;
}

int main() {
    std::vector<Point> points = {{1, 2}, {1, 4}, {1, 0}, {10, 2}, {10, 4}, {10, 0}};
    int k = 2;
    std::vector<Point> centroids = kMeans(points, k);

    for (const auto& centroid : centroids) {
        std::cout << "Centroid: (" << centroid.x << ", " << centroid.y << ")\n";
    }

    return 0;
}

軟聚類

與硬聚類不同，軟聚類允許數(shù)據(jù)點屬于多個簇，每個數(shù)據(jù)點屬于每個簇的概率是一個軟決策。這種方法在處理數(shù)據(jù)時提供了更大的靈活性，因為它允許數(shù)據(jù)點部分地屬于一個簇。

在C++中，K-means++是一種常用的軟聚類算法，它是K-means算法的擴(kuò)展，用于改進(jìn)初始質(zhì)心的選擇，從而提高聚類的質(zhì)量。K-means++通過選擇距離現(xiàn)有質(zhì)心較遠(yuǎn)的點作為新的質(zhì)心，以避免初始質(zhì)心選擇的隨機(jī)性導(dǎo)致的不穩(wěn)定性。

以下是一個簡單的C++ K-means++算法實現(xiàn)示例：

#include <iostream>
#include <vector>
#include <cmath>
#include <random>

struct Point {
    double x, y;
};

double distance(const Point& a, const Point& b) {
    return std::sqrt((a.x - b.x) * (a.x - b.x) + (a.y - b.y) * (a.y - b.y));
}

std::vector<Point> kMeansPlusPlus(const std::vector<Point>& points, int k, int maxIterations = 100) {
    std::vector<Point> centroids(k);
    std::vector<int> assignments(points.size(), -1);
    std::default_random_engine generator;
    std::uniform_real_distribution<double> distribution(0.0, 1.0);

    // Choose the first centroid randomly
    centroids[0] = points[distribution(generator) * points.size()];

    for (int i = 1; i < k; ++i) {
        std::vector<double> distances(points.size());
        for (size_t j = 0; j < points.size(); ++j) {
            double dist = distance(points[j], centroids[i - 1]);
            distances[j] = dist * dist; // Square the distance for selection
        }

        // Select the next centroid with probability proportional to the squared distance
        double sumDistances = 0;
        for (size_t j = 0; j < points.size(); ++j) {
            sumDistances += distances[j];
            if (distribution(generator) < sumDistances / (i * points.size())) {
                centroids[i] = points[j];
                break;
            }
        }
    }

    // Assign points to the nearest centroid
    std::vector<int> counts(k, 0);
    for (size_t j = 0; j < points.size(); ++j) {
        double minDist = std::numeric_limits<double>::max();
        int closestCentroid = -1;
        for (int c = 0; c < k; ++c) {
            double dist = distance(points[j], centroids[c]);
            if (dist < minDist) {
                minDist = dist;
                closestCentroid = c;
            }
        }
        assignments[j] = closestCentroid;
        counts[closestCentroid]++;
    }

    return centroids;
}

int main() {
    std::vector<Point> points = {{1, 2}, {1, 4}, {1, 0}, {10, 2}, {10, 4}, {10, 0}};
    int k = 2;
    std::vector<Point> centroids = kMeansPlusPlus(points, k);

    for (const auto& centroid : centroids) {
        std::cout << "Centroid: (" << centroid.x << ", " << centroid.y << ")\n";
    }

    return 0;
}

在這兩個示例中，我們定義了一個Point結(jié)構(gòu)體來表示二維空間中的點，并實現(xiàn)了計算兩點之間距離的函數(shù)distance。kMeans函數(shù)實現(xiàn)了基本的K-means硬聚類算法，而kMeansPlusPlus函數(shù)實現(xiàn)了K-means++軟聚類算法。在kMeansPlusPlus中，我們通過選擇距離現(xiàn)有質(zhì)心較遠(yuǎn)的點作為新的質(zhì)心，來改進(jìn)初始質(zhì)心的選擇。

向AI問一下細(xì)節(jié)

推薦閱讀：

免責(zé)聲明：本站發(fā)布的內(nèi)容（圖片、視頻和文字）以原創(chuàng)、轉(zhuǎn)載和分享為主，文章觀點不代表本網(wǎng)站立場，如果涉及侵權(quán)請聯(lián)系站長郵箱：is@yisu.com進(jìn)行舉報，并提供相關(guān)證據(jù)，一經(jīng)查實，將立刻刪除涉嫌侵權(quán)內(nèi)容。

上一篇新聞：
C++聚類算法與聚類樹結(jié)構(gòu)的構(gòu)建
下一篇新聞：
C++聚類算法對動態(tài)數(shù)據(jù)的實時處理

猜你喜歡

AI
助
手

產(chǎn)品服務(wù)

地區(qū)劃分

專題活動

幫助支持

關(guān)于我們

售后咨詢

7*24小時在線電話：400-100-2938

7*24小時在線 QQ：800811969

關(guān)注億速云

億速云公眾號

手機(jī)網(wǎng)站二維碼