您好,登錄后才能下訂單哦!
HDFS中create函數(shù)的作用是什么,很多新手對此不是很清楚,為了幫助大家解決這個難題,下面小編將為大家詳細講解,有這方面需求的人可以來學習下,希望你能有所收獲。
client通過exists()函數(shù)得知目前的namenode那邊不存在此文件后,
則通過namenode.create函數(shù)創(chuàng)建一個文件。具體細節(jié)如下:
這里意味著:clientMachine的clientName創(chuàng)建了src文件。
clientMachine只用來選擇目標DataNode.
public LocatedBlock create(String src, String clientName, String clientMachine, boolean overwrite) throws IOException {
Object results[] = namesystem.startFile(new UTF8(src), new UTF8(clientName), new UTF8(clientMachine), overwrite);//調(diào)用文件系統(tǒng)的startFile函數(shù),返回值為block信息和目標datanode信息
if (results == null) {
throw new IOException("Cannot create file " + src + " on client " + clientName);
} else {
Block b = (Block) results[0];//取回block
DatanodeInfo targets[] = (DatanodeInfo[]) results[1];//獲取DatanodeInfo數(shù)組信息
return new LocatedBlock(b, targets);//組合返回最終信息
}
}
====================================
下面開始學習
public synchronized Object[] startFile(UTF8 src, UTF8 holder, UTF8 clientMachine, boolean overwrite) {
對此函數(shù)的分析如下:
public synchronized Object[] startFile(UTF8 src, UTF8 holder, UTF8 clientMachine, boolean overwrite) {
//背景知識:參數(shù)有holder和clientMachine.比如一個例子如下:
Holder:DFS_CLIENT_xxxx
clientMachine:Machine66.
也就是說一個clientMachine上面可以有多個Holder.
一個clientMachine上的Holder發(fā)出了一個上傳的請求。
下面的代碼中哪里用到了holder和哪里用到了clientMachine,
還請讀者自己注意思考。
Object results[] = null;
if (pendingCreates.get(src) == null) {//說明pendingCreates記錄了正在創(chuàng)建的文件
boolean fileValid = dir.isValidToCreate(src);//文件路徑也確實不存在,需要這一句嗎?
if (overwrite && ! fileValid) {//如果可以覆蓋的話,目前都是不可以覆蓋
delete(src);
fileValid = true;
}
if (fileValid) {//確實可以的話,繼續(xù)執(zhí)行
results = new Object[2];//創(chuàng)建返回結(jié)果的數(shù)組
// Get the array of replication targets
DatanodeInfo targets[] = chooseTargets(this.desiredReplication, null, clientMachine);
//根據(jù)clientMachine和備份數(shù)目選擇多個目標datanode
if (targets.length < this.minReplication) {
LOG.warning("Target-length is " + targets.length +
", below MIN_REPLICATION (" + this.minReplication+ ")");
return null;
}//如果長度達不到備份數(shù),則返回失敗
// Reserve space for this pending file
pendingCreates.put(src, new Vector());//表明這個文件正在create!!!
synchronized (leases) {//開始處理租約系統(tǒng)
Lease lease = (Lease) leases.get(holder);//查找租約系統(tǒng)
if (lease == null) {//如果不存在
lease = new Lease(holder);//創(chuàng)建
leases.put(holder, lease);//存儲到leases
sortedLeases.add(lease);//存儲到sortedLeases
} else {//如果存在的話,則lease本身刷新時間且重新加入到sortedLeases.
//注意,這里有一個sort過程。
sortedLeases.remove(lease);
lease.renew();
sortedLeases.add(lease);
}
lease.startedCreate(src);//lease的本身creates保存了文件名
}
// Create next block
results[0] = allocateBlock(src);//主要是記錄文件對應的Block信息
results[1] = targets;//分配的datanode信息
} else { // ! fileValid
LOG.warning("Cannot start file because it is invalid. src=" + src);
}
} else {
LOG.warning("Cannot start file because pendingCreates is non-null. src=" + src);
}
return results;//返回結(jié)果!
}
-------------------------------------------------------------
DatanodeInfo[] chooseTargets(int desiredReplicates, TreeSet forbiddenNodes, UTF8 clientMachine) {
TreeSet alreadyChosen = new TreeSet();//初始化空的已經(jīng)選擇的機器
Vector targets = new Vector();//真的無語。這里為啥還要再創(chuàng)建一個targets,浪費內(nèi)存,直接傳到chooseTarget一樣的好吧!崩潰!
for (int i = 0; i < desiredReplicates; i++) {//根據(jù)備份數(shù)來選擇執(zhí)行次數(shù)
DatanodeInfo target = chooseTarget(forbiddenNodes, alreadyChosen, clientMachine);//選擇單個機器
if (target != null) {//選擇好了一個,就加到targets和alreadyChosen.崩潰,加2次有啥意思?。?!
targets.add(target);
alreadyChosen.add(target);
} else {
break; // calling chooseTarget again won't help
}
}
return (DatanodeInfo[]) targets.toArray(new DatanodeInfo[targets.size()]);//返回執(zhí)行的結(jié)果
}
---------------
=======================
DatanodeInfo chooseTarget(TreeSet forbidden1, TreeSet forbidden2, UTF8 clientMachine) {
//
// Check if there are any available targets at all
//
int totalMachines = datanodeMap.size();//獲取當前已知的所有數(shù)據(jù)節(jié)點個數(shù)
if (totalMachines == 0) {//為0就不用說了,返回null
LOG.warning("While choosing target, totalMachines is " + totalMachines);
return null;
}
//
// Build a map of forbidden hostnames from the two forbidden sets.
//
TreeSet forbiddenMachines = new TreeSet();
if (forbidden1 != null) {//這里forbidden1是初始化禁止的節(jié)點,此處為null
for (Iterator it = forbidden1.iterator(); it.hasNext(); ) {
DatanodeInfo cur = (DatanodeInfo) it.next();
forbiddenMachines.add(cur.getHost());
}
}
if (forbidden2 != null) {//是已經(jīng)選擇的節(jié)點,因為已經(jīng)選擇的就不會再返回了,你懂的
for (Iterator it = forbidden2.iterator(); it.hasNext(); ) {
DatanodeInfo cur = (DatanodeInfo) it.next();
forbiddenMachines.add(cur.getHost());
}
}
//
// Build list of machines we can actually choose from
//
Vector targetList = new Vector();//從總的節(jié)點中去掉不可以選擇的節(jié)點,得到剩下的可選的節(jié)點
for (Iterator it = datanodeMap.values().iterator(); it.hasNext(); ) {
DatanodeInfo node = (DatanodeInfo) it.next();
if (! forbiddenMachines.contains(node.getHost())) {
targetList.add(node);
}
}
Collections.shuffle(targetList);//本來不知道干嘛的,百度了一下,用來洗牌的
//為啥?因為DFSShell采用計算機組成原理的菊花鏈的方式來上傳數(shù)據(jù)。剩下的我就不用解釋了
//
// Now pick one
//
if (targetList.size() > 0) {//如果還剩下確實可以選擇的節(jié)點,并且clientMachine也在里面
//并且容量大于5塊,就直接返回clientMachine.我猜是為了本地加速
//畢竟上傳到本地和上傳到遠程主機是不一樣的。
// If the requester's machine is in the targetList,
// and it's got the capacity, pick it.
//
if (clientMachine != null && clientMachine.getLength() > 0) {
for (Iterator it = targetList.iterator(); it.hasNext(); ) {
DatanodeInfo node = (DatanodeInfo) it.next();
if (clientMachine.equals(node.getHost())) {
if (node.getRemaining() > BLOCK_SIZE * MIN_BLOCKS_FOR_WRITE) {
return node;
}
}
}
}
//
// Otherwise, choose node according to target capacity
//否則,就從中選擇一個容量大于5塊的節(jié)點
for (Iterator it = targetList.iterator(); it.hasNext(); ) {
DatanodeInfo node = (DatanodeInfo) it.next();
if (node.getRemaining() > BLOCK_SIZE * MIN_BLOCKS_FOR_WRITE) {
return node;
}
}
//
// That should do the trick. But we might not be able
// to pick any node if the target was out of bytes. As
// a last resort, pick the first valid one we can find.
//否則,就選擇一個至少大于1塊的節(jié)點
for (Iterator it = targetList.iterator(); it.hasNext(); ) {
DatanodeInfo node = (DatanodeInfo) it.next();
if (node.getRemaining() > BLOCK_SIZE) {
return node;
}
}
LOG.warning("Could not find any nodes with sufficient capacity");
return null;//否則返回null
} else {
LOG.warning("Zero targets found, forbidden1.size=" +
( forbidden1 != null ? forbidden1.size() : 0 ) +
" forbidden2.size()=" +
( forbidden2 != null ? forbidden2.size() : 0 ));
return null;//一個可用來查找的節(jié)點都沒有!
}
}
看完上述內(nèi)容是否對您有幫助呢?如果還想對相關(guān)知識有進一步的了解或閱讀更多相關(guān)文章,請關(guān)注億速云行業(yè)資訊頻道,感謝您對億速云的支持。
免責聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進行舉報,并提供相關(guān)證據(jù),一經(jīng)查實,將立刻刪除涉嫌侵權(quán)內(nèi)容。